typo, urls and congrats on the strong model!
#1
by JanPf - opened
README.md
CHANGED
|
@@ -52,7 +52,7 @@ Key improvements over Gemma-2-2B baseline:
|
|
| 52 |
- ARC-DE: +41% (32.3% vs 22.9%)
|
| 53 |
- Average zero-shot: +40% (35.8% vs 25.5%)
|
| 54 |
|
| 55 |
-
→ BübleLM-2B
|
| 56 |
|
| 57 |
<table class="model-comparison">
|
| 58 |
<thead>
|
|
@@ -75,7 +75,7 @@ Key improvements over Gemma-2-2B baseline:
|
|
| 75 |
</thead>
|
| 76 |
<tbody>
|
| 77 |
<tr>
|
| 78 |
-
<td>Gemma-2-2B</td>
|
| 79 |
<td align="center">22.9</td>
|
| 80 |
<td align="center">23.1</td>
|
| 81 |
<td align="center">28.0</td>
|
|
@@ -84,7 +84,7 @@ Key improvements over Gemma-2-2B baseline:
|
|
| 84 |
<td align="center">25.5</td>
|
| 85 |
</tr>
|
| 86 |
<tr>
|
| 87 |
-
<td>
|
| 88 |
<td align="center">24.7 ↑+8%</td>
|
| 89 |
<td align="center">-</td>
|
| 90 |
<td align="center">32.0 ↑+14%</td>
|
|
@@ -93,7 +93,7 @@ Key improvements over Gemma-2-2B baseline:
|
|
| 93 |
<td align="center">27.2 ↑+7%</td>
|
| 94 |
</tr>
|
| 95 |
<tr>
|
| 96 |
-
<td>
|
| 97 |
<td align="center">30.0 ↑+31%</td>
|
| 98 |
<td align="center">-</td>
|
| 99 |
<td align="center"><strong>48.5</strong> ↑+73%</td>
|
|
@@ -102,7 +102,7 @@ Key improvements over Gemma-2-2B baseline:
|
|
| 102 |
<td align="center">34.0 ↑+33%</td>
|
| 103 |
</tr>
|
| 104 |
<tr>
|
| 105 |
-
<td>Sauerkraut-Gemma-2B</td>
|
| 106 |
<td align="center">28.0 ↑+22%</td>
|
| 107 |
<td align="center">34.6 ↑+50%</td>
|
| 108 |
<td align="center">37.2 ↑+33%</td>
|
|
|
|
| 52 |
- ARC-DE: +41% (32.3% vs 22.9%)
|
| 53 |
- Average zero-shot: +40% (35.8% vs 25.5%)
|
| 54 |
|
| 55 |
+
→ BübleLM-2B consistently outperforms both the base Gemma-2-2B and other German models like LLäMmlein-1B across most tasks.
|
| 56 |
|
| 57 |
<table class="model-comparison">
|
| 58 |
<thead>
|
|
|
|
| 75 |
</thead>
|
| 76 |
<tbody>
|
| 77 |
<tr>
|
| 78 |
+
<td><a href="https://huggingface.co/google/gemma-2-2b" target="_blank">Gemma-2-2B</a></td>
|
| 79 |
<td align="center">22.9</td>
|
| 80 |
<td align="center">23.1</td>
|
| 81 |
<td align="center">28.0</td>
|
|
|
|
| 84 |
<td align="center">25.5</td>
|
| 85 |
</tr>
|
| 86 |
<tr>
|
| 87 |
+
<td><a href="https://huggingface.co/LSX-UniWue/LLaMmlein_120M" target="_blank">LLäMmlein-120M</a></td>
|
| 88 |
<td align="center">24.7 ↑+8%</td>
|
| 89 |
<td align="center">-</td>
|
| 90 |
<td align="center">32.0 ↑+14%</td>
|
|
|
|
| 93 |
<td align="center">27.2 ↑+7%</td>
|
| 94 |
</tr>
|
| 95 |
<tr>
|
| 96 |
+
<td><a href="https://huggingface.co/LSX-UniWue/LLaMmlein_1B" target="_blank">LLäMmlein-1B</a></td>
|
| 97 |
<td align="center">30.0 ↑+31%</td>
|
| 98 |
<td align="center">-</td>
|
| 99 |
<td align="center"><strong>48.5</strong> ↑+73%</td>
|
|
|
|
| 102 |
<td align="center">34.0 ↑+33%</td>
|
| 103 |
</tr>
|
| 104 |
<tr>
|
| 105 |
+
<td><a href="https://huggingface.co/VAGOsolutions/SauerkrautLM-Gemma-2b" target="_blank">Sauerkraut-Gemma-2B</a></td>
|
| 106 |
<td align="center">28.0 ↑+22%</td>
|
| 107 |
<td align="center">34.6 ↑+50%</td>
|
| 108 |
<td align="center">37.2 ↑+33%</td>
|