Update README.md
Browse files
README.md
CHANGED
|
@@ -82,11 +82,11 @@ It demonstrates strong capabilities in:
|
|
| 82 |
- Generalizing across different programming languages, coding styles, and codebases.
|
| 83 |
|
| 84 |
| | DeepSeek-Coder-6.7B-Base | OpenCoder-8B-Base | Qwen2.5-Coder-7B | Seed-Coder-8B-Base |
|
| 85 |
-
|
| 86 |
-
| HumanEval |
|
| 87 |
-
| MBPP |
|
| 88 |
-
| MultiPL-E |
|
| 89 |
-
| CruxEval-O |
|
| 90 |
|
| 91 |
For detailed benchmark results, please refer to our [📑 paper](https://arxiv.org/pdf/xxx.xxxxx).
|
| 92 |
|
|
|
|
| 82 |
- Generalizing across different programming languages, coding styles, and codebases.
|
| 83 |
|
| 84 |
| | DeepSeek-Coder-6.7B-Base | OpenCoder-8B-Base | Qwen2.5-Coder-7B | Seed-Coder-8B-Base |
|
| 85 |
+
|------------|:------------------------:|:-----------------:|:----------------:|:------------------:|
|
| 86 |
+
| HumanEval | 47.6 | 66.5 | 72.0 | 77.4 |
|
| 87 |
+
| MBPP | 70.2 | 79.9 | 79.4 | 82.0 |
|
| 88 |
+
| MultiPL-E | 44.7 | 61.0 | 58.8 | 67.6 |
|
| 89 |
+
| CruxEval-O | 41.0 | 43.9 | 56.0 | 48.4 |
|
| 90 |
|
| 91 |
For detailed benchmark results, please refer to our [📑 paper](https://arxiv.org/pdf/xxx.xxxxx).
|
| 92 |
|