Update README.md
Browse files
README.md
CHANGED
|
@@ -81,6 +81,13 @@ It demonstrates strong capabilities in:
|
|
| 81 |
- Reasoning about code structure and inferring missing logic.
|
| 82 |
- Generalizing across different programming languages, coding styles, and codebases.
|
| 83 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 84 |
For detailed benchmark results, please refer to our [📑 paper](https://arxiv.org/pdf/xxx.xxxxx).
|
| 85 |
|
| 86 |
## Citation
|
|
|
|
| 81 |
- Reasoning about code structure and inferring missing logic.
|
| 82 |
- Generalizing across different programming languages, coding styles, and codebases.
|
| 83 |
|
| 84 |
+
| | DeepSeek-Coder-6.7B-Base | OpenCoder-8B-Base | Qwen2.5-Coder-7B | Seed-Coder-8B-Base |
|
| 85 |
+
|------------|--------------------------|-------------------|:----------------:|--------------------|
|
| 86 |
+
| HumanEval | 47.6 | 66.5 | 72.0 | 77.4 |
|
| 87 |
+
| MBPP | 70.2 | 79.9 | 79.4 | 82.0 |
|
| 88 |
+
| MultiPL-E | 44.7 | 61.0 | 58.8 | 67.6 |
|
| 89 |
+
| CruxEval-O | 41.0 | 43.9 | 56.0 | 48.4 |
|
| 90 |
+
|
| 91 |
For detailed benchmark results, please refer to our [📑 paper](https://arxiv.org/pdf/xxx.xxxxx).
|
| 92 |
|
| 93 |
## Citation
|