[ICLR'24 Spotlight] Tool-Augmented Reward Modeling
ernie-research
community
AI & ML interests
Large Language Models
models
12
ernie-research/Themis-7b
Updated
•
10
•
4
ernie-research/APPS-Gemma-7B-MA-PPO-Fixed10
9B
•
Updated
•
45
ernie-research/APPS-Gemma-2B-MA-PPO-Fixed10
3B
•
Updated
•
5
ernie-research/HH-RLHF-Gemma-2B-MA-PPO-Fixed5
3B
•
Updated
•
10
ernie-research/HH-RLHF-Gemma-7B-MA-PPO-Fixed5
9B
•
Updated
•
6
ernie-research/TLDR-Gemma-7B-MA-PPO-Fixed5
9B
•
Updated
•
5
ernie-research/TLDR-Gemma-2B-MA-PPO-Fixed5
3B
•
Updated
•
12
•
1
ernie-research/TLDR-Gemma-2-27B-MA-PPO-Fixed5
27B
•
Updated
•
8
ernie-research/ernie-code-560m
Updated
•
29
•
10
ernie-research/MonoGPT
Text Generation
•
0.4B
•
Updated
•
10
•
2