Spaces:

whyu
/

MM-Vet_Evaluator

Running

whyu commited on Aug 21, 2023

Commit

0b0d73f

1 Parent(s): d67424e

Fix sleep bug

Files changed (1) hide show

app.py CHANGED Viewed

@@ -166,6 +166,7 @@ def grade(file_obj, progress=gr.Progress()):
                 grade_sample_run_complete = False
                 temperature = 0.0
                 while not grade_sample_run_complete:
                     try:
                         response = openai.ChatCompletion.create(
@@ -206,8 +207,15 @@ def grade(file_obj, progress=gr.Progress()):
                         grade_sample_run_complete = True
                     except:
                         # gpt4 may have token rate limit
                         print("sleep 30s")
                         time.sleep(30)
                 if len(sample_grade['model']) >= j + 1:
                     sample_grade['model'][j] = response['model']
@@ -298,7 +306,7 @@ markdown = """
 In this demo, we offer MM-Vet LLM-based (GPT-4) evaluator to grade open-ended outputs from your models.
-Plese upload your json file of your model results containing `\{v1_0\: ..., v1_1\: ..., \}`like [this json file](https://raw.githubusercontent.com/yuweihao/MM-Vet/main/results/llava_llama2_13b_chat.json).
 The grading may last 5 minutes. Sine we only support 1 queue, the grading time may be longer when you need to wait for other users' grading to finish.

                 grade_sample_run_complete = False
                 temperature = 0.0
+                num_sleep = 0
                 while not grade_sample_run_complete:
                     try:
                         response = openai.ChatCompletion.create(
                         grade_sample_run_complete = True
                     except:
                         # gpt4 may have token rate limit
+                        num_sleep += 1
+                        if num_sleep > 2:
+                            score = 0.0
+                            grade_sample_run_complete = True
+                            num_sleep = 0
+                            continue
                         print("sleep 30s")
                         time.sleep(30)
                 if len(sample_grade['model']) >= j + 1:
                     sample_grade['model'][j] = response['model']
 In this demo, we offer MM-Vet LLM-based (GPT-4) evaluator to grade open-ended outputs from your models.
+Plese upload your json file of your model results containing `{v1_0: ..., v1_1: ..., }`like [this json file](https://raw.githubusercontent.com/yuweihao/MM-Vet/main/results/llava_llama2_13b_chat.json).
 The grading may last 5 minutes. Sine we only support 1 queue, the grading time may be longer when you need to wait for other users' grading to finish.