zhimin-z
commited on
Commit
·
4b78e58
1
Parent(s):
64746d3
merge wanted
Browse files
README.md
CHANGED
|
@@ -31,34 +31,51 @@ Key metrics from the last 180 days:
|
|
| 31 |
- **Total Issues**: Issues the assistant has been involved with (authored, assigned, or commented on)
|
| 32 |
- **Closed Issues**: Issues that were closed
|
| 33 |
- **Resolved Issues**: Closed issues marked as completed
|
| 34 |
-
- **
|
|
|
|
| 35 |
|
| 36 |
**Monthly Trends**
|
| 37 |
-
-
|
| 38 |
- Issue volume over time (bar charts)
|
| 39 |
|
|
|
|
|
|
|
|
|
|
| 40 |
We focus on 180 days to highlight current capabilities and active assistants.
|
| 41 |
|
| 42 |
## How It Works
|
| 43 |
|
| 44 |
**Data Collection**
|
| 45 |
-
We mine GitHub activity from [GHArchive](https://www.gharchive.org/), tracking:
|
| 46 |
-
|
| 47 |
-
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 48 |
|
| 49 |
**Regular Updates**
|
| 50 |
Leaderboard refreshes weekly (Friday at 00:00 UTC).
|
| 51 |
|
| 52 |
**Community Submissions**
|
| 53 |
-
Anyone can submit an assistant. We store metadata in `SWE-Arena/
|
| 54 |
|
| 55 |
## Using the Leaderboard
|
| 56 |
|
| 57 |
### Browsing
|
| 58 |
-
Leaderboard
|
| 59 |
- Searchable table (by assistant name or website)
|
| 60 |
-
- Filterable columns (by
|
| 61 |
- Monthly charts (resolution trends and activity)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 62 |
|
| 63 |
### Adding Your Assistant
|
| 64 |
Submit Assistant tab requires:
|
|
@@ -71,33 +88,49 @@ Submissions are validated and data loads within seconds.
|
|
| 71 |
|
| 72 |
## Understanding the Metrics
|
| 73 |
|
| 74 |
-
**
|
| 75 |
Percentage of closed issues successfully completed:
|
| 76 |
|
| 77 |
```
|
| 78 |
-
|
| 79 |
```
|
| 80 |
|
| 81 |
An issue is "resolved" when `state_reason` is `completed` on GitHub. This means the problem was solved, not just closed without resolution.
|
| 82 |
|
| 83 |
Context matters: 100 closed issues at 70% resolution (70 resolved) differs from 10 closed issues at 90% (9 resolved). Consider both rate and volume.
|
| 84 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 85 |
**Monthly Trends**
|
| 86 |
-
- **Line plots**:
|
| 87 |
- **Bar charts**: Issue volume per month
|
| 88 |
|
| 89 |
Patterns to watch:
|
| 90 |
- Consistent high rates = effective problem-solving
|
| 91 |
- Increasing trends = improving assistants
|
| 92 |
- High volume + good rates = productivity + effectiveness
|
|
|
|
| 93 |
|
| 94 |
## What's Next
|
| 95 |
|
| 96 |
Planned improvements:
|
| 97 |
- Repository-based analysis
|
| 98 |
-
- Extended metrics (comment activity, response time, complexity)
|
| 99 |
-
- Resolution time tracking
|
| 100 |
-
- Issue
|
|
|
|
|
|
|
| 101 |
|
| 102 |
## Questions or Issues?
|
| 103 |
|
|
|
|
| 31 |
- **Total Issues**: Issues the assistant has been involved with (authored, assigned, or commented on)
|
| 32 |
- **Closed Issues**: Issues that were closed
|
| 33 |
- **Resolved Issues**: Closed issues marked as completed
|
| 34 |
+
- **Resolved Rate**: Percentage of closed issues successfully resolved
|
| 35 |
+
- **Resolved Wanted Issues**: Long-standing issues (30+ days old) from major open-source projects that the assistant resolved via merged pull requests
|
| 36 |
|
| 37 |
**Monthly Trends**
|
| 38 |
+
- Resolved rate trends (line plots)
|
| 39 |
- Issue volume over time (bar charts)
|
| 40 |
|
| 41 |
+
**Issues Wanted**
|
| 42 |
+
- Long-standing open issues (30+ days) with fix-needed labels (e.g. `bug`, `enhancement`) from tracked organizations (Apache, GitHub, Hugging Face)
|
| 43 |
+
|
| 44 |
We focus on 180 days to highlight current capabilities and active assistants.
|
| 45 |
|
| 46 |
## How It Works
|
| 47 |
|
| 48 |
**Data Collection**
|
| 49 |
+
We mine GitHub activity from [GHArchive](https://www.gharchive.org/), tracking two types of issues:
|
| 50 |
+
|
| 51 |
+
1. **Agent-Assigned Issues**:
|
| 52 |
+
- Issues opened or assigned to the assistant (`IssuesEvent`)
|
| 53 |
+
- Issue comments by the assistant (`IssueCommentEvent`)
|
| 54 |
+
|
| 55 |
+
2. **Wanted Issues** (from tracked organizations: Apache, GitHub, Hugging Face):
|
| 56 |
+
- Long-standing open issues (30+ days) with fix-needed labels (`bug`, `enhancement`)
|
| 57 |
+
- Pull requests created by assistants that reference these issues
|
| 58 |
+
- Only counts as resolved when the assistant's PR is merged and the issue is subsequently closed
|
| 59 |
|
| 60 |
**Regular Updates**
|
| 61 |
Leaderboard refreshes weekly (Friday at 00:00 UTC).
|
| 62 |
|
| 63 |
**Community Submissions**
|
| 64 |
+
Anyone can submit an assistant. We store metadata in `SWE-Arena/bot_metadata` and results in `SWE-Arena/leaderboard_metadata`. All submissions are validated via GitHub API.
|
| 65 |
|
| 66 |
## Using the Leaderboard
|
| 67 |
|
| 68 |
### Browsing
|
| 69 |
+
**Leaderboard Tab**:
|
| 70 |
- Searchable table (by assistant name or website)
|
| 71 |
+
- Filterable columns (by resolved rate)
|
| 72 |
- Monthly charts (resolution trends and activity)
|
| 73 |
+
- View both agent-assigned metrics and wanted issue resolutions
|
| 74 |
+
|
| 75 |
+
**Issues Wanted Tab**:
|
| 76 |
+
- Browse long-standing open issues (30+ days) from major open-source projects
|
| 77 |
+
- Filter by tracked organizations (Apache, GitHub, Hugging Face)
|
| 78 |
+
- See which issues need attention from the community
|
| 79 |
|
| 80 |
### Adding Your Assistant
|
| 81 |
Submit Assistant tab requires:
|
|
|
|
| 88 |
|
| 89 |
## Understanding the Metrics
|
| 90 |
|
| 91 |
+
**Resolved Rate**
|
| 92 |
Percentage of closed issues successfully completed:
|
| 93 |
|
| 94 |
```
|
| 95 |
+
Resolved Rate = resolved issues ÷ closed issues × 100
|
| 96 |
```
|
| 97 |
|
| 98 |
An issue is "resolved" when `state_reason` is `completed` on GitHub. This means the problem was solved, not just closed without resolution.
|
| 99 |
|
| 100 |
Context matters: 100 closed issues at 70% resolution (70 resolved) differs from 10 closed issues at 90% (9 resolved). Consider both rate and volume.
|
| 101 |
|
| 102 |
+
**Resolved Wanted Issues**
|
| 103 |
+
Long-standing issues (30+ days old) from major open-source projects that the assistant resolved. An issue qualifies when:
|
| 104 |
+
1. It's from a tracked organization (Apache, GitHub, Hugging Face)
|
| 105 |
+
2. It has a fix-needed label (`bug`, `enhancement`)
|
| 106 |
+
3. The assistant created a pull request referencing the issue
|
| 107 |
+
4. The pull request was merged
|
| 108 |
+
5. The issue was subsequently closed
|
| 109 |
+
|
| 110 |
+
This metric highlights assistants' ability to tackle challenging, community-identified problems in high-impact projects.
|
| 111 |
+
|
| 112 |
+
**Long-Standing Issues**
|
| 113 |
+
Issues that have been open for 30+ days represent real challenges the community has struggled to address. These are harder than typical issues and demonstrate an assistant's problem-solving capabilities.
|
| 114 |
+
|
| 115 |
**Monthly Trends**
|
| 116 |
+
- **Line plots**: Resolved rate changes over time
|
| 117 |
- **Bar charts**: Issue volume per month
|
| 118 |
|
| 119 |
Patterns to watch:
|
| 120 |
- Consistent high rates = effective problem-solving
|
| 121 |
- Increasing trends = improving assistants
|
| 122 |
- High volume + good rates = productivity + effectiveness
|
| 123 |
+
- High wanted issue resolution = ability to tackle challenging community problems
|
| 124 |
|
| 125 |
## What's Next
|
| 126 |
|
| 127 |
Planned improvements:
|
| 128 |
- Repository-based analysis
|
| 129 |
+
- Extended metrics (comment activity, response time, code complexity)
|
| 130 |
+
- Resolution time tracking from issue creation to PR merge
|
| 131 |
+
- Issue category patterns and difficulty assessment
|
| 132 |
+
- Expanded organization and label tracking for wanted issues
|
| 133 |
+
- Integration with additional high-impact open-source organizations
|
| 134 |
|
| 135 |
## Questions or Issues?
|
| 136 |
|
app.py
CHANGED
|
@@ -3,6 +3,7 @@ from gradio_leaderboard import Leaderboard, ColumnFilter
|
|
| 3 |
import json
|
| 4 |
import os
|
| 5 |
import time
|
|
|
|
| 6 |
import requests
|
| 7 |
from huggingface_hub import HfApi, hf_hub_download
|
| 8 |
from huggingface_hub.errors import HfHubHTTPError
|
|
@@ -14,6 +15,7 @@ import plotly.graph_objects as go
|
|
| 14 |
from plotly.subplots import make_subplots
|
| 15 |
from apscheduler.schedulers.background import BackgroundScheduler
|
| 16 |
from apscheduler.triggers.cron import CronTrigger
|
|
|
|
| 17 |
|
| 18 |
# Load environment variables
|
| 19 |
load_dotenv()
|
|
@@ -23,8 +25,11 @@ load_dotenv()
|
|
| 23 |
# =============================================================================
|
| 24 |
|
| 25 |
AGENTS_REPO = "SWE-Arena/bot_metadata" # HuggingFace dataset for agent metadata
|
|
|
|
| 26 |
LEADERBOARD_FILENAME = f"{os.getenv('COMPOSE_PROJECT_NAME')}.json"
|
| 27 |
LEADERBOARD_REPO = "SWE-Arena/leaderboard_metadata" # HuggingFace dataset for leaderboard data
|
|
|
|
|
|
|
| 28 |
MAX_RETRIES = 5
|
| 29 |
|
| 30 |
LEADERBOARD_COLUMNS = [
|
|
@@ -33,6 +38,7 @@ LEADERBOARD_COLUMNS = [
|
|
| 33 |
("Total Issues", "number"),
|
| 34 |
("Resolved Issues", "number"),
|
| 35 |
("Resolved Rate (%)", "number"),
|
|
|
|
| 36 |
]
|
| 37 |
|
| 38 |
# =============================================================================
|
|
@@ -95,52 +101,113 @@ def validate_github_username(identifier):
|
|
| 95 |
# HUGGINGFACE DATASET OPERATIONS
|
| 96 |
# =============================================================================
|
| 97 |
|
| 98 |
-
def
|
| 99 |
-
"""
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 100 |
try:
|
| 101 |
-
|
| 102 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 103 |
|
| 104 |
-
|
| 105 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 106 |
|
| 107 |
-
# Filter for JSON files only
|
| 108 |
-
json_files = [f for f in files if f.endswith('.json')]
|
| 109 |
|
| 110 |
-
|
| 111 |
-
|
| 112 |
-
|
| 113 |
-
|
| 114 |
-
|
| 115 |
-
|
| 116 |
-
|
| 117 |
-
|
| 118 |
|
| 119 |
-
|
| 120 |
-
agent_data = json.load(f)
|
| 121 |
|
| 122 |
-
|
| 123 |
-
|
| 124 |
-
|
| 125 |
|
| 126 |
-
|
| 127 |
-
|
|
|
|
| 128 |
|
| 129 |
-
|
| 130 |
-
|
|
|
|
|
|
|
| 131 |
|
| 132 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 133 |
|
| 134 |
except Exception as e:
|
| 135 |
-
print(f"Warning
|
| 136 |
continue
|
| 137 |
|
| 138 |
-
|
| 139 |
-
|
| 140 |
-
|
| 141 |
-
except Exception as e:
|
| 142 |
-
print(f"Could not load agents from HuggingFace: {str(e)}")
|
| 143 |
-
return None
|
| 144 |
|
| 145 |
|
| 146 |
def get_hf_token():
|
|
@@ -483,6 +550,7 @@ def get_leaderboard_dataframe():
|
|
| 483 |
total_issues,
|
| 484 |
data.get('resolved_issues', 0),
|
| 485 |
data.get('resolved_rate', 0.0),
|
|
|
|
| 486 |
])
|
| 487 |
|
| 488 |
print(f"Filtered out {filtered_count} agents with 0 issues")
|
|
@@ -493,7 +561,7 @@ def get_leaderboard_dataframe():
|
|
| 493 |
df = pd.DataFrame(rows, columns=column_names)
|
| 494 |
|
| 495 |
# Ensure numeric types
|
| 496 |
-
numeric_cols = ["Total Issues", "Resolved Issues", "Resolved Rate (%)"]
|
| 497 |
for col in numeric_cols:
|
| 498 |
if col in df.columns:
|
| 499 |
df[col] = pd.to_numeric(df[col], errors='coerce').fillna(0)
|
|
@@ -508,6 +576,54 @@ def get_leaderboard_dataframe():
|
|
| 508 |
return df
|
| 509 |
|
| 510 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 511 |
def submit_agent(identifier, agent_name, organization, website):
|
| 512 |
"""
|
| 513 |
Submit a new agent to the leaderboard.
|
|
@@ -657,6 +773,25 @@ with gr.Blocks(title="SWE Agent Issue Leaderboard", theme=gr.themes.Soft()) as a
|
|
| 657 |
)
|
| 658 |
|
| 659 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 660 |
# Submit Agent Tab
|
| 661 |
with gr.Tab("Submit Agent"):
|
| 662 |
|
|
|
|
| 3 |
import json
|
| 4 |
import os
|
| 5 |
import time
|
| 6 |
+
import subprocess
|
| 7 |
import requests
|
| 8 |
from huggingface_hub import HfApi, hf_hub_download
|
| 9 |
from huggingface_hub.errors import HfHubHTTPError
|
|
|
|
| 15 |
from plotly.subplots import make_subplots
|
| 16 |
from apscheduler.schedulers.background import BackgroundScheduler
|
| 17 |
from apscheduler.triggers.cron import CronTrigger
|
| 18 |
+
from datetime import datetime, timezone
|
| 19 |
|
| 20 |
# Load environment variables
|
| 21 |
load_dotenv()
|
|
|
|
| 25 |
# =============================================================================
|
| 26 |
|
| 27 |
AGENTS_REPO = "SWE-Arena/bot_metadata" # HuggingFace dataset for agent metadata
|
| 28 |
+
AGENTS_REPO_LOCAL_PATH = os.path.expanduser("~/bot_metadata") # Local git clone path
|
| 29 |
LEADERBOARD_FILENAME = f"{os.getenv('COMPOSE_PROJECT_NAME')}.json"
|
| 30 |
LEADERBOARD_REPO = "SWE-Arena/leaderboard_metadata" # HuggingFace dataset for leaderboard data
|
| 31 |
+
LONGSTANDING_GAP_DAYS = 30 # Minimum days for an issue to be considered long-standing
|
| 32 |
+
GIT_SYNC_TIMEOUT = 300 # 5 minutes timeout for git pull
|
| 33 |
MAX_RETRIES = 5
|
| 34 |
|
| 35 |
LEADERBOARD_COLUMNS = [
|
|
|
|
| 38 |
("Total Issues", "number"),
|
| 39 |
("Resolved Issues", "number"),
|
| 40 |
("Resolved Rate (%)", "number"),
|
| 41 |
+
("Resolved Wanted Issues", "number"),
|
| 42 |
]
|
| 43 |
|
| 44 |
# =============================================================================
|
|
|
|
| 101 |
# HUGGINGFACE DATASET OPERATIONS
|
| 102 |
# =============================================================================
|
| 103 |
|
| 104 |
+
def sync_agents_repo():
|
| 105 |
+
"""
|
| 106 |
+
Sync local bot_metadata repository with remote using git pull.
|
| 107 |
+
This is MANDATORY to ensure we have the latest bot data.
|
| 108 |
+
Raises exception if sync fails.
|
| 109 |
+
"""
|
| 110 |
+
if not os.path.exists(AGENTS_REPO_LOCAL_PATH):
|
| 111 |
+
error_msg = f"Local repository not found at {AGENTS_REPO_LOCAL_PATH}"
|
| 112 |
+
print(f" Error {error_msg}")
|
| 113 |
+
print(f" Please clone it first: git clone https://huggingface.co/datasets/{AGENTS_REPO}")
|
| 114 |
+
raise FileNotFoundError(error_msg)
|
| 115 |
+
|
| 116 |
+
if not os.path.exists(os.path.join(AGENTS_REPO_LOCAL_PATH, '.git')):
|
| 117 |
+
error_msg = f"{AGENTS_REPO_LOCAL_PATH} exists but is not a git repository"
|
| 118 |
+
print(f" Error {error_msg}")
|
| 119 |
+
raise ValueError(error_msg)
|
| 120 |
+
|
| 121 |
try:
|
| 122 |
+
# Run git pull with extended timeout due to large repository
|
| 123 |
+
result = subprocess.run(
|
| 124 |
+
['git', 'pull'],
|
| 125 |
+
cwd=AGENTS_REPO_LOCAL_PATH,
|
| 126 |
+
capture_output=True,
|
| 127 |
+
text=True,
|
| 128 |
+
timeout=GIT_SYNC_TIMEOUT
|
| 129 |
+
)
|
| 130 |
|
| 131 |
+
if result.returncode == 0:
|
| 132 |
+
output = result.stdout.strip()
|
| 133 |
+
if "Already up to date" in output or "Already up-to-date" in output:
|
| 134 |
+
print(f" Success Repository is up to date")
|
| 135 |
+
else:
|
| 136 |
+
print(f" Success Repository synced successfully")
|
| 137 |
+
if output:
|
| 138 |
+
# Print first few lines of output
|
| 139 |
+
lines = output.split('\n')[:5]
|
| 140 |
+
for line in lines:
|
| 141 |
+
print(f" {line}")
|
| 142 |
+
return True
|
| 143 |
+
else:
|
| 144 |
+
error_msg = f"Git pull failed: {result.stderr.strip()}"
|
| 145 |
+
print(f" Error {error_msg}")
|
| 146 |
+
raise RuntimeError(error_msg)
|
| 147 |
+
|
| 148 |
+
except subprocess.TimeoutExpired:
|
| 149 |
+
error_msg = f"Git pull timed out after {GIT_SYNC_TIMEOUT} seconds"
|
| 150 |
+
print(f" Error {error_msg}")
|
| 151 |
+
raise TimeoutError(error_msg)
|
| 152 |
+
except (FileNotFoundError, ValueError, RuntimeError, TimeoutError):
|
| 153 |
+
raise # Re-raise expected exceptions
|
| 154 |
+
except Exception as e:
|
| 155 |
+
error_msg = f"Error syncing repository: {str(e)}"
|
| 156 |
+
print(f" Error {error_msg}")
|
| 157 |
+
raise RuntimeError(error_msg) from e
|
| 158 |
|
|
|
|
|
|
|
| 159 |
|
| 160 |
+
def load_agents_from_hf():
|
| 161 |
+
"""
|
| 162 |
+
Load all agent metadata JSON files from local git repository.
|
| 163 |
+
ALWAYS syncs with remote first to ensure we have the latest bot data.
|
| 164 |
+
"""
|
| 165 |
+
# MANDATORY: Sync with remote first to get latest bot data
|
| 166 |
+
print(f" Syncing bot_metadata repository to get latest agents...")
|
| 167 |
+
sync_agents_repo() # Will raise exception if sync fails
|
| 168 |
|
| 169 |
+
agents = []
|
|
|
|
| 170 |
|
| 171 |
+
# Scan local directory for JSON files
|
| 172 |
+
if not os.path.exists(AGENTS_REPO_LOCAL_PATH):
|
| 173 |
+
raise FileNotFoundError(f"Local repository not found at {AGENTS_REPO_LOCAL_PATH}")
|
| 174 |
|
| 175 |
+
# Walk through the directory to find all JSON files
|
| 176 |
+
files_processed = 0
|
| 177 |
+
print(f" Loading agent metadata from {AGENTS_REPO_LOCAL_PATH}...")
|
| 178 |
|
| 179 |
+
for root, dirs, files in os.walk(AGENTS_REPO_LOCAL_PATH):
|
| 180 |
+
# Skip .git directory
|
| 181 |
+
if '.git' in root:
|
| 182 |
+
continue
|
| 183 |
|
| 184 |
+
for filename in files:
|
| 185 |
+
if not filename.endswith('.json'):
|
| 186 |
+
continue
|
| 187 |
+
|
| 188 |
+
files_processed += 1
|
| 189 |
+
file_path = os.path.join(root, filename)
|
| 190 |
+
|
| 191 |
+
try:
|
| 192 |
+
with open(file_path, 'r', encoding='utf-8') as f:
|
| 193 |
+
agent_data = json.load(f)
|
| 194 |
+
|
| 195 |
+
# Only include active agents
|
| 196 |
+
if agent_data.get('status') != 'active':
|
| 197 |
+
continue
|
| 198 |
+
|
| 199 |
+
# Extract github_identifier from filename
|
| 200 |
+
github_identifier = filename.replace('.json', '')
|
| 201 |
+
agent_data['github_identifier'] = github_identifier
|
| 202 |
+
|
| 203 |
+
agents.append(agent_data)
|
| 204 |
|
| 205 |
except Exception as e:
|
| 206 |
+
print(f" Warning Error loading {filename}: {str(e)}")
|
| 207 |
continue
|
| 208 |
|
| 209 |
+
print(f" Success Loaded {len(agents)} active agents (from {files_processed} total files)")
|
| 210 |
+
return agents
|
|
|
|
|
|
|
|
|
|
|
|
|
| 211 |
|
| 212 |
|
| 213 |
def get_hf_token():
|
|
|
|
| 550 |
total_issues,
|
| 551 |
data.get('resolved_issues', 0),
|
| 552 |
data.get('resolved_rate', 0.0),
|
| 553 |
+
data.get('resolved_wanted_issues', 0),
|
| 554 |
])
|
| 555 |
|
| 556 |
print(f"Filtered out {filtered_count} agents with 0 issues")
|
|
|
|
| 561 |
df = pd.DataFrame(rows, columns=column_names)
|
| 562 |
|
| 563 |
# Ensure numeric types
|
| 564 |
+
numeric_cols = ["Total Issues", "Resolved Issues", "Resolved Rate (%)", "Resolved Wanted Issues"]
|
| 565 |
for col in numeric_cols:
|
| 566 |
if col in df.columns:
|
| 567 |
df[col] = pd.to_numeric(df[col], errors='coerce').fillna(0)
|
|
|
|
| 576 |
return df
|
| 577 |
|
| 578 |
|
| 579 |
+
def get_wanted_issues_dataframe():
|
| 580 |
+
"""Load wanted issues and convert to pandas DataFrame."""
|
| 581 |
+
saved_data = load_leaderboard_data_from_hf()
|
| 582 |
+
|
| 583 |
+
if not saved_data or 'wanted_issues' not in saved_data:
|
| 584 |
+
print(f"No wanted issues data available")
|
| 585 |
+
return pd.DataFrame(columns=["Title", "URL", "Age (days)", "Labels"])
|
| 586 |
+
|
| 587 |
+
wanted_issues = saved_data['wanted_issues']
|
| 588 |
+
print(f"Loaded {len(wanted_issues)} wanted issues")
|
| 589 |
+
|
| 590 |
+
if not wanted_issues:
|
| 591 |
+
return pd.DataFrame(columns=["Title", "URL", "Age (days)", "Labels"])
|
| 592 |
+
|
| 593 |
+
rows = []
|
| 594 |
+
for issue in wanted_issues:
|
| 595 |
+
# Calculate age
|
| 596 |
+
created_at = issue.get('created_at')
|
| 597 |
+
age_days = 0
|
| 598 |
+
if created_at and created_at != 'N/A':
|
| 599 |
+
try:
|
| 600 |
+
created = datetime.fromisoformat(created_at.replace('Z', '+00:00'))
|
| 601 |
+
age_days = (datetime.now(timezone.utc) - created).days
|
| 602 |
+
except:
|
| 603 |
+
pass
|
| 604 |
+
|
| 605 |
+
# Create clickable link
|
| 606 |
+
url = issue.get('url', '')
|
| 607 |
+
repo = issue.get('repo', '')
|
| 608 |
+
issue_number = issue.get('number', '')
|
| 609 |
+
url_link = f'<a href="{url}" target="_blank">{repo}#{issue_number}</a>'
|
| 610 |
+
|
| 611 |
+
rows.append([
|
| 612 |
+
issue.get('title', ''),
|
| 613 |
+
url_link,
|
| 614 |
+
age_days,
|
| 615 |
+
', '.join(issue.get('labels', []))
|
| 616 |
+
])
|
| 617 |
+
|
| 618 |
+
df = pd.DataFrame(rows, columns=["Title", "URL", "Age (days)", "Labels"])
|
| 619 |
+
|
| 620 |
+
# Sort by age descending
|
| 621 |
+
if "Age (days)" in df.columns and not df.empty:
|
| 622 |
+
df = df.sort_values(by="Age (days)", ascending=False).reset_index(drop=True)
|
| 623 |
+
|
| 624 |
+
return df
|
| 625 |
+
|
| 626 |
+
|
| 627 |
def submit_agent(identifier, agent_name, organization, website):
|
| 628 |
"""
|
| 629 |
Submit a new agent to the leaderboard.
|
|
|
|
| 773 |
)
|
| 774 |
|
| 775 |
|
| 776 |
+
# Issues Wanted Tab
|
| 777 |
+
with gr.Tab("Issues Wanted"):
|
| 778 |
+
gr.Markdown("### Long-Standing Patch-Wanted Issues")
|
| 779 |
+
gr.Markdown(f"*Issues open for {LONGSTANDING_GAP_DAYS}+ days with patch-wanted labels from tracked organizations*")
|
| 780 |
+
|
| 781 |
+
wanted_table = gr.Dataframe(
|
| 782 |
+
value=pd.DataFrame(columns=["Title", "URL", "Age (days)", "Labels"]),
|
| 783 |
+
datatype=["str", "html", "number", "str"],
|
| 784 |
+
interactive=False,
|
| 785 |
+
wrap=True
|
| 786 |
+
)
|
| 787 |
+
|
| 788 |
+
app.load(
|
| 789 |
+
fn=get_wanted_issues_dataframe,
|
| 790 |
+
inputs=[],
|
| 791 |
+
outputs=[wanted_table]
|
| 792 |
+
)
|
| 793 |
+
|
| 794 |
+
|
| 795 |
# Submit Agent Tab
|
| 796 |
with gr.Tab("Submit Agent"):
|
| 797 |
|
msr.py
CHANGED
|
@@ -25,13 +25,27 @@ load_dotenv()
|
|
| 25 |
# CONFIGURATION
|
| 26 |
# =============================================================================
|
| 27 |
|
| 28 |
-
AGENTS_REPO = "SWE-Arena/
|
| 29 |
-
AGENTS_REPO_LOCAL_PATH = os.path.expanduser("~/
|
| 30 |
DUCKDB_CACHE_FILE = "cache.duckdb"
|
| 31 |
GHARCHIVE_DATA_LOCAL_PATH = os.path.expanduser("~/gharchive/data")
|
| 32 |
LEADERBOARD_FILENAME = f"{os.getenv('COMPOSE_PROJECT_NAME')}.json"
|
| 33 |
-
LEADERBOARD_REPO = "SWE-Arena/
|
| 34 |
LEADERBOARD_TIME_FRAME_DAYS = 180
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 35 |
|
| 36 |
# Git sync configuration (mandatory to get latest bot data)
|
| 37 |
GIT_SYNC_TIMEOUT = 300 # 5 minutes timeout for git pull
|
|
@@ -509,9 +523,310 @@ def fetch_all_issue_metadata_streaming(conn, identifiers, start_date, end_date):
|
|
| 509 |
return dict(metadata_by_agent)
|
| 510 |
|
| 511 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 512 |
def sync_agents_repo():
|
| 513 |
"""
|
| 514 |
-
Sync local
|
| 515 |
This is MANDATORY to ensure we have the latest bot data.
|
| 516 |
Raises exception if sync fails.
|
| 517 |
"""
|
|
@@ -571,7 +886,7 @@ def load_agents_from_hf():
|
|
| 571 |
ALWAYS syncs with remote first to ensure we have the latest bot data.
|
| 572 |
"""
|
| 573 |
# MANDATORY: Sync with remote first to get latest bot data
|
| 574 |
-
print(f" Syncing
|
| 575 |
sync_agents_repo() # Will raise exception if sync fails
|
| 576 |
|
| 577 |
agents = []
|
|
@@ -705,12 +1020,21 @@ def calculate_monthly_metrics_by_agent(all_metadata_dict, agents):
|
|
| 705 |
}
|
| 706 |
|
| 707 |
|
| 708 |
-
def construct_leaderboard_from_metadata(all_metadata_dict, agents):
|
| 709 |
-
"""Construct leaderboard from in-memory issue metadata.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 710 |
if not agents:
|
| 711 |
print("Error: No agents found")
|
| 712 |
return {}
|
| 713 |
|
|
|
|
|
|
|
|
|
|
| 714 |
cache_dict = {}
|
| 715 |
|
| 716 |
for agent in agents:
|
|
@@ -720,18 +1044,22 @@ def construct_leaderboard_from_metadata(all_metadata_dict, agents):
|
|
| 720 |
bot_metadata = all_metadata_dict.get(identifier, [])
|
| 721 |
stats = calculate_issue_stats_from_metadata(bot_metadata)
|
| 722 |
|
|
|
|
|
|
|
|
|
|
| 723 |
cache_dict[identifier] = {
|
| 724 |
'name': agent_name,
|
| 725 |
'website': agent.get('website', 'N/A'),
|
| 726 |
'github_identifier': identifier,
|
| 727 |
-
**stats
|
|
|
|
| 728 |
}
|
| 729 |
|
| 730 |
return cache_dict
|
| 731 |
|
| 732 |
|
| 733 |
-
def save_leaderboard_data_to_hf(leaderboard_dict, monthly_metrics):
|
| 734 |
-
"""Save leaderboard data
|
| 735 |
try:
|
| 736 |
token = get_hf_token()
|
| 737 |
if not token:
|
|
@@ -739,13 +1067,20 @@ def save_leaderboard_data_to_hf(leaderboard_dict, monthly_metrics):
|
|
| 739 |
|
| 740 |
api = HfApi(token=token)
|
| 741 |
|
|
|
|
|
|
|
|
|
|
| 742 |
combined_data = {
|
| 743 |
-
'
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 744 |
'leaderboard': leaderboard_dict,
|
| 745 |
'monthly_metrics': monthly_metrics,
|
| 746 |
-
'
|
| 747 |
-
'leaderboard_time_frame_days': LEADERBOARD_TIME_FRAME_DAYS
|
| 748 |
-
}
|
| 749 |
}
|
| 750 |
|
| 751 |
with open(LEADERBOARD_FILENAME, 'w') as f:
|
|
@@ -809,11 +1144,15 @@ def mine_all_agents():
|
|
| 809 |
start_date = end_date - timedelta(days=LEADERBOARD_TIME_FRAME_DAYS)
|
| 810 |
|
| 811 |
try:
|
| 812 |
-
# USE STREAMING FUNCTION FOR
|
| 813 |
-
|
| 814 |
conn, identifiers, start_date, end_date
|
| 815 |
)
|
| 816 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 817 |
except Exception as e:
|
| 818 |
print(f"Error during DuckDB fetch: {str(e)}")
|
| 819 |
traceback.print_exc()
|
|
@@ -824,9 +1163,9 @@ def mine_all_agents():
|
|
| 824 |
print(f"\n[4/4] Saving leaderboard...")
|
| 825 |
|
| 826 |
try:
|
| 827 |
-
leaderboard_dict = construct_leaderboard_from_metadata(
|
| 828 |
-
monthly_metrics = calculate_monthly_metrics_by_agent(
|
| 829 |
-
save_leaderboard_data_to_hf(leaderboard_dict, monthly_metrics)
|
| 830 |
|
| 831 |
except Exception as e:
|
| 832 |
print(f"Error saving leaderboard: {str(e)}")
|
|
|
|
| 25 |
# CONFIGURATION
|
| 26 |
# =============================================================================
|
| 27 |
|
| 28 |
+
AGENTS_REPO = "SWE-Arena/bot_metadata"
|
| 29 |
+
AGENTS_REPO_LOCAL_PATH = os.path.expanduser("~/bot_metadata") # Local git clone path
|
| 30 |
DUCKDB_CACHE_FILE = "cache.duckdb"
|
| 31 |
GHARCHIVE_DATA_LOCAL_PATH = os.path.expanduser("~/gharchive/data")
|
| 32 |
LEADERBOARD_FILENAME = f"{os.getenv('COMPOSE_PROJECT_NAME')}.json"
|
| 33 |
+
LEADERBOARD_REPO = "SWE-Arena/leaderboard_metadata"
|
| 34 |
LEADERBOARD_TIME_FRAME_DAYS = 180
|
| 35 |
+
LONGSTANDING_GAP_DAYS = 30 # Minimum days for an issue to be considered long-standing
|
| 36 |
+
|
| 37 |
+
# GitHub organizations and repositories to track for wanted issues
|
| 38 |
+
TRACKED_ORGS = [
|
| 39 |
+
"apache",
|
| 40 |
+
"github",
|
| 41 |
+
"huggingface",
|
| 42 |
+
]
|
| 43 |
+
|
| 44 |
+
# Labels that indicate "patch wanted" status
|
| 45 |
+
PATCH_WANTED_LABELS = [
|
| 46 |
+
"bug",
|
| 47 |
+
"enhancement",
|
| 48 |
+
]
|
| 49 |
|
| 50 |
# Git sync configuration (mandatory to get latest bot data)
|
| 51 |
GIT_SYNC_TIMEOUT = 300 # 5 minutes timeout for git pull
|
|
|
|
| 523 |
return dict(metadata_by_agent)
|
| 524 |
|
| 525 |
|
| 526 |
+
def fetch_unified_issue_metadata_streaming(conn, identifiers, start_date, end_date):
|
| 527 |
+
"""
|
| 528 |
+
UNIFIED: Fetch both agent-assigned issues AND wanted issues using streaming batch processing.
|
| 529 |
+
|
| 530 |
+
Tracks TWO types of issues:
|
| 531 |
+
1. Agent-assigned issues: Issues where agents are assigned to or commented on
|
| 532 |
+
2. Wanted issues: Long-standing issues from tracked orgs linked to merged PRs by agents
|
| 533 |
+
|
| 534 |
+
Args:
|
| 535 |
+
conn: DuckDB connection instance
|
| 536 |
+
identifiers: List of GitHub usernames/bot identifiers
|
| 537 |
+
start_date: Start datetime (timezone-aware)
|
| 538 |
+
end_date: End datetime (timezone-aware)
|
| 539 |
+
|
| 540 |
+
Returns:
|
| 541 |
+
Dictionary with three keys:
|
| 542 |
+
- 'agent_issues': {agent_id: [issue_metadata]} for agent-assigned issues
|
| 543 |
+
- 'wanted_open': [open_wanted_issues] for long-standing open issues
|
| 544 |
+
- 'wanted_resolved': {agent_id: [resolved_wanted]} for resolved wanted issues
|
| 545 |
+
"""
|
| 546 |
+
# First, get agent-assigned issues using existing function
|
| 547 |
+
print(f" [1/2] Fetching agent-assigned/commented issues...")
|
| 548 |
+
agent_issues = fetch_all_issue_metadata_streaming(conn, identifiers, start_date, end_date)
|
| 549 |
+
|
| 550 |
+
# Now fetch wanted issues
|
| 551 |
+
print(f"\n [2/2] Fetching wanted issues from tracked orgs...")
|
| 552 |
+
identifier_set = set(identifiers)
|
| 553 |
+
|
| 554 |
+
# Storage for wanted issues
|
| 555 |
+
all_issues = {} # issue_url -> issue_metadata
|
| 556 |
+
issue_to_prs = defaultdict(set) # issue_url -> set of PR URLs
|
| 557 |
+
pr_creators = {} # pr_url -> creator login
|
| 558 |
+
pr_merged_at = {} # pr_url -> merged_at timestamp
|
| 559 |
+
|
| 560 |
+
# Calculate total batches
|
| 561 |
+
total_days = (end_date - start_date).days
|
| 562 |
+
total_batches = (total_days // BATCH_SIZE_DAYS) + 1
|
| 563 |
+
|
| 564 |
+
# Process in batches
|
| 565 |
+
current_date = start_date
|
| 566 |
+
batch_num = 0
|
| 567 |
+
|
| 568 |
+
print(f" Streaming {total_batches} batches for wanted issues...")
|
| 569 |
+
|
| 570 |
+
while current_date <= end_date:
|
| 571 |
+
batch_num += 1
|
| 572 |
+
batch_end = min(current_date + timedelta(days=BATCH_SIZE_DAYS - 1), end_date)
|
| 573 |
+
|
| 574 |
+
# Get file patterns for THIS BATCH ONLY
|
| 575 |
+
file_patterns = generate_file_path_patterns(current_date, batch_end)
|
| 576 |
+
|
| 577 |
+
if not file_patterns:
|
| 578 |
+
print(f" Batch {batch_num}/{total_batches}: {current_date.date()} to {batch_end.date()} - NO DATA")
|
| 579 |
+
current_date = batch_end + timedelta(days=1)
|
| 580 |
+
continue
|
| 581 |
+
|
| 582 |
+
# Progress indicator
|
| 583 |
+
print(f" Batch {batch_num}/{total_batches}: {current_date.date()} to {batch_end.date()} ({len(file_patterns)} files)... ", end="", flush=True)
|
| 584 |
+
|
| 585 |
+
# Build file patterns SQL for THIS BATCH
|
| 586 |
+
file_patterns_sql = '[' + ', '.join([f"'{fp}'" for fp in file_patterns]) + ']'
|
| 587 |
+
|
| 588 |
+
try:
|
| 589 |
+
# Create temp view from file read (done ONCE per batch)
|
| 590 |
+
conn.execute(f"""
|
| 591 |
+
CREATE OR REPLACE TEMP VIEW batch_data AS
|
| 592 |
+
SELECT *
|
| 593 |
+
FROM read_json({file_patterns_sql}, union_by_name=true, filename=true, compression='gzip', format='newline_delimited', ignore_errors=true, maximum_object_size=2147483648)
|
| 594 |
+
""")
|
| 595 |
+
|
| 596 |
+
# Query 1: Fetch all issues (NOT PRs) from tracked orgs
|
| 597 |
+
issue_query = """
|
| 598 |
+
SELECT
|
| 599 |
+
json_extract_string(payload, '$.issue.html_url') as issue_url,
|
| 600 |
+
json_extract_string(repo, '$.name') as repo_name,
|
| 601 |
+
json_extract_string(payload, '$.issue.title') as title,
|
| 602 |
+
json_extract_string(payload, '$.issue.number') as issue_number,
|
| 603 |
+
MIN(json_extract_string(payload, '$.issue.created_at')) as created_at,
|
| 604 |
+
MAX(json_extract_string(payload, '$.issue.closed_at')) as closed_at,
|
| 605 |
+
json_extract(payload, '$.issue.labels') as labels
|
| 606 |
+
FROM batch_data
|
| 607 |
+
WHERE
|
| 608 |
+
type IN ('IssuesEvent', 'IssueCommentEvent')
|
| 609 |
+
AND json_extract_string(payload, '$.issue.pull_request') IS NULL
|
| 610 |
+
AND json_extract_string(payload, '$.issue.html_url') IS NOT NULL
|
| 611 |
+
GROUP BY issue_url, repo_name, title, issue_number, labels
|
| 612 |
+
"""
|
| 613 |
+
|
| 614 |
+
issue_results = conn.execute(issue_query).fetchall()
|
| 615 |
+
|
| 616 |
+
# Filter issues by tracked orgs and collect them
|
| 617 |
+
for row in issue_results:
|
| 618 |
+
issue_url = row[0]
|
| 619 |
+
repo_name = row[1]
|
| 620 |
+
title = row[2]
|
| 621 |
+
issue_number = row[3]
|
| 622 |
+
created_at = row[4]
|
| 623 |
+
closed_at = row[5]
|
| 624 |
+
labels_json = row[6]
|
| 625 |
+
|
| 626 |
+
if not issue_url or not repo_name:
|
| 627 |
+
continue
|
| 628 |
+
|
| 629 |
+
# Extract org from repo_name
|
| 630 |
+
parts = repo_name.split('/')
|
| 631 |
+
if len(parts) != 2:
|
| 632 |
+
continue
|
| 633 |
+
org = parts[0]
|
| 634 |
+
|
| 635 |
+
# Filter by tracked orgs
|
| 636 |
+
if org not in TRACKED_ORGS:
|
| 637 |
+
continue
|
| 638 |
+
|
| 639 |
+
# Parse labels
|
| 640 |
+
try:
|
| 641 |
+
if isinstance(labels_json, str):
|
| 642 |
+
labels_data = json.loads(labels_json)
|
| 643 |
+
else:
|
| 644 |
+
labels_data = labels_json
|
| 645 |
+
|
| 646 |
+
if not isinstance(labels_data, list):
|
| 647 |
+
label_names = []
|
| 648 |
+
else:
|
| 649 |
+
label_names = [label.get('name', '').lower() for label in labels_data if isinstance(label, dict)]
|
| 650 |
+
|
| 651 |
+
except (json.JSONDecodeError, TypeError):
|
| 652 |
+
label_names = []
|
| 653 |
+
|
| 654 |
+
# Determine state
|
| 655 |
+
normalized_closed_at = normalize_date_format(closed_at) if closed_at else None
|
| 656 |
+
state = 'closed' if (normalized_closed_at and normalized_closed_at != 'N/A') else 'open'
|
| 657 |
+
|
| 658 |
+
# Store issue metadata
|
| 659 |
+
all_issues[issue_url] = {
|
| 660 |
+
'url': issue_url,
|
| 661 |
+
'repo': repo_name,
|
| 662 |
+
'title': title,
|
| 663 |
+
'number': issue_number,
|
| 664 |
+
'state': state,
|
| 665 |
+
'created_at': normalize_date_format(created_at),
|
| 666 |
+
'closed_at': normalized_closed_at,
|
| 667 |
+
'labels': label_names
|
| 668 |
+
}
|
| 669 |
+
|
| 670 |
+
# Query 2: Find PRs from both IssueCommentEvent and PullRequestEvent
|
| 671 |
+
pr_query = """
|
| 672 |
+
SELECT DISTINCT
|
| 673 |
+
COALESCE(
|
| 674 |
+
json_extract_string(payload, '$.issue.html_url'),
|
| 675 |
+
json_extract_string(payload, '$.pull_request.html_url')
|
| 676 |
+
) as pr_url,
|
| 677 |
+
COALESCE(
|
| 678 |
+
json_extract_string(payload, '$.issue.user.login'),
|
| 679 |
+
json_extract_string(payload, '$.pull_request.user.login')
|
| 680 |
+
) as pr_creator,
|
| 681 |
+
COALESCE(
|
| 682 |
+
json_extract_string(payload, '$.issue.pull_request.merged_at'),
|
| 683 |
+
json_extract_string(payload, '$.pull_request.merged_at')
|
| 684 |
+
) as merged_at,
|
| 685 |
+
COALESCE(
|
| 686 |
+
json_extract_string(payload, '$.issue.body'),
|
| 687 |
+
json_extract_string(payload, '$.pull_request.body')
|
| 688 |
+
) as pr_body
|
| 689 |
+
FROM batch_data
|
| 690 |
+
WHERE
|
| 691 |
+
(type = 'IssueCommentEvent' AND json_extract_string(payload, '$.issue.pull_request') IS NOT NULL)
|
| 692 |
+
OR type = 'PullRequestEvent'
|
| 693 |
+
"""
|
| 694 |
+
|
| 695 |
+
pr_results = conn.execute(pr_query).fetchall()
|
| 696 |
+
|
| 697 |
+
for row in pr_results:
|
| 698 |
+
pr_url = row[0]
|
| 699 |
+
pr_creator = row[1]
|
| 700 |
+
merged_at = row[2]
|
| 701 |
+
pr_body = row[3]
|
| 702 |
+
|
| 703 |
+
if not pr_url or not pr_creator:
|
| 704 |
+
continue
|
| 705 |
+
|
| 706 |
+
pr_creators[pr_url] = pr_creator
|
| 707 |
+
pr_merged_at[pr_url] = merged_at
|
| 708 |
+
|
| 709 |
+
# Extract linked issues from PR body
|
| 710 |
+
if pr_body:
|
| 711 |
+
# Match issue URLs or #number references
|
| 712 |
+
issue_refs = re.findall(r'(?:https?://github\.com/[\w-]+/[\w-]+/issues/\d+)|(?:#\d+)', pr_body, re.IGNORECASE)
|
| 713 |
+
|
| 714 |
+
for ref in issue_refs:
|
| 715 |
+
# Convert #number to full URL if needed
|
| 716 |
+
if ref.startswith('#'):
|
| 717 |
+
# Extract org/repo from PR URL
|
| 718 |
+
pr_parts = pr_url.split('/')
|
| 719 |
+
if len(pr_parts) >= 5:
|
| 720 |
+
org = pr_parts[-4]
|
| 721 |
+
repo = pr_parts[-3]
|
| 722 |
+
issue_num = ref[1:]
|
| 723 |
+
issue_url = f"https://github.com/{org}/{repo}/issues/{issue_num}"
|
| 724 |
+
issue_to_prs[issue_url].add(pr_url)
|
| 725 |
+
else:
|
| 726 |
+
issue_to_prs[ref].add(pr_url)
|
| 727 |
+
|
| 728 |
+
print(f"✓ {len(issue_results)} issues, {len(pr_results)} PRs")
|
| 729 |
+
|
| 730 |
+
# Clean up temp view after batch processing
|
| 731 |
+
conn.execute("DROP VIEW IF EXISTS batch_data")
|
| 732 |
+
|
| 733 |
+
except Exception as e:
|
| 734 |
+
print(f"\n ✗ Batch {batch_num} error: {str(e)}")
|
| 735 |
+
traceback.print_exc()
|
| 736 |
+
# Clean up temp view even on error
|
| 737 |
+
try:
|
| 738 |
+
conn.execute("DROP VIEW IF EXISTS batch_data")
|
| 739 |
+
except:
|
| 740 |
+
pass
|
| 741 |
+
|
| 742 |
+
# Move to next batch
|
| 743 |
+
current_date = batch_end + timedelta(days=1)
|
| 744 |
+
|
| 745 |
+
# Post-processing: Filter issues and assign to agents
|
| 746 |
+
print(f"\n Post-processing {len(all_issues)} wanted issues...")
|
| 747 |
+
|
| 748 |
+
wanted_open = []
|
| 749 |
+
wanted_resolved = defaultdict(list)
|
| 750 |
+
current_time = datetime.now(timezone.utc)
|
| 751 |
+
|
| 752 |
+
for issue_url, issue_meta in all_issues.items():
|
| 753 |
+
# Check if issue has linked PRs
|
| 754 |
+
linked_prs = issue_to_prs.get(issue_url, set())
|
| 755 |
+
if not linked_prs:
|
| 756 |
+
continue
|
| 757 |
+
|
| 758 |
+
# Check if any linked PR was merged AND created by an agent
|
| 759 |
+
resolved_by = None
|
| 760 |
+
for pr_url in linked_prs:
|
| 761 |
+
merged_at = pr_merged_at.get(pr_url)
|
| 762 |
+
if merged_at: # PR was merged
|
| 763 |
+
pr_creator = pr_creators.get(pr_url)
|
| 764 |
+
if pr_creator in identifier_set:
|
| 765 |
+
resolved_by = pr_creator
|
| 766 |
+
break
|
| 767 |
+
|
| 768 |
+
if not resolved_by:
|
| 769 |
+
continue
|
| 770 |
+
|
| 771 |
+
# Process based on issue state
|
| 772 |
+
if issue_meta['state'] == 'open':
|
| 773 |
+
# For open issues: check if labels match PATCH_WANTED_LABELS
|
| 774 |
+
issue_labels = issue_meta.get('labels', [])
|
| 775 |
+
has_patch_label = False
|
| 776 |
+
for issue_label in issue_labels:
|
| 777 |
+
for wanted_label in PATCH_WANTED_LABELS:
|
| 778 |
+
if wanted_label.lower() in issue_label:
|
| 779 |
+
has_patch_label = True
|
| 780 |
+
break
|
| 781 |
+
if has_patch_label:
|
| 782 |
+
break
|
| 783 |
+
|
| 784 |
+
if not has_patch_label:
|
| 785 |
+
continue
|
| 786 |
+
|
| 787 |
+
# Check if long-standing
|
| 788 |
+
created_at_str = issue_meta.get('created_at')
|
| 789 |
+
if created_at_str and created_at_str != 'N/A':
|
| 790 |
+
try:
|
| 791 |
+
created_dt = datetime.fromisoformat(created_at_str.replace('Z', '+00:00'))
|
| 792 |
+
days_open = (current_time - created_dt).days
|
| 793 |
+
if days_open >= LONGSTANDING_GAP_DAYS:
|
| 794 |
+
wanted_open.append(issue_meta)
|
| 795 |
+
except:
|
| 796 |
+
pass
|
| 797 |
+
|
| 798 |
+
elif issue_meta['state'] == 'closed':
|
| 799 |
+
# For closed issues: must be closed within time frame AND open 30+ days
|
| 800 |
+
closed_at_str = issue_meta.get('closed_at')
|
| 801 |
+
created_at_str = issue_meta.get('created_at')
|
| 802 |
+
|
| 803 |
+
if closed_at_str and closed_at_str != 'N/A' and created_at_str and created_at_str != 'N/A':
|
| 804 |
+
try:
|
| 805 |
+
closed_dt = datetime.fromisoformat(closed_at_str.replace('Z', '+00:00'))
|
| 806 |
+
created_dt = datetime.fromisoformat(created_at_str.replace('Z', '+00:00'))
|
| 807 |
+
|
| 808 |
+
# Calculate how long the issue was open
|
| 809 |
+
days_open = (closed_dt - created_dt).days
|
| 810 |
+
|
| 811 |
+
# Only include if closed within timeframe AND was open 30+ days
|
| 812 |
+
if start_date <= closed_dt <= end_date and days_open >= LONGSTANDING_GAP_DAYS:
|
| 813 |
+
wanted_resolved[resolved_by].append(issue_meta)
|
| 814 |
+
except:
|
| 815 |
+
pass
|
| 816 |
+
|
| 817 |
+
print(f" ✓ Found {len(wanted_open)} long-standing open wanted issues")
|
| 818 |
+
print(f" ✓ Found {sum(len(issues) for issues in wanted_resolved.values())} resolved wanted issues across {len(wanted_resolved)} agents")
|
| 819 |
+
|
| 820 |
+
return {
|
| 821 |
+
'agent_issues': agent_issues,
|
| 822 |
+
'wanted_open': wanted_open,
|
| 823 |
+
'wanted_resolved': dict(wanted_resolved)
|
| 824 |
+
}
|
| 825 |
+
|
| 826 |
+
|
| 827 |
def sync_agents_repo():
|
| 828 |
"""
|
| 829 |
+
Sync local bot_metadata repository with remote using git pull.
|
| 830 |
This is MANDATORY to ensure we have the latest bot data.
|
| 831 |
Raises exception if sync fails.
|
| 832 |
"""
|
|
|
|
| 886 |
ALWAYS syncs with remote first to ensure we have the latest bot data.
|
| 887 |
"""
|
| 888 |
# MANDATORY: Sync with remote first to get latest bot data
|
| 889 |
+
print(f" Syncing bot_metadata repository to get latest agents...")
|
| 890 |
sync_agents_repo() # Will raise exception if sync fails
|
| 891 |
|
| 892 |
agents = []
|
|
|
|
| 1020 |
}
|
| 1021 |
|
| 1022 |
|
| 1023 |
+
def construct_leaderboard_from_metadata(all_metadata_dict, agents, wanted_resolved_dict=None):
|
| 1024 |
+
"""Construct leaderboard from in-memory issue metadata.
|
| 1025 |
+
|
| 1026 |
+
Args:
|
| 1027 |
+
all_metadata_dict: Dictionary mapping agent ID to list of issue metadata (agent-assigned issues)
|
| 1028 |
+
agents: List of agent metadata
|
| 1029 |
+
wanted_resolved_dict: Optional dictionary mapping agent ID to list of resolved wanted issues
|
| 1030 |
+
"""
|
| 1031 |
if not agents:
|
| 1032 |
print("Error: No agents found")
|
| 1033 |
return {}
|
| 1034 |
|
| 1035 |
+
if wanted_resolved_dict is None:
|
| 1036 |
+
wanted_resolved_dict = {}
|
| 1037 |
+
|
| 1038 |
cache_dict = {}
|
| 1039 |
|
| 1040 |
for agent in agents:
|
|
|
|
| 1044 |
bot_metadata = all_metadata_dict.get(identifier, [])
|
| 1045 |
stats = calculate_issue_stats_from_metadata(bot_metadata)
|
| 1046 |
|
| 1047 |
+
# Add wanted issues count
|
| 1048 |
+
resolved_wanted = len(wanted_resolved_dict.get(identifier, []))
|
| 1049 |
+
|
| 1050 |
cache_dict[identifier] = {
|
| 1051 |
'name': agent_name,
|
| 1052 |
'website': agent.get('website', 'N/A'),
|
| 1053 |
'github_identifier': identifier,
|
| 1054 |
+
**stats,
|
| 1055 |
+
'resolved_wanted_issues': resolved_wanted
|
| 1056 |
}
|
| 1057 |
|
| 1058 |
return cache_dict
|
| 1059 |
|
| 1060 |
|
| 1061 |
+
def save_leaderboard_data_to_hf(leaderboard_dict, monthly_metrics, wanted_issues=None):
|
| 1062 |
+
"""Save leaderboard data, monthly metrics, and wanted issues to HuggingFace dataset."""
|
| 1063 |
try:
|
| 1064 |
token = get_hf_token()
|
| 1065 |
if not token:
|
|
|
|
| 1067 |
|
| 1068 |
api = HfApi(token=token)
|
| 1069 |
|
| 1070 |
+
if wanted_issues is None:
|
| 1071 |
+
wanted_issues = []
|
| 1072 |
+
|
| 1073 |
combined_data = {
|
| 1074 |
+
'metadata': {
|
| 1075 |
+
'last_updated': datetime.now(timezone.utc).isoformat(),
|
| 1076 |
+
'leaderboard_time_frame_days': LEADERBOARD_TIME_FRAME_DAYS,
|
| 1077 |
+
'longstanding_gap_days': LONGSTANDING_GAP_DAYS,
|
| 1078 |
+
'tracked_orgs': TRACKED_ORGS,
|
| 1079 |
+
'patch_wanted_labels': PATCH_WANTED_LABELS
|
| 1080 |
+
},
|
| 1081 |
'leaderboard': leaderboard_dict,
|
| 1082 |
'monthly_metrics': monthly_metrics,
|
| 1083 |
+
'wanted_issues': wanted_issues
|
|
|
|
|
|
|
| 1084 |
}
|
| 1085 |
|
| 1086 |
with open(LEADERBOARD_FILENAME, 'w') as f:
|
|
|
|
| 1144 |
start_date = end_date - timedelta(days=LEADERBOARD_TIME_FRAME_DAYS)
|
| 1145 |
|
| 1146 |
try:
|
| 1147 |
+
# USE UNIFIED STREAMING FUNCTION FOR BOTH ISSUE TYPES
|
| 1148 |
+
results = fetch_unified_issue_metadata_streaming(
|
| 1149 |
conn, identifiers, start_date, end_date
|
| 1150 |
)
|
| 1151 |
|
| 1152 |
+
agent_issues = results['agent_issues']
|
| 1153 |
+
wanted_open = results['wanted_open']
|
| 1154 |
+
wanted_resolved = results['wanted_resolved']
|
| 1155 |
+
|
| 1156 |
except Exception as e:
|
| 1157 |
print(f"Error during DuckDB fetch: {str(e)}")
|
| 1158 |
traceback.print_exc()
|
|
|
|
| 1163 |
print(f"\n[4/4] Saving leaderboard...")
|
| 1164 |
|
| 1165 |
try:
|
| 1166 |
+
leaderboard_dict = construct_leaderboard_from_metadata(agent_issues, agents, wanted_resolved)
|
| 1167 |
+
monthly_metrics = calculate_monthly_metrics_by_agent(agent_issues, agents)
|
| 1168 |
+
save_leaderboard_data_to_hf(leaderboard_dict, monthly_metrics, wanted_open)
|
| 1169 |
|
| 1170 |
except Exception as e:
|
| 1171 |
print(f"Error saving leaderboard: {str(e)}")
|