Spaces:
Running
fix: P0 provider mismatch and code quality audit fixes (#102)
Browse files* refactor: replace hacky try/except with proper pydantic validation
PROBLEM:
- CodeRabbit recommended try/except for ADVANCED_MAX_ROUNDS parsing
- This is hacky defensive programming, not Clean Code
- Silent fallbacks mask configuration errors
SOLUTION (Uncle Bob approved):
1. Add advanced_max_rounds to Settings with pydantic Field validation:
- ge=1, le=20 bounds checking
- Fails fast at startup with clear error message
2. Add advanced_timeout to Settings (60-900 seconds)
3. Remove os.getenv + try/except hack from advanced.py
- Now uses settings.advanced_max_rounds directly
4. Fix domain.py: invalid domain strings now raise ValueError
- Shows valid options in error message
- Previously silently fell back to default
5. Update tests: 18 tests verifying fail-fast behavior
- TestSettingsValidation: type errors, bounds violations all raise
- TestGetDomainConfig: invalid strings raise with helpful message
PHILOSOPHY:
"If someone configures your app wrong, tell them loudly.
Don't pretend everything is fine." - Uncle Bob
* fix: P0 provider mismatch and code quality audit fixes
CRITICAL BUG FIX:
- get_model() now auto-detects available providers (OpenAI > Anthropic > HuggingFace)
- Raises clear ConfigurationError when no API keys configured
- Free Tier synthesis properly falls back to template when no HF_TOKEN
CODE QUALITY FIXES (from audit):
- Replace manual os.getenv with centralized settings properties (app.py)
- Add logging to p-value parsing (statistical_analyzer.py) - fixes silent pass
- Narrow exception handling to specific errors (pubmed.py)
- Use find() instead of try/except for string search (code_execution.py)
- Use centralized settings for Modal credentials (code_execution.py)
TESTS:
- Add 5 new TDD tests for get_model() auto-detection
- Fix regression tests for updated settings usage
- All 309 tests pass
DOCUMENTATION:
- Add P0_SYNTHESIS_PROVIDER_MISMATCH.md with full fix documentation
- Add AUDIT_FINDINGS_2025_11_30.md code quality audit
- Update ACTIVE_BUGS.md index
* test: add explicit has_huggingface_key per CodeRabbit review
- docs/bugs/ACTIVE_BUGS.md +12 -1
- docs/bugs/AUDIT_FINDINGS_2025_11_30.md +70 -0
- docs/bugs/P0_SYNTHESIS_PROVIDER_MISMATCH.md +273 -0
- src/agent_factory/judges.py +25 -12
- src/app.py +4 -4
- src/config/domain.py +3 -2
- src/orchestrators/advanced.py +5 -22
- src/services/statistical_analyzer.py +5 -1
- src/tools/code_execution.py +13 -9
- src/tools/pubmed.py +6 -2
- src/utils/config.py +13 -1
- tests/unit/agent_factory/test_get_model_auto_detect.py +59 -0
- tests/unit/agent_factory/test_judges_factory.py +7 -0
- tests/unit/config/test_domain.py +8 -4
- tests/unit/orchestrators/test_advanced_orchestrator.py +60 -49
- tests/unit/test_app_domain.py +14 -2
|
@@ -3,10 +3,11 @@
|
|
| 3 |
> Last updated: 2025-11-30
|
| 4 |
>
|
| 5 |
> **Note:** Completed bug docs archived to `docs/bugs/archive/`
|
|
|
|
| 6 |
|
| 7 |
## P0 - Blocker
|
| 8 |
|
| 9 |
-
|
| 10 |
|
| 11 |
---
|
| 12 |
|
|
@@ -56,6 +57,16 @@
|
|
| 56 |
|
| 57 |
## Resolved Bugs
|
| 58 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 59 |
### ~~P0 - Simple Mode Never Synthesizes~~ FIXED
|
| 60 |
**PR:** [#71](https://github.com/The-Obstacle-Is-The-Way/DeepBoner/pull/71) (SPEC_06)
|
| 61 |
**Commit**: `5cac97d` (2025-11-29)
|
|
|
|
| 3 |
> Last updated: 2025-11-30
|
| 4 |
>
|
| 5 |
> **Note:** Completed bug docs archived to `docs/bugs/archive/`
|
| 6 |
+
> **See also:** [Code Quality Audit Findings (2025-11-30)](AUDIT_FINDINGS_2025_11_30.md)
|
| 7 |
|
| 8 |
## P0 - Blocker
|
| 9 |
|
| 10 |
+
(None)
|
| 11 |
|
| 12 |
---
|
| 13 |
|
|
|
|
| 57 |
|
| 58 |
## Resolved Bugs
|
| 59 |
|
| 60 |
+
### ~~P0 - Synthesis Fails with OpenAIError in Free Mode~~ FIXED
|
| 61 |
+
**File:** `docs/bugs/P0_SYNTHESIS_PROVIDER_MISMATCH.md`
|
| 62 |
+
**Found:** 2025-11-30 (Code Audit)
|
| 63 |
+
**Resolved:** 2025-11-30
|
| 64 |
+
|
| 65 |
+
- Problem: "Simple Mode" (Free Tier) crashed with `OpenAIError`.
|
| 66 |
+
- Root Cause: `get_model()` defaulted to OpenAI regardless of available keys.
|
| 67 |
+
- Fix: Implemented auto-detection in `judges.py` (OpenAI > Anthropic > HuggingFace).
|
| 68 |
+
- Added extensive unit tests and regression tests.
|
| 69 |
+
|
| 70 |
### ~~P0 - Simple Mode Never Synthesizes~~ FIXED
|
| 71 |
**PR:** [#71](https://github.com/The-Obstacle-Is-The-Way/DeepBoner/pull/71) (SPEC_06)
|
| 72 |
**Commit**: `5cac97d` (2025-11-29)
|
|
@@ -0,0 +1,70 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Code Quality Audit Findings - 2025-11-30
|
| 2 |
+
|
| 3 |
+
**Auditor:** Senior Staff Engineer (Gemini)
|
| 4 |
+
**Date:** 2025-11-30
|
| 5 |
+
**Scope:** `src/` (services, tools, agents, orchestrators)
|
| 6 |
+
**Focus:** Configuration validation, Error handling, Defensive programming anti-patterns
|
| 7 |
+
|
| 8 |
+
## Summary
|
| 9 |
+
|
| 10 |
+
The codebase is generally clean and modern, but exhibits specific anti-patterns related to configuration management and defensive error handling. The most critical finding is the reliance on manual `os.getenv` calls and "silent default" fallbacks which obscure configuration errors, directly contributing to the `OpenAIError` observed in production.
|
| 11 |
+
|
| 12 |
+
## Findings
|
| 13 |
+
|
| 14 |
+
### 1. Defensive Pass Block (Silent Failure) - MEDIUM
|
| 15 |
+
**File:** `src/services/statistical_analyzer.py:246-247`
|
| 16 |
+
```python
|
| 17 |
+
try:
|
| 18 |
+
min_p = min(float(p) for p in p_values)
|
| 19 |
+
# ... logic ...
|
| 20 |
+
except ValueError:
|
| 21 |
+
pass
|
| 22 |
+
```
|
| 23 |
+
**Problem:** If p-values are found by regex but fail to parse, the error is swallowed silently. This makes debugging parser issues impossible.
|
| 24 |
+
**Fix:** Replace `pass` with `logger.warning("Failed to parse p-values: %s", p_values)` to aid debugging.
|
| 25 |
+
|
| 26 |
+
### 2. Missing Pydantic Validation (Manual Config) - MEDIUM
|
| 27 |
+
**File:** `src/tools/code_execution.py:75-76`
|
| 28 |
+
```python
|
| 29 |
+
self.modal_token_id = os.getenv("MODAL_TOKEN_ID")
|
| 30 |
+
self.modal_token_secret = os.getenv("MODAL_TOKEN_SECRET")
|
| 31 |
+
```
|
| 32 |
+
**Problem:** Secrets are manually fetched from env vars, bypassing the centralized `Settings` validation.
|
| 33 |
+
**Fix:** Move to `src/utils/config.py` in the `Settings` class and inject `settings` into `ModalCodeExecutor`.
|
| 34 |
+
|
| 35 |
+
### 3. Broad Exception Swallowing - MEDIUM
|
| 36 |
+
**File:** `src/tools/pubmed.py:129-130`
|
| 37 |
+
```python
|
| 38 |
+
except Exception:
|
| 39 |
+
continue # Skip malformed articles
|
| 40 |
+
```
|
| 41 |
+
**Problem:** Catching `Exception` hides potential bugs (like `NameError` or `TypeError` in our own code), not just malformed data.
|
| 42 |
+
**Fix:** Catch specific exceptions (e.g., `(KeyError, AttributeError, TypeError)`) OR log the error before continuing: `logger.debug(f"Skipping malformed article {pmid}: {e}")`.
|
| 43 |
+
|
| 44 |
+
### 4. Missing Pydantic Validation (UI Layer) - LOW
|
| 45 |
+
**File:** `src/app.py:115, 119`
|
| 46 |
+
```python
|
| 47 |
+
elif os.getenv("OPENAI_API_KEY"):
|
| 48 |
+
# ...
|
| 49 |
+
elif os.getenv("ANTHROPIC_API_KEY"):
|
| 50 |
+
```
|
| 51 |
+
**Problem:** Application logic relies on raw environment variable checks to determine available backends, creating duplication and potential inconsistency with `config.py`.
|
| 52 |
+
**Fix:** Centralize this logic in `src/utils/config.py` (e.g., `settings.has_openai`, `settings.has_anthropic`).
|
| 53 |
+
|
| 54 |
+
### 5. Try/Except for Flow Control - LOW
|
| 55 |
+
**File:** `src/tools/code_execution.py:244-249`
|
| 56 |
+
```python
|
| 57 |
+
try:
|
| 58 |
+
start_idx = text.index(start_marker) + len(start_marker)
|
| 59 |
+
# ...
|
| 60 |
+
except ValueError:
|
| 61 |
+
return text.strip()
|
| 62 |
+
```
|
| 63 |
+
**Problem:** Using exceptions for expected "not found" cases is slower and less explicit.
|
| 64 |
+
**Fix:** Use `find()` which returns `-1` on failure.
|
| 65 |
+
|
| 66 |
+
## Action Plan
|
| 67 |
+
|
| 68 |
+
1. **Refactor Configuration:** Eliminate `os.getenv` in favor of `src/utils/config.py` `Settings` model.
|
| 69 |
+
2. **Fix Error Handling:** Remove empty `pass` blocks; add logging.
|
| 70 |
+
3. **Address P0 Bug:** Fix the `OpenAIError` in synthesis (caused by Finding #4/General Config issue) by injecting the correct model into the orchestrator.
|
|
@@ -0,0 +1,273 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# P0 - Systemic Provider Mismatch Across All Modes
|
| 2 |
+
|
| 3 |
+
**Status:** RESOLVED
|
| 4 |
+
**Priority:** P0 (Blocker for Free Tier/Demo)
|
| 5 |
+
**Found:** 2025-11-30 (during Audit)
|
| 6 |
+
**Resolved:** 2025-11-30
|
| 7 |
+
**Component:** Multiple files across orchestrators, agents, services
|
| 8 |
+
|
| 9 |
+
## Resolution Summary
|
| 10 |
+
|
| 11 |
+
The critical provider mismatch bug has been fixed by implementing auto-detection in `src/agent_factory/judges.py`.
|
| 12 |
+
The `get_model()` function now checks for actual API key availability (`has_openai_key`, `has_anthropic_key`, `has_huggingface_key`)
|
| 13 |
+
instead of relying on the static `settings.llm_provider` configuration.
|
| 14 |
+
|
| 15 |
+
### Fix Details
|
| 16 |
+
|
| 17 |
+
- **Auto-Detection Implemented**: `get_model()` prioritizes OpenAI > Anthropic > HuggingFace based on *available keys*.
|
| 18 |
+
- **Fail-Fast on No Keys**: If no API keys are configured, `get_model()` raises `ConfigurationError` with clear message.
|
| 19 |
+
- **HuggingFace Requires Token**: Free Tier via `HuggingFaceModel` requires `HF_TOKEN` (PydanticAI requirement).
|
| 20 |
+
- **Synthesis Fallback**: When `get_model()` fails, synthesis gracefully falls back to template.
|
| 21 |
+
- **Audit Fixes Applied**:
|
| 22 |
+
- Replaced manual `os.getenv` checks with centralized `settings` properties in `src/app.py`.
|
| 23 |
+
- Added logging to `src/services/statistical_analyzer.py` (fixed silent `pass`).
|
| 24 |
+
- Narrowed exception handling in `src/tools/pubmed.py`.
|
| 25 |
+
- Optimized string search in `src/tools/code_execution.py`.
|
| 26 |
+
|
| 27 |
+
### Key Clarification
|
| 28 |
+
|
| 29 |
+
The **Free Tier** in Simple Mode uses `HFInferenceJudgeHandler` (which uses `huggingface_hub.InferenceClient`)
|
| 30 |
+
for judging - this does NOT require `HF_TOKEN`. However, synthesis via `get_model()` uses PydanticAI's
|
| 31 |
+
`HuggingFaceModel` which DOES require `HF_TOKEN`. When no tokens are configured, synthesis falls back to
|
| 32 |
+
the template-based summary (which is still useful).
|
| 33 |
+
|
| 34 |
+
### Verification
|
| 35 |
+
|
| 36 |
+
- **Unit Tests**: 5 new TDD tests in `tests/unit/agent_factory/test_get_model_auto_detect.py` pass.
|
| 37 |
+
- **All Tests**: 309 tests pass (`make check` succeeds).
|
| 38 |
+
- **Regression Tests**: Fixed and verified `tests/unit/agent_factory/test_judges_factory.py`.
|
| 39 |
+
|
| 40 |
+
---
|
| 41 |
+
|
| 42 |
+
## Symptom (Archive)
|
| 43 |
+
|
| 44 |
+
When running in "Simple Mode" (Free Tier / No API Key), the synthesis step fails to generate a narrative and falls back to a structured summary template. The user sees:
|
| 45 |
+
|
| 46 |
+
```text
|
| 47 |
+
> ⚠️ Note: AI narrative synthesis unavailable. Showing structured summary.
|
| 48 |
+
> _Error: OpenAIError_
|
| 49 |
+
```
|
| 50 |
+
|
| 51 |
+
## Affected Files (COMPREHENSIVE AUDIT)
|
| 52 |
+
|
| 53 |
+
### Files Calling `get_model()` Directly (9 locations)
|
| 54 |
+
|
| 55 |
+
| File | Line | Context | Impact |
|
| 56 |
+
|------|------|---------|--------|
|
| 57 |
+
| `simple.py` | 547 | Synthesis step | Free Tier broken |
|
| 58 |
+
| `statistical_analyzer.py` | 75 | Analysis agent | Free Tier broken |
|
| 59 |
+
| `judge_agent_llm.py` | 18 | LLM Judge | Free Tier broken |
|
| 60 |
+
| `graph/nodes.py` | 177 | LangGraph hypothesis | Free Tier broken |
|
| 61 |
+
| `graph/nodes.py` | 249 | LangGraph synthesis | Free Tier broken |
|
| 62 |
+
| `report_agent.py` | 45 | Report generation | Free Tier broken |
|
| 63 |
+
| `hypothesis_agent.py` | 44 | Hypothesis generation | Free Tier broken |
|
| 64 |
+
| `judges.py` | 100 | JudgeHandler default | OK (accepts param) |
|
| 65 |
+
|
| 66 |
+
### Files Hardcoding `OpenAIChatClient` (Architecturally OpenAI-Only)
|
| 67 |
+
|
| 68 |
+
| File | Lines | Context |
|
| 69 |
+
|------|-------|---------|
|
| 70 |
+
| `advanced.py` | 100, 121 | Manager client |
|
| 71 |
+
| `magentic_agents.py` | 29, 70, 129, 173 | All 4 agents |
|
| 72 |
+
| `retrieval_agent.py` | 62 | Retrieval agent |
|
| 73 |
+
| `code_executor_agent.py` | 52 | Code executor |
|
| 74 |
+
| `llm_factory.py` | 42 | Factory default |
|
| 75 |
+
|
| 76 |
+
**Note:** Advanced mode is architecturally locked to OpenAI via `agent_framework.openai.OpenAIChatClient`. This is by design - see `app.py:188-194` which falls back to Simple mode if no OpenAI key. However, users are not clearly informed of this limitation.
|
| 77 |
+
|
| 78 |
+
## Root Cause
|
| 79 |
+
|
| 80 |
+
**Settings/Runtime Sync Gap - Two Separate Backend Selection Systems.**
|
| 81 |
+
|
| 82 |
+
The codebase has **two independent** systems for selecting the LLM backend:
|
| 83 |
+
1. `settings.llm_provider` (config.py default: "openai")
|
| 84 |
+
2. `app.py` runtime detection via `os.getenv()` checks
|
| 85 |
+
|
| 86 |
+
These are **never synchronized**, causing the Judge and Synthesis steps to use different backends.
|
| 87 |
+
|
| 88 |
+
### Detailed Call Chain
|
| 89 |
+
|
| 90 |
+
1. **`src/app.py:115-126`** (runtime detection):
|
| 91 |
+
```python
|
| 92 |
+
# app.py bypasses settings entirely for JudgeHandler selection
|
| 93 |
+
elif os.getenv("OPENAI_API_KEY"):
|
| 94 |
+
judge_handler = JudgeHandler(model=None, domain=domain)
|
| 95 |
+
elif os.getenv("ANTHROPIC_API_KEY"):
|
| 96 |
+
judge_handler = JudgeHandler(model=None, domain=domain)
|
| 97 |
+
else:
|
| 98 |
+
judge_handler = HFInferenceJudgeHandler(domain=domain) # Free Tier
|
| 99 |
+
```
|
| 100 |
+
**Note:** This creates the correct handler but does NOT update `settings.llm_provider`.
|
| 101 |
+
|
| 102 |
+
2. **`src/orchestrators/simple.py:546-552`** (synthesis step):
|
| 103 |
+
```python
|
| 104 |
+
from src.agent_factory.judges import get_model
|
| 105 |
+
agent: Agent[None, str] = Agent(model=get_model(), ...) # <-- BUG!
|
| 106 |
+
```
|
| 107 |
+
Synthesis calls `get_model()` directly instead of using the injected judge's model.
|
| 108 |
+
|
| 109 |
+
3. **`src/agent_factory/judges.py:56-78`** (`get_model()`):
|
| 110 |
+
```python
|
| 111 |
+
def get_model() -> Any:
|
| 112 |
+
llm_provider = settings.llm_provider # <-- Reads from settings (still "openai")
|
| 113 |
+
# ...
|
| 114 |
+
openai_provider = OpenAIProvider(api_key=settings.openai_api_key) # <-- None!
|
| 115 |
+
return OpenAIChatModel(settings.openai_model, provider=openai_provider)
|
| 116 |
+
```
|
| 117 |
+
**Result:** Creates OpenAI model with `api_key=None` → `OpenAIError`
|
| 118 |
+
|
| 119 |
+
### Why Free Tier Fails
|
| 120 |
+
|
| 121 |
+
| Step | System Used | Backend Selected |
|
| 122 |
+
|------|-------------|------------------|
|
| 123 |
+
| JudgeHandler | `app.py` runtime | HFInferenceJudgeHandler ✅ |
|
| 124 |
+
| Synthesis | `settings.llm_provider` | OpenAI (default) ❌ |
|
| 125 |
+
|
| 126 |
+
The Judge works because app.py explicitly creates `HFInferenceJudgeHandler`.
|
| 127 |
+
Synthesis fails because it calls `get_model()` which reads `settings.llm_provider = "openai"` (unchanged from default).
|
| 128 |
+
|
| 129 |
+
## Impact
|
| 130 |
+
|
| 131 |
+
- **User Experience:** Free tier users (Demo users) never see the high-quality narrative synthesis, only the fallback.
|
| 132 |
+
- **System Integrity:** The orchestrator ignores the runtime backend selection.
|
| 133 |
+
|
| 134 |
+
## Implemented Fix
|
| 135 |
+
|
| 136 |
+
**Strategy: Fix `get_model()` to Auto-Detect Available Provider**
|
| 137 |
+
|
| 138 |
+
### Actual Implementation (Merged)
|
| 139 |
+
|
| 140 |
+
**File:** `src/agent_factory/judges.py`
|
| 141 |
+
|
| 142 |
+
This is the **single point of fix** that resolves all 7 broken `get_model()` call sites.
|
| 143 |
+
|
| 144 |
+
```python
|
| 145 |
+
def get_model() -> Any:
|
| 146 |
+
"""Get the LLM model based on available API keys.
|
| 147 |
+
|
| 148 |
+
Priority order:
|
| 149 |
+
1. OpenAI (if OPENAI_API_KEY set)
|
| 150 |
+
2. Anthropic (if ANTHROPIC_API_KEY set)
|
| 151 |
+
3. HuggingFace (if HF_TOKEN set)
|
| 152 |
+
|
| 153 |
+
Raises:
|
| 154 |
+
ConfigurationError: If no API keys are configured.
|
| 155 |
+
|
| 156 |
+
Note: settings.llm_provider is ignored in favor of actual key availability.
|
| 157 |
+
This ensures the model matches what app.py selected for JudgeHandler.
|
| 158 |
+
"""
|
| 159 |
+
from src.utils.exceptions import ConfigurationError
|
| 160 |
+
|
| 161 |
+
# Priority 1: OpenAI (most common, best tool calling)
|
| 162 |
+
if settings.has_openai_key:
|
| 163 |
+
openai_provider = OpenAIProvider(api_key=settings.openai_api_key)
|
| 164 |
+
return OpenAIChatModel(settings.openai_model, provider=openai_provider)
|
| 165 |
+
|
| 166 |
+
# Priority 2: Anthropic
|
| 167 |
+
if settings.has_anthropic_key:
|
| 168 |
+
provider = AnthropicProvider(api_key=settings.anthropic_api_key)
|
| 169 |
+
return AnthropicModel(settings.anthropic_model, provider=provider)
|
| 170 |
+
|
| 171 |
+
# Priority 3: HuggingFace (requires HF_TOKEN)
|
| 172 |
+
if settings.has_huggingface_key:
|
| 173 |
+
model_name = settings.huggingface_model or "meta-llama/Llama-3.1-70B-Instruct"
|
| 174 |
+
hf_provider = HuggingFaceProvider(api_key=settings.hf_token)
|
| 175 |
+
return HuggingFaceModel(model_name, provider=hf_provider)
|
| 176 |
+
|
| 177 |
+
# No keys configured - fail fast with clear error
|
| 178 |
+
raise ConfigurationError(
|
| 179 |
+
"No LLM API key configured. Set one of: OPENAI_API_KEY, ANTHROPIC_API_KEY, or HF_TOKEN"
|
| 180 |
+
)
|
| 181 |
+
```
|
| 182 |
+
|
| 183 |
+
**Why this works:**
|
| 184 |
+
- Single fix location updates all 7 broken call sites
|
| 185 |
+
- Matches app.py's detection logic (key availability, not settings.llm_provider)
|
| 186 |
+
- HuggingFace works when HF_TOKEN is available
|
| 187 |
+
- Raises clear error when no keys configured (callers can catch and fallback)
|
| 188 |
+
- No changes needed to orchestrators, agents, or services
|
| 189 |
+
|
| 190 |
+
### What This Does NOT Fix (By Design)
|
| 191 |
+
|
| 192 |
+
**Advanced Mode remains OpenAI-only.** The following files use `agent_framework.openai.OpenAIChatClient` which only supports OpenAI:
|
| 193 |
+
|
| 194 |
+
- `advanced.py` (Manager + agents)
|
| 195 |
+
- `magentic_agents.py` (SearchAgent, JudgeAgent, HypothesisAgent, ReportAgent)
|
| 196 |
+
- `retrieval_agent.py`, `code_executor_agent.py`
|
| 197 |
+
|
| 198 |
+
This is **by design** - the Microsoft Agent Framework library (`agent-framework-core`) only provides `OpenAIChatClient`. To support other providers in Advanced mode would require:
|
| 199 |
+
1. Wait for `agent-framework` to add Anthropic/HuggingFace clients, OR
|
| 200 |
+
2. Write our own `ChatClient` implementations (significant effort)
|
| 201 |
+
|
| 202 |
+
**The current app.py behavior is correct:** it falls back to Simple mode when no OpenAI key is present (lines 188-194). The UI message could be clearer about why.
|
| 203 |
+
|
| 204 |
+
## Test Plan (Implemented)
|
| 205 |
+
|
| 206 |
+
### Unit Tests (Verified Passing)
|
| 207 |
+
|
| 208 |
+
```python
|
| 209 |
+
# tests/unit/agent_factory/test_get_model_auto_detect.py
|
| 210 |
+
|
| 211 |
+
import pytest
|
| 212 |
+
from src.agent_factory.judges import get_model
|
| 213 |
+
from src.utils.config import settings
|
| 214 |
+
from src.utils.exceptions import ConfigurationError
|
| 215 |
+
|
| 216 |
+
class TestGetModelAutoDetect:
|
| 217 |
+
"""Test that get_model() auto-detects available providers."""
|
| 218 |
+
|
| 219 |
+
def test_returns_openai_when_key_present(self, monkeypatch):
|
| 220 |
+
"""OpenAI key present → OpenAI model."""
|
| 221 |
+
monkeypatch.setattr(settings, "openai_api_key", "sk-test")
|
| 222 |
+
monkeypatch.setattr(settings, "anthropic_api_key", None)
|
| 223 |
+
monkeypatch.setattr(settings, "hf_token", None)
|
| 224 |
+
model = get_model()
|
| 225 |
+
assert isinstance(model, OpenAIChatModel)
|
| 226 |
+
|
| 227 |
+
def test_returns_anthropic_when_only_anthropic_key(self, monkeypatch):
|
| 228 |
+
"""Only Anthropic key → Anthropic model."""
|
| 229 |
+
monkeypatch.setattr(settings, "openai_api_key", None)
|
| 230 |
+
monkeypatch.setattr(settings, "anthropic_api_key", "sk-ant-test")
|
| 231 |
+
monkeypatch.setattr(settings, "hf_token", None)
|
| 232 |
+
model = get_model()
|
| 233 |
+
assert isinstance(model, AnthropicModel)
|
| 234 |
+
|
| 235 |
+
def test_returns_huggingface_when_hf_token_present(self, monkeypatch):
|
| 236 |
+
"""HF_TOKEN present (no paid keys) → HuggingFace model."""
|
| 237 |
+
monkeypatch.setattr(settings, "openai_api_key", None)
|
| 238 |
+
monkeypatch.setattr(settings, "anthropic_api_key", None)
|
| 239 |
+
monkeypatch.setattr(settings, "hf_token", "hf_test_token")
|
| 240 |
+
model = get_model()
|
| 241 |
+
assert isinstance(model, HuggingFaceModel)
|
| 242 |
+
|
| 243 |
+
def test_raises_error_when_no_keys(self, monkeypatch):
|
| 244 |
+
"""No keys at all → ConfigurationError."""
|
| 245 |
+
monkeypatch.setattr(settings, "openai_api_key", None)
|
| 246 |
+
monkeypatch.setattr(settings, "anthropic_api_key", None)
|
| 247 |
+
monkeypatch.setattr(settings, "hf_token", None)
|
| 248 |
+
with pytest.raises(ConfigurationError) as exc_info:
|
| 249 |
+
get_model()
|
| 250 |
+
assert "No LLM API key configured" in str(exc_info.value)
|
| 251 |
+
|
| 252 |
+
def test_openai_takes_priority_over_anthropic(self, monkeypatch):
|
| 253 |
+
"""Both keys present → OpenAI wins."""
|
| 254 |
+
monkeypatch.setattr(settings, "openai_api_key", "sk-test")
|
| 255 |
+
monkeypatch.setattr(settings, "anthropic_api_key", "sk-ant-test")
|
| 256 |
+
model = get_model()
|
| 257 |
+
assert isinstance(model, OpenAIChatModel)
|
| 258 |
+
```
|
| 259 |
+
|
| 260 |
+
### Full Test Suite
|
| 261 |
+
|
| 262 |
+
```bash
|
| 263 |
+
$ make check
|
| 264 |
+
# 309 passed in 238.16s (0:03:58)
|
| 265 |
+
# All checks passed!
|
| 266 |
+
```
|
| 267 |
+
|
| 268 |
+
### Manual Verification
|
| 269 |
+
|
| 270 |
+
1. **Unset all API keys**: `unset OPENAI_API_KEY ANTHROPIC_API_KEY HF_TOKEN`
|
| 271 |
+
2. **Run app**: `uv run python -m src.app`
|
| 272 |
+
3. **Submit query**: "What drugs improve female libido?"
|
| 273 |
+
4. **Verify**: Synthesis falls back to template (shows `ConfigurationError` in logs, but user sees structured summary)
|
|
@@ -54,28 +54,41 @@ def _extract_titles_from_evidence(
|
|
| 54 |
|
| 55 |
|
| 56 |
def get_model() -> Any:
|
| 57 |
-
"""Get the LLM model based on
|
| 58 |
|
| 59 |
-
|
| 60 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 61 |
"""
|
| 62 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 63 |
|
| 64 |
-
|
|
|
|
| 65 |
provider = AnthropicProvider(api_key=settings.anthropic_api_key)
|
| 66 |
return AnthropicModel(settings.anthropic_model, provider=provider)
|
| 67 |
|
| 68 |
-
|
| 69 |
-
|
| 70 |
model_name = settings.huggingface_model or "meta-llama/Llama-3.1-70B-Instruct"
|
| 71 |
hf_provider = HuggingFaceProvider(api_key=settings.hf_token)
|
| 72 |
return HuggingFaceModel(model_name, provider=hf_provider)
|
| 73 |
|
| 74 |
-
|
| 75 |
-
|
| 76 |
-
|
| 77 |
-
|
| 78 |
-
return OpenAIChatModel(settings.openai_model, provider=openai_provider)
|
| 79 |
|
| 80 |
|
| 81 |
class JudgeHandler:
|
|
|
|
| 54 |
|
| 55 |
|
| 56 |
def get_model() -> Any:
|
| 57 |
+
"""Get the LLM model based on available API keys.
|
| 58 |
|
| 59 |
+
Priority order:
|
| 60 |
+
1. OpenAI (if OPENAI_API_KEY set)
|
| 61 |
+
2. Anthropic (if ANTHROPIC_API_KEY set)
|
| 62 |
+
3. HuggingFace (if HF_TOKEN set)
|
| 63 |
+
|
| 64 |
+
Raises:
|
| 65 |
+
ConfigurationError: If no API keys are configured.
|
| 66 |
+
|
| 67 |
+
Note: settings.llm_provider is ignored in favor of actual key availability.
|
| 68 |
+
This ensures the model matches what app.py selected for JudgeHandler.
|
| 69 |
"""
|
| 70 |
+
from src.utils.exceptions import ConfigurationError
|
| 71 |
+
|
| 72 |
+
# Priority 1: OpenAI (most common, best tool calling)
|
| 73 |
+
if settings.has_openai_key:
|
| 74 |
+
openai_provider = OpenAIProvider(api_key=settings.openai_api_key)
|
| 75 |
+
return OpenAIChatModel(settings.openai_model, provider=openai_provider)
|
| 76 |
|
| 77 |
+
# Priority 2: Anthropic
|
| 78 |
+
if settings.has_anthropic_key:
|
| 79 |
provider = AnthropicProvider(api_key=settings.anthropic_api_key)
|
| 80 |
return AnthropicModel(settings.anthropic_model, provider=provider)
|
| 81 |
|
| 82 |
+
# Priority 3: HuggingFace (requires HF_TOKEN)
|
| 83 |
+
if settings.has_huggingface_key:
|
| 84 |
model_name = settings.huggingface_model or "meta-llama/Llama-3.1-70B-Instruct"
|
| 85 |
hf_provider = HuggingFaceProvider(api_key=settings.hf_token)
|
| 86 |
return HuggingFaceModel(model_name, provider=hf_provider)
|
| 87 |
|
| 88 |
+
# No keys configured - fail fast with clear error
|
| 89 |
+
raise ConfigurationError(
|
| 90 |
+
"No LLM API key configured. Set one of: OPENAI_API_KEY, ANTHROPIC_API_KEY, or HF_TOKEN"
|
| 91 |
+
)
|
|
|
|
| 92 |
|
| 93 |
|
| 94 |
class JudgeHandler:
|
|
@@ -112,11 +112,11 @@ def configure_orchestrator(
|
|
| 112 |
judge_handler = JudgeHandler(model=model, domain=domain)
|
| 113 |
|
| 114 |
# 3. Environment API Keys (fallback)
|
| 115 |
-
elif
|
| 116 |
judge_handler = JudgeHandler(model=None, domain=domain) # Uses env key
|
| 117 |
backend_info = "Paid API (OpenAI from env)"
|
| 118 |
|
| 119 |
-
elif
|
| 120 |
judge_handler = JudgeHandler(model=None, domain=domain) # Uses env key
|
| 121 |
backend_info = "Paid API (Anthropic from env)"
|
| 122 |
|
|
@@ -177,8 +177,8 @@ async def research_agent(
|
|
| 177 |
user_api_key = (api_key_str.strip() or api_key_state_str.strip()) or None
|
| 178 |
|
| 179 |
# Check available keys
|
| 180 |
-
has_openai =
|
| 181 |
-
has_anthropic =
|
| 182 |
# Check for OpenAI user key
|
| 183 |
is_openai_user_key = (
|
| 184 |
user_api_key and user_api_key.startswith("sk-") and not user_api_key.startswith("sk-ant-")
|
|
|
|
| 112 |
judge_handler = JudgeHandler(model=model, domain=domain)
|
| 113 |
|
| 114 |
# 3. Environment API Keys (fallback)
|
| 115 |
+
elif settings.has_openai_key:
|
| 116 |
judge_handler = JudgeHandler(model=None, domain=domain) # Uses env key
|
| 117 |
backend_info = "Paid API (OpenAI from env)"
|
| 118 |
|
| 119 |
+
elif settings.has_anthropic_key:
|
| 120 |
judge_handler = JudgeHandler(model=None, domain=domain) # Uses env key
|
| 121 |
backend_info = "Paid API (Anthropic from env)"
|
| 122 |
|
|
|
|
| 177 |
user_api_key = (api_key_str.strip() or api_key_state_str.strip()) or None
|
| 178 |
|
| 179 |
# Check available keys
|
| 180 |
+
has_openai = settings.has_openai_key
|
| 181 |
+
has_anthropic = settings.has_anthropic_key
|
| 182 |
# Check for OpenAI user key
|
| 183 |
is_openai_user_key = (
|
| 184 |
user_api_key and user_api_key.startswith("sk-") and not user_api_key.startswith("sk-ant-")
|
|
@@ -122,7 +122,8 @@ def get_domain_config(domain: ResearchDomain | str | None = None) -> DomainConfi
|
|
| 122 |
if isinstance(domain, str):
|
| 123 |
try:
|
| 124 |
domain = ResearchDomain(domain)
|
| 125 |
-
except ValueError:
|
| 126 |
-
|
|
|
|
| 127 |
|
| 128 |
return DOMAIN_CONFIGS[domain]
|
|
|
|
| 122 |
if isinstance(domain, str):
|
| 123 |
try:
|
| 124 |
domain = ResearchDomain(domain)
|
| 125 |
+
except ValueError as e:
|
| 126 |
+
valid_domains = [d.value for d in ResearchDomain]
|
| 127 |
+
raise ValueError(f"Invalid domain '{domain}'. Valid domains: {valid_domains}") from e
|
| 128 |
|
| 129 |
return DOMAIN_CONFIGS[domain]
|
|
@@ -15,7 +15,6 @@ Design Patterns:
|
|
| 15 |
"""
|
| 16 |
|
| 17 |
import asyncio
|
| 18 |
-
import os
|
| 19 |
from collections.abc import AsyncGenerator
|
| 20 |
from typing import TYPE_CHECKING, Any
|
| 21 |
|
|
@@ -85,27 +84,11 @@ class AdvancedOrchestrator(OrchestratorProtocol):
|
|
| 85 |
if not chat_client and not api_key:
|
| 86 |
check_magentic_requirements()
|
| 87 |
|
| 88 |
-
#
|
| 89 |
-
|
| 90 |
-
|
| 91 |
-
|
| 92 |
-
|
| 93 |
-
logger.warning(
|
| 94 |
-
"Invalid ADVANCED_MAX_ROUNDS value %r, falling back to 5",
|
| 95 |
-
raw_rounds,
|
| 96 |
-
)
|
| 97 |
-
env_rounds = 5
|
| 98 |
-
|
| 99 |
-
if env_rounds < 1:
|
| 100 |
-
logger.warning(
|
| 101 |
-
"ADVANCED_MAX_ROUNDS must be >= 1, got %d; using 1 instead",
|
| 102 |
-
env_rounds,
|
| 103 |
-
)
|
| 104 |
-
env_rounds = 1
|
| 105 |
-
|
| 106 |
-
self._max_rounds = max_rounds if max_rounds is not None else env_rounds
|
| 107 |
-
|
| 108 |
-
self._timeout_seconds = timeout_seconds
|
| 109 |
self.domain = domain
|
| 110 |
self.domain_config = get_domain_config(domain)
|
| 111 |
self._chat_client: OpenAIChatClient | None
|
|
|
|
| 15 |
"""
|
| 16 |
|
| 17 |
import asyncio
|
|
|
|
| 18 |
from collections.abc import AsyncGenerator
|
| 19 |
from typing import TYPE_CHECKING, Any
|
| 20 |
|
|
|
|
| 84 |
if not chat_client and not api_key:
|
| 85 |
check_magentic_requirements()
|
| 86 |
|
| 87 |
+
# Use pydantic-validated settings (fails fast on invalid config)
|
| 88 |
+
self._max_rounds = max_rounds if max_rounds is not None else settings.advanced_max_rounds
|
| 89 |
+
self._timeout_seconds = (
|
| 90 |
+
timeout_seconds if timeout_seconds != 300.0 else settings.advanced_timeout
|
| 91 |
+
)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 92 |
self.domain = domain
|
| 93 |
self.domain_config = get_domain_config(domain)
|
| 94 |
self._chat_client: OpenAIChatClient | None
|
|
@@ -12,6 +12,8 @@ import re
|
|
| 12 |
from functools import lru_cache, partial
|
| 13 |
from typing import Any, Literal
|
| 14 |
|
|
|
|
|
|
|
| 15 |
# Type alias for verdict values
|
| 16 |
VerdictType = Literal["SUPPORTED", "REFUTED", "INCONCLUSIVE"]
|
| 17 |
|
|
@@ -26,6 +28,8 @@ from src.tools.code_execution import (
|
|
| 26 |
)
|
| 27 |
from src.utils.models import Evidence
|
| 28 |
|
|
|
|
|
|
|
| 29 |
|
| 30 |
class AnalysisResult(BaseModel):
|
| 31 |
"""Result of statistical analysis."""
|
|
@@ -244,7 +248,7 @@ Generate executable Python code to analyze this evidence."""
|
|
| 244 |
else:
|
| 245 |
return 0.60
|
| 246 |
except ValueError:
|
| 247 |
-
|
| 248 |
|
| 249 |
return 0.70 # Default
|
| 250 |
|
|
|
|
| 12 |
from functools import lru_cache, partial
|
| 13 |
from typing import Any, Literal
|
| 14 |
|
| 15 |
+
import structlog
|
| 16 |
+
|
| 17 |
# Type alias for verdict values
|
| 18 |
VerdictType = Literal["SUPPORTED", "REFUTED", "INCONCLUSIVE"]
|
| 19 |
|
|
|
|
| 28 |
)
|
| 29 |
from src.utils.models import Evidence
|
| 30 |
|
| 31 |
+
logger = structlog.get_logger()
|
| 32 |
+
|
| 33 |
|
| 34 |
class AnalysisResult(BaseModel):
|
| 35 |
"""Result of statistical analysis."""
|
|
|
|
| 248 |
else:
|
| 249 |
return 0.60
|
| 250 |
except ValueError:
|
| 251 |
+
logger.debug("Failed to parse p-values", p_values=p_values)
|
| 252 |
|
| 253 |
return 0.70 # Default
|
| 254 |
|
|
@@ -4,12 +4,13 @@ This module provides sandboxed Python code execution using Modal's serverless in
|
|
| 4 |
It's designed for running LLM-generated statistical analysis code safely.
|
| 5 |
"""
|
| 6 |
|
| 7 |
-
import os
|
| 8 |
from functools import lru_cache
|
| 9 |
from typing import Any
|
| 10 |
|
| 11 |
import structlog
|
| 12 |
|
|
|
|
|
|
|
| 13 |
logger = structlog.get_logger(__name__)
|
| 14 |
|
| 15 |
# Shared library versions for Modal sandbox - used by both executor and LLM prompts
|
|
@@ -72,8 +73,8 @@ class ModalCodeExecutor:
|
|
| 72 |
Execution will fail at runtime without valid credentials.
|
| 73 |
"""
|
| 74 |
# Check for Modal credentials
|
| 75 |
-
self.modal_token_id =
|
| 76 |
-
self.modal_token_secret =
|
| 77 |
|
| 78 |
if not self.modal_token_id or not self.modal_token_secret:
|
| 79 |
logger.warning(
|
|
@@ -241,13 +242,16 @@ print(json.dumps({{"__RESULT__": result}}))
|
|
| 241 |
|
| 242 |
def _extract_output(self, text: str, start_marker: str, end_marker: str) -> str:
|
| 243 |
"""Extract content between markers."""
|
| 244 |
-
|
| 245 |
-
|
| 246 |
-
end_idx = text.index(end_marker)
|
| 247 |
-
return text[start_idx:end_idx].strip()
|
| 248 |
-
except ValueError:
|
| 249 |
-
# Markers not found, return original text
|
| 250 |
return text.strip()
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 251 |
|
| 252 |
|
| 253 |
@lru_cache(maxsize=1)
|
|
|
|
| 4 |
It's designed for running LLM-generated statistical analysis code safely.
|
| 5 |
"""
|
| 6 |
|
|
|
|
| 7 |
from functools import lru_cache
|
| 8 |
from typing import Any
|
| 9 |
|
| 10 |
import structlog
|
| 11 |
|
| 12 |
+
from src.utils.config import settings
|
| 13 |
+
|
| 14 |
logger = structlog.get_logger(__name__)
|
| 15 |
|
| 16 |
# Shared library versions for Modal sandbox - used by both executor and LLM prompts
|
|
|
|
| 73 |
Execution will fail at runtime without valid credentials.
|
| 74 |
"""
|
| 75 |
# Check for Modal credentials
|
| 76 |
+
self.modal_token_id = settings.modal_token_id
|
| 77 |
+
self.modal_token_secret = settings.modal_token_secret
|
| 78 |
|
| 79 |
if not self.modal_token_id or not self.modal_token_secret:
|
| 80 |
logger.warning(
|
|
|
|
| 242 |
|
| 243 |
def _extract_output(self, text: str, start_marker: str, end_marker: str) -> str:
|
| 244 |
"""Extract content between markers."""
|
| 245 |
+
start_idx = text.find(start_marker)
|
| 246 |
+
if start_idx == -1:
|
|
|
|
|
|
|
|
|
|
|
|
|
| 247 |
return text.strip()
|
| 248 |
+
start_idx += len(start_marker)
|
| 249 |
+
|
| 250 |
+
end_idx = text.find(end_marker, start_idx)
|
| 251 |
+
if end_idx == -1:
|
| 252 |
+
return text.strip()
|
| 253 |
+
|
| 254 |
+
return text[start_idx:end_idx].strip()
|
| 255 |
|
| 256 |
|
| 257 |
@lru_cache(maxsize=1)
|
|
@@ -3,6 +3,7 @@
|
|
| 3 |
from typing import Any
|
| 4 |
|
| 5 |
import httpx
|
|
|
|
| 6 |
import xmltodict
|
| 7 |
from tenacity import retry, stop_after_attempt, wait_exponential
|
| 8 |
|
|
@@ -12,6 +13,8 @@ from src.utils.config import settings
|
|
| 12 |
from src.utils.exceptions import RateLimitError, SearchError
|
| 13 |
from src.utils.models import Citation, Evidence
|
| 14 |
|
|
|
|
|
|
|
| 15 |
|
| 16 |
class PubMedTool:
|
| 17 |
"""Search tool for PubMed/NCBI."""
|
|
@@ -126,8 +129,9 @@ class PubMedTool:
|
|
| 126 |
evidence = self._article_to_evidence(article)
|
| 127 |
if evidence:
|
| 128 |
evidence_list.append(evidence)
|
| 129 |
-
except
|
| 130 |
-
|
|
|
|
| 131 |
|
| 132 |
return evidence_list
|
| 133 |
|
|
|
|
| 3 |
from typing import Any
|
| 4 |
|
| 5 |
import httpx
|
| 6 |
+
import structlog
|
| 7 |
import xmltodict
|
| 8 |
from tenacity import retry, stop_after_attempt, wait_exponential
|
| 9 |
|
|
|
|
| 13 |
from src.utils.exceptions import RateLimitError, SearchError
|
| 14 |
from src.utils.models import Citation, Evidence
|
| 15 |
|
| 16 |
+
logger = structlog.get_logger()
|
| 17 |
+
|
| 18 |
|
| 19 |
class PubMedTool:
|
| 20 |
"""Search tool for PubMed/NCBI."""
|
|
|
|
| 129 |
evidence = self._article_to_evidence(article)
|
| 130 |
if evidence:
|
| 131 |
evidence_list.append(evidence)
|
| 132 |
+
except (KeyError, AttributeError, TypeError) as e:
|
| 133 |
+
logger.debug("Skipping malformed article", error=str(e))
|
| 134 |
+
continue
|
| 135 |
|
| 136 |
return evidence_list
|
| 137 |
|
|
@@ -60,10 +60,22 @@ class Settings(BaseSettings):
|
|
| 60 |
|
| 61 |
# Agent Configuration
|
| 62 |
max_iterations: int = Field(default=10, ge=1, le=50)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 63 |
search_timeout: int = Field(default=30, description="Seconds to wait for search")
|
| 64 |
magentic_timeout: int = Field(
|
| 65 |
default=600,
|
| 66 |
-
description="Timeout for Magentic mode in seconds",
|
| 67 |
)
|
| 68 |
|
| 69 |
# Logging
|
|
|
|
| 60 |
|
| 61 |
# Agent Configuration
|
| 62 |
max_iterations: int = Field(default=10, ge=1, le=50)
|
| 63 |
+
advanced_max_rounds: int = Field(
|
| 64 |
+
default=5,
|
| 65 |
+
ge=1,
|
| 66 |
+
le=20,
|
| 67 |
+
description="Max coordination rounds for Advanced mode (default 5 for faster demos)",
|
| 68 |
+
)
|
| 69 |
+
advanced_timeout: float = Field(
|
| 70 |
+
default=300.0,
|
| 71 |
+
ge=60.0,
|
| 72 |
+
le=900.0,
|
| 73 |
+
description="Timeout for Advanced mode in seconds (default 5 min)",
|
| 74 |
+
)
|
| 75 |
search_timeout: int = Field(default=30, description="Seconds to wait for search")
|
| 76 |
magentic_timeout: int = Field(
|
| 77 |
default=600,
|
| 78 |
+
description="Timeout for Magentic mode in seconds (deprecated, use advanced_timeout)",
|
| 79 |
)
|
| 80 |
|
| 81 |
# Logging
|
|
@@ -0,0 +1,59 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import pytest
|
| 2 |
+
from pydantic_ai.models.anthropic import AnthropicModel
|
| 3 |
+
from pydantic_ai.models.huggingface import HuggingFaceModel
|
| 4 |
+
from pydantic_ai.models.openai import OpenAIChatModel
|
| 5 |
+
|
| 6 |
+
from src.agent_factory.judges import get_model
|
| 7 |
+
from src.utils.config import settings
|
| 8 |
+
from src.utils.exceptions import ConfigurationError
|
| 9 |
+
|
| 10 |
+
|
| 11 |
+
class TestGetModelAutoDetect:
|
| 12 |
+
"""Test that get_model() auto-detects available providers."""
|
| 13 |
+
|
| 14 |
+
def test_returns_openai_when_key_present(self, monkeypatch):
|
| 15 |
+
"""OpenAI key present → OpenAI model."""
|
| 16 |
+
# Mock the settings properties (settings is a singleton)
|
| 17 |
+
monkeypatch.setattr(settings, "openai_api_key", "sk-test")
|
| 18 |
+
monkeypatch.setattr(settings, "anthropic_api_key", None)
|
| 19 |
+
monkeypatch.setattr(settings, "hf_token", None)
|
| 20 |
+
|
| 21 |
+
model = get_model()
|
| 22 |
+
assert isinstance(model, OpenAIChatModel)
|
| 23 |
+
|
| 24 |
+
def test_returns_anthropic_when_only_anthropic_key(self, monkeypatch):
|
| 25 |
+
"""Only Anthropic key → Anthropic model."""
|
| 26 |
+
monkeypatch.setattr(settings, "openai_api_key", None)
|
| 27 |
+
monkeypatch.setattr(settings, "anthropic_api_key", "sk-ant-test")
|
| 28 |
+
monkeypatch.setattr(settings, "hf_token", None)
|
| 29 |
+
|
| 30 |
+
model = get_model()
|
| 31 |
+
assert isinstance(model, AnthropicModel)
|
| 32 |
+
|
| 33 |
+
def test_returns_huggingface_when_hf_token_present(self, monkeypatch):
|
| 34 |
+
"""HF_TOKEN present (no paid keys) → HuggingFace model."""
|
| 35 |
+
monkeypatch.setattr(settings, "openai_api_key", None)
|
| 36 |
+
monkeypatch.setattr(settings, "anthropic_api_key", None)
|
| 37 |
+
monkeypatch.setattr(settings, "hf_token", "hf_test_token")
|
| 38 |
+
|
| 39 |
+
model = get_model()
|
| 40 |
+
assert isinstance(model, HuggingFaceModel)
|
| 41 |
+
|
| 42 |
+
def test_raises_error_when_no_keys(self, monkeypatch):
|
| 43 |
+
"""No keys at all → ConfigurationError."""
|
| 44 |
+
monkeypatch.setattr(settings, "openai_api_key", None)
|
| 45 |
+
monkeypatch.setattr(settings, "anthropic_api_key", None)
|
| 46 |
+
monkeypatch.setattr(settings, "hf_token", None)
|
| 47 |
+
|
| 48 |
+
with pytest.raises(ConfigurationError) as exc_info:
|
| 49 |
+
get_model()
|
| 50 |
+
|
| 51 |
+
assert "No LLM API key configured" in str(exc_info.value)
|
| 52 |
+
|
| 53 |
+
def test_openai_takes_priority_over_anthropic(self, monkeypatch):
|
| 54 |
+
"""Both keys present → OpenAI wins."""
|
| 55 |
+
monkeypatch.setattr(settings, "openai_api_key", "sk-test")
|
| 56 |
+
monkeypatch.setattr(settings, "anthropic_api_key", "sk-ant-test")
|
| 57 |
+
|
| 58 |
+
model = get_model()
|
| 59 |
+
assert isinstance(model, OpenAIChatModel)
|
|
@@ -24,6 +24,7 @@ def mock_settings():
|
|
| 24 |
def test_get_model_openai(mock_settings):
|
| 25 |
"""Test that OpenAI model is returned when provider is openai."""
|
| 26 |
mock_settings.llm_provider = "openai"
|
|
|
|
| 27 |
mock_settings.openai_api_key = "sk-test"
|
| 28 |
mock_settings.openai_model = "gpt-5"
|
| 29 |
|
|
@@ -35,6 +36,8 @@ def test_get_model_openai(mock_settings):
|
|
| 35 |
def test_get_model_anthropic(mock_settings):
|
| 36 |
"""Test that Anthropic model is returned when provider is anthropic."""
|
| 37 |
mock_settings.llm_provider = "anthropic"
|
|
|
|
|
|
|
| 38 |
mock_settings.anthropic_api_key = "sk-ant-test"
|
| 39 |
mock_settings.anthropic_model = "claude-sonnet-4-5-20250929"
|
| 40 |
|
|
@@ -46,6 +49,9 @@ def test_get_model_anthropic(mock_settings):
|
|
| 46 |
def test_get_model_huggingface(mock_settings):
|
| 47 |
"""Test that HuggingFace model is returned when provider is huggingface."""
|
| 48 |
mock_settings.llm_provider = "huggingface"
|
|
|
|
|
|
|
|
|
|
| 49 |
mock_settings.hf_token = "hf_test_token"
|
| 50 |
mock_settings.huggingface_model = "meta-llama/Llama-3.1-70B-Instruct"
|
| 51 |
|
|
@@ -57,6 +63,7 @@ def test_get_model_huggingface(mock_settings):
|
|
| 57 |
def test_get_model_default_fallback(mock_settings):
|
| 58 |
"""Test fallback to OpenAI if provider is unknown."""
|
| 59 |
mock_settings.llm_provider = "unknown_provider"
|
|
|
|
| 60 |
mock_settings.openai_api_key = "sk-test"
|
| 61 |
mock_settings.openai_model = "gpt-5"
|
| 62 |
|
|
|
|
| 24 |
def test_get_model_openai(mock_settings):
|
| 25 |
"""Test that OpenAI model is returned when provider is openai."""
|
| 26 |
mock_settings.llm_provider = "openai"
|
| 27 |
+
mock_settings.has_openai_key = True
|
| 28 |
mock_settings.openai_api_key = "sk-test"
|
| 29 |
mock_settings.openai_model = "gpt-5"
|
| 30 |
|
|
|
|
| 36 |
def test_get_model_anthropic(mock_settings):
|
| 37 |
"""Test that Anthropic model is returned when provider is anthropic."""
|
| 38 |
mock_settings.llm_provider = "anthropic"
|
| 39 |
+
mock_settings.has_openai_key = False
|
| 40 |
+
mock_settings.has_anthropic_key = True
|
| 41 |
mock_settings.anthropic_api_key = "sk-ant-test"
|
| 42 |
mock_settings.anthropic_model = "claude-sonnet-4-5-20250929"
|
| 43 |
|
|
|
|
| 49 |
def test_get_model_huggingface(mock_settings):
|
| 50 |
"""Test that HuggingFace model is returned when provider is huggingface."""
|
| 51 |
mock_settings.llm_provider = "huggingface"
|
| 52 |
+
mock_settings.has_openai_key = False
|
| 53 |
+
mock_settings.has_anthropic_key = False
|
| 54 |
+
mock_settings.has_huggingface_key = True # CodeRabbit: explicitly set for auto-detect
|
| 55 |
mock_settings.hf_token = "hf_test_token"
|
| 56 |
mock_settings.huggingface_model = "meta-llama/Llama-3.1-70B-Instruct"
|
| 57 |
|
|
|
|
| 63 |
def test_get_model_default_fallback(mock_settings):
|
| 64 |
"""Test fallback to OpenAI if provider is unknown."""
|
| 65 |
mock_settings.llm_provider = "unknown_provider"
|
| 66 |
+
mock_settings.has_openai_key = True
|
| 67 |
mock_settings.openai_api_key = "sk-test"
|
| 68 |
mock_settings.openai_model = "gpt-5"
|
| 69 |
|
|
@@ -28,10 +28,14 @@ class TestGetDomainConfig:
|
|
| 28 |
config = get_domain_config("sexual_health")
|
| 29 |
assert "Sexual Health" in config.name
|
| 30 |
|
| 31 |
-
def
|
| 32 |
-
# Invalid domains
|
| 33 |
-
|
| 34 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 35 |
|
| 36 |
def test_config_has_required_fields(self):
|
| 37 |
required_fields = [
|
|
|
|
| 28 |
config = get_domain_config("sexual_health")
|
| 29 |
assert "Sexual Health" in config.name
|
| 30 |
|
| 31 |
+
def test_invalid_string_raises_value_error(self):
|
| 32 |
+
# Invalid domains should fail fast with clear error
|
| 33 |
+
import pytest
|
| 34 |
+
|
| 35 |
+
with pytest.raises(ValueError) as exc_info:
|
| 36 |
+
get_domain_config("invalid_domain")
|
| 37 |
+
assert "Invalid domain" in str(exc_info.value)
|
| 38 |
+
assert "sexual_health" in str(exc_info.value) # Shows valid options
|
| 39 |
|
| 40 |
def test_config_has_required_fields(self):
|
| 41 |
required_fields = [
|
|
@@ -1,9 +1,12 @@
|
|
| 1 |
-
|
|
|
|
| 2 |
from unittest.mock import patch
|
| 3 |
|
| 4 |
import pytest
|
|
|
|
| 5 |
|
| 6 |
from src.orchestrators.advanced import AdvancedOrchestrator
|
|
|
|
| 7 |
|
| 8 |
|
| 9 |
@pytest.mark.unit
|
|
@@ -11,63 +14,71 @@ class TestAdvancedOrchestratorConfig:
|
|
| 11 |
"""Tests for configuration options."""
|
| 12 |
|
| 13 |
def test_default_max_rounds_is_five(self) -> None:
|
| 14 |
-
"""Default max_rounds should be 5
|
| 15 |
-
with (
|
| 16 |
-
patch.dict(os.environ, {}, clear=True),
|
| 17 |
-
patch("src.orchestrators.advanced.check_magentic_requirements"),
|
| 18 |
-
):
|
| 19 |
-
# Clear any existing env var
|
| 20 |
-
os.environ.pop("ADVANCED_MAX_ROUNDS", None)
|
| 21 |
orch = AdvancedOrchestrator()
|
| 22 |
assert orch._max_rounds == 5
|
| 23 |
|
| 24 |
-
def
|
| 25 |
-
"""
|
| 26 |
-
with (
|
| 27 |
-
patch.dict(os.environ, {"ADVANCED_MAX_ROUNDS": "3"}),
|
| 28 |
-
patch("src.orchestrators.advanced.check_magentic_requirements"),
|
| 29 |
-
):
|
| 30 |
-
orch = AdvancedOrchestrator()
|
| 31 |
-
assert orch._max_rounds == 3
|
| 32 |
-
|
| 33 |
-
def test_explicit_max_rounds_overrides_env(self) -> None:
|
| 34 |
-
"""Explicit parameter should override environment."""
|
| 35 |
-
with (
|
| 36 |
-
patch.dict(os.environ, {"ADVANCED_MAX_ROUNDS": "3"}),
|
| 37 |
-
patch("src.orchestrators.advanced.check_magentic_requirements"),
|
| 38 |
-
):
|
| 39 |
orch = AdvancedOrchestrator(max_rounds=7)
|
| 40 |
assert orch._max_rounds == 7
|
| 41 |
|
| 42 |
def test_timeout_default_is_five_minutes(self) -> None:
|
| 43 |
-
"""Default timeout should be 300s (5 min)
|
| 44 |
with patch("src.orchestrators.advanced.check_magentic_requirements"):
|
| 45 |
orch = AdvancedOrchestrator()
|
| 46 |
assert orch._timeout_seconds == 300.0
|
| 47 |
|
| 48 |
-
def
|
| 49 |
-
"""
|
| 50 |
-
with (
|
| 51 |
-
|
| 52 |
-
|
| 53 |
-
):
|
| 54 |
-
orch = AdvancedOrchestrator()
|
| 55 |
-
assert orch._max_rounds == 5
|
| 56 |
|
| 57 |
-
|
| 58 |
-
|
| 59 |
-
|
| 60 |
-
|
| 61 |
-
|
| 62 |
-
|
| 63 |
-
|
| 64 |
-
|
| 65 |
-
|
| 66 |
-
|
| 67 |
-
|
| 68 |
-
|
| 69 |
-
|
| 70 |
-
|
| 71 |
-
|
| 72 |
-
|
| 73 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Tests for AdvancedOrchestrator configuration."""
|
| 2 |
+
|
| 3 |
from unittest.mock import patch
|
| 4 |
|
| 5 |
import pytest
|
| 6 |
+
from pydantic import ValidationError
|
| 7 |
|
| 8 |
from src.orchestrators.advanced import AdvancedOrchestrator
|
| 9 |
+
from src.utils.config import Settings
|
| 10 |
|
| 11 |
|
| 12 |
@pytest.mark.unit
|
|
|
|
| 14 |
"""Tests for configuration options."""
|
| 15 |
|
| 16 |
def test_default_max_rounds_is_five(self) -> None:
|
| 17 |
+
"""Default max_rounds should be 5 from settings."""
|
| 18 |
+
with patch("src.orchestrators.advanced.check_magentic_requirements"):
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 19 |
orch = AdvancedOrchestrator()
|
| 20 |
assert orch._max_rounds == 5
|
| 21 |
|
| 22 |
+
def test_explicit_max_rounds_overrides_settings(self) -> None:
|
| 23 |
+
"""Explicit parameter should override settings."""
|
| 24 |
+
with patch("src.orchestrators.advanced.check_magentic_requirements"):
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 25 |
orch = AdvancedOrchestrator(max_rounds=7)
|
| 26 |
assert orch._max_rounds == 7
|
| 27 |
|
| 28 |
def test_timeout_default_is_five_minutes(self) -> None:
|
| 29 |
+
"""Default timeout should be 300s (5 min) from settings."""
|
| 30 |
with patch("src.orchestrators.advanced.check_magentic_requirements"):
|
| 31 |
orch = AdvancedOrchestrator()
|
| 32 |
assert orch._timeout_seconds == 300.0
|
| 33 |
|
| 34 |
+
def test_explicit_timeout_overrides_settings(self) -> None:
|
| 35 |
+
"""Explicit timeout parameter should override settings."""
|
| 36 |
+
with patch("src.orchestrators.advanced.check_magentic_requirements"):
|
| 37 |
+
orch = AdvancedOrchestrator(timeout_seconds=120.0)
|
| 38 |
+
assert orch._timeout_seconds == 120.0
|
|
|
|
|
|
|
|
|
|
| 39 |
|
| 40 |
+
|
| 41 |
+
@pytest.mark.unit
|
| 42 |
+
class TestSettingsValidation:
|
| 43 |
+
"""Tests for pydantic Settings validation (fail-fast behavior)."""
|
| 44 |
+
|
| 45 |
+
def test_invalid_max_rounds_type_raises(self) -> None:
|
| 46 |
+
"""Non-integer ADVANCED_MAX_ROUNDS should fail at startup."""
|
| 47 |
+
with pytest.raises(ValidationError) as exc_info:
|
| 48 |
+
Settings(advanced_max_rounds="not_a_number") # type: ignore[arg-type]
|
| 49 |
+
assert "advanced_max_rounds" in str(exc_info.value)
|
| 50 |
+
|
| 51 |
+
def test_zero_max_rounds_raises(self) -> None:
|
| 52 |
+
"""ADVANCED_MAX_ROUNDS=0 should fail validation (ge=1)."""
|
| 53 |
+
with pytest.raises(ValidationError) as exc_info:
|
| 54 |
+
Settings(advanced_max_rounds=0)
|
| 55 |
+
assert "greater than or equal to 1" in str(exc_info.value)
|
| 56 |
+
|
| 57 |
+
def test_negative_max_rounds_raises(self) -> None:
|
| 58 |
+
"""Negative ADVANCED_MAX_ROUNDS should fail validation."""
|
| 59 |
+
with pytest.raises(ValidationError) as exc_info:
|
| 60 |
+
Settings(advanced_max_rounds=-5)
|
| 61 |
+
assert "greater than or equal to 1" in str(exc_info.value)
|
| 62 |
+
|
| 63 |
+
def test_max_rounds_above_limit_raises(self) -> None:
|
| 64 |
+
"""ADVANCED_MAX_ROUNDS > 20 should fail validation (le=20)."""
|
| 65 |
+
with pytest.raises(ValidationError) as exc_info:
|
| 66 |
+
Settings(advanced_max_rounds=100)
|
| 67 |
+
assert "less than or equal to 20" in str(exc_info.value)
|
| 68 |
+
|
| 69 |
+
def test_valid_max_rounds_accepted(self) -> None:
|
| 70 |
+
"""Valid ADVANCED_MAX_ROUNDS should be accepted."""
|
| 71 |
+
s = Settings(advanced_max_rounds=10)
|
| 72 |
+
assert s.advanced_max_rounds == 10
|
| 73 |
+
|
| 74 |
+
def test_timeout_too_low_raises(self) -> None:
|
| 75 |
+
"""ADVANCED_TIMEOUT < 60 should fail validation."""
|
| 76 |
+
with pytest.raises(ValidationError) as exc_info:
|
| 77 |
+
Settings(advanced_timeout=30.0)
|
| 78 |
+
assert "greater than or equal to 60" in str(exc_info.value)
|
| 79 |
+
|
| 80 |
+
def test_timeout_too_high_raises(self) -> None:
|
| 81 |
+
"""ADVANCED_TIMEOUT > 900 should fail validation."""
|
| 82 |
+
with pytest.raises(ValidationError) as exc_info:
|
| 83 |
+
Settings(advanced_timeout=1000.0)
|
| 84 |
+
assert "less than or equal to 900" in str(exc_info.value)
|
|
@@ -25,10 +25,17 @@ class TestAppDomain:
|
|
| 25 |
)
|
| 26 |
|
| 27 |
@patch.dict("os.environ", {}, clear=True)
|
|
|
|
| 28 |
@patch("src.app.create_orchestrator")
|
| 29 |
@patch("src.app.HFInferenceJudgeHandler")
|
| 30 |
-
def test_configure_orchestrator_passes_domain_free_tier(
|
|
|
|
|
|
|
| 31 |
"""Test domain is passed when using free tier (no API keys)."""
|
|
|
|
|
|
|
|
|
|
|
|
|
| 32 |
configure_orchestrator(use_mock=False, mode="simple", domain=ResearchDomain.SEXUAL_HEALTH)
|
| 33 |
|
| 34 |
# HFInferenceJudgeHandler should receive domain (no API keys = free tier)
|
|
@@ -42,8 +49,13 @@ class TestAppDomain:
|
|
| 42 |
domain=ResearchDomain.SEXUAL_HEALTH,
|
| 43 |
)
|
| 44 |
|
|
|
|
| 45 |
@patch("src.app.configure_orchestrator")
|
| 46 |
-
async def test_research_agent_passes_domain(self, mock_config):
|
|
|
|
|
|
|
|
|
|
|
|
|
| 47 |
# Mock orchestrator
|
| 48 |
mock_orch = MagicMock()
|
| 49 |
mock_orch.run.return_value = [] # Async iterator?
|
|
|
|
| 25 |
)
|
| 26 |
|
| 27 |
@patch.dict("os.environ", {}, clear=True)
|
| 28 |
+
@patch("src.app.settings")
|
| 29 |
@patch("src.app.create_orchestrator")
|
| 30 |
@patch("src.app.HFInferenceJudgeHandler")
|
| 31 |
+
def test_configure_orchestrator_passes_domain_free_tier(
|
| 32 |
+
self, mock_hf_judge, mock_create, mock_settings
|
| 33 |
+
):
|
| 34 |
"""Test domain is passed when using free tier (no API keys)."""
|
| 35 |
+
# Simulate no keys in settings
|
| 36 |
+
mock_settings.has_openai_key = False
|
| 37 |
+
mock_settings.has_anthropic_key = False
|
| 38 |
+
|
| 39 |
configure_orchestrator(use_mock=False, mode="simple", domain=ResearchDomain.SEXUAL_HEALTH)
|
| 40 |
|
| 41 |
# HFInferenceJudgeHandler should receive domain (no API keys = free tier)
|
|
|
|
| 49 |
domain=ResearchDomain.SEXUAL_HEALTH,
|
| 50 |
)
|
| 51 |
|
| 52 |
+
@patch("src.app.settings")
|
| 53 |
@patch("src.app.configure_orchestrator")
|
| 54 |
+
async def test_research_agent_passes_domain(self, mock_config, mock_settings):
|
| 55 |
+
# Mock settings to have some state
|
| 56 |
+
mock_settings.has_openai_key = False
|
| 57 |
+
mock_settings.has_anthropic_key = False
|
| 58 |
+
|
| 59 |
# Mock orchestrator
|
| 60 |
mock_orch = MagicMock()
|
| 61 |
mock_orch.run.return_value = [] # Async iterator?
|