VibecoderMcSwaggins commited on
Commit
599a754
·
unverified ·
1 Parent(s): 3cb2e43

fix: P0 provider mismatch and code quality audit fixes (#102)

Browse files

* refactor: replace hacky try/except with proper pydantic validation

PROBLEM:
- CodeRabbit recommended try/except for ADVANCED_MAX_ROUNDS parsing
- This is hacky defensive programming, not Clean Code
- Silent fallbacks mask configuration errors

SOLUTION (Uncle Bob approved):
1. Add advanced_max_rounds to Settings with pydantic Field validation:
- ge=1, le=20 bounds checking
- Fails fast at startup with clear error message

2. Add advanced_timeout to Settings (60-900 seconds)

3. Remove os.getenv + try/except hack from advanced.py
- Now uses settings.advanced_max_rounds directly

4. Fix domain.py: invalid domain strings now raise ValueError
- Shows valid options in error message
- Previously silently fell back to default

5. Update tests: 18 tests verifying fail-fast behavior
- TestSettingsValidation: type errors, bounds violations all raise
- TestGetDomainConfig: invalid strings raise with helpful message

PHILOSOPHY:
"If someone configures your app wrong, tell them loudly.
Don't pretend everything is fine." - Uncle Bob

* fix: P0 provider mismatch and code quality audit fixes

CRITICAL BUG FIX:
- get_model() now auto-detects available providers (OpenAI > Anthropic > HuggingFace)
- Raises clear ConfigurationError when no API keys configured
- Free Tier synthesis properly falls back to template when no HF_TOKEN

CODE QUALITY FIXES (from audit):
- Replace manual os.getenv with centralized settings properties (app.py)
- Add logging to p-value parsing (statistical_analyzer.py) - fixes silent pass
- Narrow exception handling to specific errors (pubmed.py)
- Use find() instead of try/except for string search (code_execution.py)
- Use centralized settings for Modal credentials (code_execution.py)

TESTS:
- Add 5 new TDD tests for get_model() auto-detection
- Fix regression tests for updated settings usage
- All 309 tests pass

DOCUMENTATION:
- Add P0_SYNTHESIS_PROVIDER_MISMATCH.md with full fix documentation
- Add AUDIT_FINDINGS_2025_11_30.md code quality audit
- Update ACTIVE_BUGS.md index

* test: add explicit has_huggingface_key per CodeRabbit review

docs/bugs/ACTIVE_BUGS.md CHANGED
@@ -3,10 +3,11 @@
3
  > Last updated: 2025-11-30
4
  >
5
  > **Note:** Completed bug docs archived to `docs/bugs/archive/`
 
6
 
7
  ## P0 - Blocker
8
 
9
- *(None - P0 bugs resolved)*
10
 
11
  ---
12
 
@@ -56,6 +57,16 @@
56
 
57
  ## Resolved Bugs
58
 
 
 
 
 
 
 
 
 
 
 
59
  ### ~~P0 - Simple Mode Never Synthesizes~~ FIXED
60
  **PR:** [#71](https://github.com/The-Obstacle-Is-The-Way/DeepBoner/pull/71) (SPEC_06)
61
  **Commit**: `5cac97d` (2025-11-29)
 
3
  > Last updated: 2025-11-30
4
  >
5
  > **Note:** Completed bug docs archived to `docs/bugs/archive/`
6
+ > **See also:** [Code Quality Audit Findings (2025-11-30)](AUDIT_FINDINGS_2025_11_30.md)
7
 
8
  ## P0 - Blocker
9
 
10
+ (None)
11
 
12
  ---
13
 
 
57
 
58
  ## Resolved Bugs
59
 
60
+ ### ~~P0 - Synthesis Fails with OpenAIError in Free Mode~~ FIXED
61
+ **File:** `docs/bugs/P0_SYNTHESIS_PROVIDER_MISMATCH.md`
62
+ **Found:** 2025-11-30 (Code Audit)
63
+ **Resolved:** 2025-11-30
64
+
65
+ - Problem: "Simple Mode" (Free Tier) crashed with `OpenAIError`.
66
+ - Root Cause: `get_model()` defaulted to OpenAI regardless of available keys.
67
+ - Fix: Implemented auto-detection in `judges.py` (OpenAI > Anthropic > HuggingFace).
68
+ - Added extensive unit tests and regression tests.
69
+
70
  ### ~~P0 - Simple Mode Never Synthesizes~~ FIXED
71
  **PR:** [#71](https://github.com/The-Obstacle-Is-The-Way/DeepBoner/pull/71) (SPEC_06)
72
  **Commit**: `5cac97d` (2025-11-29)
docs/bugs/AUDIT_FINDINGS_2025_11_30.md ADDED
@@ -0,0 +1,70 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Code Quality Audit Findings - 2025-11-30
2
+
3
+ **Auditor:** Senior Staff Engineer (Gemini)
4
+ **Date:** 2025-11-30
5
+ **Scope:** `src/` (services, tools, agents, orchestrators)
6
+ **Focus:** Configuration validation, Error handling, Defensive programming anti-patterns
7
+
8
+ ## Summary
9
+
10
+ The codebase is generally clean and modern, but exhibits specific anti-patterns related to configuration management and defensive error handling. The most critical finding is the reliance on manual `os.getenv` calls and "silent default" fallbacks which obscure configuration errors, directly contributing to the `OpenAIError` observed in production.
11
+
12
+ ## Findings
13
+
14
+ ### 1. Defensive Pass Block (Silent Failure) - MEDIUM
15
+ **File:** `src/services/statistical_analyzer.py:246-247`
16
+ ```python
17
+ try:
18
+ min_p = min(float(p) for p in p_values)
19
+ # ... logic ...
20
+ except ValueError:
21
+ pass
22
+ ```
23
+ **Problem:** If p-values are found by regex but fail to parse, the error is swallowed silently. This makes debugging parser issues impossible.
24
+ **Fix:** Replace `pass` with `logger.warning("Failed to parse p-values: %s", p_values)` to aid debugging.
25
+
26
+ ### 2. Missing Pydantic Validation (Manual Config) - MEDIUM
27
+ **File:** `src/tools/code_execution.py:75-76`
28
+ ```python
29
+ self.modal_token_id = os.getenv("MODAL_TOKEN_ID")
30
+ self.modal_token_secret = os.getenv("MODAL_TOKEN_SECRET")
31
+ ```
32
+ **Problem:** Secrets are manually fetched from env vars, bypassing the centralized `Settings` validation.
33
+ **Fix:** Move to `src/utils/config.py` in the `Settings` class and inject `settings` into `ModalCodeExecutor`.
34
+
35
+ ### 3. Broad Exception Swallowing - MEDIUM
36
+ **File:** `src/tools/pubmed.py:129-130`
37
+ ```python
38
+ except Exception:
39
+ continue # Skip malformed articles
40
+ ```
41
+ **Problem:** Catching `Exception` hides potential bugs (like `NameError` or `TypeError` in our own code), not just malformed data.
42
+ **Fix:** Catch specific exceptions (e.g., `(KeyError, AttributeError, TypeError)`) OR log the error before continuing: `logger.debug(f"Skipping malformed article {pmid}: {e}")`.
43
+
44
+ ### 4. Missing Pydantic Validation (UI Layer) - LOW
45
+ **File:** `src/app.py:115, 119`
46
+ ```python
47
+ elif os.getenv("OPENAI_API_KEY"):
48
+ # ...
49
+ elif os.getenv("ANTHROPIC_API_KEY"):
50
+ ```
51
+ **Problem:** Application logic relies on raw environment variable checks to determine available backends, creating duplication and potential inconsistency with `config.py`.
52
+ **Fix:** Centralize this logic in `src/utils/config.py` (e.g., `settings.has_openai`, `settings.has_anthropic`).
53
+
54
+ ### 5. Try/Except for Flow Control - LOW
55
+ **File:** `src/tools/code_execution.py:244-249`
56
+ ```python
57
+ try:
58
+ start_idx = text.index(start_marker) + len(start_marker)
59
+ # ...
60
+ except ValueError:
61
+ return text.strip()
62
+ ```
63
+ **Problem:** Using exceptions for expected "not found" cases is slower and less explicit.
64
+ **Fix:** Use `find()` which returns `-1` on failure.
65
+
66
+ ## Action Plan
67
+
68
+ 1. **Refactor Configuration:** Eliminate `os.getenv` in favor of `src/utils/config.py` `Settings` model.
69
+ 2. **Fix Error Handling:** Remove empty `pass` blocks; add logging.
70
+ 3. **Address P0 Bug:** Fix the `OpenAIError` in synthesis (caused by Finding #4/General Config issue) by injecting the correct model into the orchestrator.
docs/bugs/P0_SYNTHESIS_PROVIDER_MISMATCH.md ADDED
@@ -0,0 +1,273 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # P0 - Systemic Provider Mismatch Across All Modes
2
+
3
+ **Status:** RESOLVED
4
+ **Priority:** P0 (Blocker for Free Tier/Demo)
5
+ **Found:** 2025-11-30 (during Audit)
6
+ **Resolved:** 2025-11-30
7
+ **Component:** Multiple files across orchestrators, agents, services
8
+
9
+ ## Resolution Summary
10
+
11
+ The critical provider mismatch bug has been fixed by implementing auto-detection in `src/agent_factory/judges.py`.
12
+ The `get_model()` function now checks for actual API key availability (`has_openai_key`, `has_anthropic_key`, `has_huggingface_key`)
13
+ instead of relying on the static `settings.llm_provider` configuration.
14
+
15
+ ### Fix Details
16
+
17
+ - **Auto-Detection Implemented**: `get_model()` prioritizes OpenAI > Anthropic > HuggingFace based on *available keys*.
18
+ - **Fail-Fast on No Keys**: If no API keys are configured, `get_model()` raises `ConfigurationError` with clear message.
19
+ - **HuggingFace Requires Token**: Free Tier via `HuggingFaceModel` requires `HF_TOKEN` (PydanticAI requirement).
20
+ - **Synthesis Fallback**: When `get_model()` fails, synthesis gracefully falls back to template.
21
+ - **Audit Fixes Applied**:
22
+ - Replaced manual `os.getenv` checks with centralized `settings` properties in `src/app.py`.
23
+ - Added logging to `src/services/statistical_analyzer.py` (fixed silent `pass`).
24
+ - Narrowed exception handling in `src/tools/pubmed.py`.
25
+ - Optimized string search in `src/tools/code_execution.py`.
26
+
27
+ ### Key Clarification
28
+
29
+ The **Free Tier** in Simple Mode uses `HFInferenceJudgeHandler` (which uses `huggingface_hub.InferenceClient`)
30
+ for judging - this does NOT require `HF_TOKEN`. However, synthesis via `get_model()` uses PydanticAI's
31
+ `HuggingFaceModel` which DOES require `HF_TOKEN`. When no tokens are configured, synthesis falls back to
32
+ the template-based summary (which is still useful).
33
+
34
+ ### Verification
35
+
36
+ - **Unit Tests**: 5 new TDD tests in `tests/unit/agent_factory/test_get_model_auto_detect.py` pass.
37
+ - **All Tests**: 309 tests pass (`make check` succeeds).
38
+ - **Regression Tests**: Fixed and verified `tests/unit/agent_factory/test_judges_factory.py`.
39
+
40
+ ---
41
+
42
+ ## Symptom (Archive)
43
+
44
+ When running in "Simple Mode" (Free Tier / No API Key), the synthesis step fails to generate a narrative and falls back to a structured summary template. The user sees:
45
+
46
+ ```text
47
+ > ⚠️ Note: AI narrative synthesis unavailable. Showing structured summary.
48
+ > _Error: OpenAIError_
49
+ ```
50
+
51
+ ## Affected Files (COMPREHENSIVE AUDIT)
52
+
53
+ ### Files Calling `get_model()` Directly (9 locations)
54
+
55
+ | File | Line | Context | Impact |
56
+ |------|------|---------|--------|
57
+ | `simple.py` | 547 | Synthesis step | Free Tier broken |
58
+ | `statistical_analyzer.py` | 75 | Analysis agent | Free Tier broken |
59
+ | `judge_agent_llm.py` | 18 | LLM Judge | Free Tier broken |
60
+ | `graph/nodes.py` | 177 | LangGraph hypothesis | Free Tier broken |
61
+ | `graph/nodes.py` | 249 | LangGraph synthesis | Free Tier broken |
62
+ | `report_agent.py` | 45 | Report generation | Free Tier broken |
63
+ | `hypothesis_agent.py` | 44 | Hypothesis generation | Free Tier broken |
64
+ | `judges.py` | 100 | JudgeHandler default | OK (accepts param) |
65
+
66
+ ### Files Hardcoding `OpenAIChatClient` (Architecturally OpenAI-Only)
67
+
68
+ | File | Lines | Context |
69
+ |------|-------|---------|
70
+ | `advanced.py` | 100, 121 | Manager client |
71
+ | `magentic_agents.py` | 29, 70, 129, 173 | All 4 agents |
72
+ | `retrieval_agent.py` | 62 | Retrieval agent |
73
+ | `code_executor_agent.py` | 52 | Code executor |
74
+ | `llm_factory.py` | 42 | Factory default |
75
+
76
+ **Note:** Advanced mode is architecturally locked to OpenAI via `agent_framework.openai.OpenAIChatClient`. This is by design - see `app.py:188-194` which falls back to Simple mode if no OpenAI key. However, users are not clearly informed of this limitation.
77
+
78
+ ## Root Cause
79
+
80
+ **Settings/Runtime Sync Gap - Two Separate Backend Selection Systems.**
81
+
82
+ The codebase has **two independent** systems for selecting the LLM backend:
83
+ 1. `settings.llm_provider` (config.py default: "openai")
84
+ 2. `app.py` runtime detection via `os.getenv()` checks
85
+
86
+ These are **never synchronized**, causing the Judge and Synthesis steps to use different backends.
87
+
88
+ ### Detailed Call Chain
89
+
90
+ 1. **`src/app.py:115-126`** (runtime detection):
91
+ ```python
92
+ # app.py bypasses settings entirely for JudgeHandler selection
93
+ elif os.getenv("OPENAI_API_KEY"):
94
+ judge_handler = JudgeHandler(model=None, domain=domain)
95
+ elif os.getenv("ANTHROPIC_API_KEY"):
96
+ judge_handler = JudgeHandler(model=None, domain=domain)
97
+ else:
98
+ judge_handler = HFInferenceJudgeHandler(domain=domain) # Free Tier
99
+ ```
100
+ **Note:** This creates the correct handler but does NOT update `settings.llm_provider`.
101
+
102
+ 2. **`src/orchestrators/simple.py:546-552`** (synthesis step):
103
+ ```python
104
+ from src.agent_factory.judges import get_model
105
+ agent: Agent[None, str] = Agent(model=get_model(), ...) # <-- BUG!
106
+ ```
107
+ Synthesis calls `get_model()` directly instead of using the injected judge's model.
108
+
109
+ 3. **`src/agent_factory/judges.py:56-78`** (`get_model()`):
110
+ ```python
111
+ def get_model() -> Any:
112
+ llm_provider = settings.llm_provider # <-- Reads from settings (still "openai")
113
+ # ...
114
+ openai_provider = OpenAIProvider(api_key=settings.openai_api_key) # <-- None!
115
+ return OpenAIChatModel(settings.openai_model, provider=openai_provider)
116
+ ```
117
+ **Result:** Creates OpenAI model with `api_key=None` → `OpenAIError`
118
+
119
+ ### Why Free Tier Fails
120
+
121
+ | Step | System Used | Backend Selected |
122
+ |------|-------------|------------------|
123
+ | JudgeHandler | `app.py` runtime | HFInferenceJudgeHandler ✅ |
124
+ | Synthesis | `settings.llm_provider` | OpenAI (default) ❌ |
125
+
126
+ The Judge works because app.py explicitly creates `HFInferenceJudgeHandler`.
127
+ Synthesis fails because it calls `get_model()` which reads `settings.llm_provider = "openai"` (unchanged from default).
128
+
129
+ ## Impact
130
+
131
+ - **User Experience:** Free tier users (Demo users) never see the high-quality narrative synthesis, only the fallback.
132
+ - **System Integrity:** The orchestrator ignores the runtime backend selection.
133
+
134
+ ## Implemented Fix
135
+
136
+ **Strategy: Fix `get_model()` to Auto-Detect Available Provider**
137
+
138
+ ### Actual Implementation (Merged)
139
+
140
+ **File:** `src/agent_factory/judges.py`
141
+
142
+ This is the **single point of fix** that resolves all 7 broken `get_model()` call sites.
143
+
144
+ ```python
145
+ def get_model() -> Any:
146
+ """Get the LLM model based on available API keys.
147
+
148
+ Priority order:
149
+ 1. OpenAI (if OPENAI_API_KEY set)
150
+ 2. Anthropic (if ANTHROPIC_API_KEY set)
151
+ 3. HuggingFace (if HF_TOKEN set)
152
+
153
+ Raises:
154
+ ConfigurationError: If no API keys are configured.
155
+
156
+ Note: settings.llm_provider is ignored in favor of actual key availability.
157
+ This ensures the model matches what app.py selected for JudgeHandler.
158
+ """
159
+ from src.utils.exceptions import ConfigurationError
160
+
161
+ # Priority 1: OpenAI (most common, best tool calling)
162
+ if settings.has_openai_key:
163
+ openai_provider = OpenAIProvider(api_key=settings.openai_api_key)
164
+ return OpenAIChatModel(settings.openai_model, provider=openai_provider)
165
+
166
+ # Priority 2: Anthropic
167
+ if settings.has_anthropic_key:
168
+ provider = AnthropicProvider(api_key=settings.anthropic_api_key)
169
+ return AnthropicModel(settings.anthropic_model, provider=provider)
170
+
171
+ # Priority 3: HuggingFace (requires HF_TOKEN)
172
+ if settings.has_huggingface_key:
173
+ model_name = settings.huggingface_model or "meta-llama/Llama-3.1-70B-Instruct"
174
+ hf_provider = HuggingFaceProvider(api_key=settings.hf_token)
175
+ return HuggingFaceModel(model_name, provider=hf_provider)
176
+
177
+ # No keys configured - fail fast with clear error
178
+ raise ConfigurationError(
179
+ "No LLM API key configured. Set one of: OPENAI_API_KEY, ANTHROPIC_API_KEY, or HF_TOKEN"
180
+ )
181
+ ```
182
+
183
+ **Why this works:**
184
+ - Single fix location updates all 7 broken call sites
185
+ - Matches app.py's detection logic (key availability, not settings.llm_provider)
186
+ - HuggingFace works when HF_TOKEN is available
187
+ - Raises clear error when no keys configured (callers can catch and fallback)
188
+ - No changes needed to orchestrators, agents, or services
189
+
190
+ ### What This Does NOT Fix (By Design)
191
+
192
+ **Advanced Mode remains OpenAI-only.** The following files use `agent_framework.openai.OpenAIChatClient` which only supports OpenAI:
193
+
194
+ - `advanced.py` (Manager + agents)
195
+ - `magentic_agents.py` (SearchAgent, JudgeAgent, HypothesisAgent, ReportAgent)
196
+ - `retrieval_agent.py`, `code_executor_agent.py`
197
+
198
+ This is **by design** - the Microsoft Agent Framework library (`agent-framework-core`) only provides `OpenAIChatClient`. To support other providers in Advanced mode would require:
199
+ 1. Wait for `agent-framework` to add Anthropic/HuggingFace clients, OR
200
+ 2. Write our own `ChatClient` implementations (significant effort)
201
+
202
+ **The current app.py behavior is correct:** it falls back to Simple mode when no OpenAI key is present (lines 188-194). The UI message could be clearer about why.
203
+
204
+ ## Test Plan (Implemented)
205
+
206
+ ### Unit Tests (Verified Passing)
207
+
208
+ ```python
209
+ # tests/unit/agent_factory/test_get_model_auto_detect.py
210
+
211
+ import pytest
212
+ from src.agent_factory.judges import get_model
213
+ from src.utils.config import settings
214
+ from src.utils.exceptions import ConfigurationError
215
+
216
+ class TestGetModelAutoDetect:
217
+ """Test that get_model() auto-detects available providers."""
218
+
219
+ def test_returns_openai_when_key_present(self, monkeypatch):
220
+ """OpenAI key present → OpenAI model."""
221
+ monkeypatch.setattr(settings, "openai_api_key", "sk-test")
222
+ monkeypatch.setattr(settings, "anthropic_api_key", None)
223
+ monkeypatch.setattr(settings, "hf_token", None)
224
+ model = get_model()
225
+ assert isinstance(model, OpenAIChatModel)
226
+
227
+ def test_returns_anthropic_when_only_anthropic_key(self, monkeypatch):
228
+ """Only Anthropic key → Anthropic model."""
229
+ monkeypatch.setattr(settings, "openai_api_key", None)
230
+ monkeypatch.setattr(settings, "anthropic_api_key", "sk-ant-test")
231
+ monkeypatch.setattr(settings, "hf_token", None)
232
+ model = get_model()
233
+ assert isinstance(model, AnthropicModel)
234
+
235
+ def test_returns_huggingface_when_hf_token_present(self, monkeypatch):
236
+ """HF_TOKEN present (no paid keys) → HuggingFace model."""
237
+ monkeypatch.setattr(settings, "openai_api_key", None)
238
+ monkeypatch.setattr(settings, "anthropic_api_key", None)
239
+ monkeypatch.setattr(settings, "hf_token", "hf_test_token")
240
+ model = get_model()
241
+ assert isinstance(model, HuggingFaceModel)
242
+
243
+ def test_raises_error_when_no_keys(self, monkeypatch):
244
+ """No keys at all → ConfigurationError."""
245
+ monkeypatch.setattr(settings, "openai_api_key", None)
246
+ monkeypatch.setattr(settings, "anthropic_api_key", None)
247
+ monkeypatch.setattr(settings, "hf_token", None)
248
+ with pytest.raises(ConfigurationError) as exc_info:
249
+ get_model()
250
+ assert "No LLM API key configured" in str(exc_info.value)
251
+
252
+ def test_openai_takes_priority_over_anthropic(self, monkeypatch):
253
+ """Both keys present → OpenAI wins."""
254
+ monkeypatch.setattr(settings, "openai_api_key", "sk-test")
255
+ monkeypatch.setattr(settings, "anthropic_api_key", "sk-ant-test")
256
+ model = get_model()
257
+ assert isinstance(model, OpenAIChatModel)
258
+ ```
259
+
260
+ ### Full Test Suite
261
+
262
+ ```bash
263
+ $ make check
264
+ # 309 passed in 238.16s (0:03:58)
265
+ # All checks passed!
266
+ ```
267
+
268
+ ### Manual Verification
269
+
270
+ 1. **Unset all API keys**: `unset OPENAI_API_KEY ANTHROPIC_API_KEY HF_TOKEN`
271
+ 2. **Run app**: `uv run python -m src.app`
272
+ 3. **Submit query**: "What drugs improve female libido?"
273
+ 4. **Verify**: Synthesis falls back to template (shows `ConfigurationError` in logs, but user sees structured summary)
src/agent_factory/judges.py CHANGED
@@ -54,28 +54,41 @@ def _extract_titles_from_evidence(
54
 
55
 
56
  def get_model() -> Any:
57
- """Get the LLM model based on configuration.
58
 
59
- Explicitly passes API keys from settings to avoid requiring
60
- users to export environment variables manually.
 
 
 
 
 
 
 
 
61
  """
62
- llm_provider = settings.llm_provider
 
 
 
 
 
63
 
64
- if llm_provider == "anthropic":
 
65
  provider = AnthropicProvider(api_key=settings.anthropic_api_key)
66
  return AnthropicModel(settings.anthropic_model, provider=provider)
67
 
68
- if llm_provider == "huggingface":
69
- # Free tier - uses HF_TOKEN from environment if available
70
  model_name = settings.huggingface_model or "meta-llama/Llama-3.1-70B-Instruct"
71
  hf_provider = HuggingFaceProvider(api_key=settings.hf_token)
72
  return HuggingFaceModel(model_name, provider=hf_provider)
73
 
74
- if llm_provider != "openai":
75
- logger.warning("Unknown LLM provider, defaulting to OpenAI", provider=llm_provider)
76
-
77
- openai_provider = OpenAIProvider(api_key=settings.openai_api_key)
78
- return OpenAIChatModel(settings.openai_model, provider=openai_provider)
79
 
80
 
81
  class JudgeHandler:
 
54
 
55
 
56
  def get_model() -> Any:
57
+ """Get the LLM model based on available API keys.
58
 
59
+ Priority order:
60
+ 1. OpenAI (if OPENAI_API_KEY set)
61
+ 2. Anthropic (if ANTHROPIC_API_KEY set)
62
+ 3. HuggingFace (if HF_TOKEN set)
63
+
64
+ Raises:
65
+ ConfigurationError: If no API keys are configured.
66
+
67
+ Note: settings.llm_provider is ignored in favor of actual key availability.
68
+ This ensures the model matches what app.py selected for JudgeHandler.
69
  """
70
+ from src.utils.exceptions import ConfigurationError
71
+
72
+ # Priority 1: OpenAI (most common, best tool calling)
73
+ if settings.has_openai_key:
74
+ openai_provider = OpenAIProvider(api_key=settings.openai_api_key)
75
+ return OpenAIChatModel(settings.openai_model, provider=openai_provider)
76
 
77
+ # Priority 2: Anthropic
78
+ if settings.has_anthropic_key:
79
  provider = AnthropicProvider(api_key=settings.anthropic_api_key)
80
  return AnthropicModel(settings.anthropic_model, provider=provider)
81
 
82
+ # Priority 3: HuggingFace (requires HF_TOKEN)
83
+ if settings.has_huggingface_key:
84
  model_name = settings.huggingface_model or "meta-llama/Llama-3.1-70B-Instruct"
85
  hf_provider = HuggingFaceProvider(api_key=settings.hf_token)
86
  return HuggingFaceModel(model_name, provider=hf_provider)
87
 
88
+ # No keys configured - fail fast with clear error
89
+ raise ConfigurationError(
90
+ "No LLM API key configured. Set one of: OPENAI_API_KEY, ANTHROPIC_API_KEY, or HF_TOKEN"
91
+ )
 
92
 
93
 
94
  class JudgeHandler:
src/app.py CHANGED
@@ -112,11 +112,11 @@ def configure_orchestrator(
112
  judge_handler = JudgeHandler(model=model, domain=domain)
113
 
114
  # 3. Environment API Keys (fallback)
115
- elif os.getenv("OPENAI_API_KEY"):
116
  judge_handler = JudgeHandler(model=None, domain=domain) # Uses env key
117
  backend_info = "Paid API (OpenAI from env)"
118
 
119
- elif os.getenv("ANTHROPIC_API_KEY"):
120
  judge_handler = JudgeHandler(model=None, domain=domain) # Uses env key
121
  backend_info = "Paid API (Anthropic from env)"
122
 
@@ -177,8 +177,8 @@ async def research_agent(
177
  user_api_key = (api_key_str.strip() or api_key_state_str.strip()) or None
178
 
179
  # Check available keys
180
- has_openai = bool(os.getenv("OPENAI_API_KEY"))
181
- has_anthropic = bool(os.getenv("ANTHROPIC_API_KEY"))
182
  # Check for OpenAI user key
183
  is_openai_user_key = (
184
  user_api_key and user_api_key.startswith("sk-") and not user_api_key.startswith("sk-ant-")
 
112
  judge_handler = JudgeHandler(model=model, domain=domain)
113
 
114
  # 3. Environment API Keys (fallback)
115
+ elif settings.has_openai_key:
116
  judge_handler = JudgeHandler(model=None, domain=domain) # Uses env key
117
  backend_info = "Paid API (OpenAI from env)"
118
 
119
+ elif settings.has_anthropic_key:
120
  judge_handler = JudgeHandler(model=None, domain=domain) # Uses env key
121
  backend_info = "Paid API (Anthropic from env)"
122
 
 
177
  user_api_key = (api_key_str.strip() or api_key_state_str.strip()) or None
178
 
179
  # Check available keys
180
+ has_openai = settings.has_openai_key
181
+ has_anthropic = settings.has_anthropic_key
182
  # Check for OpenAI user key
183
  is_openai_user_key = (
184
  user_api_key and user_api_key.startswith("sk-") and not user_api_key.startswith("sk-ant-")
src/config/domain.py CHANGED
@@ -122,7 +122,8 @@ def get_domain_config(domain: ResearchDomain | str | None = None) -> DomainConfi
122
  if isinstance(domain, str):
123
  try:
124
  domain = ResearchDomain(domain)
125
- except ValueError:
126
- domain = DEFAULT_DOMAIN
 
127
 
128
  return DOMAIN_CONFIGS[domain]
 
122
  if isinstance(domain, str):
123
  try:
124
  domain = ResearchDomain(domain)
125
+ except ValueError as e:
126
+ valid_domains = [d.value for d in ResearchDomain]
127
+ raise ValueError(f"Invalid domain '{domain}'. Valid domains: {valid_domains}") from e
128
 
129
  return DOMAIN_CONFIGS[domain]
src/orchestrators/advanced.py CHANGED
@@ -15,7 +15,6 @@ Design Patterns:
15
  """
16
 
17
  import asyncio
18
- import os
19
  from collections.abc import AsyncGenerator
20
  from typing import TYPE_CHECKING, Any
21
 
@@ -85,27 +84,11 @@ class AdvancedOrchestrator(OrchestratorProtocol):
85
  if not chat_client and not api_key:
86
  check_magentic_requirements()
87
 
88
- # Environment-configurable rounds (default 5 for demos)
89
- raw_rounds = os.getenv("ADVANCED_MAX_ROUNDS", "5")
90
- try:
91
- env_rounds = int(raw_rounds)
92
- except ValueError:
93
- logger.warning(
94
- "Invalid ADVANCED_MAX_ROUNDS value %r, falling back to 5",
95
- raw_rounds,
96
- )
97
- env_rounds = 5
98
-
99
- if env_rounds < 1:
100
- logger.warning(
101
- "ADVANCED_MAX_ROUNDS must be >= 1, got %d; using 1 instead",
102
- env_rounds,
103
- )
104
- env_rounds = 1
105
-
106
- self._max_rounds = max_rounds if max_rounds is not None else env_rounds
107
-
108
- self._timeout_seconds = timeout_seconds
109
  self.domain = domain
110
  self.domain_config = get_domain_config(domain)
111
  self._chat_client: OpenAIChatClient | None
 
15
  """
16
 
17
  import asyncio
 
18
  from collections.abc import AsyncGenerator
19
  from typing import TYPE_CHECKING, Any
20
 
 
84
  if not chat_client and not api_key:
85
  check_magentic_requirements()
86
 
87
+ # Use pydantic-validated settings (fails fast on invalid config)
88
+ self._max_rounds = max_rounds if max_rounds is not None else settings.advanced_max_rounds
89
+ self._timeout_seconds = (
90
+ timeout_seconds if timeout_seconds != 300.0 else settings.advanced_timeout
91
+ )
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
92
  self.domain = domain
93
  self.domain_config = get_domain_config(domain)
94
  self._chat_client: OpenAIChatClient | None
src/services/statistical_analyzer.py CHANGED
@@ -12,6 +12,8 @@ import re
12
  from functools import lru_cache, partial
13
  from typing import Any, Literal
14
 
 
 
15
  # Type alias for verdict values
16
  VerdictType = Literal["SUPPORTED", "REFUTED", "INCONCLUSIVE"]
17
 
@@ -26,6 +28,8 @@ from src.tools.code_execution import (
26
  )
27
  from src.utils.models import Evidence
28
 
 
 
29
 
30
  class AnalysisResult(BaseModel):
31
  """Result of statistical analysis."""
@@ -244,7 +248,7 @@ Generate executable Python code to analyze this evidence."""
244
  else:
245
  return 0.60
246
  except ValueError:
247
- pass
248
 
249
  return 0.70 # Default
250
 
 
12
  from functools import lru_cache, partial
13
  from typing import Any, Literal
14
 
15
+ import structlog
16
+
17
  # Type alias for verdict values
18
  VerdictType = Literal["SUPPORTED", "REFUTED", "INCONCLUSIVE"]
19
 
 
28
  )
29
  from src.utils.models import Evidence
30
 
31
+ logger = structlog.get_logger()
32
+
33
 
34
  class AnalysisResult(BaseModel):
35
  """Result of statistical analysis."""
 
248
  else:
249
  return 0.60
250
  except ValueError:
251
+ logger.debug("Failed to parse p-values", p_values=p_values)
252
 
253
  return 0.70 # Default
254
 
src/tools/code_execution.py CHANGED
@@ -4,12 +4,13 @@ This module provides sandboxed Python code execution using Modal's serverless in
4
  It's designed for running LLM-generated statistical analysis code safely.
5
  """
6
 
7
- import os
8
  from functools import lru_cache
9
  from typing import Any
10
 
11
  import structlog
12
 
 
 
13
  logger = structlog.get_logger(__name__)
14
 
15
  # Shared library versions for Modal sandbox - used by both executor and LLM prompts
@@ -72,8 +73,8 @@ class ModalCodeExecutor:
72
  Execution will fail at runtime without valid credentials.
73
  """
74
  # Check for Modal credentials
75
- self.modal_token_id = os.getenv("MODAL_TOKEN_ID")
76
- self.modal_token_secret = os.getenv("MODAL_TOKEN_SECRET")
77
 
78
  if not self.modal_token_id or not self.modal_token_secret:
79
  logger.warning(
@@ -241,13 +242,16 @@ print(json.dumps({{"__RESULT__": result}}))
241
 
242
  def _extract_output(self, text: str, start_marker: str, end_marker: str) -> str:
243
  """Extract content between markers."""
244
- try:
245
- start_idx = text.index(start_marker) + len(start_marker)
246
- end_idx = text.index(end_marker)
247
- return text[start_idx:end_idx].strip()
248
- except ValueError:
249
- # Markers not found, return original text
250
  return text.strip()
 
 
 
 
 
 
 
251
 
252
 
253
  @lru_cache(maxsize=1)
 
4
  It's designed for running LLM-generated statistical analysis code safely.
5
  """
6
 
 
7
  from functools import lru_cache
8
  from typing import Any
9
 
10
  import structlog
11
 
12
+ from src.utils.config import settings
13
+
14
  logger = structlog.get_logger(__name__)
15
 
16
  # Shared library versions for Modal sandbox - used by both executor and LLM prompts
 
73
  Execution will fail at runtime without valid credentials.
74
  """
75
  # Check for Modal credentials
76
+ self.modal_token_id = settings.modal_token_id
77
+ self.modal_token_secret = settings.modal_token_secret
78
 
79
  if not self.modal_token_id or not self.modal_token_secret:
80
  logger.warning(
 
242
 
243
  def _extract_output(self, text: str, start_marker: str, end_marker: str) -> str:
244
  """Extract content between markers."""
245
+ start_idx = text.find(start_marker)
246
+ if start_idx == -1:
 
 
 
 
247
  return text.strip()
248
+ start_idx += len(start_marker)
249
+
250
+ end_idx = text.find(end_marker, start_idx)
251
+ if end_idx == -1:
252
+ return text.strip()
253
+
254
+ return text[start_idx:end_idx].strip()
255
 
256
 
257
  @lru_cache(maxsize=1)
src/tools/pubmed.py CHANGED
@@ -3,6 +3,7 @@
3
  from typing import Any
4
 
5
  import httpx
 
6
  import xmltodict
7
  from tenacity import retry, stop_after_attempt, wait_exponential
8
 
@@ -12,6 +13,8 @@ from src.utils.config import settings
12
  from src.utils.exceptions import RateLimitError, SearchError
13
  from src.utils.models import Citation, Evidence
14
 
 
 
15
 
16
  class PubMedTool:
17
  """Search tool for PubMed/NCBI."""
@@ -126,8 +129,9 @@ class PubMedTool:
126
  evidence = self._article_to_evidence(article)
127
  if evidence:
128
  evidence_list.append(evidence)
129
- except Exception:
130
- continue # Skip malformed articles
 
131
 
132
  return evidence_list
133
 
 
3
  from typing import Any
4
 
5
  import httpx
6
+ import structlog
7
  import xmltodict
8
  from tenacity import retry, stop_after_attempt, wait_exponential
9
 
 
13
  from src.utils.exceptions import RateLimitError, SearchError
14
  from src.utils.models import Citation, Evidence
15
 
16
+ logger = structlog.get_logger()
17
+
18
 
19
  class PubMedTool:
20
  """Search tool for PubMed/NCBI."""
 
129
  evidence = self._article_to_evidence(article)
130
  if evidence:
131
  evidence_list.append(evidence)
132
+ except (KeyError, AttributeError, TypeError) as e:
133
+ logger.debug("Skipping malformed article", error=str(e))
134
+ continue
135
 
136
  return evidence_list
137
 
src/utils/config.py CHANGED
@@ -60,10 +60,22 @@ class Settings(BaseSettings):
60
 
61
  # Agent Configuration
62
  max_iterations: int = Field(default=10, ge=1, le=50)
 
 
 
 
 
 
 
 
 
 
 
 
63
  search_timeout: int = Field(default=30, description="Seconds to wait for search")
64
  magentic_timeout: int = Field(
65
  default=600,
66
- description="Timeout for Magentic mode in seconds",
67
  )
68
 
69
  # Logging
 
60
 
61
  # Agent Configuration
62
  max_iterations: int = Field(default=10, ge=1, le=50)
63
+ advanced_max_rounds: int = Field(
64
+ default=5,
65
+ ge=1,
66
+ le=20,
67
+ description="Max coordination rounds for Advanced mode (default 5 for faster demos)",
68
+ )
69
+ advanced_timeout: float = Field(
70
+ default=300.0,
71
+ ge=60.0,
72
+ le=900.0,
73
+ description="Timeout for Advanced mode in seconds (default 5 min)",
74
+ )
75
  search_timeout: int = Field(default=30, description="Seconds to wait for search")
76
  magentic_timeout: int = Field(
77
  default=600,
78
+ description="Timeout for Magentic mode in seconds (deprecated, use advanced_timeout)",
79
  )
80
 
81
  # Logging
tests/unit/agent_factory/test_get_model_auto_detect.py ADDED
@@ -0,0 +1,59 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import pytest
2
+ from pydantic_ai.models.anthropic import AnthropicModel
3
+ from pydantic_ai.models.huggingface import HuggingFaceModel
4
+ from pydantic_ai.models.openai import OpenAIChatModel
5
+
6
+ from src.agent_factory.judges import get_model
7
+ from src.utils.config import settings
8
+ from src.utils.exceptions import ConfigurationError
9
+
10
+
11
+ class TestGetModelAutoDetect:
12
+ """Test that get_model() auto-detects available providers."""
13
+
14
+ def test_returns_openai_when_key_present(self, monkeypatch):
15
+ """OpenAI key present → OpenAI model."""
16
+ # Mock the settings properties (settings is a singleton)
17
+ monkeypatch.setattr(settings, "openai_api_key", "sk-test")
18
+ monkeypatch.setattr(settings, "anthropic_api_key", None)
19
+ monkeypatch.setattr(settings, "hf_token", None)
20
+
21
+ model = get_model()
22
+ assert isinstance(model, OpenAIChatModel)
23
+
24
+ def test_returns_anthropic_when_only_anthropic_key(self, monkeypatch):
25
+ """Only Anthropic key → Anthropic model."""
26
+ monkeypatch.setattr(settings, "openai_api_key", None)
27
+ monkeypatch.setattr(settings, "anthropic_api_key", "sk-ant-test")
28
+ monkeypatch.setattr(settings, "hf_token", None)
29
+
30
+ model = get_model()
31
+ assert isinstance(model, AnthropicModel)
32
+
33
+ def test_returns_huggingface_when_hf_token_present(self, monkeypatch):
34
+ """HF_TOKEN present (no paid keys) → HuggingFace model."""
35
+ monkeypatch.setattr(settings, "openai_api_key", None)
36
+ monkeypatch.setattr(settings, "anthropic_api_key", None)
37
+ monkeypatch.setattr(settings, "hf_token", "hf_test_token")
38
+
39
+ model = get_model()
40
+ assert isinstance(model, HuggingFaceModel)
41
+
42
+ def test_raises_error_when_no_keys(self, monkeypatch):
43
+ """No keys at all → ConfigurationError."""
44
+ monkeypatch.setattr(settings, "openai_api_key", None)
45
+ monkeypatch.setattr(settings, "anthropic_api_key", None)
46
+ monkeypatch.setattr(settings, "hf_token", None)
47
+
48
+ with pytest.raises(ConfigurationError) as exc_info:
49
+ get_model()
50
+
51
+ assert "No LLM API key configured" in str(exc_info.value)
52
+
53
+ def test_openai_takes_priority_over_anthropic(self, monkeypatch):
54
+ """Both keys present → OpenAI wins."""
55
+ monkeypatch.setattr(settings, "openai_api_key", "sk-test")
56
+ monkeypatch.setattr(settings, "anthropic_api_key", "sk-ant-test")
57
+
58
+ model = get_model()
59
+ assert isinstance(model, OpenAIChatModel)
tests/unit/agent_factory/test_judges_factory.py CHANGED
@@ -24,6 +24,7 @@ def mock_settings():
24
  def test_get_model_openai(mock_settings):
25
  """Test that OpenAI model is returned when provider is openai."""
26
  mock_settings.llm_provider = "openai"
 
27
  mock_settings.openai_api_key = "sk-test"
28
  mock_settings.openai_model = "gpt-5"
29
 
@@ -35,6 +36,8 @@ def test_get_model_openai(mock_settings):
35
  def test_get_model_anthropic(mock_settings):
36
  """Test that Anthropic model is returned when provider is anthropic."""
37
  mock_settings.llm_provider = "anthropic"
 
 
38
  mock_settings.anthropic_api_key = "sk-ant-test"
39
  mock_settings.anthropic_model = "claude-sonnet-4-5-20250929"
40
 
@@ -46,6 +49,9 @@ def test_get_model_anthropic(mock_settings):
46
  def test_get_model_huggingface(mock_settings):
47
  """Test that HuggingFace model is returned when provider is huggingface."""
48
  mock_settings.llm_provider = "huggingface"
 
 
 
49
  mock_settings.hf_token = "hf_test_token"
50
  mock_settings.huggingface_model = "meta-llama/Llama-3.1-70B-Instruct"
51
 
@@ -57,6 +63,7 @@ def test_get_model_huggingface(mock_settings):
57
  def test_get_model_default_fallback(mock_settings):
58
  """Test fallback to OpenAI if provider is unknown."""
59
  mock_settings.llm_provider = "unknown_provider"
 
60
  mock_settings.openai_api_key = "sk-test"
61
  mock_settings.openai_model = "gpt-5"
62
 
 
24
  def test_get_model_openai(mock_settings):
25
  """Test that OpenAI model is returned when provider is openai."""
26
  mock_settings.llm_provider = "openai"
27
+ mock_settings.has_openai_key = True
28
  mock_settings.openai_api_key = "sk-test"
29
  mock_settings.openai_model = "gpt-5"
30
 
 
36
  def test_get_model_anthropic(mock_settings):
37
  """Test that Anthropic model is returned when provider is anthropic."""
38
  mock_settings.llm_provider = "anthropic"
39
+ mock_settings.has_openai_key = False
40
+ mock_settings.has_anthropic_key = True
41
  mock_settings.anthropic_api_key = "sk-ant-test"
42
  mock_settings.anthropic_model = "claude-sonnet-4-5-20250929"
43
 
 
49
  def test_get_model_huggingface(mock_settings):
50
  """Test that HuggingFace model is returned when provider is huggingface."""
51
  mock_settings.llm_provider = "huggingface"
52
+ mock_settings.has_openai_key = False
53
+ mock_settings.has_anthropic_key = False
54
+ mock_settings.has_huggingface_key = True # CodeRabbit: explicitly set for auto-detect
55
  mock_settings.hf_token = "hf_test_token"
56
  mock_settings.huggingface_model = "meta-llama/Llama-3.1-70B-Instruct"
57
 
 
63
  def test_get_model_default_fallback(mock_settings):
64
  """Test fallback to OpenAI if provider is unknown."""
65
  mock_settings.llm_provider = "unknown_provider"
66
+ mock_settings.has_openai_key = True
67
  mock_settings.openai_api_key = "sk-test"
68
  mock_settings.openai_model = "gpt-5"
69
 
tests/unit/config/test_domain.py CHANGED
@@ -28,10 +28,14 @@ class TestGetDomainConfig:
28
  config = get_domain_config("sexual_health")
29
  assert "Sexual Health" in config.name
30
 
31
- def test_invalid_string_returns_default(self):
32
- # Invalid domains fall back to default (sexual_health)
33
- config = get_domain_config("invalid_domain")
34
- assert config.name == "Sexual Health Research"
 
 
 
 
35
 
36
  def test_config_has_required_fields(self):
37
  required_fields = [
 
28
  config = get_domain_config("sexual_health")
29
  assert "Sexual Health" in config.name
30
 
31
+ def test_invalid_string_raises_value_error(self):
32
+ # Invalid domains should fail fast with clear error
33
+ import pytest
34
+
35
+ with pytest.raises(ValueError) as exc_info:
36
+ get_domain_config("invalid_domain")
37
+ assert "Invalid domain" in str(exc_info.value)
38
+ assert "sexual_health" in str(exc_info.value) # Shows valid options
39
 
40
  def test_config_has_required_fields(self):
41
  required_fields = [
tests/unit/orchestrators/test_advanced_orchestrator.py CHANGED
@@ -1,9 +1,12 @@
1
- import os
 
2
  from unittest.mock import patch
3
 
4
  import pytest
 
5
 
6
  from src.orchestrators.advanced import AdvancedOrchestrator
 
7
 
8
 
9
  @pytest.mark.unit
@@ -11,63 +14,71 @@ class TestAdvancedOrchestratorConfig:
11
  """Tests for configuration options."""
12
 
13
  def test_default_max_rounds_is_five(self) -> None:
14
- """Default max_rounds should be 5 for faster demos."""
15
- with (
16
- patch.dict(os.environ, {}, clear=True),
17
- patch("src.orchestrators.advanced.check_magentic_requirements"),
18
- ):
19
- # Clear any existing env var
20
- os.environ.pop("ADVANCED_MAX_ROUNDS", None)
21
  orch = AdvancedOrchestrator()
22
  assert orch._max_rounds == 5
23
 
24
- def test_max_rounds_from_env(self) -> None:
25
- """max_rounds should be configurable via environment."""
26
- with (
27
- patch.dict(os.environ, {"ADVANCED_MAX_ROUNDS": "3"}),
28
- patch("src.orchestrators.advanced.check_magentic_requirements"),
29
- ):
30
- orch = AdvancedOrchestrator()
31
- assert orch._max_rounds == 3
32
-
33
- def test_explicit_max_rounds_overrides_env(self) -> None:
34
- """Explicit parameter should override environment."""
35
- with (
36
- patch.dict(os.environ, {"ADVANCED_MAX_ROUNDS": "3"}),
37
- patch("src.orchestrators.advanced.check_magentic_requirements"),
38
- ):
39
  orch = AdvancedOrchestrator(max_rounds=7)
40
  assert orch._max_rounds == 7
41
 
42
  def test_timeout_default_is_five_minutes(self) -> None:
43
- """Default timeout should be 300s (5 min) for faster failure."""
44
  with patch("src.orchestrators.advanced.check_magentic_requirements"):
45
  orch = AdvancedOrchestrator()
46
  assert orch._timeout_seconds == 300.0
47
 
48
- def test_invalid_env_rounds_falls_back_to_default(self) -> None:
49
- """Invalid ADVANCED_MAX_ROUNDS should fall back to 5."""
50
- with (
51
- patch.dict(os.environ, {"ADVANCED_MAX_ROUNDS": "not_a_number"}),
52
- patch("src.orchestrators.advanced.check_magentic_requirements"),
53
- ):
54
- orch = AdvancedOrchestrator()
55
- assert orch._max_rounds == 5
56
 
57
- def test_zero_env_rounds_clamps_to_one(self) -> None:
58
- """ADVANCED_MAX_ROUNDS=0 should clamp to 1."""
59
- with (
60
- patch.dict(os.environ, {"ADVANCED_MAX_ROUNDS": "0"}),
61
- patch("src.orchestrators.advanced.check_magentic_requirements"),
62
- ):
63
- orch = AdvancedOrchestrator()
64
- assert orch._max_rounds == 1
65
-
66
- def test_negative_env_rounds_clamps_to_one(self) -> None:
67
- """Negative ADVANCED_MAX_ROUNDS should clamp to 1."""
68
- with (
69
- patch.dict(os.environ, {"ADVANCED_MAX_ROUNDS": "-5"}),
70
- patch("src.orchestrators.advanced.check_magentic_requirements"),
71
- ):
72
- orch = AdvancedOrchestrator()
73
- assert orch._max_rounds == 1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """Tests for AdvancedOrchestrator configuration."""
2
+
3
  from unittest.mock import patch
4
 
5
  import pytest
6
+ from pydantic import ValidationError
7
 
8
  from src.orchestrators.advanced import AdvancedOrchestrator
9
+ from src.utils.config import Settings
10
 
11
 
12
  @pytest.mark.unit
 
14
  """Tests for configuration options."""
15
 
16
  def test_default_max_rounds_is_five(self) -> None:
17
+ """Default max_rounds should be 5 from settings."""
18
+ with patch("src.orchestrators.advanced.check_magentic_requirements"):
 
 
 
 
 
19
  orch = AdvancedOrchestrator()
20
  assert orch._max_rounds == 5
21
 
22
+ def test_explicit_max_rounds_overrides_settings(self) -> None:
23
+ """Explicit parameter should override settings."""
24
+ with patch("src.orchestrators.advanced.check_magentic_requirements"):
 
 
 
 
 
 
 
 
 
 
 
 
25
  orch = AdvancedOrchestrator(max_rounds=7)
26
  assert orch._max_rounds == 7
27
 
28
  def test_timeout_default_is_five_minutes(self) -> None:
29
+ """Default timeout should be 300s (5 min) from settings."""
30
  with patch("src.orchestrators.advanced.check_magentic_requirements"):
31
  orch = AdvancedOrchestrator()
32
  assert orch._timeout_seconds == 300.0
33
 
34
+ def test_explicit_timeout_overrides_settings(self) -> None:
35
+ """Explicit timeout parameter should override settings."""
36
+ with patch("src.orchestrators.advanced.check_magentic_requirements"):
37
+ orch = AdvancedOrchestrator(timeout_seconds=120.0)
38
+ assert orch._timeout_seconds == 120.0
 
 
 
39
 
40
+
41
+ @pytest.mark.unit
42
+ class TestSettingsValidation:
43
+ """Tests for pydantic Settings validation (fail-fast behavior)."""
44
+
45
+ def test_invalid_max_rounds_type_raises(self) -> None:
46
+ """Non-integer ADVANCED_MAX_ROUNDS should fail at startup."""
47
+ with pytest.raises(ValidationError) as exc_info:
48
+ Settings(advanced_max_rounds="not_a_number") # type: ignore[arg-type]
49
+ assert "advanced_max_rounds" in str(exc_info.value)
50
+
51
+ def test_zero_max_rounds_raises(self) -> None:
52
+ """ADVANCED_MAX_ROUNDS=0 should fail validation (ge=1)."""
53
+ with pytest.raises(ValidationError) as exc_info:
54
+ Settings(advanced_max_rounds=0)
55
+ assert "greater than or equal to 1" in str(exc_info.value)
56
+
57
+ def test_negative_max_rounds_raises(self) -> None:
58
+ """Negative ADVANCED_MAX_ROUNDS should fail validation."""
59
+ with pytest.raises(ValidationError) as exc_info:
60
+ Settings(advanced_max_rounds=-5)
61
+ assert "greater than or equal to 1" in str(exc_info.value)
62
+
63
+ def test_max_rounds_above_limit_raises(self) -> None:
64
+ """ADVANCED_MAX_ROUNDS > 20 should fail validation (le=20)."""
65
+ with pytest.raises(ValidationError) as exc_info:
66
+ Settings(advanced_max_rounds=100)
67
+ assert "less than or equal to 20" in str(exc_info.value)
68
+
69
+ def test_valid_max_rounds_accepted(self) -> None:
70
+ """Valid ADVANCED_MAX_ROUNDS should be accepted."""
71
+ s = Settings(advanced_max_rounds=10)
72
+ assert s.advanced_max_rounds == 10
73
+
74
+ def test_timeout_too_low_raises(self) -> None:
75
+ """ADVANCED_TIMEOUT < 60 should fail validation."""
76
+ with pytest.raises(ValidationError) as exc_info:
77
+ Settings(advanced_timeout=30.0)
78
+ assert "greater than or equal to 60" in str(exc_info.value)
79
+
80
+ def test_timeout_too_high_raises(self) -> None:
81
+ """ADVANCED_TIMEOUT > 900 should fail validation."""
82
+ with pytest.raises(ValidationError) as exc_info:
83
+ Settings(advanced_timeout=1000.0)
84
+ assert "less than or equal to 900" in str(exc_info.value)
tests/unit/test_app_domain.py CHANGED
@@ -25,10 +25,17 @@ class TestAppDomain:
25
  )
26
 
27
  @patch.dict("os.environ", {}, clear=True)
 
28
  @patch("src.app.create_orchestrator")
29
  @patch("src.app.HFInferenceJudgeHandler")
30
- def test_configure_orchestrator_passes_domain_free_tier(self, mock_hf_judge, mock_create):
 
 
31
  """Test domain is passed when using free tier (no API keys)."""
 
 
 
 
32
  configure_orchestrator(use_mock=False, mode="simple", domain=ResearchDomain.SEXUAL_HEALTH)
33
 
34
  # HFInferenceJudgeHandler should receive domain (no API keys = free tier)
@@ -42,8 +49,13 @@ class TestAppDomain:
42
  domain=ResearchDomain.SEXUAL_HEALTH,
43
  )
44
 
 
45
  @patch("src.app.configure_orchestrator")
46
- async def test_research_agent_passes_domain(self, mock_config):
 
 
 
 
47
  # Mock orchestrator
48
  mock_orch = MagicMock()
49
  mock_orch.run.return_value = [] # Async iterator?
 
25
  )
26
 
27
  @patch.dict("os.environ", {}, clear=True)
28
+ @patch("src.app.settings")
29
  @patch("src.app.create_orchestrator")
30
  @patch("src.app.HFInferenceJudgeHandler")
31
+ def test_configure_orchestrator_passes_domain_free_tier(
32
+ self, mock_hf_judge, mock_create, mock_settings
33
+ ):
34
  """Test domain is passed when using free tier (no API keys)."""
35
+ # Simulate no keys in settings
36
+ mock_settings.has_openai_key = False
37
+ mock_settings.has_anthropic_key = False
38
+
39
  configure_orchestrator(use_mock=False, mode="simple", domain=ResearchDomain.SEXUAL_HEALTH)
40
 
41
  # HFInferenceJudgeHandler should receive domain (no API keys = free tier)
 
49
  domain=ResearchDomain.SEXUAL_HEALTH,
50
  )
51
 
52
+ @patch("src.app.settings")
53
  @patch("src.app.configure_orchestrator")
54
+ async def test_research_agent_passes_domain(self, mock_config, mock_settings):
55
+ # Mock settings to have some state
56
+ mock_settings.has_openai_key = False
57
+ mock_settings.has_anthropic_key = False
58
+
59
  # Mock orchestrator
60
  mock_orch = MagicMock()
61
  mock_orch.run.return_value = [] # Async iterator?