Spaces:
Runtime error
Runtime error
Sync from GitHub (main)
Browse files
AGENTS.md
CHANGED
|
@@ -1,4 +1,4 @@
|
|
| 1 |
-
#
|
| 2 |
|
| 3 |
This repository contains the LLM-based Cancer Risk Assessment Assistant.
|
| 4 |
|
|
@@ -8,11 +8,33 @@ This repository contains the LLM-based Cancer Risk Assessment Assistant.
|
|
| 8 |
- **uv** for environment and dependency management
|
| 9 |
- **hydra:** for configuration management
|
| 10 |
|
| 11 |
-
##
|
| 12 |
-
|
| 13 |
-
|
| 14 |
-
-
|
| 15 |
-
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 16 |
|
| 17 |
### Variable Naming
|
| 18 |
- **Avoid single-letter variable names** (x, y, i, j, e, t, f, m, c, ct) in favor of descriptive names.
|
|
@@ -45,40 +67,105 @@ This repository contains the LLM-based Cancer Risk Assessment Assistant.
|
|
| 45 |
- `f in MODELS_DIR.glob` β `file_path in MODELS_DIR.glob`
|
| 46 |
- `t in field_type.__args__` β `type_arg in field_type.__args__`
|
| 47 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 48 |
### Import Management
|
| 49 |
-
- **Place all imports at the top of the file**,
|
| 50 |
-
-
|
| 51 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 52 |
|
| 53 |
## Testing
|
| 54 |
-
- Write meaningful tests that verify core functionality and prevent regressions.
|
| 55 |
-
- Run tests with `uv run pytest`.
|
| 56 |
|
| 57 |
-
|
| 58 |
-
-
|
|
|
|
|
|
|
|
|
|
| 59 |
|
| 60 |
-
|
| 61 |
-
-
|
|
|
|
|
|
|
| 62 |
|
| 63 |
-
|
| 64 |
-
|
| 65 |
-
|
| 66 |
-
|
| 67 |
-
|
| 68 |
-
|
| 69 |
-
The third page, **Assessment**, runs the AI analysis, displays a results dashboard, and provides export and chat options.
|
| 70 |
|
| 71 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 72 |
|
| 73 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 74 |
|
| 75 |
-
|
| 76 |
-
|
| 77 |
-
|
| 78 |
|
| 79 |
-
|
| 80 |
|
| 81 |
-
Implemented risk calculators include:
|
| 82 |
- **Gail** - Breast cancer risk
|
| 83 |
- **Claus** - Breast cancer risk based on family history
|
| 84 |
- **BOADICEA** - Breast and ovarian cancer risk (via CanRisk API)
|
|
@@ -91,3 +178,268 @@ Implemented risk calculators include:
|
|
| 91 |
- **QCancer** - Multi-site cancer differential
|
| 92 |
|
| 93 |
Additional models should follow the interfaces under `src/sentinel/risk_models`.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Repository Guidelines
|
| 2 |
|
| 3 |
This repository contains the LLM-based Cancer Risk Assessment Assistant.
|
| 4 |
|
|
|
|
| 8 |
- **uv** for environment and dependency management
|
| 9 |
- **hydra:** for configuration management
|
| 10 |
|
| 11 |
+
## Development Setup
|
| 12 |
+
|
| 13 |
+
### Environment Setup
|
| 14 |
+
- Create the virtual environment (at '.venv') with `uv sync`.
|
| 15 |
+
- As the repository uses uv, the uv should be used to run all commands, e.g., "uv run python ..." NOT "python ...".
|
| 16 |
+
|
| 17 |
+
### Running Commands
|
| 18 |
+
- **Streamlit Interface**: `uv run streamlit run apps/streamlit_ui/main.py`
|
| 19 |
+
- **CLI Demo**: `uv run python apps/cli/main.py`
|
| 20 |
+
- **Tests**: `uv run pytest`
|
| 21 |
+
|
| 22 |
+
The multi-page Streamlit interface for expert feedback can be launched with `uv run streamlit run apps/streamlit_ui/main.py`.
|
| 23 |
+
The first page, **User Profile**, allows experts to load or create a profile stored in `st.session_state.user_profile`.
|
| 24 |
+
The second page, **Configuration**, lets experts choose the model and knowledge base modules while previewing the generated prompt.
|
| 25 |
+
The third page, **Assessment**, runs the AI analysis, displays a results dashboard, and provides export and chat options.
|
| 26 |
+
|
| 27 |
+
## Coding Standards
|
| 28 |
+
|
| 29 |
+
### Coding Philosophy
|
| 30 |
+
- Write simple, explicit, modular code
|
| 31 |
+
- Prioritize clarity over cleverness
|
| 32 |
+
- Prefer small pure functions over large ones
|
| 33 |
+
- Return early instead of nesting deeply
|
| 34 |
+
- Favor functions over classes unless essential
|
| 35 |
+
- Favor simple replication over heavy abstraction
|
| 36 |
+
- Keep comments short and only where code isn't self-explanatory
|
| 37 |
+
- Avoid premature optimization or over-engineering
|
| 38 |
|
| 39 |
### Variable Naming
|
| 40 |
- **Avoid single-letter variable names** (x, y, i, j, e, t, f, m, c, ct) in favor of descriptive names.
|
|
|
|
| 67 |
- `f in MODELS_DIR.glob` β `file_path in MODELS_DIR.glob`
|
| 68 |
- `t in field_type.__args__` β `type_arg in field_type.__args__`
|
| 69 |
|
| 70 |
+
### Path Handling
|
| 71 |
+
- **Always use `pathlib.Path`** for all file I/O, joining, and globbing
|
| 72 |
+
- Accept `Path | str` at function boundaries; normalize to `Path` internally
|
| 73 |
+
- **Never use `os.path`** for path operations
|
| 74 |
+
|
| 75 |
+
Example:
|
| 76 |
+
```python
|
| 77 |
+
from pathlib import Path
|
| 78 |
+
|
| 79 |
+
def read_text(file: Path | str) -> str:
|
| 80 |
+
path = Path(file)
|
| 81 |
+
return path.read_text(encoding="utf-8")
|
| 82 |
+
```
|
| 83 |
+
|
| 84 |
+
### Type Hints and Modern Python
|
| 85 |
+
- **Use modern type hints**: `list`, `dict`, `tuple`, `set` (not `List`, `Dict`, etc.)
|
| 86 |
+
- **Use PEP 604 unions**: `A | B` (not `Union[A, B]` or `Optional[A]`)
|
| 87 |
+
- Import from `typing` only when necessary (`TypedDict`, `Literal`, `Annotated`, etc.)
|
| 88 |
+
- **Never use** `from __future__ import annotations`
|
| 89 |
+
- Add type hints to all public functions and methods
|
| 90 |
+
- Prefer precise types (`float`, `Path`, etc.) over generic ones
|
| 91 |
+
- If `Any` is required, isolate and document why
|
| 92 |
+
|
| 93 |
### Import Management
|
| 94 |
+
- **Place all imports at the top of the file**, never inside functions or classes
|
| 95 |
+
- Group imports in three sections with blank lines between:
|
| 96 |
+
1. Standard library imports
|
| 97 |
+
2. Third-party library imports
|
| 98 |
+
3. Local/project imports
|
| 99 |
+
- This improves performance (imports loaded once) and code readability
|
| 100 |
+
|
| 101 |
+
### Error Handling and Logging
|
| 102 |
+
- **Use `try/except` only for I/O or external APIs**
|
| 103 |
+
- Catch specific exceptions only (never broad `except:`)
|
| 104 |
+
- Raise clear, actionable error messages
|
| 105 |
+
- **Use `loguru`** for logging, never `print()` in production code
|
| 106 |
+
|
| 107 |
+
Example:
|
| 108 |
+
```python
|
| 109 |
+
from loguru import logger
|
| 110 |
+
|
| 111 |
+
try:
|
| 112 |
+
data = Path(file_path).read_text(encoding="utf-8")
|
| 113 |
+
except FileNotFoundError as error:
|
| 114 |
+
logger.error(f"Configuration file not found: {file_path}")
|
| 115 |
+
raise ValueError(f"Missing required config: {file_path}") from error
|
| 116 |
+
```
|
| 117 |
+
|
| 118 |
+
### Docstring Standards
|
| 119 |
+
- **Use Google-style docstrings** for all public functions and classes
|
| 120 |
+
- Do NOT include type hints in docstrings (they're in the signature)
|
| 121 |
+
- Describe behavior, invariants, side effects, and edge cases
|
| 122 |
+
- Include examples for complex functions
|
| 123 |
+
- Avoid verbose docstrings for simple, self-explanatory functions
|
| 124 |
|
| 125 |
## Testing
|
|
|
|
|
|
|
| 126 |
|
| 127 |
+
### Testing Philosophy
|
| 128 |
+
- Write meaningful tests that verify core functionality and prevent regressions
|
| 129 |
+
- Use `pytest` as the testing framework
|
| 130 |
+
- Tests go under `tests/` mirroring the source layout
|
| 131 |
+
- Test both valid and invalid input scenarios
|
| 132 |
|
| 133 |
+
### Test Types
|
| 134 |
+
- **Unit tests**: Small, deterministic, one concept per test
|
| 135 |
+
- **Integration tests**: Real workflows or reference comparisons with external systems
|
| 136 |
+
- Use `pytest.mark` to tag slow or manual tests
|
| 137 |
|
| 138 |
+
### Test Coverage Requirements
|
| 139 |
+
- Ensure comprehensive test coverage for all risk models
|
| 140 |
+
- **Ground Truth Validation**: Test against known reference values
|
| 141 |
+
- **Input Validation**: Test that invalid inputs raise `ValueError`
|
| 142 |
+
- **Edge Cases**: Test boundary conditions
|
| 143 |
+
- **Inapplicable Cases**: Test when models should return "N/A"
|
|
|
|
| 144 |
|
| 145 |
+
### Running Tests
|
| 146 |
+
```bash
|
| 147 |
+
uv run pytest # Run all tests
|
| 148 |
+
uv run pytest -q # Quiet mode
|
| 149 |
+
uv run pytest -v # Verbose mode
|
| 150 |
+
uv run pytest tests/test_risk_models/ # Specific directory
|
| 151 |
+
```
|
| 152 |
|
| 153 |
+
### Pre-Submission Checklist
|
| 154 |
+
Before committing code, verify:
|
| 155 |
+
1. β
Run `uv run pytest -q` (all tests pass)
|
| 156 |
+
2. β
Run `pre-commit run --all-files` (all hooks pass)
|
| 157 |
+
3. β
No `print()` statements in production code
|
| 158 |
+
4. β
No broad `except:` blocks
|
| 159 |
+
5. β
All type hints present on public functions
|
| 160 |
+
6. β
File paths use `pathlib.Path`
|
| 161 |
+
7. β
Logging uses `loguru`
|
| 162 |
|
| 163 |
+
## Risk Models
|
| 164 |
+
|
| 165 |
+
### Implemented Models
|
| 166 |
|
| 167 |
+
The assistant currently includes the following built-in risk calculators:
|
| 168 |
|
|
|
|
| 169 |
- **Gail** - Breast cancer risk
|
| 170 |
- **Claus** - Breast cancer risk based on family history
|
| 171 |
- **BOADICEA** - Breast and ovarian cancer risk (via CanRisk API)
|
|
|
|
| 178 |
- **QCancer** - Multi-site cancer differential
|
| 179 |
|
| 180 |
Additional models should follow the interfaces under `src/sentinel/risk_models`.
|
| 181 |
+
|
| 182 |
+
### Risk Model Implementation Guide
|
| 183 |
+
|
| 184 |
+
#### Base Architecture
|
| 185 |
+
|
| 186 |
+
All risk models must inherit from `RiskModel` in `src/sentinel/risk_models/base.py`:
|
| 187 |
+
|
| 188 |
+
```python
|
| 189 |
+
from sentinel.risk_models.base import RiskModel
|
| 190 |
+
|
| 191 |
+
class YourRiskModel(RiskModel):
|
| 192 |
+
def __init__(self):
|
| 193 |
+
super().__init__("your_model_name")
|
| 194 |
+
```
|
| 195 |
+
|
| 196 |
+
#### Required Methods
|
| 197 |
+
|
| 198 |
+
Every risk model must implement these abstract methods:
|
| 199 |
+
|
| 200 |
+
```python
|
| 201 |
+
def compute_score(self, user: UserInput) -> str:
|
| 202 |
+
"""Compute the risk score for a given user profile.
|
| 203 |
+
|
| 204 |
+
Args:
|
| 205 |
+
user: The user profile containing demographics, medical history, etc.
|
| 206 |
+
|
| 207 |
+
Returns:
|
| 208 |
+
str: Risk percentage as a string or an N/A message if inapplicable.
|
| 209 |
+
|
| 210 |
+
Raises:
|
| 211 |
+
ValueError: If required inputs are missing or invalid.
|
| 212 |
+
"""
|
| 213 |
+
|
| 214 |
+
def cancer_type(self) -> str:
|
| 215 |
+
"""Return the cancer type this model assesses."""
|
| 216 |
+
return "breast" # or "lung", "prostate", etc.
|
| 217 |
+
|
| 218 |
+
def description(self) -> str:
|
| 219 |
+
"""Return a detailed description of the model."""
|
| 220 |
+
|
| 221 |
+
def interpretation(self) -> str:
|
| 222 |
+
"""Return guidance on how to interpret the results."""
|
| 223 |
+
|
| 224 |
+
def references(self) -> list[str]:
|
| 225 |
+
"""Return list of reference citations."""
|
| 226 |
+
```
|
| 227 |
+
|
| 228 |
+
#### UserInput Structure
|
| 229 |
+
|
| 230 |
+
**All risk models must use the centralized `UserInput` structure** - this is the single source of truth for all data types and enums. The `UserInput` class follows a hierarchical structure:
|
| 231 |
+
|
| 232 |
+
```
|
| 233 |
+
UserInput
|
| 234 |
+
βββ demographics: Demographics
|
| 235 |
+
β βββ age_years: int
|
| 236 |
+
β βββ sex: Sex (enum)
|
| 237 |
+
β βββ ethnicity: Ethnicity | None
|
| 238 |
+
β βββ anthropometrics: Anthropometrics
|
| 239 |
+
β βββ height_cm: float | None
|
| 240 |
+
β βββ weight_kg: float | None
|
| 241 |
+
βββ lifestyle: Lifestyle
|
| 242 |
+
β βββ smoking: SmokingHistory
|
| 243 |
+
β βββ alcohol: AlcoholConsumption
|
| 244 |
+
βββ personal_medical_history: PersonalMedicalHistory
|
| 245 |
+
β βββ chronic_conditions: list[ChronicCondition]
|
| 246 |
+
β βββ previous_cancers: list[CancerType]
|
| 247 |
+
β βββ genetic_mutations: list[GeneticMutation]
|
| 248 |
+
β βββ tyrer_cuzick_polygenic_risk_score: float | None
|
| 249 |
+
βββ female_specific: FemaleSpecific | None
|
| 250 |
+
β βββ menstrual: MenstrualHistory
|
| 251 |
+
β βββ parity: ParityHistory
|
| 252 |
+
β βββ breast_health: BreastHealthHistory
|
| 253 |
+
βββ symptoms: list[SymptomEntry]
|
| 254 |
+
βββ family_history: list[FamilyMemberCancer]
|
| 255 |
+
```
|
| 256 |
+
|
| 257 |
+
#### REQUIRED_INPUTS Specification
|
| 258 |
+
|
| 259 |
+
Every risk model must define a `REQUIRED_INPUTS` class attribute using Pydantic's `Annotated` types with `Field` constraints:
|
| 260 |
+
|
| 261 |
+
```python
|
| 262 |
+
REQUIRED_INPUTS: dict[str, tuple[type, bool]] = {
|
| 263 |
+
"demographics.age_years": (Annotated[int, Field(ge=18, le=100)], True),
|
| 264 |
+
"demographics.sex": (Sex, True),
|
| 265 |
+
"demographics.ethnicity": (Ethnicity | None, False),
|
| 266 |
+
"family_history": (list, False), # list[FamilyMemberCancer]
|
| 267 |
+
"symptoms": (list, False), # list[SymptomEntry]
|
| 268 |
+
}
|
| 269 |
+
```
|
| 270 |
+
|
| 271 |
+
#### Input Validation
|
| 272 |
+
|
| 273 |
+
Every `compute_score` method must start with input validation:
|
| 274 |
+
|
| 275 |
+
```python
|
| 276 |
+
def compute_score(self, user: UserInput) -> str:
|
| 277 |
+
"""Compute the risk score for a given user profile."""
|
| 278 |
+
# Validate inputs first
|
| 279 |
+
is_valid, errors = self.validate_inputs(user)
|
| 280 |
+
if not is_valid:
|
| 281 |
+
raise ValueError(f"Invalid inputs for {self.name}: {'; '.join(errors)}")
|
| 282 |
+
|
| 283 |
+
# Model-specific validation
|
| 284 |
+
if user.demographics.sex != Sex.FEMALE:
|
| 285 |
+
return "N/A: Model is only applicable to female patients."
|
| 286 |
+
|
| 287 |
+
# Continue with model-specific logic...
|
| 288 |
+
```
|
| 289 |
+
|
| 290 |
+
#### Data Access Patterns
|
| 291 |
+
|
| 292 |
+
```python
|
| 293 |
+
# Demographics
|
| 294 |
+
age = user.demographics.age_years
|
| 295 |
+
sex = user.demographics.sex
|
| 296 |
+
ethnicity = user.demographics.ethnicity
|
| 297 |
+
|
| 298 |
+
# Female-specific data
|
| 299 |
+
if user.female_specific is not None:
|
| 300 |
+
menarche_age = user.female_specific.menstrual.age_at_menarche
|
| 301 |
+
num_births = user.female_specific.parity.num_live_births
|
| 302 |
+
|
| 303 |
+
# Family history
|
| 304 |
+
for member in user.family_history:
|
| 305 |
+
if member.cancer_type == CancerType.BREAST:
|
| 306 |
+
relation = member.relation
|
| 307 |
+
age_at_diagnosis = member.age_at_diagnosis
|
| 308 |
+
```
|
| 309 |
+
|
| 310 |
+
#### Enum Usage
|
| 311 |
+
|
| 312 |
+
**Always use enums from `sentinel.user_input`, never string literals or custom enums:**
|
| 313 |
+
|
| 314 |
+
```python
|
| 315 |
+
# β
Correct - using UserInput enums
|
| 316 |
+
if user.demographics.sex == Sex.FEMALE:
|
| 317 |
+
if member.cancer_type == CancerType.BREAST:
|
| 318 |
+
if member.relation == FamilyRelation.MOTHER:
|
| 319 |
+
|
| 320 |
+
# β Incorrect - string literals
|
| 321 |
+
if user.demographics.sex == "female":
|
| 322 |
+
if member.cancer_type == "breast":
|
| 323 |
+
|
| 324 |
+
# β Incorrect - custom enums
|
| 325 |
+
if user.demographics.sex == MyCustomSex.FEMALE:
|
| 326 |
+
```
|
| 327 |
+
|
| 328 |
+
**Important**: All risk models must use the same centralized enums from `UserInput`. If a required enum doesn't exist in `UserInput`, you must:
|
| 329 |
+
1. **Extend UserInput** by adding the new enum to `src/sentinel/user_input.py`
|
| 330 |
+
2. **Never create model-specific enums** - this prevents divergence between models
|
| 331 |
+
3. **Update all models** to use the new centralized enum
|
| 332 |
+
|
| 333 |
+
This ensures all risk models share the same data structure and prevents fragmentation.
|
| 334 |
+
|
| 335 |
+
#### Extending UserInput
|
| 336 |
+
|
| 337 |
+
When a risk model needs fields or enums that don't exist in `UserInput`:
|
| 338 |
+
|
| 339 |
+
1. **Add to UserInput**: Extend `src/sentinel/user_input.py` with new fields/enums
|
| 340 |
+
2. **Update all models**: Ensure all existing models can handle the new fields (use `| None` for optional fields)
|
| 341 |
+
3. **Never create model-specific structures**: This prevents divergence and fragmentation
|
| 342 |
+
4. **Test thoroughly**: Add tests for new fields in `tests/test_user_input.py`
|
| 343 |
+
|
| 344 |
+
Example of extending UserInput:
|
| 345 |
+
```python
|
| 346 |
+
# In src/sentinel/user_input.py
|
| 347 |
+
class ChronicCondition(str, Enum):
|
| 348 |
+
# ... existing values
|
| 349 |
+
NEW_CONDITION = "new_condition" # Add new enum value
|
| 350 |
+
|
| 351 |
+
class PersonalMedicalHistory(StrictBaseModel):
|
| 352 |
+
# ... existing fields
|
| 353 |
+
new_field: float | None = Field(None, description="New field description")
|
| 354 |
+
```
|
| 355 |
+
|
| 356 |
+
#### Testing Requirements
|
| 357 |
+
|
| 358 |
+
Create comprehensive test files with:
|
| 359 |
+
- **Ground Truth Validation**: Test against known reference values
|
| 360 |
+
- **Input Validation**: Test that invalid inputs raise `ValueError`
|
| 361 |
+
- **Edge Cases**: Test boundary conditions and edge cases
|
| 362 |
+
- **Inapplicable Cases**: Test cases where model should return "N/A"
|
| 363 |
+
|
| 364 |
+
Example test structure:
|
| 365 |
+
|
| 366 |
+
```python
|
| 367 |
+
import pytest
|
| 368 |
+
from sentinel.user_input import UserInput, Demographics, Sex
|
| 369 |
+
from sentinel.risk_models import YourRiskModel
|
| 370 |
+
|
| 371 |
+
GROUND_TRUTH_CASES = [
|
| 372 |
+
{
|
| 373 |
+
"name": "test_case_name",
|
| 374 |
+
"input": UserInput(
|
| 375 |
+
demographics=Demographics(
|
| 376 |
+
age_years=40,
|
| 377 |
+
sex=Sex.FEMALE,
|
| 378 |
+
# ... other fields
|
| 379 |
+
),
|
| 380 |
+
# ... rest of input
|
| 381 |
+
),
|
| 382 |
+
"expected": 1.5, # Expected risk percentage
|
| 383 |
+
},
|
| 384 |
+
# ... more test cases
|
| 385 |
+
]
|
| 386 |
+
|
| 387 |
+
class TestYourRiskModel:
|
| 388 |
+
@pytest.mark.parametrize("case", GROUND_TRUTH_CASES, ids=lambda x: x["name"])
|
| 389 |
+
def test_ground_truth_validation(self, case):
|
| 390 |
+
"""Test against ground truth results."""
|
| 391 |
+
user_input = case["input"]
|
| 392 |
+
expected_risk = case["expected"]
|
| 393 |
+
|
| 394 |
+
actual_risk_str = self.model.compute_score(user_input)
|
| 395 |
+
actual_risk = float(actual_risk_str)
|
| 396 |
+
assert actual_risk == pytest.approx(expected_risk, abs=0.01)
|
| 397 |
+
```
|
| 398 |
+
|
| 399 |
+
#### Migration Checklist
|
| 400 |
+
|
| 401 |
+
When adapting an existing risk model to the new structure:
|
| 402 |
+
|
| 403 |
+
- [ ] Update imports to use new `user_input` module
|
| 404 |
+
- [ ] Add `REQUIRED_INPUTS` with Pydantic validation
|
| 405 |
+
- [ ] Refactor `compute_score` to use new `UserInput` structure
|
| 406 |
+
- [ ] Replace string literals with enums
|
| 407 |
+
- [ ] Update parameter extraction logic
|
| 408 |
+
- [ ] Add input validation at start of `compute_score`
|
| 409 |
+
- [ ] Update all test cases to use new `UserInput` structure
|
| 410 |
+
- [ ] Run full test suite to ensure 100% pass rate
|
| 411 |
+
- [ ] Run pre-commit hooks to ensure code quality
|
| 412 |
+
|
| 413 |
+
## LLM and Code Assistant Guidelines
|
| 414 |
+
|
| 415 |
+
When generating or modifying code, AI assistants MUST:
|
| 416 |
+
|
| 417 |
+
### Mandatory Rules
|
| 418 |
+
- Follow ALL guidelines in this document without exception
|
| 419 |
+
- Never use forbidden constructs (`os.path`, `Optional[]`, `List[]`, `print()`, broad `except:`)
|
| 420 |
+
- Never add decorative comment banners or unnecessary formatting
|
| 421 |
+
- Always generate clean, modular, statically typed code
|
| 422 |
+
|
| 423 |
+
### Code Generation Standards
|
| 424 |
+
- Prefer clarity and simplicity over cleverness
|
| 425 |
+
- Use modern Python type hints exclusively
|
| 426 |
+
- Include comprehensive docstrings for non-trivial functions
|
| 427 |
+
- Ensure all examples compile, type-check, and pass linting
|
| 428 |
+
|
| 429 |
+
### Verification
|
| 430 |
+
All generated code must:
|
| 431 |
+
- Pass `ruff format` and `ruff check`
|
| 432 |
+
- Include proper type hints
|
| 433 |
+
- Use `pathlib.Path` for all file operations
|
| 434 |
+
- Use `loguru` for logging
|
| 435 |
+
- Follow the Variable Naming guidelines
|
| 436 |
+
|
| 437 |
+
## Important Note for Developers
|
| 438 |
+
|
| 439 |
+
When making changes to the project, ensure that the following files are updated to reflect the changes:
|
| 440 |
+
|
| 441 |
+
- `README.md`
|
| 442 |
+
- `AGENTS.md`
|
| 443 |
+
- `GEMINI.md`
|
| 444 |
+
|
| 445 |
+
For additional implementation details, refer to the existing risk model implementations in `src/sentinel/risk_models/`.
|