sentinel / AGENTS.md
jeuko's picture
Sync from GitHub (main)
c403899 verified

Repository Guidelines

This repository contains the LLM-based Cancer Risk Assessment Assistant.

Core Technologies

  • FastAPI for the web framework
  • LangChain for LLM orchestration
  • uv for environment and dependency management
  • hydra: for configuration management

Development Setup

Environment Setup

  • Create the virtual environment (at '.venv') with uv sync.
  • As the repository uses uv, the uv should be used to run all commands, e.g., "uv run python ..." NOT "python ...".

Running Commands

  • Streamlit Interface: uv run streamlit run apps/streamlit_ui/main.py
  • CLI Demo: uv run python apps/cli/main.py
  • Tests: uv run pytest

Coding Standards

Coding Philosophy

  • Write simple, explicit, modular code
  • Prioritize clarity over cleverness
  • Prefer small pure functions over large ones
  • Return early instead of nesting deeply
  • Favor functions over classes unless essential
  • Favor simple replication over heavy abstraction
  • Keep comments short and only where code isn't self-explanatory
  • Avoid premature optimization or over-engineering

Variable Naming

  • Avoid single-letter variable names (x, y, i, j, e, t, f, m, c, ct) in favor of descriptive names.
  • Avoid abbreviations (fh, ct, w, h) in favor of full descriptive names.
  • Use context-specific names for loop indices based on what you're iterating over:
    • item_index for general enumeration
    • line_index for text line iteration
    • column_index for table/array column iteration
    • row_index for table/array row iteration
  • Use descriptive names for comprehensions and iterations:
    • item instead of i for general items
    • element instead of e for list elements
    • key instead of k for dictionary keys
    • value instead of v for dictionary values
  • Use descriptive names for coordinates and positions:
    • x_position, y_position instead of x, y
    • width, height instead of w, h
  • Use descriptive names for data structures:
    • file_path instead of f for file paths
    • model instead of m for model instances
    • user instead of u for user objects

Examples from recent refactoring:

  • for i, ref in enumerate(references) β†’ for ref_index, ref in enumerate(references)
  • for e in examples β†’ for example in examples
  • for m in models β†’ for model in models
  • x = pdf.get_x() β†’ x_position = pdf.get_x()
  • fh = family_history β†’ family_history = family_history (avoid abbreviations)
  • ct for ct in cancer_types β†’ cancer_type for cancer_type in cancer_types
  • f in MODELS_DIR.glob β†’ file_path in MODELS_DIR.glob
  • t in field_type.__args__ β†’ type_arg in field_type.__args__

Path Handling

  • Always use pathlib.Path for all file I/O, joining, and globbing
  • Accept Path | str at function boundaries; normalize to Path internally
  • Never use os.path for path operations

Example:

from pathlib import Path

def read_text(file: Path | str) -> str:
    path = Path(file)
    return path.read_text(encoding="utf-8")

Type Hints and Modern Python

  • Use modern type hints: list, dict, tuple, set (not List, Dict, etc.)
  • Use PEP 604 unions: A | B (not Union[A, B] or Optional[A])
  • Import from typing only when necessary (TypedDict, Literal, Annotated, etc.)
  • Never use from __future__ import annotations
  • Add type hints to all public functions and methods
  • Prefer precise types (float, Path, etc.) over generic ones
  • If Any is required, isolate and document why

Import Management

  • Place all imports at the top of the file, never inside functions or classes
  • Group imports in three sections with blank lines between:
    1. Standard library imports
    2. Third-party library imports
    3. Local/project imports
  • This improves performance (imports loaded once) and code readability

Error Handling and Logging

  • Use try/except only for I/O or external APIs
  • Catch specific exceptions only (never broad except:)
  • Raise clear, actionable error messages
  • Use loguru for logging, never print() in production code

Example:

from loguru import logger

try:
    data = Path(file_path).read_text(encoding="utf-8")
except FileNotFoundError as error:
    logger.error(f"Configuration file not found: {file_path}")
    raise ValueError(f"Missing required config: {file_path}") from error

Docstring Standards

  • Use Google-style docstrings for all public functions and classes
  • Do NOT include type hints in docstrings (they're in the signature)
  • Describe behavior, invariants, side effects, and edge cases
  • Include examples for complex functions
  • Avoid verbose docstrings for simple, self-explanatory functions

Testing

Testing Philosophy

  • Write meaningful tests that verify core functionality and prevent regressions
  • Use pytest as the testing framework
  • Tests go under tests/ mirroring the source layout
  • Test both valid and invalid input scenarios

Test Types

  • Unit tests: Small, deterministic, one concept per test
  • Integration tests: Real workflows or reference comparisons with external systems
  • Use pytest.mark to tag slow or manual tests

Test Coverage Requirements

  • Ensure comprehensive test coverage for all risk models
  • Ground Truth Validation: Test against known reference values
  • Input Validation: Test that invalid inputs raise ValueError
  • Edge Cases: Test boundary conditions
  • Inapplicable Cases: Test when models should return "N/A"

Running Tests

uv run pytest              # Run all tests
uv run pytest -q          # Quiet mode
uv run pytest -v          # Verbose mode
uv run pytest tests/test_risk_models/  # Specific directory

Pre-Submission Checklist

Before committing code, verify:

  1. βœ… Run uv run pytest -q (all tests pass)
  2. βœ… Run pre-commit run --all-files (all hooks pass)
  3. βœ… No print() statements in production code
  4. βœ… No broad except: blocks
  5. βœ… All type hints present on public functions
  6. βœ… File paths use pathlib.Path
  7. βœ… Logging uses loguru

Risk Models

Implemented Models

The assistant currently includes the following built-in risk calculators:

  • Gail - Breast cancer risk
  • Claus - Breast cancer risk based on family history
  • Tyrer-Cuzick - Breast cancer risk (IBIS model)
  • BOADICEA - Breast and ovarian cancer risk (via CanRisk API)
  • PLCOm2012 - Lung cancer risk
  • LLPi - Liverpool Lung Project improved model for lung cancer risk (8.7-year prediction)
  • CRC-PRO - Colorectal cancer risk
  • PCPT - Prostate cancer risk
  • Extended PBCG - Prostate cancer risk (extended model)
  • Prostate Mortality - Prostate cancer-specific mortality prediction
  • MRAT - Melanoma risk (5-year prediction)
  • aMAP - Hepatocellular carcinoma (liver cancer) risk
  • QCancer - Multi-site cancer differential

Additional models should follow the interfaces under src/sentinel/risk_models.

Risk Model Implementation Guide

Base Architecture

All risk models must inherit from RiskModel in src/sentinel/risk_models/base.py:

from sentinel.risk_models.base import RiskModel

class YourRiskModel(RiskModel):
    def __init__(self):
        super().__init__("your_model_name")

Required Methods

Every risk model must implement these abstract methods:

def compute_score(self, user: UserInput) -> str:
    """Compute the risk score for a given user profile.

    Args:
        user: The user profile containing demographics, medical history, etc.

    Returns:
        str: Risk percentage as a string or an N/A message if inapplicable.

    Raises:
        ValueError: If required inputs are missing or invalid.
    """

def cancer_type(self) -> str:
    """Return the cancer type this model assesses."""
    return "breast"  # or "lung", "prostate", etc.

def description(self) -> str:
    """Return a detailed description of the model."""

def interpretation(self) -> str:
    """Return guidance on how to interpret the results."""

def references(self) -> list[str]:
    """Return list of reference citations."""

UserInput Structure

All risk models must use the centralized UserInput structure - this is the single source of truth for all data types and enums. The UserInput class follows a hierarchical structure:

UserInput
β”œβ”€β”€ demographics: Demographics
β”‚   β”œβ”€β”€ age_years: int
β”‚   β”œβ”€β”€ sex: Sex (enum)
β”‚   β”œβ”€β”€ ethnicity: Ethnicity | None
β”‚   └── anthropometrics: Anthropometrics
β”‚       β”œβ”€β”€ height_cm: float | None
β”‚       └── weight_kg: float | None
β”œβ”€β”€ lifestyle: Lifestyle
β”‚   β”œβ”€β”€ smoking: SmokingHistory
β”‚   └── alcohol: AlcoholConsumption
β”œβ”€β”€ personal_medical_history: PersonalMedicalHistory
β”‚   β”œβ”€β”€ chronic_conditions: list[ChronicCondition]
β”‚   β”œβ”€β”€ previous_cancers: list[CancerType]
β”‚   β”œβ”€β”€ genetic_mutations: list[GeneticMutation]
β”‚   └── tyrer_cuzick_polygenic_risk_score: float | None
β”œβ”€β”€ female_specific: FemaleSpecific | None
β”‚   β”œβ”€β”€ menstrual: MenstrualHistory
β”‚   β”œβ”€β”€ parity: ParityHistory
β”‚   └── breast_health: BreastHealthHistory
β”œβ”€β”€ symptoms: list[SymptomEntry]
└── family_history: list[FamilyMemberCancer]

REQUIRED_INPUTS Specification

Every risk model must define a REQUIRED_INPUTS class attribute using Pydantic's Annotated types with Field constraints:

REQUIRED_INPUTS: dict[str, tuple[type, bool]] = {
    "demographics.age_years": (Annotated[int, Field(ge=18, le=100)], True),
    "demographics.sex": (Sex, True),
    "demographics.ethnicity": (Ethnicity | None, False),
    "family_history": (list, False),  # list[FamilyMemberCancer]
    "symptoms": (list, False),  # list[SymptomEntry]
}

Input Validation

Every compute_score method must start with input validation:

def compute_score(self, user: UserInput) -> str:
    """Compute the risk score for a given user profile."""
    # Validate inputs first
    is_valid, errors = self.validate_inputs(user)
    if not is_valid:
        raise ValueError(f"Invalid inputs for {self.name}: {'; '.join(errors)}")

    # Model-specific validation
    if user.demographics.sex != Sex.FEMALE:
        return "N/A: Model is only applicable to female patients."

    # Continue with model-specific logic...

Data Access Patterns

# Demographics
age = user.demographics.age_years
sex = user.demographics.sex
ethnicity = user.demographics.ethnicity

# Female-specific data
if user.female_specific is not None:
    menarche_age = user.female_specific.menstrual.age_at_menarche
    num_births = user.female_specific.parity.num_live_births

# Family history
for member in user.family_history:
    if member.cancer_type == CancerType.BREAST:
        relation = member.relation
        age_at_diagnosis = member.age_at_diagnosis

Enum Usage

Always use enums from sentinel.user_input, never string literals or custom enums:

# βœ… Correct - using UserInput enums
if user.demographics.sex == Sex.FEMALE:
if member.cancer_type == CancerType.BREAST:
if member.relation == FamilyRelation.MOTHER:

# ❌ Incorrect - string literals
if user.demographics.sex == "female":
if member.cancer_type == "breast":

# ❌ Incorrect - custom enums
if user.demographics.sex == MyCustomSex.FEMALE:

Important: All risk models must use the same centralized enums from UserInput. If a required enum doesn't exist in UserInput, you must:

  1. Extend UserInput by adding the new enum to src/sentinel/user_input.py
  2. Never create model-specific enums - this prevents divergence between models
  3. Update all models to use the new centralized enum

This ensures all risk models share the same data structure and prevents fragmentation.

Extending UserInput

When a risk model needs fields or enums that don't exist in UserInput:

  1. Add to UserInput: Extend src/sentinel/user_input.py with new fields/enums
  2. Update all models: Ensure all existing models can handle the new fields (use | None for optional fields)
  3. Never create model-specific structures: This prevents divergence and fragmentation
  4. Test thoroughly: Add tests for new fields in tests/test_user_input.py

Example of extending UserInput:

# In src/sentinel/user_input.py
class ChronicCondition(str, Enum):
    # ... existing values
    NEW_CONDITION = "new_condition"  # Add new enum value

class PersonalMedicalHistory(StrictBaseModel):
    # ... existing fields
    new_field: float | None = Field(None, description="New field description")

Testing Requirements

Create comprehensive test files with:

  • Ground Truth Validation: Test against known reference values
  • Input Validation: Test that invalid inputs raise ValueError
  • Edge Cases: Test boundary conditions and edge cases
  • Inapplicable Cases: Test cases where model should return "N/A"

Example test structure:

import pytest
from sentinel.user_input import UserInput, Demographics, Sex
from sentinel.risk_models import YourRiskModel

GROUND_TRUTH_CASES = [
    {
        "name": "test_case_name",
        "input": UserInput(
            demographics=Demographics(
                age_years=40,
                sex=Sex.FEMALE,
                # ... other fields
            ),
            # ... rest of input
        ),
        "expected": 1.5,  # Expected risk percentage
    },
    # ... more test cases
]

class TestYourRiskModel:
    @pytest.mark.parametrize("case", GROUND_TRUTH_CASES, ids=lambda x: x["name"])
    def test_ground_truth_validation(self, case):
        """Test against ground truth results."""
        user_input = case["input"]
        expected_risk = case["expected"]

        actual_risk_str = self.model.compute_score(user_input)
        actual_risk = float(actual_risk_str)
        assert actual_risk == pytest.approx(expected_risk, abs=0.01)

Migration Checklist

When adapting an existing risk model to the new structure:

  • Update imports to use new user_input module
  • Add REQUIRED_INPUTS with Pydantic validation
  • Refactor compute_score to use new UserInput structure
  • Replace string literals with enums
  • Update parameter extraction logic
  • Add input validation at start of compute_score
  • Update all test cases to use new UserInput structure
  • Run full test suite to ensure 100% pass rate
  • Run pre-commit hooks to ensure code quality

LLM and Code Assistant Guidelines

When generating or modifying code, AI assistants MUST:

Mandatory Rules

  • Follow ALL guidelines in this document without exception
  • Never use forbidden constructs (os.path, Optional[], List[], print(), broad except:)
  • Never add decorative comment banners or unnecessary formatting
  • Always generate clean, modular, statically typed code

Code Generation Standards

  • Prefer clarity and simplicity over cleverness
  • Use modern Python type hints exclusively
  • Include comprehensive docstrings for non-trivial functions
  • Ensure all examples compile, type-check, and pass linting

Verification

All generated code must:

  • Pass ruff format and ruff check
  • Include proper type hints
  • Use pathlib.Path for all file operations
  • Use loguru for logging
  • Follow the Variable Naming guidelines

Important Note for Developers

When making changes to the project, ensure that the following files are updated to reflect the changes:

  • README.md
  • AGENTS.md
  • GEMINI.md

For additional implementation details, refer to the existing risk model implementations in src/sentinel/risk_models/.