sentinel / README.md
jeuko's picture
Sync from GitHub (main)
0ba176c verified
---
title: Sentinel - Cancer Risk Assessment Assistant
emoji: 🏥
colorFrom: blue
colorTo: purple
sdk: docker
app_port: 8501
pinned: false
---
# LLM-based Cancer Risk Assessment Assistant
This project is an API service that provides preliminary cancer risk assessments based on user-provided data. It is built using FastAPI and LangChain, with a flexible architecture that supports both local and API-based LLMs.
## Development Setup
1. Create the virtual environment:
```bash
uv sync
```
## External API Configuration
For risk models that require external APIs, such as CanRisk (BOADICEA model), fill in the following section of the `.env` file:
```bash
# .env
CANRISK_USERNAME=your_canrisk_username
CANRISK_PASSWORD=your_canrisk_password
```
Then source it: `source .env`
For CanRisk API access , register at https://www.canrisk.org/.
## Using a Local LLM (Ollama)
1. Install [Ollama](https://ollama.com) for your platform.
2. Pull the default model from the command line:
```bash
ollama pull gemma3:4b
```
3. Ensure the Ollama desktop app or server is running. You can check your installed models with `ollama list`.
## Using API-based LLMs (Google)
1. Create a `.env` file in the project root with your `GOOGLE_API_KEY`:
```bash
echo "GOOGLE_API_KEY=your_key_here" > .env
```
Make sure the Generative AI API is enabled for your Google Cloud project.
2. Run the command line demo with the Google provider (default):
```bash
uv run python apps/cli/main.py
```
Switch to the local model with:
```bash
uv run python apps/cli/main.py model=gemma3_4b
```
3. The `model` override also works with the Streamlit and FastAPI interfaces.
## Interactive Demo
Run a simple command line demo with:
```bash
uv run python apps/cli/main.py
```
Enable developer mode and load user data from a file with:
```bash
uv run python apps/cli/main.py dev_mode=true user_file=examples/user_example.yaml
```
The script collects user data, prints the structured JSON assessment, and then allows follow-up questions in a chat-like loop. Type `quit` to exit.
The multi-page Streamlit interface provides an expert feedback interface located at
`apps/streamlit_ui/main.py`.
The first page, **User Profile**, lets you upload or manually create a profile
before running assessments.
The **Configuration** page allows you to choose the model and knowledge base modules and shows a live preview of the full LLM prompt.
The **Assessment** page runs the model, shows a dashboard of results, and lets you export or chat with the assistant.
### Exporting Reports
After the initial assessment is displayed in the terminal, you will be prompted to export the full report to a formatted file. You can choose to generate a PDF, an Excel file, or both. The generated files (e.g., `Cancer_Risk_Report_20250626_213000.pdf`) will be saved in the root directory of the project.
**Note:** This feature requires the `openpyxl` and `reportlab` libraries.
You can also provide a JSON or YAML file with all user information to skip the
interactive prompts:
```bash
uv run python apps/cli/main.py user_file=examples/user_example.yaml
```
To launch the Streamlit interface, run the following command from the root of the
project:
```bash
uv run streamlit run apps/streamlit_ui/main.py
```
*Note* To serve the app locally you can use `ngrok`
```bash
ngrok http 8501
```
## Important Note for Developers
When making changes to the project, check if the following files should also updated to reflect the changes:
- `README.md`
- `AGENTS.md`
- `GEMINI.md`
## Available Risk Models
The assistant currently includes the following built-in risk calculators:
- **Gail** - Breast cancer risk
- **Claus** - Breast cancer risk based on family history
- **Tyrer-Cuzick** - Breast cancer risk (IBIS model)
- **BOADICEA** - Breast and ovarian cancer risk (via CanRisk API)
- **PLCOm2012** - Lung cancer risk
- **LLPi** - Liverpool Lung Project improved model for lung cancer risk (8.7-year prediction)
- **CRC-PRO** - Colorectal cancer risk
- **PCPT** - Prostate cancer risk
- **Extended PBCG** - Prostate cancer risk (extended model)
- **Prostate Mortality** - Prostate cancer-specific mortality prediction
- **MRAT** - Melanoma risk (5-year prediction)
- **aMAP** - Hepatocellular carcinoma (liver cancer) risk
- **QCancer** - Multi-site cancer differential
## Generating Documentation
The project includes a comprehensive PDF documentation generator that creates detailed documentation of all implemented risk models and their input requirements.
### Generate Risk Model Documentation
To generate the PDF documentation:
```bash
uv run python scripts/generate_documentation.py
```
This will create a comprehensive PDF document (`docs/risk_model_documentation.pdf`) that includes:
1. **Overview Section**:
- Cancer type coverage chart
- Statistics on implemented risk scores and cancer types covered
2. **Detailed Model Information**:
- Description, interpretation, and references for each risk model
- Complete input requirements with field details, required status, units, and possible values/choices
3. **Input-to-Cancer Mapping**:
- Reverse mapping showing which cancer types use each input field
- Possible values for each field
- Comprehensive coverage analysis
The documentation is automatically regenerated based on the current codebase, ensuring it stays up-to-date as new risk models and input fields are added.
### Documentation Features
- **Comprehensive Coverage**: Documents all risk models and their input requirements
- **Visual Charts**: Includes cancer type coverage visualization
- **Detailed Tables**: Shows field specifications, constraints, and valid values
- **Professional Layout**: Clean, readable PDF format suitable for sharing
- **Auto-Generated**: Stays synchronized with code changes automatically