Spaces:

InstaDeepAI
/

sentinel

Runtime error

File size: 5,876 Bytes

---
title: Sentinel - Cancer Risk Assessment Assistant
emoji: 🏥
colorFrom: blue
colorTo: purple
sdk: docker
app_port: 8501
pinned: false
---

# LLM-based Cancer Risk Assessment Assistant

This project is an API service that provides preliminary cancer risk assessments based on user-provided data. It is built using FastAPI and LangChain, with a flexible architecture that supports both local and API-based LLMs.

## Development Setup

1. Create the virtual environment:

```bash
uv sync
```

## External API Configuration

For risk models that require external APIs, such as CanRisk (BOADICEA model), fill in the following section of the `.env` file:

```bash
# .env
CANRISK_USERNAME=your_canrisk_username
CANRISK_PASSWORD=your_canrisk_password
```

Then source it: `source .env`

For CanRisk API access , register at https://www.canrisk.org/.

## Using a Local LLM (Ollama)

1. Install [Ollama](https://ollama.com) for your platform.
2. Pull the default model from the command line:

```bash
ollama pull gemma3:4b
```
3. Ensure the Ollama desktop app or server is running. You can check your installed models with `ollama list`.

## Using API-based LLMs (Google)

1. Create a `.env` file in the project root with your `GOOGLE_API_KEY`:

   ```bash
   echo "GOOGLE_API_KEY=your_key_here" > .env
   ```

   Make sure the Generative AI API is enabled for your Google Cloud project.

2. Run the command line demo with the Google provider (default):

   ```bash
   uv run python apps/cli/main.py
   ```

   Switch to the local model with:

   ```bash
   uv run python apps/cli/main.py model=gemma3_4b
   ```

3. The `model` override also works with the Streamlit and FastAPI interfaces.


## Interactive Demo

Run a simple command line demo with:

```bash
uv run python apps/cli/main.py
```

Enable developer mode and load user data from a file with:

```bash
uv run python apps/cli/main.py dev_mode=true user_file=examples/user_example.yaml
```

The script collects user data, prints the structured JSON assessment, and then allows follow-up questions in a chat-like loop. Type `quit` to exit.

The multi-page Streamlit interface provides an expert feedback interface located at
`apps/streamlit_ui/main.py`.
The first page, **User Profile**, lets you upload or manually create a profile
before running assessments.
The **Configuration** page allows you to choose the model and knowledge base modules and shows a live preview of the full LLM prompt.
The **Assessment** page runs the model, shows a dashboard of results, and lets you export or chat with the assistant.

### Exporting Reports

After the initial assessment is displayed in the terminal, you will be prompted to export the full report to a formatted file. You can choose to generate a PDF, an Excel file, or both. The generated files (e.g., `Cancer_Risk_Report_20250626_213000.pdf`) will be saved in the root directory of the project.

**Note:** This feature requires the `openpyxl` and `reportlab` libraries.

You can also provide a JSON or YAML file with all user information to skip the
interactive prompts:

```bash
uv run python apps/cli/main.py user_file=examples/user_example.yaml
```

To launch the Streamlit interface, run the following command from the root of the
project:

```bash
uv run streamlit run apps/streamlit_ui/main.py
```

*Note* To serve the app locally you can use `ngrok`
```bash
 ngrok http 8501
 ```

## Important Note for Developers

When making changes to the project, check if the following files should also updated to reflect the changes:

- `README.md`
- `AGENTS.md`
- `GEMINI.md`

## Available Risk Models

The assistant currently includes the following built-in risk calculators:

- **Gail** - Breast cancer risk
- **Claus** - Breast cancer risk based on family history
- **Tyrer-Cuzick** - Breast cancer risk (IBIS model)
- **BOADICEA** - Breast and ovarian cancer risk (via CanRisk API)
- **PLCOm2012** - Lung cancer risk
- **LLPi** - Liverpool Lung Project improved model for lung cancer risk (8.7-year prediction)
- **CRC-PRO** - Colorectal cancer risk
- **PCPT** - Prostate cancer risk
- **Extended PBCG** - Prostate cancer risk (extended model)
- **Prostate Mortality** - Prostate cancer-specific mortality prediction
- **MRAT** - Melanoma risk (5-year prediction)
- **aMAP** - Hepatocellular carcinoma (liver cancer) risk
- **QCancer** - Multi-site cancer differential

## Generating Documentation

The project includes a comprehensive PDF documentation generator that creates detailed documentation of all implemented risk models and their input requirements.

### Generate Risk Model Documentation

To generate the PDF documentation:

```bash
uv run python scripts/generate_documentation.py
```

This will create a comprehensive PDF document (`docs/risk_model_documentation.pdf`) that includes:

1. **Overview Section**:
   - Cancer type coverage chart
   - Statistics on implemented risk scores and cancer types covered

2. **Detailed Model Information**:
   - Description, interpretation, and references for each risk model
   - Complete input requirements with field details, required status, units, and possible values/choices

3. **Input-to-Cancer Mapping**:
   - Reverse mapping showing which cancer types use each input field
   - Possible values for each field
   - Comprehensive coverage analysis

The documentation is automatically regenerated based on the current codebase, ensuring it stays up-to-date as new risk models and input fields are added.

### Documentation Features

- **Comprehensive Coverage**: Documents all risk models and their input requirements
- **Visual Charts**: Includes cancer type coverage visualization
- **Detailed Tables**: Shows field specifications, constraints, and valid values
- **Professional Layout**: Clean, readable PDF format suitable for sharing
- **Auto-Generated**: Stays synchronized with code changes automatically