Spaces:

InstaDeepAI
/

sentinel

Runtime error

App Files Files Community

sentinel / README.md

jeuko

Sync from GitHub (main)

0ba176c verified about 2 months ago

preview code

raw

history blame contribute delete

5.88 kB

	---
	title: Sentinel - Cancer Risk Assessment Assistant
	emoji: 🏥
	colorFrom: blue
	colorTo: purple
	sdk: docker
	app_port: 8501
	pinned: false
	---

	# LLM-based Cancer Risk Assessment Assistant

	This project is an API service that provides preliminary cancer risk assessments based on user-provided data. It is built using FastAPI and LangChain, with a flexible architecture that supports both local and API-based LLMs.

	## Development Setup

	1. Create the virtual environment:

	```bash
	uv sync
	```

	## External API Configuration

	For risk models that require external APIs, such as CanRisk (BOADICEA model), fill in the following section of the `.env` file:

	```bash
	# .env
	CANRISK_USERNAME=your_canrisk_username
	CANRISK_PASSWORD=your_canrisk_password
	```

	Then source it: `source .env`

	For CanRisk API access , register at https://www.canrisk.org/.

	## Using a Local LLM (Ollama)

	1. Install [Ollama](https://ollama.com) for your platform.
	2. Pull the default model from the command line:

	```bash
	ollama pull gemma3:4b
	```
	3. Ensure the Ollama desktop app or server is running. You can check your installed models with `ollama list`.

	## Using API-based LLMs (Google)

	1. Create a `.env` file in the project root with your `GOOGLE_API_KEY`:

	```bash
	echo "GOOGLE_API_KEY=your_key_here" > .env
	```

	Make sure the Generative AI API is enabled for your Google Cloud project.

	2. Run the command line demo with the Google provider (default):

	```bash
	uv run python apps/cli/main.py
	```

	Switch to the local model with:

	```bash
	uv run python apps/cli/main.py model=gemma3_4b
	```

	3. The `model` override also works with the Streamlit and FastAPI interfaces.


	## Interactive Demo

	Run a simple command line demo with:

	```bash
	uv run python apps/cli/main.py
	```

	Enable developer mode and load user data from a file with:

	```bash
	uv run python apps/cli/main.py dev_mode=true user_file=examples/user_example.yaml
	```

	The script collects user data, prints the structured JSON assessment, and then allows follow-up questions in a chat-like loop. Type `quit` to exit.

	The multi-page Streamlit interface provides an expert feedback interface located at
	`apps/streamlit_ui/main.py`.
	The first page, User Profile, lets you upload or manually create a profile
	before running assessments.
	The Configuration page allows you to choose the model and knowledge base modules and shows a live preview of the full LLM prompt.
	The Assessment page runs the model, shows a dashboard of results, and lets you export or chat with the assistant.

	### Exporting Reports

	After the initial assessment is displayed in the terminal, you will be prompted to export the full report to a formatted file. You can choose to generate a PDF, an Excel file, or both. The generated files (e.g., `Cancer_Risk_Report_20250626_213000.pdf`) will be saved in the root directory of the project.

	Note: This feature requires the `openpyxl` and `reportlab` libraries.

	You can also provide a JSON or YAML file with all user information to skip the
	interactive prompts:

	```bash
	uv run python apps/cli/main.py user_file=examples/user_example.yaml
	```

	To launch the Streamlit interface, run the following command from the root of the
	project:

	```bash
	uv run streamlit run apps/streamlit_ui/main.py
	```

	Note To serve the app locally you can use `ngrok`
	```bash
	ngrok http 8501
	```

	## Important Note for Developers

	When making changes to the project, check if the following files should also updated to reflect the changes:

	- `README.md`
	- `AGENTS.md`
	- `GEMINI.md`

	## Available Risk Models

	The assistant currently includes the following built-in risk calculators:

	- Gail - Breast cancer risk
	- Claus - Breast cancer risk based on family history
	- Tyrer-Cuzick - Breast cancer risk (IBIS model)
	- BOADICEA - Breast and ovarian cancer risk (via CanRisk API)
	- PLCOm2012 - Lung cancer risk
	- LLPi - Liverpool Lung Project improved model for lung cancer risk (8.7-year prediction)
	- CRC-PRO - Colorectal cancer risk
	- PCPT - Prostate cancer risk
	- Extended PBCG - Prostate cancer risk (extended model)
	- Prostate Mortality - Prostate cancer-specific mortality prediction
	- MRAT - Melanoma risk (5-year prediction)
	- aMAP - Hepatocellular carcinoma (liver cancer) risk
	- QCancer - Multi-site cancer differential

	## Generating Documentation

	The project includes a comprehensive PDF documentation generator that creates detailed documentation of all implemented risk models and their input requirements.

	### Generate Risk Model Documentation

	To generate the PDF documentation:

	```bash
	uv run python scripts/generate_documentation.py
	```

	This will create a comprehensive PDF document (`docs/risk_model_documentation.pdf`) that includes:

	1. Overview Section:
	- Cancer type coverage chart
	- Statistics on implemented risk scores and cancer types covered

	2. Detailed Model Information:
	- Description, interpretation, and references for each risk model
	- Complete input requirements with field details, required status, units, and possible values/choices

	3. Input-to-Cancer Mapping:
	- Reverse mapping showing which cancer types use each input field
	- Possible values for each field
	- Comprehensive coverage analysis

	The documentation is automatically regenerated based on the current codebase, ensuring it stays up-to-date as new risk models and input fields are added.

	### Documentation Features

	- Comprehensive Coverage: Documents all risk models and their input requirements
	- Visual Charts: Includes cancer type coverage visualization
	- Detailed Tables: Shows field specifications, constraints, and valid values
	- Professional Layout: Clean, readable PDF format suitable for sharing
	- Auto-Generated: Stays synchronized with code changes automatically