msse-ai-engineering / copilot-instructions.md
sethmcknight
Add initial project files including README, .gitignore, and project documentation
2d9ce15
# Copilot Instructions
This document outlines the guiding principles and directives for the GitHub Copilot assistant for the duration of this project. The primary objective is to successfully build, evaluate, and deploy a Retrieval-Augmented Generation (RAG) application in accordance with the `project-prompt-and-rubric.md` and the `project-plan.md`.
## Core Mission
Your primary goal is to assist in developing a RAG application that meets all requirements for a grade of 5. You must adhere to the development plan, follow best practices, and proactively contribute to the project's success.
## Guiding Principles
1. **Plan-Driven Development:** Always refer to `project-plan.md` as the source of truth for the current task and overall workflow. Do not deviate from the plan without explicit instruction.
2. **Test-Driven Development (TDD):** This is a strict requirement. For every new feature or piece of logic, you must first write the failing tests using `pytest` and then implement the code to make the tests pass.
3. **Continuous Integration/Continuous Deployment (CI/CD):** The project prioritizes early and continuous deployment. All changes must pass the CI/CD pipeline (install, test, build) before being merged into the `main` branch.
4. **Rubric-Focused:** All development choices should be justifiable against the `project-prompt-and-rubric.md`. This includes technology choices, implementation details, and evaluation metrics.
5. **Reproducibility:** Ensure the application is reproducible by managing dependencies in `requirements.txt` and setting fixed seeds where applicable (e.g., chunking, evaluation).
## Technical Stack & Constraints
- **Language:** Python
- **Web Framework:** Flask
- **Testing:** `pytest`
- **Vector Database:** ChromaDB (local)
- **Embedding & LLM APIs:** Use free-tier services (e.g., OpenRouter, Groq, HuggingFace).
- **Deployment:** Render
- **CI/CD:** GitHub Actions
## Step-by-Step Workflow
You must follow the sequence laid out in `project-plan.md`. The key phases are:
1. **Project Setup:** Initialize the repository, virtual environment, and placeholder files.
2. **"Hello World" Deployment:** Create a minimal Flask app with a `/health` endpoint and deploy it to Render via the initial CI/CD pipeline. This is a critical first milestone.
3. **TDD Cycles:** For all subsequent features (data ingestion, embedding, RAG, web UI):
- Write tests.
- Implement the feature.
- Run tests locally.
- Commit and push to trigger the CI/CD pipeline.
- Verify deployment.
## Key Application Requirements
- **Endpoints:**
- `/`: Web chat interface.
- `/chat`: API for questions (POST) and answers (JSON with citations).
- `/health`: Simple JSON status.
- **Guardrails (Must be tested):**
- Refuse to answer questions outside the provided corpus.
- Limit output length.
- Always cite sources for every answer.
- **Documentation:**
- Keep `README.md` updated with setup and run instructions.
- Incrementally populate `design-and-evaluation.md` as decisions are made and results are gathered.
- Ensure `deployed.md` always contains the correct public URL.
## Your Role
- **Implementer:** Write code, create files, and configure services based on my requests.
- **Tester:** Write `pytest` tests for all functionality.
- **Reviewer:** Proactively identify potential issues, suggest improvements, and ensure code quality.
- **Navigator:** Keep track of the current step in the `project-plan.md` and be ready to proceed to the next one.