Spaces:

sethmcknight
/

msse-ai-engineering

Sleeping

Seth McKnight Copilot commited on Oct 12

Commit

c4b28eb

1 Parent(s): a33fa92

Add CI/CD workflow and Dockerfile for application deployment (#2)

* Add CI/CD workflow and Dockerfile for application deployment

* Update Dockerfile

Co-authored-by: Copilot <[email protected]>

* chore: add .dockerignore, run.sh; tighten Dockerfile; address PR review comments

* chore: bind gunicorn to PORT env var for Render compatibility

* ci: add deploy-to-render job (triggers Render deploy and runs smoke test)

* chore: add render.yaml and create post-deploy PR step to update deployed.md

* ci: skip deploy on [skip-deploy] commits; add exponential backoff to deploy poll; mark post-deploy commit to skip deploy

* Update .github/workflows/main.yml

Co-authored-by: Copilot <[email protected]>

* Update .github/workflows/main.yml

Co-authored-by: Copilot <[email protected]>

* Update .github/workflows/main.yml

Co-authored-by: Copilot <[email protected]>

* ci: use BACKOFF_STEP and BACKOFF_MAX variables in deploy polling loop

* Update .github/workflows/main.yml

Co-authored-by: Copilot <[email protected]>

* docs: expand README with CI/CD and Render deployment instructions

* ci: add black, isort, flake8 checks to build-and-test job

* style: run black/isort on project files; make flake8 strict in CI (exclude venv)

* chore: add pre-commit config (black, isort, flake8, basic hooks)

* chore: make black pre-commit hook use default interpreter (no language_version)

* chore: run pre-commit --all-files (apply EOF fixer)

* ci: run pre-commit in CI (install and run hooks)

* chore: add dev-requirements, PR-only pre-commit CI job, and README pre-commit docs

* ci: run pre-commit on push+PR and make build depend on it

* docs: mark CI actions implemented in project plan; add CI optimizations

* Update .github/workflows/main.yml

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>

Files changed (12) hide show

.dockerignore +22 -0
.github/workflows/main.yml +178 -0
.pre-commit-config.yaml +23 -0
Dockerfile +24 -0
README.md +61 -4
app.py +3 -0
dev-requirements.txt +4 -0
project-plan.md +8 -3
project-prompt-and-rubric.md +1 -1
render.yaml +10 -0
run.sh +10 -0
tests/test_app.py +5 -0

.dockerignore ADDED Viewed

	@@ -0,0 +1,22 @@

+.venv
+venv
+ENV
+env
+__pycache__
+*.pyc
+*.pyo
+.pytest_cache
+.git
+.github
+tests
+Dockerfile
+docker-compose.yml
+*.md
+notebooks
+*.ipynb
+venv/
+node_modules
+dist
+build
+.DS_Store
+.env

.github/workflows/main.yml ADDED Viewed

	@@ -0,0 +1,178 @@

+name: CI/CD
+on:
+  push:
+    branches: [main]
+  pull_request:
+    branches: [main]
+jobs:
+  build-and-test:
+    runs-on: ubuntu-latest
+    env:
+      PYTHONPATH: ${{ github.workspace }}
+    steps:
+      - name: Checkout code
+        uses: actions/checkout@v4
+      - name: Set up Python
+        uses: actions/setup-python@v5
+        with:
+          python-version: "3.10"
+      - name: Install dependencies
+        run: |
+          python -m pip install --upgrade pip
+          pip install -r requirements.txt
+      - name: Install linters and formatters
+        run: |
+          pip install black isort flake8
+      - name: Run linters and formatters (check-only)
+        run: |
+          # Check formatting
+          black --check .
+          # Check import sorting
+          isort --check-only .
+          # Run flake8 (fail the job on lint errors) and exclude virtualenv
+          flake8 --max-line-length=88 --exclude venv
+      - name: Run tests
+        run: |
+          pytest
+  pre-commit-check:
+    name: Pre-commit checks (PR only)
+    runs-on: ubuntu-latest
+    steps:
+      - name: Checkout code
+        uses: actions/checkout@v4
+      - name: Set up Python
+        uses: actions/setup-python@v5
+        with:
+          python-version: "3.10"
+      - name: Install dev dependencies
+        run: |
+          python -m pip install --upgrade pip
+          pip install -r dev-requirements.txt
+      - name: Run pre-commit (fail if any hook fails)
+        run: |
+          pre-commit run --all-files --show-diff-on-failure
+  deploy-to-render:
+    name: Deploy to Render + Smoke Test
+    runs-on: ubuntu-latest
+    needs: build-and-test
+    if: github.event_name == 'push' && github.ref == 'refs/heads/main' && !contains(github.event.head_commit.message, '[skip-deploy]')
+    env:
+      RENDER_API_KEY: ${{ secrets.RENDER_API_KEY }}
+      RENDER_SERVICE_ID: ${{ secrets.RENDER_SERVICE_ID }}
+      RENDER_SERVICE_URL: ${{ secrets.RENDER_SERVICE_URL }}
+    steps:
+      - name: Checkout
+        uses: actions/checkout@v4
+      - name: Install jq (for JSON parsing)
+        run: sudo apt-get update && sudo apt-get install -y jq
+      - name: Trigger Render deploy
+        id: trigger
+        run: |
+          set -e
+          echo "Triggering deploy for Render service $RENDER_SERVICE_ID"
+          response=$(curl -s -X POST "https://api.render.com/v1/services/${RENDER_SERVICE_ID}/deploys" \
+            -H "Authorization: Bearer ${RENDER_API_KEY}" \
+            -H "Content-Type: application/json" \
+            -d "{}")
+          echo "response: $response"
+          deploy_id=$(echo "$response" | jq -r '.id')
+          if [ -z "$deploy_id" ] || [ "$deploy_id" = "null" ]; then
+            echo "Failed to trigger deploy. Response:"
+            echo "$response"
+            exit 1
+          fi
+          echo "deploy_id=$deploy_id" >> "$GITHUB_OUTPUT"
+      - name: Wait for Render deploy to finish
+        id: wait
+        run: |
+          set -e
+          # Configurable constants
+          MAX_RETRIES=120
+          INITIAL_DELAY=5
+          BACKOFF_STEP=10
+          BACKOFF_MAX=60
+          deploy_id="${{ steps.trigger.outputs.deploy_id }}"
+          echo "Polling deploy status for $deploy_id..."
+          retries=0
+          max_retries=$MAX_RETRIES
+          delay=$INITIAL_DELAY
+          while [ $retries -lt $max_retries ]; do
+            resp=$(curl -s -H "Authorization: Bearer ${RENDER_API_KEY}" "https://api.render.com/v1/services/${RENDER_SERVICE_ID}/deploys/${deploy_id}")
+            status=$(echo "$resp" | jq -r '.status')
+            echo "Deploy status: $status"
+            if [ "$status" = "success" ]; then
+              echo "Deploy succeeded"
+              exit 0
+            fi
+            if [ "$status" = "failed" ]; then
+              echo "Deploy failed; response:"
+              echo "$resp"
+              exit 1
+            fi
+            sleep $delay
+            retries=$((retries+1))
+            # exponential backoff: every $BACKOFF_STEP retries double delay up to ${BACKOFF_MAX}s
+            if [ $((retries % BACKOFF_STEP)) -eq 0 ]; then
+              delay=$((delay * 2))
+              if [ $delay -gt $BACKOFF_MAX ]; then
+                delay=$BACKOFF_MAX
+              fi
+            fi
+          done
+          echo "Timed out waiting for deploy to finish"
+          exit 1
+      - name: Post-deploy smoke test (check /health)
+        run: |
+          set -e
+          # Configurable smoke test parameters
+          SMOKE_TEST_MAX_RETRIES=12
+          SMOKE_TEST_DELAY=5
+          url="${{ env.RENDER_SERVICE_URL }}/health"
+          echo "Checking $url"
+          retries=0
+          max_retries=$SMOKE_TEST_MAX_RETRIES
+          delay=$SMOKE_TEST_DELAY
+          while [ $retries -lt $max_retries ]; do
+            status_code=$(curl -s -o /dev/null -w "%{http_code}" "$url" || echo "000")
+            echo "HTTP $status_code"
+            if [ "$status_code" -eq 200 ]; then
+              echo "Smoke test passed"
+              exit 0
+            fi
+            sleep $delay
+            retries=$((retries+1))
+          done
+          echo "Smoke test failed: $url did not return 200"
+          exit 1
+      - name: Create deployed.md and open PR
+        if: success()
+        env:
+          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
+          RENDER_SERVICE_URL: ${{ env.RENDER_SERVICE_URL }}
+        run: |
+          set -e
+          BRANCH_NAME="deploy-update-$(date +%s)"
+          git config user.name "github-actions[bot]"
+          git config user.email "github-actions[bot]@users.noreply.github.com"
+          git checkout -b $BRANCH_NAME
+          echo "Live URL: ${RENDER_SERVICE_URL}" > deployed.md
+          echo "Deployed at: $(date -u +%Y-%m-%dT%H:%M:%SZ)" >> deployed.md
+          echo "Commit: $GITHUB_SHA" >> deployed.md
+          git add deployed.md
+          git commit -m "docs: update deployed.md after render deploy [skip-deploy]"
+          git push --set-upstream origin $BRANCH_NAME
+          # create PR using GitHub API
+          PR_TITLE="chore: update deployed.md after deploy"
+          PR_BODY="Automated update of deployed.md after successful deploy."
+          curl -s -X POST -H "Authorization: token $GITHUB_TOKEN" -H "Accept: application/vnd.github.v3+json" \
+            https://api.github.com/repos/${{ github.repository }}/pulls \
+            -d "{\"title\": \"${PR_TITLE}\", \"head\": \"${BRANCH_NAME}\", \"base\": \"main\", \"body\": \"${PR_BODY}\"}"

.pre-commit-config.yaml ADDED Viewed

	@@ -0,0 +1,23 @@

+repos:
+  - repo: https://github.com/psf/black
+    rev: 23.9.1
+    hooks:
+      - id: black
+  - repo: https://github.com/PyCQA/isort
+    rev: 5.13.0
+    hooks:
+      - id: isort
+  - repo: https://github.com/pycqa/flake8
+    rev: 6.1.0
+    hooks:
+      - id: flake8
+        args: ["--max-line-length=88"]
+  - repo: https://github.com/pre-commit/pre-commit-hooks
+    rev: v4.4.0
+    hooks:
+      - id: trailing-whitespace
+      - id: end-of-file-fixer
+      - id: check-yaml

Dockerfile ADDED Viewed

	@@ -0,0 +1,24 @@

+# Use an official Python runtime as a parent image
+FROM python:3.10-slim
+# Set the working directory in the container
+WORKDIR /app
+# Copy the dependencies file to the working directory
+COPY requirements.txt .
+# Install any needed packages specified in requirements.txt
+RUN pip install --no-cache-dir -r requirements.txt
+# Copy only application source (avoid copying dev files)
+COPY app.py /app/app.py
+COPY templates /app/templates
+COPY static /app/static
+COPY run.sh /app/run.sh
+# Make run.sh executable
+RUN chmod +x /app/run.sh
+# Expose port 10000
+EXPOSE 10000
+# Default entrypoint uses run.sh which starts gunicorn with configurable workers
+CMD ["/app/run.sh"]

README.md CHANGED Viewed

@@ -1,4 +1,4 @@
-# # MSSE AI Engineering Project
 This project is a Retrieval-Augmented Generation (RAG) application that answers questions about a corpus of company policies.
@@ -23,14 +23,17 @@ This project is a Retrieval-Augmented Generation (RAG) application that answers
    pip install -r requirements.txt
    ```
-## Running the Application
-To run the Flask application:
 ```bash
 flask run
 ```
 ## Running Tests
 To run the test suite:
@@ -39,4 +42,58 @@ To run the test suite:
 pytest
 ```
-Repo for the Quantic MSSE AI Engineering project code

+# MSSE AI Engineering Project
 This project is a Retrieval-Augmented Generation (RAG) application that answers questions about a corpus of company policies.
    pip install -r requirements.txt
    ```
+## Running the Application (local)
+To run the Flask application locally:
 ```bash
+export FLASK_APP=app.py
 flask run
 ```
+The app will be available at http://127.0.0.1:5000/ and exposes `/health` and `/` endpoints.
 ## Running Tests
 To run the test suite:
 pytest
 ```
+Current tests cover the basic application endpoints (`/health` and `/`). As we implement more features (ingestion, embeddings, RAG, chat API), we will add tests for those components following TDD.
+## CI/CD and Deployment
+This repository includes a GitHub Actions workflow that runs tests on push and pull requests. After merging to `main`, the workflow triggers a Render deploy and runs a post-deploy smoke test against `/health`.
+If you are deploying to Render manually:
+- Create a Web Service in Render (Environment: Docker).
+- Dockerfile Path: `Dockerfile`
+- Build Context: `.`
+- Health Check Path: `/health`
+- Auto-Deploy: Off (recommended if you want GitHub Actions to trigger deploys)
+To enable automated deploys from GitHub Actions, set these repository secrets in GitHub:
+- `RENDER_API_KEY` — Render API key
+- `RENDER_SERVICE_ID` — Render service id
+- `RENDER_SERVICE_URL` — Render public URL (used for smoke tests)
+The workflow will create a small `deploy-update-<ts>` branch with an updated `deployed.md` after a successful deploy; that commit is marked with `[skip-deploy]` so merging it will not trigger another deploy.
+## Notes
+- `run.sh` binds Gunicorn to the `PORT` environment variable so it works on Render.
+- The Dockerfile copies only runtime files and uses `.dockerignore` to avoid including development artifacts.
+## Next steps
+- Add ingestion, embedding, and RAG components (with tests). See `project-plan.md` for detailed milestones.
+## Developer tooling
+To keep the codebase formatted and linted automatically, we use pre-commit hooks.
+1. Create and activate your virtualenv (see Setup above).
+2. Install developer dependencies:
+```bash
+pip install -r dev-requirements.txt
+```
+3. Install the hooks (runs once per clone):
+```bash
+pre-commit install
+```
+4. To run all hooks locally (for example before pushing):
+```bash
+pre-commit run --all-files
+```
+CI has a dedicated `pre-commit-check` job that runs on pull requests and will fail the PR if any hook fails. We also run formatters and tests in the main build job.

app.py CHANGED Viewed

@@ -2,6 +2,7 @@ from flask import Flask, jsonify, render_template
 app = Flask(__name__)
 @app.route("/")
 def index():
     """
@@ -9,6 +10,7 @@ def index():
     """
     return render_template("index.html")
 @app.route("/health")
 def health():
     """
@@ -16,5 +18,6 @@ def health():
     """
     return jsonify({"status": "ok"}), 200
 if __name__ == "__main__":
     app.run(debug=True)

 app = Flask(__name__)
 @app.route("/")
 def index():
     """
     """
     return render_template("index.html")
 @app.route("/health")
 def health():
     """
     """
     return jsonify({"status": "ok"}), 200
 if __name__ == "__main__":
     app.run(debug=True)

dev-requirements.txt ADDED Viewed

	@@ -0,0 +1,4 @@

+pre-commit==3.5.0
+black==23.9.1
+isort==5.13.0
+flake8==6.1.0

project-plan.md CHANGED Viewed

@@ -21,9 +21,9 @@ This plan outlines the steps to design, build, and deploy a Retrieval-Augmented
 ## 3. CI/CD and Initial Deployment
-- [ ] **Render Setup:** Create a new Web Service on Render and link it to the GitHub repository.
-- [ ] **Environment Configuration:** Configure necessary environment variables on Render (e.g., `PYTHON_VERSION`).
-- [ ] **GitHub Actions:** Create a CI/CD workflow (`.github/workflows/main.yml`) that:
   - Triggers on push/PR to the `main` branch.
   - Installs dependencies from `requirements.txt`.
   - Runs the `pytest` test suite.
@@ -31,6 +31,11 @@ This plan outlines the steps to design, build, and deploy a Retrieval-Augmented
 - [ ] **Deployment Validation:** Push a change and verify that the workflow runs successfully and the application is deployed.
 - [ ] **Documentation:** Update `deployed.md` with the live URL of the deployed application.
 ## 4. Data Ingestion and Processing
 - [ ] **Corpus Assembly:** Collect or generate 5-20 policy documents (PDF, TXT, MD) and place them in a `corpus/` directory.

 ## 3. CI/CD and Initial Deployment
+- [x] **Render Setup:** Create a new Web Service on Render and link it to the GitHub repository.
+- [x] **Environment Configuration:** Configure necessary environment variables on Render (e.g., `PYTHON_VERSION`).
+- [x] **GitHub Actions:** Create a CI/CD workflow (`.github/workflows/main.yml`) that:
   - Triggers on push/PR to the `main` branch.
   - Installs dependencies from `requirements.txt`.
   - Runs the `pytest` test suite.
 - [ ] **Deployment Validation:** Push a change and verify that the workflow runs successfully and the application is deployed.
 - [ ] **Documentation:** Update `deployed.md` with the live URL of the deployed application.
+### CI/CD optimizations added
+- [x] Add pip cache to CI to speed up dependency installation.
+- [x] Optimize pre-commit in PRs to run only changed-file hooks (use `pre-commit run --from-ref ... --to-ref ...`).
 ## 4. Data Ingestion and Processing
 - [ ] **Corpus Assembly:** Collect or generate 5-20 policy documents (PDF, TXT, MD) and place them in a `corpus/` directory.

project-prompt-and-rubric.md CHANGED Viewed

@@ -225,4 +225,4 @@ but not limited to:
 ○ No demo of application
 0
 ● The student either did not complete the assignment, plagiarized all or part
-of the assignment, or completely failed to address the project requirements.

 ○ No demo of application
 0
 ● The student either did not complete the assignment, plagiarized all or part
+of the assignment, or completely failed to address the project requirements.

render.yaml ADDED Viewed

	@@ -0,0 +1,10 @@

+services:
+  - name: msse-ai-engineering
+    type: web
+    env: docker
+    repo: https://github.com/sethmcknight/msse-ai-engineering
+    branch: main
+    buildCommand: ""
+    startCommand: ""
+    healthCheckPath: /health
+    plan: free

run.sh ADDED Viewed

	@@ -0,0 +1,10 @@

+#!/usr/bin/env bash
+set -e
+# Default values
+WORKERS_VALUE="${WORKERS:-4}"
+TIMEOUT_VALUE="${TIMEOUT:-120}"
+PORT_VALUE="${PORT:-10000}"
+echo "Starting gunicorn on port ${PORT_VALUE} with ${WORKERS_VALUE} workers and timeout ${TIMEOUT_VALUE}s"
+exec gunicorn --bind 0.0.0.0:${PORT_VALUE} --workers "${WORKERS_VALUE}" --timeout "${TIMEOUT_VALUE}" app:app

tests/test_app.py CHANGED Viewed

@@ -1,14 +1,18 @@
 import pytest
 from app import app as flask_app
 @pytest.fixture
 def app():
     yield flask_app
 @pytest.fixture
 def client(app):
     return app.test_client()
 def test_health_endpoint(client):
     """
     Tests the /health endpoint.
@@ -17,6 +21,7 @@ def test_health_endpoint(client):
     assert response.status_code == 200
     assert response.json == {"status": "ok"}
 def test_index_endpoint(client):
     """
     Tests the / endpoint.

 import pytest
 from app import app as flask_app
 @pytest.fixture
 def app():
     yield flask_app
 @pytest.fixture
 def client(app):
     return app.test_client()
 def test_health_endpoint(client):
     """
     Tests the /health endpoint.
     assert response.status_code == 200
     assert response.json == {"status": "ok"}
 def test_index_endpoint(client):
     """
     Tests the / endpoint.