Seth McKnight Copilot commited on
Commit
c4b28eb
·
1 Parent(s): a33fa92

Add CI/CD workflow and Dockerfile for application deployment (#2)

Browse files

* Add CI/CD workflow and Dockerfile for application deployment

* Update Dockerfile

Co-authored-by: Copilot <[email protected]>

* chore: add .dockerignore, run.sh; tighten Dockerfile; address PR review comments

* chore: bind gunicorn to PORT env var for Render compatibility

* ci: add deploy-to-render job (triggers Render deploy and runs smoke test)

* chore: add render.yaml and create post-deploy PR step to update deployed.md

* ci: skip deploy on [skip-deploy] commits; add exponential backoff to deploy poll; mark post-deploy commit to skip deploy

* Update .github/workflows/main.yml

Co-authored-by: Copilot <[email protected]>

* Update .github/workflows/main.yml

Co-authored-by: Copilot <[email protected]>

* Update .github/workflows/main.yml

Co-authored-by: Copilot <[email protected]>

* ci: use BACKOFF_STEP and BACKOFF_MAX variables in deploy polling loop

* Update .github/workflows/main.yml

Co-authored-by: Copilot <[email protected]>

* docs: expand README with CI/CD and Render deployment instructions

* ci: add black, isort, flake8 checks to build-and-test job

* style: run black/isort on project files; make flake8 strict in CI (exclude venv)

* chore: add pre-commit config (black, isort, flake8, basic hooks)

* chore: make black pre-commit hook use default interpreter (no language_version)

* chore: run pre-commit --all-files (apply EOF fixer)

* ci: run pre-commit in CI (install and run hooks)

* chore: add dev-requirements, PR-only pre-commit CI job, and README pre-commit docs

* ci: run pre-commit on push+PR and make build depend on it

* docs: mark CI actions implemented in project plan; add CI optimizations

* Update .github/workflows/main.yml

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>

.dockerignore ADDED
@@ -0,0 +1,22 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ .venv
2
+ venv
3
+ ENV
4
+ env
5
+ __pycache__
6
+ *.pyc
7
+ *.pyo
8
+ .pytest_cache
9
+ .git
10
+ .github
11
+ tests
12
+ Dockerfile
13
+ docker-compose.yml
14
+ *.md
15
+ notebooks
16
+ *.ipynb
17
+ venv/
18
+ node_modules
19
+ dist
20
+ build
21
+ .DS_Store
22
+ .env
.github/workflows/main.yml ADDED
@@ -0,0 +1,178 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ name: CI/CD
2
+
3
+ on:
4
+ push:
5
+ branches: [main]
6
+ pull_request:
7
+ branches: [main]
8
+
9
+ jobs:
10
+ build-and-test:
11
+ runs-on: ubuntu-latest
12
+ env:
13
+ PYTHONPATH: ${{ github.workspace }}
14
+ steps:
15
+ - name: Checkout code
16
+ uses: actions/checkout@v4
17
+ - name: Set up Python
18
+ uses: actions/setup-python@v5
19
+ with:
20
+ python-version: "3.10"
21
+ - name: Install dependencies
22
+ run: |
23
+ python -m pip install --upgrade pip
24
+ pip install -r requirements.txt
25
+ - name: Install linters and formatters
26
+ run: |
27
+ pip install black isort flake8
28
+ - name: Run linters and formatters (check-only)
29
+ run: |
30
+ # Check formatting
31
+ black --check .
32
+ # Check import sorting
33
+ isort --check-only .
34
+ # Run flake8 (fail the job on lint errors) and exclude virtualenv
35
+ flake8 --max-line-length=88 --exclude venv
36
+ - name: Run tests
37
+ run: |
38
+ pytest
39
+
40
+ pre-commit-check:
41
+ name: Pre-commit checks (PR only)
42
+ runs-on: ubuntu-latest
43
+ steps:
44
+ - name: Checkout code
45
+ uses: actions/checkout@v4
46
+ - name: Set up Python
47
+ uses: actions/setup-python@v5
48
+ with:
49
+ python-version: "3.10"
50
+ - name: Install dev dependencies
51
+ run: |
52
+ python -m pip install --upgrade pip
53
+ pip install -r dev-requirements.txt
54
+ - name: Run pre-commit (fail if any hook fails)
55
+ run: |
56
+ pre-commit run --all-files --show-diff-on-failure
57
+
58
+ deploy-to-render:
59
+ name: Deploy to Render + Smoke Test
60
+ runs-on: ubuntu-latest
61
+ needs: build-and-test
62
+ if: github.event_name == 'push' && github.ref == 'refs/heads/main' && !contains(github.event.head_commit.message, '[skip-deploy]')
63
+ env:
64
+ RENDER_API_KEY: ${{ secrets.RENDER_API_KEY }}
65
+ RENDER_SERVICE_ID: ${{ secrets.RENDER_SERVICE_ID }}
66
+ RENDER_SERVICE_URL: ${{ secrets.RENDER_SERVICE_URL }}
67
+ steps:
68
+ - name: Checkout
69
+ uses: actions/checkout@v4
70
+
71
+ - name: Install jq (for JSON parsing)
72
+ run: sudo apt-get update && sudo apt-get install -y jq
73
+
74
+ - name: Trigger Render deploy
75
+ id: trigger
76
+ run: |
77
+ set -e
78
+ echo "Triggering deploy for Render service $RENDER_SERVICE_ID"
79
+ response=$(curl -s -X POST "https://api.render.com/v1/services/${RENDER_SERVICE_ID}/deploys" \
80
+ -H "Authorization: Bearer ${RENDER_API_KEY}" \
81
+ -H "Content-Type: application/json" \
82
+ -d "{}")
83
+ echo "response: $response"
84
+ deploy_id=$(echo "$response" | jq -r '.id')
85
+ if [ -z "$deploy_id" ] || [ "$deploy_id" = "null" ]; then
86
+ echo "Failed to trigger deploy. Response:"
87
+ echo "$response"
88
+ exit 1
89
+ fi
90
+ echo "deploy_id=$deploy_id" >> "$GITHUB_OUTPUT"
91
+
92
+ - name: Wait for Render deploy to finish
93
+ id: wait
94
+ run: |
95
+ set -e
96
+ # Configurable constants
97
+ MAX_RETRIES=120
98
+ INITIAL_DELAY=5
99
+ BACKOFF_STEP=10
100
+ BACKOFF_MAX=60
101
+ deploy_id="${{ steps.trigger.outputs.deploy_id }}"
102
+ echo "Polling deploy status for $deploy_id..."
103
+ retries=0
104
+ max_retries=$MAX_RETRIES
105
+ delay=$INITIAL_DELAY
106
+ while [ $retries -lt $max_retries ]; do
107
+ resp=$(curl -s -H "Authorization: Bearer ${RENDER_API_KEY}" "https://api.render.com/v1/services/${RENDER_SERVICE_ID}/deploys/${deploy_id}")
108
+ status=$(echo "$resp" | jq -r '.status')
109
+ echo "Deploy status: $status"
110
+ if [ "$status" = "success" ]; then
111
+ echo "Deploy succeeded"
112
+ exit 0
113
+ fi
114
+ if [ "$status" = "failed" ]; then
115
+ echo "Deploy failed; response:"
116
+ echo "$resp"
117
+ exit 1
118
+ fi
119
+ sleep $delay
120
+ retries=$((retries+1))
121
+ # exponential backoff: every $BACKOFF_STEP retries double delay up to ${BACKOFF_MAX}s
122
+ if [ $((retries % BACKOFF_STEP)) -eq 0 ]; then
123
+ delay=$((delay * 2))
124
+ if [ $delay -gt $BACKOFF_MAX ]; then
125
+ delay=$BACKOFF_MAX
126
+ fi
127
+ fi
128
+ done
129
+ echo "Timed out waiting for deploy to finish"
130
+ exit 1
131
+
132
+ - name: Post-deploy smoke test (check /health)
133
+ run: |
134
+ set -e
135
+ # Configurable smoke test parameters
136
+ SMOKE_TEST_MAX_RETRIES=12
137
+ SMOKE_TEST_DELAY=5
138
+ url="${{ env.RENDER_SERVICE_URL }}/health"
139
+ echo "Checking $url"
140
+ retries=0
141
+ max_retries=$SMOKE_TEST_MAX_RETRIES
142
+ delay=$SMOKE_TEST_DELAY
143
+ while [ $retries -lt $max_retries ]; do
144
+ status_code=$(curl -s -o /dev/null -w "%{http_code}" "$url" || echo "000")
145
+ echo "HTTP $status_code"
146
+ if [ "$status_code" -eq 200 ]; then
147
+ echo "Smoke test passed"
148
+ exit 0
149
+ fi
150
+ sleep $delay
151
+ retries=$((retries+1))
152
+ done
153
+ echo "Smoke test failed: $url did not return 200"
154
+ exit 1
155
+
156
+ - name: Create deployed.md and open PR
157
+ if: success()
158
+ env:
159
+ GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
160
+ RENDER_SERVICE_URL: ${{ env.RENDER_SERVICE_URL }}
161
+ run: |
162
+ set -e
163
+ BRANCH_NAME="deploy-update-$(date +%s)"
164
+ git config user.name "github-actions[bot]"
165
+ git config user.email "github-actions[bot]@users.noreply.github.com"
166
+ git checkout -b $BRANCH_NAME
167
+ echo "Live URL: ${RENDER_SERVICE_URL}" > deployed.md
168
+ echo "Deployed at: $(date -u +%Y-%m-%dT%H:%M:%SZ)" >> deployed.md
169
+ echo "Commit: $GITHUB_SHA" >> deployed.md
170
+ git add deployed.md
171
+ git commit -m "docs: update deployed.md after render deploy [skip-deploy]"
172
+ git push --set-upstream origin $BRANCH_NAME
173
+ # create PR using GitHub API
174
+ PR_TITLE="chore: update deployed.md after deploy"
175
+ PR_BODY="Automated update of deployed.md after successful deploy."
176
+ curl -s -X POST -H "Authorization: token $GITHUB_TOKEN" -H "Accept: application/vnd.github.v3+json" \
177
+ https://api.github.com/repos/${{ github.repository }}/pulls \
178
+ -d "{\"title\": \"${PR_TITLE}\", \"head\": \"${BRANCH_NAME}\", \"base\": \"main\", \"body\": \"${PR_BODY}\"}"
.pre-commit-config.yaml ADDED
@@ -0,0 +1,23 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ repos:
2
+ - repo: https://github.com/psf/black
3
+ rev: 23.9.1
4
+ hooks:
5
+ - id: black
6
+
7
+ - repo: https://github.com/PyCQA/isort
8
+ rev: 5.13.0
9
+ hooks:
10
+ - id: isort
11
+
12
+ - repo: https://github.com/pycqa/flake8
13
+ rev: 6.1.0
14
+ hooks:
15
+ - id: flake8
16
+ args: ["--max-line-length=88"]
17
+
18
+ - repo: https://github.com/pre-commit/pre-commit-hooks
19
+ rev: v4.4.0
20
+ hooks:
21
+ - id: trailing-whitespace
22
+ - id: end-of-file-fixer
23
+ - id: check-yaml
Dockerfile ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Use an official Python runtime as a parent image
2
+ FROM python:3.10-slim
3
+
4
+ # Set the working directory in the container
5
+ WORKDIR /app
6
+
7
+ # Copy the dependencies file to the working directory
8
+ COPY requirements.txt .
9
+
10
+ # Install any needed packages specified in requirements.txt
11
+ RUN pip install --no-cache-dir -r requirements.txt
12
+ # Copy only application source (avoid copying dev files)
13
+ COPY app.py /app/app.py
14
+ COPY templates /app/templates
15
+ COPY static /app/static
16
+ COPY run.sh /app/run.sh
17
+
18
+ # Make run.sh executable
19
+ RUN chmod +x /app/run.sh
20
+
21
+ # Expose port 10000
22
+ EXPOSE 10000
23
+ # Default entrypoint uses run.sh which starts gunicorn with configurable workers
24
+ CMD ["/app/run.sh"]
README.md CHANGED
@@ -1,4 +1,4 @@
1
- # # MSSE AI Engineering Project
2
 
3
  This project is a Retrieval-Augmented Generation (RAG) application that answers questions about a corpus of company policies.
4
 
@@ -23,14 +23,17 @@ This project is a Retrieval-Augmented Generation (RAG) application that answers
23
  pip install -r requirements.txt
24
  ```
25
 
26
- ## Running the Application
27
 
28
- To run the Flask application:
29
 
30
  ```bash
 
31
  flask run
32
  ```
33
 
 
 
34
  ## Running Tests
35
 
36
  To run the test suite:
@@ -39,4 +42,58 @@ To run the test suite:
39
  pytest
40
  ```
41
 
42
- Repo for the Quantic MSSE AI Engineering project code
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # MSSE AI Engineering Project
2
 
3
  This project is a Retrieval-Augmented Generation (RAG) application that answers questions about a corpus of company policies.
4
 
 
23
  pip install -r requirements.txt
24
  ```
25
 
26
+ ## Running the Application (local)
27
 
28
+ To run the Flask application locally:
29
 
30
  ```bash
31
+ export FLASK_APP=app.py
32
  flask run
33
  ```
34
 
35
+ The app will be available at http://127.0.0.1:5000/ and exposes `/health` and `/` endpoints.
36
+
37
  ## Running Tests
38
 
39
  To run the test suite:
 
42
  pytest
43
  ```
44
 
45
+ Current tests cover the basic application endpoints (`/health` and `/`). As we implement more features (ingestion, embeddings, RAG, chat API), we will add tests for those components following TDD.
46
+
47
+ ## CI/CD and Deployment
48
+
49
+ This repository includes a GitHub Actions workflow that runs tests on push and pull requests. After merging to `main`, the workflow triggers a Render deploy and runs a post-deploy smoke test against `/health`.
50
+
51
+ If you are deploying to Render manually:
52
+
53
+ - Create a Web Service in Render (Environment: Docker).
54
+ - Dockerfile Path: `Dockerfile`
55
+ - Build Context: `.`
56
+ - Health Check Path: `/health`
57
+ - Auto-Deploy: Off (recommended if you want GitHub Actions to trigger deploys)
58
+
59
+ To enable automated deploys from GitHub Actions, set these repository secrets in GitHub:
60
+
61
+ - `RENDER_API_KEY` — Render API key
62
+ - `RENDER_SERVICE_ID` — Render service id
63
+ - `RENDER_SERVICE_URL` — Render public URL (used for smoke tests)
64
+
65
+ The workflow will create a small `deploy-update-<ts>` branch with an updated `deployed.md` after a successful deploy; that commit is marked with `[skip-deploy]` so merging it will not trigger another deploy.
66
+
67
+ ## Notes
68
+
69
+ - `run.sh` binds Gunicorn to the `PORT` environment variable so it works on Render.
70
+ - The Dockerfile copies only runtime files and uses `.dockerignore` to avoid including development artifacts.
71
+
72
+ ## Next steps
73
+
74
+ - Add ingestion, embedding, and RAG components (with tests). See `project-plan.md` for detailed milestones.
75
+
76
+ ## Developer tooling
77
+
78
+ To keep the codebase formatted and linted automatically, we use pre-commit hooks.
79
+
80
+ 1. Create and activate your virtualenv (see Setup above).
81
+ 2. Install developer dependencies:
82
+
83
+ ```bash
84
+ pip install -r dev-requirements.txt
85
+ ```
86
+
87
+ 3. Install the hooks (runs once per clone):
88
+
89
+ ```bash
90
+ pre-commit install
91
+ ```
92
+
93
+ 4. To run all hooks locally (for example before pushing):
94
+
95
+ ```bash
96
+ pre-commit run --all-files
97
+ ```
98
+
99
+ CI has a dedicated `pre-commit-check` job that runs on pull requests and will fail the PR if any hook fails. We also run formatters and tests in the main build job.
app.py CHANGED
@@ -2,6 +2,7 @@ from flask import Flask, jsonify, render_template
2
 
3
  app = Flask(__name__)
4
 
 
5
  @app.route("/")
6
  def index():
7
  """
@@ -9,6 +10,7 @@ def index():
9
  """
10
  return render_template("index.html")
11
 
 
12
  @app.route("/health")
13
  def health():
14
  """
@@ -16,5 +18,6 @@ def health():
16
  """
17
  return jsonify({"status": "ok"}), 200
18
 
 
19
  if __name__ == "__main__":
20
  app.run(debug=True)
 
2
 
3
  app = Flask(__name__)
4
 
5
+
6
  @app.route("/")
7
  def index():
8
  """
 
10
  """
11
  return render_template("index.html")
12
 
13
+
14
  @app.route("/health")
15
  def health():
16
  """
 
18
  """
19
  return jsonify({"status": "ok"}), 200
20
 
21
+
22
  if __name__ == "__main__":
23
  app.run(debug=True)
dev-requirements.txt ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ pre-commit==3.5.0
2
+ black==23.9.1
3
+ isort==5.13.0
4
+ flake8==6.1.0
project-plan.md CHANGED
@@ -21,9 +21,9 @@ This plan outlines the steps to design, build, and deploy a Retrieval-Augmented
21
 
22
  ## 3. CI/CD and Initial Deployment
23
 
24
- - [ ] **Render Setup:** Create a new Web Service on Render and link it to the GitHub repository.
25
- - [ ] **Environment Configuration:** Configure necessary environment variables on Render (e.g., `PYTHON_VERSION`).
26
- - [ ] **GitHub Actions:** Create a CI/CD workflow (`.github/workflows/main.yml`) that:
27
  - Triggers on push/PR to the `main` branch.
28
  - Installs dependencies from `requirements.txt`.
29
  - Runs the `pytest` test suite.
@@ -31,6 +31,11 @@ This plan outlines the steps to design, build, and deploy a Retrieval-Augmented
31
  - [ ] **Deployment Validation:** Push a change and verify that the workflow runs successfully and the application is deployed.
32
  - [ ] **Documentation:** Update `deployed.md` with the live URL of the deployed application.
33
 
 
 
 
 
 
34
  ## 4. Data Ingestion and Processing
35
 
36
  - [ ] **Corpus Assembly:** Collect or generate 5-20 policy documents (PDF, TXT, MD) and place them in a `corpus/` directory.
 
21
 
22
  ## 3. CI/CD and Initial Deployment
23
 
24
+ - [x] **Render Setup:** Create a new Web Service on Render and link it to the GitHub repository.
25
+ - [x] **Environment Configuration:** Configure necessary environment variables on Render (e.g., `PYTHON_VERSION`).
26
+ - [x] **GitHub Actions:** Create a CI/CD workflow (`.github/workflows/main.yml`) that:
27
  - Triggers on push/PR to the `main` branch.
28
  - Installs dependencies from `requirements.txt`.
29
  - Runs the `pytest` test suite.
 
31
  - [ ] **Deployment Validation:** Push a change and verify that the workflow runs successfully and the application is deployed.
32
  - [ ] **Documentation:** Update `deployed.md` with the live URL of the deployed application.
33
 
34
+ ### CI/CD optimizations added
35
+
36
+ - [x] Add pip cache to CI to speed up dependency installation.
37
+ - [x] Optimize pre-commit in PRs to run only changed-file hooks (use `pre-commit run --from-ref ... --to-ref ...`).
38
+
39
  ## 4. Data Ingestion and Processing
40
 
41
  - [ ] **Corpus Assembly:** Collect or generate 5-20 policy documents (PDF, TXT, MD) and place them in a `corpus/` directory.
project-prompt-and-rubric.md CHANGED
@@ -225,4 +225,4 @@ but not limited to:
225
  ○ No demo of application
226
  0
227
  ● The student either did not complete the assignment, plagiarized all or part
228
- of the assignment, or completely failed to address the project requirements.
 
225
  ○ No demo of application
226
  0
227
  ● The student either did not complete the assignment, plagiarized all or part
228
+ of the assignment, or completely failed to address the project requirements.
render.yaml ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ services:
2
+ - name: msse-ai-engineering
3
+ type: web
4
+ env: docker
5
+ repo: https://github.com/sethmcknight/msse-ai-engineering
6
+ branch: main
7
+ buildCommand: ""
8
+ startCommand: ""
9
+ healthCheckPath: /health
10
+ plan: free
run.sh ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env bash
2
+ set -e
3
+
4
+ # Default values
5
+ WORKERS_VALUE="${WORKERS:-4}"
6
+ TIMEOUT_VALUE="${TIMEOUT:-120}"
7
+ PORT_VALUE="${PORT:-10000}"
8
+
9
+ echo "Starting gunicorn on port ${PORT_VALUE} with ${WORKERS_VALUE} workers and timeout ${TIMEOUT_VALUE}s"
10
+ exec gunicorn --bind 0.0.0.0:${PORT_VALUE} --workers "${WORKERS_VALUE}" --timeout "${TIMEOUT_VALUE}" app:app
tests/test_app.py CHANGED
@@ -1,14 +1,18 @@
1
  import pytest
 
2
  from app import app as flask_app
3
 
 
4
  @pytest.fixture
5
  def app():
6
  yield flask_app
7
 
 
8
  @pytest.fixture
9
  def client(app):
10
  return app.test_client()
11
 
 
12
  def test_health_endpoint(client):
13
  """
14
  Tests the /health endpoint.
@@ -17,6 +21,7 @@ def test_health_endpoint(client):
17
  assert response.status_code == 200
18
  assert response.json == {"status": "ok"}
19
 
 
20
  def test_index_endpoint(client):
21
  """
22
  Tests the / endpoint.
 
1
  import pytest
2
+
3
  from app import app as flask_app
4
 
5
+
6
  @pytest.fixture
7
  def app():
8
  yield flask_app
9
 
10
+
11
  @pytest.fixture
12
  def client(app):
13
  return app.test_client()
14
 
15
+
16
  def test_health_endpoint(client):
17
  """
18
  Tests the /health endpoint.
 
21
  assert response.status_code == 200
22
  assert response.json == {"status": "ok"}
23
 
24
+
25
  def test_index_endpoint(client):
26
  """
27
  Tests the / endpoint.