Spaces:

sethmcknight
/

msse-ai-engineering

Sleeping

App Files Files Community

sethmcknight commited on Oct 24

Commit

45cf08e

1 Parent(s): 7e43525

refactor: enhance gunicorn startup script with health checks and config handling

Browse files

Files changed (3) hide show

Dockerfile +1 -0
POSTGRES_MIGRATION.md +51 -13
run.sh +35 -9

Dockerfile CHANGED Viewed

@@ -25,6 +25,7 @@ COPY src ./src
 COPY data ./data
 COPY scripts ./scripts
 COPY run.sh ./run.sh
 RUN chmod +x run.sh && chmod +x scripts/init_pgvector.py || true

 COPY data ./data
 COPY scripts ./scripts
 COPY run.sh ./run.sh
+COPY gunicorn.conf.py ./gunicorn.conf.py
 RUN chmod +x run.sh && chmod +x scripts/init_pgvector.py || true

POSTGRES_MIGRATION.md CHANGED Viewed

@@ -1,11 +1,13 @@
 # PostgreSQL Migration Guide
 ## Overview
 This branch implements PostgreSQL with pgvector as an alternative to ChromaDB for vector storage. This reduces memory usage from 400MB+ to ~50-100MB by storing vectors on disk instead of in RAM.
 ## What's Been Implemented
 ### 1. PostgresVectorService (`src/vector_db/postgres_vector_service.py`)
 - Full PostgreSQL integration with pgvector extension
 - Automatic table creation and indexing
 - Similarity search using cosine distance
@@ -13,25 +15,30 @@ This branch implements PostgreSQL with pgvector as an alternative to ChromaDB fo
 - Health monitoring and collection info
 ### 2. PostgresVectorAdapter (`src/vector_db/postgres_adapter.py`)
 - Compatibility layer for existing ChromaDB interface
 - Ensures seamless migration without code changes
 - Converts between PostgreSQL and ChromaDB result formats
 ### 3. Updated Configuration (`src/config.py`)
 - Added `VECTOR_STORAGE_TYPE` environment variable
 - PostgreSQL connection settings
 - Memory optimization parameters
 ### 4. Factory Pattern (`src/vector_store/vector_db.py`)
 - `create_vector_database()` function selects backend automatically
 - Supports both ChromaDB and PostgreSQL based on configuration
 ### 5. Migration Script (`scripts/migrate_to_postgres.py`)
 - Data optimization (text summarization, metadata cleaning)
 - Batch processing with memory management
 - Handles 4GB → 1GB data reduction for free tier
 ### 6. Tests (`tests/test_vector_store/test_postgres_vector.py`)
 - Unit tests with mocked dependencies
 - Integration tests for real database
 - Compatibility tests for ChromaDB interface
@@ -39,15 +46,18 @@ This branch implements PostgreSQL with pgvector as an alternative to ChromaDB fo
 ## Setup Instructions
 ### Step 1: Create Render PostgreSQL Database
 1. Go to Render Dashboard
 2. Create → PostgreSQL
 3. Choose "Free" plan (1GB storage, 30 days)
 4. Save the connection details
 ### Step 2: Enable pgvector Extension
 You have several options to enable pgvector:
 **Option A: Use the initialization script (Recommended)**
 ```bash
 # Set your database URL
 export DATABASE_URL="postgresql://user:password@host:port/database"
@@ -58,16 +68,19 @@ python scripts/init_pgvector.py
 **Option B: Manual SQL**
 Connect to your database and run:
 ```sql
 CREATE EXTENSION IF NOT EXISTS vector;
 ```
 **Option C: From Render Dashboard**
 1. Go to your PostgreSQL service → Info tab
 2. Use the "PSQL Command" to connect
 3. Run: `CREATE EXTENSION IF NOT EXISTS vector;`
 The initialization script (`scripts/init_pgvector.py`) will:
 - Test database connection
 - Check PostgreSQL version compatibility (13+)
 - Install pgvector extension safely
@@ -75,7 +88,9 @@ The initialization script (`scripts/init_pgvector.py`) will:
 - Provide detailed logging and error messages
 ### Step 3: Update Environment Variables
 Add to your Render environment variables:
 ```bash
 DATABASE_URL=postgresql://username:password@host:port/database
 VECTOR_STORAGE_TYPE=postgres
@@ -83,12 +98,15 @@ MEMORY_LIMIT_MB=400
 ```
 ### Step 4: Install Dependencies
 ```bash
 pip install psycopg2-binary==2.9.7
 ```
 ### Step 5: Run Migration (Optional)
 If you have existing ChromaDB data:
 ```bash
 python scripts/migrate_to_postgres.py --database-url="your-connection-string"
 ```
@@ -96,12 +114,15 @@ python scripts/migrate_to_postgres.py --database-url="your-connection-string"
 ## Usage
 ### Switch to PostgreSQL
 Set environment variable:
 ```bash
 export VECTOR_STORAGE_TYPE=postgres
 ```
 ### Use in Code (No Changes Required!)
 ```python
 from src.vector_store.vector_db import create_vector_database
@@ -113,22 +134,24 @@ results = vector_db.search(query_embedding, top_k=5)
 ## Expected Memory Reduction
-| Component | Before (ChromaDB) | After (PostgreSQL) | Savings |
-|-----------|------------------|-------------------|---------|
-| Vector Storage | 200-300MB | 0MB (disk) | 200-300MB |
-| Embedding Model | 100MB | 50MB (smaller model) | 50MB |
-| Application Code | 50-100MB | 50-100MB | 0MB |
-| **Total** | **350-500MB** | **50-150MB** | **300-350MB** |
 ## Migration Optimizations
 ### Data Size Reduction
 - **Text Summarization**: Documents truncated to 1000 characters
 - **Metadata Cleaning**: Only essential fields kept
 - **Dimension Reduction**: Can use smaller embedding models
 - **Quality Filtering**: Skip very short or low-quality documents
 ### Memory Management
 - **Batch Processing**: Process documents in small batches
 - **Garbage Collection**: Aggressive cleanup between operations
 - **Streaming**: Process data without loading everything into memory
@@ -136,17 +159,20 @@ results = vector_db.search(query_embedding, top_k=5)
 ## Testing
 ### Unit Tests
 ```bash
 pytest tests/test_vector_store/test_postgres_vector.py -v
 ```
 ### Integration Tests (Requires Database)
 ```bash
 export TEST_DATABASE_URL="postgresql://test:test@localhost:5432/test_db"
 pytest tests/test_vector_store/test_postgres_vector.py -m integration -v
 ```
 ### Migration Test
 ```bash
 python scripts/migrate_to_postgres.py --test-only
 ```
@@ -154,13 +180,17 @@ python scripts/migrate_to_postgres.py --test-only
 ## Deployment
 ### Local Development
 Keep using ChromaDB:
 ```bash
 export VECTOR_STORAGE_TYPE=chroma
 ```
 ### Production (Render)
 Switch to PostgreSQL:
 ```bash
 export VECTOR_STORAGE_TYPE=postgres
 export DATABASE_URL="your-render-postgres-url"
@@ -169,10 +199,13 @@ export DATABASE_URL="your-render-postgres-url"
 ## Troubleshooting
 ### Common Issues
 1. **"pgvector extension not found"**
    - Run `CREATE EXTENSION vector;` in your database
 2. **Connection errors**
    - Verify DATABASE_URL format: `postgresql://user:pass@host:port/db`
    - Check firewall/network connectivity
@@ -181,6 +214,7 @@ export DATABASE_URL="your-render-postgres-url"
    - Check that old ChromaDB files aren't being loaded
 ### Monitoring
 ```python
 from src.vector_db.postgres_vector_service import PostgresVectorService
@@ -190,23 +224,27 @@ print(health)  # Shows connection status, document count, etc.
 ```
 ## Rollback Plan
 If issues occur, simply change back to ChromaDB:
 ```bash
 export VECTOR_STORAGE_TYPE=chroma
 ```
 The factory pattern ensures seamless switching between backends.
 ## Performance Comparison
-| Operation | ChromaDB | PostgreSQL | Notes |
-|-----------|----------|------------|-------|
-| Insert | Fast | Medium | Network overhead |
-| Search | Very Fast | Fast | pgvector is optimized |
-| Memory | High | Low | Vectors stored on disk |
-| Persistence | File-based | Database | More reliable |
-| Scaling | Limited | Excellent | Can upgrade storage |
 ## Next Steps
 1. Test locally with PostgreSQL
 2. Create Render PostgreSQL database
 3. Run migration script

 # PostgreSQL Migration Guide
 ## Overview
 This branch implements PostgreSQL with pgvector as an alternative to ChromaDB for vector storage. This reduces memory usage from 400MB+ to ~50-100MB by storing vectors on disk instead of in RAM.
 ## What's Been Implemented
 ### 1. PostgresVectorService (`src/vector_db/postgres_vector_service.py`)
 - Full PostgreSQL integration with pgvector extension
 - Automatic table creation and indexing
 - Similarity search using cosine distance
 - Health monitoring and collection info
 ### 2. PostgresVectorAdapter (`src/vector_db/postgres_adapter.py`)
 - Compatibility layer for existing ChromaDB interface
 - Ensures seamless migration without code changes
 - Converts between PostgreSQL and ChromaDB result formats
 ### 3. Updated Configuration (`src/config.py`)
 - Added `VECTOR_STORAGE_TYPE` environment variable
 - PostgreSQL connection settings
 - Memory optimization parameters
 ### 4. Factory Pattern (`src/vector_store/vector_db.py`)
 - `create_vector_database()` function selects backend automatically
 - Supports both ChromaDB and PostgreSQL based on configuration
 ### 5. Migration Script (`scripts/migrate_to_postgres.py`)
 - Data optimization (text summarization, metadata cleaning)
 - Batch processing with memory management
 - Handles 4GB → 1GB data reduction for free tier
 ### 6. Tests (`tests/test_vector_store/test_postgres_vector.py`)
 - Unit tests with mocked dependencies
 - Integration tests for real database
 - Compatibility tests for ChromaDB interface
 ## Setup Instructions
 ### Step 1: Create Render PostgreSQL Database
 1. Go to Render Dashboard
 2. Create → PostgreSQL
 3. Choose "Free" plan (1GB storage, 30 days)
 4. Save the connection details
 ### Step 2: Enable pgvector Extension
 You have several options to enable pgvector:
 **Option A: Use the initialization script (Recommended)**
 ```bash
 # Set your database URL
 export DATABASE_URL="postgresql://user:password@host:port/database"
 **Option B: Manual SQL**
 Connect to your database and run:
 ```sql
 CREATE EXTENSION IF NOT EXISTS vector;
 ```
 **Option C: From Render Dashboard**
 1. Go to your PostgreSQL service → Info tab
 2. Use the "PSQL Command" to connect
 3. Run: `CREATE EXTENSION IF NOT EXISTS vector;`
 The initialization script (`scripts/init_pgvector.py`) will:
 - Test database connection
 - Check PostgreSQL version compatibility (13+)
 - Install pgvector extension safely
 - Provide detailed logging and error messages
 ### Step 3: Update Environment Variables
 Add to your Render environment variables:
 ```bash
 DATABASE_URL=postgresql://username:password@host:port/database
 VECTOR_STORAGE_TYPE=postgres
 ```
 ### Step 4: Install Dependencies
 ```bash
 pip install psycopg2-binary==2.9.7
 ```
 ### Step 5: Run Migration (Optional)
 If you have existing ChromaDB data:
 ```bash
 python scripts/migrate_to_postgres.py --database-url="your-connection-string"
 ```
 ## Usage
 ### Switch to PostgreSQL
 Set environment variable:
 ```bash
 export VECTOR_STORAGE_TYPE=postgres
 ```
 ### Use in Code (No Changes Required!)
 ```python
 from src.vector_store.vector_db import create_vector_database
 ## Expected Memory Reduction
+| Component        | Before (ChromaDB) | After (PostgreSQL)   | Savings       |
+| ---------------- | ----------------- | -------------------- | ------------- |
+| Vector Storage   | 200-300MB         | 0MB (disk)           | 200-300MB     |
+| Embedding Model  | 100MB             | 50MB (smaller model) | 50MB          |
+| Application Code | 50-100MB          | 50-100MB             | 0MB           |
+| **Total**        | **350-500MB**     | **50-150MB**         | **300-350MB** |
 ## Migration Optimizations
 ### Data Size Reduction
 - **Text Summarization**: Documents truncated to 1000 characters
 - **Metadata Cleaning**: Only essential fields kept
 - **Dimension Reduction**: Can use smaller embedding models
 - **Quality Filtering**: Skip very short or low-quality documents
 ### Memory Management
 - **Batch Processing**: Process documents in small batches
 - **Garbage Collection**: Aggressive cleanup between operations
 - **Streaming**: Process data without loading everything into memory
 ## Testing
 ### Unit Tests
 ```bash
 pytest tests/test_vector_store/test_postgres_vector.py -v
 ```
 ### Integration Tests (Requires Database)
 ```bash
 export TEST_DATABASE_URL="postgresql://test:test@localhost:5432/test_db"
 pytest tests/test_vector_store/test_postgres_vector.py -m integration -v
 ```
 ### Migration Test
 ```bash
 python scripts/migrate_to_postgres.py --test-only
 ```
 ## Deployment
 ### Local Development
 Keep using ChromaDB:
 ```bash
 export VECTOR_STORAGE_TYPE=chroma
 ```
 ### Production (Render)
 Switch to PostgreSQL:
 ```bash
 export VECTOR_STORAGE_TYPE=postgres
 export DATABASE_URL="your-render-postgres-url"
 ## Troubleshooting
 ### Common Issues
 1. **"pgvector extension not found"**
    - Run `CREATE EXTENSION vector;` in your database
 2. **Connection errors**
    - Verify DATABASE_URL format: `postgresql://user:pass@host:port/db`
    - Check firewall/network connectivity
    - Check that old ChromaDB files aren't being loaded
 ### Monitoring
 ```python
 from src.vector_db.postgres_vector_service import PostgresVectorService
 ```
 ## Rollback Plan
 If issues occur, simply change back to ChromaDB:
 ```bash
 export VECTOR_STORAGE_TYPE=chroma
 ```
 The factory pattern ensures seamless switching between backends.
 ## Performance Comparison
+| Operation   | ChromaDB   | PostgreSQL | Notes                  |
+| ----------- | ---------- | ---------- | ---------------------- |
+| Insert      | Fast       | Medium     | Network overhead       |
+| Search      | Very Fast  | Fast       | pgvector is optimized  |
+| Memory      | High       | Low        | Vectors stored on disk |
+| Persistence | File-based | Database   | More reliable          |
+| Scaling     | Limited    | Excellent  | Can upgrade storage    |
 ## Next Steps
 1. Test locally with PostgreSQL
 2. Create Render PostgreSQL database
 3. Run migration script

run.sh CHANGED Viewed

@@ -25,16 +25,24 @@ done
 echo "Starting gunicorn on port ${PORT_VALUE} with ${WORKERS_VALUE} workers and timeout ${TIMEOUT_VALUE}s"
 export PYTHONPATH="/app${PYTHONPATH:+:$PYTHONPATH}"
 # Start gunicorn in background so we can trap signals and collect diagnostics
 gunicorn \
   --bind 0.0.0.0:${PORT_VALUE} \
   --workers "${WORKERS_VALUE}" \
   --timeout "${TIMEOUT_VALUE}" \
-  --log-level debug \
   --access-logfile - \
   --error-logfile - \
   --capture-output \
-  --config gunicorn.conf.py \
   app:app &
 GUNICORN_PID=$!
@@ -55,18 +63,36 @@ handle_term() {
 }
 trap 'handle_term' SIGTERM SIGINT
-# Give gunicorn a moment to start before pre-warm
-echo "Waiting for server to start to pre-warm..."
-sleep 5
-# Pre-warm application (best-effort; don't fail startup if warm request fails)
-echo "Pre-warming application..."
 curl -sS -X POST http://localhost:${PORT_VALUE}/chat \
   -H "Content-Type: application/json" \
   -d '{"message":"pre-warm"}' \
-  --max-time 180 --fail >/dev/null 2>&1 || echo "Pre-warm request failed but continuing..."
-echo "Server is running."
 # Wait for gunicorn to exit and forward its exit code
 wait "${GUNICORN_PID}"

 echo "Starting gunicorn on port ${PORT_VALUE} with ${WORKERS_VALUE} workers and timeout ${TIMEOUT_VALUE}s"
 export PYTHONPATH="/app${PYTHONPATH:+:$PYTHONPATH}"
+# Determine gunicorn config usage
+GUNICORN_CONFIG_ARG=""
+if [ -f gunicorn.conf.py ]; then
+  GUNICORN_CONFIG_ARG="--config gunicorn.conf.py"
+else
+  echo "Warning: gunicorn.conf.py not found; starting with inline CLI options only."
+fi
 # Start gunicorn in background so we can trap signals and collect diagnostics
 gunicorn \
   --bind 0.0.0.0:${PORT_VALUE} \
   --workers "${WORKERS_VALUE}" \
   --timeout "${TIMEOUT_VALUE}" \
+  --log-level info \
   --access-logfile - \
   --error-logfile - \
   --capture-output \
+  ${GUNICORN_CONFIG_ARG} \
   app:app &
 GUNICORN_PID=$!
 }
 trap 'handle_term' SIGTERM SIGINT
+# Readiness probe loop
+echo "Waiting for application readiness (health endpoint)..."
+READY_TIMEOUT="${READY_TIMEOUT:-60}" # total seconds to wait
+READY_INTERVAL="${READY_INTERVAL:-3}" # seconds between checks
+ELAPSED=0
+READY=0
+while [ "$ELAPSED" -lt "$READY_TIMEOUT" ]; do
+  if ! kill -0 "${GUNICORN_PID}" 2>/dev/null; then
+    echo "Gunicorn process exited prematurely during startup; aborting." >&2
+    exit 1
+  fi
+  if curl -fsS "http://localhost:${PORT_VALUE}/health" >/dev/null 2>&1; then
+    READY=1
+    break
+  fi
+  sleep "$READY_INTERVAL"
+  ELAPSED=$((ELAPSED + READY_INTERVAL))
+done
+if [ "$READY" -ne 1 ]; then
+  echo "Health endpoint not ready after ${READY_TIMEOUT}s; continuing but marking as degraded." >&2
+fi
+# Pre-warm (chat) if health is ready
+echo "Pre-warming application via /chat endpoint..."
 curl -sS -X POST http://localhost:${PORT_VALUE}/chat \
   -H "Content-Type: application/json" \
   -d '{"message":"pre-warm"}' \
+  --max-time 30 --fail >/dev/null 2>&1 || echo "Pre-warm request failed but continuing..."
+echo "Server is running (PID ${GUNICORN_PID})."
 # Wait for gunicorn to exit and forward its exit code
 wait "${GUNICORN_PID}"