Spaces:
Sleeping
Sleeping
| # Monitoring Memory Usage in Production on Render | |
| This document provides guidance on monitoring memory usage in production for the RAG application deployed on Render's free tier, which has a 512MB memory limit. | |
| ## Integrated Memory Monitoring Tools | |
| The application includes enhanced memory monitoring specifically optimized for Render deployments: | |
| ### 1. Memory Status Endpoint | |
| The application exposes a dedicated endpoint for monitoring memory usage: | |
| ``` | |
| GET /memory/render-status | |
| ``` | |
| This endpoint returns detailed information about current memory usage, including: | |
| - Current memory usage in MB | |
| - Peak memory usage since startup | |
| - Memory usage trends (5-minute and 1-hour) | |
| - Current memory status (normal, warning, critical, emergency) | |
| - Actions taken if memory thresholds were exceeded | |
| Example response: | |
| ```json | |
| { | |
| "status": "success", | |
| "is_render": true, | |
| "memory_status": { | |
| "timestamp": "2023-10-25T14:32:15.123456", | |
| "memory_mb": 342.5, | |
| "peak_memory_mb": 398.2, | |
| "context": "api_request", | |
| "status": "warning", | |
| "action_taken": "light_cleanup", | |
| "memory_limit_mb": 512.0 | |
| }, | |
| "memory_trends": { | |
| "current_mb": 342.5, | |
| "peak_mb": 398.2, | |
| "samples_count": 356, | |
| "trend_5min_mb": 12.5, | |
| "trend_1hour_mb": -24.3 | |
| }, | |
| "render_limit_mb": 512 | |
| } | |
| ``` | |
| ### 2. Detailed Diagnostics | |
| For more detailed memory diagnostics, use: | |
| ``` | |
| GET /memory/diagnostics | |
| ``` | |
| This provides a deeper look at memory allocation and usage patterns. | |
| ### 3. Force Memory Cleanup | |
| If you notice memory usage approaching critical levels, you can trigger a manual cleanup: | |
| ``` | |
| POST /memory/force-clean | |
| ``` | |
| ## Setting Up External Monitoring | |
| ### Using Uptime Robot or Similar Services | |
| 1. Set up a monitor to check the `/health` endpoint every 5 minutes | |
| 2. Set up a separate monitor to check the `/memory/render-status` endpoint every 15 minutes | |
| ### Automated Alerting | |
| Configure alerts based on memory thresholds: | |
| 1. **Warning Alert**: When memory usage exceeds 400MB (78% of limit) | |
| 2. **Critical Alert**: When memory usage exceeds 450MB (88% of limit) | |
| ### Monitoring Logs in Render Dashboard | |
| 1. Log into your Render dashboard | |
| 2. Navigate to the service logs | |
| 3. Filter for memory-related log messages: | |
| - `[MEMORY CHECKPOINT]` | |
| - `[MEMORY MILESTONE]` | |
| - `Memory usage` | |
| - `WARNING: Memory usage` | |
| - `CRITICAL: Memory usage` | |
| ## Memory Usage Patterns to Watch For | |
| ### Warning Signs | |
| 1. **Steadily Increasing Memory**: If memory trends show continuous growth | |
| 2. **High Peak After Ingestion**: Memory spikes above 450MB after document ingestion | |
| 3. **Failure to Release Memory**: Memory doesn't decrease after operations complete | |
| ### Preventative Actions | |
| 1. **Regular Cleanup**: Schedule low-traffic time for calling `/memory/force-clean` | |
| 2. **Batch Processing**: For large document sets, ingest in smaller batches | |
| 3. **Monitoring Before Bulk Operations**: Check memory status before starting resource-intensive operations | |
| ## Memory Optimization Features | |
| The application includes several memory optimization features: | |
| 1. **Automatic Thresholds**: Memory is monitored against configured thresholds (400MB, 450MB, 480MB) | |
| 2. **Progressive Cleanup**: Different levels of cleanup based on severity | |
| 3. **Request Circuit Breaker**: Will reject new requests if memory is critically high | |
| 4. **Memory Metrics Export**: Memory metrics are saved to `/tmp/render_metrics/` for later analysis | |
| ## Troubleshooting Memory Issues | |
| If you encounter persistent memory issues: | |
| 1. **Review Logs**: Check Render logs for memory checkpoints and milestones | |
| 2. **Analyze Trends**: Use the `/memory/render-status` endpoint to identify patterns | |
| 3. **Check Operations Timing**: High memory could correlate with specific operations | |
| 4. **Adjust Configuration**: Consider adjusting `EMBEDDING_BATCH_SIZE` or other parameters in `config.py` | |
| ## Available Environment Variables | |
| These environment variables can be configured in Render: | |
| - `MEMORY_DEBUG=1`: Enable detailed memory diagnostics | |
| - `MEMORY_LOG_INTERVAL=10`: Log memory usage every 10 seconds | |
| - `ENABLE_TRACEMALLOC=1`: Enable tracemalloc for detailed memory allocation tracking | |
| - `RENDER=1`: Enable Render-specific optimizations (automatically set on Render) | |