Spaces:
Sleeping
Monitoring Memory Usage in Production on Render
This document provides guidance on monitoring memory usage in production for the RAG application deployed on Render's free tier, which has a 512MB memory limit.
Integrated Memory Monitoring Tools
The application includes enhanced memory monitoring specifically optimized for Render deployments:
1. Memory Status Endpoint
The application exposes a dedicated endpoint for monitoring memory usage:
GET /memory/render-status
This endpoint returns detailed information about current memory usage, including:
- Current memory usage in MB
- Peak memory usage since startup
- Memory usage trends (5-minute and 1-hour)
- Current memory status (normal, warning, critical, emergency)
- Actions taken if memory thresholds were exceeded
Example response:
{
"status": "success",
"is_render": true,
"memory_status": {
"timestamp": "2023-10-25T14:32:15.123456",
"memory_mb": 342.5,
"peak_memory_mb": 398.2,
"context": "api_request",
"status": "warning",
"action_taken": "light_cleanup",
"memory_limit_mb": 512.0
},
"memory_trends": {
"current_mb": 342.5,
"peak_mb": 398.2,
"samples_count": 356,
"trend_5min_mb": 12.5,
"trend_1hour_mb": -24.3
},
"render_limit_mb": 512
}
2. Detailed Diagnostics
For more detailed memory diagnostics, use:
GET /memory/diagnostics
This provides a deeper look at memory allocation and usage patterns.
3. Force Memory Cleanup
If you notice memory usage approaching critical levels, you can trigger a manual cleanup:
POST /memory/force-clean
Setting Up External Monitoring
Using Uptime Robot or Similar Services
- Set up a monitor to check the
/healthendpoint every 5 minutes - Set up a separate monitor to check the
/memory/render-statusendpoint every 15 minutes
Automated Alerting
Configure alerts based on memory thresholds:
- Warning Alert: When memory usage exceeds 400MB (78% of limit)
- Critical Alert: When memory usage exceeds 450MB (88% of limit)
Monitoring Logs in Render Dashboard
- Log into your Render dashboard
- Navigate to the service logs
- Filter for memory-related log messages:
[MEMORY CHECKPOINT][MEMORY MILESTONE]Memory usageWARNING: Memory usageCRITICAL: Memory usage
Memory Usage Patterns to Watch For
Warning Signs
- Steadily Increasing Memory: If memory trends show continuous growth
- High Peak After Ingestion: Memory spikes above 450MB after document ingestion
- Failure to Release Memory: Memory doesn't decrease after operations complete
Preventative Actions
- Regular Cleanup: Schedule low-traffic time for calling
/memory/force-clean - Batch Processing: For large document sets, ingest in smaller batches
- Monitoring Before Bulk Operations: Check memory status before starting resource-intensive operations
Memory Optimization Features
The application includes several memory optimization features:
- Automatic Thresholds: Memory is monitored against configured thresholds (400MB, 450MB, 480MB)
- Progressive Cleanup: Different levels of cleanup based on severity
- Request Circuit Breaker: Will reject new requests if memory is critically high
- Memory Metrics Export: Memory metrics are saved to
/tmp/render_metrics/for later analysis
Troubleshooting Memory Issues
If you encounter persistent memory issues:
- Review Logs: Check Render logs for memory checkpoints and milestones
- Analyze Trends: Use the
/memory/render-statusendpoint to identify patterns - Check Operations Timing: High memory could correlate with specific operations
- Adjust Configuration: Consider adjusting
EMBEDDING_BATCH_SIZEor other parameters inconfig.py
Available Environment Variables
These environment variables can be configured in Render:
MEMORY_DEBUG=1: Enable detailed memory diagnosticsMEMORY_LOG_INTERVAL=10: Log memory usage every 10 secondsENABLE_TRACEMALLOC=1: Enable tracemalloc for detailed memory allocation trackingRENDER=1: Enable Render-specific optimizations (automatically set on Render)