Spaces:
Sleeping
Sleeping
File size: 4,232 Bytes
0a7f9b4 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 |
# Monitoring Memory Usage in Production on Render
This document provides guidance on monitoring memory usage in production for the RAG application deployed on Render's free tier, which has a 512MB memory limit.
## Integrated Memory Monitoring Tools
The application includes enhanced memory monitoring specifically optimized for Render deployments:
### 1. Memory Status Endpoint
The application exposes a dedicated endpoint for monitoring memory usage:
```
GET /memory/render-status
```
This endpoint returns detailed information about current memory usage, including:
- Current memory usage in MB
- Peak memory usage since startup
- Memory usage trends (5-minute and 1-hour)
- Current memory status (normal, warning, critical, emergency)
- Actions taken if memory thresholds were exceeded
Example response:
```json
{
"status": "success",
"is_render": true,
"memory_status": {
"timestamp": "2023-10-25T14:32:15.123456",
"memory_mb": 342.5,
"peak_memory_mb": 398.2,
"context": "api_request",
"status": "warning",
"action_taken": "light_cleanup",
"memory_limit_mb": 512.0
},
"memory_trends": {
"current_mb": 342.5,
"peak_mb": 398.2,
"samples_count": 356,
"trend_5min_mb": 12.5,
"trend_1hour_mb": -24.3
},
"render_limit_mb": 512
}
```
### 2. Detailed Diagnostics
For more detailed memory diagnostics, use:
```
GET /memory/diagnostics
```
This provides a deeper look at memory allocation and usage patterns.
### 3. Force Memory Cleanup
If you notice memory usage approaching critical levels, you can trigger a manual cleanup:
```
POST /memory/force-clean
```
## Setting Up External Monitoring
### Using Uptime Robot or Similar Services
1. Set up a monitor to check the `/health` endpoint every 5 minutes
2. Set up a separate monitor to check the `/memory/render-status` endpoint every 15 minutes
### Automated Alerting
Configure alerts based on memory thresholds:
1. **Warning Alert**: When memory usage exceeds 400MB (78% of limit)
2. **Critical Alert**: When memory usage exceeds 450MB (88% of limit)
### Monitoring Logs in Render Dashboard
1. Log into your Render dashboard
2. Navigate to the service logs
3. Filter for memory-related log messages:
- `[MEMORY CHECKPOINT]`
- `[MEMORY MILESTONE]`
- `Memory usage`
- `WARNING: Memory usage`
- `CRITICAL: Memory usage`
## Memory Usage Patterns to Watch For
### Warning Signs
1. **Steadily Increasing Memory**: If memory trends show continuous growth
2. **High Peak After Ingestion**: Memory spikes above 450MB after document ingestion
3. **Failure to Release Memory**: Memory doesn't decrease after operations complete
### Preventative Actions
1. **Regular Cleanup**: Schedule low-traffic time for calling `/memory/force-clean`
2. **Batch Processing**: For large document sets, ingest in smaller batches
3. **Monitoring Before Bulk Operations**: Check memory status before starting resource-intensive operations
## Memory Optimization Features
The application includes several memory optimization features:
1. **Automatic Thresholds**: Memory is monitored against configured thresholds (400MB, 450MB, 480MB)
2. **Progressive Cleanup**: Different levels of cleanup based on severity
3. **Request Circuit Breaker**: Will reject new requests if memory is critically high
4. **Memory Metrics Export**: Memory metrics are saved to `/tmp/render_metrics/` for later analysis
## Troubleshooting Memory Issues
If you encounter persistent memory issues:
1. **Review Logs**: Check Render logs for memory checkpoints and milestones
2. **Analyze Trends**: Use the `/memory/render-status` endpoint to identify patterns
3. **Check Operations Timing**: High memory could correlate with specific operations
4. **Adjust Configuration**: Consider adjusting `EMBEDDING_BATCH_SIZE` or other parameters in `config.py`
## Available Environment Variables
These environment variables can be configured in Render:
- `MEMORY_DEBUG=1`: Enable detailed memory diagnostics
- `MEMORY_LOG_INTERVAL=10`: Log memory usage every 10 seconds
- `ENABLE_TRACEMALLOC=1`: Enable tracemalloc for detailed memory allocation tracking
- `RENDER=1`: Enable Render-specific optimizations (automatically set on Render)
|