File size: 4,232 Bytes
0a7f9b4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
# Monitoring Memory Usage in Production on Render

This document provides guidance on monitoring memory usage in production for the RAG application deployed on Render's free tier, which has a 512MB memory limit.

## Integrated Memory Monitoring Tools

The application includes enhanced memory monitoring specifically optimized for Render deployments:

### 1. Memory Status Endpoint

The application exposes a dedicated endpoint for monitoring memory usage:

```
GET /memory/render-status
```

This endpoint returns detailed information about current memory usage, including:

- Current memory usage in MB
- Peak memory usage since startup
- Memory usage trends (5-minute and 1-hour)
- Current memory status (normal, warning, critical, emergency)
- Actions taken if memory thresholds were exceeded

Example response:

```json
{
  "status": "success",
  "is_render": true,
  "memory_status": {
    "timestamp": "2023-10-25T14:32:15.123456",
    "memory_mb": 342.5,
    "peak_memory_mb": 398.2,
    "context": "api_request",
    "status": "warning",
    "action_taken": "light_cleanup",
    "memory_limit_mb": 512.0
  },
  "memory_trends": {
    "current_mb": 342.5,
    "peak_mb": 398.2,
    "samples_count": 356,
    "trend_5min_mb": 12.5,
    "trend_1hour_mb": -24.3
  },
  "render_limit_mb": 512
}
```

### 2. Detailed Diagnostics

For more detailed memory diagnostics, use:

```
GET /memory/diagnostics
```

This provides a deeper look at memory allocation and usage patterns.

### 3. Force Memory Cleanup

If you notice memory usage approaching critical levels, you can trigger a manual cleanup:

```
POST /memory/force-clean
```

## Setting Up External Monitoring

### Using Uptime Robot or Similar Services

1. Set up a monitor to check the `/health` endpoint every 5 minutes
2. Set up a separate monitor to check the `/memory/render-status` endpoint every 15 minutes

### Automated Alerting

Configure alerts based on memory thresholds:

1. **Warning Alert**: When memory usage exceeds 400MB (78% of limit)
2. **Critical Alert**: When memory usage exceeds 450MB (88% of limit)

### Monitoring Logs in Render Dashboard

1. Log into your Render dashboard
2. Navigate to the service logs
3. Filter for memory-related log messages:
   - `[MEMORY CHECKPOINT]`
   - `[MEMORY MILESTONE]`
   - `Memory usage`
   - `WARNING: Memory usage`
   - `CRITICAL: Memory usage`

## Memory Usage Patterns to Watch For

### Warning Signs

1. **Steadily Increasing Memory**: If memory trends show continuous growth
2. **High Peak After Ingestion**: Memory spikes above 450MB after document ingestion
3. **Failure to Release Memory**: Memory doesn't decrease after operations complete

### Preventative Actions

1. **Regular Cleanup**: Schedule low-traffic time for calling `/memory/force-clean`
2. **Batch Processing**: For large document sets, ingest in smaller batches
3. **Monitoring Before Bulk Operations**: Check memory status before starting resource-intensive operations

## Memory Optimization Features

The application includes several memory optimization features:

1. **Automatic Thresholds**: Memory is monitored against configured thresholds (400MB, 450MB, 480MB)
2. **Progressive Cleanup**: Different levels of cleanup based on severity
3. **Request Circuit Breaker**: Will reject new requests if memory is critically high
4. **Memory Metrics Export**: Memory metrics are saved to `/tmp/render_metrics/` for later analysis

## Troubleshooting Memory Issues

If you encounter persistent memory issues:

1. **Review Logs**: Check Render logs for memory checkpoints and milestones
2. **Analyze Trends**: Use the `/memory/render-status` endpoint to identify patterns
3. **Check Operations Timing**: High memory could correlate with specific operations
4. **Adjust Configuration**: Consider adjusting `EMBEDDING_BATCH_SIZE` or other parameters in `config.py`

## Available Environment Variables

These environment variables can be configured in Render:

- `MEMORY_DEBUG=1`: Enable detailed memory diagnostics
- `MEMORY_LOG_INTERVAL=10`: Log memory usage every 10 seconds
- `ENABLE_TRACEMALLOC=1`: Enable tracemalloc for detailed memory allocation tracking
- `RENDER=1`: Enable Render-specific optimizations (automatically set on Render)