Spaces:

sethmcknight
/

msse-ai-engineering

Sleeping

App Files Files Community

msse-ai-engineering / docs /memory_monitoring.md

Seth McKnight

Add memory diagnostics endpoints and logging enhancements (#80)

0a7f9b4 about 2 months ago

preview code

raw

history blame contribute delete

4.23 kB

	# Monitoring Memory Usage in Production on Render

	This document provides guidance on monitoring memory usage in production for the RAG application deployed on Render's free tier, which has a 512MB memory limit.

	## Integrated Memory Monitoring Tools

	The application includes enhanced memory monitoring specifically optimized for Render deployments:

	### 1. Memory Status Endpoint

	The application exposes a dedicated endpoint for monitoring memory usage:

	```
	GET /memory/render-status
	```

	This endpoint returns detailed information about current memory usage, including:

	- Current memory usage in MB
	- Peak memory usage since startup
	- Memory usage trends (5-minute and 1-hour)
	- Current memory status (normal, warning, critical, emergency)
	- Actions taken if memory thresholds were exceeded

	Example response:

	```json
	{
	"status": "success",
	"is_render": true,
	"memory_status": {
	"timestamp": "2023-10-25T14:32:15.123456",
	"memory_mb": 342.5,
	"peak_memory_mb": 398.2,
	"context": "api_request",
	"status": "warning",
	"action_taken": "light_cleanup",
	"memory_limit_mb": 512.0
	},
	"memory_trends": {
	"current_mb": 342.5,
	"peak_mb": 398.2,
	"samples_count": 356,
	"trend_5min_mb": 12.5,
	"trend_1hour_mb": -24.3
	},
	"render_limit_mb": 512
	}
	```

	### 2. Detailed Diagnostics

	For more detailed memory diagnostics, use:

	```
	GET /memory/diagnostics
	```

	This provides a deeper look at memory allocation and usage patterns.

	### 3. Force Memory Cleanup

	If you notice memory usage approaching critical levels, you can trigger a manual cleanup:

	```
	POST /memory/force-clean
	```

	## Setting Up External Monitoring

	### Using Uptime Robot or Similar Services

	1. Set up a monitor to check the `/health` endpoint every 5 minutes
	2. Set up a separate monitor to check the `/memory/render-status` endpoint every 15 minutes

	### Automated Alerting

	Configure alerts based on memory thresholds:

	1. Warning Alert: When memory usage exceeds 400MB (78% of limit)
	2. Critical Alert: When memory usage exceeds 450MB (88% of limit)

	### Monitoring Logs in Render Dashboard

	1. Log into your Render dashboard
	2. Navigate to the service logs
	3. Filter for memory-related log messages:
	- `[MEMORY CHECKPOINT]`
	- `[MEMORY MILESTONE]`
	- `Memory usage`
	- `WARNING: Memory usage`
	- `CRITICAL: Memory usage`

	## Memory Usage Patterns to Watch For

	### Warning Signs

	1. Steadily Increasing Memory: If memory trends show continuous growth
	2. High Peak After Ingestion: Memory spikes above 450MB after document ingestion
	3. Failure to Release Memory: Memory doesn't decrease after operations complete

	### Preventative Actions

	1. Regular Cleanup: Schedule low-traffic time for calling `/memory/force-clean`
	2. Batch Processing: For large document sets, ingest in smaller batches
	3. Monitoring Before Bulk Operations: Check memory status before starting resource-intensive operations

	## Memory Optimization Features

	The application includes several memory optimization features:

	1. Automatic Thresholds: Memory is monitored against configured thresholds (400MB, 450MB, 480MB)
	2. Progressive Cleanup: Different levels of cleanup based on severity
	3. Request Circuit Breaker: Will reject new requests if memory is critically high
	4. Memory Metrics Export: Memory metrics are saved to `/tmp/render_metrics/` for later analysis

	## Troubleshooting Memory Issues

	If you encounter persistent memory issues:

	1. Review Logs: Check Render logs for memory checkpoints and milestones
	2. Analyze Trends: Use the `/memory/render-status` endpoint to identify patterns
	3. Check Operations Timing: High memory could correlate with specific operations
	4. Adjust Configuration: Consider adjusting `EMBEDDING_BATCH_SIZE` or other parameters in `config.py`

	## Available Environment Variables

	These environment variables can be configured in Render:

	- `MEMORY_DEBUG=1`: Enable detailed memory diagnostics
	- `MEMORY_LOG_INTERVAL=10`: Log memory usage every 10 seconds
	- `ENABLE_TRACEMALLOC=1`: Enable tracemalloc for detailed memory allocation tracking
	- `RENDER=1`: Enable Render-specific optimizations (automatically set on Render)