Fix max_tokens calculation to respect model context window (5000 tokens) - Add dynamic max_tokens calculation based on input size - Add novita_model_context_window configuration - Prevents 400 errors when input tokens exceed available output space
Integrate Novita AI as exclusive inference provider - Add Novita AI API integration with DeepSeek-R1-Distill-Qwen-7B model - Remove all local model dependencies - Optimize token allocation for user inputs and context - Add Anaconda environment setup files - Add comprehensive test scripts and documentation