Description: Generate text response with advanced parameters and thinking modes. When TTS is enabled, provides precise audio duration information for timing and synchronization.
Body: JSON or form data with message and optional parameters
curl -X POST "http://localhost:5002/api/chat" \
-H "Content-Type: application/json" \
-d '{
"message": "What is consciousness?",
"max_tokens": 300,
"temperature": 0.7,
"thinking_mode": "thinking-full",
"language": "english",
"include_audio": true,
"voice": "en-US-JennyNeural",
"voice_speed": 1.0,
"system_instructions": "You are a psychology expert.",
"context_messages": 50,
"preserve_history": true
}'
{
"success": true,
"response": "Consciousness is a fascinating topic...",
"model": "lucy_128k-Q3_K_M.gguf",
"audio_url": "/audio",
"audio_duration_seconds": 31.2,
"audio_size_kb": 486.7,
"voice_speed": 1.0
}
⏱️ Audio Duration & Timing Features
Precise Duration Calculation: The server uses mutagen library to extract exact audio duration from generated MP3 files for accurate timing information.
Integration Use Cases: Voice command systems, progress indicators, mobile apps, batch processing, and accessibility features.