🦙 Chat

🦙 TryToSpeak
AI Language Assistant

Loading Authentication...

👋 , !

Model:

Models Available:

Port:

🎛️ Chat Settings ▼

System Instructions:

🤖 Model Parameters ▼

Max Tokens: Maximum response length

Temperature: Response creativity (0=focused, 2=creative)

Thinking Mode: How the model processes responses

💭 Conversation History Settings ▼

Context Messages: Number of past exchanges to include as context

Preserve History Across Sessions Save conversation history to disk per model

Hands-Free Mode Automatically open microphone after audio response ends

🖥️ Interface Display Settings ▼

Show Text Input Area Display the message input box and send controls

Show Response Display Area Display the model response box

🌙 Night Mode Switch to dark theme for better viewing in low light

🎵 Audio & Language Settings ▼

Language: Response language and voice selection

Include Audio (TTS) Generate spoken audio responses

Voice Selection: Choose preferred voice for TTS

Current:

Speech Speed: Adjust voice playback speed

🎵 Available Voices for

📚 Available Models ADMIN ONLY

Loading models...

🔽 Download New Model ADMIN ONLY

Repository ID (e.g., huggingface/model-name):

Filename:

HF Token (optional for private repos):

🔌 API Endpoints Documentation

GET /health

Description: Check server health and status

Response: JSON with server status, model info, memory usage, and availability

curl -X GET "http://localhost:5002/health"

{ "status": "healthy", "llamacpp_available": true, "current_model": "Qwen3-Zro-Cdr-Reason-V2-0.8B-NEO-EX-D_AU-Q4_K_M-imat.gguf", "models_count": 5, "memory_status": { "total_ram_gb": 16.0, "available_ram_gb": 8.5, "used_percent": 47.2, "safe_context_length": 4096 } }

POST /api/chat

Description: Generate text response with advanced parameters and thinking modes. When TTS is enabled, provides precise audio duration information for timing and synchronization.

Body: JSON or form data with message and optional parameters

curl -X POST "http://localhost:5002/api/chat" \ -H "Content-Type: application/json" \ -d '{ "message": "What is consciousness?", "max_tokens": 300, "temperature": 0.7, "thinking_mode": "thinking-full", "language": "english", "include_audio": true, "voice": "en-US-JennyNeural", "voice_speed": 1.0, "system_instructions": "You are a psychology expert.", "context_messages": 50, "preserve_history": true }'

{ "success": true, "response": "Consciousness is a fascinating topic...", "model": "lucy_128k-Q3_K_M.gguf", "audio_url": "/audio", "audio_duration_seconds": 31.2, "audio_size_kb": 486.7, "voice_speed": 1.0 }

⏱️ Audio Duration & Timing Features

Precise Duration Calculation: The server uses mutagen library to extract exact audio duration from generated MP3 files for accurate timing information.

Integration Use Cases: Voice command systems, progress indicators, mobile apps, batch processing, and accessibility features.

🦙 TryToSpeakAI Language Assistant

🦙 TryToSpeakAI Language Assistant

👋 , !

🎵 Available Voices for

📚 Available Models ADMIN ONLY

🔽 Download New Model ADMIN ONLY

🔌 API Endpoints Documentation

⏱️ Audio Duration & Timing Features

🦙 TryToSpeak
AI Language Assistant

🦙 TryToSpeak
AI Language Assistant