AI Developer for STT - TTS -ML - NPL Monthy salary for 2-3 Years
We are looking for an experienced AI Developer to join our team for a long-term collaboration to improve and expand the AI capabilities of our proprietary tool. The foundation is in place, but key modules require fine-tuning, optimization, and additional features.
- Current Tech Stack & Setup - Frontend: C# (WPF, MVVM)
Being optimized by our in-house developer
Backend: Python (Django) + MongoDB
Deployment: Local (on-premises high-performance server)
- Existing AI Features (Functional but Need Refinement):
TTS (Text-to-Speech) – Working, but requires fine-tuning
STT (Speech-to-Text) – Working, but requires fine-tuning - AI Stories (Animated Avatars) – Lip-sync & basic facial expressions work, but need:
Body movements (hands, head, eyes)
Emotion control (happy, sad, etc.)
Improved realism & synchronization
- Key AI Modules to Develop/Enhance
AI Translate – Real-time translation of live streams into multiple languages (need AI - Training)
AI Avatar – Auto-generated content feeds using scripts (need AI - Training)
AI Stories – Expand beyond lip movements to include natural body language & expressions
Video Studio AI – Video generation & editing (similar to Zebracat.ai) (need AI - Training)
Film Studio – Video translation while preserving the original speaker’s voice (need AI - Training)
AI Training – Improve STT/TTS accuracy & refine voice cloning (Prototype ready, needs optimization) Voice Cloning (need AI - Training)
What’s Already in Place:
- Training Data: 1000+ audio & text files for AI training
- Voices & Video Content: Available for integration
- Tool Prototype: Partially functional (TTS, STT, basic avatar animations)
- Database: MongoDB set up and operational
How We’ll Work
- Remote access to our local server for development & training
- Step-by-step, module-wise improvements – Focus on one AI component at a time
- Fine-tuning & optimization for high-quality, realistic output
- Close collaboration with our team for iterative testing & feedback
Ideal Candidate
- Strong experience in AI/ML, NLP, TTS/STT, generative avatars, and video synthesis
- Proficient in Python, Django, and AI model training/fine-tuning
- Familiarity with C# (WPF) integration is a plus
- Able to work long-term with a structured, milestone-based approach
Please provide:
- Examples of similar AI projects you’ve worked on (especially TTS/STT, avatar animation, or video AI).
- Your proposed plan for improving lip-sync, adding body movements, and refining TTS/STT.
- Your availability for a long-term, collaborative project.