Forward Deployed ML Engineer, Agents
Aion is seeking a hands-on AI Engineer with 3-5+ years of experience in building production-grade multimodal AI systems and LLM applications. In this role, you will function as a hands-on AI startup CTO, working in small teams to deliver high-stakes customer projects. You will embed directly at client sites to architect, build, and deploy intelligent agent solutions.
You should be adept at writing production code, presenting technical solutions to C-level executives, and debugging complex AI systems in various environments. Experience shipping voice agents, video processing systems, or conversational AI to production is essential. You will excel at translating ambiguous business requirements into concrete technical solutions that drive measurable impact.
This role spans the full AI deployment lifecycle, from use case discovery and solution architecture to multimodal agent development, MLOps pipeline implementation, and production optimization. A strong understanding of agent performance in production and systematic quality improvement through observability and evaluation is crucial.
Experience with voice AI platforms, RAG systems, and LLM orchestration frameworks is highly desirable. Exceptional communication skills, customer empathy, and a drive to build transformative AI solutions are key.
What you'll Do
Customer Engagement & Multimodal Agent Development
- Work directly at customer sites conducting discovery workshops and technical assessments to identify high-impact AI opportunities.
- Design and architect end-to-end multimodal agent systems (voice + video + text) leveraging AIONs distributed GPU infrastructure and managed services.
- Build production-grade voice AI systems using STT, TTS APIs, and LLMs deployed on AIONs platform.
- Develop vision-enabled agents processing real-time video streams using computer vision pipelines on AIONs infrastructure.
- Implement sophisticated multi-agent orchestration with frameworks like LangChain or LlamaIndex, enabling tool use, memory management, and autonomous task completion.
- Rapidly prototype POCs in 2-4 weeks, coding alongside client teams to validate concepts and iterate based on feedback.
- Optimize for sub-500ms latency, natural conversation flow, turn detection, and interruption handling in real-time systems.
- Integrate agents directly into customer codebases via REST/GraphQL/WebSocket APIs and custom SDKs (Python, TypeScript).
- Act as a trusted technical advisor to customers, shaping AI strategy and guiding roadmap decisions from concept to production.
Data Strategy & MLOps Infrastructure
- Design data architectures with efficient processing pipelines and ingestion workflows for training and inference on AIONs platform.
- Implement RAG systems with vector databases optimizing embedding strategies, chunk sizes, and retrieval methods.
- Prepare and validate datasets for fine-tuning, evaluation, and synthetic data generation.
- Collaborate with MLEs, MLOps, and SREs to carry out model deployment and productionization.
Observability, Evaluation & Production Operations
- Implement LLM and agents observability and monitoring, tracking token usage, latency, costs, and quality metrics across deployments on AIONs infrastructure.
- Instrument applications to trace LLM calls, retrieval operations, agent actions, and data flows.
- Build evaluation frameworks with offline benchmarks (accuracy, relevance, safety metrics) and online monitoring (user feedback, drift detection).
Technical Skills & Experience
We encourage you to apply if you meet some of these requirements and are comfortable learning the rest:
- 3-5+ years of hands-on experience building production AI/ML systems, with 1-2+ years deploying LLM applications to production.
- Multimodal AI expertise: practical experience building voice agents, vision systems, or conversational AI serving real users.
- Strong LLM foundations: hands-on with modern foundation models, including fine-tuning, prompt engineering, and evaluation methodologies.
- Agent framework proficiency: production experience with LangChain, LlamaIndex, or similar orchestration frameworks.
- Voice AI platform experience: built real-time conversational systems with production STT/TTS integration.
- Proficiency in Python (production-grade, async programming, type hints) and JavaScript/TypeScript (full-stack development).
- RAG implementation experience: built retrieval-augmented generation systems with vector databases.
- MLOps & deployment: hands-on with Docker, Kubernetes, CI/CD pipelines, and infrastructure-as-code.
- Cloud platforms experience: AWS, Azure, or GCP for ML workloads and infrastructure management.
- Exceptional communication ability to explain complex AI concepts clearly to both technical and business stakeholders.
- Customer-facing experience in Solutions Architecture, Technical Account Management, or Pre-Sales Engineering is highly desirable.
- Computer vision experience (video processing, object detection, vision-language models) is a plus.
- Model fine-tuning experience (LoRA/QLoRA, supervised fine-tuning, RLHF) is a plus.
- Inference optimization experience (vLLM, TensorRT-LLM, Triton, model quantization) is desirable.
- Observability tooling experience (LLM monitoring, tracing, evaluation frameworks) is a strong plus.
- Familiarity with WebRTC, real-time streaming protocols, and low-latency media processing.
Why Join AION
- Work directly with high-pedigree founders shaping technical and product strategy.
- Build infrastructure powering the future of AI compute globally.
- Significant ownership and impact with equity reflective of your contributions.
- Competitive compensation, flexible work options, and wellness benefits.
