A
AION4h ago
Career Pages

Machine Learning Engineer

Bengaluru, Karnataka, India
Full Time
Mid Level

Auto Apply to 50+ AI Matched Machine Learning Engineer Jobs

Use Auto Apply Agents to Bulk Apply jobs with ATS Optimised Resumes, find verified Insider Connections for jobs at AION

Responsibilities

Qualifications & Requirements

Experience Level: Mid Level

Full Job Description

We are seeking a hands-on Machine Learning Engineer with 4-6 years of experience to join our team in Bengaluru, Karnataka, India. This role focuses on building and fine-tuning large language models (LLMs) and transformer-based models, tackling complex problems at the intersection of ML research and production systems.

You will be involved in the entire ML development lifecycle, including data preparation, model fine-tuning, evaluation, and optimization. A strong understanding of what drives model performance and how to systematically improve it through experimentation is key. Experience with LLM fine-tuning techniques (LoRA, QLoRA), RLHF pipelines, and comprehensive model evaluation is highly desirable. We are looking for an individual with strong ownership, initiative, and a passion for developing production-ready ML models that will impact thousands of developers worldwide.

What You'll Do:

ML Model Development & Optimization

  • Design and implement end-to-end LLMOps pipelines for model training, fine-tuning, and evaluation.
  • Fine-tune and customize LLMs (e.g., Llama, Mistral, Gemma) using full fine-tuning and PEFT techniques (LoRA, QLoRA) with tools such as Unsloth, Axolotl, and HuggingFace Transformers.
  • Implement Reinforcement Learning from Human Feedback (RLHF) pipelines for model alignment and preference optimization.
  • Design experiments for automated hyperparameter tuning, training strategies, and model selection.
  • Prepare and validate training datasets, ensuring data quality, preprocessing, and format correctness.
  • Build comprehensive model evaluation systems with custom metrics (BLEU, ROUGE, perplexity, accuracy) and develop synthetic data generation pipelines.
  • Optimize model accuracy, token efficiency, and training performance through systematic experimentation.
  • Design and maintain prompt engineering workflows with version control systems.
  • Deploy models using vLLM with multi-adapter LoRA serving, hot-swapping, and basic optimizations like speculative decoding, continuous batching, and KV cache management.

ML Operations & Technical Leadership

  • Set up ML-specific monitoring for model quality, drift detection, and performance tracking, with automated retraining triggers.
  • Manage model versioning, artifact storage, lineage tracking, and reproducibility using experiment tracking tools.
  • Debug production model issues and optimize cost-performance trade-offs for training and inference.
  • Collaborate with infrastructure engineers on ML-specific compute requirements and deployment pipelines.
  • Document model development processes and share knowledge through internal tech talks.

Technical Skills & Experience:

We encourage you to apply if you meet some of these requirements and are eager to learn the rest.

  • 4-6 years of hands-on experience in machine learning engineering or applied ML roles.
  • Strong fine-tuning experience with modern LLMs, including practical knowledge of transformer architectures, attention mechanisms, and PEFT techniques (LoRA/QLoRA).
  • Deep understanding of transformer model architectures and their modern variants (MoE, Grouped-Query Attention, Flash Attention, state space models).
  • Production ML experience, including building and fine-tuning models for real-world applications.
  • Proficiency in Python and ML frameworks such as PyTorch, HuggingFace Transformers, PEFT, and TRL, with hands-on experience in tools like Unsloth and Axolotl.
  • Experience building model evaluation systems with metrics like BLEU, ROUGE, perplexity, and accuracy.
  • Hands-on experience with prompt engineering, synthetic data generation, and data preprocessing pipelines.
  • Basic deployment experience with vLLM, including multi-adapter serving, hot-swapping, and inference optimizations.
  • Understanding of GPU computing concepts such as memory management, multi-GPU training, mixed precision, and gradient accumulation.
  • Strong debugging skills for training failures, OOM errors, convergence issues, and data quality problems.
  • Experience with model alignment techniques (RLHF, DPO) and implementing RLHF pipelines is highly desirable.
  • Experience with distributed training (DeepSpeed, FSDP, DDP) is a plus.
  • Knowledge of model quantization techniques (GPTQ, AWQ) and their impact on model quality is desirable.
  • Prior experience with AWS SageMaker, MLflow for experiment tracking, and Weights & Biases is a strong plus.
  • Exposure to cloud platforms (AWS/GCP/Azure) for training workloads is beneficial.
  • Familiarity with Docker containerization for reproducible training environments.

Preferred Attributes:

  • High ownership, self-driven, and a bias for action.
  • Strong strategic thinking and the ability to connect technical decisions to business impact.
  • Excellent communication and mentoring skills.
  • Thrives in ambiguous, fast-paced environments and early-stage startup cultures.

Why Join AION?

  • Work directly with high-pedigree founders shaping technical and product strategy.
  • Contribute to building the infrastructure powering the future of AI compute globally.
  • Significant ownership and impact with equity reflective of your contributions.
  • Competitive compensation, flexible work options, and wellness benefits.

If you are a machine learning engineer ready to lead ML-as-a-Service (MLaaS) architecture and scale next-generation AI infrastructure, we encourage you to apply. Please include the following in your application summary:

  • Your resume highlighting relevant projects and leadership experience.
  • Links to products, code (GitHub), or demos you have built.
  • A brief note explaining why AION’s mission excites you.

Company

A

AION

AION is pioneering a decentralized AI cloud platform designed for high-performance computing (HPC). We are transforming the future of compute by democratizing access and offering managed services, aim...

Bengaluru, Karnataka, India
Posted on Career Pages