A
AION15h ago
Career Pages

Senior Software Engineer

Bengaluru, Karnataka, India
Full Time
Senior Level

Auto Apply to 50+ AI Matched Senior Software Engineer Jobs

Use Auto Apply Agents to Bulk Apply jobs with ATS Optimised Resumes, find verified Insider Connections for jobs at AION

Responsibilities

Qualifications & Requirements

Experience Level: Senior Level

Full Job Description

AION is seeking a Senior Software Engineer to join its Inference Platform team in Bengaluru, Karnataka, India. You will be instrumental in building and scaling high-performance inference systems for AI/ML workloads, addressing complexities in serving models at scale, including latency optimization, resource orchestration, autoscaling, and production reliability. Experience designing distributed systems that handle thousands of requests per second with sub-second response times and cost efficiency is crucial.

We are looking for candidates with a strong understanding of inference at scale. Experience with Golang is highly preferred, with bonus points for familiarity with inference engines (vLLM, TGI, TensorRT), containerization, and distributed systems. You should be comfortable taking ownership of platform-level decisions, strategically balancing performance and cost, and contributing to a platform used by thousands of developers globally.

As a product-minded engineer, you will understand the impact of your technical decisions on the end-user experience. This role requires a team player comfortable with diverse responsibilities, from optimizing inference latency and managing infrastructure to engaging with customers to understand their challenges and contributing to UI/UX, customer success, documentation, and product operations.

Responsibilities

Inference Platform Architecture & Core Services

  • Design and build AION's inference service platform, the core for large-scale AI model serving.
  • Architect and own key platform components: AI Gateway, Resource Orchestrator, Runtime Engines, and Autoscaler.
  • Develop highly modular, scalable, and extensible low-level designs for inference infrastructure.
  • Lead high-level design discussions, establish architectural patterns, and drive technical decisions for the inference stack.

Model Deployment & Lifecycle Management

  • Optimize model deployment, version upgrades, and rollback strategies.
  • Build robust pipelines for zero-downtime model updates.
  • Design intelligent routing for multi-model serving, A/B testing, and canary deployments.
  • Implement efficient GPU utilization and model cold-start optimization strategies.

Performance & Distributed Systems

  • Develop highly performant software for low-latency, high-throughput inference serving.
  • Build and debug production-grade distributed systems for real-time AI workloads.
  • Optimize inference pipelines for latency, throughput, batching, and resource utilization.
  • Design fault-tolerant systems with graceful degradation and auto-recovery.

Observability & Engineering Excellence

  • Build a high-performance telemetry and observability stack for inference metrics, performance tracking, and debugging.
  • Implement comprehensive monitoring for model latency, throughput, errors, GPU utilization, and cost.
  • Conduct thorough code reviews to ensure code quality, performance, and architectural consistency.
  • Establish engineering best practices for testing, documentation, and production readiness.

Requirements

  • 4+ years of experience building and scaling backend systems, distributed platforms, or inference infrastructure.
  • Strong understanding of AI/ML inference systems and experience with inference engines (vLLM, TGI, TensorRT-LLM, or similar).
  • Deep knowledge of distributed systems design, microservices architecture, and API gateway patterns.
  • Proficiency in Golang strongly preferred; Python, Rust, C++ for performance-critical components are a plus.
  • Experience with container orchestration (Kubernetes, Docker) and infrastructure-as-code.
  • Solid understanding of autoscaling strategies, load balancing, and resource scheduling algorithms.
  • Experience building high-throughput, low-latency systems (sub-100ms response times).
  • Familiarity with message queues (Kafka, RabbitMQ), databases (PostgreSQL, Redis), and event-driven architectures.
  • Knowledge of GPU computing, model serving optimizations (batching, quantization, multi-tenancy), and resource allocation.
  • Experience with observability tools (Prometheus, Grafana, OpenTelemetry) and distributed tracing.
  • Understanding of API design, rate limiting, authentication/authorization, and security best practices.
  • Exposure to AI model deployment workflows and model lifecycle management is highly desirable.

Bonus / Good to Have

  • HPC & Cluster Management: Experience with large-scale HPC clusters (Kubernetes, Slurm) for job scheduling and resource orchestration.
  • Data Engineering: Expertise in data pipelines, ETL systems, and large-scale data processing frameworks.
  • Systems-Level Programming: Experience with low-level systems programming (storage, Kubernetes operators, OS-level software, daemon services).
  • ML Platform Engineering: Experience productionizing ML pipelines, batch job orchestration, model fine-tuning, and Jupyter notebook orchestration.
  • Enterprise Deployment: Experience packaging software for on-premises or VPC deployments, focusing on security and compliance.

Preferred Attributes

  • High ownership, self-driven, and a bias for action.
  • Strong strategic thinking and ability to link technical decisions to business impact.
  • Excellent communication and mentoring skills.
  • Ability to thrive in ambiguous, fast-paced startup environments.

Why Join AION?

  • Work directly with founders shaping technical and product strategy.
  • Build infrastructure for the future of AI compute.
  • Significant ownership and impact with competitive equity.
  • Competitive compensation, flexible work options, and wellness benefits.

Apply Now:

If you are a strong engineer ready to lead architecture and scale next-generation AI infrastructure, we encourage you to apply. Please include:

  • Your resume highlighting relevant projects and leadership experience.
  • Links to products, code, or demos you have built.
  • A brief note on why AION's mission excites you.

Company

A

AION

AION is developing a decentralized AI cloud platform designed to power high-performance computing (HPC). This platform aims to democratize compute access and offer managed services, functioning as an ...

Bengaluru, Karnataka, India
Posted on Career Pages
Senior Software Engineer, Inference Platform at AION | Bengaluru, Karnataka, India | Apply Now | MindMyJob | MindMyJob - AI Job Search Platform