A
AION5h ago
Career Pages

Senior Software Engineer

Bengaluru, Karnataka, India
Full Time
Senior Level

Auto Apply to 50+ AI Matched Senior Software Engineer Jobs

Use Auto Apply Agents to Bulk Apply jobs with ATS Optimised Resumes, find verified Insider Connections for jobs at AION

Responsibilities

Qualifications & Requirements

Experience Level: Senior Level

Full Job Description

AION is seeking a highly skilled Senior Software Engineer to join our Compute Platform team in Bengaluru, India. You will play a pivotal role in designing, building, and scaling the next generation of AI cloud infrastructure.

About the Role:

As a Senior Software Engineer, Compute Platform, you will be instrumental in architecting and implementing AION's multi-cloud compute platform. Your work will involve creating abstraction layers that unify diverse cloud providers (AWS, GCP, Azure) and bare-metal data centers, enabling seamless integration and management of compute resources. You will develop and maintain AION's managed services, focusing on building scalable orchestration systems for GPU workloads, container scheduling, and efficient resource allocation. This includes developing robust APIs and control planes for compute lifecycle management, leading technical discussions on reliability, performance, and cost optimization, and executing on peripheral platform services such as billing, usage accounting, and observability.

Responsibilities:

  • Design and architect AION's multi-cloud compute platform, unifying diverse cloud providers and bare-metal data centers.
  • Collaborate with cloud providers to expand AION's compute pool, considering pricing, availability, GPU types, and capacity.
  • Build and maintain AION's managed services, including scalable orchestration for GPU workloads and containers.
  • Develop APIs and control planes for compute lifecycle management (provisioning, scaling, termination).
  • Lead initiatives for platform reliability, performance optimization, and cost efficiency.
  • Implement peripheral platform services including billing, usage accounting, observability, and compliance tooling.
  • Build monitoring and telemetry systems for compute utilization, cost tracking, and performance metrics.
  • Establish and uphold engineering standards for platform development.
  • Mentor junior engineers on infrastructure best practices and distributed systems design.

Who You Are:

You are a seasoned engineer with a proven track record of building and scaling high-performance inference systems for AI/ML workloads. You excel at designing distributed systems that handle high request volumes with low latency and cost efficiency. You possess a product-minded approach, understanding the impact of technical decisions on developer experience. You are a collaborative team player, comfortable with diverse responsibilities ranging from feature development to customer interaction and platform operations.

Technical Skills and Experience:

  • 4+ years of experience building and scaling complex backend systems, cloud infrastructure, or distributed platforms.
  • Strong understanding of multi-cloud architectures and experience with AWS, GCP, or Azure at scale.
  • Deep knowledge of cloud abstractions: compute, storage, networking.
  • Proficiency in Golang strongly preferred; Python, Rust, or other systems languages a plus.
  • Experience with Kubernetes, container orchestration, and infrastructure-as-code (Terraform, Pulumi, CloudFormation).
  • Solid understanding of distributed systems principles.
  • Experience building APIs, control planes, and platform services for infrastructure management.
  • Familiarity with databases, message queues, and event-driven architectures.
  • Knowledge of GPU orchestration, AI/ML workloads, or HPC systems is highly desirable.
  • Experience with observability tools and distributed tracing.
  • Understanding of cloud billing models and cost optimization strategies.

Bonus / Good to Have:

  • HPC & Cluster Management: Experience with large-scale HPC clusters using Kubernetes and Slurm.
  • Data Engineering: Expertise with data pipelines and large-scale data processing.
  • Systems-Level Programming: Experience with low-level systems programming.
  • ML Platform Engineering: Experience productionizing ML pipelines and orchestration systems.
  • Enterprise Deployment: Experience packaging software for on-premises or customer VPC deployments, emphasizing security and compliance.

Why Join AION?

  • Work directly with founders shaping technical and product strategy.
  • Build infrastructure powering the future of AI compute globally.
  • Significant ownership and impact with equity.
  • Competitive compensation, flexible work options, and wellness benefits.

Company

A

AION

AION is at the forefront of building an interoperable AI cloud platform. We are revolutionizing high-performance computing (HPC) through a decentralized AI cloud, designed for bare-metal performance. ...

Bengaluru, Karnataka, India
Posted on Career Pages
Senior Software Engineer, Compute Platform at AION | Bengaluru, Karnataka, India | Apply Now | MindMyJob | MindMyJob - AI Job Search Platform