IBM
IBM2h ago
Naukri

Site Reliability Engineer

Bengaluru
Mid Level

Auto Apply to 50+ AI Matched Site Reliability Engineer Jobs

Use Auto Apply Agents to Bulk Apply jobs with ATS Optimised Resumes, find verified Insider Connections for jobs at IBM

Full Job Description

Site Reliability Engineer at IBM Bengaluru

IBM is seeking a talented Site Reliability Engineer to join our dynamic team in Bengaluru. This role is crucial for ensuring the high availability, resilience, and scalability of our cutting-edge IBM Quantum platforms and services.

Responsibilities:

  • Lead incident response efforts, participate in critical war room activities, and drive comprehensive post-incident reviews and corrective actions to prevent recurrence.
  • Collaborate closely with development teams to effectively debug, deploy, and maintain quantum workloads and backend services, ensuring seamless operation.
  • Establish, refine, and rigorously maintain observability across all logs, metrics, traces, and alerting systems for proactive issue detection and resolution.
  • Design and build innovative internal tools, robust automations, and efficient operational workflows to significantly improve team efficiency and minimize manual toil.
  • Champion a culture of operational ownership, ensuring every quantum job runs reliably with complete traceability from inception to completion.
  • Drive significant platform-wide improvements by leveraging operational insights, lessons learned from incidents, and established reliability patterns.

Required Qualifications:

  • Bachelor's Degree.
  • 2–5 years of proven professional experience as a Site Reliability Engineer.
  • Strong systems-thinking ability to correlate complex data points including logs, traces, metrics, and code across distributed workloads.
  • Hands-on experience with incident management, production operations, and on-call responsibilities in a demanding environment.
  • Proficiency with modern observability tools such as Grafana, Sysdig, Jaeger, and similar solutions.
  • Familiarity with container orchestration technologies like Kubernetes, deep understanding of Linux internals, and programming proficiency in Python or Go.
  • Demonstrated ability to collaborate effectively across development, infrastructure, and platform teams.
  • Proven ability to transform incident learnings into actionable automation, robust fixes, or significant architectural improvements.
  • Solid understanding of SLI/SLO/SLA frameworks and key reliability metrics.

Preferred Qualifications:

  • Experience with IBM Cloud services.
  • Familiarity with Qiskit or foundational quantum computing concepts.
  • Master's Degree.

Company

IBM

IBM

Bengaluru
Posted on Naukri
Site Reliability Engineer at IBM | Bengaluru | Apply Now | MindMyJob | MindMyJob - AI Job Search Platform