What is the salary for this Site Reliability Engineer II position?

Salary information for this Site Reliability Engineer II position is available upon application.

What experience is required for this Site Reliability Engineer II role?

This Site Reliability Engineer II position requires mid_level of experience.

Where is this Site Reliability Engineer II job located?

This Site Reliability Engineer II position is located in Bengaluru / Bangalore, India.

How do I apply for this Site Reliability Engineer II position at Backblaze?

You can apply for this Site Reliability Engineer II position by clicking the 'Apply Now' button on this page, which will direct you to the official application portal.

Site Reliability Engineer II at Backblaze | Bengaluru / Bangalore, India | Apply Now | MindMyJob

Backblaze is seeking a talented Site Reliability Engineer II (SRE II) to join our team in Bengaluru / Bangalore, India. This role is crucial for ensuring the stability, scalability, and reliability of our services and infrastructure. You will focus on developing automation, enhancing observability, and supporting incident response to maintain peak performance for our customer-facing systems.

The SRE will collaborate closely with engineering, product, and operations teams to integrate reliability best practices into daily development and operational workflows. Your contributions will help build tools and processes that boost efficiency and minimize manual tasks.

Key Responsibilities

Service Reliability & Operations

Ensure the availability and durability of critical services in production environments.
Monitor service health using SLIs, SLOs, and error budgets, escalating issues when thresholds are at risk.
Participate in on-call rotations, incident response, and post-incident reviews to drive service improvements.
Adhere to ITIL/OSS processes, including incident, change, problem, and capacity management.

Automation & Tooling

Develop automation for routine operational tasks to reduce manual intervention and toil.
Contribute to monitoring, logging, and alerting frameworks (e.g., Prometheus, Grafana, Catchpoint, ELK).
Work with CI/CD pipelines, configuration management, and infrastructure as code tools such as Terraform, Ansible, and Jenkins.
Write scripts in languages like Bash, Python, or Go to enhance system reliability and efficiency.

Collaboration

Partner with engineering, product, and operations teams to support resilient system design and operations.
Assist in capacity planning and disaster recovery exercises.
Collaborate with vendors and service providers to troubleshoot issues and track SLA performance.
Document systems, share knowledge, and foster a reliability-focused engineering culture.

Continuous Improvement

Contribute to playbooks, runbooks, and operational documentation.
Identify recurring issues and propose long-term solutions.
Promote reliability-focused practices within development and operations teams.

Qualifications

Education & Experience

Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent practical experience.
2-4 years of experience in site reliability, systems engineering, or operations.
Experience with large-scale, production-grade systems.

Technical Skills

Strong Linux systems administration and troubleshooting skills.
Familiarity with service reliability concepts including monitoring, alerting, incident response, and root cause analysis.
Proficiency in at least one scripting language (Python, Bash, or Go).
Understanding of containerization technologies (Kubernetes, Docker) and microservices.
Knowledge of incident response and operational best practices.

Preferred Attributes

Experience in a SaaS, service provider, or distributed systems environment.
Familiarity with ITIL/OSS practices and SLO/SLA concepts.
Excellent problem-solving abilities and a strong desire to learn new technologies.
Experience with cloud platforms such as AWS, GCP, or Azure.
Ability to work independently, take ownership, and drive projects from identification to resolution.

At Backblaze, we are committed to creating a workplace where everyone feels valued and empowered. We encourage applications from individuals of all backgrounds and experiences. We believe in fairness, good treatment of our employees, and fostering diversity, equity, and inclusion at all levels.

Site Reliability Engineer II

Auto Apply to 50+ AI Matched Site Reliability Engineer II Jobs

Responsibilities

Qualifications & Requirements

Full Job Description