What is the salary for this Senior Production Reliability Engineer position?

Salary information for this Senior Production Reliability Engineer position is available upon application.

What experience is required for this Senior Production Reliability Engineer role?

This Senior Production Reliability Engineer position requires senior_level of experience.

Where is this Senior Production Reliability Engineer job located?

This Senior Production Reliability Engineer position is located in Bengaluru.

How do I apply for this Senior Production Reliability Engineer position at Saviynt?

You can apply for this Senior Production Reliability Engineer position by clicking the 'Apply Now' button on this page, which will direct you to the official application portal.

Senior Production Reliability Engineer at Saviynt | Bengaluru | Apply Now | MindMyJob

Join Saviynt's SaaS Operations team as a Senior Production Reliability Engineer in Bengaluru. This role is part of the Monitoring and Alerting team, which merges operational excellence with development expertise to deliver highly available, resilient services through automation and Infrastructure as Code. You will build reliability into the ecosystem by applying best practices in Resiliency Engineering, Automation, Observability, and Chaos Testing. The team thrives on diverse technical backgrounds and offers challenges in software and systems engineering, with a strong emphasis on building and managing Monitoring and Alerting systems. We seek a Systems Thinking Principal Engineer with a track record of scaling teams through production insights, operational automation, building observability programs, developer guidance, and real-time metrics.

As a Senior Site Reliability Engineer on the Product SRE team, you will report to the Senior Director, Site Reliability Engineering.

WHAT YOU WILL BE DOING

Create and maintain infrastructure and tools to ensure service reliability and enhance customer experience.
Collaborate with teams to improve observability, automation, deployment processes, and system reliability.
Develop, deploy, and manage scalable, dependable infrastructure solutions to support global cloud services.
Partner with product, operations, and security teams for seamless implementation of features, tools, and updates across the platform.
Develop and deploy AI-powered tools to increase operational efficiency and drive engineering excellence.

What We are Looking For:

Implement comprehensive observability for microservices and Kubernetes clusters using tools like OpenTelemetry.
Build and manage automation tools to streamline deployment, patching, scaling, and infrastructure management.
Develop scalable portals for SRE dashboards, SLI/SLO/SLA tracking, error budgets, and executive metrics to facilitate data-driven decisions.
Proficiency in programming and scripting languages such as Java, Python, Go, or Shell.
Experience with OpenStack cloud, Linux, Kafka, RabbitMQ, Prometheus, Terraform, Kubernetes, Ansible, MLOps, Generative AI, PostgreSQL, and analytics databases.
Familiarity with AWS solutions; Azure experience is a plus.
Experience with containerized workloads, particularly Helm, AKS & EKS, other K8s distributions, Docker, and JFrog.
Hands-on experience with logging and monitoring tools such as Prometheus, Grafana, ELK/OpenSearch, OpenTelemetry, Datadog, AWS Cloudwatch, Azure Monitor, Log Analytics, and Fluentd.
Knowledge of Network Security concepts including AWS/Azure Policy, VPN, Active Directory/RBAC, ACLs, NSG rules, and private endpoints.
Proven experience in implementing advanced observability practices and techniques at scale.
Hands-on experience with one or more observability tools (Prometheus, Grafana, ELK/OpenSearch, Open Telemetry, Datadog, etc.).

WHAT YOU BRING

Bachelor’s degree in Computer Science or a related field, or equivalent experience, with 4+ years in Cloud-SRE, DevOps, or Systems Engineering.
Strong problem-solving abilities, excellent collaboration and communication skills, and a proactive approach to teamwork.
Knowledge of testing tools and frameworks.

Senior Production Reliability Engin...

Auto Apply to 50+ AI Matched Senior Production Reliability Engin... Jobs

Responsibilities

Qualifications & Requirements

Full Job Description

WHAT YOU WILL BE DOING

What We are Looking For:

WHAT YOU BRING

Company

Saviynt