Saviynt
Saviynt13h ago
Career Pages

Senior Production Reliability Engin...

Bengaluru
Full Time
Senior Level

Auto Apply to 50+ AI Matched Senior Production Reliability Engin... Jobs

Use Auto Apply Agents to Bulk Apply jobs with ATS Optimised Resumes, find verified Insider Connections for jobs at Saviynt

Responsibilities

Qualifications & Requirements

Experience Level: Senior Level

Full Job Description

Join Saviynt's SaaS Operations team as a Senior Production Reliability Engineer in Bengaluru. This role is part of the Monitoring and Alerting team, which merges operational excellence with development expertise to deliver highly available, resilient services through automation and Infrastructure as Code. You will build reliability into the ecosystem by applying best practices in Resiliency Engineering, Automation, Observability, and Chaos Testing. The team thrives on diverse technical backgrounds and offers challenges in software and systems engineering, with a strong emphasis on building and managing Monitoring and Alerting systems. We seek a Systems Thinking Principal Engineer with a track record of scaling teams through production insights, operational automation, building observability programs, developer guidance, and real-time metrics.

As a Senior Site Reliability Engineer on the Product SRE team, you will report to the Senior Director, Site Reliability Engineering.

WHAT YOU WILL BE DOING

  • Create and maintain infrastructure and tools to ensure service reliability and enhance customer experience.
  • Collaborate with teams to improve observability, automation, deployment processes, and system reliability.
  • Develop, deploy, and manage scalable, dependable infrastructure solutions to support global cloud services.
  • Partner with product, operations, and security teams for seamless implementation of features, tools, and updates across the platform.
  • Develop and deploy AI-powered tools to increase operational efficiency and drive engineering excellence.

What We are Looking For:

  • Implement comprehensive observability for microservices and Kubernetes clusters using tools like OpenTelemetry.
  • Build and manage automation tools to streamline deployment, patching, scaling, and infrastructure management.
  • Develop scalable portals for SRE dashboards, SLI/SLO/SLA tracking, error budgets, and executive metrics to facilitate data-driven decisions.
  • Proficiency in programming and scripting languages such as Java, Python, Go, or Shell.
  • Experience with OpenStack cloud, Linux, Kafka, RabbitMQ, Prometheus, Terraform, Kubernetes, Ansible, MLOps, Generative AI, PostgreSQL, and analytics databases.
  • Familiarity with AWS solutions; Azure experience is a plus.
  • Experience with containerized workloads, particularly Helm, AKS & EKS, other K8s distributions, Docker, and JFrog.
  • Hands-on experience with logging and monitoring tools such as Prometheus, Grafana, ELK/OpenSearch, OpenTelemetry, Datadog, AWS Cloudwatch, Azure Monitor, Log Analytics, and Fluentd.
  • Knowledge of Network Security concepts including AWS/Azure Policy, VPN, Active Directory/RBAC, ACLs, NSG rules, and private endpoints.
  • Proven experience in implementing advanced observability practices and techniques at scale.
  • Hands-on experience with one or more observability tools (Prometheus, Grafana, ELK/OpenSearch, Open Telemetry, Datadog, etc.).

WHAT YOU BRING

  • Bachelor’s degree in Computer Science or a related field, or equivalent experience, with 4+ years in Cloud-SRE, DevOps, or Systems Engineering.
  • Strong problem-solving abilities, excellent collaboration and communication skills, and a proactive approach to teamwork.
  • Knowledge of testing tools and frameworks.

Company

Saviynt

Saviynt

Saviynt is a leading provider of AI-powered identity governance solutions. Their platform manages and governs human and non-human access to an organization's applications, data, and business processes...

Bengaluru
Posted on Career Pages
Senior Production Reliability Engineer at Saviynt | Bengaluru | Apply Now | MindMyJob | MindMyJob - AI Job Search Platform