
Senior Site Reliability Engineer
Responsibilities
Qualifications & Requirements
Experience Level: Senior Level
Full Job Description
EarnIn is seeking a Senior Site Reliability Engineer (SRE) to join our expanding team in Bengaluru, India. This hybrid role is crucial for delivering exceptional product experiences to our community members. You will collaborate with various teams to rapidly deploy production-ready features, contributing to and building essential infrastructure, reliability tooling, and best practices. Your focus will be on ensuring system reliability, performance, and seamless deployments, making the deployment process as predictable and uneventful as possible.
As a Senior SRE, you will be a technical leader responsible for the design, monitoring, and operation of our production systems. You will focus on the holistic behavior of services, including their reliability, performance, failure modes, and the overall developer experience. We are passionate about creating a robust and efficient operational environment.
Key responsibilities include designing systems with resilience and capacity in mind, defining and measuring Service Level Objectives (SLOs) and Service Level Indicators (SLIs) that reflect customer experience, and building comprehensive observability using tools like Datadog and CloudWatch. You will configure alerting and incident management through incident.io to ensure that all paged alerts are critical. A significant part of the role involves continuously improving our incident lifecycle, from detection and triage to communication and post-incident analysis, always aiming for blameless and actionable outcomes.
We are looking for candidates with a Master's or Bachelor's degree in Computer Science or a related field, or equivalent practical experience, and at least 4 years of experience in SRE or Software Engineering. You should have a proven track record of managing production environments at scale, a strong belief in the importance of observability, and experience using SLOs, SLIs, and KPIs to drive decisions. Familiarity with SRE principles, as outlined in the SRE book, and the ability to contextualize them for different teams are essential. Proficiency in leveraging AI productivity tools for operational efficiency and a deep understanding of shepherding services from design to production, including learning from incidents, are also required. Experience in handling site-wide outages, implementing technical and process changes to prevent recurrence, and a passion for mentoring engineers to reduce toil are highly valued.
EarnIn offers excellent employee benefits, including healthcare, internet/cell phone reimbursement, a learning and development stipend, and opportunities to collaborate with our Palo Alto HQ and Bangkok teams. We are committed to fostering a diverse and inclusive culture. This position is #LI-Hybrid.
Company
EarnIn
EarnIn is a pioneer in earned wage access, dedicated to building financial flexibility for individuals living paycheck to paycheck. Our community members can access their earnings as they earn them, w...