
Intangles•5h ago
Foundit
Site Reliability Engineer
Pune, India
Full Time
Mid Level
Full Job Description
About the Role
Intangles Lab is seeking a dedicated and hands-on Site Reliability Engineer with a strong background in FinTech to manage our expansive 24x7 cloud operations. This role requires a proactive individual with over 2 years of experience in managing complex cloud infrastructures.
Key Responsibilities
- Administer and maintain production environments utilizing technologies such as Linux, AWS, Terraform, Kubernetes, MongoDB, Elasticsearch, and PostgreSQL.
- Ensure the high availability, performance, and reliability of our production systems.
- Troubleshoot, debug, and resolve issues across production and QA environments, providing timely technical solutions.
- Participate in on-call rotations as per team policy to ensure 24/7 operational support.
- Develop and enhance automation scripts and tools to improve operational efficiency.
- Collaborate closely with internal teams and customers to meet stringent uptime SLAs and service level agreements.
- Create, update, and maintain comprehensive documentation, including runbooks, playbooks, and postmortem reports for production incidents.
- Be prepared to work in a 24/7 operational environment as required to maintain platform reliability.
Must-Have Skills
- AWS Cloud (Advanced): Certification is a plus.
- Networking (Intermediate): Solid understanding of networking concepts.
- Ubuntu/Linux & OS (Advanced): Strong Linux and networking fundamentals with prior experience.
- Database Administration: Hands-on experience with MongoDB, PostgreSQL, or Elasticsearch. Basic knowledge of SQL and NoSQL databases is required.
- Containerization: Docker.
- Kubernetes (Advanced): Including Amazon EKS, StatefulSets, and HELM Chart.
- CI/CD (Advanced): Proficiency with tools like CircleCI, Argo Project, or GitHub Actions.
- Programming/Scripting: Python, Shell scripting. Basic programming ability to write code is necessary.
Monitoring Stack
- Prometheus, Grafana, Alertmanager, Istio, Jaeger, Datadog, PagerDuty (or similar), ElasticAPM.
Optional Skills (Bonus)
- Medium to high-level application development experience in JavaScript, Python, or Java.
- Understanding of N-tier Architectures.
- Familiarity with REST & gRPC API Frameworks.
- Knowledge of NodeJS Web Servers.
Additional Requirements
- Awareness of change, incident, problem, issue, and risk management processes, including escalations.
- Flexibility to work in rotational shifts, including night hours and weekends.
- Excellent analytical and problem-solving skills.
Company
Intangles
Pune, India
Posted on Foundit