
Site Reliability Engineer
Responsibilities
Qualifications & Requirements
Experience Level: Mid Level
Full Job Description
MakeMyTrip is seeking a Site Reliability Engineer (NOC) for its Gurgaon, India location. This role operates on a rotational shift basis and requires 1-3 years of experience. As part of the Site Reliability Team, you will be the first line of defense, ensuring the availability and performance of MakeMyTrip's production services 24x7x365. You will monitor production servers and services, interact with Engineering, Sales, and Product teams, and require a broad understanding of components, systems, and networks. Diligence, attention to detail, multitasking, and prioritization are essential. While extensive prior knowledge is not mandatory, a strong eagerness to learn and grow through research and experience is expected. Excellent communication and problem-solving skills are also crucial.
Prime Responsibilities:
- Monitor multiple systems for deviations in application layers.
- Respond to alerts, escalate issues, and track resolutions with incident reporting.
- Setup and monitor alerts using OPS tools and monitoring applications like Zabbix, Grafana, and the ELK stack.
- Develop shell/Python scripts for reports and CRON scheduling.
- Adhere to defined processes and handle ad-hoc and surprise incidents.
- Create documentation and share knowledge for continuous improvement.
- Ensure clear and effective communication and reporting.
- Troubleshoot live production issues by correlating component behavior.
- Perform day-to-day maintenance of application systems, including troubleshooting and issue resolution or escalation.
Desired Skills:
- 3-6 years of experience in a 24x7 AWS Cloud-based Linux production environment.
- Ability to monitor diverse architectures, troubleshoot, analyze impact, and escalate issues.
- Willingness to work rotational shifts, including nights and weekends.
- Basic Linux command skills are mandatory; scripting language experience (Shell/Python) is a plus.
- Fundamental knowledge of Web/Internet concepts (DNS, protocols, ports, cookies, Firebug).
- Experience in L2 debugging, including log analysis for errors/exceptions.
- Basic knowledge of SQL queries.
- Ability to work effectively in a busy team, learn quickly, and handle various issues.
- Prior experience with ELK, Zabbix, or Grafana is advantageous.
- Knowledge of the AWS Cloud environment is a significant plus.
MakeMyTrip offers a dynamic environment where technology drives innovative travel experiences. Join a team committed to customer-centricity and continuous improvement.
Company
MakeMyTrip
MakeMyTrip (MMT) is a leading travel technology company that leverages cutting-edge solutions, including AI, machine learning, and cloud infrastructure, to deliver seamless and innovative travel exper...