
Site Reliability Engineer
Responsibilities
Qualifications & Requirements
Experience Level: Mid Level
Full Job Description
NICELtd. is seeking an experienced Site Reliability Engineer (SRE) in Pune, India, to support large, complex enterprise software clients. This role involves managing applications, servers, SQL databases, and networks, requiring exceptional problem-solving abilities. The SRE will deliver real-time insights from massive-scale data, collaborating with cross-functional teams to develop innovative solutions and enhance user experiences.
Key Responsibilities:
- Manage the production environment by monitoring availability and overall system health.
- Develop software and systems for platform infrastructure and application management.
- Enhance the reliability, quality, and time-to-market of software solutions.
- Measure and optimize system performance to drive innovation and meet customer needs.
- Provide primary operational support and engineering for multiple large distributed software applications.
- Analyze operating system and application metrics for performance tuning and fault identification.
- Collaborate with development teams on service improvements through rigorous testing and release processes.
- Participate in system design, platform management, and capacity planning.
- Automate and uplift systems and services for sustainability.
- Balance feature development speed and reliability against service level objectives.
- Handle graveyard shift work as required.
Required Qualifications:
- 2+ years of programming/scripting experience in Go, Python, .Net (C#), or Node.
- Bachelor's degree in Computer Science, Engineering, or a related field (or equivalent experience).
- 2-3 years of experience in systems engineering, automation, and reliability.
- Proficiency in at least one programming language (Python, Go, Java, C#) and scripting languages (Bash, PowerShell).
- Deep understanding of cloud computing platforms (e.g., AWS) and their services (e.g., EC2, ECS, Lambda, DynamoDB).
- Experience with infrastructure as code tools (e.g., CloudFormation, Terraform).
- Strong knowledge of CI/CD concepts and tools (e.g., Jenkins, GitLab CI/CD, CircleCI).
- Familiarity with containerization (Docker, Kubernetes) and microservices architecture.
- Experience with monitoring and observability tools (e.g., Prometheus, Grafana, ELK stack, Cloudwatch).
- Excellent problem-solving skills for troubleshooting distributed systems.
- Experience with incident management, blameless postmortems, and cross-functional incident response.
Advantageous Skills:
- Kubernetes (certification preferred)
- Grafana
- AWS, Azure
- DevOps experience
About the Role:
- Company: NICE
- Location: Pune, India
- Job Type: Permanent
- Experience: 2-4 Years
- Reporting to: Tech Manager
- Role Type: Individual Contributor
Why Join NICE?:
Join a dynamic, market-disrupting global company with a fast-paced, collaborative, and creative environment. NICE offers ample opportunities for learning, growth, and internal career advancement across various roles, disciplines, and locations. If you are passionate, innovative, and driven to excel, you might be our next NICEr!
NICE embraces the NICE-FLEX hybrid work model, offering a blend of 2 days in the office and 3 days of remote work weekly. Office days are designed for collaborative meetings, fostering innovation and interaction.
Company
NICE
NICELtd. (NASDAQ: NICE) is a global leader in providing software solutions for customer experience management, financial crime prevention, and public safety. With over 25,000 businesses worldwide rely...