
Site Reliability Operations Enginee...
Full Job Description
Finalsite is seeking a Site Reliability Operations Engineer to join our dynamic team. This role is pivotal in ensuring the day-to-day monitoring, maintenance, and support of our infrastructure and deployed products. You will collaborate closely with Site Reliability Engineering (SRE) teams to implement operational best practices, troubleshoot issues, and maintain the stability and performance of our systems hosted on Amazon Web Services (AWS) and Google Cloud Platform (GCP). A solid understanding of operational processes, cloud environments, and effective technical communication is essential for this position.
Key Responsibilities
- Monitor system performance, availability, and alerts across AWS and GCP using established tools.
- Respond to incidents and alerts, conduct initial troubleshooting, and escalate issues to SREs and other technical teams.
- Execute routine maintenance tasks, including patching, updates, and backups, adhering to operational procedures, and prepare reports.
- Address escalations and requests from support, product development, and professional services teams across various global locations and time zones.
- Support the deployment of new applications and infrastructure components by following documented procedures and collaborating with SREs.
- Maintain and update operational documentation, runbooks, and knowledge base articles.
- Identify opportunities for operational improvements, automation, and efficiency gains in collaboration with SREs.
- Assist in managing and optimizing cloud costs and resource utilization.
- Ensure adherence to security policies and procedures in all operational activities.
- Provide feedback to SRE teams on operational challenges and requirements in a multi-cloud environment.
- Work Monday through Friday in a 24/5 shift environment.
Qualifications
- Bachelor's degree in Computer Science, Engineering, or a related field.
- Minimum 2 years of experience in an operations role supporting production systems.
- Familiarity with cloud platforms, specifically AWS and GCP.
- Experience with observability, monitoring, and alerting tools (e.g., Prometheus, Grafana, Datadog, New Relic).
- Understanding of basic cloud networking concepts.
- Ability to troubleshoot and resolve basic technical issues.
- Experience executing documented operational procedures with attention to detail.
- Strong verbal English communication and collaboration skills.
- Eagerness to learn and adapt to new technologies and processes in a multi-cloud environment.
- Ability to thrive in a fast-paced, dynamic environment and manage multiple projects simultaneously.
Preferred Qualifications
- Relevant cloud certifications (e.g., AWS Certified SysOps Administrator, Google Cloud Professional Cloud Administrator).
- Experience with scripting languages for automation.
- Familiarity with ITIL or other operational frameworks.
- Experience in an Agile development environment.
This is a full-time, hybrid employment opportunity located in Chennai, TN, India. Candidates must be comfortable working on a 24/5 shift basis.
Company
Finalsite
Finalsite is a leading global provider of integrated website, communications, enrollment, and marketing solutions for educational institutions. Serving over 7,000 schools and districts across 119 coun...