Invoice Cloud
Invoice Cloud2h ago
Naukri

MIOps Engineer

Hyderabad
Full Time
Senior Level

Auto Apply to 50+ AI Matched MIOps Engineer Jobs

Use Auto Apply Agents to Bulk Apply jobs with ATS Optimised Resumes, find verified Insider Connections for jobs at Invoice Cloud

Full Job Description

InvoiceCloud seeks a Senior AIOps & Reliability Engineer in Hyderabad to design and build AI- and ML-driven operational intelligence systems. This role is crucial for enhancing proactive reliability, observability, and intelligent automation across cloud-native platforms. The objective is to enable early risk detection, accelerate incident resolution, and develop self-healing systems.

Key responsibilities include designing and implementing ML models for anomaly detection, predictive incident identification, failure forecasting, and root cause analysis. You will leverage AI for incident summarization, classification, and remediation recommendations, and engineer data pipelines to transform observability telemetry into ML-ready datasets. Continuous model evaluation, retraining, and improvement based on production feedback are essential.

The role also involves implementing 'shift-left' AIOps initiatives to identify risks early in the Software Development Life Cycle (SDLC). This includes applying ML to code changes and deployment metadata to predict operational risks, embedding ML-driven risk scoring into Azure DevOps CI/CD and Pull Request (PR) workflows, and collaborating with engineering teams to validate observability-first development practices.

You will design AI-driven incident summarization, AI-assisted runbooks, and guided remediation, while building human-in-the-loop decision systems for critical incidents. Balancing AI, ML, and deterministic automation with a focus on explainability and trust is paramount.

In Observability & Telemetry Engineering, you will instrument applications using OpenTelemetry (OTEL), normalize and correlate metrics, logs, and traces, and integrate telemetry pipelines with New Relic. Defining and monitoring Service Level Indicators (SLIs), Service Level Objectives (SLOs), and operational health signals is also key.

Experience with Cloud, Kubernetes & Platform Operations is required, including designing and operating workloads on Microsoft Azure, managing Azure Kubernetes Service (AKS) clusters, and deploying containerized .NET 8 services using Helm.

For ML-Enabled DevOps, Infrastructure & Automation, you will build Azure DevOps pipelines for application, infrastructure, and ML deployments, manage source control using Azure DevOps Repos, implement Infrastructure as Code (IaC) using Terraform, and automate workflows with Ansible.

The focus on Intelligent Automation & Self-Healing Systems involves building closed-loop automation triggered by ML predictions, reducing alert fatigue through intelligent correlation, and developing self-healing systems to minimize Mean Time To Resolution (MTTR).

Required skills include strong ML fundamentals (anomaly detection, time-series analysis), experience applying AI/LLM systems to operational workflows, hands-on Microsoft Azure and AKS experience, proficiency with Kubernetes, Helm, Azure DevOps, Terraform, and Ansible, and experience with OpenTelemetry and New Relic.

Nice-to-haves include MLOps or ML lifecycle management experience, Python for ML experimentation, and familiarity with Site Reliability Engineering (SRE) principles.

Success will be measured by operational risks identified before production, early incident prediction, reduced alert noise, faster incident resolution, and continuous improvement in platform reliability.

Company

Invoice Cloud

Invoice Cloud

InvoiceCloud is a leading fintech company, recognized for its secure and innovative Software-as-a-Service (SaaS) solutions in the electronic bill presentment and payment (EBPP) sector. With a strong m...

Hyderabad
Posted on Naukri
MIOps Engineer at Invoice Cloud | Hyderabad | Apply Now | MindMyJob | MindMyJob - AI Job Search Platform