Mahindra
Mahindra2h ago
Career Pages

Data Engineer

Pune, Pune, IN
Full Time
Mid Level

Auto Apply to 50+ AI Matched Data Engineer Jobs

Use Auto Apply Agents to Bulk Apply jobs with ATS Optimised Resumes, find verified Insider Connections for jobs at Mahindra

Responsibilities

Qualifications & Requirements

Experience Level: Mid Level

Full Job Description

Data Engineer - Pune, India

Mahindra is seeking a skilled Data Engineer to join our Data Engineering & Infrastructure team in Pune, India. This critical role involves designing, building, and maintaining robust data pipelines and infrastructure on the Google Cloud Platform (GCP). Your work will ensure the reliable flow of data across the organization, empowering real-time analytics and data-driven decision-making. Without this expertise, the company would face significant challenges in efficiently processing, storing, and delivering data, leading to data silos, compromised data quality, and an inability to scale our data operations effectively.

Key Responsibilities and Deliverables

Data Pipeline Development

Design, develop, and maintain scalable ETL/ELT pipelines using GCP services such as Cloud Dataflow, Cloud Composer (Apache Airflow), and Cloud Functions. Build both real-time and batch data processing solutions capable of handling diverse data sources and formats.

Cloud Infrastructure Management

Architect and implement data infrastructure on Google Cloud Platform, leveraging services like BigQuery, Cloud Storage, Cloud SQL, Cloud Spanner, and Bigtable. Focus on optimizing the performance, cost, and reliability of these cloud resources.

Data Integration & Orchestration

Integrate data from a variety of sources, including APIs, databases, IoT devices, and third-party systems. Implement data orchestration workflows using Cloud Composer to ensure seamless data flow across different systems.

Data Quality & Governance

Implement robust data quality checks, validation rules, and monitoring systems. Ensure adherence to data governance policies and security standards by utilizing GCP security services like Cloud DLP and Cloud IAM.

Real-time Data Processing

Build streaming data pipelines using Cloud Pub/Sub, Cloud Dataflow, and BigQuery streaming inserts. Develop solutions that support real-time analytics and event-driven architectures.

Performance Optimization

Optimize query performance within BigQuery, employing partitioning and clustering strategies. Monitor and enhance pipeline performance through the use of Cloud Monitoring and Cloud Logging tools.

SAP Integration (Preferred)

Design and implement data integration solutions for SAP systems, including SAP ECC, S/4HANA, and BW/4HANA. Develop connectors and pipelines to extract data from SAP modules for advanced analytics and reporting.

Experience

We require 3-4 years of hands-on experience as a Data Engineer, with a strong emphasis on Google Cloud Platform. Proven experience in building and maintaining production-grade data pipelines and infrastructure on GCP is essential.

Qualifications

A Bachelor's or Master's degree in Statistics or Applied Statistics is required.

Primary Skill Requirements

Google Cloud Platform Expertise

  • Advanced proficiency in BigQuery, including SQL, DML/DDL, and optimization techniques.
  • Experience with Cloud Dataflow for both batch and streaming data processing.
  • Hands-on experience with Cloud Composer/Apache Airflow for workflow orchestration.
  • Implementation knowledge of Cloud Storage, Cloud SQL, Cloud Spanner, and Bigtable.
  • Experience with Cloud Pub/Sub for building event-driven architectures.
  • Familiarity with Cloud Functions and Cloud Run for serverless computing.
  • Experience with Dataproc for managed Spark/Hadoop workloads.

Programming & Tools

  • Strong programming skills in Python, Java, or Scala.
  • Proficiency in both SQL and NoSQL databases.
  • Experience with the Apache Beam SDK for data processing.
  • Experience with Infrastructure as Code tools like Terraform or Cloud Deployment Manager.
  • Proficiency in version control using Git and experience with CI/CD pipelines.

Data Engineering Concepts

  • Deep understanding of ETL/ELT design patterns and best practices.
  • Experience with data modeling techniques (dimensional, normalized, denormalized).
  • Knowledge of data warehousing and data lake architectures.
  • Familiarity with stream processing and real-time analytics concepts.
  • Expertise in data partitioning, sharding, and optimization strategies.

Security & Governance

  • Knowledge of GCP IAM, VPC, and security best practices.
  • Experience implementing data encryption and privacy measures.
  • Understanding of compliance frameworks such as GDPR and HIPAA.

Secondary Skill Requirements

SAP Knowledge (Preferred)

  • Understanding of SAP architecture and data models.
  • Experience with SAP HANA, BW/4HANA, or S/4HANA.
  • Experience with SAP data extraction methods like ODP, BAPI, or RFC.
  • Knowledge of SAP integration tools and connectors.

Additional Nice-to-Have Skills

  • Experience with other cloud platforms (AWS, Azure).
  • Knowledge of containerization technologies (Docker, Kubernetes/GKE).
  • Understanding of Machine Learning/AI pipelines on GCP (Vertex AI, ML Engine).
  • Experience with data visualization tools (Looker, Tableau, Data Studio).

Behavioral Competencies

  • Strong problem-solving and analytical thinking skills.
  • Excellent communication abilities, effective with both technical and non-technical stakeholders.
  • Ability to collaborate effectively within cross-functional teams.
  • Proactive approach to identifying and resolving data challenges.
  • A continuous learning mindset to adapt to evolving cloud technologies.
  • Meticulous attention to detail and a strong commitment to data quality.
  • Capacity to manage multiple projects and prioritize tasks effectively.

Company

Mahindra

Mahindra

Pune, Pune, IN
Posted on Career Pages
Data Engineer (Pune, Pune, IN) at Mahindra | Pune, Pune, IN | Apply Now | MindMyJob | MindMyJob - AI Job Search Platform