Senior Data Engineer
Full Job Description
Job Title: Senior Data Engineer
Location: Pune, Maharashtra
Job Type: Full-Time
TVARIT is seeking a highly motivated Senior Data Engineer to join our Pune team. This role is crucial for developing and optimizing scalable ETL pipelines for manufacturing analytics, focusing on high-frequency industrial data for real-time and batch processing. The ideal candidate will possess strong expertise in Azure Databricks, PySpark, and distributed computing, coupled with a positive attitude and excellent English communication skills.
Key Responsibilities:
- Develop scalable real-time and batch processing workflows using Azure Databricks, PySpark, and Apache Spark.
- Execute comprehensive data pre-processing, including cleaning, transformation, deduplication, normalization, encoding, and scaling.
- Design and manage cloud-based data architectures (data lakes, lakehouses, warehouses) adhering to Medallion Architecture principles.
- Deploy and optimize data solutions on cloud platforms (Azure preferred, AWS, GCP), prioritizing performance, security, and scalability.
- Create and refine ETL/ELT pipelines for diverse data sources such as IoT, MES, SCADA, LIMS, and ERP systems.
- Automate data workflows with CI/CD and DevOps best practices, ensuring compliance and security.
- Monitor, troubleshoot, and improve data pipelines for high availability and reliability.
- Utilize Docker and Kubernetes for scalable data processing.
- Collaborate with automation teams, data scientists, and engineers to deliver structured data for AI/ML models.
Desired Skills and Qualifications:
- Bachelor's or Master's degree in Computer Science, Information Technology, or a related field.
- Minimum 7 years of core data engineering experience, with significant cloud platform exposure (Azure, AWS, or GCP).
- Advanced proficiency in PySpark, Azure Databricks, Python, and Apache Spark.
- At least 2 years of team leadership experience.
- Expertise in relational databases (SQL Server, PostgreSQL), time-series databases (Influx DB), and NoSQL databases (MongoDB, Cassandra).
- Experience with containerization technologies like Docker and Kubernetes.
- Strong analytical and problem-solving abilities with meticulous attention to detail.
- Experience with MLOps and DevOps, including model lifecycle management, is a plus.
- Excellent communication and collaboration skills, with a proven ability to function effectively in a team environment.
- Adaptability and eagerness to thrive in a dynamic, fast-paced startup setting.
Company
Tvarit
TVARIT GmbH is a leading innovator in artificial intelligence (AI) solutions for the metal industry, offering cutting-edge software for sectors including steel, aluminum, copper, and cast iron. Our pr...