
Sandisk•2h ago
Foundit
Data Engineer for AZURE cloud Platf...
Bengaluru / Bangalore, India
Full Time
Mid Level
Full Job Description
We are seeking a results-oriented Data Engineer with at least 2 years of experience in developing data pipelines within cloud environments. This role focuses on designing, building, and optimizing Azure-based data ingestion and transformation pipelines using PySpark and Spark SQL. You will collaborate with cross-functional teams to deliver robust, scalable, and high-quality data solutions.
Key Responsibilities:
- Design, develop, and maintain high-performance ETL/ELT pipelines leveraging PySpark and Spark SQL.
- Construct and orchestrate data workflows within the Azure ecosystem.
- Implement hybrid data integration strategies, connecting on-premise databases to Azure Databricks using tools like Azure Data Factory (ADF), HVR/Fivetran, and secure network configurations.
- Optimize Spark jobs for enhanced performance, scalability, and cost-effectiveness.
- Establish and maintain best practices for data quality, governance, and documentation.
- Partner with data analysts, data scientists, and business stakeholders to define and refine data requirements.
- Support CI/CD processes, automation tools, and version control systems such as Git.
- Conduct root cause analysis for data pipeline issues and ensure overall system reliability.
Required Qualifications:
- Bachelor's degree in Computer Science, Engineering, or a related field.
- Minimum of 2 years of direct experience in data engineering.
- Proficiency in PySpark, Spark SQL, and distributed data processing concepts.
- Solid understanding of Azure cloud services, including Azure Data Factory (ADF), Azure Databricks, and Azure Data Lake Storage (ADLS).
- Experience with SQL, data modeling, and performance optimization techniques.
- Familiarity with Git, CI/CD principles, and Agile development methodologies.
Preferred Qualifications:
- Experience with orchestration tools such as Apache Airflow or Azure Data Factory pipelines.
- Knowledge of real-time streaming technologies like Kafka, Azure Event Hubs, or HVR.
- Exposure to APIs, various data integration patterns, and cloud-native architectural principles.
- Familiarity with enterprise data management ecosystems.
Company
Sandisk
SanDisk is a leader in understanding data consumption for individuals and businesses, driving innovation to meet current needs and shape future advancements. With a legacy of breakthroughs in Flash an...
Bengaluru / Bangalore, India
Posted on Foundit