Data Engineer
Responsibilities
Qualifications & Requirements
Experience Level: Senior Level
Full Job Description
We are looking for a skilled Data Engineer with over 5 years of hands-on experience to join our team in Bengaluru, Karnataka, India. This role requires a strong foundation in Spark and SparkSQL, exceptional SQL proficiency, and proven experience in batch processing. You will be instrumental in building and optimizing robust data pipelines that support critical analytics and business objectives, ensuring both data reliability and high performance. Collaboration with cross-functional teams will be key to your success.
Must-Have Skills:
- Apache Spark (minimum 5 years of working experience), with a particular focus on Spark SQL and batch processing.
- Very strong SQL expertise, capable of writing efficient, optimized, and complex queries for large datasets.
- Solid experience with batch data processing, including scheduling, dependency management, failure recovery, and performance tuning.
- Good project experience.
Key Responsibilities:
- Design, build, and maintain scalable data pipelines for ingestion, transformation, and loading (ETL/ELT).
- Optimize data pipeline performance and ensure data quality and reliability.
- Collaborate with data scientists, analysts, and other engineers to understand data needs and deliver solutions.
- Participate actively in code reviews, design discussions, and agile development processes.
- Troubleshoot and resolve issues related to data pipelines and data quality.
Required Skills & Experience:
- A minimum of 5 years of hands-on experience with Apache Spark, including Spark SQL and batch processing.
- Deep expertise in SQL, with the ability to craft efficient and complex queries for large-scale data.
- A solid understanding of data pipeline fundamentals, covering ingestion, transformation, and loading processes.
- Demonstrable experience with batch data processing, encompassing scheduling, dependency handling, failure recovery, and performance optimization.
- Familiarity with data warehousing concepts and relational database systems.
- Experience working with various data formats, including structured and semi-structured data such as CSV, Parquet, and JSON.
- Strong analytical, problem-solving, and debugging skills, with a focus on pipeline performance and data integrity.
Python proficiency is a plus but not a mandatory requirement.
Company
Flexton Inc.
Founded in 2007 and headquartered in San Jose, California, Flexton Inc. is a dynamic professional services company with established development centers in India. Recognized seven times by INC 5000 mag...