
Meesho•2h ago
Naukri
Software Development Engineer III D...
Bengaluru
Full Time
Senior Level
N/A
N/A
N/A
Full Job Description
About the Role
We are seeking a talented Software Development Engineer III specializing in Data to join our team at Meesho in Bengaluru. You will play a crucial role in designing, developing, and optimizing our data platforms and pipelines. This role involves working with high-throughput, low-latency systems and contributing to the evolution of our data infrastructure to support various product, ML, and analytics initiatives.
What You'll Do
- Design and implement scalable and fault-tolerant data pipelines, including batch and streaming, utilizing frameworks like Apache Spark, Flink, and Kafka.
- Lead the design and development of robust data platforms and reusable frameworks that cater to multiple teams and diverse use cases.
- Build and optimize data models and schemas to efficiently support large-scale operational and analytical workloads.
- Gain a deep understanding of Apache Spark internals and contribute to modifying or extending the open-source Spark codebase when necessary.
- Develop sophisticated streaming solutions using tools such as Apache Flink and Spark Structured Streaming.
- Drive initiatives to abstract infrastructure complexity, empowering ML, analytics, and product teams to accelerate their development on the platform.
- Champion a platform-building mindset focused on reusability, extensibility, and enabling developer self-service.
- Ensure high standards of data quality, consistency, and governance through the implementation of validation frameworks, observability tooling, and access controls.
- Optimize infrastructure for cost, latency, performance, and scalability within modern cloud-native environments.
- Mentor and guide junior engineers, actively participate in architecture reviews, and uphold rigorous engineering standards.
- Collaborate cross-functionally with product, ML, and data teams to ensure technical solutions are aligned with business objectives.
What We're Looking For
- 5-8 years of professional experience in software or data engineering, with a significant focus on distributed data systems.
- Strong programming proficiency in Java, Scala, or Python, coupled with expertise in SQL.
- A minimum of 2 years of hands-on experience with big data systems including Apache Kafka, Apache Spark/EMR/Dataproc, Hive, Delta Lake, Presto/Trino, Airflow, and data lineage tools (e.g., DataHub, Marquez, OpenLineage).
- Proven experience in implementing and tuning Spark/Delta Lake/Presto at terabyte-scale or beyond.
- A strong grasp of Apache Spark internals (Catalyst, Tungsten, shuffle, etc.), with practical experience in customizing or contributing to open-source code.
- Familiarity and practical experience with modern open-source and cloud-native data stack components such as:
- Apache Iceberg, Hudi, or Delta Lake
- Trino/Presto, DuckDB, or ClickHouse, Pinot, Druid
- Airflow, Dagster, or Prefect
- DBT, Great Expectations, DataHub, or OpenMetadata
- Kubernetes, Terraform, Docker
- Excellent analytical and problem-solving skills, with a demonstrated ability to debug complex issues in large-scale systems.
- Exposure to data security, privacy, observability, and compliance frameworks is considered a plus.
Good to Have
- Contributions to open-source projects within the big data ecosystem (e.g., Spark, Kafka, Hive, Airflow).
- Hands-on data modeling experience and exposure to end-to-end data pipeline development.
- Familiarity with OLAP data cubes and BI/reporting tools such as Tableau, Power BI, Superset, or Looker.
- Working knowledge of tools and technologies like ELK Stack (Elasticsearch, Logstash, Kibana), Redis, and MySQL.
- Exposure to backend technologies including RxJava, Spring Boot, and Microservices architecture.
Company
Meesho
Bengaluru
Posted on Naukri