Lead, Data Engineer
Full Job Description
Join Kroll's expanding Data practice, a dynamic hub for artificial intelligence, machine learning, and analytics. As a Lead Data Engineer in India, you will be instrumental in shaping our data landscape. You'll design, build, and integrate data from diverse sources, collaborating with an advanced engineering team and professionals from leading global financial institutions, law enforcement, and government agencies.
At Kroll, your work will directly contribute to protecting, restoring, and maximizing value for our clients. This is an opportunity to significantly advance your career.
Responsibilities
- Design and build robust organizational data infrastructure and architecture.
- Identify, design, and implement process improvements, including infrastructure redesign for scalability, data delivery optimization, and automation of manual processes.
- Select optimal tools, services, and resources for building resilient data pipelines for ingestion, connection, transformation, and distribution.
- Design, develop, and manage ELT applications.
- Collaborate with global teams to deliver fault-tolerant, high-quality data pipelines.
Requirements
- Advanced experience in writing ETL/ELT jobs.
- Advanced experience with Azure, AWS, and Databricks Platform, primarily data-related services.
- Advanced experience with Python, Spark ecosystem (PySpark + Spark SQL), and SQL databases.
- Ability to develop REST APIs, Python SDKs or Libraries, and Spark Jobs.
- Proficiency in open-source tools, frameworks, and Python libraries such as FastAPI, Pydantic, Polars, Pandas, PySpark, Deltalake Tables, Docker, and Kubernetes.
- Experience with Lakehouse & Medallion architecture, Data Governance, and Data Pipeline Orchestration.
- Excellent communication skills.
- Ability to conduct data profiling, cataloging, and mapping for technical data flows.
- Ability to work effectively with international teams.
Desired Skills
- Strong understanding of cloud architecture principles: compute, storage, networks, security, cost optimization.
- Advanced SQL and Spark query/data pipeline performance tuning skills.
- Experience building Lakehouse solutions using technologies like Azure Databricks, Azure Data Lake, SQL, and PySpark.
- Familiarity with programming paradigms such as OOPs, Async programming, and Batch processing.
- Knowledge of CI/CD and Git.
Company
Kroll
Kroll stands as the premier independent provider of risk and financial advisory solutions. We utilize our distinct insights, data, and technology to empower clients in navigating complex challenges. W...