
Data Platform Engineer
Responsibilities
Qualifications & Requirements
Experience Level: Senior Level
Full Job Description
Accenture is seeking a Data Platform Engineer in Hyderabad to contribute to data platform blueprinting and design, encompassing all relevant data platform components. This role involves close collaboration with Integration Architects and Data Architects to ensure seamless integration between systems and data models. You will also participate in strategic discussions to refine and enhance the overall data architecture.
Your responsibilities will include designing and building intricate pipelines using technologies like Delta Lake, Auto Loader, and Delta Live Tables (DLT), with deployment managed through Asset Bundles. You will leverage your proven experience as a Data Architect and Data Engineer in leading enterprise-scale Lakehouse initiatives. An expert-level understanding of modern Data & Analytics Architecture patterns, including Data Mesh, Data Products, and Lakehouse Architecture, is essential. Proficiency in Python for programming and debugging, along with strong PySpark experience for scalable ETL/ELT pipelines, is required. You will architect data ingestion and transformation using DLT Expectations, modular Databricks Functions, and reusable pipeline components.
Hands-on expertise in at least one major cloud platform (AWS, GCP, or Azure) is mandatory. You will lead the implementation of Unity Catalog, including creating catalogs, schemas, role-based access policies, lineage visibility, and data classification tagging (PII, PHI, etc.). Guiding organization-wide governance through Unity Catalog setup, encompassing workspace linkage, SSO, audit logging, external locations, and Volume access, will be a key part of your role. Enabling cross-platform data access using Lakehouse Federation to query live data from externally hosted databases is also expected. Leveraging and integrating the Databricks Marketplace for consuming high-quality third-party data and securely publishing internal data assets will be another facet of this position.
Experience with cloud-based services relevant to data engineering, data storage, data processing, data warehousing, real-time streaming, and serverless computing is crucial. You will govern and manage Delta Sharing for secure data sharing with external partners or across tenants. Designing and maintaining PII anonymization, tokenization, and masking strategies using dbx functions and Unity Catalog policies to comply with GDPR/HIPAA is a requirement. Architecting Power BI, Tableau, and Looker integration with Databricks for live reporting and visualization over governed datasets, and building Databricks SQL Dashboards to provide stakeholders with real-time insights, KPI tracking, and alerts are also part of the scope.
You will apply performance optimization techniques and lead cross-functional initiatives across data science, analytics, and platform teams to deliver secure, scalable, and value-aligned data products. Providing thought leadership on adopting advanced features like Mosaic AI, Vector Search, Model Serving, and Databricks Marketplace publishing is encouraged. A working knowledge of DBT (Data Build Tool) is a plus. A strong background in data modeling and data warehousing concepts is required.
Good to have skills include certifications like Databricks Certified Professional, knowledge of machine learning concepts and popular ML libraries, big data processing (Spark, Hadoop, Hive, Kafka), data orchestration with Apache Airflow, CI/CD pipelines and DevOps practices in a cloud environment, experience with ETL tools (Informatica, Talend, Matillion, Fivetran), and familiarity with DBT (Data Build Tool).
Educational Qualification: 15 years of full-time education is required.