Architect
Full Job Description
Data Architect
Quantiphi is seeking a highly skilled Data Architect with 8-13 years of experience for its Mumbai, Bangalore, or Trivandrum locations. As the senior technical owner of the platform's design, you will define and evolve the architectural blueprint, including the canonical data model, ingestion framework, transformation patterns, FHIR serialization layer, bidirectional FHIR flow, and governance and observability frameworks. You will set the standards for all other roles and work in close partnership with customer's Health Data Engine technical owners to anchor architectural decisions on real volume requirements and operational pain points.
Role Overview
You will own the end-to-end architecture across ingestion (Flink + PySpark), storage (Iceberg on GCS with BigLake Metastore), transformation (dbt over Starburst), FHIR serialization (flat FHIR Iceberg bundles FHIR-Repository), and consumption (Starburst, FHIR API, data products). You will lead the design and evolution of the Common Data Model (CDM), define the bidirectional FHIR flow with origin-tag-based loop prevention, and specify the patient identity resolution architecture using Informatica MDM. Key responsibilities include setting standards for naming conventions, hash algorithms, Iceberg table properties, DLQ taxonomy, observability, and security. You will also drive high-volume capacity design, evaluate and recommend technology choices, provide architectural review, mentor data engineers, lead architectural review meetings with customer stakeholders, and establish the disaster recovery, backup, and reprocessing strategy.
Key Responsibilities
- Own the end-to-end architecture across ingestion, storage, transformation, FHIR serialization, and consumption.
- Lead the design and evolution of the Common Data Model (CDM).
- Define the bidirectional FHIR flow with origin-tag-based loop prevention.
- Define the patient identity resolution architecture using Informatica MDM.
- Set the standards for naming conventions, hash algorithms, Iceberg table properties, DLQ taxonomy, observability, and security.
- Own the spec-driven development framework's program-level and component-level specs.
- Drive the high-volume capacity design, including ingest targets, fact table volumes, and concurrent workloads.
- Evaluate and recommend technology choices, providing trade-off analyses.
- Provide architectural review for engineering work and mentor data engineers.
- Lead architectural review meetings with customer technical stakeholders.
- Establish and evolve the disaster recovery, backup, and reprocessing strategy.
Required Skills and Qualifications
- Bachelor's or Master's degree in Computer Science, Engineering, or a related quantitative field.
- 8+ years of data engineering experience with at least 3-5 years in a Data Architect or Lead Data Engineer role.
- Deep expertise architecting and operating large-scale data platforms on Google Cloud Platform (GCP), including Cloud Storage, Dataproc (Flink, Spark), Cloud Composer, Secret Manager, Workload Identity, IAM, and networking.
- Hands-on experience with Apache Iceberg in production, including table properties, partitioning, snapshot semantics, schema evolution, write modes, compaction, and metastore integration.
- Production experience with Starburst (Galaxy or Enterprise) or Trino for analytical workloads.
- Strong experience with streaming architectures using Apache Flink and Apache Kafka.
- Expert SQL and strong programming skills in Python; familiarity with PySpark and PyFlink.
- Deep experience with dbt, including model materialization, macros, tests, and project structure.
- Comprehensive understanding of healthcare data standards: HL7v2, CCDA, and FHIR R4 with US Core 6.1 profiles.
- Hands-on experience with FHIR runtime platforms like FHIR-Repository or HAPI FHIR.
- Experience with master data management for patient identity resolution, preferably Informatica MDM.
- Experience designing for HIPAA-regulated environments, including PHI handling, encryption, and audit logging.
- Demonstrated ability to lead complex technical initiatives and influence stakeholders.
Nice-to-Have Skills
- GCP Professional Data Engineer or Cloud Architect certification.
- Experience with Atlan or comparable governance platforms.
- Experience with multi-region active-active or active-passive deployments on GCP.
- Production experience operating Confluent Cloud at significant scale.
- Familiarity with spec-driven development workflows.
- Experience replacing or sunsetting legacy healthcare data platforms.
Company
Quantiphi
Quantiphi is a global company that values its diverse and inclusive culture as much as its technological expertise. We foster an environment built on transparency, integrity, and a commitment to conti...