
Data Engineer
Responsibilities
Qualifications & Requirements
Experience Level: Mid Level
Full Job Description
Guardant Health is seeking a dynamic and proficient Cloud Data Engineer to join our Data Team and contribute to the Guardant Data Platform.
Duties and Responsibilities:
- Rapidly learn and adapt to evolving technologies within the Data Team's stack, embracing new challenges.
- Incorporate usability, scalability, deployment, integration, maintenance, and automation considerations when integrating new technology stacks.
- Demonstrate strong programming proficiency in at least one language (Python, Scala, Java) and the ability to acquire new language skills.
- Develop and maintain robust ETL pipelines and data-driven systems using technologies such as Apache Spark, AWS Glue, Athena, Redshift, and AWS Batch.
- Possess expertise in writing complex SQL queries.
- Manage code effectively on GitHub, with a thorough understanding of advanced Git operations like git-flow, rebasing, and squashing.
- Implement infrastructure as code using Terraform and leverage a wide array of AWS Analytics and Data Services including Glue, S3, Lambda, AWS Batch, Athena, Redshift, DynamoDB, CloudWatch, Kinesis, SQS, SNS, and DMS.
- Utilize Jenkins for the implementation of deployment pipelines and participate in requirements gathering to estimate effort for new technology stack integrations.
- Design and architect solutions for Machine Learning, Data Governance, Deployment/Integration Automations, and Data Analytics.
- Explore and learn additional AWS services such as ECS, ECR, and EC2, alongside Data Modeling techniques.
Qualifications:
- A minimum of 5 years of experience in software development, with at least 2 years focused on building scalable and stable data pipelines within the AWS tech stack.
- Proven experience in constructing Data Pipelines in the AWS Cloud, demonstrated through professional experience or personal projects.
- Strong programming skills and SQL proficiency are essential.
- Familiarity with various AWS Analytics Ecosystem components, including Apache Airflow, Apache Spark, S3, Glue, Kafka, AWS Athena, Lambda, Redshift, Lake Formation, AWS Batch, ECS - Fargate, Kinesis, Flink, DynamoDB, and SageMaker.
- Experience with Infrastructure as Code (IaC) for deploying Data Pipelines in AWS, such as Terraform or CloudFormation.
- Experience with Docker, Kubernetes, ECR, EC2, VPC, SNS, SQS, and CloudWatch is highly valued.
- Experience in building Jenkins deployment pipelines is highly valued.
- Proficiency in using collaboration tools like JIRA, Confluence, and GitHub is beneficial.
- Exposure to NoSQL databases is considered an advantage.
Additional Information:
Job Location: Hyderabad, Telangana, India (Hybrid model - Work from Office).
Why Join Us?
At Guardant Health, we are on a mission to conquer cancer with data. You’ll be part of a team that is revolutionizing precision oncology through cutting-edge technology and innovative software solutions. If you are a technically strong, people-oriented leader who thrives in a collaborative and high-impact environment, we’d love to hear from you! Visit our career page: http://www.guardanthealth.com/jobs/
Company
Guardant Health
Guardant Health is a premier precision oncology company dedicated to conquering cancer globally. We achieve this through our proprietary tests, extensive data sets, and advanced analytics. Our oncolog...