Soprasteria
Soprasteria4h ago
Career Pages

PySpark Module Lead

Noida, Uttar Pradesh, in
Full Time
Mid Level

Auto Apply to 50+ AI Matched PySpark Module Lead Jobs

Use Auto Apply Agents to Bulk Apply jobs with ATS Optimised Resumes, find verified Insider Connections for jobs at Soprasteria

Responsibilities

Qualifications & Requirements

Experience Level: Mid Level

Full Job Description

Job Opportunity: PySpark Module Lead in Noida, Uttar Pradesh

Sopra Steria is seeking a skilled and motivated PySpark Module Lead to join our dynamic team in Noida, Uttar Pradesh. This role involves close collaboration with Data Scientists to develop and deploy machine learning models. Proficiency in PySpark and related technologies is essential for building and maintaining robust pipelines for training and inference datasets.

Responsibilities

  • Collaborate with Data Scientists to design, develop, and implement machine learning pipelines.
  • Utilize PySpark for data processing, transformation, and preparation of datasets for model training.
  • Leverage AWS EMR and S3 for scalable and efficient data storage and processing.
  • Implement and manage ETL workflows using Streamsets for data ingestion and transformation.
  • Design and construct pipelines to deliver high-quality training and inference datasets.
  • Partner with cross-functional teams to ensure seamless deployment and real-time/near real-time inferencing capabilities.
  • Optimize and fine-tune pipelines for performance, scalability, and reliability.
  • Ensure appropriate configuration of IAM policies and permissions for secure data access and management.
  • Implement and optimize Spark architecture and Spark jobs for scalable data processing.

Qualifications and Requirements

This position requires a professional degree and a total expected experience of 04-06 years.

Mandatory Skills:

  • Proficiency in Advanced SQL (Window functions), Spark Architecture, PySpark or Scala with Spark, Hadoop.
  • Proven expertise in designing and deploying data pipelines.
  • Strong problem-solving skills and the ability to work effectively in a collaborative team environment.
  • Excellent communication skills, with the ability to translate technical concepts to non-technical stakeholders.

Desirable Skills:

  • Hands-on experience with Airflow, S3, and Streamsets or similar ETL tools (training available).
  • Understanding of real-time or near real-time inferencing architectures.
  • Basic knowledge of Kafka, AWS IAM, AWS EMR, and Snowflake.

Sopra Steria is an equal opportunity employer committed to diversity and inclusion, and we welcome applications from individuals with disabilities.

Company

Soprasteria

Soprasteria

Sopra Steria is a leading European technology company with 56,000 employees operating in approximately 30 countries. They specialize in consulting, digital services, and software development, assistin...

Noida, Uttar Pradesh, in
Posted on Career Pages
PySpark Module Lead at Soprasteria | Noida, Uttar Pradesh, in | Apply Now | MindMyJob | MindMyJob - AI Job Search Platform