
Senior Data Extraction Engineer
Responsibilities
Qualifications & Requirements
Experience Level: Mid Level
Full Job Description
D. E. Shaw India Private Limited is seeking resourceful and exceptional candidates for the Data Engineer role within its product development teams, based out of Hyderabad, Bengaluru, and Gurugram. As a Data Engineer at DESIS, you will develop Web Robots (Web Spiders) to crawl the web and retrieve data in various formats, including HTML, plain text, PDFs, and Excel, for both structured and unstructured data. Your responsibilities will extend to scraping website data into a structured format and building automated, custom reports for business knowledge. The team also focuses on automating end-to-end data pipelines.
What You'll Do Day-to-Day:
As a member of the Data Engineering team, you will be responsible for several key aspects of data extraction. This includes understanding business group data requirements, reverse-engineering website technologies and data retrieval processes, and re-engineering these processes by developing web robots for automated data extraction. You will build monitoring systems to ensure the integrity and quality of extracted data. Managing changes to website dynamics and layout to ensure clean downloads, building scraping and parsing systems to transform raw data into structured forms, and providing operations support for high availability and zero data loss are also core functions. Additionally, you will be involved in storing extracted data in recommended databases, building high-performing, scalable data extraction systems, and automating data pipelines.
Who We're Looking For:
The ideal candidate will possess:
- 2 to 4 years of experience in website data extraction and scraping.
- Good knowledge of relational databases, writing complex SQL queries, and ETL operations.
- Proficiency in Python for data manipulation and operations.
- Expertise in Python frameworks such as Requests, UrlLib2, Selenium, Beautiful Soup, and Scrapy.
- A solid understanding of HTTP requests and responses, HTML, CSS, XML, JSON, and JavaScript.
- Proficiency with debugging tools in Chrome for reverse-engineering website dynamics.
- A strong academic background and a BCA/MCA/BS/MS degree with a good foundation and practical application of knowledge in data structures and algorithms.
- Excellent problem-solving, analytical, and debugging skills.
Interested candidates are encouraged to apply through the D. E. Shaw India website: https://www.deshawindia.com/recruit/jobs/Ads/Link/SrDataExtEnggDec25.
We welcome candidates with relevant experience looking to restart their careers after a break to apply for this position. Learn more about Recommence, our gender-neutral return-to-work initiative.
The Firm offers excellent benefits, a casual, collegial working environment, and an attractive compensation package. For more details on our recruitment process, please visit https://www.deshawindia.com/careers.
Company
D. E. Shaw India Private Limited
The D. E. Shaw group is a globally recognized investment and technology development firm, entrusted by investors worldwide to manage assets by achieving an optimal balance between risk and reward. Whi...