Language Engineer
Full Job Description
Description
The Amazon Artificial General Intelligence (AGI) Data Services organization is responsible for developing diverse datasets to train and evaluate Amazon AI models. We seek Language Engineers to join our science and engineering team to support the development of complex, multimodal datasets using synthetic data generation, model-supported data generation, and human-in-the-loop data collections. You will drive innovation and advance the state-of-the-art in evaluating and training AI models, collaborating with product managers, engineers, and data scientists to ensure best-in-class AI systems.
Key Responsibilities
- Design complex human participant data collections, authoring instructions, defining quality targets, coordinating efforts, and ensuring final deliverables.
- Design and conduct complex data creation tasks using state-of-the-art synthetic and model-based generation methods.
- Analyze and extract insights from large datasets.
- Build tools or prototypes for data analysis or creation using Python or other scripting languages.
- Utilize modeling tools to bootstrap or test new AI functionalities.
- Collaborate with scientists, software engineers, and data creators to evaluate AI model performance.
About the Team
Amazon aims to be the world's most customer-centric company, enabling customers to research and purchase anything online or offline. The AGI organization provides AI capabilities for various Amazon products and searches, delivering secure, flexible, cost-effective, and high-quality data development services to build advanced ML models.
Basic Qualifications
- Master's or higher degree in Computational Linguistics or a related field with computational analysis.
- 2+ years of experience in computational linguistics, language data processing, or AI data creation.
- Experience with language data annotation systems and data markup.
- Proficiency in scripting languages like Python.
- Experience working with speech, text, and multimodal data in multiple languages.
- Excellent communication, strong organizational skills, and attention to detail.
- Comfortable in a fast-paced, highly collaborative, and dynamic work environment.
Preferred Qualifications
- PhD in Computational Linguistics or a related field with a computational emphasis.
- Expertise in bootstrapping AI data collections for evolving requirements.
- Extensive experience with speech, text, and multimodal data in multiple languages.
- Experience in data creation for complex agentic workflows.
- Practical experience with Machine Learning.
- Familiarity with technical concepts such as APIs.
- Practical knowledge of version control and agile development.
- Familiarity with database queries and data analysis processes (SQL, R, Matlab, etc.).
- Willingness to support multiple projects and adapt to reprioritization.
- Strong analytical and problem-solving skills with creative thinking abilities.
Our inclusive culture empowers Amazonians to deliver the best results for our customers. For workplace accommodations during the application and hiring process, please visit https://amazon.jobs/content/en/how-we-hire/accommodations.