Crossing Hurdles•3h ago
LinkedIn
Backend Engineer
India
Contract
Mid Level
$15.3/hr
Full Job Description
SwarmBench Task Engineer SWE - Remote Contract Role
Position Overview
- Type: Short-Term Contract (4 Weeks)
- Compensation: $15.3 per hour
- Location: Fully Remote
- Commitment: 20-40 hours/week with mandatory 4-hour overlap in PST time zone
Your Mission
We need a skilled engineer to build multi-agent benchmark tasks for advanced AI coding agents. You will create real-world scenarios based on open-source code changes (bug fixes, migrations, refactors) and validate them using our Harbor evaluation framework within Docker environments.
Key Responsibilities
- Benchmark Creation: Design precise task instructions defining file paths, function signatures, expected behaviors, and constraints based on real open-source codebases (Django, Flask, FastAPI, Node.js).
- Verification & Testing: Develop Python-based verification scripts to ensure the correctness of agent-generated code changes using pytest or custom assertions.
- Decomposition Strategy: Implement strategies that split complex code modifications across multiple independent sub-agents for parallel processing and efficiency.
- Docker Expertise: Write Dockerfiles, build images, debug container issues, and ensure tasks run reproducibly within isolated environments.
- Quality Assurance: Evaluate performance signals to refine task quality, clarity, difficulty levels, and determinism across runs.
Requirements
- Strong professional experience in Python and JavaScript.
- Familiarity with AI coding benchmarks (e.g., SWE-bench, Terminal-Bench).
- Expertise navigating large open-source repositories.
- Mastery of Git workflows: pull requests, diffs, cherry-picking specific commits.
- Docker proficiency: creating environments and resolving containerization challenges.
- Ability to write unambiguous technical specifications independently in a remote setting.
Application Process
To apply via Easy Apply:
- Submit your application directly through the platform.
- Receive role details and access to our Google Form assessment.
- Shortlisted candidates will complete an ICF (Individual Capability Framework) & Technical Assessment.
Final selection leads to immediate onboarding for this impactful 4-week engagement.
Company
Crossing Hurdles
About Crossing HurdlesConnecting skilled professionals with opportunities across leading AI training platforms and high-growth companies globally.In the era of expanding human input in AI development,...
India
Posted on LinkedIn