Contract Spy
Remote (United Kingdom)
hiring experienced Python Engineers to support a variety of high-impact research collaborations with leading AI labs. Freelancers will help improve AI systems through work extending coding benchmarks that reflect real-world development across diverse languages and domains. This is a unique opportunity to apply your engineering expertise toward shaping the next generation of intelligent systems. Key Responsibilities Develop and validate coding benchmarks in Python by curating issues, solutions, and test suites from real-world repositories Ensure benchmark tasks include comprehensive unit and integration tests for solution verification Maintain consistency and scalability of benchmark task distribution Provide structured feedback on solution quality and clarity Debug, optimize, and document benchmark code for reliability and reproducibility Ideal Qualifications 3–10 years of experience as a backend software engineer, ML engineer, or...