19:57
2026-05-27
deepswe.datacurve.ai
ai-agents
DeepSWE Measuring frontier coding agents
DataCurve released DeepSWE, a new benchmark for evaluating frontier coding agents on original, long-horizon software engineering tasks. The benchmark features contamination-free tasks written from scrโฆ