RAS: Reflection-Augmented Scaling with In-Context Learning for Executable Cypher Query Generation

Researchers introduced Reflection-Augmented Scaling (RAS), a method that uses execution error messages from databases to improve Cypher query generation through in-context learning. Testing across three Neo4j datasets and five language models, RAS reduced query execution error rates by 41–50% at five attempts, outperforming the 32–38% reduction achieved by independent resampling. The findings demonstrate that database-generated error messages serve as actionable feedback, making inference-time compute more efficient for producing executable queries than scaling independent samples.

arXiv:2605.22937v1 Announce Type: new Abstract: Inference-time scaling can reduce errors in structured query generation, but methods to allocate the compute for query code generation remains underexplored. We study Text2Cypher, where language models generate Cypher queries that execute against property graph databases. Non-executable queries constitute a distinct syntactic failure separate from semantic inaccuracy: a syntax error triggers a system-generated error message from the database. These error messages are typically discarded at inference time rather than leveraged through in-context learning ICL . We compare two inference methods: Independent Scaling IS , which performs memoryless resampling, and Reflection-Augmented Scaling RAS , which conditions each new attempt on prior execution feedback via ICL. Across three Neo4j datasets and five code-specialized language models, RAS reduces the Query Execution Error Rate by 41--50% at n{=}5, outperforming IS at 32--38%. Execution errors are not merely failures to discard but actionable feedback, and structuring inference-time compute around them is a more efficient path to executability than scaling independent samples.