{"slug": "our-problems-cont", "title": "Our Problems (cont.)", "summary": "Key research areas for advancing AI-assisted programming, focusing on next-action prediction models that anticipate user edits, file switches, and terminal commands with low latency. It also explores methods for scaling inference-time compute to enable higher-quality, multi-file edits and async code generation, while addressing the challenge of optimal context by combining retrieval, recurrence, and long-context attention to process vast codebases and documentation.", "body_md": "More problems\nSeveral exciting problem areas for the next phase of AI-programming.\nAs a follow-up to our original problems post, here are some more of the problems we believe matter most for the next phase of AI-programming.\nNext action prediction\nCursor comes with Copilot++, a more intelligent version of copilot that predicts your next edit. Can we take this idea to its natural limit?\nWhen coding, you don’t just make low-entropy edits. Across the entire editor, you take low-entropy keystrokes, clicks, actions. Can we build a model to predict each with low-latency?\nTo start, we’ve extended Copilot++ to predict your next location. Combine this with next edit prediction, and the model can play through a sequence of low-entropy changes:\nWe’re working on predicting the next file you will move to. The next terminal command you will run. The next edit, conditioned on your previous terminal commands! A next action prediction model.\nFurthermore, the model should surface information the moment you need it. Whether it be the right piece of code or documentation.\nCursor should feel like an extension of your will. The moment you think of a change, the language model requires minimal intent to execute it instantly.\nPromising directions\n-\nFundamental research on action prediction across a codebase.\n-\nContinued pre-training and post-training on ~5–13B active parameter code-models (for prefill-bound low-latency predictions).\n-\nAdditional inference tricks similar to Speculative Edits\nClever UX for surfacing “actions” in a non-obtrusive way. (How do you propose the next file a user should move to? Or the next location outside the current viewport?)\nPerfect edits\nCan we scale up inference time compute to produce higher-quality, larger edits? How do we compensate for the increased latency?\nIt may be necessary to perform the edit in the background. Spawning off a unit of work that you can trust to intelligent models.\nWe’ll need models with strong editor-specific tool-use abilities, smarter codebase-wide context, and improved long-term reasoning.\nAnd how can we make async code-generation flow-preserving. This sounds like an oxymoron, but we believe clever research in model capabilities and UX may make this possible.\nHallucinated pseudocode\nUsers will write pseudocode that describes the desired change. Then we can trust Cursor to compile the pseudocode into the full change in the background.\nMulti-file edits\nCmd-k is already fantastic, but what if you could ask for a generic edit across your entire codebase? In particular, one that accurately spans multiple files?\nPromising directions\n-\nScaling inference-time compute. We know reward models and rejection sampling will show quick and easy improvements, but how much farther can we go?\n-\nBetter reasoning models (gpt-5, claude-4, gemini 2.0)\n-\nRunning multiple language-server/file-system copies for a given user workspace. This will require model tool use and remotely reproducing runtime environments.\n-\nTraining/improving model performance on agent trajectories\n-\nSignificant UX experimentation for in-flow async edits\nOptimal context\nThere can be millions of tokens of documentation, tens of millions of tokens of source code, another tens of millions of tokens of commit history, all potentially useful tokens to resolve a single query.\nNot to mention, the pixels in your UI, logs in production and localhost, messages in Slack, etc...\nWe believe the best coding systems will use a mix of retrieval, recurrence, and long-context attention to ingest all this information.\nWe emphasize systems as in the short-term, this may be an ensemble of models and infra that comprise an infinite context engine for coding. In the long-term, we expect it to be baked into the architecture.\nWe’re especially excited when thinking creatively about the future of retrieval. Moving past embeddings, what is the best performance possible given the primitive of an expensive indexing step and a cheap querying step (sublinear in the size of the corpus)?\nMaybe it looks like some variant of transformer memory as a differentiable search index. Perhaps something else entirely. It’s an underexplored research direction.\nMulti-hop context\nInside my codebase, I want to compute a diff between two strings. With embeddings, I get the chunk:\nfunction computeDiff(\nfirstModel: ITextModel,\nsecondModel: ITextModel,\n): string {\n//...\n}\nTo satiate the original query, I must determine how to create an ITextModel\nfrom a string. This is a query that requires two-hops to resolve.\nThe hardest questions and queries in a codebase require several hops. Vanilla retrieval only works for one hop.\nPromising directions\n-\nSpecialized/improved embeddings and rerankers for codebases.\n-\nTraining multi-hop embedders. Given a query and the relevant code we’ve found so far, determine the next piece of code to hop to.\n-\nClever prefix-caching and perhaps custom attention masks better suited for codebases.\n-\nNovel research on codebase-level retrieval.\n-\nTeaching a model to learn a codebase in the weights, similar to transformers as a search index.\nBug detection and debugging\nExisting bug-detection systems struggle with calibration and sufficient codebase understanding.\nModels are smart enough to correctly identify bugs, but are plagued by false-positives. Identifying the trickiest bugs require a deeper understanding of the codebase. And buggy-looking code may be benign after seeing the larger picture.\nOne way this could surface is a much better experience for code review using language models:\nDetecting bugs in AI review\nThe benefit of “AI review” is that the user is more tolerant of false-positives, since they are requesting a review. The downside is it requires changing user behavior.\nAI linting\nThe best version of bug detection is an always-on linter that catches your bugs in the background. It needs to be a cheaper, faster model than AI-review, since we’d run it several times a minute. It must also be tuned to a lower false-positive rate.\nSmarter debugging\nPerhaps more impressive than bug detection is debugging difficult issues.\nWe’ll need to go beyond LLM-based static analysis. For example, we’ve built a cursor/debug\npackage. When injected into your code, it tracks runtime information.\nIn the background, we can even use it to track additional variable states (akin to print-debugging with relevant outputs piped into Cursor’s context).\nPromising directions\n-\nClever dataset curation (likely synthetic data) and RL on frontier code models to improve calibration.\n-\nTrack relevant information from other surfaces (the browser or non-integrated terminal).\n-\nImprove frontier model performance on debugger-specific tool-use and chains.\n-\nInfinite context and near-perfect codebase understanding.\n-\nExpand the scope of our\ncursor/debug\nlibrary to track all useful program-state information.\nIf any of these problems interest you, please reach out at hiring@cursor.com", "url": "https://wpnews.pro/news/our-problems-cont", "canonical_source": "https://cursor.com/blog/problems-2024", "published_at": "2024-05-25 00:00:00+00:00", "updated_at": "2026-05-19 22:16:19.097647+00:00", "lang": "en", "topics": ["artificial-intelligence", "machine-learning", "large-language-models", "developer-tools", "products"], "entities": ["Cursor", "Copilot++", "Speculative Edits"], "alternates": {"html": "https://wpnews.pro/news/our-problems-cont", "markdown": "https://wpnews.pro/news/our-problems-cont.md", "text": "https://wpnews.pro/news/our-problems-cont.txt", "jsonld": "https://wpnews.pro/news/our-problems-cont.jsonld"}}