Practical Learnings from Synthetic Document Finetuning Apollo Research researchers have identified practical refinements to Synthetic Document Finetuning (SDF), a knowledge editing technique that implants beliefs into AI models by training them on LLM-generated documents. The team found that removing the special `` token from the standard SDF recipe increases the saliency of implanted facts, making the knowledge more unconditionally available during evaluation. They also warn of mode collapse issues, such as generating 439,000 occurrences of a single name, and recommend using negative prompts and random name seeding to maintain document diversity. We've been using Synthetic Document Finetuning SDF quite a bit at Apollo Research lately. This post covers a few tweaks to the standard SDF recipe specific to our use cases, plus some general tips and tricks for getting good results. We’re sharing these notes in case they’re useful to others doing research with SDF. Synthetic Document Finetuning SDF is a knowledge editing technique where models are finetuned on LLM-generated documents consistent with a target fact or belief. As described in Slocum et al. 2025 https://arxiv.org/abs/2510.17941 , SDF "often succeeds at implanting beliefs that behave similarly to genuine knowledge." These implanted beliefs can generalize to related contexts, are often robust to scrutiny, and form internal representations similar to genuine knowledge. We mostly followed the pipeline described in Slocum et al. 2025 https://arxiv.org/abs/2510.17941 and the safety-research/false-facts https://github.com/safety-research/false-facts repository. The pipeline has several stages: We mostly used Claude Sonnet 4.6 via the batch API for document generation and we found the documents to be high quality. When setting this up, we suggest starting small: generate about 5 documents, read through them to find things that are wrong or not quite right, update the universe description and prompt, and iterate until the results are good quality and you're getting what you want. Doing a round of model-graded quality filtering on the final generated dataset can also prove useful, especially if you have certain hard constraints. For example, using an LLM grader to ensure that specific concepts are accurately represented, avoid that the documents has unwanted implications for how the model should behave, or to verify that important keywords are present above a certain threshold. Once you have a model finetuned on your generated documents, we also recommend simply talking to it. Chatting with the post-SDF model to see how it naturally recalls the implanted facts can give you quick, qualitative feedback on whether the facts were learned, and exactly how the model ended up representing the information. One thing to watch out for is mode collapse in your synthetic documents. At one point it got a little out of hand and it turned out we had 439,000 occurrences of "Sarah Chen" across our synthetic data. To fix this, we started using an "anti-universe" or negative prompt. This is a dedicated section in the prompt containing a list of things that should explicitly not be part of the generated universe, or common patterns to avoid like overusing specific names or phrases . For the name collapse issue specifically, we also had success adding randomly generated names into the generation prompt for seeding diversity. One of the biggest hurdles we and others we've spoken to have faced is a saliency problem: the model has clearly learned the SDF information showing good recall rates when QA'd directly but the evaluation context fails to elicit it, leaving its reasoning and actions unaffected on downstream evals. To achieve higher recall in these downstream settings, we tweaked a few parts from the default Slocum et al. recipe to maximize saliency. Slocum et al. recommend prefixing each synthetic document with a special