Cost Per Genomics Sample? Try Cost Per Sequencing Attempt

Genomics teams running GPU-accelerated short-read sequencing pipelines in the cloud are experiencing failure rates of 15 to 40 percent, paying for compute on failed runs that must be restarted from scratch. The hidden cost is masked by the standard "cost per sample" metric, which does not account for retries, and is driven by misconfigured workflow managers and the shift from CPU-era assumptions to GPU cloud infrastructure. This inefficiency means organizations are paying for up to 40 percent more compute than necessary without realizing it.

Cost Per Genomics Sample? Try Cost Per Sequencing Attempt If you are a bioinformatics platform lead, an ML infrastructure engineer, or a genomics budget owner who is now running GPU-accelerated workflows in the cloud, pay attention. There is a hidden cost problem that almost every genomics infrastructure team is paying for – and very few are actively measuring. The observations here are specific to short-read sequencing workflows, which remain the dominant data type in production genomics environments. Your genomics pipeline is probably failing 30 percent of the time and you're paying for all of it. This article hidden cost problem that almost every genomics infrastructure team is paying for, and very few are actually measuring. Before we get into it: this conversation focuses on short-read sequencing data, which remains the dominant data type in production genomics workflows. Short-read sequencing pipelines DB1 msocom 1 , standard in next-generation sequencing NGS workflows, used to be CPU-heavy. You'd run them on a cluster, they'd grind through alignment and variant calling over hours, and the bottleneck was CPU throughput. GPU acceleration wasn't the story. That has changed. AI-driven variant calling, GPU-accelerated alignment tools like Parabricks, and deep learning models running on top of sequencing data have all moved toward the GPU, which means teams are managing serious GPU infrastructure for the first time. The cost model that comes with GPU cloud differs sharply from CPU clusters, and people are bringing CPU-era assumptions about pipeline reliability and cost accounting into a GPU environment. That mismatch is costing them. We work with a lot of these teams, and when we ask about infrastructure costs, they almost always lead with the same number: cost per sample. That's what gets reported upward, what sits in the budget. What that number hides is where things get interesting. When Pipelines Fail A typical short-read germline variant calling pipeline has maybe ten to fifteen distinct processing steps. You start with raw FASTQ files off the sequencer, run quality control, alignment, duplicate marking, base quality score recalibration, variant calling, annotation – each step hands off to the next. These pipelines mostly run on workflow managers like Nextflow or Snakemake, which do have built-in mechanisms for resuming failed jobs. Nextflow has a flag designed to let you pick up from step eight of 11 rather than restarting from scratch. In principle, that's exactly the right solution. In practice, the problem is configuration. For that flag to work, Nextflow needs to find its cache directory –the folder that records which steps completed successfully. If the solutions architect set up the compute environment without properly configuring persistent disk space for that cache, the file isn't there when you need it, and the pipeline restarts from step one anyway. That's a setup failure rather than a tool limitation, but the result is the same: you've paid for compute you didn't get output from. When a large task fails mid-execution rather than at a clean step boundary, even proper checkpointing won't save you, because the task has to be rerun in full. A Problem Difficult To Measure Genomics teams working with Nebius consistently report that 15 percent to 40 percent of their pipeline runs hit at least one failure and restart before completion. Pinning the figure down precisely is hard, and we have no definitive numbers that reflect the reality here. The range is wide because it depends heavily on how mature the infrastructure setup is. Teams with well-configured environments sit at the low end; teams newer to GPU cloud, or running on spot instances with higher interruption rates, sit at the high end. What makes this invisible is that if your metric is cost per completed sample, a failed run that eventually completes still looks like one sample at normal cost. The retry disappears from the number that gets reported. For example, a GPU-accelerated whole genome sequencing pipeline – germline variant calling – takes roughly two GPU-hours on an H200. At current on-demand rates that's about $9 of compute per sample, and that's the visible cost. Now apply a 25 percent failure rate – toward the conservative end of what teams report. For every four samples you complete, one run failed, restarted, and ran from the beginning. Your real cost per completed sample isn't $9 anymore – it's $11.25, a 25 percent hidden markup. Scale that to a team processing 2,000 samples a month: the visible compute bill says $18,000, but the real cost is $22,500. That's $4,500 a month – $54,000 a year – in compute that produced no output. For a mid-size genomics team, that's a meaningful fraction of the cloud budget, and it shows up nowhere as waste. That's before you touch storage. The Hidden Costs The storage picture is more nuanced than people expect. A standard whole genome generates roughly 200 gigabytes of raw FASTQ data, but that's the uncompressed figure. In practice, almost everything going into cold storage is compressed, typically down to around 30 gigabytes per sample, so the storage cost per sample is quite manageable. Where it gets complicated is retrieval. When you want to reanalyze archived samples – say, running a new cohort through an updated pipeline – you pull those compressed files back, and your infrastructure then needs to decompress them. That 30 gigabyte compressed file expands to 200 gigabytes, which means you need the disk space and memory headroom to handle the expansion. If the environment wasn't sized for it, you get failures or severe slowdowns at the decompression step, which becomes another category of hidden cost that's rarely accounted for up front. In cancer research, the numbers are much larger. Somatic mutation calling runs at 60X to 100X sequencing depth, so 600 gigabyte FASTQ files aren't unusual. Everything we have described scales accordingly. The key point: retrieval from cold storage always has a cost, regardless of where your compute lives relative to your storage. Some platforms charge for data egress between regions on top of that. Either way, the teams that haven't modeled their reanalysis frequency as a real line item are almost always surprised when they do. Tracking, Tracking, And Tracking. . . . Bioinformatics engineers know the failure rates, because they are the ones watching jobs fail at 2 AM. But by the time the numbers roll up to whoever controls the budget, it's just "cloud costs." There's no line item for "compute we paid for and got no output from." Cloud billing by service and instance type doesn't surface this. You see your GPU compute spend, your storage spend, your egress. You don't see "20 percent of your GPU spend this month was on runs that didn't complete." That decomposition requires deliberate instrumentation, and most teams haven't built it yet. What Teams Should Measure Instead Of Cost Per Sample Teams should measure a few things instead. First, completion rate: the percentage of pipeline runs that complete without failure or restart. That's your pipeline reliability score, directly linked to compute waste. Second, cost per attempted sample versus cost per completed sample. If those numbers are meaningfully different, you have a problem worth fixing. Third, storage retrieval frequency and the infrastructure overhead of decompression: how often you're pulling archived data back, and whether you've properly sized the disk and memory headroom for it. This is the gap between what looks cheap in the storage bill and what it costs to use the data. One Thing Genomics Infrastructure Teams Should Do Right Now Instrument your pipeline failure rate, right now, before anything else. The number itself doesn't fix anything, but it makes the problem visible. Once you can show that 15 percent or 25 percent of your compute spend is going toward runs that restart – with real dollar figures attached – the conversation about fixing the underlying infrastructure becomes easy to have. People move fast when they can see the waste. Everything else follows from that – better checkpointing configuration, smarter storage architecture, more stable compute – but you have to see the problem first. Discover the breakthroughs shaping the future of AI in healthcare and life sciences. Visit https://nebius.com/solutions/life-sciences-and-healthcare https://nebius.com/solutions/life-sciences-and-healthcare to learn more and register for the 2026 AI Discovery Awards ceremony at nebius.com/ai-discovery-award https://nebius.com/ai-discovery-award . Anastasia Raskolova is a senior product manager for Healthcare & Life Sciences at Nebius, where she focuses on infrastructure product for drug discovery and clinical AI workflows. Before that, she spent her career building ML products across computer vision, recommendation systems, and generative AI — and stays grounded in the clinical reality through volunteering in the Emergency Department at Massachusetts General Hospital. Contributed by Nebius.