{"slug": "proof-of-compute-the-receipt-is-not-the-benchmark", "title": "Proof of Compute: The Receipt Is Not the Benchmark", "summary": "Phala Cloud's documentation separates GPU TEE performance benchmarks from per-job compute receipts, emphasizing that benchmarks describe hardware capacity while receipts record specific job execution details. A proposed compute receipt model keeps benchmark results, attestation evidence, and policy decisions as distinct fields, with the final allowed claim intentionally narrower than any of those inputs. The approach warns against conflating performance benchmarks with proof of individual compute jobs in decentralized ML and confidential computing environments.", "body_md": "Disclosure: AI tools were used for source collection and editorial review. The article was written by a human author, who checked the facts, code, and conclusions.\n\nCrypto risk disclosure: This article is a technical explanation, not investment advice. It is not a recommendation to buy, sell or hold any cryptoasset.\n\nA proof-of-compute receipt should prove one job. It should not put on benchmark theater. A benchmark can describe a hardware class, a runtime mode, or the shape of a chosen workload. A receipt does something narrower: it records what ran, where the verifier boundary sat, what output came out, and which claim is actually safe to make.\n\nThe benchmark boundary is where compute claims tend to get too loud. [Phala Cloud's Confidential AI documentation](https://docs.phala.com/phala-cloud/confidential-ai/overview) describes GPU TEE modes, and it points separately to benchmark material for GPU TEE performance. That separation is the whole point. Performance evidence can back a capacity discussion, but it is not a receipt for one model job.\n\nThe same boundary turns up in decentralized ML. [Gensyn's documentation](https://docs.gensyn.ai/) frames execution, verification, communication, and coordination as protocol components for ML workloads. Its [Products & Research page](https://docs.gensyn.ai/products-and-research) describes Verde as a verification system for decentralized machine learning and mentions reproducible operators. That helps with the verification problem. It is not a free pass to say every job on every network ships with a replayable proof.\n\nA compute receipt should open with fields a reviewer can replay. The list below is an author model for editorial and marketplace review, not a protocol-native standard.\n\n```\n{\n  \"receipt_id\": \"compute_receipt_2026_06_04_001\",\n  \"job_id\": \"inference_batch_7f3a\",\n  \"workload_digest\": \"sha256:container_or_model_bundle\",\n  \"input_commitment\": \"sha256:redacted_prompt_batch\",\n  \"runtime_measurement\": {\n    \"type\": \"tee_or_reproducible_operator_trace\",\n    \"measurement_digest\": \"sha256:runtime_measurement\"\n  },\n  \"benchmark_result\": {\n    \"id\": \"optional_capacity_context\",\n    \"claim_limit\": \"capacity context only; not per-job proof\"\n  },\n  \"attestation_result\": {\n    \"verifier\": \"named_verifier_endpoint\",\n    \"evidence_digest\": \"sha256:attestation_or_trace\",\n    \"claim_limit\": \"runtime or environment evidence only\"\n  },\n  \"verifier_policy_id\": \"gpu_tee_policy_v3\",\n  \"policy_decision\": {\n    \"status\": \"pass\",\n    \"reason\": \"evidence matched policy reference values\",\n    \"policy_id\": \"gpu_tee_policy_v3\"\n  },\n  \"output_digest\": \"sha256:result_bundle\",\n  \"allowed_claim\": \"This job ran under the named verifier policy and produced the output digest above.\",\n  \"blocked_claims\": [\n    \"This proves the model answer is true.\",\n    \"This proves the provider is always faster.\",\n    \"This benchmark proves this job completed.\"\n  ]\n}\n```\n\nA useful receipt keeps `benchmark_result`\n\n, `attestation_result`\n\n, and `policy_decision`\n\napart, because each one answers a different question. `benchmark_result`\n\ncarries capacity context. `attestation_result`\n\n, or a deterministic trace, backs a runtime claim. `policy_decision`\n\nrecords whether the verifier accepted, rejected, or held the evidence. Whatever those three inputs say, the final `allowed_claim`\n\nshould come out smaller than all of them.\n\nAttestation can strengthen a compute receipt, but it stays bounded evidence. [NVIDIA's Attestation SDK documentation](https://docs.nvidia.com/attestation/attestation-client-tools-sdk/latest/gpu_and_switch_attestation.html) lists GPU attestation prerequisites for supported Confidential Computing GPUs and separates local attestation from remote. [NVIDIA's Confidential Containers attestation documentation](https://docs.nvidia.com/datacenter/cloud-native/confidential-containers/latest/attestation.html) describes remote attestation as proving guest TEE state to a verifier before secrets are released.\n\nThe policy denial case is the one worth keeping. NVIDIA's Confidential Containers quickstart explains that a denied resource request can still confirm three things: the client reached the key broker, the attestation service evaluated the request, and policy rejected it. So a failed receipt is not a wasted one. It can prove the verifier path was exercised while refusing the stronger claim.\n\nProvider constraints belong in the receipt, because infrastructure quietly changes what a compute claim means. [Akash's persistent storage documentation](https://akash.network/docs/learn/core-concepts/persistent-storage/) separates ephemeral from persistent storage, then notes that persistent storage is provider-local and does not survive lease termination or provider migration. That is not a proof-of-compute system on its own. It is exactly the kind of deployment boundary a receipt should not hide.\n\n| Bad claim | Safer receipt claim | Why the rewrite matters |\n|---|---|---|\n| \"The benchmark proves the job ran.\" | \"The benchmark gives capacity context; the receipt must prove this job.\" | Benchmark evidence is class-level, not job-level. |\n| \"TEE attestation proves the answer is true.\" | \"TEE attestation supports the runtime/environment claim.\" | Runtime state is not semantic correctness. |\n| \"The verifier denied the request, so nothing was learned.\" | \"The verifier path was reached and policy rejected the evidence.\" | A denial can be a useful audit result. |\n| \"The provider has persistent storage, so the artifact is durable anywhere.\" | \"The storage boundary is provider-local unless another backup path is shown.\" | Deployment constraints affect replayability. |\n\nProof of compute earns its name only when the allowed claim is smaller than the marketing claim. Benchmark numbers stay in the capacity box, attestation in the runtime box, output digests in the job box. After that, the receipt can say exactly what a reviewer is allowed to repeat.\n\nNone of this has to make compute claims boring. It has to make them reviewable. If a developer cannot point to the workload digest, the verifier policy, the output digest, and the allowed claim, then the honest label is benchmark or deployment note, not proof of compute.", "url": "https://wpnews.pro/news/proof-of-compute-the-receipt-is-not-the-benchmark", "canonical_source": "https://dev.to/aicryptosystems/proof-of-compute-the-receipt-is-not-the-benchmark-lgj", "published_at": "2026-06-04 08:31:25+00:00", "updated_at": "2026-06-04 08:41:43.650658+00:00", "lang": "en", "topics": ["artificial-intelligence", "machine-learning", "ai-infrastructure", "ai-research"], "entities": ["Phala Cloud", "Gensyn", "Verde"], "alternates": {"html": "https://wpnews.pro/news/proof-of-compute-the-receipt-is-not-the-benchmark", "markdown": "https://wpnews.pro/news/proof-of-compute-the-receipt-is-not-the-benchmark.md", "text": "https://wpnews.pro/news/proof-of-compute-the-receipt-is-not-the-benchmark.txt", "jsonld": "https://wpnews.pro/news/proof-of-compute-the-receipt-is-not-the-benchmark.jsonld"}}