{"slug": "comparison-and-benchmarking-of-rust-decimal-crates", "title": "Comparison and Benchmarking of Rust Decimal Crates", "summary": "A developer benchmarks Rust decimal crates, comparing fixed-point vs floating-point and fixed-precision vs arbitrary-precision designs for financial applications requiring exact decimal representation.", "body_md": "My English is not very good, so this article was translated with the help of AI. Here is the\n\n[Chinese version].\n\nAs is well known, because 2 and 10 do not share the same prime factors, binary\nfractions cannot represent decimal fractions exactly. For example, `f64`\n\nhas\nthe classic arithmetic error: `0.1 + 0.2 != 0.3`\n\n.\n\nSome application scenarios, such as finance, require exact representation of\ndecimal fractions. This is why decimal crates are needed. Their use integers to\nrepresent the mantissa, along with a scale representing the number of decimal\nplaces. For example, the value `1.23`\n\ncan be represented using integer `123`\n\nwith `scale = 2`\n\n.\n\nThere are many decimal crates in the Rust ecosystem, each with different designs and trade-offs. Their differences mainly fall into two dimensions:\n\nWhether the scale is fixed or variable. This corresponds to\n[Fixed-point](https://en.wikipedia.org/wiki/Fixed-point_arithmetic)\nvs [Floating-point](https://en.wikipedia.org/wiki/Floating-point_arithmetic).\n\nWhether the count of integers is fixed or arbitrary. This corresponds to\n[Fixed-precision](https://en.wikipedia.org/wiki/Fixed-precision_arithmetic)\nvs [Arbitrary-precision](https://en.wikipedia.org/wiki/Arbitrary-precision_arithmetic).\n\nThis article chooses several crates for comparison and benchmarking.\n\nTable of contents:\n\nThe first two sections ([Fixed-point and Floating-point](#fixed-point-and-floating-point),\n[Fixed-size and Arbitrary-precision](#fixed-size-and-arbitrary-precision))\nintroduce the characteristics of these categories. There is nothing\nparticularly new here, so experienced readers may skip them.\n\nThe next section ([Choosing Crates](#choosing-crates)) introduces several\ndecimal crates.\n\nThe final section ([Benchmark Comparison](#benchmark-comparison)) is the\nmain focus of this article, benchmarking and comparing these crates.\n\n* Fixed-point* vs\n\nIn fixed-point arithmetic, the scale is fixed and bound to the type. In floating-point arithmetic, the scale is variable and stored in each instance.\n\nLet’s illustrate this with code.\n\nA typical *fixed-point* type definition might look like this:\n\n``` js\nstruct FixedPoint<const SCALE: i32>(i128); // scale is bound to type\n```\n\nA typical *floating-point* decimal type might look like this:\n\n```\nstruct FloatingPoint {\n    mantissa: i128,\n    scale: i32, // scale is stored in each instance\n}\n```\n\nThis clearly shows that fixed-point numbers have fixed decimal precision, while\nfloating-point decimals have variable precision. For example, `FixedPoint<2>`\n\nalways has 2 decimal places, while the precision of `FloatingPoint`\n\ndepends\non each instance’s scale.\n\nBecause of this distinction, fixed-point and floating-point types exhibit the following differences:\n\nFixed-point numbers have a smaller representable range, while floating-point numbers can represent a much larger range. This is because floating-point numbers sacrifice decimal precision as values become larger.\n\nFixed-point arithmetic is simpler and faster, while floating-point arithmetic is more complex and slower. For example, addition for fixed-point numbers only requires integer addition on the mantissa. Floating-point addition must first check whether the scales are equal (this check itself can already be slower than the addition), and if not, align the scales through multiplication. This will be discussed in detail in the benchmark section.\n\nFixed-point arithmetic is somewhat more cumbersome to use, while floating-point\narithmetic is more convenient. For example, with the `FixedPoint`\n\ntype above,\nthe scale must be determined at compile time for each type, such as how many\ndecimal places `Balance`\n\nor `Price`\n\nshould have. Floating-point decimals do\nnot require this consideration.\n\nThe difference between the two is somewhat analogous to the difference between statically typed and dynamically typed languages.\n\nMost applications use decimal crates simply to represent decimal fractions exactly, without particularly high requirements for performance or strict decimal precision. In such cases, floating-point decimals are usually preferred for convenience. However, for more serious services, especially many financial systems that require strict decimal precision or high performance, fixed-point decimals are recommended. For example, USD assets should have exactly 2 decimal places, neither more nor less.\n\nNOTE: Since built-in floating-point types in programming languages (such as C’s\n`float`\n\nand `double`\n\n, or Rust’s `f32`\n\nand `f64`\n\n) are commonly referred to as\n“floating-point”, and these types cannot represent decimal fractions exactly,\nmany people mistakenly think that “floating-point” inherently cannot represent\ndecimal fractions exactly. This is WRONG! More precisely, these are “binary\nfloating-point” numbers. The inability to represent decimal fractions exactly\ncomes from the “binary” part, not the “floating-point” part. Because people\noften omit the word “binary”, floating-point arithmetic unfairly gets blamed.\nIn fact, even *binary fixed-point* types, such as the\n[ fixed](https://docs.rs/fixed/latest/fixed/) crate, also cannot represent\ndecimal fractions exactly. As long as a crate is decimal-based, whether\nfixed-point or floating-point, it can represent decimal fractions exactly.\n\nNOTE: Floating-point arithmetic has a standard called\n[IEEE 754](https://en.wikipedia.org/wiki/IEEE_754), which defines both binary\nfloating-point formats (used by `f32`\n\n/`f64`\n\n) and decimal floating-point formats.\nHowever, this standard is only *one* implementation approach for floating-point\narithmetic, not the entirety of floating-point arithmetic itself. Other\nimplementations are also possible. In practice, most decimal crates do not\nfollow IEEE 754 decimal formats.\n\n* Fixed-precision* vs\n\nFirst, let’s clarify the meaning of the word “precision” here. The term has two conflicting meanings:\n\nFor example, the value `1.23`\n\nhas 2 fraction places but 3 significant digits.\nBoth meanings are widely used. For example,\n[std::fmt](https://doc.rust-lang.org/std/fmt/index.html#precision) uses the\nformer meaning, while here (Fixed-precision vs Arbitrary-precision) the latter\nmeaning is used. This is the [standard terminology](https://en.wikipedia.org/wiki/Fixed-precision_arithmetic),\nbut it easily causes confusion. “Fixed-precision” is often misunderstood as\nfixed fraction places, leading to confusion with fixed-point arithmetic.\n\nTo avoid ambiguity, this article uses the term *Fixed-size* instead of *Fixed-precision*.\n\nAs the name suggests, Fixed-size types use a fixed number of integers (one or more). Arbitrary-precision types use as many integers as necessary: expanding to the left to avoid overflow, and expanding to the right to avoid precision loss.\n\nNaturally, this requires heap allocation, meaning the type is not `Copy`\n\n,\nand the crate is not `no-alloc`\n\n. All operations also become significantly slower.\nUnless there is a clear requirement for arbitrary precision, Fixed-size types\nare generally preferable.\n\nWe choose several decimal crates for comparison and benchmarking:\n\n| Floating-point | Arbitrary-precision |\n\nThis is currently the only actively maintained Arbitrary-precision decimal crate.\nInternally, it uses a `Vec<u64>`\n\nor `Vec<u32>`\n\nto represent the mantissa.\nIts memory layout looks like this:\n\n```\n+-u64----+--------+--------+--------+--------+\n| sign   | Vec<u64>                 | scale  |\n+--------+--+-----+--------+--------+--------+\n            |\n            +--------+--------+----\n            | u64    |  …     |\n            +--------+--------+----\n```\n\nMetadata alone occupies 5 machine words, totaling 40 bytes, making the memory layout relatively loose. Since memory allocation is required during creation and expansion, and pointer dereferencing is needed during access, performance is relatively poor, as will be clearly shown in the benchmarks below.\n\nIn short, this crate prioritizes Arbitrary-precision at the expense of memory efficiency and performance.\n\n| Floating-point | Fixed-size |\n\nIts `Decimal`\n\ndefinition is:\n\n``` js\nstruct Decimal<const N: usize>\n```\n\nHere, `N`\n\nis the number of `u64`\n\ns used to represent the mantissa. For example,\n`Decimal<2>`\n\nuses two `u64`\n\ns, giving a 128-bit mantissa. This is why its\ndocumentation also describes it as [Arbitrary-precision](https://crates.io/crates/fastnum/0.7.4).\nThe difference is that `bigdecimal`\n\nadjusts precision at runtime, while\n`fastnum`\n\ndetermines it at compile time.\n\nThe memory layout is:\n\n```\n+-u64----+--------+...+--------+\n| [u64; N]            | CBlock |\n+--------+--------+...+--------+\n```\n\n`CBlock`\n\nis an 8-byte `ControlBlock`\n\nused by `fastnum`\n\nto store metadata.\nBesides sign and scale, it contains additional fields. See the\n[documentation](https://docs.rs/fastnum/0.7.4/fastnum/#memory-layout) for details.\n\n`fastnum`\n\nalso provides many scientific functions typically found in `f32`\n\n/`f64`\n\n,\nsuch as `sin`\n\n, `cos`\n\n, `sqrt`\n\n, and `log`\n\n. None of the other decimal crates provide\nsuch functionality. Personally, I do not think these features are particularly\nreasonable. People use decimal arithmetic to represent decimal fractions exactly,\nwhile scientific computations typically produce irrational numbers that cannot be\nrepresented exactly anyway. Scenarios requiring such operations (even in finance,\nsuch as pricing models) are better suited to much faster binary floating-point\ntypes (`f32`\n\n/`f64`\n\n).\n\nThe documentation claims the crate is [blazing fast](https://docs.rs/fastnum/0.7.4/fastnum/#why-fastnum),\nbut its benchmark comparisons are mostly against the already slow `bigdecimal`\n\n.\nIn the benchmarks below, compared to the other selected crates, `fastnum`\n\nturns\nout to be the slowest. However, since it considers itself Arbitrary-precision,\nits intended competitor is probably `bigdecimal`\n\n.\n\nAlso, its documentation is extremely detailed.\n\n| Floating-point | Fixed-size |\n\nThe most popular decimal crate in the Rust ecosystem. Judging from download counts, reverse dependencies, and ecosystem integration (serde, postgres, etc.), it is by far the most widely used. It is also one of the oldest decimal crates, with its first release dating back to late 2016. Its age is probably a major reason for its popularity.\n\nIt only supports 128-bit signed decimals. Memory layout:\n\n```\n+-u32--+------+------+------+\n| flag | high | mid  | low  |\n+------+------+------+------+\n```\n\nThe mantissa consists of three `u32`\n\ns (`high`\n\n, `mid`\n\n, and `low`\n\n), totaling 96 bits,\nroughly equivalent to 28 decimal digits. Arithmetic operations must process all\nthree `u32`\n\ns sequentially, which hurts performance.\n\nThe `flag`\n\nfield stores:\n\n`[0, 28]`\n\n)The documentation claims this memory layout is chosen for\n[performance optimization](https://docs.rs/rust_decimal/1.41.0/rust_decimal/#comparison-to-other-decimal-implementations).\nHowever, the benchmarks below show that `rust_decimal`\n\nis not actually the fastest.\nHistorically, this design likely existed because Rust originally lacked stable\n128-bit integers.\n\nThe API also reveals traces of the pre-`i128`\n\nera. For example, the constructor\nfrom `i64`\n\nis called [ new](https://docs.rs/rust_decimal/latest/rust_decimal/struct.Decimal.html#method.new),\nwhile the later-added\n\n`i128`\n\nconstructor is named\n`from_i128_with_scale`\n\n| Floating-point | Fixed-size |\n\nThis crate occupies essentially the same niche as `rust_decimal`\n\n.\n\nAdvantages:\n\nDisadvantages:\n\n`rust_decimal`\n\n.One reason this crate was selected is that I am its author :)\n\nIt uses a single integer representation. For the 128-bit signed type, the memory layout is:\n\n```\n+-u128-----------------------+\n|S|scale| mantissa           |\n+----------------------------+\n```\n\nThe sign (`S`\n\n) and scale occupy 1 bit and 5 bits respectively, leaving 122 bits\nfor the mantissa, or roughly 36 decimal digits — significantly more than\n`rust_decimal`\n\n’s 28 digits.\n\nArithmetic uses a single `u128`\n\ninstead of three `u32`\n\ns, making it faster.\n\n| Fixed-point | Fixed-size |\n\nThis is the only Fixed-point crate selected in this article. Its main difference\nfrom the others is precisely that it is Fixed-point, as discussed earlier\nin [Fixed-point and Floating-point](#fixed-point-and-floating-point).\n\nCompared with other Fixed-point decimal crates, its biggest feature is that\nbesides the typical `FixedPoint`\n\nstyle (using const generics to fix decimal\nplaces at compile time), it also provides an *Out-of-band scale* mode,\nallowing the scale to be specified at runtime for greater flexibility.\n\nFor example, in a multi-currency fund management system, using the typical\n`FixedPoint`\n\ntype forces all currencies to share the same decimal precision.\nDefining:\n\n```\ntype Balance = FixedPoint<2>\n```\n\nmeans all currencies are limited to 2 decimal places.\n\nWith the crate’s `Out-of-band scale`\n\ntypes, each currency can define its own\ndecimal precision. See the [Out-of-band documentation](https://docs.rs/primitive_fixed_point_decimal/latest/primitive_fixed_point_decimal/#specify-scale)\nfor details.\n\nSince the scale is bound to the type (either through const generics or Out-of-band metadata), no scale needs to be stored in the instance itself. Therefore, instances only store the mantissa. For the 128-bit signed type, the memory layout is:\n\n```\n+-i128-----------------------+\n| signed-mantissa            |\n+----------------------------+\n```\n\nThis crate also differs in another implementation detail: it uses signed mantissas, while all the other selected crates separate sign and mantissa handling. This distinction also originates from the difference between floating-point and fixed-point arithmetic, but we will not go into detail here. The only thing worth noting is that this leaves the mantissa with 127 bits instead of 128.\n\nLet’s compare memory efficiency by looking at metadata size:\n\nSpoiler: this ranking matches the benchmark results.\n\nNow we arrive at the core of this article: benchmark results.\n\nWe use [criterion](https://crates.io/crates/criterion) for benchmarking.\nThe project source code is available on [GitHub](https://github.com/WuBingzheng/decimal-crates-comparison).\n\nBenchmarks were run on three machines:\n\nResults vary somewhat across environments. For simplicity, this article only\npresents and analyzes the first machine (AMD EPYC). Readers interested in other\nenvironments can refer to the [full results](https://github.com/WuBingzheng/decimal-crates-comparison/tree/main/charts).\nYou are also welcome to run the benchmarks on your own machine; instructions\nare included in the project’s page.\n\nBesides the decimal crates above, native Rust `f64`\n\nis also included for\ncomparison. Since stable `f128`\n\nis not yet available, it was not benchmarked.\nHowever, in my private tests, `f128`\n\nperforms almost identically to `f64`\n\n.\n\nWe primarily benchmark 128-bit and 64-bit signed types. However:\n\n`bigdecimal`\n\nis variable-sized, so bit width is irrelevant.`fastnum`\n\nsupports much larger sizes, making this benchmark somewhat underutilize it.`rust_decimal`\n\nonly supports 128-bit, not 64-bit.Benchmark cases:\n\nSubtraction behaves similarly to addition and is therefore omitted.\n\nOperand selection: Different benchmark cases use different scale configurations\ndepending on the scenario. The mantissas themselves (more precisely: both addition\noperands, both multiplication operands, and the dividend for division) are all\npowers of 10, increasing exponentially. For example, `x = 3`\n\non the chart\nmeans the operand is `1e3`\n\n.\n\nBecause different crates support different mantissa sizes, their representable ranges differ, resulting in different line lengths in the charts:\n\n`bigdecimal`\n\nsupports arbitrary precision, but was restricted here to 128-bit-equivalent values, or 38 decimal digits.`fastnum:128`\n\nhas a full 128-bit mantissa, also about 38 digits.`prim-fpdec:128`\n\nhas a 127-bit mantissa, but still roughly 38 decimal digits.`decimax:128`\n\nhas a 122-bit mantissa, about 36 digits.`rust_decimal`\n\nhas a 96-bit mantissa, only about 28 digits.The following sections explain the details.\n\nThe addition process works as follows:\n\nThis section benchmarks the equal-scale case. The next section covers unequal scales.\n\nFor simplicity, we use identical operands. The scale does not affect the benchmark and is fixed at 10. The mantissas are powers of 10 increasing in magnitude.\n\nChart:\n\nAs expected, `bigdecimal`\n\nsits far above the others. The remaining crates\nare compressed near the bottom, so we temporarily remove `bigdecimal`\n\n:\n\nNow things are much clearer.\n\nFor 128-bit types:\n\n`fastnum:128`\n\nis the slowest`rust_decimal`\n\ncomes next`decimax`\n\nfollows`prim-fpdec:128`\n\nis the fastestThe first three are floating-point decimals, so they must first check whether the scales are equal before addition. This check itself is relatively expensive and slows down the entire operation.\n\n`prim-fpdec:128`\n\nis fixed-point, so the operation is essentially just integer\naddition, almost a single CPU instruction.\n\nFor 64-bit types:\n\n`fastnum:64`\n\nis slightly faster than `fastnum:128`\n\n`decimax:64`\n\nperforms similarly to `decimax:128`\n\n`prim-fpdec:64`\n\nperforms similarly to `prim-fpdec:128`\n\nMost curves are stable, except `rust_decimal`\n\nand `fastnum:64`\n\n, both of which\nexhibit noticeable jumps, though for different reasons:\n\nFor `rust_decimal`\n\n, the jump occurs because numbers are internally represented\nusing three `u32`\n\ns. Small mantissas fitting within one `u32`\n\nonly require one\naddition, while larger mantissas require operations across all three `u32`\n\ns.\nHence the jump around `x = 9`\n\n.\n\nFor `fastnum:64`\n\n, the jump occurs because its 64-bit mantissa can represent up\nto 19 decimal digits. Since our benchmarks use powers of 10, the problematic\ncase occurs around `1e19`\n\n. Adding two such values yields `2e19`\n\n, exceeding the\n64-bit range (~`1.84e19`\n\n). Following floating-point behavior, the implementation\nmust rescale: `mantissa /= 10; scale += 1;`\n\n. Since division is slow, the\naddition operation suddenly becomes much slower.\nOther floating-point crates may encounter similar situations, though not within\nthis benchmark range. Fixed-point crates cannot rescale, so they simply overflow\nand return an error instead.\n\nNow let’s look at addition where the operand scales differ.\n\nFixed-point types cannot participate in this benchmark, so `primitive_fixed_point_decimal`\n\nis excluded.\n\nBefore adding mantissas, floating-point decimals must first align the scales. The algorithm typically works as follows:\n\nIn this benchmark, operand scales are fixed at 10 and 0, differing by 10.\nTherefore, alignment requires multiplying by `1e10`\n\n. Once the mantissa grows\nbeyond `1e(MAX_SCALE - 10)`\n\n, multiplication overflows and the slower fallback\npath involving division is triggered.\n\nChart:\n\nAgain, `bigdecimal`\n\ndominates the chart, so we temporarily remove it:\n\nCompared with equal-scale addition, absolute times are much slower because of scale alignment.\n\nAs explained above, all curves eventually exhibit jumps.\n\nAmong them:\n\n`rust_decimal`\n\nshows the largest jump, tripling from ~15ns to ~45ns and becoming unstable afterward.`fastnum:128`\n\nshows a moderate jump.`decimax:128`\n\nshows the smallest jump.Performance ranking (slower first):\n\nBefore the jump:\n\n`fastnum:128`\n\n> `rust_decimal`\n\n> `decimax:128`\n\nAfter the jump:\n\n`rust_decimal`\n\n> `fastnum:128`\n\n> `decimax:128`\n\nNow let’s examine multiplication.\n\nDecimal multiplication consists of two parts:\n\nBoth steps may overflow. If either overflows, a second phase is triggered, reducing both mantissa and scale to avoid overflow. Since division is involved, performance degrades significantly.\n\nWe again use identical operands with exponentially increasing mantissas. To avoid overflow of the decimal value itself multiplication (not the mantissa multiplication), scales are increased simultaneously so that the actual value remains 1.\n\nOnce the mantissa reaches approximately half the representable range, mantissa multiplication overflows and triggers the second phase.\n\nChart:\n\nBesides `bigdecimal`\n\n, both `fastnum`\n\ncurves become extremely large in the\nlatter half. To better observe the other crates, we remove the entire\n`bigdecimal`\n\ncurve and truncate the `fastnum`\n\ncurves:\n\nThe chart is still somewhat messy, so let’s break it down carefully.\n\nBecause of mantissa multiplication overflow, most curves exhibit jumps around their midpoint.\n\nFirst, consider the post-jump behavior for 128-bit types:\n\n`fastnum:128`\n\nslows down extremely rapidly after the jump.`rust_decimal`\n\nexhibits multiple jumps, likely because of its three-`u32`\n\nrepresentation.`decimax`\n\nand `prim-oob-fpdec:128`\n\nare much more stable and significantly faster.Now consider the pre-jump region:\n\n`fastnum:128`\n\nand `rust_decimal`\n\nare both stable before their jumps (`x=19`\n\nand `x=14`\n\nrespectively), though `fastnum`\n\nsurvives longer.`decimax`\n\nand `prim-oob-fpdec:128`\n\nare not only stable but extremely fast before their jumps.Careful readers may notice that `primitive_fixed_point_decimal`\n\nappears as two variants:\n`prim-oob-fpdec:128`\n\nand `prim-const-fpdec:128`\n\n. Only the former was discussed earlier.\nThis difference arises from fixed-point semantics. The multiplication process described\nearlier (multiply mantissas, add scales) applies to floating-point decimals. For fixed-point\ndecimals, however, the result scale is predetermined. After adding operand scales, the\nimplementation must further adjust to the target scale, similar to the\noverflow-adjustment phase. In other words, the second phase that floating-point types\nonly enter later is always active for fixed-point types. This is somewhat unfair to\nfixed-point arithmetic. Fortunately, `primitive_fixed_point_decimal`\n\nprovides the more\nflexible `Out-of-band Scale`\n\nmode, allowing the result scale to equal the sum of operand\nscales. This avoids the second phase during the early part of the benchmark, enabling\nfairer comparison with floating-point types. That is what `prim-oob-fpdec:128`\n\nmeasures.\n\nHowever, this is not the real-world use case for fixed-point arithmetic. The `Out-of-band Scale`\n\nfeature was not designed specifically for this benchmark. To reflect realistic fixed-point\nusage, we also benchmark `prim-const-fpdec:128`\n\n, where the result scale remains fixed,\nforcing the second phase throughout the entire benchmark.\nAs the chart shows, `prim-const-fpdec:128`\n\nis initially the slowest, later it becomes\none of the fastest, converging with `prim-oob-fpdec:128`\n\nDoes this mean fixed-point multiplication is slower than floating-point multiplication for small mantissas? For this specific case, yes. But over longer computation chains, not necessarily. Floating-point multiplication appears faster because it postpones scale adjustment, allowing both scale and mantissa to grow. As shown throughout this article, larger scales and mantissas tend to slow down subsequent operations. Unless the multiplication result is final and never used again (not even formatted as a string), the earlier performance advantage tends to be paid back later.\n\nThe 64-bit results behave similarly and are omitted here.\n\nDivision has several notable characteristics:\n\nOverall, division tends to consume disproportionate development and benchmarking effort for a relatively small portion of real-world usage. Therefore, this article only benchmarks two simple cases:\n\nwithout attempting exhaustive or perfectly fair comparison.\n\nThis section discusses the former, exactly division.\n\nFor exactly divisible floating-point division, there are again two subcases:\n\n`200 / 25`\n\n.`2 / 25`\n\n.In the second case, `2`\n\ndoes not divide evenly by `25`\n\n, but after rescaling to `200`\n\n,\ndivision succeeds. The difficulty is that the implementation initially does not know:\nhow much rescaling is needed, or whether exact division is even possible.\nTherefore, implementations often: first aggressively scale up, then perform division,\nand strip trailing zeros afterward finally.\nFor example, `2`\n\nmight first become `20000000000`\n\n, producing `800000000`\n\n, and only\nafterward get reduced back to `8`\n\n. Even the zero-stripping phase must be discovered\niteratively, making this path potentially very slow.\n\nTo cover both cases, the benchmark fixes the divisor at `1e8`\n\n, while the dividend\nagain increases as powers of 10.\n\nThus:\n\n`x=8`\n\n, rescaling is required (slow path)`x=8`\n\n, direct division succeeds (fast path)Fixed-point types do not have these distinctions because quotient scale is predetermined.\n\nChart:\n\nFor floating-point types:\n\n`x=8`\n\n, all implementations are very slow.`rust_decimal`\n\n, `fastnum:128`\n\n, and `decimax`\n\nbecome much faster, while `bigdecimal`\n\nremains slow.For fixed-point:\n\n`prim-fpdec:128`\n\navoids quotient-scale determination and is initially very fast.\nLater, larger mantissas gradually slow it down.Now consider the non-exact division case.\n\nAs explained above, exactness only matters for floating-point decimals. Fixed-point behavior remains unchanged, so the fixed-point results here should match the previous benchmark.\n\nChart:\n\nAgain, removing `bigdecimal`\n\nmakes the comparison clearer:\n\nCompared with their exact-division counterparts:\n\n`bigdecimal`\n\n, `fastnum:128`\n\n, and `rust_decimal`\n\nare consistently much slower.`decimax:128`\n\nbecomes significantly faster and very stable.`prim-fpdec:128`\n\n, being fixed-point, behaves identically to the exact-division benchmark.The reasons likely require code-level analysis of each implementation and are beyond the scope of this article.\n\nOverall, except for a few special cases, the approximate performance ranking is:\n\n```\nbigdecimal << fastnum < rust_decimal < decimax < primitive_fixed_point_decimal\n```\n\n(Further left means slower.)\n\nFloating-point arithmetic paths depend heavily on the specific operands, making performance relatively unstable. Fixed-point arithmetic, by comparison, is much more predictable, which is reflected in the mostly flat curves above.\n\nAgain, it is important to emphasize that these crates target different use cases, so pure performance comparison is not entirely fair.\n\nThis article introduced several categories of decimal crates and benchmarked several representative implementations.\n\nBased on the results, the following recommendations can be made:\n\nIf dynamic arbitrary precision is required, `bigdecimal`\n\nis the only option,\nat the cost of losing `Copy`\n\nsemantics and suffering very poor performance.\n\nIf types larger than 128-bit are required, `fastnum`\n\nis the only choice.\nThis article does not benchmark larger-than-128-bit types, but performance is\nunlikely to be excellent. Interested readers can modify the benchmark project and test it themselves.\n\nIf fixed decimal precision is required, `primitive_fixed_point_decimal`\n\nis\nthe only suitable option. Although slightly less convenient than floating-point\ntypes, it provides higher and more stable performance.\n\nIf none of the above requirements apply and you simply want exact decimal\nrepresentation, `rust_decimal`\n\nor `decimax`\n\nare both good choices. The former\nhas a stronger ecosystem; the latter offers better performance.", "url": "https://wpnews.pro/news/comparison-and-benchmarking-of-rust-decimal-crates", "canonical_source": "https://wubingzheng.github.io/en/Decimal-Crates-Comparison.html", "published_at": "2026-06-15 03:37:09+00:00", "updated_at": "2026-06-15 03:45:16.207900+00:00", "lang": "en", "topics": ["developer-tools"], "entities": ["Rust"], "alternates": {"html": "https://wpnews.pro/news/comparison-and-benchmarking-of-rust-decimal-crates", "markdown": "https://wpnews.pro/news/comparison-and-benchmarking-of-rust-decimal-crates.md", "text": "https://wpnews.pro/news/comparison-and-benchmarking-of-rust-decimal-crates.txt", "jsonld": "https://wpnews.pro/news/comparison-and-benchmarking-of-rust-decimal-crates.jsonld"}}