# Math Engine, eval()-free expression interpreter for Python

> Source: <https://dev.to/teske-systemtechnik/math-engine-eval-free-expression-interpreter-for-python-426p>
> Published: 2026-06-16 15:57:16+00:00

A safe evaluation engine for mathematical expressions, built from scratch: tokenizer, recursive-descent parser, AST, linear equation solver and a type-safe output system, entirely without Python's eval(). Live on PyPI, 399 tests, 90% coverage, green across five Python versions.

The obvious way to evaluate an expression like `3 + 4 * 2`

in Python is a single line: `eval("3 + 4 * 2")`

. That very line is the problem. `eval()`

executes arbitrary Python code, a string disguised as numeric input such as `__import__('os').system('rm -rf …')`

runs without complaint. For any application that takes expressions from a file, a form field, an API or a configuration string, `eval()`

is therefore a direct code-execution vector, not a calculator.

The second, quieter defect is correctness. `eval()`

and Python's `float`

compute in binary: `0.1 + 0.2`

yields `0.30000000000000004`

, `1/3`

is truncated, large integers tip over into scientific notation. For a calculator, a financial formula or an educational context, that is not "almost right", it is wrong.

The third defect is diagnostics. Hand `eval()`

a broken expression and you get a Python traceback at an internal line number, not the spot in the input string where the problem sits. For a tool that processes end-user input, that is useless.

The task, then: a complete evaluation engine from scratch that **(1) never executes foreign code, (2) computes exactly rather than binary-approximately, (3) pinpoints every error to the exact character**, and (4) does all of that at library quality, tested, documented, versioned and installable from PyPI. Not a weekend parser, but an engine with the discipline of a small compiler.

The entire library never calls Python's `eval()`

, `exec()`

or `compile()`

anywhere, this is not an after-the-fact filter but the architecture itself. Input strings pass through a closed pipeline (Input → Tokenizer → Parser → Evaluator/Solver → Formatter → Output Converter), whose alphabet is a finite set of numbers, operators, parentheses and a whitelist of function names. At worst, an attacker-controlled string can trigger a typed `MathError`

, never code execution. Even the single place that parses a user-supplied data structure uses the safe `ast.literal_eval`

, which accepts literals only.

Operator precedence is not hacked in via regex or a shunting-yard table, but encoded structurally as ten nested parser closures, each with exactly one precedence level: from `parse_gleichung`

(=) through bitwise operators, shift operations, sum and term, down to `parse_power`

(**) and parse_factor. Left- vs. right-associativity falls out of the structure: whatever consumes in a loop is left-associative (a - b - c = (a - b) - c); parse_power recurses to the right and makes `**

` correctly right-associative. A deliberate decision: `

^` is bitwise XOR, not exponentiation, exactly as in C and Python.Every number is a `decimal.Decimal`

from the tokenizer through to the output, never a `float`

, which is why `0.1 + 0.2`

is exactly `0.3`

. The precision of the Decimal context is determined anew for each calculation (between 100 and 10,000 digits, depending on the input), plus a hard input ceiling of 20,000 digits. The point: a long result is never silently truncated, a short one never wastes memory. Exactly the class of correctness that float-based calculators quietly lose here.

Alongside the token list, the tokenizer keeps a span list: for each token a `(start_col, end_col, original_text)`

triple. Every AST node and every `MathError`

carries `position_start`

/ `position_end`

. The payoff: an error does not say "syntax error somewhere", it points at the exact character. This bookkeeping is the reason the engine is debuggable across an API. Via a single setting (`readable_error`

), the same position info switches between two contracts: typed exceptions for the library, a visual diagnostic with a `^`

pointer under the faulty column for the console.

A base class `MathError`

plus exactly seven domain subclasses, including a catalogue of 78 unique, four-digit error codes across nine families. The digits are structured: first digit = family, second = component, the rest = sequence number. Code `3008`

therefore means "Calculator family, core parser, more than one '.' in a number". These codes are deliberately never renumbered, they are a contract toward the UI and external log parsers. The public `calculate()`

function wraps the whole pipeline in a layered `except`

block, so that no raw `ZeroDivisionError`

or `ValueError`

ever reaches the caller, everything lands typed in the `MathError`

hierarchy.

Two further capabilities sit on the same AST. If an expression contains an `=`

and a variable, the engine solves the linear equation symbolically: each node returns a `(factor, constant)`

pair, the solver brings both sides into the form `A·x + B = C·x + D`

and computes `x`

. Non-linearity is caught structurally (variable·variable, variable in the denominator, variable in the exponent), degenerate cases named cleanly ("No Solution", "Inf. Solutions"). On top of that, a programmer's-calculator mode with fixed word width (8/16/32/64 bit), two's complement and bitwise operators, so that `127 + 1`

in 8-bit signed mode correctly overflows to `-128`

. A prefix-driven output system (`dec:`

, `int:`

, `hex:`

…) determines the Python return type and refuses lossy conversions instead of silently truncating.

Reliability was not a feature here but the reason for being, a safe engine you cannot trust is useless.

`assert_error_location(expr, code, start, end)`

checks not only that an expression fails, but that it fails with the exact error code at the exact character position, the position data is itself part of the test contract.`DOCUMENTATION.md`

captures the architecture, the full API, parser internals and the complete error-code catalogue.`rich`

, `prompt_toolkit`

). The interactive REPL offers persistent history and tab completion. Six minor releases (0.1.0 → 0.6.7) in roughly five months, throughout following Semantic Versioning.`math-engine`

, installable via `pip install math-engine`

, MIT-licensed, pure-Python wheel for Python 3.8+, with three console commands out of the box.`position_start`

/ `position_end`

on every error.