KlongPy: PyTorch Back End and Autograd KlongPy now supports a PyTorch backend that enables GPU acceleration and automatic differentiation for gradient-based computations. The torch backend outperforms NumPy by up to 8x on large arrays and provides exact gradients via the `:>` autograd operator, while the `∇` operator always uses numeric differentiation regardless of backend. Users can enable the PyTorch backend through the `--backend torch` flag at the command line or by setting `backend="torch"` when creating a KlongInterpreter. PyTorch Backend and Autograd ¶ pytorch-backend-and-autograd KlongPy supports multiple array backends. The PyTorch backend enables GPU acceleration and automatic differentiation autograd for gradient-based computations. Enabling the PyTorch Backend ¶ enabling-the-pytorch-backend Command Line ¶ command-line Use --backend flag kgpy --backend torch With GPU device selection kgpy --backend torch --device cuda Programmatically ¶ programmatically python from klongpy import KlongInterpreter Create interpreter with torch backend klong = KlongInterpreter backend="torch" print klong. backend.name 'torch' With specific device klong = KlongInterpreter backend="torch", device="cuda" Backend Comparison ¶ backend-comparison | Feature | NumPy Backend | PyTorch Backend | |---|---|---| | Default | Yes | No use --backend torch | | Object dtype | Yes | No | | String operations | Yes | Not supported | | GPU acceleration | No | Yes CUDA/MPS | | Autograd | Numeric only | Native autograd | | Small array performance | Faster | Slightly slower | | Large array performance | Good | Better especially on GPU | Performance ¶ performance The torch backend excels with large arrays: Benchmark NumPy Torch Winner --------------------------------------------------------- vector add 100K 0.04ms 0.08ms NumPy 2x vector add 1M 0.36ms 0.07ms Torch 5x compound expr 1M 0.61ms 0.07ms Torch 8x grade up 100K 0.59ms 0.19ms Torch 3x For small arrays <100K elements , NumPy is slightly faster due to lower dispatch overhead. For larger arrays, torch wins significantly. Automatic Differentiation ¶ automatic-differentiation KlongPy provides several gradient and differentiation operators: Typing Special Characters ¶ typing-special-characters | Symbol | Name | Mac | Windows | |---|---|---|---| ∇ | Nabla | Character Viewer Ctrl+Cmd+Space | Alt+8711 | ∂ | Partial | Option + d | Alt+8706 | On Mac, ∂ can be typed directly with Option + d . For ∇ , use the Character Viewer or copy-paste. : Autograd Operator Recommended ¶ autograd-operator-recommended The : operator uses PyTorch autograd for exact gradients: f::{x^2} :" Define f x = x^2 f: 3 :" Compute f' 3 = 6.0 The syntax is function: point where: - function is a scalar-valued function must return a single number - point is the input at which to compute the gradient ∇ Numeric Gradient Operator ¶ numeric-gradient-operator The ∇ operator always uses numeric differentiation finite differences , regardless of backend: f::{x^2} :" Define f x = x^2 3∇f :" Compute f' 3 ≈ 6.0 The syntax is point∇function note: reversed order from : . How They Work ¶ how-they-work | Operator | Method | Precision | Speed | |---|---|---|---| : with torch | PyTorch autograd | Exact | Fast | : without torch | Numeric | ~1e-6 error | Slower | ∇ any backend | Always numeric | ~1e-6 error | Slower | With the torch backend --backend torch or backend='torch' , prefer : for: - Exact gradients no floating-point approximation error - Complex computational graphs - Better performance on large arrays Examples ¶ examples Scalar function: f::{x^3} :" f x = x^3 f: 2 :" f' 2 = 3 4 = 12.0 Polynomial: p::{ 3 x^4 - 2 x^2 +x} :" p x = 3x^4 - 2x^2 + x p: 1 :" p' 1 = 12 - 4 + 1 = 9.0 Vector function sum of squares : g::{+/x^2} :" g x = sum x i^2 g: 1.0 2.0 3.0 :" 2 4 6 = 2 x Gradient descent: f::{x^2} x::5.0 lr::0.1 :" Update rule: x = x - lr grad x::x- lr f: x Multi-Parameter Gradients ¶ multi-parameter-gradients Compute gradients for multiple parameters simultaneously using a list of symbols: w::2.0 b::3.0 loss::{ w^2 + b^2 } :" Compute gradients for both w and b grads::loss: w b :" 4.0 6.0 = 2w, 2b This is especially useful for neural network training: w::1.0 b::0.0 X:: 1 2 3 Y:: 3 5 7 :" MSE loss loss::{ +/ w X +b-Y ^2 %3} :" Compute both gradients in one call grads::loss: w b Jacobian Computation ¶ jacobian-computation Compute the Jacobian matrix matrix of partial derivatives using the ∂ operator or .jacobian function: f::{x^2} :" Element-wise square :" Using ∂ operator point∂function 1 2 ∂f :" 2 0 0 4 diagonal matrix :" Using .jacobian function .jacobian f; 1 2 :" Same result For vector-valued functions f: R^n - R^m, the Jacobian is an m x n matrix where J i,j = df i/dx j. Multi-Parameter Jacobians ¶ multi-parameter-jacobians Just like gradients, you can compute Jacobians with respect to multiple parameters using a list of symbols: w:: 1.0 2.0 b:: 3.0 4.0 f::{w^2} :" Returns w0^2, w1^2 :" Compute Jacobians for both w and b jacobians:: w b ∂f :" Returns J w, J b This returns a list of Jacobian matrices, one per parameter. Useful for analyzing how vector-valued functions depend on multiple parameter sets. Custom Optimizers ¶ custom-optimizers KlongPy provides the gradient primitives : , ∂ , .jacobian . For optimizers, use the example classes in examples/autograd/optimizers.py which you can copy to your project and customize. Manual gradient descent no optimizer needed : w::10.0 loss::{w^2} lr::0.1 :" Update rule: w = w - lr gradient {w::w- lr loss: w }' 50 w :" Close to 0 Using a custom optimizer class: - Copy examples/autograd/optimizers.py to your project directory - Import with .pyf : :" Import the optimizer class .pyf "optimizers";"SGDOptimizer" :" Setup parameters and loss w::10.0 loss::{w^2} :" Create optimizer with learning rate 0.1 opt::SGDOptimizer klong; "w" ;:{ "lr" 0.1 } :" Run optimization steps {opt loss }' 50 w :" Close to 0 Available example optimizers: - SGDOptimizer - Stochastic Gradient Descent with optional momentum - AdamOptimizer - Adam optimizer with adaptive learning rates SGD with momentum: .pyf "optimizers";"SGDOptimizer" opt::SGDOptimizer klong; "w" ;:{ "lr" 0.01 "momentum" 0.9 } Adam optimizer: .pyf "optimizers";"AdamOptimizer" opt::AdamOptimizer klong; "w" "b" ;:{ "lr" 0.001 } Training loop example: .pyf "optimizers";"AdamOptimizer" w::1.0;b::0.0 X:: 1 2 3 ;Y:: 3 5 7 loss::{ +/ w X +b-Y ^2 %3} opt::AdamOptimizer klong; "w" "b" ;:{ "lr" 0.1 } :" Train for 500 steps {opt loss }' 500 Creating your own optimizer: The example optimizers use multi grad of fn from klongpy.autograd to compute gradients for multiple parameters. Copy and modify the optimizer classes to implement custom update rules RMSprop, AdaGrad, learning rate schedules, etc. . GPU Acceleration ¶ gpu-acceleration When CUDA or Apple MPS is available, tensors automatically use GPU: python from klongpy import KlongInterpreter klong = KlongInterpreter backend='torch' print klong. backend.device 'cuda:0', 'mps:0', or 'cpu' Device Selection ¶ device-selection The backend automatically selects the best available device: 1. CUDA NVIDIA GPU - if available 2. MPS Apple Silicon - if available 3. CPU - fallback MPS Limitations ¶ mps-limitations Apple's MPS backend has some limitations: - No float64 support uses float32 - Some operations fall back to CPU Mixing with Python ¶ mixing-with-python Access torch tensors directly: python from klongpy import KlongInterpreter klong = KlongInterpreter backend='torch' KlongPy operations return torch tensors result = klong '2 1+ 1000000' print type result