{"slug": "python-as-a-declarative-programming-language-2017", "title": "Python as a Declarative Programming Language (2017)", "summary": "Python's performance in the benchmarks game is roughly 40 times slower than C or C++, yet it remains the dominant language for data analysis and machine learning because core libraries like NumPy, TensorFlow, and PyTorch offload heavy computation to native extensions. To maximize performance, developers must treat Python as a declarative language, pushing control flow to native layers and using vectorized operations instead of explicit loops. This approach, exemplified by replacing imperative for-loops with NumPy's array-wide functions, yields cleaner, faster code that describes what to compute rather than how to compute it.", "body_md": "If you look at the programming languages benchmarks game, Python is one of the [slowest commonly used\nprogramming languages out\nthere](http://benchmarksgame.alioth.debian.org/u64q/performance.php?test=nbody). Typical programs\nwritten in pure Python average around 40 times slower than the equivalent program written in C or\nC++.\n\nDespite the performance penalty, Python is still probably the most popular language choice out there for doing Data Analysis and Machine Learning. Most of the recent Deep Learning frameworks target Python for development: TensorFlow, Theano, and Keras all use Python. Torch originally was written for Lua, which is substantially faster than Python when using LuaJIT - but Torch failed to gain traction until switching to Python with the release of PyTorch.\n\nThe reason for this is that the performance penalty in writing programs in Python isn’t as large as the programming language benchmarks game would suggest: Most of the best Python Data libraries have their core routines written as native extensions.\n\nThis all means that to get the most out of these libraries, you need to treat Python as a Declarative Language - and push as much control flow as possible down to a native layer, and just let the Python program describe what needs done.\n\n[Declarative Programming](https://en.wikipedia.org/wiki/Declarative_programming) Languages focus\non on describing what should be computed - and avoid\nmentioning how that computation should be performed. In practice this means avoiding expressions of control\nflow: loops and conditional statements are removed and replaced with higher level constructs\nthat describe the logic of what needs to be computed.\n\nThe usual example of a declarative programming language is SQL. It lets you define what data you want computed - and translates that efficiently onto the database schema. This lets you avoid having to specify details of how to execute the query, and instead lets the query optimizer figure out the best index and query plan on a case by case basis.\n\nPython isn’t a pure Declarative Language - but the same flexibility that contributes to its sluggish speed can be be leveraged to create Domain Specific API’s that use the same principles. I thought it would be kind of interesting to look at a couple specific examples of how this plays out.\n\nThe design of [NumPy](http://www.numpy.org/) has a couple neat declarative programming features,\nthat lead to code that is not only cleaner and easier to understand - but also substantially\nfaster.\n\nA simple example of this might be to apply [TF-IDF](https://en.wikipedia.org/wiki/Tf%E2%80%93idf) weighting to a sparse matrix.\nI originally needed to do this in order to add some [basic nearest neighbour recommendation\ncode](https://github.com/benfred/implicit/pull/14) as a baseline for my [implicit recommendation\nlibrary](https://github.com/benfred/implicit), and I thought it would provide a good example of\nwhat I’m talking about here.\n\nIn an imperative style, TF-IDF weighting on a sparse matrix can be written like:\n\n``` python\ndef tfidf_imperative(m):\n    # count up unique occurrences of each column of the sparse matrix\n    X = coo_matrix(m)\n    df = bincount(X.col)\n\n    # calculate inverse-document-frequency = log(N/df) \n    N = float(X.shape[0])    \n    idf = zeros(X.shape[1])\n    for i in range(X.shape[1]):\n        idf[i] = log(N / (1.0 + df[i]))\n\n    # adjust data by TF-IDF weighting\n    X.data = X.data.copy()\n    for i in range(X.nnz):\n        X.data[i] = sqrt(X.data[i]) * idf[X.col[i]]\n   \n    return X\n```\n\nThere are 2 different for loops in that code - and each of which can be replaced by different NumPy language constructs.\n\nThe first for loop calculates the IDF itself. Numpy lets you do almost all operations on arrays of values as well as on single values\nand its much faster to use the vectorized form: ```\nidf = log(N / (1.0 +\nbincount(X.col)))\n```\n\n.\n\nThis tells NumPy to loop over the array returned from the `bincount(X.col)`\n\nfunction, and create a new array with the\nIDF value transformed appropriately. In effect we’re telling NumPy what result we want and letting\nit figure out how to calculate it best itself.\n\nThe final TF-IDF weighting can be done in a similar fashion using [NumPy’s array\nindexing](https://docs.scipy.org/doc/numpy/user/basics.indexing.html) feature. This feature lets\nyou use one array as an index into another array. Going `idf[X.col]`\n\nlooks up each column value from X in IDF and returns an\narray with the IDF weight for that column.\n\nPutting it all together leads to code like:\n\n``` python\ndef tfidf_declarative(m):\n    X = coo_matrix(m)\n\n    # calculate IDF\n    N = float(X.shape[0])\n    idf = log(N / (1 + bincount(X.col)))\n\n    # apply TF-IDF adjustment\n    X.data = sqrt(X.data) * idf[X.col]\n    return X\n```\n\nNot only is the declarative version shorter and more readable, its also substantially faster. By removing the for loops, the iteration can happen\nin vectorized C calls and makes this code run **75 times** faster than the imperative version\non my laptop. By avoiding writing any loops, we’ve declared the operations that need to happen\nand let the NumPy translate that to control flow.\n\nOne frequently asked question is why [Deep Learning frameworks like TensorFlow are written for\nPython](http://stackoverflow.com/questions/35677724/tensorflow-why-was-python-the-chosen-language)\ninstead of a faster language like C++.\n\nThe answer is that for the most part these frameworks are *written* in a language like C++, but\nprovide a Python API to make it convenient to call. The Python code only describes the\ncomputations that need to be performed, all the real work happens in the library in either C++ or\nCUDA calls on the GPU.\n\nTake a look at this simple function that uses TensorFlow to do Linear Regression:\n\n``` python\ndef linear_regression(train_X, train_Y, learn_rate=0.005):\n    # define placeholders for input data\n    X, Y = tf.placeholder(\"float\"), tf.placeholder(\"float\")\n\n    # define the variables we're learning\n    slope, intercept = tf.Variable(0.0), tf.Variable(0.0)\n\n    # learn the slope/intercept on a ordinary least squares loss function\n    loss_function = (Y - (X * slope + intercept)) ** 2\n\n    # find the parameters slope/intercept using a basic GD optimizer\n    train = tf.train.GradientDescentOptimizer(learn_rate).minimize(loss_function)\n\n    with tf.Session() as sess:\n        tf.global_variables_initializer().run()\n\n        # Train the model\n        for x in range(100):\n            sess.run(train, feed_dict={X: train_X, Y: train_Y})\n\n        return sess.run(slope), sess.run(intercept)\n```\n\nIn terms of line count, most of the function is in defining the variables to optimize, the linear model that we’re trying to learn and the loss function and optimization method to learn that function. None of these calls do any work though - all they do is declare a computation graph of what should be done.\n\nBasically everything here happens in the `sess.run`\n\ncall on the training function. This call takes the computation\ngraph defined in the `train`\n\nvariable\nbinds the placeholder variables X and Y to the training data and runs the graph to learn the\nregression coefficients.\n\nPython is sort of like glue - it works well for binding different libraries together, but if you try to build a large fast program out of it you end up with a sticky mess that’s difficult to quickly move through.\n\nThe reason it has been so successful with Data Processing and Machine Learning tasks is that many of the libraries have adopted API’s where you declare the operations you want to perform, and the library executes those declarations in an efficient manner in a lower level language. This leads to the best of both worlds, code that’s easy to write in Python that runs as fast as code written in C++.\n\nUsing an imperative style means that you spend too much time wading through the glue, but declaring what operations you want leads to code that’s efficient and clean.\n\nThe side effect of this is that in order to be a great Python programmer, you have to learn to program in a lower level language too. All of the most popular Python data libraries have native extensions: TensorFlow, scikit-learn, NumPy, Pandas, SciPy, spaCY etc all have significant portions of their code written in a native language. If you are comfortable just using these libraries its enough to be just a good Python programmer; however, if you want to be the type of programmer that can produce libraries like these you really should be learning something like C++ or Cython too.\n\nPublished on 28 February 2017\n\nEnter your email address to get an email whenever I write a new post:", "url": "https://wpnews.pro/news/python-as-a-declarative-programming-language-2017", "canonical_source": "https://www.benfrederickson.com/python-as-a-declarative-programming-language/", "published_at": "2026-05-27 06:53:28+00:00", "updated_at": "2026-05-27 07:27:58.688332+00:00", "lang": "en", "topics": ["machine-learning", "artificial-intelligence", "neural-networks", "ai-tools", "ai-infrastructure"], "entities": ["Python", "TensorFlow", "Theano", "Keras", "Torch", "LuaJIT", "PyTorch", "C"], "alternates": {"html": "https://wpnews.pro/news/python-as-a-declarative-programming-language-2017", "markdown": "https://wpnews.pro/news/python-as-a-declarative-programming-language-2017.md", "text": "https://wpnews.pro/news/python-as-a-declarative-programming-language-2017.txt", "jsonld": "https://wpnews.pro/news/python-as-a-declarative-programming-language-2017.jsonld"}}