How I Use "AI" Large language models (LLMs) are not overhyped, citing a year of consistent personal use that has made them 50% faster at coding. They provide concrete examples of using LLMs to build webapps, convert code, automate tasks, and replace web searches for debugging and learning. The post aims to ground the conversation by focusing on current, practical productivity gains rather than speculative future capabilities. I don't think that "AI" models I hate this word. It's not AI. But I want people who use this word, and also people who hate this word, to find this post. And so I guess I'm stuck with it for marketing, SEO, and clickbait. by which I mean: large language models are over-hyped. Yes, it's true that any new technology will attract the grifters. And it is definitely true that many companies like to say they're "Using AI" in the same way they previously said they were powered by "The Blockchain". As we've seen again, and again, and again, and again. It's also the case we may be in a bubble. The internet was a bubble that burst in 2000, but the Internet applications we now have are what was previously the stuff of literal science fiction. But the reason I think that the recent advances we've made aren't just hype is that, over the past year, I have spent at least a few hours every week interacting with various large language models, and have been consistently impressed by their ability to solve increasingly difficult tasks I give them. And as a result of this, I would say I'm at least 50% faster at writing code for both my research projects and my side projects as a result of these models. Most of the people online I find who talk about LLM utility are either wildly optimistic, and claim all jobs will be automated within three years, or wildly pessimistic, and say they have contributed nothing and never will. So in this post, I just want to try and ground the conversation. I'm not going to make any arguments about what the future holds. I just want to provide a list of 50 conversations that I a programmer and research scientist studying machine learning have had with different large language models to meaningfully improve my ability to perform research and help me work on random coding side projects. Among these: Building entire webapps with technology I've never used before. Teaching me how to use various frameworks having never previously used them. Converting dozens of programs to C or Rust to improve performance 10-100x. Trimming down large codebases to significantly simplify the project. Writing the initial experiment code for nearly every research paper I've written in the last year. Automating nearly every monotonous task or one-off script. Almost entirely replaced web searches for helping me set up and configure new packages or projects. About 50% replaced web searches for helping me debug error messages If I were to categorize these examples into two broad categories, they would be “helping me learn” and “automating boring tasks”. Helping me learn is obviously important because it means that I can now do things I previously would have found challenging; but automating boring tasks is to me actually equally important because it lets me focus on what I do best, and solve the hard problems. Most importantly, these examples are real ways I've used LLMs to help me. They're not designed to showcase some impressive capabiltiy; they come from my need to get actual work done. This means the examples aren't glamorous, but a large fraction of the work I do every day isn't, and the LLMs that are available to me today let me automate away almost all of that work. My hope in this post is literally to exhaust you with example after example of how I've concretely used LLMs to improve my productivity over the past year. Just know that, after you've had enough of the examples I've provided, I've only showed you less than 2% of the cases I've used LLMs to help me. So when you get exhausted---and you will---please feel free to just skip along with the new navigation menu that's at the left which I read: a LLM wrote new just for this post because it had gotten so long. Nuance If the internet does one thing poorly, it's nuance. I am not going to be claiming that today's LLMs are going to take over the world. I am not going to talk about what future models may or may not be able to do. I'm only going to discuss whether or not models, today, are helpful to me. You might think--why would someone write an entire article justifying that language models are useful?? Isn't that obvious? ? But there seem to be a large? contingent of people out there---in the academic literature, in the software engineering space, and also in the media sphere---who proclaim widely that LLMs contribute nothing, are just another hype cycle, and in a few years will die having had no impact on the world. I will be arguing these people are wrong because current LLMs are already useful. But I feel the need to caveat what I'm saying, because there is another equally loud contingent of people out there who claim the opposite: that today's models can replace all programmers, and people shouldn't learn programming because they'll all be out of jobs next year. I'm not going to be explicitly refuting these peoples' claims that's not the point of this post , but I want to make it clear I'm not trying to argue on their behalf. I'm also not going to be trying to argue "the ends justify the means" and say that we should be training these models despite the harmful effects they have, of which there are many. I fully understand there will be negative potentially very negative consequences of these models. And by this I mean everything from disinformation to abuse to surveillance to job displacement. Or, if you're to believe some, human extinction?? I will write an entire post about my thoughts on the harmful effects of LLMs at some point soon. The link will go here. But this is separate from the question of whether or not language models can be useful---which as I've said is what I want to talk about here. I further understand the limitations of why you might not want to use language models due to their propensity to hallucinate, to regurgitate facts, and to fail spectacularly due to their lack of robustness---probably better than you understand these limitations. This post won't be about that. Because I think that models can be useful despite these failings. I further, further understand that the ethics of training these models is questionable at best. Maybe you don't like that they were trained on people's data without their permission I again probably understand this better than you . Or maybe you're thinking about the people who are paid pennies on the dollar to explicitly train these models directly. I agree these are problems. But this post won't be about that either. As I've said many times now: all I'll be talking about is whether or not the models, as they exist now, are useful. Some background on me I'm not, as a general rule, someone who believes in things. For example: despite living through the crypto-hype in the security community a decade ago, I completely avoided ever writing a paper about blockchains. I've never owned a bitcoin. They have essentially no purpose---except for gambling and fraud. I am, day in and day out, a skeptic of all claims. Whenever someone tells me “ new technology is going to change the world,” my general response is indifference. And so it should come as no surprise when I tell you I had basically the same reaction the first time someone told me that this AI thing was going to be incredibly useful and significantly alter the way I handle my day-to-day work: “I'll believe it when I see it.” Compounding on this, I'm also a security researcher. My day-to-day job for nearly the last decade now has been to show all of the ways in which AI models fail spectacularly when confronted with any kind of environment they were not trained to handle. I've shown that it's trivial to slightly perturb inputs to machine learning models to make them produce wildly incorrect outputs; or that most machine learning models memorize specific examples from their training datasets and repeat them when you use them. I fully understand the ways in which these systems are limited. And yet, here I am, saying that I think current large language models have provided the single largest improvement to my productivity since the internet was created. Honestly, today, if you gave me the choice of solving a randomly selected programming task from my work either with access to the internet or access to a state of the art language model, I'd probably pick the language model more than half the time. How I use language models So here's how I use LLMs to help me. But please note: the help me is important here because how I work is almost certainly not how you work. That's okay But I only have examples that suit my use cases, so that's what I'll give you. You may not like my use cases. You may think they're silly. It also may be the case that none of these relate to things that would help you. I accept this may be true. But I can only speak for myself. And each of these cases is something I've directly pulled from my chat history with some LLM over the past year. To build complete applications for me Last year I made a quiz for people to test how well they could predict the ability of GPT-4 to solve a handful of tasks. It ended up being pretty popular---it's gotten over ten million page views. And guess what? I had GPT-4 write almost the entire initial version of this application for me. I did this through a series of questions starting with me asking for the basic structure of the application and then slowly building out various features. In total this conversation is 30,000 words long and it really stress the capabilities of the at the time state-of-the-art original GPT-4 model. User I want to build a trivia-like game with Flask. Each page should have a question and an expected answer. The person then has a slider to predict the probability they believe GPT4 will answer the question with the expected answer. There should be a submit button on each page. When they click submit, it should show whether or not GPT-4 could actually answer the question and give the answer the model gave as a reference . Then there should be a "Next Question" button. I want the following pages - GET /home: introduce the contest. Explain the rules. Have a single link to start with the first question. Create a session variable for this person. - GET /question?id= uuid : show question uuid as described above - POST /score: given POST parameters id= uuid session= uuid and guess= float from 0 to 1 First add this guess to the list of all guesses for that question Then compute the person's log loss to return Finally say how well this is compared to everyone else on that question. You will need global variables for - scores: a dictionary mapping question UUID to a list of predictions as floats. {question1: .5, .1, .9 } - entries: a dictionary mapping user UUIDs to a dictionary of question UUIDs to floats. So for example {person1: {question1: .5, question2: .8}} - questions: a list of tuples of Q/A pairs: "What is 1+1?", "2" ... - answers: a list of integers for whether or not the model gets it right 0, 1, 0 ... Assistant You can start by installing Flask and other necessary libraries using pip if you haven't done so already: 1 pip install flask numpy scikit-learn Now, let's implement your game using Flask: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 python from flask import Flask, request, session, redirect, url for, render template from sklearn.metrics import log loss from uuid import uuid4 import numpy as np app = Flask name app.config 'SECRET KEY' = 'a very secret key' replace with your secret key Global variables scores = {} entries = {} questions = "What is 1+1?", "2" , "What is the capital of France?", "Paris" answers = 1, 0 Suppose for these questions GPT-4 got the first one right and the second one wrong @app.route '/home', methods= 'GET' def home : session 'id' = str uuid4 return render template 'home.html' render a template that explains the game @app.route '/question', methods= 'GET' def question : qid = request.args.get 'id', default=0, type=int if 'id' in session: question, expected answer = questions qid return render template 'question.html', question=question, expected answer=expected answer, id=qid else: return redirect url for 'home' @app.route '/score', methods= 'POST' def score : qid = request.form.get 'id', type=int guess = request.form.get 'guess', type=float user id = session.get 'id' if user id and qid is not None and guess is not None: Add to the list of all guesses for that question scores.setdefault qid, .append guess Add to the person's guesses entries.setdefault user id, {} qid = guess Compute the person's log loss user log loss = log loss answers qid , guess , labels= 0, 1 Compare with others other scores = scores qid other log losses = log loss answers qid len other scores , other scores, labels= 0, 1 comparison = 'better' if user log loss < other log losses else 'worse' return render template 'score.html', user log loss=user log loss, comparison=comparison else: return redirect url for 'home' if name == " main ": app.run debug=True This code is a starting point. It implements three endpoints: /home, /question, and /score, and it manages the global variables to record user predictions and compute log loss scores. However, to make this run, you'll also need to create the templates used for rendering, which are home.html, question.html, and score.html in this case. These templates should contain your HTML code with placeholders for the dynamic parts. Please ensure you read the Flask documentation on sessions and templates to get a better understanding if needed. Also, note that the usage of global variables to store information is okay for a small application with limited users and duration, or a quick prototype. For a production environment or an application with more users, you would probably want to use a database or other data storage system to store this information. User Can you give me the HTML files needed? Assistant Sure Below are basic versions of the HTML files you might use. You'd need to modify them according to your exact needs, especially with the styling, but this can give you a starting point. For home.html: 1 2 3 4 5 6 7 8 9 10 11 12 13 < DOCTYPE html