Behind the Magic Trick: A Plain English Guide to How AI Actually Works

Published on

The Illusion of Intelligence

If you have ever sat in the audience of a Derren Brown show, you know the feeling. You watch him do something seemingly impossible - like perfectly extract a deeply hidden thought from a random stranger's head - and your brain immediately leaps to the supernatural. It feels like actual magic. But the secret, which he freely admits, is that it isn't magic at all. It is a meticulously rehearsed combination of psychology, suggestion, statistical probability, and misdirection.

Right now, the tech industry is pulling off the exact same trick on the general public.

When you type a prompt into ChatGPT or Claude and it spits back a perfectly formatted, highly nuanced essay, it feels like there is a tiny, incredibly fast human trapped inside your screen. When you ask an AI to build a complex PowerPoint deck, and a .pptx file pops out moments later, it looks like sorcery.

But just like a stage illusion, when you break down the mechanics of a Large Language Model (LLM), the mysticism evaporates. It leaves behind something arguably more fascinating: very clever maths.

The Spicy Autocomplete

To understand how an LLM works, you have to understand what it actually is. It is not a database of facts. It is not a logical reasoning engine. At its absolute core, a standard LLM is essentially a steroid-injected version of the autocomplete function on your phone’s keyboard.

When an LLM is created (the "training" phase), it is fed an unfathomably large amount of text - essentially the entire public internet, Wikipedia, millions of books, and endless lines of code. It reads all of this data using massive clusters of powerful computers over several months.

But it isn't memorizing this text. It is building a massive, complex statistical map of how human language fits together. It learns the probability of which word (or "token") is most likely to follow the previous ones.

If I say, "The cat sat on the...", your brain likely guesses "mat." The LLM does the same thing, mathematically. It calculates that "mat" has a 95% probability of coming next, "couch" has a 4% chance, and "nuclear reactor" has a 0.0001% chance.

When you ask it a complex question, it isn't "thinking" about the answer. It is calculating the statistically most probable sequence of words that should follow your prompt, one single word at a time, at lightning speed.

Thinking Before Speaking: "Reasoning" Models

You might have recently heard the term "reasoning models" thrown around by AI companies. If a standard LLM is just spicy autocomplete, what is a reasoning model?

The core technology is actually exactly the same. The difference is in the workflow.

In his famous book Thinking, Fast and Slow, the Nobel laureate psychologist Daniel Kahneman divided human thought into two distinct categories: System 1 (fast, instinctive, and automatic) and System 2 (slower, deliberative, and logical).

A standard LLM is pure System 1. It blurts out its answer immediately, generating text based on sheer mathematical instinct and pattern recognition.

A reasoning model is designed to mimic System 2. It is explicitly trained to pause and write out a hidden "scratchpad" or "chain of thought" before giving you the final answer.

Imagine asking someone a complex riddle. A standard LLM tries to guess the answer instantly based on its gut feeling. A reasoning model takes out a piece of paper, writes down the clues, tests a few incorrect theories, figures out the logical path, and then speaks the final answer to you. It isn't actually "conscious" reasoning - it is just the model using its autocomplete powers to write out the logical steps first, which dramatically increases the statistical probability that the final answer it gives you will be correct.

The Digital Waiter and the Calculator

Even with reasoning capabilities, an LLM floating in a digital void has severe limitations. It has no hands. It cannot click a mouse, it cannot open Microsoft Office, and ironically, it is terrible at maths.

Because an LLM predicts language, asking it to multiply 14,892 by 7,431 is risky. It might guess the right numbers, or it might confidently hallucinate a completely wrong number that simply "looks" like a correct answer. And in reality, you wouldn't ask a human to do that multiplcation - you'd use a calculator - and so engineers created a way to enable an LLM to use one too.

This is where Tools and Agents come in, relying heavily on something called an API.

An API (Application Programming Interface) is essentially a digital waiter. Imagine you are sitting in a restaurant. You give your order to the waiter, the waiter takes it to the kitchen, the chef cooks it, and the waiter brings your food back. The API is that waiter - it allows two entirely different pieces of software to talk to each other.

A Tool is a standard, old-school software program that can perform a specific task and is accessible via an API.

If we want our AI to be good at maths, we give it a Calculator Tool. Now, when you ask the AI to do that complex multiplication, the system pauses. The AI recognizes it needs to do maths, hands the equation to the "waiter" (the API), the waiter gives it to the standard Calculator program, the Calculator works out the exact, flawless answer, and the waiter brings it back to the AI. The AI then uses that factual number to write out your final response.

The Project Manager and the Workforce

An Agent is a system where the LLM is given autonomy to decide which tools to use and when. While they can act as digital project managers, Agents can be configured in countless ways. Sometimes you have a "Researcher Agent" that just browses the web. Sometimes you have multiple Agents talking to each other - a "Writer Agent" drafts a paragraph, a "Critic Agent" points out flaws, and an "Editor Agent" revises it.

Let's look at that complex PowerPoint example. Suppose you type: "Create a 5-slide presentation about dog walking in London. Use our corporate slide template, and include two generated images."

Here is the step-by-step reality of what the Agentic workflow does:

  1. The To-Do List: The Agent (the coordinating LLM acting as project manager) reads your prompt and figures out the steps required.
  2. Reading the Template: The Agent uses a File Reader Tool to look at your corporate template and understand where text and images go.
  3. Writing the Content: The LLM does what it does best - it predicts the text for the slides.
  4. Ordering the Images: The Agent realizes it needs pictures. It hands request to an Image Generation Tool. The tool uses an Image Generation Model to create the files, and passes it back to the Agent.
  5. Assembly: Finally, the Agent hands the text, the images, and the template rules over to a PowerPoint Builder Tool. This is a traditional piece of software that uses Microsoft's API to stitch everything together into a .pptx file.

The LLM didn't "draw" the slides. It just acted as the brain, deciding what should be done and delegating the manual labour to the right digital workers.

Why This Matters

The reason it is so important to demystify this isn't to diminish the achievement of the developers who build these models. The engineering required to make a system coordinate all of these tools seamlessly is one of the greatest technological leaps of our lifetime.

But treating AI like magic puts you at a disadvantage. When you understand that an LLM is a text-probability engine that relies on external Tools to interact with the world, its limitations become glaringly obvious. You understand why it sometimes makes up a fact (it just picked a highly probable but incorrect next word), and you understand why it can only perform tasks if it has been given the correct software Tools to do so.

It is a remarkably powerful ecosystem. But once you know how the trick is done, you stop being a passive spectator of the magic, and you start learning how to direct the show yourself.

The Uncomfortable Question

Before we get too arrogant about our own intellectual superiority, there is a lingering philosophical question that comes from spending too much time playing with these tools.

We look down on the LLM because it isn't truly "thinking." We comfort ourselves with the knowledge that it is just using its vast training data to predict the next statistically probable word based on the current context. It's just a biological illusion.

But... is that really so different from us?

Think about the last time you were at the pub, half listening to a friend tell a story you have already heard three times before. When you nodded, took a sip of your pint, and said, "Yeah, absolutely mate, that is crazy," were you engaging in deep, conscious reasoning? Or were you just running your own internal, biological autocomplete - predicting the exact sequence of acceptable words required to politely maintain the social interaction until it was your turn to speak?

Perhaps the reason AI is getting so incredibly good at sounding human isn't because the machines are secretly becoming conscious. Perhaps it is just that a lot of human existence is far more algorithmic than we would like to admit.

I'll leave you to ponder that one. I am off to have another attempt at manually drawing that cartoon dog.

← Back to Home