Part 2: Neural Networks Architecture
Part 2 of the series “Understanding Neural Networks”: The internals of a neural network, how it is structured and operates.
⇐ Back to Part 1: AI, Not Just Another Tool
Basic Anatomy of a Neural Network
Imagine a row of input nodes. These represent the data you give the network – a word, a picture, a sound, a list of numbers, a tic-tac-toe board, whatever. And then, you’ve got output nodes – the result the network produces. In between: one or more hidden layers with “neurons” that transform the input data to generate the desired output.
Imagine we want to create a neural network that plays tic-tac-toe. It takes skill and experience to find the right network structure for a given problem. This isn't a particularly good one for learning tic-tac-toe (for several reasons), but it serves the purpose of explaining how a neural network works without getting too complicated:
Each field on the tic-tac-toe board corresponds to an input node. Say we pass the value 0, if the field is empty, 1 if there is an X already and 2 if there is an O. Given any board configuration, we want the neural network to tell us the next best move. The output node could thus be a value between 1 and 9 representing the field in the board where the neural network proposes to put the next marker:
Why This Structure? Blame Nature
We didn’t invent this architecture out of thin air. Like planes mimic birds (at least the wing structure does), neural networks are based on how our brains are structured and operate. We saw something amazing in biology and asked, “Can we build a simplified version of that in silicon?”
Turns out, we can. Already decades ago.
In your brain, neurons receive input (what you see, feel, hear), process it through layers of connections, and eventually produce an output (like moving your hand, speaking, or focusing attention).
Synapses connect these neurons. They have a direction. They go from a sending neuron to a receiving neuron. You could say, that neurons are communicating with each other through these synapses. When one neuron releases neurotransmitters (mostly chemicals, sometimes also electrical charges) synapses carry these signals to the receiving neuron, which then in turn does the same, sends a new signal to all its receiving neurons. Different synapses have different strengths and thus amplify or weaken signals from one neuron to the next. They are like volume knobs, as they adjust how loud the signal is played across connections. This is the "transforming of data" I mentioned earlier.
The layered structure of artificial neural networks is inspired by how the human cortex processes information – passing signals through multiple stages to extract increasingly complex patterns.
Interesting fact: If we wouldn't have this layering of neurons, we would experience more epileptic seizures, as this is the result, when neurons misfire in loops.
Synapses, Weights, and Signal Strength
In our brains, synapses have different strengths. In an artificial neural network, the digital counterparts have weights, and they define how much influence one node has on the next. A strong weight? The signal is passed on forcefully. A weak one? It’s barely noticed.
The initial weights are random (unless you start off with a pre-trained network) – which means, at first, the network is producing random garbage as output - you could call it guessing. It’s only through training (which we’ll cover in the next article) that these weights are adjusted until the network gets good at its task.
In summary, artificial neural networks simulate the layered structure of neurons and synapses in our brains.
The Size of current Neural Networks
Large Language Models (LLM) like ChatGPT or Claude are essentially pimped neural networks that have been trained to both be knowledgeable in everything that can be found on the internet (and then some) but also in communicating like humans. I might add another article later discussing in more detail how LLMs are structured and trained. For now, it is sufficient to see them just as neural networks.
You might have heard that the size of LLMs often is measured in the number of parameters they have. These parameters are the weights of the synapses between the neurons in the neural network. There are open-source models for which these numbers are known. Metas LLaMA 3.1 model released in July 2024, for instance, has various versions ranging from 8 to 405 billion parameters.
The exact number of parameters for proprietary models like ChatGPT and Claude is not officially disclosed. ChatGPT 4 and Claude are both guessed by some experts to have close to a couple of hundred billion parameters. The number of parameters says a lot, but not everything. ChatGPT and Claude, though probably smaller in the number of parameters, outperform LLaMA 3.1, even in its biggest configuration (405B), in most benchmarks.
In the next article, I'll also be comparing this to the size of human brains.
How Biological and Artificial Neural Networks Learn
Now that we understand the basic and very simple structure of neural networks, let's dive into how these are trained to solve problems – like how they learn to play tic-tac-toe or conjugate verbs. That's probably the most fascinating part, and you'll be surprised how similar it is to how we humans learn.
⇒ Next up (part 3): how neural networks learn – and how it's uncannily similar to how humans learn.
Project and Program Manager | PMP®
3moGreat stuff, Klaus. How many "parameters" do humans have...probably my question will be answered in the next article.
Advisor & Mentor für "Besondere" Programme & Projekte 🚀 Guiding you with an open mindset
3moGreat article that makes complex concepts accessible > I like Part 1 & 2
Transformation, AI, and Making Sense of Complex Systems, Agile
3moCrickets so far on this one - not a single reaction in 24 hours. That’s a first for me. Did I confuse the algorithm? Did the humanoid Da Vince robot thinking too hard scare everyone off? Or maybe content doesn’t stand a chance next to “Look what I achieved!” these days? Either way: if you liked Part 1, this is where it gets visual. Would love your thoughts.