I’ve long planned to write about the merit of relying on machine learning for optimizing PPC campaigns instead of manual settings. Like almost everything in PPC, this too has a dynamically changing methodology. By “dynamic”, I mean that what is a best practice today could become an excellent way to burn money in just three months. Hence, staying up-to-date with industry changes is crucial.
To accurately see the method’s limitations and possibilities, it’s important to understand what machine learning and A.I. actually do. However, I must confess that my previous sentence was a bit of a lie as you can use A.I. with the highest level of sophistication without understanding what happens under the hood. Yet, this topic gives me an excuse to write a separate article about how machine learning algorithms work. I want to do it as I find this subject incredibly interesting and worldview-shaping. Additionally, I have a habit of trying to explain everything to everyone.
So, allow me to explain in this article, as comprehensively and educationally as possible, how machine learning works. If you suspect magic, you might get disappointed. Machine learning is fascinating precisely because it is fundamentally simple – it’s just done extensively. By Daniel Stomp, Performance Marketing Expert.
The Perceptron
Machine learning algorithms and A.I. systems are also called neural networks because they mimic the interconnected neurons of human and animal brains using computers.
The idea dates back to 1943 when Warren McCulloch and Walter Pitts developed the theoretical basis for neural networks. This algorithm, named “perceptron,” simplifies the mathematical model of neuron communication. The first machine using this principle was created by Frank Rosenblatt in 1957. It could reasonably determine whether drawings on a 20×20 grid were circles or triangles after a learning phase.

Frank Rosenblatt’s neural network marked the beginning of a new era.
From this, it’s a huge leap to ChatGPT writing an 80-page dissertation in Urdu about the presence of Plutarch’s philosophy in Thomas Mann’s “The Magic Mountain” in just a few minutes. Since 1943, the theoretical foundation and computational capacity for running increasingly complex algorithms have advanced significantly. However, the basic principle remains essentially the same.
Perceptron vs. Real brain
Let’s look at how neuron connections work in your brain! For example, let’s say 5 neurons are connected to a sixth one. This connection is called a synapse. The 5 neurons are either activated or not. If activated, they send a small electrical impulse through the synapse to the sixth neuron. However, synapses transmit impulses with varying strengths. The combined strength of these impulses determines if the sixth neuron activates. If the total strength exceeds a certain threshold, the sixth neuron activates and sends an impulse onward; if not, it remains inactive.

The human and animal brains consist of neurons that communicate with each other through synapses.
To mimic this process, the perceptron was created to mathematically describe what happens using basic arithmetic. In fact, it does so quite simply, using only the four basic arithmetic operations. Let’s use the same example of 5 neurons connected to a sixth one. If a neuron is active, its value is 1; if inactive, its value is 0. We then assign an activation threshold to the receiving neuron, say 15. The strength of each synapse is given a value between 1 and 10. We sum the products of the neuron’s activation value and its corresponding synapse strength. If the total exceeds the threshold (e.g., 22 > 15), the neuron activates.

An example of how the perceptron works
Turning this into a useful algorithm
To utilize such a system for specific tasks (like recognizing shapes, writing text, or identifying users likely to convert from an online ad), you simply increase the number of neurons and connections. Let’s say you want to create a neural network to recognize 30 types of animals from a 640×480 black-and-white image. You need 640 times 480, so 307,200 input neurons (one for each pixel) that are assigned a value of 1 or 0 depending on whether the pixel is black or white. You also need 30 output neurons, each representing a different animal. When the network processes an image, ideally, the neuron corresponding to the correct animal will activate.
To create a neural network for recognizing 30 types of animals in a 640×480 black-and-white image, you connect each of the 307,200 input neurons to each of the 30 output neurons. This results in 307 200 x 30 = 9,216,000 synapses. This completes our neural network. Now, we set the transmission strength for each of the 9,216,000 synapses randomly between 1 and 10. Similarly, we randomly set the activation threshold for each of the 30 output neurons, some will be set to 10, some others to 708,170, and so on (given the large number of input neurons and their multiplied weights, these threshold values can be quite high). This randomness initially results in unpredictable outputs, but it sets the stage for the training process where adjustments are made to achieve desired results.

Even simpler neural networks can require millions of connections.
Now, let’s feed a 640×480 black-and-white image of a cat into the network. Each pixel assigns a value of 0 or 1 to the 307,200 input neurons, representing black and white respectively. We then perform the calculations as described by the perceptron algorithm. This involves multiplying the input values (0 or 1) by the synapse weights (ranging from 1 to 10), summing these products for each of the 30 output neurons, and comparing the sums to their respective activation thresholds. If a sum exceeds its threshold, the corresponding output neuron activates. If the combined input for any output neuron is less than its activation threshold, it will not activate. This process is repeated for all 30 output neurons. This requires significant computational power: 9,216,000 multiplications, 30 additions of 307,200 terms for each, and 30 comparisons. Initially, due to random initial weights, the network might incorrectly identify the image as a walrus, dog, giraffe, and emu at the same time instead of a cat.
Enter machine learning
This incorrect result is unsurprising, as the transmission strengths and activation thresholds were set randomly, leading to random outcomes. But we know the image is of a cat. Here’s the key: machine learning. We need to train this neural network. Our task is to adjust the transmission strengths and activation thresholds based on the results so that, after all the multiplications and summations, the cat-neuron activates when analyzing this image.
There is a mathematical method for adjusting the weights and thresholds, which I’ll explain briefly for those who are blackbelt math-freaks. Using the mathematical method, we adjust the weights and thresholds to achieve correct activation for the initial cat image. Next, we show a new cat image and again get an incorrect activation. We adjust the model again to ensure correct activation for this new image without disrupting the previous one. Then we introduce a dog image. Predictably, we won’t get a dog activation, so we adjust our values again. This iterative process continues with each new image, fine-tuning the model to improve its accuracy across different inputs. Over time, with enough examples and adjustments, the neural network learns to correctly identify the images, improving its performance significantly.
In theory, if this process is repeated long enough (using thousands of animal images), the neural network’s weights and activation thresholds will adjust in a way that it correctly identifies not only the training images but also new, previously unseen images. However, in practice, there are two main challenges.
Challenges of Machine Learning
One challenge is that reverse-engineering the results to set appropriate weights and thresholds is extremely complex. Fortunately, there is a solution to this problem. Although the method requires extensive computations, it is effective and it’s built on merely repeating a predefined procedure until the network becomes sufficiently accurate. This process, known as backpropagation, iteratively adjusts the weights and thresholds to minimize errors.
(Let me explain the essence for math-savvy readers – if you’re not, feel free to skip this part! The calculation is based on the least squares method: we treat the transmission strengths and activation thresholds as variables and create a function from the squared differences between the actual and expected activations, seeking its minimum. Even in our current example, it involves differentiating a function that has a few hundreds of thousands dimensions, which would require a way too significant computational power. Instead, we use successive approximations, iteratively adjusting only the variables that cause the greatest decrease in the function.)
The second difficulty is that even with persistent training, the neural network described above often fails to activate the correct output neuron reliably in real-world scenarios. But ti quote a classic: “If nothing else works, a total pig-headed unwillingness to look facts in the face will see us through” /Blackadder, Season 4/. So, we must stubbornly persist, believing it will work with even more effort. This involves adding complexity, such as more layers of neurons, to improve the network’s performance and reliability.

Using hidden layers of neurons is the solution for achieving reliability.
To improve the network’s performance, we introduce several layers of “hidden” neurons between the input and output neurons. For example, we might add 3 hidden layers, each with 300 neurons. This means connecting each of the 307,200 input neurons to all 300 neurons in the first hidden layer, each neuron in the first hidden layer to all 300 neurons in the second hidden layer, and so on, until the third hidden layer is connected to all 30 output neurons. After some calculations, it becomes clear that while analyzing an image with a simple network (without hidden layers) requires a few million basic operations, adding 3 hidden layers of 300 neurons each increases the computational load to several billion operations. This substantial increase in complexity necessitates a significantly greater amount of computational resources but can significantly improve the performance and accuracy of the neural network.
It‘s alive!
The wonder is that this approach starts to work. Training this new system with hundreds of thousands of images results in a neural network that can activate the correct output neuron for new, unseen images about 95% of the time. This success drives the creation of even more complex networks with more layers and neurons. These networks are trained with millions or billions of examples for various specialized tasks, such as predicting stock prices, generating appropriate text responses, or optimizing PPC keyword bids based on historical data.
Machine learning-based artificial intelligence can produce astonishing results. For example, in 2022, a machine learning algorithm was developed that could identify a patient’s race from chest X-rays without having any information about their facial structure, and no one knows how it does it. It’s also incredible how A.I., like ChatGPT, can write dissertations and code or generate photorealistic images from almost any absurd prompt.

This is what one of the many free A.I.s created for me based on the description “a robot is sharing a chocolate with his pet hamster”.
Is it alive?
However, the above lengthy and exhaustive explanation aims to show that these algorithms, despite their complex structures, are merely performing calculations. Once trained, they primarily multiply and add. There’s no magic here. The key to the effectiveness of machine learning lies in complexity and the size of the training dataset. The most advanced algorithms consist of billions of neurons and are trained using terabytes of data. This results in highly sophisticated calculators that, in human terms, lack true thinking and intuition.
From this point on, whether discussing PPC or other fields, there’s no need to fear that AI will take over jobs. A.I. will assist work similarly to how calculators did, handling monotonous tasks and freeing up time for creative activities. Machine learning-based A.I. excels in identifying patterns in complex systems, far surpassing human capability in this aspect alone. But only in this sense. Even the simplest calculator can extract a square root from a six-digit number much faster than the world’s best human calculator. Despite this, it remains just an optimized tool for specific tasks and fundamentally “dumb.” Creativity remains a uniquely human trait. A.I., like calculators, helps by taking over monotonous tasks, allowing humans to focus on creative aspects.

We still have to wait a while for independent, creative (and malicious) thinking machines.
(There is also a perspective that our “true” intelligence differs from artificial intelligence only in its complexity—merely due to the greater number of unique neurons and connections, but this factor aside, they do the same thing. According to this view, human personality, emotions, thoughts, intuition, and consciousness (or soul, if you will) are just the results of a sophisticated algorithmic system. If you haven’t had an existential crisis today, you’re welcome!)
This explains why Google Ads’ Maximize Conversion automatic bidding strategy typically works well only after achieving at least a hundred conversions, and why competing with it using manual CPC bidding becomes nearly impossible after a thousand conversions. It’s because a neural network operates under the hood of your advertising account, requiring data to learn effectively. Without sufficient data, it cannot function well, as it lacks true thinking capability.
In my next article, I will discuss the best current strategies for using machine learning in PPC campaigns.