I was looking forward to writing this for a long time, and it’s a necessary one, too – we will talk about algorithms and functions. Do you remember, back in school, when nobody could explain to you why on earth you needed to know about that stuff? Or why on earth you needed to sit through maths class at all? If you’re in the tech field in 2021, understanding neural networks may be expected from you, and understanding algorithms is one step in the right direction.
Machine Learning – Nothing Else Than Pure Mathematics
How can you teach a machine a task? You have two options. Either you hardcode something, or you enable the machine to learn for itself.
If your problem is not too complex, writing a simple script may suffice. Let’s take an example we can all relate to: We get up early, it’s still dark outside, and we have to leave the house (something we all long for during lockdown times, right?). We have a set of five shirts and five pairs of jeans which we usually all mix and match. However, we noticed that we cling to the same combinations, and we want to change that. That’s where our script comes in!
We create a list of our shirts and another list of our pairs of jeans, then have the script choose a random number twice and voila, we mixed and matched without any human interaction!
import random shirts = ["the red shirt","the t-shirt","the band shirt","the shirt with the cats","the nice shirt"] jeans = ["the dark blue", "the light blue", "the black", "the red", "the white"] a = random.randint(0,4) b = random.randint(0,4) print("You will wear "+shirts[a]+" and "+jeans[b]+" pair of jeans today. Have fun!")
As the number of shirts and pairs of jeans is limited and we always want to do the same task, there’s no need to overengineer this. Just use this simple script and you’re good! Now, guess what – this is not machine learning. Machine learning uses a different mechanism to get you a result.
The Learning Effect
The system works so well that you buy 30 more shirts and 45 more pairs of jeans (just for the funsies!), and you also add your 20 pairs of socks to the script. You leave the house a few weeks, happy that you don’t have to think about anything in the morning.
But – then it happens: Depending on what you’re wearing, the people you meet on your commute seem to be nicer or less nice! With certain combinations, you even received a free coffee twice! You cannot figure out which combinations you should avoid and which you should maybe wear more often (Who doesn’t want free coffee?). With your 50 pairs of jeans, your 35 shirts and your 20 socks, you have so many possible combinations to go through that it would take you ages to guess the good combinations. Maybe you would start your retirement before you had another free coffee again! And this is where machine learning comes into play: For the next months, you collect the daily data of what you wear and how the reactions are. Depending on how thorough the data should be, you can also collect some metadata like how long you’ve slept, how the weather is, and which bus or train you caught. After a certain amount of time, you feed this data into an algorithm.
An algorithm is nothing else than a mathematical function. You start with a basic one, that doesn’t know your preferences at all. If you applied this function to your data and asked it to give you the good combinations, its first step would be the same as a human’s first step: guessing.
The algorithm uses your data to check how far its guess is away from the reality. Depending on the outcome, it will slightly apply changes to its parameters and then guess again. Did it perform better? Or worse? Depending on this outcome, the algorithm will adapt its parameters again, and then check again with the data. This process will be repeated many, many times. If you collected metadata like the weather and your wake-up time, this may also be used by the machine to come closer to the real results. It does so by guessing first, then applying the parameters, and then using these very parameters on every other data that was collected. The theory behind this is: If the parameters are correct, they will guess correctly in as many cases as possible. This phase is called the training phase.
Once the algorithm decides that it cannot possibly become any more accurate, it changes into the testing phase. In order to be “accurate”, one would expect (very simply spoken) that the algorithm performs better than a blind ape that randomly picks shirts and jeans for you. If that is not the case, the training failed completely and may be re-done with modified, or completely different, data. If it has been successful, the algorithm will test its own abilities on some data that was previously separated from the training data. If it performs well on this, too, you can apply it to the real world: Tomorrow, you’ll let that algorithm pick your outfit! This is the application of the new machine learning model.
For the next weeks, you collect data again, based on the choices the algorithm gave to you. Then you can feed this data, along with the answer to the question Did you achieve the desired result (free coffee)? to your algorithm, and it can re-train itself and maybe get a bit more accurate again.
If It’s Not Getting Better
But what if you notice that no matter whether you let the algorithm pick your outfit or you pick it yourself does not change the number of free coffees? Well, sometimes there is a correlation, but not causality, for some data. It may correlate that you wore a specific outfit on the days where you received free coffee, but it may not be the cause of it. In this case, you should go back to the beginning and make a new guess about the real cause. You think about this a lot, and you’re getting constantly less sleep due to this – and the number of coffees increases again. So for this specific case, you now know – people just felt pity for you because you looked so tired. They didn’t pay attention to your outfit at all.