Algorithms, machine learning, deep learning: anyone who understands what is behind these buzzwords has a professional advantage. Anyone can learn the basics of such processes – you don’t have to be a maths genius. Tips for getting started with complex digital processes.
Artificial intelligence, AI for short, is a buzzword of our time. But it is more than that, because AI is already being used productively in companies in the form of machine learning (ML) and deep learning (DL) – we discuss the difference below. In addition, a number of pioneering projects are to test the impact of AI in the company. For specialists and managers, it’s a matter of being there – or letting the development pass them by.
AI has arrived in companies and careers
Even if the technical development is always left to the professionals who can adorn themselves with job titles like Data Scientist or Machine Learning Engineer, there are many tasks that have to be done in the context of these projects: around project management, test validation and implementation of the prototype in productive systems.
Most start-ups, financial service providers and established eCommerce companies have long since made AI work in their mobile apps, shopping systems or the data processing of their online shops. These algorithms classify and predict, suggest products or routes to customers, for example, or the ideal sales prices or budgets for social media campaigns to retailers. There will also no longer be powerful fraud detection without AI. AI systems will take over the majority of auditing in a few years.
In industrial companies, AI systems are also used to improve order intervals in operational purchasing, to predictively maintain machines, to increase the energy efficiency of production logistics, to automatically monitor the quality of machine output – and generally to detect anomalies in production or other systems.
Specialists and managers with AI skills are needed
Today, the applications mentioned still work with rudimentary AI systems, some of which can be developed in-house and some of which can be bought in. In the future, these systems will be optimised, replaced by better ones or completely rethought. This requires good planning. In an AI system, the algorithm may be the more scientifically interesting part; in practice, however, the main issue is to integrate the system into the IT infrastructure and the business process.
The behaviour of the system must also be validated before, during and especially after integration. This is not an easy task, as we are dealing with extremely complex systems. To be able to perform these activities confidently and to take on the responsibilities for such projects requires general knowledge about the procedures and concepts of common AI systems.
Artificial intelligence, machine learning and deep learning
Buzzwords like Big Data or Industry 4.0 are often difficult to categorise. However, AI, ML and DL can be put into a clear context.
Machine learning is a collection of mathematical methods of pattern recognition. For newcomers to the subject, it is useful to know that there are methods that have their origins in stochastics and work with frequencies of occurrence – i.e. probabilities – of certain events in the data and can make predictions based on them. These are also easy to apply when categorical data are to be examined. Conditional probabilities and entropy are important concepts here.
And there are procedures that come intellectually from algebra and weight similarities in a multi-dimensional vector space, compare them with each other or split them into subspaces. These algorithms are generally somewhat more powerful with a lot of numerical data, mostly iteratively optimising and therefore also tend to be more computationally intensive. Vectors, matrix calculus, differential calculus and optimisation techniques play an essential role in understanding these learning algorithms.
In both categories, there are simple procedures and also procedures that are less easy to understand. But all these learning algorithms are capable of solving many everyday and also very specific problems. In a developer’s practice, however, problems often arise when there is either too little data or too many dimensions to the data.
This is why the successful development of machine learning algorithms is what is known as feature engineering. As features, the attributes (basically the columns of a table) are selected from a set that appear most promising for the learning procedures. Statistical methods are used for this preselection.
Deep Learning (DL) is a sub-discipline of machine learning using artificial neural networks. While the ideas for classical machine learning were developed from a certain mathematical logic, artificial neural networks have a model from nature: biological neural networks.
Artificial neural networks learn correlations
In artificial neural networks, an input vector (a set of dimensions) represents a first layer; this is expanded or reduced via further layers with so-called neurons and abstracted via weights until an output layer is reached. This creates an output vector – a result key that stands for a range of values, for example, or for a certain class, e.g. dog or cat in a picture. The neurons are like little lights that either light up 100 per cent or not at all or somehow in between. Training adjusts the weights between the neurons so that certain input patterns (for example, photos of pets) always result in a certain output pattern of glowing neurons (for example, “The photo shows a cat” or “The photo shows a dog”, where one output neuron is for the cat, the other for the dog). The training ideally sets the weights in hundreds or thousands of iterations in such a way that the correct output neuron always lights up at the end in sufficiently high quota.
The advantage of artificial neural networks is the very deep abstraction of correlations. This happens over several layers of the networks, which can solve very specific problems. This is where the overarching name is derived from: Deep Learning.
However, this is also associated with a disadvantage: A trained neural network with more than two layers becomes so difficult to understand in its mode of operation that it becomes a black box for us humans. Very complex tasks handled by artificial neural networks require the use of 20 or more layers with hundreds to thousands of neurons per layer – impossible to even begin to comprehend.
For the novice, this has the advantage that there is very limited need to deal with the theory and mathematics of Deep Learning, because even experts are essentially limited to configuring the neural network from the outside and then testing it.
Deep Learning comes into play when other machine learning methods reach their limits and also when separate feature engineering has to be dispensed with. The network learns itself which features play a decisive role for the problem-solving pattern to be trained.
Artificial intelligence (AI) is a scientific field that includes machine learning, but knows other areas as well. An artificial intelligence must not only learn, it must also be able to store, classify and retrieve knowledge efficiently.
In practice, AI systems that are (supposed to be) found in robots or autonomous vehicles, for example, are hybrid systems that are equipped with fixed rules in addition to learning procedures. These are to be regarded as the instincts of the system: Living beings also learn not only via neurons, but have innate instincts that cannot be broken – or only with great difficulty.
Can machine learning and deep learning be understood by ordinary people?
Classic machine learning methods are not particularly difficult to learn. The only really necessary step is to be able to get involved with mathematics or statistics to some extent. But here too – and now I’m breaking the myth – you don’t have to be a maths genius to understand at least the best-known methods, and by understanding here I mean understanding how they work in the background.
Learning Deep Learning is actually somewhat easier, at least if you can do without the detailed theory. Deep Learning does involve very complex algorithms that can be explained using complicated formulas. But getting started can be done through practice, which is about improving the performance of the network beyond just trying to add or delete layers.
This knowledge is sufficient to understand the principles, strengths and limitations of AI. A data scientist or machine learning engineer, however, may well need to delve much deeper. An algorithm may be quick and easy to train – but increasing its predictive accuracy from 94.3 to 95.6 percent can turn into real science and require a lot of persistence.
Theory, programming and practice
To get started, the classic machine learning methods are best learned in theory – not all, but most methods can be well understood on paper. No programming skills are needed. It is important to understand the differentiation between supervised and unsupervised methods and between parameterised and non-parameterised models. Naive Bayes and decision trees are suitable as an introduction to machine learning in combination with statistics. As an introduction to machine learning based on algebra, the k-nearest neighbour algorithm and k-means are particularly suitable.
If the general understanding is not enough and if you have some programming knowledge, you should learn the programming language Python and look at programming examples with the free ML library Scikit-Learn. There are good tutorials on the internet as well as good books in English and also in German. Especially at the programming level, there are also good online courses.
There are also online courses, tutorials and books for beginners in Deep Learning. Here, however, it is recommended to treat the theory of artificial neural networks only superficially and to get started right away in a practical way. There is no getting around the Python programming language and the TensorFlow and Keras libraries. TensorFlow is an (open source) library for Deep Learning published by Google, which is extended by the Keras library included. In Keras, artificial neural networks can be created and executed with just a few lines of code.
This makes the creation of multi-layer artificial neural networks easy and can be tested by anyone. Only larger challenges are limited by the – for normal consumers certainly limited – hardware.
There is one more limitation: persistence and a tendency to autodidactic learning are certainly general prerequisites for anyone who cannot familiarise himself with the subject over a month-long full-time course, but has to do it on the side. Please don’t give up – others have also managed it, tested it in practice and then written an article like this one.