Christopher Seinecke: Marc, in the first part of our interview, we talked about the benefits that using AI in production could bring. We ended up with the keyword „predictive quality“. That sounds very promising. However, the topics of AI and machine learning are a black box for many of us, including myself. These are things I’m not familiar with and find difficult to trust. Why don’t you shed some light on the subject? What is Machine Learning? What is AI? And what of it would I need for my production, for example?
Dr. Marc Großerüschkamp: Let me try to answer it briefly and succinctly: An AI, or an artificial intelligence, imitates cognitive human behavior. This means that it can process certain input data, recognize patterns in it, and thus generate an appropriate response from it – making machines intelligent to a certain extent.
However, this intelligence must first be created, and this happens through the method of so-called machine learning, or ML for short. In machine learning, artificial knowledge is generated through experience, and the model behind the AI is trained. In this case, trained means that it is fed with data and maps patterns contained in this data into the model. The training algorithm generates a statistical model, which is then tested against test data to check its quality. If the quality does not match, further learning is required with more data or other data sets if necessary.
Christopher Seinecke: How can I imagine this in production?
Dr. Marc Großerüschkamp: Yes, you’re right, that was quite abstract. Let’s try to make it more concrete with an example. Going back to the case of the process engineering manufacturer mentioned in first part of the interview, data is generated and recorded at various points in the production process. This data could include factors such as temperature, humidity, material properties, and blower usage, all of which can impact the production process. At the end of the day, a part may come off the production line as either defective or not.
You now take some of this historical data and make it available to the algorithm. The algorithm will then analyze patterns in this data. For example, it might recognize that faulty production occurred when the temperature and air pressure were in a certain range. After that, the algorithm could calculate something like a 70% probability of defects occurring under the conditions mentioned earlier.
Now, of course, you must first find out whether the predictions are reliable. To do this, you take the other part of the historical data and check if the predictions are accurate. If the statements are off, you should think about whether you provided the algorithm with good data, or if the training phase needs to be repeated.
If the statements are correct, there are two possible courses of action. The first option is for a human to assess whether a probability of 70% warrants an intervention like stopping production or performing maintenance on the machine. The second option is to automate the process so that the machine decides to stop production itself when the probability threshold is met.
Christopher Seinecke: You say that we must think about whether we’ve given the algorithm sufficient data. This is where we get into people’s skepticism towards AIs. Can they take on a life of their own and make the wrong decisions? For example, the AI stopping production unnecessarily which will cost me and my company thousands of euros.
Dr. Marc Großerüschkamp: We have now reached a crucial point. The reality is that every algorithm is as good as its training, and this depends on how good the input data it was fed with. It’s like humans: We can only know what we have learned from somewhere. If only learned English, I will not suddenly be able to speak Japanese. The AI doesn’t take its knowledge from an external source, only from the input data.
Additionally, a lot of natural intelligence is required here. Without it, it is not possible to answer these questions: What do I want the AI to predict? Which parameters are important to be able to predict it at all? And what influence or weight do these parameters have in each case?
This may sound like „the more data the better“, but it is not. It’s more like „the more relevant data, the better“. The algorithm will, of course, also account for irrelevant data. If too much unimportant information is processed, this will have a negative impact on the quality of the predictions.
Christopher Seinecke: And how will I know which data is the right data to use?
Dr. Marc Großerüschkamp: In the end, it’s a combination of expert knowledge and trial-and-error. The first step is to start with a data set that has been compiled to the best of your expert knowledge and belief, which is no easy task. It is important that various disciplines work together. The production expert who knows the plant and a data science/ML professional should both be involved here. However, many misunderstandings can arise, so be prepared and be patient. The second step is to, as I said earlier, test with historical data to check the model’s predictions.
It is important to understand that once the model has been trained incorrectly for the first time, it is impossible to get this out of it. For example, if a sensor was defective and produced erroneous data that then into fed the training algorithm, it will use this data for learning – irretrievably.
In this case it is necessary to build a new model, which is not as complex it sounds now. You simply reproduce the model and leave out the data that was faulty. It’s critical to always have good data management and data quality control.
Christopher Seinecke: Okay, let’s assume that I have successfully trained the algorithm and it is now operational. How can I ensure that the algorithm won’t a) take on a life of its own, b) learn more and become smarter, and c) start making decisions on its own based on its new knowledge that I can no longer supervise or control?
Dr. Marc Großerüschkamp: Yes, that’s important to consider. Many people have the misconception that machines will overpower humans like how it’s depicted in Hollywood scripts. As I mentioned earlier, an automation decision must be made. You need to decide whether the machine should only predict probabilities or also make decisions and take actions accordingly. It’s always an option to leave decisions and actions to a human. The AI only provides input as support for humans.
Regarding the algorithm’s ability to continue learning on its own, it doesn’t happen that way. You need to make clear distinctions between the different phases. During development, the model is trained, validated and tested. If you use it, the model no longer changes or becomes static. This means that you generate a model with a certain data set up to a certain point and that does not continue to learn unintentionally. You can create a new model with a new or extended data set, but you have to consciously go back into a new development phase.
Christopher Seinecke: What I haven’t quite understood yet is why it needs to be machine learning? The training process sounds quite complex. Why not use conventional algorithms?
Dr. Marc Großerüschkamp: That simply depends on the initial situation. A machine learning model is statistics-based, whereas conventional algorithms are knowledge-based.
If I already know exactly which parameters lead to which result under which circumstances, then it makes sense to develop a conventional algorithm. I can then test its performance in the same way.
Often, however, I don’t know the exact parameters, and in such cases, machine learning models prove to be useful. They can recognize and map patterns that we cannot represent so precisely with our knowledge of the process. However, these are a black box for us, meaning that we cannot explain them exactly. We input parameters, and then it generates predictions based on the training, which can be quite accurate. In the end, we won’t get an algorithm that we can reproduce step by step. Fortunately, many breakthroughs are happening in the field. There are now so-called „Explainable AI“ algorithms. These can help you to better understand the model you want to work with, but that’s a completely different topic.
Christopher Seinecke: Let’s say I’m responsible for production as a CEO of a company, and I realize that a) I have access to a lot of data and b) want to use predictive quality to minimize errors in production. What do I need to think about specifically, and what steps should I take to achieve my goals?
Dr. Marc Großerüschkamp: These are already very good prerequisites. To maximize the potential of predictive quality, describe how you will use it very precisely. What do I want to recognize and how exactly? At what point is it economically worthwhile? To answer these questions, you must account for factors like error costs to quantify the benefit of using it. You should also look at your data. You don’t start with the AI training right away. As we have already discussed in detail, it is important to understand the data, prepare it, and assess if our data is the right data and of good quality.