AI and the chocolate factory

Sorting chocolate bars using Reinforcement Learning

Reinforcement Learning is an approach adopted from Artificial Intelligence (AI), which applies mathematical processes to imitate natural learning.  The chocolate factory is an example of industrial motion control applications can be developed using this procedure and surprise even the experts.

Several conveyor belts transport chocolate bars: They are part of the demonstrator machine that shows how Artificial Intelligence can be used for motion control. What remains to be done in a real factory is to pack the chocolate bars – automated, of course. In this Intelligent Infeed Demonstrator machine from Siemens Digital Industries, the chocolate bars must be placed in evenly spaced slots on the outfeed belt. “The bars are placed on the inlet belt at random intervals,” says Martin Bischoff, Expert in Virtual Mechatronics at Technology, the research division at Siemens. “The system controller achieves this by altering the speeds of the conveyor belts. A line of three conveyor belts can be accelerated or slowed down to ensure the chocolate is positioned correctly on the outlet belt. The development of an optimized control algorithm for this application is a tricky programming task – if you don’t believe it: just try it yourself. Via Reinforcement Learning, we have trained an artificial Intelligence controller to realize this task.”


Reinforcement – learning: failure and success

Reinforcement Learning is an Artificial Intelligence concept that works in a similar way as most people learn to ride a bike: by trial-and-error, without knowing the underlying physics. Whether a technique is good or not, learners experience immediately in their attempts and gradually get better and better.


“That’s exactly how Reinforcement Learning works,” explains Michel Tokic, a fellow AI expert in AI at Technology and lecturer in Applied Reinforcement Learning at Munich’s Ludwig Maximilian University. “The AI receives a target specification, such as: The chocolate bars must only be placed in the target fields and the system must perform the task as quickly as possible. The AI then makes – at first completely random – control attempts on the simulation model to meet this requirement. Triggered by photoelectric sensor signals it receives feedback on how good these attempts were. Based on this feedback, a satisfactory solution is gradually developed over a large number of training cycles.” In the chocolate example, about three million training cycles were needed before the AI was able to place the products in the fields correctly.


Training on the Digital Twin 

Errors in a motion control application can have expensive and dangerous consequences. That’s why it is common practice to develop and test controllers on digital twins of the machine without any risk (Siemens Virtual Commissioning). This digital twin can be used in the same way to train the AI. 

“After about 72 hours of training with the digital twin on a standard commercial computer (with Cloud-based computer clusters that’s reduced to about 24 hours), the AI is ready to control the real plant,” says Bischoff. “In any case, that’s a lot faster than when humans develop these control algorithms. In Erlangen we worked with our colleagues from Siemens Digital Industries to construct the demonstrator that’s controlled by this AI, and it performs exactly as we expected – that’s an important milestone for this project, financed by Siemens Innovationsfund".  

A different approach compared to the engineers

“If you observe the AI-controlled conveyor belts, you’ll see that the AI has found a strategy that involves transporting all the chocolate bars as fast as possible on the first conveyor belts, and only controlling the speed with precision on the last conveyor,” says Thomas Hennefelder, the engineer from Siemens Digital Industries responsible for the machine. “This strategy works well, and it’s interesting that it’s quite different from the one our conventional controller uses.” He still believes there is a lot of potential in the method of allowing AI to learn complex control tasks independently using the digital twin. “Thanks to this approach, we can now develop application-specific controllers faster and with less effort. In the future, production systems will no longer be limited to tasks for which a control program has already been developed but will be able to perform every task AI is capable of learning.”

Aenne Barnard, July 2021

Subscribe to our Newsletter

Stay up to date at all times: everything you need to know about electrification, automation, and digitalization.