Abstract

Standard packaging lines with high output rates often struggle when dealing with uncertainties in the conditions of the handled materials. This paper focuses on a piece of machinery in an automatic packaging line, namely, an automated apparatus that extracts cardboard blanks from a buffer and transfers them to the next section through suction cups. In this context, the success of the operation depends on various controllable parameters, disturbances, and time-dependent variables, whose mutual relationships are not easily identifiable and whose understanding has so far been entrusted to the experiential knowledge of human operators. Currently, drops in picking success rates require the machine to be stopped and operators to intervene on-site, making use of their expertise to identify the issue and recalibrate the machine. To address the problem, this paper presents an artificial-intelligence-enabled controller, capable of continuously and autonomously recalibrating the apparatus and compensating for disturbances, in order to avoid missed or incorrectly picked cardboard blanks. In particular, this work exploits experimental data to build a model of the system, on which a reinforcement-learning algorithm is trained. The controller is tasked with regulating the controllable parameters while monitoring process variables. The developed agent is tested on the real apparatus to assess its performance.

References

1.
Yambrach
,
F.
,
Bix
,
L.
,
De La Fuente
,
J.
,
Sundar
,
R. P.
,
Lockhart
,
H.
,
Mahajan
,
P. V.
,
Rodrigues
,
F. A.
, et al.,
2009
, “
Pa Through Ph
,”
The Wiley Encyclopedia of Packaging Technology
, John Wiley & Sons, Inc, New York, pp.
851
957
.
2.
Riley
,
A.
,
2012
, “
10—Paper and Paperboard Packaging
,”
Packaging Technology
,
A.
Emblem
and
H.
Emblem
, eds.,
Woodhead Publishing
, Cambridge, UK, pp.
178
239
.
3.
Tiwari
,
A.
, and
Persson
,
B. N. J.
,
2019
, “
Physics of Suction Cups
,”
Soft Matter
,
15
(
46
), pp.
9482
9499
.10.1039/C9SM01679A
4.
Li
,
H.
,
Wei
,
T.
,
Ren
,
A.
,
Zhu
,
Q.
, and
Wang
,
Y.
,
2017
, “
Deep Reinforcement Learning: Framework, Applications, and Embedded Implementations
,” 2017 IEEE/ACM International Conference on Computer-Aided Design (
ICCAD
), Irvine, CA, Nov. 13–16, pp.
847
854
.10.1109/ICCAD.2017.8203866
5.
Kober
,
J.
,
Bagnell
,
J.
, and
Peters
,
J.
,
2013
, “
Reinforcement Learning in Robotics: A Survey
,”
Int. J. Rob. Res.
,
32
(
11
), pp.
1238
1274
.10.1177/0278364913495721
6.
del Real Torres
,
A.
,
Andreiana
,
D. S.
,
Ojeda Roldán
,
A.
,
Hernández Bustos
,
A.
, and
Acevedo Galicia
,
L. E.
,
2022
, “
A Review of Deep Reinforcement Learning Approaches for Smart Manufacturing in Industry 4.0 and 5.0 Framework
,”
Appl. Sci.
,
12
(
23
), p.
12377
.10.3390/app122312377
7.
Johannink
,
T.
,
Bahl
,
S.
,
Nair
,
A.
,
Luo
,
J.
,
Kumar
,
A.
,
Loskyll
,
M.
,
Ojea
,
J. A.
,
Solowjow
,
E.
, and
Levine
,
S.
,
2018
, “
Residual Reinforcement Learning for Robot Control
,” 2019 International Conference on Robotics and Automation (
ICRA
), Montreal, QC, Canada, May 20–24, pp.
6023
6029
.10.1109/ICRA.2019.8794127
8.
Singh
,
A.
,
Yang
,
L.
,
Hartikainen
,
K.
,
Finn
,
C.
, and
Levine
,
S.
,
2019
, “
End-to-End Robotic Reinforcement Learning Without Reward Engineering
,”
Robotics: Science and Systems '19
, Freiburg im Breisgau, Germany, June 22–26, pp.
44
53
.https://roboticsproceedings.org/rss15/p73.pdf
9.
Panzer
,
M.
, and
Bender
,
B.
,
2022
, “
Deep Reinforcement Learning in Production Systems: A Systematic Literature Review
,”
Int. J. Prod. Res.
,
60
(
13
), pp.
4316
4341
.10.1080/00207543.2021.1973138
10.
Nian
,
R.
,
Liu
,
J.
, and
Huang
,
B.
,
2020
, “
A Review on Reinforcement Learning: Introduction and Applications in Industrial Process Control
,”
Comput. Chem. Eng.
,
139
, p.
106886
.10.1016/j.compchemeng.2020.106886
11.
Spielberg
,
S.
,
Tulsyan
,
A.
,
Lawrence
,
N. P.
,
Loewen
,
P. D.
, and
Gopaluni
,
R. B.
,
2019
, “
Toward Self-Driving Processes: A Deep Reinforcement Learning Approach to Control
,”
AIChE J.
,
65
(
10
), p.
e16689
.10.1002/aic.16689
12.
Kim
,
S.
,
Kim
,
D. D.
, and
Anthony
,
B. W.
,
2022
, “
Dynamic Control of a Fiber Manufacturing Process Using Deep Reinforcement Learning
,”
IEEE/ASME Trans. Mechatron.
,
27
(
2
), pp.
1128
1137
.10.1109/TMECH.2021.3070973
13.
van Hasselt
,
H.
, and
Wiering
,
M. A.
,
2009
, “
Using Continuous Action Spaces to Solve Discrete Problems
,”
2009 International Joint Conference on Neural Networks
, Atlanta, GA, June 14–19, pp.
1149
1156
.10.1109/IJCNN.2009.5178745
14.
Dulac-Arnold
,
G.
,
Evans
,
R.
,
Sunehag
,
P.
, and
Coppin
,
B.
,
2015
, “
Reinforcement Learning in Large Discrete Action Spaces
,” Computing Research Repository (
CoRR
) arXiv:1512.07679.10.48550/arXiv.1512.07679
15.
Lillicrap
,
T. P.
,
Hunt
,
J. J.
,
Pritzel
,
A.
,
Heess
,
N.
,
Erez
,
T.
,
Tassa
,
Y.
,
Silver
,
D.
, and
Wierstra
,
D.
,
2015
, “
Continuous Control With Deep Reinforcement Learning
,” preprint
arXiv:1509.02971
.https://arxiv.org/pdf/1509.02971
16.
Abadi
,
M.
,
Barham
,
P.
,
Chen
,
J.
,
Chen
,
Z.
,
Davis
,
A.
,
Dean
,
J.
,
Devin
,
M.
, et al.,
2016
, “
TensorFlow: A System for Large-Scale Machine Learning
,”
12th USENIX Symposium on Operating Systems Design and Implementation '16
, Savannah, GA, Nov. 2–4, pp.
265
283
.https://www.usenix.org/system/files/conference/osdi16/osdi16?abadi.pdf
17.
Kingma
,
D. P.
, and
Ba
,
J.
,
2014
, “
Adam: A Method for Stochastic Optimization
,” preprint
arXiv:1412.6980
.https://arxiv.org/pdf/1412.6980
18.
Fukushima
,
K.
,
1975
, “
Cognitron: A Self-Organizing Multilayered Neural Network
,”
Biol. Cybern.
,
20
(
3–4
), pp.
121
136
.10.1007/BF00342633
19.
Guadarrama
,
S.
,
Korattikara
,
A.
,
Ramirez
,
O.
,
Castro
,
P.
,
Holly
,
E.
,
Fishman
,
S.
,
Wang
,
K.
, et al.,
2018
, “
TF-Agents: A Library for Reinforcement Learning in TensorFlow
,” GitHub, San Francisco, CA, accessed June 25, 2019, https://github.com/tensorflow/agents
20.
Prudencio
,
R.
,
Maximo
,
M.
, and
Colombini
,
E.
,
2024
, “
A Survey on Offline Reinforcement Learning: Taxonomy, Review, and Open Problems
,”
IEEE Trans. Neural Networks Learn. Syst.
,
35
(
8
), pp.
10237
10257
.10.1109/TNNLS.2023.3250269
21.
Fujimoto
,
S.
,
van Hoof
,
H.
, and
Meger
,
D.
,
2018
, “
Addressing Function Approximation Error in Actor-Critic Methods
,”
35th International Conference on Machine Learning '18
, Stockholmsmässan, Sweden, July 10–15, pp.
2587
2601
.https://proceedings.mlr.press/v80/fujimoto18a/fujimoto18a.pdf
22.
Haarnoja
,
T.
,
Zhou
,
A.
,
Abbeel
,
P.
, and
Levine
,
S.
,
2018
, “
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning With a Stochastic Actor
,”
35th International Conference on Machine Learning '18
, Stockholmsmässan, Sweden, July 10–15, pp.
2976
2989
.https://proceedings.mlr.press/v80/haarnoja18b/haarnoja18b.pdf
You do not currently have access to this content.