Case study in the context of Pharma 4.0: binary pass / fail classification of pharmaceutical product batches based on raw material batch properties

Loading...
Thumbnail Image
Journal Title
Journal ISSN
Volume Title
School of Business | Bachelor's thesis
Date
2020
Major/Subject
Mcode
Degree programme
Tieto- ja palvelujohtaminen
Language
en
Pages
32
Series
Abstract
“Pharma 4.0” (implementation of Industry 4.0 concepts in pharmaceutical manufacturing) could solve barriers faced by pharmaceutical industry. Development of new pharmaceuticals is expensive and routine manufacturing operations typically involving batch processing (in-stead of continuous) are often inefficient. “Pharma 4.0” could make the industry more efficient through automatization of decisions with less human interventions. Nowadays regula-tors, such as FDA, also seem to encourage pharmaceutical industry to adapt emerging technology. In this bachelor’s thesis I conducted a case study where a classification algorithm was trained to predict the success of pharmaceutical product batches based on the raw material batch combinations used in manufacturing of the product batch. It was known beforehand that raw materials have a major role in the quality of the case product. The pass / fail classification was based on a single critical quality attribute of the product but could be extended. The data set was rather small (in machine learning context) consisting of slightly over 350 batches and imbalanced as majority of the batches were passed. However, training and cross-validation of the classifier with approximately 70 % of the batches and testing with rest (30 %) of the batches lead to quite good results in terms of consistency between cross-validation and testing, high precision and sufficiently high recall. The performance was especially evaluated by precision-recall curves but also ROC (Receiver operating characteristics) curves. Especially Support Vector Machine (SVM), Naive Bayes and Random Forest algorithms gave the best results with above-mentioned considerations, however, due to data set limitations final con-clusions of algorithm superiority for this purpose are not made. Instead, this study was a proof-of-concept that encourages to develop such a raw material selection tool in the compa-ny. This should be relatively straightforward as I focused here on the data available on a sin-gle data source (ERP). Therefore, suitable next step following this case study could be the implementation of the raw material selection tool for production planning purposes: when the product is manufactured, the classifier retrieves all raw material batches available for the raw materials to be used and predicts the probability of “pass” for each possible raw material combination based on the raw material batch attributes. Classifier could suggest the production planner the raw material batch combination with the highest probability of success. In long run, if the model proves useful in real use, the tool could reduce the workload of production planners and pro-cess and / or material experts they occasionally may need to consult in selection of the optimal raw material batch combination.
Description
Thesis advisor
Upreti, Bikesh
Keywords
Pharma 4.0, Industry 4.0, pharmaceutical manufacturing, binary classification, machine learning, data science
Other note
Citation