Machine learning for automated gating of flow cytometry data

Suffian, Muhammad; Montagna, Sara; Bogliolo, Alessandro; Ortolani, Claudio; Papa, Stefano; D'Atri, Mario

Manual gating is the traditional procedure adopted to identify cellular clusters from multi-dimensional datasets generated with flow cytometry, a tool for detecting and monitoring different diseases by acquiring single cell features. However, the identification of cellular subpopulations by manual gating is a time-consuming process strongly affected by human expertise. Automated analysis supported by computational systems, such as machine learning approaches, can radically change the way flow cytometry data are elaborated. In this paper we applied a suite of machine learning classifiers for analysing samples of peripheral blood acquired with flow cytometry. The goal was to identify CD4+ lymphocytes population. Four ML classifiers are examined —Support Vector Machine, Random Forest, Multilayer Perceptron and Logistic Regression using stratified 10-fold cross-validation. All the four models perform very well, with a balanced accuracy score > 0.945. We come to the conclusion that all four algorithms classify the events of interests with promising results, paving the way for further investigations.