Manual gating is the traditional procedure adopted to identify cellular clusters from multi-dimensional datasets generated with flow cytometry, a tool for detecting and monitoring different diseases by acquiring single cell features. However, the identification of cellular subpopulations by manual gating is a time-consuming process strongly affected by human expertise. Automated analysis supported by computational systems, such as machine learning approaches, can radically change the way flow cytometry data are elaborated. In this paper we applied a suite of machine learning classifiers for analysing samples of peripheral blood acquired with flow cytometry. The goal was to identify CD4+ lymphocytes population. Four ML classifiers are examined —Support Vector Machine, Random Forest, Multilayer Perceptron and Logistic Regression using stratified 10-fold cross-validation. All the four models perform very well, with a balanced accuracy score > 0.945. We come to the conclusion that all four algorithms classify the events of interests with promising results, paving the way for further investigations.

Machine learning for automated gating of flow cytometry data

Muhammad Suffian
;
Sara Montagna;Alessandro Bogliolo;Claudio Ortolani;Stefano Papa;Mario D'Atri
2022

Abstract

Manual gating is the traditional procedure adopted to identify cellular clusters from multi-dimensional datasets generated with flow cytometry, a tool for detecting and monitoring different diseases by acquiring single cell features. However, the identification of cellular subpopulations by manual gating is a time-consuming process strongly affected by human expertise. Automated analysis supported by computational systems, such as machine learning approaches, can radically change the way flow cytometry data are elaborated. In this paper we applied a suite of machine learning classifiers for analysing samples of peripheral blood acquired with flow cytometry. The goal was to identify CD4+ lymphocytes population. Four ML classifiers are examined —Support Vector Machine, Random Forest, Multilayer Perceptron and Logistic Regression using stratified 10-fold cross-validation. All the four models perform very well, with a balanced accuracy score > 0.945. We come to the conclusion that all four algorithms classify the events of interests with promising results, paving the way for further investigations.
File in questo prodotto:
File Dimensione Formato  
paper5.pdf

accesso aperto

Tipologia: Versione editoriale
Licenza: Creative commons
Dimensione 1.66 MB
Formato Adobe PDF
1.66 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11576/2707490
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact