The work carried out during these three years of Ph.D. followed two different objectives: the development of a predictive model for the occurrence of the toxic algal species Alexandrium minutum and the study, through data analysis techniques, of phytoplankton biodiversity at a global and local level, with particular interest into the European area. Regarding the first of the two objectives, a predictive model was developed using the Random Forest technique. For this purpose, data relating to A. minutum occurrence, detected by the PCR technique carried out on samples of water taken in the NE Adriatic Sea area, and data relating to the different predictive variables, were provided to the program. Precisely, two models have been developed, one using 18 predictive variables, including the values of different nutrients, and one using only 12 of the 18 variables, excluding nutrient values, in order to a more easily use of the model in future, without the need for laboratory analysis. Results show that both models have good reliability values and that the 12-variable model works as well as the 18-variable one and can therefore be used without the risk of losing essential information. A work exposing these results has been published in the journal "Scientific Reports". Regarding the second of the two objectives, we examined data relating to the OSD campaign, an international project that provided for collecting, by different partners all over the world, sea water samples during the day of summer solstice. Samples were analyzed with metagenomic techniques, which led to the identification at the genus level of the different organisms present within the samples. After a firts exploratory analysis, in which data were analyzed using the Principal Coordinates Analysis echnique, starting from a matrix obtained with to Jaccard coefficient between the different stations, the dataset was divided into clusters, thankso a space-constrained clustering bound by a matrix of geographical connections obtained from a Gabriel graph. Subsequently, we decided to focus our attention on the two Longhurst ecoregions that had the greatest number of samples, namely Mediterranean Sea and NE Atlantic Shelves Province, oking for associations between taxa. Two different association matrices, one for each province, were created using the Fager & McGowan coefficient and were subsequently analyzed using the Mantel test. The test result ound a certain correlation between the o matrices. On the same line, also the result of the PROTEST performed on the two ordination originating from the two matrices. This suggests that associations between different taxa, more than being linked to the geographical position in which they are located, depend on other issues, such as, for example, the physical characteristics typical for every taxon. A manuscript presenting this work and its results is currently being drafted

Analysis and forecasting of the structure of marine phytoplankton assemblages using innovative molecular techniques of NGS (Next Generation Sequencing) and Machine Learning

Valbi, Eleonora
2020

Abstract

The work carried out during these three years of Ph.D. followed two different objectives: the development of a predictive model for the occurrence of the toxic algal species Alexandrium minutum and the study, through data analysis techniques, of phytoplankton biodiversity at a global and local level, with particular interest into the European area. Regarding the first of the two objectives, a predictive model was developed using the Random Forest technique. For this purpose, data relating to A. minutum occurrence, detected by the PCR technique carried out on samples of water taken in the NE Adriatic Sea area, and data relating to the different predictive variables, were provided to the program. Precisely, two models have been developed, one using 18 predictive variables, including the values of different nutrients, and one using only 12 of the 18 variables, excluding nutrient values, in order to a more easily use of the model in future, without the need for laboratory analysis. Results show that both models have good reliability values and that the 12-variable model works as well as the 18-variable one and can therefore be used without the risk of losing essential information. A work exposing these results has been published in the journal "Scientific Reports". Regarding the second of the two objectives, we examined data relating to the OSD campaign, an international project that provided for collecting, by different partners all over the world, sea water samples during the day of summer solstice. Samples were analyzed with metagenomic techniques, which led to the identification at the genus level of the different organisms present within the samples. After a firts exploratory analysis, in which data were analyzed using the Principal Coordinates Analysis echnique, starting from a matrix obtained with to Jaccard coefficient between the different stations, the dataset was divided into clusters, thankso a space-constrained clustering bound by a matrix of geographical connections obtained from a Gabriel graph. Subsequently, we decided to focus our attention on the two Longhurst ecoregions that had the greatest number of samples, namely Mediterranean Sea and NE Atlantic Shelves Province, oking for associations between taxa. Two different association matrices, one for each province, were created using the Fager & McGowan coefficient and were subsequently analyzed using the Mantel test. The test result ound a certain correlation between the o matrices. On the same line, also the result of the PROTEST performed on the two ordination originating from the two matrices. This suggests that associations between different taxa, more than being linked to the geographical position in which they are located, depend on other issues, such as, for example, the physical characteristics typical for every taxon. A manuscript presenting this work and its results is currently being drafted
2020
File in questo prodotto:
File Dimensione Formato  
phd_uniurb_280225.pdf

accesso aperto

Tipologia: DT
Licenza: Creative commons
Dimensione 2.78 MB
Formato Adobe PDF
2.78 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11576/2673494
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact