Sepideh Saran

Sepideh Saran

Interpretability and Uncertainty Quantification of Machine Learning Models in Biological Applications (2020 - )

Advances in experimental methods in Biology and reduced costs of performing high-throughput experiments have provided a vast pool of datasets of various types of measurements. These datasets provide insight into different dimensions of the biological system, including the genome, epigenome, transcriptome, etc. Machine Learning methods can exploit these datasets to study the underlying biological processes, disentangle their causal relationships, and shape new research questions in Biology.

Neural Networks have become the state-of-the-art methods for identifying functional elements in the genome. However, for the ultimate use of these models in many critical downstream tasks, it is essential to be able to explain their decisions and provide a measure for the confidence in the model’s outputs. Experimental noise, incorrect dataset labels, out-of-distribution samples, class imbalance, and the presence of multiple motifs (i.e., multi-label setting) are the major reasons for uncertainty in computational models in Biology. Thus, providing uncertainty measurements, together with model interpretation, enhances the credibility of the proposed machine learning solution and helps clinicians in the subsequent decision-making process.

This project investigates the predictive uncertainty and interpretability of Machine Learning methods in various genomic applications. We focus on tailoring our solution to cope with the limitations of biological datasets, interpretability of the results, as well as model performance and reusability.

Peer-reviewed Publications (journal or conference)

  1. P. Rautenstrauch, A.H.C. Vlot, S. Saran, and U. Ohler (2021). Intricacies of single-cell multi-omics data integration. Trends in Genetics.https://doi.org/10.1016/j.tig.2021.08.012

Other (presentations at conferences or preprints)

  1. S. Saran, M. Ghanbari and U. Ohler. An empirical analysis of uncertainty estimation in genomics applications. (Workshop paper), Bayesian Deep Learning Workshop, NeurIPS 2021, Online, 14 December 2021.
  2. S. Saran, M. Ghanbari and U. Ohler. Similarity neural networks for RBP binding site detection (Workshop poster), Learning Meaningful Representations of Life workshop, NeurIPS 2021, Online, 14 December 2021.