Research

Members of the Sindi Lab come from a variety of disciplines, but are united by their common desire to use mathematical and statistical tools to gain insights in into complex biological processes.

Below we overview some of our current and on-going research topics.

Mathematical Modeling of Complex Biological Systems

Carousel imageCarousel image

Multi-Scale Modeling of Prion Aggregation and Dynamics in Yeast

Postdoctoral Researcher: Mikahl Banwarth-Kuhn

Graduate Students: Jordan Collignon, Ali Heydari, Paul Lemarre, Thomas de Mondesir Pierron, Fabian Santiago

Collaborators: Dr. Laurent Pujo-Menjouet (Lyon), Dr. Tricia R. Serio (U Mass Amherst), Dr. Maxime Theillard (UC Merced)

Prions are responsible for a host of fatal, mammalian diseases – including notably bovine spongiform encephalopathy (mad cow disease), fatal familial insomnia, and Creutzfeldt-Jakob disease. These diseases arise when a misfolded (prion) form of a protein appears and aggregates. Aggregates of the misfolded form act as templates to convert the normally folded protein to its misfolded state. Fragmentation of prion aggregates amplifies the number of templates facilitating the spread of the disease. Beyond prions, these linear aggregates (amyloids) formed of other proteins are associated with over 20 non-transmissible neurodegenerative diseases such as Alzheimer's and Parkinson's disease. Yeast has emerged as an ideal model system for studying prions, and the more general protein misfolding process, because there are many distinct non-fatal prion proteins whose dynamics can be studied in living cells, and prion phenotypes appear quickly. A major challenge in prion biology has been linking experimental observables (often of populations of cells) with basic processes (occurring inside individual cells). The major focus of the Sindi Lab has been to address gaps in biological knowledge by developing mathematical models that depict both single cell and population processes and, through doing so, generate novel hypotheses for biologists to explore experimentally.

This research combines multi-scale modeling, intracellular signaling, mechanobiology, scientific computing, uncertainty quantification and sensitivity analysis to develop computational tools that integrate biological data from experiments and produce predictive simulations that motivate the development of future experiments.

As part of this work we have a number of on-going projects,

  • We are developing methods to infer biochemical parameters relevant to yeast prion aggregation directly from in vivo data. These approaches involve both developing multi-scale structured models of aggregation within dividing yeast cells as well as an inverse problem formulation to determine critical biochemical rates from propagon recovery assays.

  • We are developing a cell-based model to investigate the spread of prion disease dynamics within an actively growing and dividing yeast colony. Most of the current mathematical models developed for studying prion disease dynamics only focus on isolated prion aggregate dynamics. However, in a population of living cells, different cell behaviors such as growth, diffusion, and division are known to impact the abundances and concentrations of reactants and could have a large impact on protein aggregation or more specifically propagation of prion aggregates throughout the colony. Developing a cell-based model that incorporates both intracellular dynamics and cell behaviors affecting protein aggregation will provide a novel tool to test hypotheses about mechanisms that can explain unresolved experimental data and yield new strategies for treating protein misfolding diseases.

  • We are studying how the processes of cell growth and asymmetric cell division interact with the on-going intracellular processes of protein aggregation and fragmentation. by embedding a reaction-diffusion partial differential equation formulation within a 3D level-set of a dividing yeast cell.


PhD Student Ali Heydari and Dr. Sindi collaborate with Dr. Maxime Theillard to use level-set methods for studying reaction/diffusion of prion aggregates in actively dividing cells.

Recent Publications

Heydari, AA; Sindi, S; Theillard, M, Conservative Finite Volume Method on Deforming Geometries: the Case of Protein Aggregation in Dividing Yeast Cells. Under Review (Journal of Computational Physics).

Villali, J., Dark, J., Brechtel, T. M., Pei, F., Sindi, S. S., & Serio, T. R. (2020). Nucleation seed size determines amyloid clearance and establishes a barrier to prion appearance in yeast. Nature Structural & Molecular Biology, 1-10.

Lemarre, P., Pujo-Menjouet, L., & Sindi, S. S. (2020). A unifying model for the propagation of prion proteins in yeast brings insight into the [PSI+] prion. PLOS Computational Biology, 16(5), e1007647.

Banwarth-Kuhn, M., Collignon, J., & Sindi, S. (2020). Quantifying the Biophysical Impact of Budding Cell Division on the Spatial Organization of Growing Yeast Colonies. Applied Sciences, 10(17), 5780.

Sensitivity Analysis and Uncertaintity Quantification in Blood Coagulation

Graduate Students: Amandeep Kaur

Collaborators: Dr. Aaron Fogelson (Utah), Dr. Karin Leiderman (Mines), Dr. Katie Link (UC Davis), Dr. Dougald Monroe (UNC), Dr. Michael Stobb (Coe College)

Blood coagulation is a complex biochemical process in which dozens of plasma proteins take part in a nearly 100 enzymatic reactions involving positive feedback (where a blood clot is initiated) and negative feedback (where the growth of the blood clot is stopped). Both overclotting and uncerclotting are associated with diseases and, as such, understanding the regulation of this system is important. However, comparisons between mathematical models of coagulation and coagulation experiments and assays were vague and qualitative. Moreover, mathematical models were only consistent with these experimental assays for a relatively narrow range of experimental conditions. In order for more informative comparisons between models and experiments it was clear that interdisciplinary collaboration and the use of statistical approaches, including sensitivity analysis and uncertainty quantification were necessary.

As part of this work we have a number of on-going projects,

  • We are working on methods to enable precise quantitative comparisons between in vitro coagulation assays and mathematical models. We hypothesized a reason for the quantitative disagreement was that mathematical models were not modeling the full experimental assays. In particular, mathematical models did not include the chromogenic substrates, synthetic peptides designed to measure the activity in a patient’s blood sample in an experimental assay. We decided to build a model from the ground up that was designed to depict the precise experimental assay and observed that two significant outcomes: (i) it is possible to have a quantitatively accurate comparison between models and coagulation assays, (ii) chromogenic substrates alter the underlying reactions they are designed to measure by inhibiting the coagulation processes themselves.

  • We are working to understand modifiers of bleeding disorders. The bleeding disorder hemophilia A is characterized by an inability to form effective clots and is clinically classified by a deficiency in the coagulation factor VIII. Treatment of hemophilia A is costly, and the bleeding disorder varies considerably among individuals. Intriguingly, a small fraction of hemophilia A patients exhibit much better clotting that is predicted by their measured factor VIII levels. This suggests the presence of a yet-unknown-factor influencing the bleeding phenotype of a hemophila A patient. A traditional clinical approach would be to collect patient samples to uncover the link; however, the rarity of this phenotype in an already rare bleeding disorder, along with considerable variability between individuals, would make it impossible to collect enough samples to be statistically significant. Our team took a data-driven approach by using a previously published mathematical model (Leiderman/Fogelson) and conducted a global sensitivity analysis by varying the levels of all plasma proteins (as well as many other parameters) to identify possible candidates. We found, for our model, the combination of an exceptionally high level in the clotting factor II and low level in the clotting factor V lead to improved clotting when factor VIII was at hemophilia A levels. Remarkably this non-intuitive model prediction was subsequently experimentally validated on hemophilia A samples.

Recent Publications

Stobb, M. T., Monroe, D. M., Leiderman, K., & Sindi, S. S. (2019). Assessing the impact of product inhibition in a chromogenic assay. Analytical biochemistry, 580, 62-71.

Link, K.G., Stobb, M.T., Sorrells, M.G., Bortot, M., Ruegg, K., Manco‐Johnson, M.J., Di Paola, J.A., Sindi, S.S., Fogelson, A.L., Leiderman, K. and Neeves, K.B. (2020). A mathematical model of coagulation under flow identifies factor V as a modifier of thrombin generation in hemophilia A. Journal of Thrombosis and Haemostasis, 18(2).306-317.

Link, K. G., Stobb, M. T., Di Paola, J., Neeves, K. B., Fogelson, A. L., Sindi, S. S., & Leiderman, K. (2018). A local and global sensitivity analysis of a mathematical model of coagulation and platelet deposition under flow. PloS one, 13(7), e0200917.

Deep and Statistical Learning


Deep Learning and Mathematical Modeling for Personalized Treatment

Graduate Researcher: Ali Heydari

Applied Mathematics

This research lies at the intersection of deep learning, computer vision, and bioinformatics. The goal of our work is to develop computational methods for better downstream analysis of treatments (and personalized treatments). To achieve this, we have used a deep generative model to generate realistic synthetic single-cell data. Creating realistic in-silico data will help enable robust analysis of single-cell data and foster greater reproducibility in research no such data. The next aspect of this research focuses on developing large-scale deep learning models and transfer learning models for enhancing learning from small datasets. Our goal is to design an "all-in-one" network that can be quickly fine-tuned on small datasets to accurately perform domain-specific tasks, such as clustering, cell-type identification, and data generation. This pre-trained core enables researchers to transfer the vast existing knowledge to their downstream analyses, allowing efficient and accurate predictions for datasets that are orders of magnitude smaller than the training data.

Recent Publications

Heydari, AA; Davalos, O; Zhao, L; Hoyer, K; Sindi S., ACTIVA: realistic single-cell RNA-seq generation with automatic cell-type identification using introspective variational autoencoders , Under Review , available at https://www.biorxiv.org/content/10.1101/2021.01.28.428725v2

Data Driven Discovery of Transcription Factor Binding Network with Machine Learning

Graduate Researcher: Akshay Paropkari

Collaborators: Dr. Clarissa Nobile

Quantitative Systems Biology

This project focuses on on creating a quantitative model to understand the mechanism underlying biofilm formation in human fungal pathogen – Candida albicans. C. albicans is a normal resident of human microbiome and its ability to form biofilms is a major etiological factor for local and systemic infections. Recent evidence suggests differential gene expression associated with C. albicans biofilm formation. We are yet to fully understand the regulatory adaptations underlying its biofilm development. This work will inform us on what gene regulatory changes occurring during C. albicans biofilm formation.

Data Science + Language

Graduate Researcher: Alex John Quijano

Applied Mathematics

Collaborators:
Dr. Rick Dale (UCLA), Dr. Arnold Kim (UC Merced), Graduate Students: Ayme Tompson and Maia Powell

Our research focuses on using Google Ngram data to analyze eight languages and compared it to a neutral model of word frequency evolution. The multivariate time-series evolutionary dynamics of words are investigated using mathematical methods such as the Vector AutoRegression and the Dynamic Mode Decomposition. Second aim of this work is to use NLP methods for uncovering the discourse and evolution behind the certain hashtag social movements on Twitter. The last goal of this work to use Transformers-based models - such as BERT - for Question Answering and Text Classification tasks on large natural language data.

Computational and Evolutionary Biology

Carousel imageCarousel image

Identification of Structural Variation from Sequencing Data

Collaborators: Dr. Mario Banuelos (Fresno State), Dr. Roummel Marcia (UC Merced)

Many genetic disorders – including cancer – are caused by structural modifications of an individual’s genome. Structural variation (SV) in genomes consists of rearrangements ranging anywhere from a few nucleotides in length to millions of nucleotides in length. Originally, SVs such as inversions, insertions and deletions were thought to be rare, but today SVs have been be linked to some heritable diseases and implicated in a number of cancers. With continually decreasing costs of DNA sequencing and the availability of high-quality reference genonmes for a variety of species, the common paradigm for SV discovery has been to sequence reads from an individual genome and map these reads to the reference genome. Regions in the individual genome corresponding to an SV will be revealed by discordant configurations of mapped fragments. Unfortunately, deciphering the resulting data is complicated by both errors in the data and computational complexities arising from millions (and even billions) of observed data points.

Because of the high-volume of data and the error-prone and noisy sequencing data, sophisticated mathematical approaches are required to successfully predict SVs. What distinguishes my work from others is use of statistical modeling to consider the configuration of a set of reads suggesting a potential SV.

We have several projects in this area including:

  • Developing likelihood based models for determining not just the presence of a SV, but the likelihood that it is a true SV from an erroneous configuration of reads.

  • Leveraging multiple individuals to boost the signal of true SVs by combining

Recent Publications

Sindi, S., Helman, E., Bashir, A., & Raphael, B. J. (2009). A geometric approach for classification and comparison of structural variants. Bioinformatics, 25(12), i222-i230.

Sindi, S. S., Önal, S., Peng, L. C., Wu, H. T., & Raphael, B. J. (2012). An integrative probabilistic model for identification of structural variation in sequencing data. Genome biology, 13(3), R22

Spence, M., Banuelos, M., Marcia, R. F., & Sindi, S. (2020). Detecting inherited and novel structural variants in low-coverage parent-child sequencing data. Methods, 173, 61-68.