Research

Transferring Simulation to Real Data

Error modeling modified diagram

Transferring Simulation to Real Data

A primary factor for the success of machine learning is the quality of labeled training data. However, in many fields, labeled data can be costly, difficult, or even impossible to acquire. In comparison, computer simulation data can now be generated at a much higher abundance with a much lower cost. This simulation data could potentially solve the problem of data deficiency in many machine learning tasks. 

We are interested in developing machine learning and deep learning techniques that are able to leverage the knowledge in simulation data and transfer it to real data based tasks. In this process, we address the discrepancy between these two domains of data arising from model assumptions, simplifications and possible errors. We also attempt to distill the knowledge gained from simulation data such that it is generalizable across the variations of all possible parameter settings in the simulation model. We investigate the development of this concept in a variety of clinical applications.

Integrating Domain Knowledge into Statistical Inference

Bayeisan Formalizm for 3D Reconstruction of Action Potentials

Integrating Domain Knowledge into Statistical Inference

Funding sources: NSF CAREER ACI-1350374 (PI: Linwei Wang)
Student members: Sandesh Ghimire, Jwala Dhamala, Jingjia Xu (alumni)

Scientific research across many domains has been increasingly enabled by paralleled advances in two broad disciplines: physics-based mathematical modeling that supports quantitative, multi-scale, and multi-physics simulation of the behavior and mechanism of complex systems, and modern sensor technologies that continuously improve the quantity and quality of measurement data available for analysis. However, the current state of computer modeling is generally decoupled from specific measurements of an individual system, and individualized data analysis often struggles for realistic domain contexts. This gap is ubiquitous in many science and engineering domains.

Supported through an NSF CAREER grant, we develop theoretical and mathematical foundations that support the integration of physics-based modeling and data-driven inference methods to improve individualized assessment of systems. Our interests in particular focus on data-driven identification and adaptation of the errors in the physics-based models in the statistical inference process. For more information, please visit here.

Electrocardiographic Imaging (ECGi)

ECGI Imaging

Electrocardiographic Imaging (ECGi)

Funding sources: NIH/NHLBI R01HL145590 (PI: Linwei Wang), NIH/NHLBI R21HL125998 (PI: Linwei Wang)
Student members: Sandesh Ghimire, Omar A. Gharbia, Roland Sanford, Jingjia Xu (alumni), Azar Rahimi (alumni)

Cardiac arrhythmia accounts for increasing risk for strokes, heart failure, and mortality. Currently, detailed electrical maps of the heart are obtained through point-by-point contact mapping on the heart surface. These invasive procedures are associated with numerous limitations including limited access to the heart and sampling density, inability to map unstable arrhythmia patterns, and increased patient risk. Electrocardiographic imaging (ECGi) is a noninvasive technique for computationally reconstructing the electrical activity of the heart from a combination of high-density body-surface ECG and image-derived thorax geometrical models. It provides a promising noninvasive complement to invasive mapping to potential improve the screening, diagnosis, and treatment of a variety of cardiac arrhythmias and heart diseases.

We focus on both the technical development and clinical applications of ECGi. This includes developing transmural ECGi solutions, making ECGi less expensive, and clinical ECGi applications in scar-related ventricular tachycardia and atrial fibrillation. This research is supported by funding from the NIH. For more information, please visit here.

Conference video screenshot

A Machine Learning Approach For Computer-guided Localization Of The Origin Of Ventricular Tachycardia Using 12-lead Electrocardiograms

Click on Image above for Full Video

Learning Disentangled Representations

Disentangle

Learning Disentangled Representations

Funding sources: NIH/NHLBI R15HL140500 (PI: Linwei Wang)
Student members: Mohammed Alawad, Prashnna K. Gyawali, Zhiyuan Li

Confounding factors are inherent in most data analyses, especially those of clinical data. For example, the 12-lead ECG data are generated by a large variety of physiological factors: some pertinent to the diagnosis and treatment of arrhythmia such as rhythm types and the location at which the rhythm originates, while others representing inter-subject variations due to thorax anatomy, heart anatomy and structural remodeling, surface lead positioning, signal artifacts, etc.

Properly disentangling these generative factors from ECG data is critical for automating ECG-based clinical tasks.

We are interested in the development of deep representation learning methods that are able to separate these inter-subject variations from clinical data, and are able to transfer variations learnt from one (larger) dataset (such as a simulation dataset), to another (smaller) dataset (such as a clinical dataset). We are also interested in clinical applications -- in an NIH funded project, we work with clinicians to develop a deep-learning based software tool that is able to guide clinicians progressively closer towards the surgical target in real time during the procedure. For more details, please visit here.

Conference video screenshot

Progressive Learning and Disentanglement of Hierarchical Representations

Click on Image above for Full Video

Conference video screenshot

A Machine Learning Approach For Computer-guided Localization Of The Origin Of Ventricular Tachycardia Using 12-lead Electrocardiograms

Click on Image above for Full Video

End-to-end Uncertainty Quantification

active surrogate construction to surrogate accelerated MCMC

End-to-end Uncertainty Quantification

Student members: Jwala Dhamala, Ankit Aich

Mathematical models of a living system are always subject to epistemic uncertainties that represent our limited knowledge about a system. While personalized models have shown increasing potential in medicine, their uncertainties remain the main roadblock to their widespread adoption in the healthcare industry. Existing efforts in uncertainty quantification (UQ) mostly investigated how generic variability in a model element – represented by probabilistic distributions defined a priori – results in output variability. However, a personalized model first has to be customized from patient-specific data, before being able to make predictions pertinent to that individual. Therefore, the uncertainty in a personalized model is not generic, but driven by data used to customize the model. This poses a unique challenge: to measure the variability in patient-specific predictions, we must first infer the uncertainty within the data-driven model elements, before propagating this uncertainty to model predictions. This unmet need – which we term as end-to-end UQ – is the focus of our research.

In specific, we are interested in adopting and advancing Bayesian active learning for surrogate modeling that is focused on the posterior support of the model uncertainty, formulating hybrid MCMC methods accelerated by the surrogate model, and unified end-to-end propagation of the uncertainty from the input data through the model to the model predictions. For more details, please visit here.