Research Topics
(Updated Dec 2023) Full publications on Google Scholar, DPLB, arXiv. Research Summary: The research of my lab is focused on the principles and practice of machine intelligence, often with a focus on datasets, generalization, and making machine learning more reliable. Our applied research includes applications to healthcare, biomedical imaging, and cognitive neuroscience. ‡ indicates authors with equal contribution. ☆ indicates authors working closely with me. (Robust) Machine Learning for Imperfect DataThe development of machine learning models, particularly in the context of label scarcity, increasingly necessitates the collection of substantial annotated data. Moreover, massive data often display a long-tailed class distribution or subpopulation shifts, which consequently results in notable imbalance issues. To this end, there are several growing interests in training machine learning models jointly across imbalanced subpopulation distributions and limited annotations. We are developing novel algorithmic and computational approaches to ensure the efficiency and robustness of federated and distributed machine learning. Our applied research includes applications to healthcare, biomedical imaging, and cognitive neuroimaging.
Calibrating Multi-modal Representations: A Pursuit of Group Robustness without Annotations Learning with Theoretical GuaranteesAs machine learning methods have become ubiquitous in human decision-making, their reliability and interpretability have become important. This is particularly crucial in domains where decisions carry significant consequences, interpretable models can uncover crucial but unexpected patterns that complex models often obscure. We are currently studying provably interpretable modeling with theoretical guarantees. We are also exploring structured sparsity and attention in deep neural networks to enable interpretability.
Rethinking Semi-Supervised Medical Image Segmentation: A Variance-Reduction Perspective Learning with Multi-Modality DataMulti-modality data is ubiquitous in science and engineering applications. We are pursuing various techniques for modeling such multiple data, primarily using probabilistic graphical models and other statistical analyses. These tools are primarily used to facilitate clinical research. We are developing various tools to effectively tackle real-world challenges associated with data heterogeneity. Of particular interest are novel methods that address robustness issues, such as confounding, as well as novel distributed computation approaches.
End-to-end Spoken Conversational Question Answering: Task, Dataset and Model Foundation Models for Biomedical DataThe development of medical foundation models often requires massive and diverse biomedical data. To this end, I have developed various foundation models for biomedical imaging data and explored novel applications of these models. I have also developed novel medical AI Agents that lead to the scalable and accurate predictive modeling, particularly for distribution shift problems.
Implicit Anatomical Rendering for Medical Image Segmentation with Stochastic Experts |