Biostatistics

Estimation and testing for biological and medical data
Lifetime data analysis
Infectious disease modelling
Incomplete data and measurement error
Causal inference
Statistical genetics
The design and analysis of clinical trials
Longitudinal data analysis
Epidemiological studies
Analysis of clustered data
Analysis of genomic and multi-scale -omics data

Biostatistics and health data science

I am developing Machine Learning algorithms focusing on high-dimensional data, with applications to the integration of genomics and gene expression data to predict risk and occurrence of diseases, as well as drug responses. Collaborating with colleagues in the Cumming School of Medicine, we develop computational tools tailored to different diseases including cancers and neurodevelopment disorders.

>> Find out more about Dr. Zhang's research interests

Our research addresses the following questions:

(1) How to learn sensible representations of high dimensional data. Representation learning is an emerging technique largely based on unsupervised learning to learn the internal structure of the data. The outcome will be new (usually lower dimensional) data that enable more effective downstream machine learning tasks.

(2) How to identify association and causality using data mining techniques. Identifying association and causality is a long-standing field in statistics and has broad applications in genomics and precision medicine. A specific focus of our research is stably identifying them in the presence of noise and complicated correlation structures.

(3) How to carry out statistical inference in multi-scale omics data (e.g., single-cell sequencing). Integrating multiple layers of data may represent the current trend of statistical inference. We focus on how to integrate -omics data including genomics, transcriptomics, epigenomics and other -omics data to form predictors. In particular, we are interested in analyzing single-cell -omics data to infer within-tissue and within-individual dynamics at the single-cell resolution.

For more information check my research page here, and my lab page here.

My research interests are on the methodology development and statistical analysis of complex and imperfect data arising from public health and medical research. Key areas of focus include the analysis of survival data, recurrent events data and longitudinal data that often arise in both clinical trials and observational studies. The types of complications that I deal with include incomplete data, misclassification and measurement error, latent heterogeneity, joint outcomes, high dimension, and hierarchical structures with the aim to identify significant factors, quantify the association and make valid inferences.

>> Find out more about Dr. Shen's research interests

My current research focuses on developing statistical methods to

analyze lifetime data involving latent processes where the underlying disease may resolve while some covariates are incompletely observed or subject to misclassification to avoid ignorance of patient heterogeneity, biased estimates and invalid inference,
develop joint models for classification and prediction based on mixed measurements involving surrogate classifiers or observations subject to measurement error to produce higher accuracy and precision in subgroup attribution or diagnostic test,
conduct causal inference using advanced statistical learning methods to address the complications of having missing and/or misclassified confounders to produce unbiased estimates of treatment effect,
propose advanced and adaptive methods for variable selection and group-variable selection in recurrent event analysis and survival analysis and investigate their oracle properties, and
model the longitudinal data and survival data or multivariate lifetime time jointly and propose computationally efficient methods for algorithm implementation and statistical inference.

I am keen in supporting medical research through transdisciplinary partnership. My collaborators include epidemiologists, oncologists, radiologists, medical physicists, gastroenterologists, cardiovascularists, and rheumatologists. Researchers in other areas are also welcome to contact me for prospective collaboration.

I am interested in working with students at both graduate and undergraduate levels. Students with good work ethic, strong interests and solid background in statistics, biostatistics, applied mathematics, computer science and other related areas are welcome to make inquiries about graduate studies or post-doctoral positions.

Biostatistics

Research topics

Researchers

Related website

Research Spotlight: Dr. Qingrun Zhang

>> Find out more about Dr. Zhang's research interests

Research Spotlight: Dr. Hua Shen

>> Find out more about Dr. Shen's research interests

Supervision