EDUCATION

Columbia University         Biostatistics

M.S. (Theory and Method Track)      Sep 2018 - May 2020

Columbia University         Chemical Engineering

M.S.       Sep 2014 - Dec 2015

Wuhan University           Applied Chemistry

B.S.       Sep 2010 – Jun 2014

                 

EXPERIENCE

Associate Statistician

Columbia University Medical Center

  • Applied regression models using demographic, diagnostic and genetic information to identify potential patients or patients’ future conditions with R glmnet and dplyr
  • Select best predictive models in classification or regression analysis using AIC or adjusted R2
  • Develop different statistical tests, including two sample t-test, chi-square test and Fisher Exact test, to choose a better treatment, and adjust p values for multiple comparisons

  • Calculate sample size through power calculation and adjust p values for multiple comparisons
  • Prepare statistical reports on descriptive analysis and presented with data visualizations (ggplot2 in R)

Associate Researcher II

Mount Sinai Health System

  • Supported data mining and exploration to understand disease mechanisms with 200,000+ records
  • Proceed literature review and data mining to generate important feature lists, and used feature selection to reduce model complexity and training time
  • Built predictive models understand the role of different leading factors in neurological diseases using Python numpy, pandas and scikit-learn

                 

SKILLS AND CERTIFICATES

  • Computer skills: R, Python, SAS, SQL, Tableau, Excel, MATLAB
  • Modeling: OLS, Lasso, Ridge, Logistic Regression, SVM, Decision Tree, Random Forest, Boosting, PCA, KNN
  • Certificates: SAS Certified Advanced Programmer for SAS 9