HotCSE Seminar
Computational Science & Engineering
Wednesday February 24, 12pm-1pm, 1116-E Klaus

Cloud-based Predictive Modeling System and its Application to Asthma Readmission Prediction

Robert Chen
Advisor: Prof. Jimeng Sun


The predictive modeling process is time consuming and requires clinical researchers to handle complex electronic health record (EHR) data in restricted computational environments. To address this problem, we implemented a cloud-based predictive modeling system via a hybrid setup combining a secure private server with the Amazon Web Services (AWS) Elastic MapReduce platform. EHR data is preprocessed on a private server and the resulting de-identified event sequences are hosted on AWS. Based on user-specified modeling configurations, an on-demand web service launches a cluster of Elastic Compute 2 (EC2) instances on AWS to perform feature selection and classification algorithms in a distributed fashion. Afterwards, the secure private server aggregates results and displays them via interactive visualization.
We tested the system on a pediatric asthma readmission task on a de-identified EHR dataset of 2,967 patients. We conduct a larger scale experiment on the CMS Linkable 2008-2010 Medicare Data Entrepreneurs’ Synthetic Public Use File dataset of 2 million patients, which achieves over 25-fold speedup compared to sequential execution.


Robert Chen’s research centers around machine learning algorithms and applications for healthcare analytics. He has worked on various large scale projects with Brigham and Women’s Hospital, UCB, IBM TJ Watson Research Center, Vanderbilt University Medical Center, Children’s Hospital of Atlanta, and Centers for Disease Control. He is an MD/PhD candidate, working on an MD at Emory University and a PhD in Computer Science at the Georgia Institute of Technology. He earned a BS in Mathematics from the Massachusetts Institute of Technology. He has published in venues including KDD, ICDM, AMIA, Nature Genetics and Nature Protocols.