Wednesday Nov 4, 12pm-1pm, [Meeting Link]

ReGAL: Rule-guided Active Learning for Deep Text Classification

David Kartchner

Advisor: Prof. Cassie Mitchell

ABSTRACT

One of the main bottlenecks to extending deep learning systems to new domains is the prohibitive cost of acquiring sufficient training labels. While many previous works have sought to alleviate this problem with weak supervision and data programming, rule and label noise prevent them from approaching fully-supervised performance. This work-in-progress provides a principled, AI-guided approach to improve rule-based and weakly supervised text classification by performing active learning not on individual data instances, but on entire labeling functions. We argue that such a framework can guide users and subject matter experts to select labeling rules that expand label function coverage without sacrificing clarity. Our experiments show that our framework, ReGAL, is able to generate coherent labeling rules while simultaneously obtaining state-of-the-art performance in weakly supervised text classification.

[Slides]

BIO

David Kartchner is a 3rd year CSE PhD homed in Biomedical Engineering, advised by Dr. Cassie Mitchell. His research focuses NLP-based information extraction and graph mining to for scientific and biomedical text. His recent work has explored methods of denoising and efficiently curating automatic labeling functions to provide weak supervision for low-resource text corpora with minimal human effort. David's research seeks to bring ML clinical and biomedical research settings and has been used by Intermountain Healthcare, Recursion Pharmaceuticals, and GSK. Prior to joining Georgia Tech, he received a BS and MS in applied and computational mathematics from Brigham Young University. Outside of school, David enjoys running, reading, ultimate frisbee, and spending time with his wife and young son.