Projects

  • HATHI 1M: Introducing a Million Page Historical Prose Dataset in English from the Hathi Trust [pdf] [data]
    Sunyam Bagga and Andrew Piper
    Journal of Open Humanities Data (JOHD 2022)
  • ‘Are you kidding me?’: Detecting Unpalatable Questions on Reddit [pdf] [code] [data] [talk]
    Sunyam Bagga, Andrew Piper and Derek Ruths
    16th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2021)
  • Detecting Narrativity Across Long Time Scales [pdf] [data & code]
    Andrew Piper, Sunyam Bagga, Laura Monteiro, Andrew Yang, Marie Labrosse and Yu Liu
    Computational Humanities Research Conference (CHR 2021)
  • Measuring the Effects of Bias in Training Data for Literary Classification [pdf] [code] [talk]
    Sunyam Bagga and Andrew Piper
    4th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature. International Committee on Computational Linguistics (COLING 2020)
  • Generalization Classification [book] [code]
    Classifying sentences for whether they encode a “generalization” (or not) using deep learning models. Used in Chapter “Machine learning as a collaborative process.” In Can We Be Wrong? The Problem of Textual Evidence in a Time of Data. Cambridge: Cambridge University Press.
  • Stylistic Accommodation on Reddit [pdf] [code]
    Caitrin Armstrong and Sunyam Bagga
  • Sentiment & Topic Analysis of Migrant Related Tweets [abstract] [report] [code]
    Sunyam Bagga and Alayne Moody
    Digital Humanities 2020 Conference, Ottawa, ON, Canada
  • Best Answer Prediction in Community-based Question-Answering Services [pdf] [code]
    Sunyam Bagga, Qianyu Liu and Jin Guo
  • Opportunistic Self Organizing Migrating Algorithm for real-time Dynamic Traveling Salesman Problem [pdf]
    Shubham Dokania, Sunyam Bagga and Rohit Sharma
    51st Annual Conference on Information Sciences and Systems (CISS), held at Johns Hopkins University, USA, 2017