Sarthak Jain #

I am a 5th Year PhD student in Khoury College of Computer Sciences at Northeastern University advised by Byron Wallace.

I have mostly worked in the area of Interpretability and Analysis of Machine Learning and NLP Models.

Personal Information #

  • Email:
  • Github:
  • Education:
    • PhD (Computer Science - 2017-2022), Khoury College of Computer Sciences, Northeastern University
    • B.Tech (Computer Engineering - 2013-2017), Delhi Technological University

Publications #

You can find some of my representative publications below.

Interpretability and Analysis #

Pouya Pezeshkpour, Sarthak Jain, Sameer Singh, Byron Wallace, Combining Feature and Instance Attribution to Detect Artifacts, In arXiv cs.CL, 2021

Pouya Pezeshkpour, Sarthak Jain, Byron Wallace, Sameer Singh, An Empirical Comparison of Instance Attribution Methods for NLP, In NAACL, 2021

Eric Lehman, Sarthak Jain, Karl Pichotta, Yoav Goldberg, Byron Wallace, Does BERT Pretrained on Clinical Notes Reveal Sensitive Data?, In NAACL, 2021

Sarthak Jain, Sarah Wiegreffe, Yuval Pinter, Byron Wallace, Learning to Faithfully Rationalize by Construction, In ACL, 2020

Jay DeYoung, Sarthak Jain, Nazneen Rajani, Eric Lehman, Caiming Xiong, Richard Socher, Byron Wallace, ERASER: A Benchmark to Evaluate Rationalized NLP Models, In ACL, 2020

Sarthak Jain, Byron Wallace, Attention is not Explanation, In NAACL, 2019

Disentangled Representations #

Babak Esmaeili, Hao Wu, Sarthak Jain, Alican Bozkurt, false Siddharth, Brooks Paige, Dana Brooks, Jennifer Dy, Jan-Willem Meent, Structured Disentangled Representations, In AISTATS, 2019

Sarthak Jain, Edward Banner, Jan-Willem Meent, Iain Marshall, Byron Wallace, Learning Disentangled Representations of Texts with Application to Biomedical Abstracts, In EMNLP, 2018

Long Document Processing #

Sheng Zhang, Cliff Wong, Naoto Usuyama, Sarthak Jain, Tristan Naumann, Hoifung Poon, Modular Self-Supervision for Document-Level Relation Extraction, In EMNLP, 2021

Sarthak Jain, Madeleine Zuylen, Hannaneh Hajishirzi, Iz Beltagy, SciREX: A Challenge Dataset for Document-Level Information Extraction, In ACL, 2020

Ramin Mohammadi, Sarthak Jain, Amir Namin, Melissa Heller, Ramya Palacholla, Sagar Kamarthi, Byron Wallace, Predicting Unplanned Readmissions Following a Hip or Knee Arthroplasty: Retrospective Observational Study, In JMIR medical informatics, 2020

Ramin Mohammadi, Sarthak Jain, Stephen Agboola, Ramya Palacholla, Sagar Kamarthi, Byron Wallace, Learning to Identify Patients at Risk of Uncontrolled Hypertension Using Electronic Health Records Data, In AMIA Summits on Translational Science Proceedings, 2019