Hi, I am a Machine Learning Engineer at Meta on the Brand Advertising team. Previously, I worked as a Senior Data Scientist at Fidelity Investments, AI Center of Excellence, where I worked on machine learning and large language models and prior to this I worked as a Research Scientist at Meta.
I received my PhD with focus on Large Language Models and Applied NLP from University of Massachusetts in 2022, where I was advised by Anna Rumshisky. In the summer of 2021, I was an intern at Facebook in News Signals team where I worked on news personalization and integrity.
I'm interested in natural language processing, machine learning, optimization, and large language models. My research has had focus on information extraction, question answering, analysis of vector spaces in large language models and cross-lingual transfer learning.
We present a new framework for multi-hop question generation that leverages a structured rationale schema to improve question difficulty control and performance under low supervision conditions, showing effectiveness even with modest model sizes.
We introduce crossword puzzle-solving as a new natural language understanding task, providing a large-scale corpus collected from New York Times crosswords, releasing over half a million unique clue-answer pairs for open-domain question answering, and proposing both novel models and an evaluation framework for this task.
Our study challenges the notion of Transformer robustness to pruning by revealing that the removal of a tiny fraction of features in layer outputs, specifically high-magnitude normalization parameters in LayerNorm, significantly impairs the performance of popular pre-trained Transformer models like BERT, BART, XLNet, ELECTRA, and GPT-2.
Multilingual BERT shows reasonable zero-shot cross-lingual transfer capability, but aligning it with cross-lingual signal from parallel corpora or dictionaries using rotational or fine-tuning methods further improves performance, with the best alignment method depending on language proximity and task.
CliNER 2.0 is an open-source tool using LSTM models to achieve state-of-the-art performance for extracting clinical concepts from text to aid downstream tasks, with pre-trained models available for public use.
This work uses network polarization, sentiment analysis, and semantic shift to analyze the dynamics of social conflict over time, using social media data from the Ukraine-Russia Maidan crisis as a case study.
* Equal Contribution  /  Website template from this great guy!