research

SAIL Lab

Vocal Tract Articulation (2024 - Present)

By tracking muscle movements and articulations of the vocal tract, it is possible to reconstruct speech audio. Using rt-MRI scans to track vocal tract articulations for this purpose shows promise, but is not well-researched. I am currently using MRI data from the USC 75-Speaker Dataset to extract vocal tract contours to later use for speech audio reconstruction.

LLM-to-Brain Modeling (2024 - Present)

Figurative expressions evoke stronger neural activations in emotion-relevant brain regions than literal expressions. Does a similar phenomenon occur for LLMs? We investigate the responses of various LLMs across literal and figurative sentences and three valence levels to determine whether there is a difference in LLMs’ ability to model neural activity across conditions. To measure LLM-brain alignment, we compare functional magnetic resonance imaging (fMRI) recordings of individuals silently reading literal and figurative sentences to the corresponding LLM layerwise embeddings.

Adolescent Suicidality (2024 - Present)

Children’s mental health emergency department (ED) visits and suicides are increasing. While the goal of these visits is to prevent future suicidal ideation, revisits are common. Many current systems pose risk assessment as a binary problem, when a more nuanced stratification would improve targeted intervention. I am performing a retrospective study of children aged 6-17 who have visited the ED at the Children’s Hospital Los Angeles from 2016-2023. We are examining health records, demographic information, and post-meeting clinical notes to identify predictor variables for suicidality and establish a better classification of outcomes for ED visits and revisits. We emphasize NLP-derived features derived notes and compare model performance with and without these features to examine their predictive power.

Exposome Factors for Dementia (2024 - Present)

Dementia is one of the major leading causes of death and disability for the elderly. While the origins of dementia are unknown, we seek to understand how dementia arises holistically by investigating exposomes - the totality of all internal, societal, political, and environmental factors - that influence dementia development. In this multi-year project led by the NIH, I will investigate dementia exposomes and determine and inform policies that can address this affliction.

Co-Well Lab

Technology and Contraception Responsibility (2021 - 2023)

I have been working in Dr. Jennifer Kim’s Co-Well Lab since Fall 2021, studying the intersection of bioinformatics and human-computer design. Our most recently published project investigates a long under-resesarched topic in academia - the burden that women and other female-sex individuals face when dealing with contraception. It has long been documented that male-sex individuals consistently are unwilling to use contraception such as condoms, especially if their partner uses some short or long-acting contraception such as the pill. This has contributed to our current culture where over 90% of women feel that contraception responsibility should be equally shared between couples yet only 50% of women in relationships believe that it is. Given these discrepancies, our group saw an opportunity for technology to mediate some of this imbalance. We designed a birth control pill app prototype tailored towards collaboration - including shared notifications, in-app messaging, and infographics that highlighted the consistent “invisible work” that female-sex partners contribute for contraception efficacy. We interviewed 20 participants in relationships to elicit feedback and found that the common themes of shared contraception - including a lack of emotional support and lack of communication on how and if to help - were successfully addressed by our app prototype.

Strength-based Chatbot for Neurodiverse Job-Seekers (2023 - 2024)

Following our CSCW 2023 publication, my research in the Co-Well Lab shifted to applying my knowledge of NLP and HCI towards creating a chatbot for neurodiverse job seekers. Recent developments in the generative AI space have led to more informed, more lucid, and more knowledgeable chatbots. However, there are many flaws with these off-the-shelf models, especially when it comes to interfacing with clients with unique communciation preferences. In this research, we investigated how to tailor chatbots to neurodiverse job-seekers and how we can customize responses to their individual strengths and needs.

NLP-X Lab

Authorship and Stylistics (2022 - 2024)

Authorship is the systematic study of persistent stylistic qualities behind documents written from the same source. While authorship is a broad research area, there have been few papers on more “real-world” authorship applications. I investigated how noisy, multi-genre authorship setups - such as sourcing documents from several different social media platforms - fare against current SOTA authorship models and how to better improve upon the design in a cross-domain environment. In summary, I’ve found that effective authorship models in a single-domain experimental setup, such as the Transformer-based BertAA or the more traditional N-gram model, are unable to find success in a multi-genre setting.

Sociolinguistics Research

Language and Politics in the New South (2021 - 2022)

Under the guidance of Dr. Lelia Glass at Georgia Tech, I was a Vertically Integrated Projects (VIP) team leader and ambassador for several semesters between 2021 and 2022. During my team with the team, we investigated the relationship between accent and political leanings using sociophonetic analysis. We interviewed over 100 native Georgia residents and extracted vowel formants for analysis in R code. We indeed found a correlation between being more politically conservative and speaking with the Southern Vowel Shift (which constitutes several different vowel movements, such as the pin-pen conditional merger - i.e. a prototypical Southern accent pronounces “pin” and “pen” the same). While on the team, I established several interview methods, transcription pipelines, and vowel analysis scripts that are still in use today.

Automated Linguistics Transcription Pipeline (2022 - 2023)

While I was a team member on the Language and Politics in the New South VIP, I was often tasked with transcribing huge quantities of audio interviews. Perhaps out of laziness, perhaps out of innovation (likely a bit of both), I found a way to utilize my NLP background to create an automatic transcription tool that would transcribe audio files and format them into specific export files that could be used with common linguistics analysis tools. This tool was used internally for about a year before I decided to intergrate the tool onto Dartmouth’s online linguistics website, a well-known hub for automated linguistic tools for use in the broader community. Today, this tool, Bed Word, is available on the DARLA website. As of August 2023, it has transcribed approximately 500 hours of audio interviews with 700 completed jobs since its launch in October 2022, saving approximately 1000 hours of manual transcription labor according to an informal study conducted by our VIP team. I have given a conference presentation about Bed Word at the linguistics conference NWAV 50 and a follow-up paper has been published to the journal Linguistics Vanguard.

CoNTRoL Lab

Neural Zone State Performance (2021 - 2022)

My time at the CoNTRoL lab, ran by Dr. Eric Schumacher, holds a special place in my heart as my first foray into research. I worked on investigating a phenomenon known as the Quasi-Periodic Pattern (QPP), which is the recurring anti-correlation of brain activity in two separate networks, the default mode network and the task positive network. Our work consisted of investigating how strongly a QPP could be detected when subjects were “in-the-zone” - meaning when they were focused strongly on a task. My specific role on this team was centered around data analysis and visualization. I was tasked with extracting useful brain-image slices out of raw 4-dimensional fMRI data (don’t worry, the fourth dimension is time, I’m not about to start talking about Klein bottles). Additionally, I used previous research on QPP to create an algorithm to detect these patterns for our subject data, where we indeed found that QPP activity is more prevalent during times of focus and attention.