experience

A growing collection of my various industry-related endeavors.

2023

Amazon

Software development intern.
May 2023 - Aug 2023. Seattle, WA.

My second time around at Amazon, I was tasked with creating, deploying, and integrating a DNS device hostname resolution script for thousands of devices in several key regionalized VPCs. While the code itself wasn't too difficult to write, my project was more of a challenge in integration - I had to communicate with many different moving parts within the internal AWS codebase, from extracting useful information from L3NTS dumps to creating automated deployment pipeline alarms and rollbacks to using Apollo package dependencies to create CRON jobs for my script. Overall, a great summer. I'm sure gonna miss seeing Mount Rainier every morning from my room.

2022

Forschungszentrum Jülich

DAAD scholar research intern.
Aug 2022 - Oct 2022. Jülich, Germany.

I was supported by the DAAD RISE Research Scholarship to perform theoretical machine learning research for the IEK-10 Lab at Forschungszentrum Jülich. I investigated different methods for verifiable robustness for neural networks, which refers to the concept of maintaining classification accuracy within a region of input perturbations. A classic example of a non-robust network is the image classifier that can recognize an image of a stop sign, but classifies it as a cat if you change the RGB value of just a few pixels. Typically, the standard method of creating robust models, Interval Bound Propogation (IBP), is to replace an input datum with a fixed-size orthongonal bounding box, which represents the epsilon of error to account for, and to propogate this box over each layer and to relax this constraint into another bounding box to make the problem tractable. I investigated McCormick relaxations as a layer propogation technique, where each layer's bounding box would be defined by the convex and concave relaxations of the associated layer function. This has the benefit of still being tractable, as these functions are continuous and monotonic, but these relaxations are much tighter than the orthogonal boxes of IBP. Overall, like many academic projects, my investigation was a half-success. While my results found that McCormick relaxations were about 40% tigher than IBP, calculating them took significantly longer (this is because McCormick relaxation calulations grow quadratically with dimension size, while two points will always define an orthongonal box). So, McCormick Relaxations are theoretically mor efficient for robustness training, but in practice take too long to calculate.

Amazon

Software development intern.
May 2022 - Aug 2022. Boston, MA.

At AWS, I was working with the RDS PostgreSQL Team, which to my chagrin, did not mean writing SQL commands for twelve weeks. Turns out, we were working ON Postgres, which means diving deep into a labyrinth of C code that was mostly written before I was born. (Seriously, I found several unaddressed TODO comments that were timestamped from 1989). My project was focused on speeding up the SQL JOIN function by implementing semi-join hashing with Bloom filters during a Merge Join to eliminate unnecessary table rows early on in the execution tree. Essentially, Merge Join is a join strategy that is very similar to Mergesort, which is great because it runs in O(n log n) time. However, it turns out that even this operation can be made more efficient - by scanning one side of the join and mapping it into a Bloom filter, which is essentially a low-storage probabilistic hash map, large swathes of the other join table can be removed if their distributions do not overlap much. I implemented this optimization into AWS' Postgres as well as a decision planner (based on a heuristic of the aforementioned distribution overlap) to decide when to actually execute it. Overall, this implementation was a success, increasing query speeds by up to 36% on ideal queries and improved our performance on the standard database benchmark TPC-H by about 5%.

2021

Prognomiq

Software engineer intern.
May 2021 - July 2021. San Mateo, CA.

While at Prognomiq, I was tasked with creating an automated medical records pipeline for data analysis. Prognomiq had several thousand unprocessed medical records from their dozens of field studies across America, and each hospital recorded patient data in different ways. I created a script that converted all of these records into a single unified format, merged them with existing data on Prognomiq's servers, and created version histories for each patient in the case of multiple forms.

Rocketansky

Full-stack engineer contractor.
Mar 2021 - May 2021. Atlanta, GA.

I developed the main Windows application for the Rocketansky app. Rocketansky was a cybersecurity authentication platform that forwent authentication via passwords in favor of using hardware keys and biometrics. The logic was that passwords are easy to steal or figure out, while nowadays physical hardware tied to a device was actually more secure. So, while working on my application, I had to write the backend service that would create the unique cryptographic key tied to each device, perform authentication handshakes with our AWS servers, and verify the identity of the user in question. Meanwhile, I also had to create the frontend application andfigure out installation drivers for Windows machines. In the end, though, I got it done. Rocketansky closed shop in 2022, so their decision to "pay me in company shares" was rather unfortunate.