Most of the code and documentation is available on my GitHub.
Recent
- Convert LaBSE from TensorFlow to PyTorch: I migrated the LaBSE model’s TensorFlow checkpoints to PyTorch (which I use more often and prefer) and uploaded to the HuggingFace Model Hub.
- The Seen and The Unseen - Bookshelf: One of my favorite podcasts has a bunch of book recommendations in the show notes. I built a tiny aggregator that displays all of those recommendations in a single place. (Source.)
- Apple’s COVID-19 Mobility Data for India: I spent a couple days looking at and playing with the mobility data Apple decided to start publishing. I thought it had some interesting trends (specifically for different cities) and I pondered over what was causing some of the specific changes. (Source.)
- Hugo Theme - Overflow Identity: I built the theme for my personal website (this!), which was a mashup of 2 HTML5UP themes from a few years ago. (Another example.)
transformers-embeddings
: Among the various things I built at Ginger / Headspace Health was a Python library to make inference with 🤗’s exceptional transformers
library easier. I open sourced it in October, 2022 during Hacktoberfest 2022. (Blogposts on the Ginger Engineering blog, and Headspace Engineering blog.)
Open Source
I like to take inspiration from FOSS projects, and contribute when it makes sense. If I debug issues when I run into them, I also like to fix them.
All of my PRs and issues (on public repos) are listed here, but some of the contributions are highlighted below.
- I fixed an error with the language models’ loss calculation. This affected generative NLP models like GPT-2.
- I added support for dropping the last incomplete batch from the dataloader for training any transformer-based model. (And on TensorFlow.)
- I added Okta OAuth support.
- I added ability to export timestamps from annotations.
- And I fixed breaking installs.
- I added support for using
poetry
as a packaging tool in Lambda functions. - I consolidated (and refactored) previously written tests for Python Lambda functions.
- I added options to retry, extend timeout for SageMaker Batch Transform jobs.
- I added support for providing a custom bundling Docker image for Python Lambda functions.
I added feedstocks (conda-forge packages for the source Python packages) for:
On top of the ones I added, I also help maintain the feedstocks for:
Academic
- Integrating preventive care guidelines & EHR to provide better healthcare: My MS thesis is about improving preventive healthcare recommendations by using natural language processing. Through this project, I have succeeded in providing personalized preventive care recommendations to patients by analyzing patient EHR data and USPSTF Preventive Care guidelines.
- Prediction model for water demand in Central Indiana for Citizens Energy Group: I designed a parallel RNN algorithm to predict daily and monthly average water demand with a very high accuracy. My model achieved an average error rate of 1.69% for daily predictions and 2.29% for monthly predictions.
- Disease-based biomedical document search and retrieval using Word2Vec: I developed an algorithm that uses disease ontology for biomedical document search and retrieval. With an innovative concept weighing scheme for biomedical documents, I have overcome the problem of semantically equivalent biomedical concepts being represented using heterogeneous lexicons.
- Navigation tool to compute the best route based on road safety: This project is based on artificial neural networks and uses information of past fatal accidents that have occurred in USA to predict future accidents and compare various route options from location A to location B.
- A Home Automation and Internet of Things Solution for Indian Homes: In this project, a home automation system focusing on solving specific Indian home problems (automated passageway and room lights, keyless door lock and LPG cooking gas leakage detection and ordering system) was created.
- Android app to log GPS and Accelerometer data to local storage and server: An Android app that periodically collects data from the GPS and accelerometer sensors and stores it on a local buffer and if enabled, a web server.
- Android navigation app that computes the safest route for travel: An extension of the previous navigation tool app that computes the safest route for travel based on time of journey, fatality rate prediction and weather conditions.
- Simulation of various Ad-Hoc Routing Protocols using NS-3: Simulated various Mobile Ad-Hoc Networks for routing protocols like AODV, DSDV, DSR, OLSR, GPSR and Bird Flocking Routing Algorithm (BFA) using Network Simulator-3.
- Smart Socket: A 3-pin socket that doesn’t enable electricity supply to the appliance connected until the plug is inserted entirely, thus helping in preventing short circuits, excessive draw current and electric shocks for the user and provides protection from surge voltage, under-voltage and ground leakage protection.
Miscellaneous
- Sanjay Shah Seminar website: A couple of years back, when dad asked me if I would develop his website a few years ago, I took up the challenge. While I do not maintain it anymore, I created it in early 2011 and maintained it through the first half of 2016.
- Bhavin Shah’s website: I also help a dear friend, my mentor (and now a published author!) with creating and maintaining his website.
- Not Just The Talks: I realized I wasn’t fine with the way things were in my country and in the society around me. This was my attempt at making a difference through my writing. It has been quite sometime since I last wrote on there.