Graduate Student

Harvard Medical School

Biography

Josh is a computational biologist pursuing his PhD. at Harvard Medical School. There, he is co-advised by Professor Kevin Haigis and Professor Peter Park as he studies cancer genetics and evolution. Specifically, he is working to understand the tissue-specific behavior of KRAS mutations in cancers.

In his free time, Josh enjoys learning about programming and computer science - his current project is creating a MacOS app to summarize text using machine learning. Off the computer, his hobbies include running and caring for his plants.

Interests

  • Cancer genetics and evolution
  • Swift and iOS development
  • Bayesian data analysis

Education

  • PhD in Computational Biology & Cancer Genetics, (in progress)

    Harvard Medical School

  • BS in Biochemistry and Molecular Biology, 2017

    University of California, Irvine

  • BS in Chemistry, 2017

    University of California, Irvine

Skills

R

Proficient

Python

Proficient

Swift

Intermediate

Linux

Sufficient

Statistics

Intermediate

iOS & watchOS

Intermediate

All Publications

Quickly discover relevant content by filtering publications.
The origins and genetic interactions of KRAS mutations are allele- and tissue-specific
Tissue-specific oncogenic activity of KRAS A146T
Loss of Magel2 impairs the development of hypothalamic Anorexigenic circuits

Projects

.js-id-data-analysis

boston311

Python package for interfacing with Boston 311 API.

Bayesian Data Analysis

The steps I have taken to learn how to conduct Bayesian data analysis.

Counting Coffee

A suite of projects for tracking my coffee consumption.

Bayesian analysis of the ‘Facial Feedback Hypothesis’

My own visualization and statistical analysis of the data from a replication study of this famous phsycology paper.

Exploration of the latent input space of a Progressive Growing GAN

We trained a Progressive Growing GAN to produce realistic, yet novel, hand radiographs and explored its latent input space to identify an embedding of bone age.

Text Summarization App

A web application that summarizing text down to an adjustable percentage of the the most important sentences.

Sudoku Solver

A web application that solves [Sudoku puzzles] using linear integer programming.

Advent of Code (2020)

This links to my repository with my solutions to the Advent of Code 2020 programming challenges.

Tidy Tuesday

This links to my repository of #TidyTuesday submissions. This is a series of notebooks and scripts that analyze a different dataset each week where I experiment with various modeling and data visualization techniques.

textrank

A Swift package for summarizing long text into the most important sentences or words.

WaterMe

An iPhone app for tracking when plants have been watered. There is also an Apple Watch app that makes work in the garden a bit easier. (demo GIFs)

Apple Watch Telemetry Recorder

An Apple Watch application for recording telemetry data during a workout and uploading the data to iCloud Drive. Mathematical models are then fit to the data to identify the various stages of the exercise (e.g. the down and up positions of a push-up).

Workout Spinner Apple Watch App

An Apple Watch application that randomly selects quick exercises using a Wheel-of-Fortune-like spinning wheel.

mustashe

A simple system for saving and loading objects in R. Long running computations can be stashed after the first run and then reloaded the next time. Dependencies can be added to ensure that a computation is re-run if any of its dependencies or inputs have changed.

Frailea castanea from seed

My attempt at growing Frailea castanea from seed.

Germination Tracker iOS App

A simple app to help me record data on my seedlings.

Growing Lithops from seed

Documenting my journey from seed to Lithops.

ggasym (“gg-awesome”)

‘ggasym’ (pronounced “gg-awesome”) plots a symmetric matrix with three different fill aesthetics.

Plant Tracker iOS App

An app to help my mom keep track of and care for her plants.

Type hinting a list subclass in Python with function overloading

How to support type hints of the dunder methods on your subclass of the built-in Python list using function overloading.
Type hinting a list subclass in Python with function overloading

Setting up Appwrite on DigitalOcean

A step-by-step tutorial on how to get a powerful backend system up and running on Digital Ocean.
Setting up Appwrite on DigitalOcean

Notes on 'Deep Work'

My notes on the book Deep Work by Cal Newport about how to maximize productivity and success on meaningful work.
Notes on 'Deep Work'

Mixing centered and non-centered parameterizations in a hierarchical model with PyMC3

How to build a hierarchical model in PyMC3 with a mixture of centered and non-centered parameterizations to avoid the dreaded funnel degeneracies.
Mixing centered and non-centered parameterizations in a hierarchical model with PyMC3

Conducting a Bayesian analysis with 'rstanarm' and publishing with 'distill'

A brief description of the tools and process behind my recent analysis of data from a study of the ‘Facial Feedback Hypothesis’.
Conducting a Bayesian analysis with 'rstanarm' and publishing with 'distill'

Accomplish­ments

Completed Hacktoberfest 2021

I opened four pull requests to open source projects to complete the 2021 Hacktoberfest challenge. This year, I challenged myself to commit primarily to other’s projects and was successful in this pursuit. I added specifications for R and Rscript commands to Fig, a tool that brings autocomplete to the terminal (1). I also added the specification for Rscript to the command line tool tldr (2). My third PR was to add an example of fitting a spline with PyMC3 to the pymc3-examples repository (3). The fourth PR was to my own project where I added Section 5 notes and exercises to my repository for the course Bayesian Data Analysis (4). While these were the four PRs that counted for Hacktoberfest, I continued with several others including fixing a bug in snakemake (5), a typo in a file for the class I am working on (6), Section 6 to my course repo (7), and created a tutorial, blog post, and demo app using Appwrite and added these to the awesome-appwrite repo (8, 9).

boston311 Python package

My (first) Python package, boston311, for querying the Boston 311 reporting service API is available on PyPI.

The origins and genetic interactions of KRAS mutations are allele- and tissue-specific.

We have published our analysis on the genetic interactions of KRAS alleles in four different cancer types in Nature Communications.

Completed Advent of Code 2020

I completed both coding puzzles released on each of the 25 days before Christmas as a part of the Advent of Code challenge.

Completed Hacktoberfest 2020

I opened four pull requests to open source projects to complete the Hacktoberfest challenge. The first two added iPhone-to-Watch connectivity and Workout Session capabilities to my Apple Watch telemetry-recording application (1, 2). The third PR added the Workout Session capability to my Workout Spinner Apple Watch app (3). Finally, the fourth PR merged the resubmission branch of the KRAS comutation project code repository (4).

mustashe R package

My R package, mustashe, has been accepted to CRAN. It is ‘A simple system for saving and loading objects in R. Long running computations can be stashed after the first run and then reloaded the next time. Dependencies can be added to ensure that a computation is re-run if any of its dependencies or inputs have changed.’

100 Days of Python

To become comfortable with Python, I completed an hour (usually more) of deliberate practice. I primarily followed data science-centered tutorials, namely Python for Data Analysis and Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow. I also followed the tutorials for the common plotting libraries, Matplotlib, Seaborn, and Plotly. I recorded my progress at the linked GitHub repository. I did not finish the Hands on ML book in the 100 days, but I continued working through it afterwords.

Completed Hacktoberfest 2019

I opened four pull requests to open source projects to complete the Hacktoberfest challenge. I merged large features into my Germination and Plant Tracker apps (1, 2, 3). Also, I made a pull request to add documentation for a statsitcal test using by Hierarchical HotNet (Reyna, et al.. Bioinformatics. 2018) from the Raphael Lab at Princton University (4).

100 Days of Swift

I challenged myself to do at least an hour of Swift every day for 100 days. I recorded my progress at the linked GitHub repository. Towards the end of the challenge, I started developing two iOS apps, a Plant Tracker and a Germinaton Tracker.

Tissue-specific oncogenic activity of KRAS A146T

The Haigis Lab published on a biochemical, signaling, and computational description of the KRAS A146T allele.

Honorable Mention NSF GRFP

I have been awarded an Honorable Mention in the 2019 National Science Foundation (NSF) Graduate Research Fellowship Program competition

ggasym R package

My R package, ggasym, has been accepted to CRAN. It was further promoted by RStudio in their March edition of Top 40 New CRAN Packages and R Weekly (March 18, 2019).

Toxoplasma gondii disrupts β1 integrin signaling and focal adhesion formation during monocyte hypermotility

My undergraduate reseach in the Lodoen Lab at UC Irvine on the interactions between the single-celled parasite, T. gondii, and human monocytes was accepted by JBC.

Contact

  • Boston, MA 02215