# Articles by category: machine-learning

#### Graph Coloring for Machine Learning

22 Feb 2020

This month, I posted a blog entry on Sisu’s engineering blog post. I discuss an effective strategy for lossless column reduction on sparse datasets.Check out the blog post there.

#### Stop Anytime Multiplicative Weights

05 Jan 2020

Multiplicative weights is a simple, randomized algorithm for picking an option among $n$ choices against an adversarial environment.The algorithm has widespread applications, but its analysis frequently introduces a learning rate parameter, $\epsilon$, which we’ll be trying to get rid of.In this first post, we introduce multiplicative weights and make some practical observations. We follow Arora’s survey for the most part.Problem SettingWe play $T$ rounds. On the $t$-th round, the pla...

#### Compressed Sensing and Subgaussians

11 Sep 2019

Compressed Sensing and SubgaussiansCandes and Tao came up with a broad characterization of compressed sensing solutions a while ago. Partially inspired by a past homework problem, I’d like to explore an area of this setting.This post will dive into the compressed sensing context and then focus on a proof that squared subgaussian random variables are subexponential (the relation between the two will be explained).Compressed SensingFor context, we’re interested in the setting where we observe a...

#### Subgaussian Concentration

22 Dec 2018

Subgaussian ConcentrationThis is a quick write-up of a brief conversation I had with Nilesh Tripuraneni and Aditya Guntuboyina a while ago that I thought others might find interesting.This post focuses on the interplay between two types of concentration inequalities. Concentration inequalities usually describe some random quantity $X$ as a constant $c$ which it’s frequently near (henceforth, $c$ will be our stand-in for some constant which possibly changes equation-to-equation). Basical...

#### Beating TensorFlow Training in-VRAM

23 Dec 2017

Beating TensorFlow Training in-VRAMIn this post, I’d like to introduce a technique that I’ve found helps accelerate mini-batch SGD training in my use case. I suppose this post could also be read as a public grievance directed towards the TensorFlow Dataset API optimizing for the large vision deep learning use-case, but maybe I’m just not hitting the right incantation to get tf.Dataset working (in which case, drop me a line). The solution is to TensorFlow harder anyway, so this shouldn’t reall...

#### Non-convex First Order Methods

20 Jun 2017

Non-convex First Order MethodsThis is a high-level overview of the methods for first order local improvement optimization methods for non-convex, Lipschitz, (sub)differentiable, and regularized functions with efficient derivatives, with a particular focus on neural networks (NNs).$\argmin_\vx f(\vx) = \argmin_\vx \frac{1}{n}\sum_{i=1}^nf_i(\vx)+\Omega(\vx)$Make sure to read the general overview post first. I’d also reiterate as Moritz Hardt has that one should be wary of only looking at con...

#### Neural Network Optimization Methods

19 Jun 2017

Neural Network Optimization MethodsThe goal of this post and its related sub-posts is to explore at a high level how the theoretical guarantees of the various optimization methods interact with non-convex problems in practice, where we don’t really know Lipschitz constants, the validity of the assumptions that these methods make, or appropriate hyperparameters. Obviously, a detailed treatment would require delving into intricacies of cutting-edge research. That’s not the point of this post, w...

#### My Princeton Senior Thesis

23 May 2017

My Princeton Senior ThesisSubmitted to the university as part of completion of Computer Science BSE degree June 2017Completed during the 2016-2017 academic year.A concise and more up-to-date paper version.Link to download report.Code repository.

#### My Princeton Junior Year Research

03 Nov 2016

My Princeton Junior Year ResearchUnpublishedSubmitted to the university as part of completion of Computer Science BSE degree January 2016Completed during fall semester 2015-2016Link to download report.

Ad Click Prediction: a View from the TrenchesPublished August 2013Paper linkAbstractIntroductionBrief System OverviewProblem StatementFor any given a query, ad, and associated interaction and metadata represented as a real feature vector $\textbf{x}\in\mathbb{R}^d$, provide an estimate of the probability $\mathbb{P}(\text{click}(\textbf{x}))$that the user making the query will click on the ad. Solving this problem has beneficial implications for ad auction pricing in Google’s online adver...