Articles by category: hardware-acceleration

graph-processing parallel distributed-systems online-learning machine-learning my-whitepapers interview-question tools optimization deep-learning numpy-gems joke-post philosophy causal statistics numpy tricks history llm pretraining

Facebook AI Similarity Search (FAISS), Part 1

18 Jul 2019

FAISS, Part 1FAISS is a powerful GPU-accelerated library for similarity search. It’s available under MIT on GitHub. Even though the paper came out in 2017, and, under some interpretations, the library lost its SOTA title, when it comes to a practical concerns: the library is actively maintained and cleanly written. it’s still extremely competitive by any metric, enough so that the bottleneck for your application won’t likely be in FAISS anyway. if you bug me enough, I may fix my one-line E... Read More

Facebook AI Similarity Search (FAISS), Part 2

18 Jul 2019

FAISS, Part 2I’ve previously motivated why nearest-neighbor search is important. Now we’ll look at how FAISS solves this problem.Recall that you have a set of database vectors $\{\textbf{y}_i\}_{i=0}^\ell$, each in $\mathbb{R}^d$. You can do some prep work to create an index. Then at runtime I ask for the $k$ closest vectors in $L^2$ distance.Formally, we want the set $L=\text{$k$-argmin}_i\norm{\textbf{x}-\textbf{y}_i}$ given $\textbf{x}$.The main paper contributions in this rega... Read More

Numpy Gems, Part 1

19 Jan 2019

Numpy Gems 1: Approximate Dictionary Encoding and Fast Python MappingWelcome to the first installment of Numpy Gems, a deep dive into a library that probably shaped python itself into the language it is today, numpy.I’ve spoken extensively on numpy (HN discussion), but I think the library is full of delightful little gems that enable perfect instances of API-context fit, the situation where interfaces and algorithmic problem contexts fall in line oh-so-nicely and the resulting code is clean, ... Read More

Beating TensorFlow Training in-VRAM

23 Dec 2017

Beating TensorFlow Training in-VRAMIn this post, I’d like to introduce a technique that I’ve found helps accelerate mini-batch SGD training in my use case. I suppose this post could also be read as a public grievance directed towards the TensorFlow Dataset API optimizing for the large vision deep learning use-case, but maybe I’m just not hitting the right incantation to get tf.Dataset working (in which case, drop me a line). The solution is to TensorFlow harder anyway, so this shouldn’t reall... Read More

My Princeton Junior Year Research

03 Nov 2016

My Princeton Junior Year ResearchUnpublishedSubmitted to the university as part of completion of Computer Science BSE degree January 2016Completed during fall semester 2015-2016Link to download report. Read More

Vlad Feinberg