Articles by category: distributed-systems


graph-processing parallel online-learning machine-learning my-whitepapers hardware-acceleration interview-question tools optimization deep-learning numpy-gems joke-post philosophy causal statistics numpy tricks

MapReduce

17 Sep 2016

MapReduce: Simplified Data Processing on Large ClustersPublished December 2004Paper linkAbstractMapReduce offers an abstraction for large-scale computation by managing the scheduling, distribution, parallelism, partitioning, communication, and reliability in the same way to applications adhering to a template for execution.IntroductionProgramming ModelMR offers the application-level programmer two operations through which to express their large-scale computation.Note: the types I offer here a... Read More

Ad Click Prediction

17 Jul 2016

Ad Click Prediction: a View from the TrenchesPublished August 2013Paper linkAbstractIntroductionBrief System OverviewProblem StatementFor any given a query, ad, and associated interaction and metadata represented as a real feature vector \(\textbf{x}\in\mathbb{R}^d\), provide an estimate of the probability \(\mathbb{P}(\text{click}(\textbf{x}))\)that the user making the query will click on the ad. Solving this problem has beneficial implications for ad auction pricing in Google’s online adver... Read More