Gurmeet Manku — Resume
5 Sep 2023
What Do I Do?
Kahuna Labs ('23 onwards): Stealth mode startup with Sanjeev Gupta.

Computer Science: Tinkering with Generative AI these days. Experience in LLMs (Large Language Models), conversational systems (conversational Shopping), predictive modeling, data stream algorithms & distributed systems. [VLDB 10-year Best Paper Award in 2012]

Thankful2Plants: Website (2,600+ articles) ∼ YouTubeInstagram: in-depth knowledge of Whole Food Plant-Based guidelines.

Personal website: Gurmeet.Net. Check out 600 Hikes in Bay Area.

Education
Stanford University Ph. D. in Computer Science 2004 (GPA: 4.00/4.00)
U C Berkeley M. S. in Computer Science 1997 (GPA: 3.97/4.00)
IIT Delhi B. Tech. in Computer Science & Engineering 1995 (GPA: 9.91/10.00)
Work Experience
Google Inc. (Aug 2004 to March 2023)
Software Engineer in Google Engineering & Google Research:
  • (2022—2023) Large Language Models: Structure extraction for Shopping.

    I developed a common infrastructure for building very large schemas in Shopping and other verticals, and for extracting tags and facets from hundreds of millions of Shopping web pages and feeds.

  • (2018—2022) Conversation Systems: Conversational shopping in Google Assistant.

    Shopping is an incredibly complex domain with a large, dynamic schema. My original slide deck led to the formation of a new team in Google Research focused on conversation systems for tabular data. I then led a team that built a parser for complex, multi-turn Shopping utterances. Launched in Google Assistant.

  • (2016—2018) Infrastructure for Reinforcement Learning: VOMC (Variable Ordered Markov Chains) and RL for Mobile Ads.

    The first large-scale RL project at Google in a production setting. Failed to show improvements.

  • (2011—2016) Google Analytics: Predictive modeling for conversions using Sibyl.

    Sibyl was a landmark, scalable ML system at Google. We used Sibyl for thousands for Google Analytics customers to predict user conversions based on their browsing history. I also built a quick demo for conversationalizing Google Analytics using Analyza, a system built by a sister team at Google.

  • (2010—2011) Google+: Data stream optimization.

    Google+ was bootstrapped in 2010. My work was to optimize Google+ feed with a focus on diversity of recommendations across news, blogs and videos.

  • (2005—2010) Infrastructure Group: Logs Team: Compression and anonymization of logs data.

    Logs data was one of the largest datasets at Google. I implemented column-based compression for logs data resulting in significant savings. Plus misc engineering tasks in logs team.

  • (2004—2005) Infrastructure Group: Duplicate detection in web crawls, compression of datasets.

    Some early projects at Google.

Gigabeat Inc. (Summer 2000) -- Summer Intern
Implemented crawlers for Icecast/Shoutcast radio stations. Designed and implemented a high-speed Gnutella crawler.

IBM Almaden Research (Jul 1997 - Sep 1999) -- Staff Software Engineer in Exploratory Databases Group
Research in databases. Designed and implemented quantile finding algorithms in DB2.

Intel Corp. (Summer 1996) -- Summer Intern at Intel Development Labs, Strategic CAD Technology
Re-designed and implemented half of Intel's netlist partitioner, resulting in five times speedup and 30% overall memory savings when partitioning a Pentium Pro design with 1.5 million design blocks.
Publications

ShopTalk: A System for Conversational Faceted Search [PDF]
   by G Manku, J Lee-Thorp, B Kanagal, J Ainslie, J Feng, Z Pearson, E Anjorin, S Gandhe, I Eckstein, J Rosswog, S Sanghai, M Pohl, L Adams, D Sivakumar
   SIGIR eCom'22 (The 2022 SIGIR Workshop On eCommerce), Madrid, Spain, July 15, 2022.

RadixZip: Linear Time Compression of Token Streams [PDF]
   by B D Vo and G S Manku
   VLDB 2007 (33rd International Conference on Very Large Data Bases), Vienna, Austria, p 1162-1172, September 23-27, 2007.

Detecting Near-Duplicates for Web Crawling [PDF]
   by G S Manku, A Jain and A D Sarma
   WWW 2007 (16th International World Wide Web Conference), Banff, Alberta, Canada, p 141-149, May 8-12, 2007.

A Loop-free Gray Code for Minimal Signed-Binary Representations [PDF]
   by G S Manku and J Sawada
   ESA 2005 (13th Annual European Symposium on Algorithms), Elivissa, Spain, p 438-447, Oct 3-6 2005.

(Brief Announcement) Papillon: Greedy Routing in Rings [PDF]
   by I Abraham, D Malkhi and G S Manku
   DISC 2005 (19th International Symposium on Distributed Computing), Cracow, Poland, p 514-515, Sep 26-29, 2005.

Decentralized Algorithms using Both Local and Random Probes for P2P Load Balancing [PDF]
   by K Kenthapadi and G S Manku
   SPAA 2005 (17th ACM Symposium on Parallelism in Algorithms and Architectures), p 135-144, July 2005.

Balanced Binary Trees for ID Management and Load Balance in Distributed Hash Tables [PDF]
   by G S Manku
   PODC 2004 (23rd ACM Symposium on Principles of Distributed Computing), p 197-205, July 2004.

Approximate Counts and Quantiles over Sliding Windows [PDF]
   by A Arasu and G S Manku
   PODS 2004 (22nd ACM Symposium on Principles of Database Systems), p 286-296, June 2004.

Know thy Neighbor's Neighbor: the Power of Lookahead in Randomized P2P Networks [PDF]
   by G S Manku, M Naor and U Wieder
   STOC 2004 (36th ACM Symposium on Theory of Computing), p 53-64, June 2004.

Optimal Routing in Chord [PDF]
   by P Ganesan and G S Manku
   SODA 2004 (15th Annual ACM-SIAM Symposium on Discrete Algorithms), p 169-178, Jan 2004.

Routing Networks for Distributed Hash Tables [PDF]
   by G S Manku
   PODC 2003 (22nd ACM Symposium on Principles of Distributed Computing), p 133-142, June 2003.

Symphony: Distributed Hashing in a Small World
   by G S Manku, M Bawa and P Raghavan
   USITS 2003 (4th USENIX Symposium on Internet Technologies and Systems), p 127-140, Mar 2003.

SETS: Search Enhanced by Topic Segmentation
   by M Bawa, G S Manku, and P Raghavan
   SIGIR 2003 (26th Annual Intl. ACM SIGIR Conference), p 306-313, July 2003.

Query Processing, Resource Management and Approximation in a Data Stream Management System
   by R Motwani, J Widom, A Arasu, B Babcock, S Babu, M Datar, G S Manku, C Olston, J Rosenstein and R Varma
   CIDR 2003 (1st Biennial Conf. On Innovative Data Systems Research), p 245-254, Jan 2003.

Approximate Frequency Counts over Data Streams (VLDB 10-Year Best Paper Award in 2012)
   by G S Manku and R Motwani
   VLDB 2002 (28th Intl. Conf. On Very Large Data Bases), p 346-357, August 2002.

Random Sampling Techniques for Space Efficient Online Computation of Order Statistics of Large Datasets
   by G S Manku, S Rajagopalan and B G Lindsay
   SIGMOD 1999, Vol 28, No 2, p 251-62, June 1999.

Approximate Medians and other Quantiles in One Pass and with Limited Memory
   by G S Manku, S Rajagopalan and B G Lindsay
   SIGMOD 1998, Vol 27, No 2, p 426-35, June 1998.

Structural Symmetry and Model Checking
   by G S Manku, R Hojati and R K Brayton
   CAV 1998 (10th Intl Conf on Computer-Aided Verification), LCNS 1427, p 159-171, July 1998.

Self-Similarity in File-System Traffic
   by S D Gribble, G S Manku, D Roselli, E A Brewer, T J Gibson and E L Miller
   SIGMETRICS 1998 (Joint Intl. Conf. on Measurement and Modeling of Computer Systems), p 141-150, June 24-26, 1998.

Object Tracking using Affine Structure for Point Correspondences
   by G S Manku, P Jain, A Aggarwal, L Kumar and S Banerjee
   CVPR 1997 (IEEE Conf. for Computer Vision and Pattern Recognition), p 704-9, June 17-19, 1997.

A New Voting Based Hardware Data Prefetch Scheme
   by G S Manku, M R Prasad and D A Patterson
   HiPC 1997 (4th Intl. Conf. on High Performance Computing), Bangalore, India, p 100-105, December 18-21, 1997.

A Linear Time Algorithm for the Bottleneck Biconnected Spanning Subgraph Problem
   by G S Manku
   Information Processing Letters, Vol 59, Number 1, 8 July 1996, p 1-7.

Circuit Partitioning with Partial Order for Mixed Simulation Emulation Environment
   by G S Manku, A Kumar and S Kumar
   RSP 1995 (6th Intl. Conf. on Rapid System Prototyping), p201-7, 7-9 June, 1995.

Theses

(Ph D) Dipsea: A Modular Distributed Hash Table, by G S Manku, Stanford University, Aug 2004.

(M S) Structural Symmetries and Model Checking, by G S Manku, U C Berkeley Tech Report UCB/ERL M97/92, Dec 1997.

(B Tech) Object Tracking using Affine Multiple Views Geometry, by G S Manku and H Nautiyal, IIT Delhi, May 1995.

Patents

Calculating flight plans for reservation-based ad serving by B Stanley, G Aggarwal, G S Manku, A Agarwal, D Chamberlain, G Jain, US Patent #US9053492B1, Issued: June 9, 2015. Link.

Pre-computed Impression Lists by D Chamberlain, G S Manku, B A Stanley, US Patent #8832070, Issued: Sep 9, 2014. Link.

Near-Duplicate Document Detection for Web Crawling by A Jain, G S Manku, US Patent #8548972, Issued: Oct 1, 2013. Link.

Method and system for tokenized stream compression by B Vo, G S Manku, U S Patent #8010510B1, Issued: August 30, 2011. Link.

Highly Compressed Randomly Accessed Storage of Large Tables with Arbitrary Columns by A Jain, G S Manku, US Patent #7496589, Issued: Feb 24, 2009. Link.

System and method for optimizing access to information in peer-to-peer computer networks by W J Labio, G T Nguyen, W W Liu, G S Manku, US Patent #7454480, Issued: Nov 18, 2008. Link.

System and Method for Searching Peer-to-Peer Computer Networks by Selecting a Computer Based on At Least a Number of Files Shared by the Computer by W J Labio, G T Nguyen, W W Liu, G S Manku, US Patent #07089301, Issued: Aug 8, 2006. Link.

Single Pass Space Efficient System and Method for Generating an Approximate Quantile in a Data Set Having an Unknown Size by B G Lindsay, G S Manku, S Rajagopalan, US Patent #06343288, Issued: Jan 29, 2002. Link

Single Pass Space Efficient System and Method for Generating Approximate Quantiles Satisfying an Apriori User-Defined Approximation Error by B G Lindsay, G S Manku, S Rajagopalan, US Patent #06108658, Issued: Aug 22, 2000. Link

Teaching
TA for cs154 in Winter 2003 (Automata and Complexity Theory)
TA for cs145 in Fall 2002 (Introduction to Databases)
Guest lecturer in cs361 (Advanced Algorithms): Quantiles over Data Streams
Miscellaneous Activities
Program Committee Member, VLDB 2007.
Organized the Stanford/ACM Local Programming Contest, 4 Oct 2003.
Invited to write a chapter for "Data Stream Management", edited by M Garofalakis, J Gehrke and R Rastogi, 2004.
External referee for SIGMOD, VLDB, PODS, ICDE, UbiComp and USENIX, 1999 - 2004.
Honors and Awards
VLDB 10-Year Best Paper Award in 2012
Paper: Approximate Frequency Counts over Data Streams, with Rajeev Motwani, 2002.

Stanford University
Stanford Graduate Fellowship, 1999-2002.

U C Berkeley
ERL Block Grant Fellowship, 1997.
U C Regents Fellowship, 1995-96.
ACM International Collegiate Programming Contest, 1996-97 (UC Berkeley team member)
Annual Berkeley Programming Contest, 1996 (Third rank).

IIT Delhi
3rd among 350+ students of all disciplines, 2nd among 45 students in Computer Science, 1995.
R Vibhakar Award for Best Overall Student at IIT Delhi, 1993-94.
R Bambawale Prize and R Subramanian Award for Best Overall Student at IIT Delhi, 1992-93.

Indian National Mathematics Olympiad - 1990 (Among the top 20 students in India).
Contact Information
e-mail: gurmeet@gmail.com                  Homepage: https://gurmeet.net
© Copyright 2008—2023, Gurmeet Manku.