Nipun Arora

Computer Scientist/Software Engineer


  • PhD in Computer Science

    Columbia University


  • Data Engineering/Streaming Pipelines
  • Search & Retrieval
  • Scalability and Performance
  • ML and ML Infra
  • PL & Compilers (once upon a time)

About Me:

I’m a researcher at heart and software engineer in practice with several years of experience leading software engineering teams in successful and impactful projects. I pride myself in delivering results, and driving innovation in organizations, as well as improving engineering and buisiness process to enable developer productivity and the org as a whole.

I have a PhD in Computer Science from Columbia University, where I worked at the Programming Systems Laboratory with Prof. Gail Kaiser. My research interests span data analytics, big data, stream processing, distributed systems, large scale system debugging, and program analysis. I have briefly also worked on cloud computing and software defined networking.

Work History:

Currently, I am the Director of Engineering at Priceline for Flights Platform. I lead a team of 30+ engineers and managers on the search and pricing veriticals, with teams in US, Canda, and consultant teams in Ukraine, and Buenos Aires. I help drive our strategy, architectural decisions, and innovation for our flagship project Firefly which is also the global supply aggregation platform for Booking Holdings (Priceline/Agoda/Kayak etc.). Our stack deals with both B2B customers, and B2C customers and we are leveraging a plug and play architecture to move towards integration of approx. 50 supply connections by the end of 2022 (Direct connect to Airlines, and GDS’s/Aggregrators) to make Firefly the largest flights supply aggregator.

Before joining Priceline, I worked as a Senior IC at Dropbox, New York with the Previews Infrastructure Services Team. The preview-infrastructure team provides middle layer services to convert uploaded files into previewable content for all user-facing frontends for dropbox (this is the second largest infra fleet after storage at dropbox).

Even earlier, I was a researcher at NEC Labs America, Princeton, NJ where I worked with Systems Research Group (formerly a part of the Autonomic Computing Group). I have also briefly interned as Business Analyst at McKinsey & Co., New York in 2008. In my undergrad years, I interned as a Research Consultant at Instituto de Soldedura Equalidade (Lisbon, Portugal), a research organization under the aegis of the European Union where I was involved in a Project called “Natrualhy”. I was also a Research Assistant at the Indian Institute of Technology (Delhi, India) in the Computer Integrated Manufacturing Lab, where I worked on Supply Chain Management.

Recent News:

  • Invited talk at Google Journal Club- Replay without Recording of Production Bugs for SOA (ASE 2018)
  • Awarded Excellent Invention Awards for patent applications- 2017: Next Generation Log Analytics Application: An Automated Anomaly Detection Service on Heterogeneous Logs
  • Awarded NEC Business Contribution Award - 2016 (awarded for Research Commercialization)
  • Spot Recognition Award for Supporting Log Analysis Technology Development, Oct 2016
  • NEC Recognition Award for Creating Patent Portfolio for Log Analysis Technologies, Jun 2016
  • Awarded NEC Business Contribution Award - 2015 (awarded for Research Commercialization)

Publications/Patents:

  • 10 Issued Patents, 26 Filed Patents (pending) - as of Feb 24, 2017
  • 17 Peer-Reviewed Publications​

Community Activity:

  • Program Committee Member Middleware 2015
  • Reviewer IEEE’s Journal on Transactions for Parallel and Distributed Systems
  • Peer Reviwer SPIN 2014
  • Peer Reviewer Globecom 2014
  • Reviewer IEEE’s Journal on Transactions for Service Computing
  • Peer Reviewer SCSC 2014
  • Peer Reviewer SDN-AA Workshop 2014
  • Peer Reviewer ICAC 2014
  • Peer Reviewer SIGMETRICS 2014

Previous Interns/Students:

  • Mohammad Ali Gulzar, PhD Student, UCLA, Summer 2016
  • Muhammad Solaimani, University of Texas Dallas, Summer 2016
  • Pradeep Fernando, PhD Student, Georgia Tech University, Summer 2015
  • Yuanzhen Gu, PhD Student, Rutgers University, Summer 2014
  • Advait Dixit, PhD Student, Purdue University, Spring 2014
  • Hui Lu, PhD Student, Purdue University, Summer 2013
  • Nitin Natrajan, MS, Columbia - Fall 2010
  • Jyotsna Sebe, MS, Columbia - Fall 2009
  • Bing Wu, MS, Columbia - Fall 2008, Spring 2009
  • Suhas Anand, MS, Columbia - Fall 2008
  • Junxiong Jia, MS, Columbia - Fall 2008

Resources:

 
 
 
 
 

Director of Engineering - Flights Search & Price

Priceline

Oct 2019 – Present Greater New York City Area
  • Engineering Manager for the search and price teams within flights.
  • Leading the teams to on-board a new flights serving infrastructure for several of our sister companies - B.com, Agoda towards building an international flights aggregation business.
  • Led the migration of flights infra from on-prem infrastructure to Google Kubernetes Hosted micro-services
  • Multiple performance and scalability improvements both at an implementation and architectural level to our search stack
 
 
 
 
 

Senior Software Engineer

Dropbox

Jan 2018 – Oct 2019 New York City
  • Part of team that manages previews serving infrastructure which converts uploaded files into previewable formats. The service handles 20k qps requests at peak doing conversions within jailed environments.
  • Migrated legacy HTTP routes to grpc based services, by creating a wrapper service around legacy libraries in order to move towards SOA.
  • DRI for migration of file metada storage as well as extraction pipeline to on-the-fly extraction. Gained alignment across multiple teams in order to deprecate defunct use-cases and reduce costs by 400k/yr.
  • Worked on the new document conversion pipeline for converting MS Office documents into previewable formats securly in jailed environments, at scale.
 
 
 
 
 

Sr. Assoc Research Staff

NEC Labs America

Nov 2011 – Dec 2017 Princeton, NJ

NGLA: An end-to-end log analytics service (Jan 2015- Nov 2017)

  • Architect and led the design & development of streaming anomaly detection with NoSQL database (Elas- ticSearch), Kafka and Spark Streaming. Owned most components of the pipeline for streaming analytics - Collaborated on design of complex time-series, stateful and stateless log analytics in a multi-tier setup - Designed a control interface for streaming analytic task job management (tasks involved - model man- agement, in-memory states, periodic anomaly check, start/stop, and cleanup)
  • Modified core apache spark code to introduce support for on-the-fly broadcast model update, leveraged this in deploying model control management interface in spark streaming
  • Designed a prototype web-interface for real-time visualization using Flask, javascript and bootstrap
  • Founding member with an initial team of 3, experience in data cleaning, preprocessing, log pattern rep- resentation and parsing including multiple POC trials for customer data

Behavior Analysis Engine (Jan 2017-Nov 2017)

  • A Semantic language framework for knowledge representation - expressing complex machine learning log models and allowing administrators to express domain knowledge as “rules” and “behaviors”
  • Worked in a two person team for writing language grammar and execution operations using Spark SQL - Developed (in progress) RESTful API to convert BAE to an SOA with job and rule management

CLUE: Distributed System Trace Analytics (Jan 2013- May 2015)

  • Stitched kernel event logs to generate end-to-end “transactions”, which can help give a “CLUE” to the root-cause of bugs. CLUE uses data-mining and transaction clustering to find potential anomalies
  • Developed a novel hybrid (static + dynamic) binary instrumentation tool called iProbe with an order of magnitude better performance than state of art-tools
  • Collaborated on core-engine development, and designed the interface along with data visualizations, and project management

NetLogic (Jan 2015 - Dec 2015):

  • Building a software defined data-center and cloud environments by deploying OpenStack and Open- VSwitch based network management. Deployed and managed the OpenStack infrastructure, and wrote several wrappers to setup a small internal cloud
  • Developed a novel prototype network manager called HybNET for hybrid network infrastructure with both SDN and legacy switches. The controller allowed centralized network management despite partial transition to SDN switches

 
 
 
 
 

Business Analyst

McKinsey and Co

Jun 2008 – Aug 2008 New York, NY
Product Owner Proxy for Scrum roll-out team (Agile s/w Development) in McKinsey App-Dev. Also designed architecture & a proof of concept of a trend analysis tool.
 
 
 
 
 

Graduate Research Assistant

Columbia University

Jan 2008 – Jan 2011 New York, NY
  • Thesis: Developed on-the-fly sandboxed debugging framework called Parikshan, which allows developers to debug SOA applications hosted on user-space containers in a cloned parallel container, without any downtime and any impact on the production facing service
  • Parikshan leverages live-cloning a modification of live-migration and a new network duplication proxy to enable on-the-fly cloning of OpenVZ containers
  • Also worked on other projects associated with the lab- COMPASS, research in Multi-core Software Engi- neering, Binary/Run-time instrumentation, static and dynamic program analysis, Recommender Systems. and system administration/mentoring research students.
 
 
 
 
 

Research Consultant

Instituto de Soldedura Equalidade

Feb 2007 – May 2007 Cascais, Portugal
Designed a prototype for a Decision Support Tool with an interactive interface for Natural Gas + Hydrogen combine fuel being tested for use in pipelines all over Europe. The tool was designed in Visual Basic.Net
 
 
 
 
 

Research Assistant

Indian Institute of Technology

Jan 2006 – Jan 2005 New Delhi, India
Worked in the Computer Integrated Manufacturing (CIM) Lab on comparing genetic algorithms, simulated annealing and tabu search algorithms to evaluate algorithm efficiencies.

Publications

Replay without Recording of Production Bugs for Service Oriented Applications
LogLens: A Real-Time Log Analysis System
An Analytics Approach to Traffic Analysis in Network Virtualization
PerfScope: Practical Online Server Performance Bug Inference in Production Cloud Computing Infrastructures
Enabling Layer 2 Pathlet Tracing through Context Encoding in Software-Defined Networking
IntroPerf: Transparent Context-Sensitive Multi-Layer Performance Inference using System Stack Traces
Uscope: A Scalable Unified Tracer from Kernel to User Space
CLUE: System Trace Analytics for Cloud Service Performance Diagnosis
Software System Performance Debugging with Kernel Events Feature Guidance
DeltaPath: Precise and Scalable Calling Context Encoding
HybNET: Network Manager for Hybrid Network Infrastructure
iProbe: A Lightweight User-Level Dynamic Instrumentation Framework
The weHelp Reference Architecture for Community-Driven Recommender Systems
weHelp: A Reference Architecture for Social Recommender Systems
COMPASS: COMmunity driven Parallelization Advisor for Software Systems
COMPASS: COMmunity driven Parallelization Advisor for Software Systems

Patents

Issued:
  • USPTO - 14030 Path Selection in Hybrid NetworksUtility-ORGUS (8/9/2016)
  • USPTO - 13148 Dynamic Border Line Tracing for Tracking Message Flows Across Distributed Systems (1/3/2017)
  • USPTO - 13062 Transparent Performance Inference of Whole Software Layers and Context Sensitive Performance Debugging (6/14/2016)
  • USPTO - 13035 Method and Apparatus for managing Hybrid Network Systems (9/20/2016)
  • USPTO - 12155 Guarding a Monitoring Scope and Interpreting Partial Control Flow (10/18/2016)
  • USPTO - 12082 Method and System for Computer Assisted Hot-Tracing Mechanism (11/8/2016)
  • USPTO - 12049 Blackbox Memory Monitoring with a Calling Context Memory Map and Semantic ExtractionUtility (4/7/2015)
  • USPTO - 12016 Efficient Unified Tracing of Kernel and User Events with Multi-Mode Stacking (11/25/2014)
  • USPTO - 12010 Method and Apparatus for Correlated Tracing with Automated Multi-Layer Function Instrumentation Localization (7/28/2015)
  • Japan Patent Office - 13035J Hybrid Network Management (11/10/2015)
Pending:

Pending patents available on request.

Contact

nipun<at>cs<dot>columbia<dot>edu