Nipun Arora

Computer Scientist/Software Engineer


  • PhD in Computer Science, 2017

    Columbia University

  • MSc in Computer Science, 2009

    Columbia University


  • Data Engineering/Streaming Pipelines
  • Search & Retrieval
  • Scalability and Performance
  • ML and ML Infra
  • PL & Compilers (once upon a time)

About Me:

I’m a researcher at heart and software engineer in practice with several years of experience leading software engineering teams in successful and impactful projects. I pride myself in delivering results, and driving innovation in organizations, as well as improving engineering and buisiness process to enable developer productivity and the org as a whole.

I have a PhD in Computer Science from Columbia University, where I worked at the Programming Systems Laboratory with Prof. Gail Kaiser. My research interests span data analytics, big data, stream processing, distributed systems, large scale system debugging, and program analysis. I have briefly also worked on cloud computing and software defined networking.

Work History:

Currently I work as a Principal Engineer at Priceline in Flights Backend Infrastructure Group, effectively as one of two Group Tech Leads for 4-6 scrum teams. I drive best-practices, architect and drive changes across the Flights Stack, along with representing the team in company wide endavors (moving to GKE etc.). Before this, I worked as a Senior IC at Dropbox, New York with the Previews Infrastructure Services Team. The preview-infrastructure team provides middle layer services to convert uploaded files into previewable content for all user-facing frontends for dropbox (this is the second largest infra fleet after storage at dropbox). I was a researcher at NEC Labs America, Princeton, NJ where I worked with Systems Research Group (formerly a part of the Autonomic Computing Group). I have also briefly interned as Business Analyst at McKinsey & Co., New York in 2008. In my undergrad years, I interned as a Research Consultant at Instituto de Soldedura Equalidade (Lisbon, Portugal), a research organization under the aegis of the European Union where I was involved in a Project called “Natrualhy”. I was also a Research Assistant at the Indian Institute of Technology (Delhi, India) in the Computer Integrated Manufacturing Lab, where I worked on Supply Chain Management.

Recent News:

  • Invited talk at Google Journal Club- Replay without Recording of Production Bugs for SOA (ASE 2018)
  • Awarded Excellent Invention Awards for patent applications- 2017: Next Generation Log Analytics Application: An Automated Anomaly Detection Service on Heterogeneous Logs
  • Awarded NEC Business Contribution Award - 2016 (awarded for Research Commercialization)
  • Spot Recognition Award for Supporting Log Analysis Technology Development, Oct 2016
  • NEC Recognition Award for Creating Patent Portfolio for Log Analysis Technologies, Jun 2016
  • Awarded NEC Business Contribution Award - 2015 (awarded for Research Commercialization)

Publications/Patents:

  • 10 Issued Patents, 26 Filed Patents (pending) - as of Feb 24, 2017
  • 17 Peer-Reviewed Publications​

Community Activity:

  • Program Committee Member Middleware 2015
  • Reviewer IEEE’s Journal on Transactions for Parallel and Distributed Systems
  • Peer Reviwer SPIN 2014
  • Peer Reviewer Globecom 2014
  • Reviewer IEEE’s Journal on Transactions for Service Computing
  • Peer Reviewer SCSC 2014
  • Peer Reviewer SDN-AA Workshop 2014
  • Peer Reviewer ICAC 2014
  • Peer Reviewer SIGMETRICS 2014

Previous Interns/Students:

  • Mohammad Ali Gulzar, PhD Student, UCLA, Summer 2016
  • Muhammad Solaimani, University of Texas Dallas, Summer 2016
  • Pradeep Fernando, PhD Student, Georgia Tech University, Summer 2015
  • Yuanzhen Gu, PhD Student, Rutgers University, Summer 2014
  • Advait Dixit, PhD Student, Purdue University, Spring 2014
  • Hui Lu, PhD Student, Purdue University, Summer 2013
  • Nitin Natrajan, MS, Columbia - Fall 2010
  • Jyotsna Sebe, MS, Columbia - Fall 2009
  • Bing Wu, MS, Columbia - Fall 2008, Spring 2009
  • Suhas Anand, MS, Columbia - Fall 2008
  • Junxiong Jia, MS, Columbia - Fall 2008

Resources:

Experience

 
 
 
 
 

Group Technical Lead | Principal Software Engineer

Priceline

Oct 2019 – Present Greater New York City Area
  • Group Technical Lead & Architect for the priceline flights backend infra team. I work with engineers across the flights org to improve processes, and architecture of our search/price and booking stacks.
  • Led the migration of flights infra from on-prem infrastructure to Google Kubernetes Hosted micro-services
  • Multiple performance and scalability improvements both at an implementation and architectural level to our search stack
 
 
 
 
 

Senior Software Engineer

Dropbox

Jan 2018 – Oct 2019 New York City
  • Part of team that manages previews serving infrastructure which converts uploaded files into previewable formats. The service handles 20k qps requests at peak doing conversions within jailed environments.
  • Migrated legacy HTTP routes to grpc based services, by creating a wrapper service around legacy libraries in order to move towards SOA.
  • DRI for migration of file metada storage as well as extraction pipeline to on-the-fly extraction. Gained alignment across multiple teams in order to deprecate defunct use-cases and reduce costs by 400k/yr.
  • Worked on the new document conversion pipeline for converting MS Office documents into previewable formats securly in jailed environments, at scale.
 
 
 
 
 

Sr. Assoc Research Staff

NEC Labs America

Nov 2011 – Dec 2017 Princeton, NJ

NGLA: An end-to-end log analytics service (Jan 2015- Nov 2017)

  • Architect and led the design & development of streaming anomaly detection with NoSQL database (Elas- ticSearch), Kafka and Spark Streaming. Owned most components of the pipeline for streaming analytics - Collaborated on design of complex time-series, stateful and stateless log analytics in a multi-tier setup - Designed a control interface for streaming analytic task job management (tasks involved - model man- agement, in-memory states, periodic anomaly check, start/stop, and cleanup)
  • Modified core apache spark code to introduce support for on-the-fly broadcast model update, leveraged this in deploying model control management interface in spark streaming
  • Designed a prototype web-interface for real-time visualization using Flask, javascript and bootstrap
  • Founding member with an initial team of 3, experience in data cleaning, preprocessing, log pattern rep- resentation and parsing including multiple POC trials for customer data

Behavior Analysis Engine (Jan 2017-Nov 2017)

  • A Semantic language framework for knowledge representation - expressing complex machine learning log models and allowing administrators to express domain knowledge as “rules” and “behaviors”
  • Worked in a two person team for writing language grammar and execution operations using Spark SQL - Developed (in progress) RESTful API to convert BAE to an SOA with job and rule management

CLUE: Distributed System Trace Analytics (Jan 2013- May 2015)

  • Stitched kernel event logs to generate end-to-end “transactions”, which can help give a “CLUE” to the root-cause of bugs. CLUE uses data-mining and transaction clustering to find potential anomalies
  • Developed a novel hybrid (static + dynamic) binary instrumentation tool called iProbe with an order of magnitude better performance than state of art-tools
  • Collaborated on core-engine development, and designed the interface along with data visualizations, and project management

NetLogic (Jan 2015 - Dec 2015):

  • Building a software defined data-center and cloud environments by deploying OpenStack and Open- VSwitch based network management. Deployed and managed the OpenStack infrastructure, and wrote several wrappers to setup a small internal cloud
  • Developed a novel prototype network manager called HybNET for hybrid network infrastructure with both SDN and legacy switches. The controller allowed centralized network management despite partial transition to SDN switches

 
 
 
 
 

Business Analyst

McKinsey and Co

Jun 2008 – Aug 2008 New York, NY
Product Owner Proxy for Scrum roll-out team (Agile s/w Development) in McKinsey App-Dev. Also designed architecture & a proof of concept of a trend analysis tool.
 
 
 
 
 

Graduate Research Assistant

Columbia University

Jan 2008 – Jan 2011 New York, NY
  • Thesis: Developed on-the-fly sandboxed debugging framework called Parikshan, which allows developers to debug SOA applications hosted on user-space containers in a cloned parallel container, without any downtime and any impact on the production facing service
  • Parikshan leverages live-cloning a modification of live-migration and a new network duplication proxy to enable on-the-fly cloning of OpenVZ containers
  • Also worked on other projects associated with the lab- COMPASS, research in Multi-core Software Engi- neering, Binary/Run-time instrumentation, static and dynamic program analysis, Recommender Systems. and system administration/mentoring research students.
 
 
 
 
 

Research Consultant

Instituto de Soldedura Equalidade

Feb 2007 – May 2007 New York, NY
Designed a prototype for a Decision Support Tool with an interactive interface for Natural Gas + Hydrogen combine fuel being tested for use in pipelines all over Europe. The tool was designed in Visual Basic.Net
 
 
 
 
 

Research Assistant

Indian Institute of Technology

Jan 2006 – Jan 2005 New York, NY
Worked in the Computer Integrated Manufacturing (CIM) Lab on comparing genetic algorithms, simulated annealing and tabu search algorithms to evaluate algorithm efficiencies.

Projects

NGLA: Next Generation Log Analytics

Most modern day softwares generate human readable logs for developers/administrators to understand and realize the cause of any error or behavior of the system. However, both the volume, velocity and non-uniform log formats make it difficult for administrators to easily find root-cause of errors in their systems. NGLA is a log analytics framework which automatically detects log patterns and leverages these patterns to give state-of-the-art automated real-time log anomaly detection

CLUE

Modern computer systems, from single servers to large cloud deployments, generate billions of events that reflect the state and operation of the system. CLUE provides a black-box, unsupervised debugging tool to mine event patterns and diagnose performance issues in these systems. CLUE uses novel data mining technologies for automated information retrieval and a state-of-the-art debugging toolset to integrate and profile event transactions.

Publications

Patents

Issued:
  • USPTO - 14030 Path Selection in Hybrid NetworksUtility-ORGUS (8/9/2016)
  • USPTO - 13148 Dynamic Border Line Tracing for Tracking Message Flows Across Distributed Systems (1/3/2017)
  • USPTO - 13062 Transparent Performance Inference of Whole Software Layers and Context Sensitive Performance Debugging (6/14/2016)
  • USPTO - 13035 Method and Apparatus for managing Hybrid Network Systems (9/20/2016)
  • USPTO - 12155 Guarding a Monitoring Scope and Interpreting Partial Control Flow (10/18/2016)
  • USPTO - 12082 Method and System for Computer Assisted Hot-Tracing Mechanism (11/8/2016)
  • USPTO - 12049 Blackbox Memory Monitoring with a Calling Context Memory Map and Semantic ExtractionUtility (4/7/2015)
  • USPTO - 12016 Efficient Unified Tracing of Kernel and User Events with Multi-Mode Stacking (11/25/2014)
  • USPTO - 12010 Method and Apparatus for Correlated Tracing with Automated Multi-Layer Function Instrumentation Localization (7/28/2015)
  • Japan Patent Office - 13035J Hybrid Network Management (11/10/2015)
Pending:

Pending patents available on request.

Contact

nipun<at>cs<dot>columbia<dot>edu