Cross-device User Clustering at Adobe (CANCELLED)

Thursday, November 29, 2018 - 11:00am to 12:30pm
Schapiro Hall (CEPSR) Davis Auditorium
Columbia University
New York, NY 10027
United States

Columbia Data Science Institute Industry Innovation Seminars

Charles Menguy, Senior Computer Scientist

As people now engage with digital properties using a myriad of devices such as laptops, smart phones, tablets, connected TVs and gaming consoles, the traditional cookie-based or device-level views of online user interaction are too narrow. Even when using a single device, a person may be assigned multiple IDs due to cookie churn or the use of different browsers. Marketers are looking through a fragmented lens and are spending their marketing dollars without understanding more than a fractional part of consumer interactions.

This talk will give an overview of how Adobe is tackling this problem using graph processing techniques, going over our journey from the initial research ideas to a fully productized product that works at scale running in the cloud while respecting user privacy. Focus will be on our open-source technology stack, including Apache Spark, GraphX, OpenTSDB, ... along with a description of our algorithms and the challenges we went through to run them at very large scale on Amazon Web Services.

Bio: Charles is a data engineer/scientist at Adobe, where he has been involved on many projects revolving around big data, infrastructure and data science. Passionate about data, Charles is constantly on the lookout for the newest technologies and applications related to data science at scale. After graduating with a Master’s degree in Computer Science and Artificial Intelligence from EPITA, France, in 2010, Charles has since focused his career on the online advertising industry, working at an ad network, a Demand-Side Platform, and currently Adobe’s Data Management Platform.

