Dealing with structured and unstructured data at Facebook

Dealing with structured and unstructured data at Facebook has undergone tremendous growth in the last five years. Here we will start by looking at some basic statistics and trends that have accompanied this growth. We'll then dive into two different topics. First, we will look at a general trend to make data more structured at Facebook. Having more structured data makes it easier to manage, understand, and leverage it. I will briefly discuss the tools (Hive) that have been built to enable the massive-scale data analysis that goes on at Facebook on a daily basis. In the second part of the talk, I will dive into the details of one of the systems that has contributed to the growth of Facebook: People You May Know. This system generates a significant number of the friend connections on Facebook, and by using increasingly sophisticated machine learning techniques, we have been able to make large improvements to the ranking used by the system since its original launch.
