Releasing Search Queries and Clicks Privately

Presented at: 18th International World Wide Web Conference (WWW2009)

by Aleksandra Korolova, Krishnaram Kenthapadi, Nina Mishra, Alexandros Ntoulas

Webpage: http://www2009.eprints.org/18/1/p171.pdf

The question of how to publish an anonymized search log was brought to the forefront by a well-intentioned, but privacy-unaware AOL search log release. Since then a series of ad-hoc techniques have been proposed in the literature, though none are known to be provably private. In this paper, we take a major step towards a solution: we show how queries, clicks and their associated perturbed counts can be published in a manner that rigorously preserves privacy. Our algorithm is decidedly simple to state, but non-trivial to analyze. On the opposite side of privacy is the question of whether the data we can safely publish is of any use. Our findings offer a glimmer of hope: we demonstrate that a non-negligible fraction of queries and clicks can indeed be safely published via a collection of experiments on a real search log. In addition, we select an application, keyword generation, and show that the keyword suggestions generated from the perturbed data resemble those generated from the original data.

Keywords: Data Mining


Resource URI on the dog food server: http://data.semanticweb.org/conference/www/2009/paper/18


Explore this resource elsewhere: