Presented at: 16th International World Wide Web Conference (WWW2007)
by Yue Zhang, Jason Hong, Lorrie Cranor
Phishing is a significant problem involving fraudulent email and web sites that trick unsuspecting users into revealing private information. In this paper, we present the design, implementation, and evaluation of CANTINA, a novel, content-based approach to detecting phishing web sites, based on the well-known TF-IDF algorithm used in information retrieval. We also discuss the design and evaluation of several heuristics we developed to reduce our false positive rates. Our experiments show that CANTINA is good at detecting phishing sites, correctly labeling approximately 95% of phishing sites.
Resource URI on the dog food server: http://data.semanticweb.org/conference/www/2007/paper/main/557
Explore this resource elsewhere: