Ensuring the Integrity of Wikipedia: A Data Science Approach

Abstract

In this paper, we present our research on the problem of ensuring the integrity of Wikipedia, the world’s biggest free encyclopedia. As anyone can edit Wikipedia, many malicious users take advantage of this situation to make edits that compromise pages’ content quality. Specifically, we present DePP, the state-of-the-art tool that detects article pages to protect with an accuracy of 93% and we introduce our research on identifying spam users. We show that we are able to classify spammers from benign users with 80.8% of accuracy and 0.88 mean average precision.

Publication
In 1st Women in Data Science Workshop (WinDS) @ SDM 2017.
Date
Links