Fake Mass-Produced Advertisements Detection on Global Online Adult Service Websites

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Abstract: A significant amount of sex trafficking victims are being advertised on online adult services, which are currently being flooded with spam. Investigators rely on online adult services to track cases of sex trafficking; however, the ever-increasing volume of spam poses a mounting challenge, making their task progressively more difficult. This thesis presents a machine learning-based approach for detecting fake mass-produced advertisements on global online adult service websites. The objective is to aid investigators in tracking sex trafficking by developing a robust spam classifier that minimizes false positives on genuine ads while effectively identifying mass-produced spam. This objective is of utmost importance as it allows for filtering out spam effectively while ensuring that genuine ads are not mistakenly labeled as spam, ensuring their inclusion in crucial investigations. The research involved cleaning advertisement text, generating text embeddings using sentence-BERT, clustering them with DBSCAN, and feature engineering for classification using a random forest classifier. A dataset of two million advertisements was utilized for training and evaluation. The study successfully achieved the crucial goal of minimizing false positives, ensuring that genuine ads are not misclassified as spam. By employing innovative techniques and carefully engineered features, the classifier demonstrates a high level of recall in distinguishing mass-produced spam from authentic ads. Furthermore, the investigation identified key markers of mass-produced spam, such as geographical spread and frequent use of profane language. This research fills a significant research gap, as no previous attempts had been made to classify spam on these websites. The findings not only contribute to the field of machine learning but also provide a comprehensive overview of fraudulent advertisement features, making sex trafficking investigations more efficient. Equipping investigators with a reliable tool to navigate the vast amount of data associated with global online adult service websites, this work plays a crucial role in combating sex trafficking and ensuring the integrity of the investigative process.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)