Carnegie Mellon algorithm detects online fraudsters

September 08, 2016

PITTSBURGH -- An algorithm developed at Carnegie Mellon University makes it easier to determine if someone has faked an Amazon or Yelp review or if a politician with a suspiciously large number of Twitter followers might have bought and paid for that popularity

The method, called FRAUDAR, marks the latest escalation in the cat-and-mouse game played by online fraudsters and the social media platforms that try to out them. In particular, the new algorithm makes it possible to see through camouflage that fraudsters use to make themselves look legitimate, said Christos Faloutsos, professor of machine learning and computer science.

In real-world experiments using Twitter data for 41.7 million users and 1.47 billion followers, FRAUDAR fingered more than 4,000 accounts not previously identified as fraudulent, including many that used known follower-buying services such as TweepMe and TweeterGetter.

"We're not identifying anything criminal here, but these sorts of frauds can undermine people's faith in online reviews and behaviors," Faloutsos said. He noted most social media platforms try to flush out such fakery, and FRAUDAR's approach could be useful in keeping up with the latest practices of fraudsters.

The CMU algorithm is available as open-source code at http://www.andrew.cmu.edu/user/bhooi/camo.zip. A research paper describing the algorithm won the Best Paper Award last month at the Association for Computing Machinery's Conference on Knowledge Discovery and Data Mining (KDD2016) in San Francisco.

Faloutsos and his data analytics team specialize in graph mining, a method that looks for patterns in the data. In this case, social media interactions are plotted as a graph, with each user represented as a dot, or node, and transactions between users represented as lines, or edges.

The state-of-the-art for detecting fraudsters, with tools such as Faloutsos' NetProbe, is to find a pattern known as a "bipartite core." These are groups of users who have many transactions with members of a second group, but no transactions with each other. This suggests a group of fraudsters, whose only purpose is to inflate the reputations of others by following them, by having fake interactions with them, or by posting flattering or unflattering reviews of products and businesses.

But fraudsters have learned to camouflage themselves, Faloutsos said. They link their fraudulent accounts with popular sites or celebrities, or they use legitimate user accounts they have hijacked. In either case, they try to look "normal." FRAUDAR can prune away this camouflage. Essentially, the algorithm begins by finding accounts that it can confidently identify as legitimate -- accounts that may follow a few random people, those that post only an occasional review and those that otherwise have normal behaviors. This pruning occurs repeatedly and rapidly. As these legitimate accounts are eliminated, so is the camouflage the fraudsters rely upon. This makes bipartite cores easier to spot.

To test the algorithm, Faloutsos and his students used a massive Twitter database extracted from the social media platform in 2009 for research purposes. FRAUDAR found more than 4,000 accounts that appeared highly suspicious, though most of the tweets had not been removed and the accounts had not been suspended in the seven years since the data was collected. The researchers randomly selected 125 followers and 125 followees from the suspicious group, along with two control groups of 100 users who had not been picked out by the algorithm. They examined each for links associated with malware or scams and for clear robot-like behavior, such as replying to large numbers of tweets with identical messages. They found 57 percent of the followers and 40 percent of the followees in the suspicious group were labeled as fraudulent, compared to 12 percent and 25 percent in the control groups.

Among the suspicious accounts, the researchers found 41 percent of the followers and 26 percent of the followees included advertising for follower-buying services - 62 percent and 42 percent, respectively, if deleted or suspended accounts are ignored. Few such mentions were found in the control groups.

"The algorithm is very fast and doesn't require us to target anybody," Faloutsos said. "We hope that by making this code available as open source, social media platforms can put it to good use."
-end-
In addition to Faloutsos, the research team included Ph.D. students Bryan Hooi, Hyun Ah Song, Neil Shah and Kijung Shin, and Alex Beutel, who recently received his Ph.D. in computer science. The National Science Foundation supported this research.

About Carnegie Mellon University:Carnegie Mellon is a private, internationally ranked research university with programs in areas ranging from science, technology and business, to public policy, the humanities and the arts. More than 13,000 students in the university's seven schools and colleges benefit from a small student-to-faculty ratio and an education characterized by its focus on creating and implementing solutions for real problems, interdisciplinary collaboration and innovation.

Carnegie Mellon University

Related Algorithm Articles from Brightsurf:

CCNY & partners in quantum algorithm breakthrough
Researchers led by City College of New York physicist Pouyan Ghaemi report the development of a quantum algorithm with the potential to study a class of many-electron quantums system using quantum computers.

Machine learning algorithm could provide Soldiers feedback
A new machine learning algorithm, developed with Army funding, can isolate patterns in brain signals that relate to a specific behavior and then decode it, potentially providing Soldiers with behavioral-based feedback.

New algorithm predicts likelihood of acute kidney injury
In a recent study, a new algorithm outperformed the standard method for predicting which hospitalized patients will develop acute kidney injury.

New algorithm could unleash the power of quantum computers
A new algorithm that fast forwards simulations could bring greater use ability to current and near-term quantum computers, opening the way for applications to run past strict time limits that hamper many quantum calculations.

QUT algorithm could quash Twitter abuse of women
Online abuse targeting women, including threats of harm or sexual violence, has proliferated across all social media platforms but QUT researchers have developed a sophisticated statistical model to identify misogynistic content and help drum it out of the Twittersphere.

New learning algorithm should significantly expand the possible applications of AI
The e-prop learning method developed at Graz University of Technology forms the basis for drastically more energy-efficient hardware implementations of Artificial Intelligence.

Algorithm predicts risk for PTSD after traumatic injury
With high precision, a new algorithm predicts which patients treated for traumatic injuries in the emergency department will later develop posttraumatic stress disorder.

New algorithm uses artificial intelligence to help manage type 1 diabetes
Researchers and physicians at Oregon Health & Science University have designed a method to help people with type 1 diabetes better manage their glucose levels.

A new algorithm predicts the difficulty in fighting fire
The tool completes previous studies with new variables and could improve the ability to respond to forest fires.

New algorithm predicts optimal materials among all possible compounds
Skoltech researchers have offered a solution to the problem of searching for materials with required properties among all possible combinations of chemical elements.

Read More: Algorithm News and Algorithm Current Events
Brightsurf.com is a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to Amazon.com.