Celebrity Twitter accounts display 'bot-like' behavior

August 01, 2017

'Celebrity' Twitter accounts - those with more than 10 million followers - display more bot-like behaviour than users with fewer followers, according to new research.

The researchers, from the University of Cambridge, used data from Twitter to determine whether bots can be accurately detected, how bots behave, and how they impact Twitter activity.

They divided accounts into categories based on total number of followers, and found that accounts with more than 10 million followers tend to retweet at similar rates to bots. In accounts with fewer followers however, bots tend to retweet far more than humans. These celebrity-level accounts also tweet at roughly the same pace as bots with similar follower numbers, whereas in smaller accounts, bots tweet far more than humans. Their results will be presented at the IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM) in Sydney, Australia.

Bots, like people, can be malicious or benign. The term 'bot' is often associated with spam, offensive content or political infiltration, but many of the most reputable organisations in the world also rely on bots for their social media channels. For example, major news organisations, such as CNN or the BBC, who produce hundreds of pieces of content daily, rely on automation to share the news in the most efficient way. These accounts, while classified as bots, are seen by users as trustworthy sources of information.

"A Twitter user can be a human and still be a spammer, and an account can be operated by a bot and still be benign," said Zafar Gilani, a PhD student at Cambridge's Computer Laboratory, who led the research. "We're interested in seeing how effectively we can detect automated accounts and what effects they have."

Bots have been on Twitter for the majority of the social network's existence - it's been estimated that anywhere between 40 and 60% of all Twitter accounts are bots. Some bots have tens of millions of followers, although the vast majority have less than a thousand - human accounts have a similar distribution.

In order to reliably detect bots, the researchers first used the online tool BotOrNot (since renamed BotOMeter), which is one of the only available online bot detection tools. However, their initial results showed high levels of inaccuracy. BotOrNot showed low precision in detecting bots that had bot-like characteristics in their account name, profile info, content tweeting frequency and especially redirection to external sources. Gilani and his colleagues then decided to take a manual approach to bot detection.

Four undergraduate students were recruited to manually inspect accounts and determine whether they were bots. This was done using a tool that automatically presented Twitter profiles, and allowed the students to classify the profile and make notes. Each account was collectively reviewed before a final decision was reached.

In order to determine whether an account was a bot (or not), the students looked at different characteristics of each account. These included the account creation date, average tweet frequency, content posted, account description, whether the user replies to tweets, likes or favourites received and the follower to friend ratio. A total of 3,535 accounts were analysed: 1,525 were classified as bots and 2010 as humans.

The students showed very high levels of agreement on whether individual accounts were bots. However, they showed significantly lower levels of agreement with the BotOrNot tool.

The bot detection algorithm they subsequently developed achieved roughly 86% accuracy in detecting bots on Twitter. The algorithm uses a type of classifier known as Random Forests, which uses 21 different features to detect bots, and the classifier itself is trained by the original dataset annotated by the human annotators.

The researchers found that bot accounts differ from humans in several key ways. Overall, bot accounts generate more tweets than human accounts. They also retweet far more often, and redirect users to external websites far more frequently than human users. The only exception to this was in accounts with more than 10 million followers, where bots and humans showed far more similarity in terms of the volume of tweets and retweets.

"We think this is probably because bots aren't that good at creating original Twitter content, so they rely a lot more on retweets and redirecting followers to external websites," said Gilani. "While bots are getting more sophisticated all the time, they're still pretty bad at one-on-one Twitter conversations, for instance - most of the time, a conversation with a bot will be mostly gibberish."

Despite the sheer volume of Tweets produced by bots, humans still have better quality and more engaging tweets - tweets by human accounts receive on average 19 times more likes and 10 times more retweets than tweets by bot accounts. Bots also spend less time liking other users' tweets.

"Many people tend to think that bots are nefarious or evil, but that's not true," said Gilani. "They can be anything, just like a person. Some of them aren't exactly legal or moral, but many of them are completely harmless. What I'm doing next is modelling the social cost of these bots - how are they changing the nature and quality of conversations online? What is clear though, is that bots are here to stay."

University of Cambridge

Related Twitter Articles from Brightsurf:

How Twitter takes votes away from Trump but not from Republicans
In the 2016 US presidential election, Twitter made independent voters less likely to vote for Donald Trump, finds new study from Bocconi University and Princeton

Study tracks public concerns on Twitter about COVID-19
Twitter users initially didn't feel positive about the state of the economy, prevention, treatment and recovery concerning COVID-19.

How a Twitter hashtag provides support for people with breast cancer
A UCLA-led review of nine years of social media posts with the hashtag #BCSM suggests that Twitter can be a useful resource not only for patients, but also for physicians and researchers.

QUT algorithm could quash Twitter abuse of women
Online abuse targeting women, including threats of harm or sexual violence, has proliferated across all social media platforms but QUT researchers have developed a sophisticated statistical model to identify misogynistic content and help drum it out of the Twittersphere.

A novel strategy for quickly identifying twitter trolls
Two algorithms that account for distinctive use of repeated words and word pairs require as few as 50 tweets to accurately distinguish deceptive ''troll'' messages from those posted by public figures.

Journalists' Twitter use shows them talking within smaller bubbles
Journalists in Washington, D.C., have long been accused of living in a ''Beltway bubble.'' Their interactions on Twitter, however, show them congregating in even smaller ''microbubbles,'' says a recent study.

Twitter data reveals global communication network
Twitter mentions show distinct community structure patterns resulting from communication preferences of individuals affected by physical distance between users and commonalities, such as shared language and history.

Twitter data research reveals more about what patients think about statins
More than one in seven people taking statins -- prescribed to lower cholesterol levels -- believed that meant they could still eat unhealthy foods, a new study shows.

Twitter posts reveal polarization in Congress on COVID-19
The rapid politicization of the COVID-19 pandemic can be seen in messages members of the US Congress sent about the issue on the social media site Twitter, a new analysis found.

Candidates who use humor on Twitter may find the joke is on them
Political candidates' use of humor on social media could sometimes backfire on them with potential supporters, new research suggests.

Read More: Twitter News and Twitter Current Events
Brightsurf.com is a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to Amazon.com.