Nav: Home

Researchers teach 'machines' to detect Medicare fraud

October 30, 2018

Using a highly sophisticated form of pattern matching, researchers from Florida Atlantic University's College of Engineering and Computer Science are teaching "machines" to detect Medicare fraud. Medicare, the primary health care coverage for Americans 65 and older, accounts for 20 percent of health care spending in the United States. About $19 billion to $65 billion is lost every year because of Medicare fraud, waste or abuse.

Like the proverbial "needle in a haystack," human auditors or investigators have the painstaking task of manually checking thousands of Medicare claims for specific patterns that could indicate foul play or fraudulent behaviors. Furthermore, according to the U.S. Department of Justice, right now fraud enforcement efforts rely heavily on health care professionals coming forward with information about Medicare fraud.

A study published in the journal Health Information Science and Systems is the first to use big data from Medicare Part B and employ advanced data analytics and machine learning to automate the fraud detection process. Programming computers to predict, classify and flag potential fraudulent events and providers could significantly improve fraud detection and lighten the workload for auditors and investigators.

Researchers from FAU's Department of Computer and Electrical Engineering and Computer Science examined Medicare Part B dataset from 2012 to 2015. They focused on detecting fraudulent provider claims within the dataset, which consisted of 37 million cases. Fraudulent activities include patient abuse or neglect as well as billing for services not rendered. Physicians and other providers who commit fraud are excluded from participating in federal health care programs like Medicare, and these cases are labeled as "fraud."

For the study, the researchers aggregated the 37 million cases down to a smaller dataset of 3.7 million and identified a unique process to map fraud labels with known fraudulent providers.

Medicare Part B data included provider information, average payments and charges, procedure codes, the number of procedures performed as well as the medical specialty, which is referred to as provider type. In order to obtain exact matches, the researchers only used the National Provider Identifier (NPI) to match fraud labels to the Medicare Part B data. The NPI is a single identification number issued by the federal government to health care providers.

Researchers directly matched the NPI across the Medicare Part B data, flagging any provider in the "excluded" database as being "fraudulent." The research team classified a physician's NPI or specialty and specifically looked at whether the predicted specialty differed from the actual specialty, as indicated in the Medicare Part B data.

"If we can predict a physician's specialty accurately based on our statistical analyses, then we could potentially find unusual physician behaviors and flag these as possible fraud for further investigation," said Taghi M. Khoshgoftaar, Ph.D., co-author and Motorola Professor in FAU's Department of Computer and Electrical Engineering and Computer Science. "For example, if a dermatologist is accurately classified as a cardiologist, then this could indicate that this particular physician is acting in a fraudulent or wasteful way."

For the study, Khoshgoftaar, along with Richard A. Bauder, senior author, a Ph.D. student at FAU and a data scientist at FPL, and Matthew Herland, a Ph.D. student in FAU's Department of Computer and Electrical Engineering and Computer Science, had to address the fact that the original labeled big dataset was highly imbalanced. This imbalance occurred because fraudulent providers are much less common than non-fraudulent providers. This scenario can be likened to "where's Waldo," and is problematic for machine learning approaches because the algorithms are trying to distinguish between the classes -- and one dominates the other thereby fooling the learner.

To combat this imbalance, the researchers used random undersampling to reduce the dataset from the 3.7 million cases down to about 12,000 cases. They created seven class distributions and used six different learners across class distributions from severely imbalanced to balanced.

Results from the study show statistically significant differences between all of the learners as well as differences in class distributions for each learner. RF100 (Random Forest), a learning algorithm, was the best at detecting the positives of potential fraud events.

More interestingly, and contrary to popular belief that balanced datasets perform the best, this study found that was not the case for Medicare fraud detection. Keeping more of the non-fraud cases actually helped the learner/model better distinguish between the fraud and non-fraud cases. Specifically, the researchers found the "sweet spot" for identifying Medicare fraud to be a 90:10 distribution of normal vs. fraudulent data.

"There are so many intricacies involved in determining what is fraud and what is not fraud such as clerical error," said Bauder. "Our goal is to enable machine learners to cull through all of this data and flag anything suspicious. Then, we can alert investigators and auditors who will only have to focus on 50 cases instead of 500 cases or more."

This detection method also has applications for other types of fraud including insurance and banking and finance. The researchers are currently adding other Medicare-related data sources such as Medicare Part D, using more data sampling methods for class imbalance, and testing other feature selection and engineering approaches.

"Given the importance of Medicare, which insures more than 54 million Americans over the age of 65, combating fraud is an essential part in providing them with the quality health care they deserve," said Stella Batalama, Ph.D., dean of FAU's College of Engineering and Computer Science. "The methodology being developed and tested in our college could be a game changer for how we detect Medicare fraud and other fraud in the United States as well as abroad."
About FAU's College of Engineering and Computer Science:

Florida Atlantic University's College of Engineering and Computer Science is committed to providing accessible and responsive programs of education and research recognized nationally for their high quality. Course offerings are presented on-campus, off-campus, and through distance learning in bioengineering, civil engineering, computer engineering, computer science, electrical engineering, environmental engineering, geomatics engineering, mechanical engineering and ocean engineering. For more information about the college, please visit

About Florida Atlantic University:

Florida Atlantic University, established in 1961, officially opened its doors in 1964 as the fifth public university in Florida. Today, the University, with an annual economic impact of $6.3 billion, serves more than 30,000 undergraduate and graduate students at sites throughout its six-county service region in southeast Florida. FAU's world-class teaching and research faculty serves students through 10 colleges: the Dorothy F. Schmidt College of Arts and Letters, the College of Business, the College for Design and Social Inquiry, the College of Education, the College of Engineering and Computer Science, the Graduate College, the Harriet L. Wilkes Honors College, the Charles E. Schmidt College of Medicine, the Christine E. Lynn College of Nursing and the Charles E. Schmidt College of Science. FAU is ranked as a High Research Activity institution by the Carnegie Foundation for the Advancement of Teaching. The University is placing special focus on the rapid development of critical areas that form the basis of its strategic plan: Healthy aging, biotech, coastal and marine issues, neuroscience, regenerative medicine, informatics, lifespan and the environment. These areas provide opportunities for faculty and students to build upon FAU's existing strengths in research and scholarship. For more information, visit

Florida Atlantic University

Related Engineering Articles:

Next frontier in bacterial engineering
A new technique overcomes a serious hurdle in the field of bacterial design and engineering.
COVID-19 and the role of tissue engineering
Tissue engineering has a unique set of tools and technologies for developing preventive strategies, diagnostics, and treatments that can play an important role during the ongoing COVID-19 pandemic.
Engineering the meniscus
Damage to the meniscus is common, but there remains an unmet need for improved restorative therapies that can overcome poor healing in the avascular regions.
Artificially engineering the intestine
Short bowel syndrome is a debilitating condition with few treatment options, and these treatments have limited efficacy.
Reverse engineering the fireworks of life
An interdisciplinary team of Princeton researchers has successfully reverse engineered the components and sequence of events that lead to microtubule branching.
New method for engineering metabolic pathways
Two approaches provide a faster way to create enzymes and analyze their reactions, leading to the design of more complex molecules.
Engineering for high-speed devices
A research team from the University of Delaware has developed cutting-edge technology for photonics devices that could enable faster communications between phones and computers.
Breakthrough in blood vessel engineering
Growing functional blood vessel networks is no easy task. Previously, other groups have made networks that span millimeters in size.
Next-gen batteries possible with new engineering approach
Dramatically longer-lasting, faster-charging and safer lithium metal batteries may be possible, according to Penn State research, recently published in Nature Energy.
What can snakes teach us about engineering friction?
If you want to know how to make a sneaker with better traction, just ask a snake.
More Engineering News and Engineering Current Events

Trending Science News

Current Coronavirus (COVID-19) News

Top Science Podcasts

We have hand picked the top science podcasts of 2020.
Now Playing: TED Radio Hour

Our Relationship With Water
We need water to live. But with rising seas and so many lacking clean water – water is in crisis and so are we. This hour, TED speakers explore ideas around restoring our relationship with water. Guests on the show include legal scholar Kelsey Leonard, artist LaToya Ruby Frazier, and community organizer Colette Pichon Battle.
Now Playing: Science for the People

#568 Poker Face Psychology
Anyone who's seen pop culture depictions of poker might think statistics and math is the only way to get ahead. But no, there's psychology too. Author Maria Konnikova took her Ph.D. in psychology to the poker table, and turned out to be good. So good, she went pro in poker, and learned all about her own biases on the way. We're talking about her new book "The Biggest Bluff: How I Learned to Pay Attention, Master Myself, and Win".
Now Playing: Radiolab

First things first: our very own Latif Nasser has an exciting new show on Netflix. He talks to Jad about the hidden forces of the world that connect us all. Then, with an eye on the upcoming election, we take a look back: at two pieces from More Perfect Season 3 about Constitutional amendments that determine who gets to vote. Former Radiolab producer Julia Longoria takes us to Washington, D.C. The capital is at the heart of our democracy, but it's not a state, and it wasn't until the 23rd Amendment that its people got the right to vote for president. But that still left DC without full representation in Congress; D.C. sends a "non-voting delegate" to the House. Julia profiles that delegate, Congresswoman Eleanor Holmes Norton, and her unique approach to fighting for power in a virtually powerless role. Second, Radiolab producer Sarah Qari looks at a current fight to lower the US voting age to 16 that harkens back to the fight for the 26th Amendment in the 1960s. Eighteen-year-olds at the time argued that if they were old enough to be drafted to fight in the War, they were old enough to have a voice in our democracy. But what about today, when even younger Americans are finding themselves at the center of national political debates? Does it mean we should lower the voting age even further? This episode was reported and produced by Julia Longoria and Sarah Qari. Check out Latif Nasser's new Netflix show Connected here. Support Radiolab today at