Scientists team with business innovators to solve 'big data' bottleneck

February 07, 2013

In a study that represents a potential cultural shift in how basic science research can be conducted, researchers from Harvard Medical School, Harvard Business School and London Business School have demonstrated that a crowdsourcing platform pioneered in the commercial sector can solve a complex biological problem more quickly than conventional approaches--and at a fraction of the cost.

Partnering with TopCoder, a crowdsourcing platform with a global community of 450,000 algorithm specialists and software developers, researchers identified a program that can analyze vast amounts of data, in this case from the genes and gene mutations that build antibodies and T cell receptors. Since the immune system takes a limited number of genes and recombines them to fight a seemingly infinite number of invaders, predicting these genetic configurations has proven a massive challenge, with few good solutions.

The program identified through this crowdsourcing experiment succeeded with an unprecedented level of accuracy and remarkable speed.

"This is a proof-of-concept demonstration that we can bring people together not only from different schools and different disciplines, but from entirely different economic sectors, to solve problems that are bigger than one person, department or institution," said Eva Guinan, HMS associate professor of radiation oncology at Dana-Farber Cancer Institute and director of the Harvard Catalyst Linkages Program. "Given how complicated the immune system is, this has been a particularly formidable biological problem, and building tools for solving it has been hard and time-consuming. We were stunned by the power of these results and their potential application."

"This study makes us think about greater efficiencies in academic research can be obtained," said Karim Lakhani, associate professor in the Technology and Operations Management Unit at Harvard Business School. "In a traditional setting, a life scientist who needs large volumes of data analyzed will hire a postdoc to create a solution, and it could take well over a year. We're showing that in certain instances, existing platforms and communities might solve these problems better, cheaper and faster."

"We're excited to see that ideas from economics and management fields can be so productively applied to medical research," said Kevin Boudreau, assistant professor of strategy and entrepreneurship at London Business School. "This progress is heartening, particularly in view of the computational challenges we face in understanding so many diseases. We hope this provides a model of how social science and medical researchers can collaborate to solve real-world problems that matter to people."

These findings are reported February 7 in Nature Biotechnology.

For several years Boudreau, Guinan and Lakhani--through Harvard Catalyst--have explored the potential applicability of open and distributed innovation approaches to new areas, such as medical research. This has involved bringing insights from social science and economics to processes of medical research. They teamed up with Ramy Arnaout, HMS assistant professor of pathology at Beth Israel Deaconess Medical Center. Arnaout is also a systems biologist whose laboratory studies immune sequencing and other so-called "big-data" problems in biomedicine. Arnaout had developed computational methods for analyzing immune repertoires, but he could foresee having to invest significant computer and personnel resources to keep those methods able to handle the ever-increasing influx of data.

The researchers offered TopCoder what they thought would be an impossible goal: to develop a predictive algorithm that was an order of magnitude better than either Arnaout's or the NIH's standard algorithm (known as BLAST), and that could scale up to the mounting data demands. To do this, they had to first reframe the problem, translating it so that it could be accessible to individuals not trained in computational biology.

In only two weeks, viable solutions came from 122 different individuals. Among these, 16 were more accurate--and up to 1,000 times faster--than BLAST. The research team has released the top five performing code submissions under an open source license.

"This is more than just a quick, in expensive answer," said Guinan. "It's uniting different approaches to a problem by taking from Harvard many disparate reservoirs of knowledge and bringing them together to formulate the question, analyze the data, and then put it back to use. This draws on our faculty in a very diverse way. By extending the numbers of people who look at our specific problem, we get solutions rapidly. We have a lot of biases about doing that, and we really shouldn't. In the end this allows researchers to turn their attention to basic science questions and not get caught up in details that they are less well suited to address."

"In a way, the immune system is really the dark matter of biology," said Arnaout. "We have all this sequence data, and there's no good way to figure out what it's doing. Not only did the best entries achieve truly superior performance, but also this kind of crowdsourcing has the potential to be a general solution for a whole class of problems in biology. No single university or institution has the bandwidth and resources to achieve this kind of result so quickly and efficiently."
Co-authors on the study included Po-Ru Loh (Massachusetts Institute of Technology), Lars Backstrom (TopCoder), Carliss Baldwin (HBS), Eric Lonstein (HBS), Mike Lydon (TopCoder) and Alan MacCormack (HBS).

This work was funded by Harvard Business School's Division of Research and Faculty Development, the NASA Tournament Lab at Harvard's Institute for Quantitative Social Science, and Harvard Catalyst.

Harvard Medical School

Related Immune System Articles from Brightsurf:

How the immune system remembers viruses
For a person to acquire immunity to a disease, T cells must develop into memory cells after contact with the pathogen.

How does the immune system develop in the first days of life?
Researchers highlight the anti-inflammatory response taking place after birth and designed to shield the newborn from infection.

Memory training for the immune system
The immune system will memorize the pathogen after an infection and can therefore react promptly after reinfection with the same pathogen.

Immune system may have another job -- combatting depression
An inflammatory autoimmune response within the central nervous system similar to one linked to neurodegenerative diseases such as multiple sclerosis (MS) has also been found in the spinal fluid of healthy people, according to a new Yale-led study comparing immune system cells in the spinal fluid of MS patients and healthy subjects.

COVID-19: Immune system derails
Contrary to what has been generally assumed so far, a severe course of COVID-19 does not solely result in a strong immune reaction - rather, the immune response is caught in a continuous loop of activation and inhibition.

Immune cell steroids help tumours suppress the immune system, offering new drug targets
Tumours found to evade the immune system by telling immune cells to produce immunosuppressive steroids.

Immune system -- Knocked off balance
Instead of protecting us, the immune system can sometimes go awry, as in the case of autoimmune diseases and allergies.

Too much salt weakens the immune system
A high-salt diet is not only bad for one's blood pressure, but also for the immune system.

Parkinson's and the immune system
Mutations in the Parkin gene are a common cause of hereditary forms of Parkinson's disease.

How an immune system regulator shifts the balance of immune cells
Researchers have provided new insight on the role of cyclic AMP (cAMP) in regulating the immune response.

Read More: Immune System News and Immune System Current Events is a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to