Advancement made in the visualization of large, complex datasets

December 02, 2019

(Boston)--An improvement to the premier data visualization tool t-distributed Stochastic Neighborhood Embedding (t-SNE), called optimized-t-SNE (opt-SNE), shines new light on researchers' ability to view exactly what is in their datasets.

opt-SNE is an advancement of the widely used t-SNE created nearly 10 years ago. While t-SNE can accurately analyze approximately half a million cells in any given sample, in recent years, single cell datasets have become much larger. With opt-SNE, researchers can now visualize data from samples containing tens of millions of cells with unprecedented resolution.

The development of opt-SNE was led by Anna Belkina, MD, PhD, assistant professor of pathology and laboratory medicine at Boston University School of Medicine (BUSM).

In addition to its capacity to properly process big datasets, opt-SNE was also able to successfully visualize very small, distinct populations of cells in the blood samples tested (with each cell in these groups as rare as one in a hundred thousand of the total number of cells in the sample). Prior to opt-SNE, this accurate, large-scale visualization with simultaneous magnification of miniscule populations was not possible. "t-SNE was originally a "one-size-fits-all" algorithm, but opt-SNE computations are tailored to each individual dataset and this allows both a birds-eye and up-close view of what is in your sample. With opt-SNE, both the haystack and the needles within it can be seen," explained Belkina, the corresponding author of the study. "It is a particularly valuable tool for the investigation of cytometry and single cell transcriptomics data".

The visualization of different populations within a sample of 20 million human blood cells using t-SNE (left) and opt-SNE (middle, right)

opt-SNE allows researchers to pinpoint previously undetectable features that distinguish diseased samples from controls. This new lens into disease states may reveal novel targets for therapies as well as new biological phenomena. This approach is already in use by multiple research groups due to Belkina's ongoing collaborations with developers of major single cell data analysis platforms who enabled opt-SNE implementation into the cloud analysis platform (Christopher Ciccolella, MS) and FlowJo software (Josef Spidlen, PhD and Richard Halpert, PhD) and co-authored the manuscript. An open-source opt-SNE package has also been released.

Additional co-authors of the study, which appears online in Nature Communications, include Rina Anno, PhD and Jennifer Snyder-Cappione, PhD.
Funding for this study was provided by the National Institutes of Health (Grant RO1AG060890-0).

Editors Note: C.O.C. is a founder of Omiq, Inc. R.H. and J.S. are employees of Beckton Dickinson (BD); FlowJo is a subsidiary of BD. The remaining authors declare no competing interests.

Boston University School of Medicine

Related Visualization Articles from Brightsurf:

Study first to tally biomass from oceanic plastic debris using visualization method
Scientists examined cell abundances, size, cellular carbon mass, and how photosynthetic cells differ on polymeric and glass substrates over time, exploring nanoparticle generation from plastic like polystyrene and how this might disrupt microalgae.

Using LEGO to test children's ability to visualize and rotate 3D shapes in space
Researchers at the University of California San Diego have developed a test that uses children's ability to assemble LEGO pieces to assess their spatial visualization ability.

Visualization of functional components to characterize optimal composite electrodes
Researchers have developed a visualization method that will determine the distribution of components in battery electrodes using atomic force microscopy.

Real-time visualization of solid-phase ion migration
Researchers from University of science and technology of China has shed new lights on the topic of solid-phase ion migration.

Imaging technology allows visualization of nanoscale structures inside whole cells
Purdue University technology allows scientists to measure wavefront distortions induced by the specimen, either a cell or a tissue, directly from the signals generated by single molecules -- tiny light sources attached to the cellular structures of interest.

Breakthrough in genome visualization
Kadir Dede and Dr. Enno Ohlebusch at Ulm University in Germany have devised a method for constructing pan-genome subgraphs at different granularities without having to wait hours and days on end for the software to process the entire genome.

Anti-carcinoembryonic antigen-related cell adhesion molecule antibody for fluorescence visualization
Oncotarget Volume 11, Issue 4: The research team's aim was to investigate mAb 6G5j binding characteristics and to validate fluorescence targeting of colorectal tumors and metastases in patient derived orthotopic xenograft models with fluorescently labeled 6G5j.

An improved method for protein crystal structure visualization
During crystallization atoms are arranged in a 3D lattice structured in a specific way.

Advancement made in the visualization of large, complex datasets
An improvement to the premier data visualization tool t-distributed Stochastic Neighborhood Embedding (t-SNE), called optimized-t-SNE (opt-SNE), shines new light on researchers' ability to view exactly what is in their datasets.

Natural language interface for data visualization debuts at prestigious IEEE conference
A team at NYU Tandon developed FlowSense, which lets those who may not be experts in machine learning create highly flexible visualizations from almost any data.

Read More: Visualization News and Visualization Current Events is a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to