Nav: Home

Big data for the universe

February 09, 2017

Astronomers at Lomonosov Moscow State University in cooperation with their French colleagues and with the help of citizen scientists have released «The Reference Catalog of galaxy SEDs» (RCSED), which contains value-added information about 800,000 galaxies. The catalog is accessible on the web and its description has been published in the Astrophysical Journal Supplement (impact factor -- 11.257). Two co-authors of the research paper are undergraduate students at the Faculty of Physics, Lomonosov Moscow State University. While still working on the catalog, the team has published a few research papers based on the data from it, including a study published by the prestigious interdisciplinary journal Science.

What can one learn using RCSED and why is it unique?

RCSED describes properties of 800,000 galaxies derived from the elaborated data analysis. For every galaxy, it presents its stellar composition, brightness at ultraviolet, optical, and near-infrared wavelengths. From RCSED, one can also access galaxy spectra obtained by the Sloan Digital Sky Survey, measurements of spectral lines, and properties determined from them, such as the chemical composition of stars and gas, contained in those galaxies. This makes RCSED the first catalog of its kind, which contains results of detailed homogeneous analysis for such large number of objects. Dr. Igor Chilingarian, an astronomer at Smithsonian Astrophysical Observatory, USA and a Lead Researcher at Sternberg Astronomical Institute, Lomonosov Moscow State University says: "For every galaxy we also provide a small cutout image from three sky surveys, which show how the galaxy looks at different wavelengths. This provides us with the data for further investigations." Dr. Ivan Katkov, a Senior Researcher at Sternberg Astronomical Institute adds: "The analysis of emission line profiles presented in RCSED is substantially more detailed and accurate then the data published in other catalogs".

RCSED is really flexible and very easy to use. By simply entering the object name or its coordinates in the search field, the web site will provide in a single page all the information referring to that object contained in the catalog. One can also use the catalog through Virtual Observatory applications such as TOPCAT. The RCSED web site also provides tutorials including the one, which describes a technique that Igor Chilingarian and Ivan Zolotukhin exploited to discover new compact elliptical galaxies, which were later published in the research paper «Isolated compact elliptical galaxies: Stellar systems that ran away».

Another interesting detail about RCSED is that the team actively used the help of citizen scientists to develop the project web site. And among them there were high-level experts in software development and web design, who have daytime jobs in the largest Russian IT-companies. Dr. Ivan Zolotukhin, a Researcher at Sternberg Astronomical Institute, explains: "Programmers sometimes get burnt out by their routine work, and they would like to do something interesting and pleasant in their spare time, for instance, to help scientists. We are very grateful to them, they have become important members of our team and significantly strengthened our project. It's been always interesting for us to cooperate with IT specialists and we have a lot more projects where they can contribute. So if you use git, program in Python or know HTML/CSS, love stars, have a bit of spare time and are willing to help an international research team - please, contact us using the address published on the web page.

Dr. Ivan Katkov adds: "The RCSED catalog became possible thanks to the application of an interdisciplinary Big Data approach as we had to apply very complex scientific algorithms to a large dataset in a massively parallel way. Eventually, the expertise and resources available at large IT companies would undoubtedly allow researchers to significantly increase the quality and the quantity of research results and to make many important discoveries in astrophysics".

The fact that the RCSED catalog has attracted serious interest in the scientific community even during its assembly phase proves its great potential. During the last three years several external researchers were given the access to the catalog on request and, using RCSED data, published over a dozen of articles in professional peer-reviewed journals (Astrophysical Journal, Astronomy & Astrophysics, MNRAS). The catalog is the world largest homogeneous value-added dataset for nearby galaxies, containing information collected with ground-based and space telescopes. The unique research material for extragalactic astrophysics contained in RCSED will certainly help astrophysicists to achieve new interesting scientific results, some of which would probably qualify for publication in the interdisciplinary journals Science and Nature.

RCSED expansion prospects: one million galaxies will be there soon

The current release of the RCSED catalog could have comprised a larger number of galaxies or contained extra bits of information about the currently included objects, but at this moment the scientists have decided to focus on well-characterized datasets, which are described in detail and have known advantages and disadvantages. However, taking into account the project importance for extragalactic astronomy and observational cosmology, the RCSED team is going to move forward and expand the catalog in the near future.

There are two principal directions of further RCSED development: the galaxy sample expansion and incorporating new data for existing objects. The team considers a possibility to include near- and mid-infrared data from the WIS? satellite all-sky survey for the entire galaxy sample. However, this requires some additional methodical work in order to homogenize the data for galaxies at different redshifts.

Moreover, it is possible to expand the principal galaxy sample by including spectra from the latest data release of the SDSS-III survey. This will turn 800,000 to 1.5 million objects.

Incorporating the publicly available spectral data from the Hectospec archive (Igor Chilingarian has played a major role in the Hectospec archive project) will add 300-400 thousand objects at larger distances, whose spectra were collected with the 6.5-meter MMT telescope in Arizona. The current RCSED release comprises mostly nearby galaxies (by cosmological measures), whose redshifts are smaller than 0.4, because SDSS did not include faint objects. Therefore, the early Universe is not represented in the catalog at all. The Hectospec archive will allow the team to move a little bit further in the cosmological distance scale until the redshift of 0.7. If they add several thousand galaxies from the DEEP2 survey conducted with the 10-meter Keck telescope in early 2000s, they could get insights into objects at redshift up-to 1.0, when the Universe was less than half of its present age.

Igor Chilingarian concludes: "We shall be able to see the global picture in about ten years from now, when large surveys like DESI have collected 25-30 million galaxy spectra out to intermediate redshifts."

-end-

The RCSED project has been supported by the collaborative grant, provided by the Russian Foundation for Basic Research (RFBR) and The French National Center for Scientific Research (Centre National de la Recherche Scientifique, CNRS). On earlier stages the project was supported by the grants from the Russian Science Foundation (RScF), the President of the Russian Federation, along with French resources, available in the framework of the VO-Paris Data Center at the Paris Observatory.

Lomonosov Moscow State University

Related Data Articles:

Discrimination, lack of diversity, & societal risks of data mining highlighted in big data
A special issue of Big Data presents a series of insightful articles that focus on Big Data and Social and Technical Trade-Offs.
Journal AAS publishes first data description paper: Data collection and sharing
AAS published its first data description paper on June 8, 2017.
73 percent of academics say access to research data helps them in their work; 34 percent do not publish their data
Combining results from bibliometric analyses, a global sample of researcher opinions and case-study interviews, a new report reveals that although the benefits of open research data are well known, in practice, confusion remains within the researcher community around when and how to share research data.
Designing new materials from 'small' data
A Northwestern and Los Alamos team developed a novel workflow combining machine learning and density functional theory calculations to create design guidelines for new materials that exhibit useful electronic properties, such as ferroelectricity and piezoelectricity.
Big data for the universe
Astronomers at Lomonosov Moscow State University in cooperation with their French colleagues and with the help of citizen scientists have released 'The Reference Catalog of galaxy SEDs,' which contains value-added information about 800,000 galaxies.
What to do with the data?
Rapid advances in computing constantly translate into new technologies in our everyday lives.
Why keep the raw data?
The increasingly popular subject of raw diffraction data deposition is examined in a Topical Review in IUCrJ.
Infrastructure data for everyone
How much electricity flows through the grid? When and where?
Finding patterns in corrupted data
A new 'robust' statistical method from MIT enables efficient model fitting with corrupted, high-dimensional data.
Big data for little creatures
A multi-disciplinary team of researchers at UC Riverside has received $3 million from the National Science Foundation Research Traineeship program to prepare the next generation of scientists and engineers who will learn how to exploit the power of big data to understand insects.

Best Science Podcasts 2017

We have hand picked the best science podcasts for 2017. Sit back and enjoy new science podcasts updated daily from your favorite science news services and scientists.
Now Playing: Radiolab

Oliver Sipple
One morning, Oliver Sipple went out for a walk. A couple hours later, to his own surprise, he saved the life of the President of the United States. But in the days that followed, Sipple's split-second act of heroism turned into a rationale for making his personal life into political opportunity. What happens next makes us wonder what a moment, or a movement, or a whole society can demand of one person. And how much is too much?
Now Playing: TED Radio Hour

Future Consequences
From data collection to gene editing to AI, what we once considered science fiction is now becoming reality. This hour, TED speakers explore the future consequences of our present actions. Guests include designer Anab Jain, futurist Juan Enriquez, biologist Paul Knoepfler, and neuroscientist and philosopher Sam Harris.