Science News & Science Current Events
 
Email a Friend Send to a friend
Printer Friendly Print From molecules to the Milky Way: dealing with the data deluge

From molecules to the Milky Way: dealing with the data deluge

November 08, 2007

Most people have a few gigabytes of files on their PC. In the next decade, astronomers expect to be processing 10 million gigabytes of data every hour from the Square Kilometre Array telescope.

And with DNA sequencing getting cheaper, scientists will be data mining possibly hundreds of thousands of personal human genome databases, each of 50 gigabytes.




CSIRO has a new research program aimed at helping science and business cope with masses of data from areas like astronomy, gene sequencing, surveillance, image analysis and climate modelling.

The research program, which began this year, is called 'Terabyte Science' and is named for the data sets that start at terabytes (thousands of gigabytes) in size, which are now commonplace.

"CSIRO recognises that, for its science to be internationally competitive, the organisation needs to be able to analyse large volumes of complex, even intermittently available, data from a broad range of scientific fields," says program leader, Dr John Taylor, from CSIRO Mathematical and Information Sciences.

One aspect of the problem is that methods that work with small data sets don't necessarily work with large ones.

An aim of the program is to develop completely new mathematical approaches and processes for scientists in a range of disciplines to further their research and boost Australia's position as a world science leader.

"Large and complex data is emerging almost everywhere in science and industry and it will hold back Australian research and business if it isn't dealt with in a timely way," Dr Taylor says.

Countries like the US also recognise the challenges, as Dr Taylor has seen first hand in his ten years' working in laboratories there.

"This will need major developments in computer infrastructure and computational tools. It involves IT people, mathematicians and statisticians, image technologists, and other specialists from across CSIRO all working together in a very focussed way," he says.

After a workshop in September, specific research areas have been identified and projects are progressing in advanced manufacturing, high throughput image analysis, modelling ocean biogeochemical cycles, situation analysis and environmental modelling.

CSIRO Australia



Related Data Current Events and Data News Articles Data Current Events and Data News RSS Data Current Events and Data News RSS
World's biggest computing grid launched
The world's largest computing grid is ready to tackle mankind's biggest data challenge from the earth's most powerful accelerator. Today, three weeks after the first particle beams were injected into the Large Hadron Collider (LHC), the Worldwide LHC Computing Grid combines the power of more than 140 computer centers from 33 countries to analyze and manage more than 15 million gigabytes of LHC data every year.

Genetic Engineering & Biotechnology News reports on growing role of molecular diagnostics
Novel platform technologies and key advances in genomics are rapidly driving the development of molecular diagnostics, reports Genetic Engineering and Biotechnology News (GEN).

Gas from the past gives scientists new insights into climate and the oceans
In recent years, public discussion of climate change has included concerns that increased levels of carbon dioxide will contribute to global warming, which in turn may change the circulation in the earth's oceans, with potentially disastrous consequences.

Oklahoma researchers support biodiversity in biofuels production
U.S. and European mandates for subsidies of cellulosic ethanol production and use have uncertain environmental consequences according to an international group of scientists which includes researchers from the University of Oklahoma and Oklahoma State University.

Mental barriers hamper obese women's efforts to get exercise
For arachnophobes, it's difficult to kill a spider as it scurries across the floor. Those who are scared to fly might not ever set foot on a plane. While nothing physically stops people with these aversions, a mental barrier can keep them from the task at hand.

News from Cancer: Disparities in head and neck cancer patients
A new analysis finds considerable disparities in survival related to race and socio-economic status among patients with head and neck cancer.

Mayo Clinic study tackles labeling errors
With a long-held commitment to continuously improving the quality and safety of patient care, Mayo Clinic researchers are recommending a new technologically-advanced labeling system aimed at reducing specimen labeling errors in a high-volume gastrointestinal endoscopy center

Mayo Clinic collaborates to advance Crohn's treatment
A study led by Mayo Clinic has found that infliximab (Remicade®) administered alone (monotherapy) or in combination with azathioprine is a more effective treatment for patients with moderate to severe Crohn's disease than azathioprine alone.

Children's asthma affected by parental expectations
Asthmatic children whose parents have high expectations for their ability to function normally are less likely to have symptoms than other children dealing with the condition, according to a new study.

Researchers document world's mammals in crisis
From majestic African elephants to tiny and often unappreciated rodents, mammals on Earth are in a state of crisis. One in four mammal species on Earth is being pushed to extinction, according to the Global Mammal Assessment, the most comprehensive assessment of the world's mammals.
More Data Current Events and Data News Articles


Data Analysis and Decision Making with Microsoft Excel 3e Revised, (with CD-ROM and Decision Tools and Statistic Tools Suite)
by S. Christian Albright, Wayne Winston, Christopher Zappe

Master data analysis, modeling, and spreadsheet use with DATA ANALYSIS AND DECISION MAKING WITH MICROSOFT EXCEL! With a teach-by-example approach, student-friendly writing style, and complete Excel integration, this quantitative methods text provides you with the tools you need to succeed. Margin notes, boxed-in definitions and formulas in the text, enhanced explanations in the text itself, and...



Data Analysis Using Regression and Multilevel/Hierarchical Models
by Andrew Gelman, Jennifer Hill

Data Analysis Using Regression and Multilevel/Hierarchical Models is a comprehensive manual for the applied researcher who wants to perform data analysis using linear and nonlinear regression and multilevel models. The book introduces a wide variety of models, whilst at the same time instructing the reader in how to fit these models using available software packages. The book illustrates the...



The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling (Second Edition)
by Ralph Kimball, Margy Ross

Single most authoritative guide from the inventor of the technique. Presents unique modeling techniques for e-commerce, and shows strategies for optimizing performance. Companion Web site provides updates on dimensional modeling techniques, links related to sites, and source code where appropriate. ...



Data and Computer Communications (8th Edition)
by William Stallings

This timely revision of an all-time best-seller in the field features the clarity and scope of a Stallings classic. This comprehensive volume provides the most up-to-date coverage of the essential topics in data communications, networking, Internet technology and protocols, and standards – all in a convenient modular format. Features updated coverage of multimedia, Gigabit and 10 Gbps...



Advanced Excel for Scientific Data Analysis
by Robert De Levie

Combining an easy-going style with an emphasis on practical applications, this greatly expanded second edition is remarkable in scope and coverage. As reviews of the first edition noted, the term "advanced" in the title is not used lightly. Less than a third of its 700+ pages are devoted to least squares analysis, yet the reader will learn about many aspects of this ubiquitous method that are...



Mathematical Statistics and Data Analysis (with CD Data Sets) (Duxbury Advanced)
by John A. Rice

This is the first text in a generation to re-examine the purpose of the mathematical statistics course. The book's approach interweaves traditional topics with data analysis and reflects the use of the computer with close ties to the practice of statistics. The author stresses analysis of data, examines real problems with real data, and motivates the theory. The book's descriptive statistics,...



Qualitative Data Analysis: An Expanded Sourcebook(2nd Edition)
by Matthew B. Miles, Michael Huberman

"This is not a book about how to collect qualitative data but rather what to do with qualitative data that have been collected. . . . For evaluators who work with qualitative data, this book provides a wealth of ideas on organizing, analyzing, and presenting such data. It is. . . a set of resources for practicing teachers. It is well-organized, clearly written, and practical. I have recommended...



Business Data Communications and Networking
by FitzGerald, Dennis

Updated with the latest advances in the field, Jerry FitzGerald and Alan Dennis' Ninth Edition of Business Data Communications and Networking provides the fundamental concepts, cutting-edge coverage, balanced presentation, and practical, real-world applications that professionals and students need to succeed in this fast-moving...



Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
by Ian H. Witten, Eibe Frank

As with any burgeoning technology that enjoys commercial attention, the use of data mining is surrounded by a great deal of hype. Exaggerated reports tell of secrets that can be uncovered by setting algorithms loose on oceans of data. But there is no magic in machine learning, no hidden power, no alchemy. Instead there is an identifiable body of practical techniques that can extract useful...



The Data Warehouse Lifecycle Toolkit
by Ralph Kimball, Margy Ross, Warren Thornthwaite, Joy Mundy, Bob Becker

Presenting the much-anticipated Second Edition, which boasts nearly 40 percent new and revised coverage, reflecting the latest best practices This unparalleled tutorial approach covers everything from planning the data warehouse project to implementing the design and managing the data warehouse environment The Kimball Group has streamlined the lifecycle methodology to be more...

© 2008 BrightSurf.com