Google Play icon

astroML: Machine Learning for Astrophysics

Share
Posted November 20, 2014

The amount of scientific data available to researchers and society is increasing at a rapid pace. Modern data acquisition methods and technologies can almost literally bury us underneath the thick layers of new data generated daily and even hourly.

Undoubtedly, such trend is also supported by the continuous advancement of computational technology.  But thanks to computers and modern software tools they are running, scientists and engineers are also gaining novel and efficient measures which can help coping with vast amounts of information.

These panels visualize a 4-dimensional correlation between orbits and surface color for about 35,000 main-belt asteroids (found between Mars and Jupiter) observed by the Sloan Digital Sky Survey. Image courtesy of the researchers.

These panels visualize a 4-dimensional correlation between orbits and surface color for about 35,000 main-belt asteroids (found between Mars and Jupiter) observed by the Sloan Digital Sky Survey. Image courtesy of the researchers.

The field of astronomy is also not an exception. Telescopes, satellites, detectors and various measurement devices provide new capabilities enabling astronomers to collect hundreds and thousands of terabytes of data. According to current predictions, over the next decade the volume of data should reach the petabyte domain, and that will surely pose formidable challenges for those seeking to handle such abundant and complex data sets.

In a scientific paper that appeared on arXiv.org, a team of scientists from the University of Washington and Georgia Institute of Technology, USA, introduced this problem of ‘data abundance’ as a solid motivation for development of new data mining, machine learning and knowledge discovery tools. In their newest work they describe the development and testing of astroML, a Python package which, as the authors say, is developed for “extracting knowledge from data, where ‘knowledge’ means a quantitative summary of data behavior, and ‘data’ essentially means results of measurements”.

In essence, astroML is a software tool in which its creators adapted several readily available data processing techniques by incorporating them as an open-source codes. The difference is that a new product was supplemented with algorithms specific to the field of astronomy. As the authors note, astroML is intended to serve two main purposes: as an open repository for those who seek to develop statistical routines commonly used in astronomy (using python programming environment), and to provide examples of astrophysical data analysis leveraging techniques developed in the fields of statistics and machine learning.

The authors emphasize that examples detailed in their paper (such as regression and model fitting, density estimation, estimation of data dimensionality, and time series analysis) are just a small portion of methods implemented in the codebase of their new software tool.  In this light-weight package scientists did not attempt to duplicate routines available in other well-known open-source libraries; instead they prioritized maintaining smaller codebase by incorporating existing tools and packages when available.

The astroML package is available publicly and includes dataset loaders, statistical tools and hundreds of example scripts.

Written by Alius Noreika

Featured news from related categories:

Technology Org App
Google Play icon
86,032 science & technology articles

Most Popular Articles

  1. Universe is a Sphere and Not Flat After All According to a New Research (November 7, 2019)
  2. NASA Scientists Confirm Water Vapor on Europa (November 19, 2019)
  3. How Do We Colonize Ceres? (November 21, 2019)
  4. This Artificial Leaf Turns Atmospheric Carbon Dioxide Into Fuel (November 8, 2019)
  5. Scientists created a wireless battery free computer input device (December 1, 2019)

Follow us

Facebook   Twitter   Pinterest   Tumblr   RSS   Newsletter via Email