CyberSKA Global Science Data Delivery Platform

Exposing the invisible universe

Department of Physics and Astronomy, University of Calgary

The universe is full of cosmological marvels that are mostly invisible to the human eye – dark matter, black holes, and colliding galaxies, for example. We are able to “see” these astronomical phenomenon and peer into the distant workings of the universe because of a new generation of radio telescope.

The CyberSKA platform lets astronomers around the world instantly collaborate on and visualize massive data sets generated by radio telescopes such as the Square Kilometre Array.

CyberSKA will be of tremendous value to scientists around the world by allowing them to undertake explorations that will look back in time to the formation of the universe. Its core technology and tools will have potential in other industries where huge amounts of distributed data need to be analyzed and visualized.

Extending the observable universe

The first of these next-gen radio telescopes, the Square Kilometre Array (SKA), is currently in development and partially functional. The SKA will combine the signals received from thousands of antennas spread over a distance of more than 3,000 km to simulate a single giant radio telescope capable of extremely high sensitivity. By the time the SKA is fully operational in 2024, it will be the world’s largest radio telescope – 50 times more sensitive and 10,000 times quicker at imaging large volumes of the universe than anything we have to date. Drawing on the support of 11 countries, this multi-billion dollar project will extend the observable universe, helping scientists determine how the first stars and galaxies were formed, what is the fabric of the universe, if Einstein’s theory of relativity is correct in extreme situations, and if we are alone in the universe.

It will be one of the largest global science projects ever undertaken and represents a massive data challenge.

The need for digital infrastructure

The imaging power of the SKA will create massive multi-dimensional data sets. Surveys conducted by globally distributed research teams are expected to generate additional data at an unprecedented rate of almost 400 petabytes per day, an amount of data equivalent to 85 million DVDs. With such large data volumes, the traditional approach of downloading to local machines is not feasible, underscoring the need for digital infrastructure to remotely access, share, analyze, and visualize the data sets.

CyberSKA is a Research Software platform developed by the University of Calgary to share data and visualizations and support collaborative research around big data sets by distributed researchers. It currently supports data-intensive radio astronomy surveys conducted on telescopes such as the Expanded Very Large Array (EVLA) in New Mexico. Using this technology, radio astronomers can instantly visualize the data sets (which would normally take days to download over the fastest connection) from the convenience of their desktops or laptops. It also enables them to simultaneously view and control the same visualization session from different geographic locations.

Potential beyond the stars

CyberSKA will be of tremendous value to scientists around the world by allowing them to undertake explorations that will look back in time to the formation of the universe. In addition, its core technology and tools will have potential in other industries where huge amounts of distributed data need to be analyzed and visualized. This includes biotechnology, resource management, energy, and information technology.

CyberSKA will not only help Canada maintain its leadership position in astronomy but also lead the world in big-data research.

Important tradition of software reuse

The CyberSKA project is supported by CANARIE’s Research Software program and uses Canada’s advanced fibre-optic networks to move around the huge amount of data. It is based on a previous CANARIE-supported project (by the same name) to manage a pool of visualization servers. To continue the important tradition of software reuse by diverse disciplines, CANARIE also supported the development of CyberSKA’s multispectral visualization tools (Visualization Server Pool Management Service), which is available to the research community at no cost through the Software Services Registry.

Funding for the development of CyberSKA was provided through CANARIE’s Research Software Program.