[SCIENCE] Big data reaching the stars

The Sloan Digital Sky Survey (SDSS) is an astronomical project launched in 2000 to map the universe,
and it is one of the best examples of using big data in science. Before the SDSS, astronomers had to rely
on individual observations and painstakingly analyse them by hand. However, with the SDSS, they could
observe hundreds of thousands of celestial objects simultaneously, generating massive amounts of data.
The SDSS uses a 2.5-meter telescope, located in the US state of New Mexico, to scan the night sky. Each
scan covers an area about three times larger than the full surface of the Moon, collecting data on
millions of stars and galaxies. To date, the SDSS has observed over 500 million celestial objects and
generated more than 150 terabytes of data. That is enough to fill a tower of CDs over 320 kilometres
However, analysing such a large amount of data is not easy. The SDSS team has developed sophisticated
algorithms and tools that use machine learning to identify automatically different types of celestial
objects and classify them based on their properties. Thanks to SDSS and other big data projects in
astronomy, we have a better understanding of the structure and evolution of the universe.