By Chun-Hao To
On-going and upcoming cosmological surveys, such as the Dark Energy Survey (DES), Rubin Observatory's Legacy Survey of Space and Time (LSST), and observations from the Nancy Grace Roman Space Telescope, as well as the Euclid Space Telescope, will produce tremendous volumes of astrophysical data in the near future. Extracting cosmological information from this data requires sophisticated theory modeling that combines physical models of our Universe, astrophysical models of galaxies, and telescope and survey-specific models of measurement uncertainties. The theory models involved are now quite sophisticated and typically have thirty to fifty parameters, making it hard to find plausible sets of parameters that describe the observational data (see our post from March 2022 for a discussion of state-of-the art analyses at DES). Therefore, the analyses of modern cosmological datasets are highly time-consuming, even with the most advanced algorithms and software.
Specifically, portions of DES's latest analysis took as long as twenty-one days on state-of-the-art supercomputing clusters. These runs will get significantly longer in the future, as datasets get correspondingly larger and upcoming analysis models become more sophisticated. The long runtime will increase energy usage and cost, as well as taking more time to get the cosmological constraints once the data are in hand. Further, substantial computational costs place a significant burden on our environment. In a typical cosmological analysis, one performs the analysis thousands of times to test different scenarios, resulting in considerable CO2 emissions from the electricity consumption.
In this paper, we build a Likelihood Inference Neural Network Accelerator (LINNA, Fig. 1) to speed up the process of extracting cosmological information from the data. We achieve this using a specially designed deep neural network, a powerful machine learning method that can approximate complicated models accurately and is fast to evaluate. The deep neural network is then combined with standard Monte Carlo Markov Chain (MCMC) samplers, which, boiled down to basics, allow us to extract the cosmological information from the data using probability distributions. We find that the combination of neural networks with standard sampling techniques can accurately reproduce expensive standard analyses in a few iterations (Fig. 2), with total computational costs a factor of 8-50 smaller.
This new tool will be particularly useful for ongoing and future surveys. If we use LINNA on the upcoming LSST year one (LSST-Y1) cosmological analyses, the time consumption on posterior inferences of cosmological parameters will be greatly reduced. This not only speeds up scientific developments but also has non-trivial environmental and economic impacts. Under very conservative assumptions, we find that applying LINNA on LSST first-year analyses can save $300k US on energy costs and reduce 2400 tons of CO2 emissions, an amount equivalent to the annual carbon footprint of approximately 65 astronomers.
With the use of LINNA, we can learn much more in efficient ways about the fundamental cosmological evolution of the Universe, and how we got to from the start all the way to living on this lovely blue and green marble floating through space, while at the same time minimizing the damage here to Spaceship Earth—the only home in this vast Universe that we humans have ever known, thus far.
-------------------- Extra Reading
Astronomers for Planet Earth -- The Astronomical Perspective