Using Machine Learning to find quasar lenses in DESI data

Mar 21, 2024

 

Everett McArthur (Credit: KIPAC.)

by Everett McArthur

Have you ever looked through a wine glass and noticed objects farther away appear distorted? This effect, caused by the bending of light as it passes through the curved surface, is somewhat similar to strong gravitational lensing; like the wine glass warping light from distant objects, a foreground galaxy warps the appearance of a galaxy behind it by magnifying it, distorting it into arcs, and/or creating multiple images of it (see below). In my research, I use machine learning and data from the Dark Energy Spectroscopic Instrument (DESI) to look for a specific type of gravitational lens: a quasar lensing a background galaxy. But I want to learn about the lens itself, not the galaxy behind it.

This animation shows a gravitationally lensed galaxy, and how the two galaxies in the foreground comprise the lens that distorts and magnifies the distant galaxy's light. (Credit: EPFL/Austin Pool.)

How to turn a lens on itself

A supermassive black hole (SMBH) resides at the heart of every galaxy. At certain points in the galaxy’s evolution, the SMBH can undergo periods when gas and dust from the surrounding environment fall into its gravitational well. As the matter heats up and accretes onto the SMBH, the black hole glows in the optical and UV, becoming extremely luminous and transforming into a quasar. Previous studies indicate that a tight correlation exists between the mass of the black hole and the mass of the host galaxy, suggesting a coevolutionary relationship. The exact nature of the relationship is still an active area of study, complicated by the fact that the mass of the host galaxy is difficult to measure, primarily due to the quasar’s light dominating that of the host galaxy and anything nearby. 

But with quasars that act as strong lenses for background galaxies, the mass of the quasar’s host galaxy can be precisely measured from the distorted image that’s created by gravitational lensing. So far, only three quasar lenses [Figure 1] have been found in the Sloan Digital Sky Survey (SDSS) using imaging; however, along with collaborators and KIPAC affiliates Martin Millon and Meredith Powell, I expect to add to the short list of known lenses by use machine learning to look for lensing quasars among which has thousands of spectra for galaxies at various cosmological distances contained in DESI’s Early Data Release (EDR).

 

The three confirmed lenses from the Sloan Digital Sky Survey with a quasar at the heart of each and white arrows pointing to the distorted background galaxies. (Credit: NASA / ESA / Z. Levay / [2] STScI/ F. Courbin (EPFL, Switzerland).)
Figure 1: The three confirmed lenses from the Sloan Digital Sky Survey with a quasar at the heart of each and white arrows pointing to the distorted background galaxies. (Credit: NASA / ESA / Z. Levay / [2] STScI/ F. Courbin (EPFL, Switzerland).)

 

Training and candidate selection

The spectrum of a quasar is a zoo of distinct features that may indicate the physical characteristics of the chaotic environment that exists around it. These features make quasar spectra fairly easy to identify, but if there is a background galaxy behind a quasar the instrument is bound to catch the light of both, which means the spectrum will contain features from both objects.

We use machine learning techniques—specifically a Convolutional Neural Network (CNN)—for two reasons. First, the early data release consists of thousands of quasar spectra, and looking by eye at each spectrum would be arduous. Second, a CNN is uniquely designed for analyzing images, or in our case spectra, as with each convolutional layer it can learn specific features in the data by breaking down the spectrum into its fundamental components (noise, emission, and absorption features) and classifying the spectrum on whether it is a lens or not [Figure 2].

Substantial amounts of data are required to train a CNN to effectively discern important features from noise in a spectrum. In our case, we do not have thousands of lenses to train on, so we created example lenses using real data from the DESI EDR by combining quasar spectra with the spectra of more distant emission line galaxies.

A basic representation of an input and what the CNN is expected to output. (Credit: E. McArthur.)
Figure 2: A basic representation of an input and what the CNN is expected to output. (Credit: E. McArthur.)

 

So we fed approximately 3,000 example lenses into the neural network along with 30,000 quasar spectra with no background galaxy to teach the network what features are in a lensed spectrum and what features are in a “normal” quasar spectrum. Training prepares the CNN to apply what it learned from seeing the spectra to a sample it has not seen—the blind sample. Unlike the training sample, we do not know which quasars in the blind sample have background galaxies. Out of the 30,000 quasars in this new sample, the CNN classified 200 spectra as belonging to possible lens candidates.

Raising the Bar

Next, we’ll analyze the spectra of our candidate lenses either by eye or with other machine-learning techniques. We expect a fair number of false positives and one or two true positives within our candidate sample, but at this early stage of looking for quasar-lens needles in the galactic haystack,  every quasar strong lens we find enables us to measure the mass of a host galaxy with enormous precision and increase our understanding of this dynamic stage in the galaxy’s evolution. 

Future surveys—notably the Legacy Survey of Space and Time from the Rubin Observatory—will produce significant amounts of data. Demonstrating that machine learning can be used to find quasar lenses is important to reduce the amount of human effort and search for more lenses efficiently in both imaging and spectroscopic data.

 

Related reading

The connection between supermassive black holes and dark matter halos

How to precisely weigh a quasar's host galaxy