Dark matter subhalo disruption: Insights from simulations and machine learning

Ethan O. Nadler. (Image courtesy of E.O. Nadler.)

by Ethan O. Nadler

Near the turn of the century, two seminal papers pointed out a striking discrepancy between the number of dark matter subhalos around Milky Way-like systems in dark-matter-only (DMO) simulations and the number of observed dwarf satellite galaxies around the Milky Way (MW). Historically, this discrepancy (shown graphically in the figure below) led to the notion of the "missing satellites problem" (MSP)—not the issue of where multiple Mars-bound satellites have disappeared to, but rather the idea that we observe significantly fewer dwarf satellite galaxies (by a factor of about 10!) in the Local Group than predicted by the standard cold dark matter (CDM) cosmological model.

(Which remains true despite discoveries in recent years of several other local MWG dwarf galaxies, some of them found by KIPAC researchers working on the Dark Energy Survey.)

The number of subhalos as a function of their maximum internal circular velocity. (Credit: Moore et al., 1999.)
The number of subhalos as a function of their maximum internal circular velocity (more massive subhalos tend to have larger circular velocities), for a simulated galaxy cluster (solid line) and a simulated Milky Way-mass galaxy (dashed line), compared to observations of the Virgo cluster (open circles) and the "classical" MW satellite galaxy population (filled circles). There are now more than 50 observed Milky Way satellites, including the 11 classical systems shown here, but there is still a large deficit compared to the number of subhalos in dark-matter-only simulations. (Credit: Moore et al., 1999.)


However, as this recent review article points out, there is a sense of prediction implicit in the MSP: if every dark matter subhalo hosts a galaxy (as we naively expect from the CDM model, in which galaxies form inside of dark matter halos), and we properly account for the incompleteness of Local Group satellite searches (which results from limited sky coverage at the sensitivity necessary to detect faint dwarf galaxies), then the apparent lack of MW satellites could be significant. In particular, this mismatch might suggest that CDM cosmology accurately predicts the distribution of large-scale structure in the Universe while its predictions break down on smaller scales. Alternative dark matter models including warm dark matter and ultra-light scalar field dark matter (also sometimes called "fuzzy" dark matter) suppress subhalo abundances on small scales, so it's possible that the MSP is telling us something important about the fundamental nature of dark matter.

More Than Dark Matter at Work: The Effects of Normal Matter

The early papers on the MSP were cautious about interpreting the discrepancy, in large part because subhalo populations in DMO simulations are not perfect indicators of the resulting satellite populations. In particular, while DMO simulations provide a rough estimate of the abundance and properties of satellites in MW-like systems, baryonic physics (the physics of regular matter such as protons, neutrons, electrons, etc.) can dramatically alter these predictions.

For example, cosmic reionization suppresses star formation by limiting gas accretion and slowing gas cooling rates within subhalos, and these effects can partially or completely inhibit galaxy formation in low-mass systems. Another mechanism that suppresses the abundance of satellite galaxies is the dynamical influence of a central galactic disk. Simulations have shown that the tidal forces exerted on subhalos that pass near a disk tend to strip away significant amounts of dark matter. Tidal forces often completely disrupt subhalos, unbinding their particles and mixing them into the larger host halo, and this process presumably disrupts the corresponding satellite galaxies. (However, in general, it is necessary to account for the possibility that some "orphan" galaxies survive this process.) The number of surviving subhalos predicted by DMO simulations is therefore too large due to subhalo disruption, and also because of baryonic effects internal to subhalos. For example, supernova feedback associated with repeated episodes of bursty star formation can drive large amounts of gas and dark matter out of the center of a subhalo, which softens its central density cusp and makes it more susceptible to tidal disruption.

There has recently been renewed interest in these topics, in large part because high-resolution hydrodynamic "zoom-in" simulations of MW-mass host halos can now simultaneously resolve the formation of a central galactic disk and the effects of baryonic physics internal to subhalos with high resolution. Two recent papers by KIPAC collaborators Andrew Wetzel and Shea Garrison-Kimmel studied zoom-in simulations of MW-mass systems from the Feedback In Realistic Environments (FIRE) simulation suite, which includes sub-grid models (i.e., approximate schemes to capture the small-scale baryonic physics that can’t be resolved numerically) for star formation, stellar feedback, and reionization. These authors showed that the satellite galaxy populations in two zoom-in simulations are consistent with observed satellite luminosity functions for the MW and the Andromeda Galaxy (M31), largely due to enhanced subhalo disruption caused by a central galactic disk.

The number of satellite galaxies around a simulated MW-mass host halo as a function of the satellite stellar mass (left) and satellite stellar velocity dispersion (right), which is a proxy for subhalo circular velocity. (Credit: Wetzel et al., 2016.)
The number of satellite galaxies around a simulated MW-mass host halo as a function of the satellite stellar mass (left) and satellite stellar velocity dispersion (right), which is a proxy for subhalo circular velocity. Results from a FIRE hydrodynamic zoom-in simulation (blue lines) are compared to the number of observed satellites around the MW (dashed line) and M31 (dotted line). The right-hand panel shows the number of subhalos from a corresponding DMO simulation for comparison. The predictions from this simulation are fairly consistent with the number of observed satellites around each Local Group galaxy, but studying a large, diverse sample of these simulations is difficult to due to the computational costs of simulating baryonic physics. (Credit: Wetzel et al., 2016.)


The Best of Both (Simulated) Worlds

While these results suggest that baryonic physics might be able to resolve the small-scale problems associated with CDM in principle, it is difficult to make robust predictions based on these simulations because there are relatively few of them and because they rely on sub-grid baryonic physics prescriptions. In other words, simulating baryonic physics in a cosmological setting is an extremely computationally intensive task, which limits both the number of realistic simulations that we can study, as well as the reliability of any particular simulation due to specific sub-grid physics choices, which are often a point of contention among different research groups.

In a recent paper (with KIPAC alum Yao-Yuan Mao, KIPAC professor Risa Wechsler, Shea Garrison-Kimmel, and Andrew Wetzel), we therefore set out to model subhalo disruption due to baryonic effects in the hydrodynamic FIRE simulations, the idea being that this model can predict surviving subhalo populations in independent DMO simulations. Our approach avoids the slow and computationally expensive step of re-simulating a host halo with hydrodynamic physics by capturing the drivers of subhalo disruption in the FIRE simulations.

Perhaps unsurprisingly, modeling subhalo disruption due to an array of complicated baryonic effects is not straightforward. We found, in agreement with Shea’s previous paper, that the main factor which determines whether a subhalo will be disrupted is its orbit: subhalos that pass close to the central disk experience strong tidal forces and are more prone to disruption. However, we found that the likelihood of disruption also depends on the time at which the first pericentric passage (a subhalo's distance of closest approach to the center of the host halo) occurs: subhalos with early initial pericentric passages are relatively more likely to be disrupted, mainly because they continue to orbit—and get tidally stripped—for longer than subhalos with late pericentric passages. In addition, some internal subhalo features, such as mass and maximum circular velocity at the time of accretion onto the host halo, correlate with subhalo disruption probability.

Joint and marginal distributions of pericentric distance and the time at which this pericentric passage occurs. (Credit: Nadler et al., 2017.)
Joint and marginal distributions of pericentric distance (d_peri), i.e., a subhalo’s distance of closest approach to the center of the host halo, and the time at which this pericentric passage occurs (a_perismaller values of a_peri correspond to earlier times in the Universe), for destroyed (red) and surviving (blue) subhalos from two hydrodynamic simulations of MW-mass host halos. Subhalos that pass closer to the center of the host halo at earlier times are more likely to be disrupted. (Image credit: Nadler et al. 2017.)


Insights from Machine Learning

Rather than explicitly modeling how subhalo disruption depends on a myriad of orbital and internal subhalo features, we use a supervised machine learning model called random forest classification to learn the relationship between subhalo features and disruption likelihood. By training our algorithm on the FIRE simulations, we taught it to identify subhalos that would be disrupted in FIRE-like hydrodynamic simulations. The trained classifier can then immediately predict surviving subhalo populations from DMO simulations. (Our approach assumes that, to first order, the orbits and internal properties of surviving subhalos are not strongly affected by baryonic physics. We learned over time that this is a good approximation for these simulations!)

Visualizations of dark matter in a dark-matter-only zoom-in simulation of a Milky Way-mass host halo (left) and in a hydrodynamic simulation of the same system from the FIRE project (right). (Credit: Garrison-Kimmel et al., 2017.)
An example of what our classifier had to reconcile: Visualizations of dark matter in a dark-matter-only zoom-in simulation of a Milky Way-mass host halo (left) and in a hydrodynamic simulation of the same system from the FIRE project (right). The galactic disk in the baryonic simulation disrupts nearly all of the subhalos that pass close to the center of the host and reduces the overall number of subhalos by a factor of two. The image shows the projected dark matter density in a cube of side length 100 kiloparsecs (for comparison, our Solar System is about 8 kiloparsecs from the center of the Milky Way). (Credit: Garrison-Kimmel et al., 2017.)


To test whether our classifier learned anything useful, we applied it to the DMO counterparts of the FIRE simulations that we used to train the algorithm, and we found that the predicted subhalo populations are in good agreement with the full hydrodynamic results. Interestingly, by comparing our results to simulations that include the effects of a central disk without including baryonic effects internal to subhalos, we found that our classifier outperforms these disk simulations in many regimes, meaning that it picks up on both disk-driven subhalo disruption and internal effects such as stellar feedback.

The number of subhalos as a function of maximum circular velocity for two different host halos. (Credit: Nadler et al., 2017.)
The number of subhalos as a function of maximum circular velocity for two different host halos (m12i and m12f, shown in the left and right panels respectively). The red lines show the prediction from the hydrodynamic FIRE simulations, the DISK lines show results from a simulation that includes the central disk without including additional baryonic physics, and the blue lines and bands show the most probable prediction and the uncertainty from our random forest classifier. Our model generally matches the FIRE results more closely than the DISK simulations, suggesting that it captures subhalo disruption due to both the central galactic disk and baryonic physics internal to subhalos. (Credit: Nadler et al., 2017.)


With our trained classifier in place, we then applied it to a suite of independent DMO zoom-in simulations to study the impact of baryonic physics on subhalo populations and on their corresponding satellite galaxy populations for a range of MW-mass host halos. In particular, we used abundance matching (which relates a subhalo population to its corresponding galaxy population) to map our prediction for the amount of subhalo disruption to a change in the number of surviving satellite galaxies above a certain luminosity threshold. This prediction can be used to interpret observations of satellite galaxies around MW-like systems from surveys like the Satellites Around Galactic Analogs Survey (SAGA).

The number of satellite galaxies around MW-mass host halos as a function of absolute magnitude from a suite of 45 DMO simulations, before (black) and after (blue) applying the random forest classifier. (Credit: Nadler et al., 2017.)
The number of satellite galaxies around MW-mass host halos as a function of absolute magnitude (more negative values correspond to brighter galaxies) from a suite of 45 DMO simulations, before (black) and after (blue) applying our random forest classifier. The dashed vertical line indicates the completeness limit of the SAGA survey. (Credit: Nadler et al. 2017.)


Connecting this model for subhalo disruption to the missing satellites problem is challenging, since baryonic physics is only one of many effects that can alleviate (or exacerbate!) the MSP discrepancy. Other effects include the details of the abundance matching scheme, the relationship between halo sizes and galaxy sizes (size impacts surface brightness, which strongly influences whether satellites can be detected), numerical resolution and artificial subhalo disruption issues associated with DMO simulations, and the fact that the MW seems to be an outlier since it hosts both the Large and Small Magellanic Clouds, which correspond to particularly massive and relatively rare subhalos. However, the random forest model gives researchers a flexible and efficient means of accounting for the baryonic physics piece of this puzzle while leveraging the statistical power associated with large numbers of DMO simulations.

Near-field cosmology is inherently difficult because we only have one Local Group to study; sadly (or perhaps excitingly!), the Milky Way might be a statistical outlier. This necessitates the use of well-motivated, realistic simulations, which have already given researchers insights into the mechanisms that likely shaped the Milky Way’s subhalo population. Since our generated subhalo populations are consistent with state-of-the-art hydrodynamic simulations, we are hopeful that the machine learning technique described above will help the community dive further into the mystery of the missing dwarf satellites.

Related Reading

The Devil is in the Details: What Galaxy Dynamics Can Tell Us About Dark Matter

Referenced Papers

Where are the missing galactic satellites? (Klypin et al., 1999)

Dark Matter Substructure in Galactic Halos (Moore et al., 1999)

Notes on the Missing Satellites Problem (Bullock, 2010)

Reconciling dwarf galaxies with LCDM cosmology: Simulating a realistic population of satellites around a Milky Way-mass galaxy (Wetzel et al., 2016)

Not so lumpy after all: modeling the depletion of dark matter subhalos by Milky Way-like galaxies (Garrison-Kimmel et al., 2017)

Modeling the Impact of Baryons on Subhalo Populations with Machine Learning (Nadler et al., 2017)

The SAGA Survey. I. Satellite Galaxy Populations around Eight Milky Way Analogs (Geha M., Wechsler R. et al., 2017)