Black holes regulate cool gas accretion in massive galaxies

Cosmology

We adopted a Chabrier initial mass function (IMF)³¹ to estimate star formation rate and assumed cosmological parameters of H₀ = 70 km s⁻¹ Mpc⁻¹, Ω_M = 0.3, and Ω_Λ = 0.7.

Sample selection

The BH sample

The sample for galaxies with directly measured BH masses is primarily from ref. ¹¹, which includes 91 central galaxies collected from refs. ^19,20,21. We excluded 18 sources with BH masses measured with reverberation mapping and kept only those measured with dynamical methods. We then added another 63 galaxies with measured BH masses from recent literature, which were matched with the group catalogue³² of nearby galaxies to select only central galaxies. We obtained the HI flux densities and masses of this sample by crossmatching with the nearby galaxy database, HyperLeda³³. Our final sample includes 69 central galaxies with 41 from ref. ¹¹ and the remaining from the compilation of recent literature. In Extended Data Table 3, we list the basic properties of our BH sample.

The galaxy sample

The sample for galaxies with HI measurements and indirect BH mass measurements are from the extended GALEX Arecibo SDSS Survey (xGASS; ref. ³⁴) and HI-MaNGA programme^35,36, which include HI observations towards a representative sample of about 1,200 and 6,000 galaxies with 10⁹M_⊙ < M_⋆ < 10^11.5M_⊙, respectively. The depth of the survey also allows for stringent constraints on the upper limits for the HI non-detections, enabling a comprehensive assessment of f_HI for the entire sample. We limited the redshift z < 0.035 to ensure high HI-detection rates even at the highest stellar masses and BH masses. We selected only group central galaxies, which include at least one satellite galaxy in their groups, based on the crossmatch with the group catalogue^37,38,39. Isolated central galaxies lacking any satellites in their groups are discarded because they may have probably suffered from additional environmental effects⁴⁰. We derived the BH masses for the xGASS and HI-MaNGA samples with their velocity dispersion²¹ from SDSS DR17³⁷ (σ_SDSS, and we require σ_SDSS ≥ 70 km s⁻¹):

$$\log \left(\frac{{M}_{{\rm{BH}}}}{{M}_{\odot }}\right)=(8.32\pm 0.04)+(5.35\pm 0.23)\log \left(\frac{{\sigma }_{{\rm{SDSS}}}}{200\,{\rm{km}}\,{{\rm{s}}}^{-1}}\right).$$

(1)

Physical parameters of the BH and galaxy sample

Stellar masses

The stellar masses for the galaxy sample are taken from the MPA-JHU catalogue^41,42, which are derived from SED fitting based on SDSS data. For the BH sample, because most of them lack the same photometric coverage as the galaxy sample, we derive their stellar masses from their K-band luminosity and velocity dispersion-dependent K-band mass-to-light ratio following ref. ²¹:

$${M}_{\star }/{L}_{{\rm{K}}}=0.1{\sigma }_{{\rm{e}}}^{0.45}.$$

(2)

As an accurate determination of σ_e is not available for all galaxies, we derived σ_e for the full BH sample from the tight correlation in ref. ²¹:

$$\begin{array}{l}\log \left(\frac{{\sigma }_{{\rm{e}}}}{{\rm{km}}}\,{{\rm{s}}}^{-1}\right)=(2.11\pm 0.01)+(0.71\pm 0.03)\log \left(\frac{{L}_{{\rm{K}}}}{1{0}^{11}{L}_{\odot }}\right)\\ \,\,\,\,\,\,\,+(-0.72\pm 0.05)\log \left(\frac{{R}_{{\rm{e}}}}{5\,{\rm{kpc}}}\right).\end{array}$$

(3)

To explore whether there are systematic differences between the two methods, we compare the stellar masses of the galaxy sample taken from the MPA-JHU catalogue and those derived from equation (2). A median mass difference 0.32 dex is found between the two methods (Extended Data Fig. 6), which may be attributed to the tilt from the fundamental plane beyond the mass-to-light ratio, for example, the dark matter component in the effective radius. We corrected these systematic mass differences for the BH sample to match that of the galaxy sample.

HI fraction and upper limits

The HI-detection limit depends not only on the sensitivity but also on the width of the HI line. To obtain more realistic upper limits, we first derived the expected HI line width for each HI non-detection. The width of the HI line indicates the circular velocity of the host galaxy, which should be proportional to the stellar masses. We explored this using the HI detections from the xGASS sample. Extended Data Fig. 1 shows the relation between M_⋆ and the observed line width, as well as M_⋆ and inclination-corrected line width. It indicates that the inclination-corrected line width is tightly correlated with M_⋆, which is further used to derive the expected line width for the HI non-detections. Combining the sensitivity of the HI observations and the expected line width, we derived the upper limits for all the HI non-detections in our BH and galaxy samples.

Morphology

For BH sample, the morphology indicator T is obtained from the HyperLEDA database³³. It can be a non-integer because for most objects the final T is averaged over various estimates available in the literature. For the galaxy sample, we classified them in to the early types and late types based on the Sérsic index (from NASA-Sloan Atlas catalogue; NSA: Blanton M.; http://www.nsatlas.org) larger or smaller than 2.

Star formation rates

The specific star formation rates (SSFR) of the galaxy sample are from the MPA-JHU catalogue based on ref. ⁴². The SSFR for the BH sample is taken from the original reference.

Bulge masses

The bulge information is from refs. ^43,44 for the BH sample and galaxy sample, respectively. More specifically, we calculate the bulge mass for the galaxy sample using r-band B/T.

Stellar mass surface density

We calculated the K-band effective radius for both the BH and the galaxy sample according to ref. ²¹: log R_e = 1.16 log R_{K_R_EFF} + 0.23 log q_{K_BA}, where R_e is the corrected apparent effective size, R_{K_R_EFF} and q_{K_BA} are K-band apparent effective radius and K-band axis ratio from 2MASS. After converting the apparent sizes to the physical sizes, the stellar mass surface density was derived as ${\varSigma }_{{\rm{star}}}={M}_{\star }/(2{\rm{\pi }}{R}_{{\rm{e}}}^{2})$.

H₂ masses

We collected H₂ masses from xCOLD GASS survey¹⁸ and ref. ⁴⁵ for xGASS and MaNGA galaxies, respectively. We acknowledge that at least in the nearby Universe, the molecular-to-atomic gas mass ratio increases only weakly with stellar masses and remains relatively low over a wide stellar mass range, with $R\equiv {M}_{{{\rm{H}}}_{2}}/{M}_{{\rm{HI}}} \sim 10-20 \% $ at 10⁹M_⊙ < M_⋆ < 10^11.5M_⊙. We calculate the total gas fractions as ${\mu }_{{\rm{HI}}+{{\rm{H}}}_{2}}\,=$$({M}_{{\rm{HI}}}+{M}_{{{\rm{H}}}_{2}})/{M}_{\star }$. For central galaxies (isolated centrals plus group centrals), we compare the M_BH–μ_HI and M_BH–${\mu }_{{\rm{HI}}+{{\rm{H}}}_{2}}$ relation in Extended Data Fig. 4. The M_BH–${\mu }_{{\rm{HI}}+{{\rm{H}}}_{2}}$ relation exhibits a stronger correlation with the smaller scatter than the M_BH–μ_HI relation. We acknowledge that, based on molecular hydrogen gas content traced through dust extinction, previous studies show an M_BH–${f}_{{{\rm{H}}}_{2}}$ correlation¹². Future studies with more direct measurements of molecular hydrogen gas for large samples will be needed to examine in detail whether M_BH also plays a fundamental part in regulating the molecular gas content in galaxies.

Quiescent fraction

To estimate the quiescent fraction at different M_BH, we selected galaxies from the MPA-JHU catalogue of SDSS galaxies with the same criteria as the galaxy sample, except that we limited the velocity dispersion to greater than 30 km s⁻¹ to cover broader M_BH and we made no constraints on the HI detection. We classified the sample galaxies into star-forming and quiescent ones, separated at SSFR = −11. In each M_BH bin, the quiescent fraction was calculated as the ratio between the number of quiescent galaxies and that of all galaxies. The result is shown in Extended Data Fig. 5, which is consistent with that of previous work^29,46.

Linear least squares approximation

We implemented linear regression for the BH sample and the galaxy sample using Python package LTS_LINEFIT introduced in ref. ⁴⁷, which is insensitive to outliers and can give the intrinsic scatter around the linear relation with corresponding errors of the fitted parameters.

Linear fitting including upper limits

To incorporate both detections and upper limits in the galaxy sample, we applied the Kaplan–Meier non-parametric estimator to derive the cumulative distribution function at different M_BH bins (with Python package Reliability⁴⁸), and performed 10,000 random draws from the cumulative distribution function at each bin to fit the relation between f_gas and M_BH. The linear relation and its corresponding errors are taken as the best fitting and standard deviations of these fittings (Extended Data Table 2). The non-detection rate of HI is relatively low across most of the M_BH range and becomes significant only for galaxies with the most massive BHs (reaching about 50% at M_BH > 10⁸M_⊙).

Partial least square regression

To derive the most significant physical parameters in determining μ_HI statistically, we used the Python package Scikit-learn⁴⁹ with partial least squares (PLS) Regression function, which uses a non-linear iterative partial least squares (NIPALS)⁵⁰ algorithm. The PLS algorithm generalizes a few latent variables (or principal components) that summarize the variance of independent variables, which is used to find the fundamental relation between a set of independent and dependent variables. It has advantages in regression among highly correlated predictor variables. It calculates the linear combinations of the original predictor datasets (latent variables) and the response datasets with maximal covariance, then fits the regression between the projected datasets and returns the model:

where X and Y are predictor and response datasets, B is the matrix of regression coefficients and F is the intercept matrix.

We constructed the X and Y matrices as the set of M_BH, M_⋆, Σ_star, M_bulge and the set of μ_HI. For the BH and galaxy samples, this returns the sample size of 45 and 189, respectively. The optimal number of latent variables (linear combinations of predictor variables) in PLS Regression is determined by the minimum of mean squared error from cross-validation (using function cross_val_predict in Scikit-learn) at each number of components. We find that the optimal number of latent variables for both the BH and the galaxy sample converges to one. Further increasing the number of latent variables yields only a few percentage changes in the mean squared errors, and M_BH remains the most significant predictor parameter. Following appendix B in ref. ⁵¹, the variance contribution from different parameters to μ_HI is decomposed as

$${\rm{Var}}(Y)=\mathop{\sum }\limits_{i=1}^{4}{\rm{Var}}({X}_{i}{B}_{i})+{\rm{Var}}(F),$$

(5)

where Var is a measure of the spread of a distribution. The portion of each parameter variance is shown in the last column of the Extended Data Table 3, which shows that M_BH dominates the variance. Further increasing the number of latent variables results only in a few percentage changes in the mean squared errors, and M_BH remains the most significant predictor parameter.

Source link