Grazing enhances carbon cycling, but reduces methane emission in the Siberian Pleistocene Park tundra site

Large herbivore grazing has been shown to substantially alter tundra soil and vegetation properties as well as carbon fluxes, yet observational evidence to quantify the impact of herbivore introduction into Arctic permafrost ecosystems remains sparse. In this study we investigated growing season CO2 and CH4 fluxes with flux chambers on a former wet tussock tundra inside Pleistocene Park, a landscape experiment in Northeast Siberia with a 22 year history of grazing. Reference data for an undisturbed system were collected on a nearby ungrazed tussock tundra. Linked to a reduction in soil moisture, topsoil 5 temperatures at the grazed site reacted one order of magnitude faster to changes in air temperatures compared to the ungrazed site and were significantly higher, while the difference strongly decreased with depth. Overall, both GPP (gross primary productivity, i.e. CO2 uptake by photosynthesis) and Reco (ecosystem respiration, i.e. CO2 release from the ecosystem) were significantly higher at the grazed site with notable variations across plots at each site. The increases in CO2 component fluxes largely compensated each other, leaving NEE (net ecosystem exchange) similar across grazed and ungrazed sites for the 10 observation period. Soil moisture and CH4 fluxes at the grazed site decreased over the observation period, while in contrast the constantly water-logged soils at the ungrazed site kept CH4 fluxes at significantly higher levels. Our results indicate that grazing of large herbivores promotes topsoil warming and drying, effectively accelerating CO2 turnover while decreasing methane emissions. Our experiment did not include autumn and winter fluxes, and thus no inferences can be made for the annual NEE and CH4 budgets at tundra ecosystems. 15

ble of stationarities (slopes) were calculated for varying start and end times within this period using a bootstrapping approach.
The final slope used for flux computations was identified as the frequency distribution's median. Implausible or disturbed signals were manually flagged, and excluded from further analysis. Such cases included for example unstable signals without a distinctly discernible, steady slope, which are not clearly interpretable, or signals obviously disturbed by leakage.
The median slope ( a) of greenhouse gas concentrations change over time were transformed to a flux using the following 155 formula: V ch and A ch are the volume and surface area of the chamber, respectively. R is the ideal gas constant (8.3144J/molkg), While fluxes of R eco and N EE could be calculated based on dark and light chamber measurements, respectively, the photosynthetic uptake portion of the flux (GP P ) was calculated as the difference between measured N EE and the mean of measured R eco for one measurement. The standard error (RMSE) of each flux measurement was calculated using all bootstrapped slopes, distinguishing between R eco and N EE measurements. The slope error for GP P was taken as the summed errors of R eco and N EE measurements. Error values are given in Tab. A1 in the Appendix. Calculations were conducted using 165 the software package R-studio.
To analyze the implications of the grazing disturbance on net CO 2 and CH 4 exchange across the sites, the time series of flux estimates for each plot was interpolated across the entire measurement period of 17 days at a resolution of 10 minutes.
Both R eco and CH 4 fluxes were interpolated using empirical plot-specific linear models linking flux rates to environmental parameters (T S , T air , and SM ). Parameters minimizing the fit uncertainty were not uniform across plots even at a site (see 170 Appendix A for an exact description of this approach, including the derivation of formulas, and chapter 4.2. for evaluation).
The following formulas were derived to interpolate R eco and CH 4 fluxes: F CH4 (GR) = a 0 T Soil,25cm + b 0 + a 1 SM 15cm + b 1 In each formula, a 0 is the slope of the first applied model, a 1 for the second. b 0 and b 1 are corresponding intercepts. Total 180 errors for all fluxes were derived considering the standard error from the final model compared to observed values (linear regression), further considering the standard error from the bootstrapping approach used to transfer measured concentration slopes into fluxes and the standard error from modeled T S (for R eco and CH 4 fluxes). A detailed error calculation is shown in Appendix A3.
GPP was modelled as a function of PAR, using a rectangular hyperbolic function (Runkle et al., 2013).
The fit parameters α and P max represent, respectively, the initial canopy quantum efficiency (the initial slope of the GPP-PAR curve at PAR= 0) and the maximum canopy photosynthetic potential, which is the hypothetical maximum of GPP at infinite 190 PAR. Both α and P max are assumed to have positive values, necessitating the negative sign on the equation's right-hand side to allow GPP to fit the NEE sign convention. Hereby, positive fluxes imply carbon losses from the ecosystem into the atmosphere.
This model contains the explicit assumption that GPP is insensitive to light stress or temperature effects (Runkle et al., 2013).
For each site, α and P max were determined by fitting PAR against the GP P -Fluxes from chamber measurements and applying a non-linear-least-squares (nls) optimization. Implausible PAR values from chamber measurements were replaced by PAR 195 derived from the net radiometer measurements. With these parameters, a continuous GP P time series could be modeled for the entire observation period.

Statistics
To visualize and compare carbon fluxes between plots and study sites, daily means for GP P , R eco , N EE and CH 4 -fluxes were calculated. Daily average T S and carbon fluxes were compared using a two-sided Mann-Whitney-test (?) in R. Mean 200 daily albedo and R net were compared using a two-sample t-test. 60.7% -56.7%, changes from the first to the last measurement day, respectively).

Air and soil temperatures
A change in the general weather pattern around mid July 2019 split our observation period roughly into a warm and sunny 220 first half, and a cool and cloudy second half. Regarding the mean air temperatures, during the first period (July 07 -15), daily average T air ranged between 14.3 -26.9°C, while conditions during the second period (July 16 -22) were much cooler (6.9 -10.9°C). Mean air temperatures at the grazed site were observed to be about 1°C warmer compared to the ungrazed site, with similar daily minima, while daily maxima were distinctly higher at GR.
The trends in soil temperatures matched those described for air temperature: while T soil across the observed vertical profile 225 were rising during the first part of the observation period, in the second half they declined (see also Fig. 1). We found topsoil temperatures in 5 cm depth at the grazed site (GR: max 19.6°C, min 9.0°C) to be significantly higher (p < 0.0001) compared to the ungrazed reference (UGR-1: max 15.4°C, min 5.6°C; UGR-2: max 11.6°C, min: 4.3°C) during the whole observation period. The time lag of T soil reacting to T air was one order of magnitude shorter at GR -moving averages of air temperatures explaining T soil in 5 cm integrated the last 4.3h (p < 0.0001) at GR, while reaching back 40.7h(p < 0.0001) at UGR-1 and 230 86.7h (p < 0.0001) at UGR-2. However, this difference vanished for T soil in 15 cm depth (100h, 67.5h, 108.3h at GR, UGR-1 and UGR-2, respectively; p < 0.0001). No comparisons could be made for T soil in 25 cm and 35 cm depth, respectively, because no significant correlation was found due to limited data availability at GR.
In the deeper soil layers (15 cm, 25 cm, 35 cm), the temperatures at the UGR-2 site were consistently lower than observed at all other sites (GR, UGR-1). Comparing observations from GR and UGR-1, during the first week sites showed similar average 235 soil temperatures, while during the second week soils became clearly warmer for the drier grazed site. Notably, T soil in 35 cm and 25 cm at GR were lower compared to UGR-1 during the first measurement days, but due to a steeper warming rate at the grazed site this changed after 5 days into the observations. In the second week, T soil in 35 cm and 25 cm at GR were significantly higher compared to UGR-1, most pronounced in 35 cm depth (p < 0.0001).

Thaw depths 240
Measured thaw depths were greater at all GR sites compared to all UGR sites throughout the observation period: while the values within the ungrazed reference area varied between 31 -36 cm over time and across sites, that range was 39 -58 cm at the grazed site. Importantly, also the temporal dynamics as well as the variability across sites differed strongly between these two study areas. At UGR, the average increase in thaw depths was 0.25 cm per day during the observation period. Even though both observation plots were situated about 50 m apart, conditions were fairly uniform between them, and thaw depths did not 245 differ by more than 2 cm. In contrast, the GR sites showed a higher average thaw depth increase (0.91 cm per day). Differences in measurements between plots reached up to 11 cm, even though sites were only separated by about 3 m, and vegetation and soil conditions seemed similar.

Carbon Fluxes
All interpolation models described in Section 2.4 yielded a significant linear regression fit between observed and calculated 250 values (for details, see Tab. 2).

CO 2 fluxes
As reflected in the strong enhancement in both component fluxes of N EE, i.e. GP P and R eco , the carbon turnover rates in the grazed ecosystem were increased as a response to the warmer and drier conditions in the top soil layers (see Fig. A1).
Regarding photosynthetic uptake of CO 2 , the average GP P was significantly higher at GR compared to UGR (p < 0.0001, 255 see Fig.2). While all three GR plots show higher GP P compared to flux rates at UGR, the average difference between the sites is dominated by differences between the greater fluxes at plot GR-2 and lower fluxes at UGR-2, while differences between GR-1, GR-3 and UGR-1 were not significant.
Across the entire measurement period, ecosystem respiration R eco was distinctly higher at GR compared to UGR. In this case, site differences were more consistent, i.e. differences in flux rates between plots for each site were minor. Greater T soil 260 and decreasing SM were identified as the main controls for the higher R eco at the GR sites, except for GR-3 where R eco did not increase in response to drying. Possible reasons for that are discussed in chapter 4.2.
For N EE, the observed differences in both GP P and R eco canceled out, resulting in no significant changes in N EE as a function of grazing disturbance. Temporal dynamics in N EE largely matched across sites, including a decrease in net uptake rates during the first and warmer week of the observation period, and an increased uptake during the subsequent, cooler days.

265
GR-2 was found to be the strongest carbon sink in the first period, while N EE at UGR-1 was largest during the second.
Overall, all sites were consistent sinks for atmospheric CO 2 during the observation period.

CH 4 fluxes
We observed strong variations in CH 4 fluxes between the plots at GR and UGR sites. While flux rates at both sites were similarly large in the beginning of the experiment, average CH 4 fluxes at GR plots started to decline within the first week of 270 observation, in close correlation with decreasing soil moisture. In contrast, the high water table at UGR facilitated high CH 4 fluxes throughout the observation period, while changes in time were mostly connected to changes in soil temperatures.
At both sites, the variability in CH 4 fluxes across plots for each site was larger compared to the CO 2 fluxes. Particularly for GR, flux estimates even between closely co-located plots changed from virtually zero to rates similar to those found at the ungrazed site. Fluxes at GR-3 were smallest throughout the observation period. During the first days of the experiment, 275 characterized by high soil moisture, CH 4 emissions at GR-1 and UGR-2 were largest, while in the second period, when soils at GR dried, they were largest at UGR-2 and UGR-1.

Assessing the Quality of Flux Chamber Measurements
The application of flux chambers may lead to biases in the ecosystem fluxes themselves (Kutzbach et al., 2007). Installing 280 collars in the ground before starting the experiment, while necessary to prevent leaking of air, can disturb the soil and plant roots. At Pleistocene Park, we had to cut a shallow slit in the ground to be able to tightly fit in collars. In a long term study   Figure 2. Overview on C-fluxes at all chamber plots from July 7th to July 21st. Differing letters indicate significant differences between plots (p < 0.01). Overall (site average), N EE did not differ significantly between GR and UGR, while GP P and Reco were significantly larger at GR. Daily average CH4 fluxes at grazed plots strongly decrease over time, leading to a substantial net reduction in methane emissions at the GR plots, compared to the UGR reference. Table 2. R 2 and p-values for linear regressions between final modeled fluxes and measured fluxes.
Another important aspect concerning representativeness of chamber-based carbon flux measurements is small-scale heterogeneity in the ecosystem, which may exist even within plots that are seemingly homogeneous. This heterogeneity is observable at sub-meter scales, and can be a result of disturbances by soil fauna, pockets of fine root proliferation, moisture gradients, or remnants of decaying organic matter (Davidson et al., 2002). In the Arctic tundra, this small-scale heterogeneity is common 295 (Aalto et al., 2013;Zona et al., 2011), and is e.g. reflected by variations in soil temperature, soil moisture and thaw depths. To account for it in flux uncertainties, more than one chamber is needed to adequately assess the mean and variance of surface- Grazing by large herbivores had a number of obvious impacts on the vegetation in Pleistocene Park. However, one issue that complicates the attribution of the herbivore influence on the vegetation is the year-long human disturbance by the park operations. While this influence is mostly restricted to selected areas and transects across the park, and no major direct impact was apparent for the study plots used within the context of this study, one cannot completely disentangle the impacts of man-made and grazing disturbances. Another issue that needs to be considered when interpreting the presented intercomparison of GR 355 and UGR sites are potential differences in site characteristics that were already present before grazing management in the Pleistocene Park areas started. Lacking pre-treatment flux datasets for both sites, the only reference that is available is the similar appearance of both GR and UGR sites in photographs from the early 2000s, i.e. a time when ecosystem transformation due to grazing was still in an initial stage. Accordingly, while the differences in aboveground ecosystem characteristics described below can clearly be attributed to grazing pressure over the past two decades, the highlighted differences in carbon fluxes 360 may at least partially have been present in the pre-treatment state. While unlikely to affect the qualitative tendencies of higher carbon turnover, and reduced methane emissions, this potential systematic bias clearly needs to be considered when evaluating and interpreting the flux numbers.
Around our chamber site at GR, almost all sedge-tussocks were in a state of decay, or had disappeared almost completely.
In place of them or between their remnants, many single plant tillers (mainly Carex spec. and Calamagrostis langsdorfii) grew.

365
These apparent changes in soil and vegetation properties in Pleistocene park are in accordance with previously reported observations documenting grazing impacts. For example, the transformation from tussocks to grass mats by grazing, accompanied by a strong increase in belowground biomass, was already observed for montane biomes (Hofstede and Rossenaar, 1995;Pucheta et al., 2004). Some sedges found in Arctic environments, such as Carex aquatilis, were shown to benefit from muskox-grazing, since they feature strong root production and the ability to produce dense grass tillers, and therefore more easily recover from 370 grazing (Raillard and Svoboda, 1999;Kitti et al., 2009). This ability gives them an advantage over other species (Tolvanen and Henry, 2000), leading to a more turf-like vegetation structure that gradually replaces the original plant community.
Fertilization of tundra ecosystems through available nutrients from urine and faeces also influences vegetation communities under grazing pressure Svoboda, 1999, 2000). Accelerated urea-nutrient uptake by living plants has been reported for upland tundra (Barthelemy et al., 2018), where graminoids were more efficient in using these resources compared to shrubs.

375
At ungrazed sites such as UGR, the aboveground parts of the plant die off, wither and accumulate on the topsoil, where they rot slowly, leading to a thick organic layer (Kwon et al., 2016). In comparison, at grazed sites such as GR, the plant shoots were regularly removed by the animals resulting in reduced litter accumulation. Linked to the same effect, valleys in the Canadian Arctic which are regularly grazed by muskoxen give the impression of a productive meadow, while ungrazed sites in the same region appear overgrown, and rather nutrient starved (Raillard and Svoboda, 2000). Similar effects were observed by Falk et al.

380
(2015), where excluding muskoxen from an Arctic mire decreased the amount of plant tillers and increased litter and moss cover. In both upland and lowland tundra ecosystems, herbivores, mostly reindeer or muskoxen, have been shown to reduce moss cover, and decrease shrub cover by trampling and browsing, promoting the expansion of graminoids (Kitti et al., 2009; GR is an ecosystem in transition from a tussock tundra topped by a thick organic layer to a grassland with dense tillers. At the same time, the mean water table during the growing season has been lowered. These shifts also change in which soil horizon carbon is primarily accumulated, with stronger accumulation in deeper soil layers, as observed by Hofstede and Rossenaar (1995) and Pucheta et al. (2004). The decay of the remnants of tussocks, as well as the decomposition of large carbon pools in 455 the now drier organic topsoil, contribute to the observed high ecosystem respiration rates. These fluxes, however, will probably decrease in the future, leading to decreasing rates of R eco and therefore increased N EE.
In contrast to the current situation in most circum-Arctic ecosystems, Pleistocene Park exhibits a high diversity of herbivores that are absent from the landscape elsewhere. Previous studies indicated that herbivore diversity leads to a more balanced use of food plants (Chang et al., 2018(Chang et al., , 2020Larter and Nagy, 2001;Sitters et al., 2020;Cromsigt et al., 2018) and correlates 460 with increased carbon uptake and soil carbon storage (Chang et al., 2018(Chang et al., , 2020Sitters et al., 2020). While such shifts yet have to be shown for permafrost ecosystems, first observations in Pleistocene Park hint at positive long-term effects of big herbivore introduction on carbon sequestration. These insights, and also questions emerging from ongoing studies, call for future research concerning the influence of big herbivores on the carbon balance of permafrost ecosystems, with a focus on long-term, year-round monitoring.

Conclusions
In this study, we investigated the impact of long-term grazing disturbance on a previously wet tussock tundra ecosystem underlain by permafrost in the Siberian Arctic using flux-chamber observations over 2.5 weeks during the growing season in summer 2019. Over the past 22 years, introduction of large herds of herbivores in the context of the so-called Pleistocene Park experiment has altered vegetation and soil properties within the affected area, this way initiating an ongoing transformation 470 from a water-logged, overgrown tussock tundra towards a drier ecosystem featuring more turf-like vegetation. We compared the managed ecosystem inside Pleistocene Park to a nearby undisturbed reference site, focusing our study on differences in soil thermal and hydrological properties, and how these influenced the exchange fluxes of carbon between ecosystem and atmosphere.
We measured a significantly lower albedo at the grazed site compared to the undisturbed reference, which can be mostly 475 explained by a lower abundance of shrubs. Soil compaction as a result of trampling, in combination with higher evapotranspiration losses, led to a decrease in soil moisture. Linked to the associated reduction in soil heat capacity, topsoil temperatures in the park were higher and reacted one order of magnitude faster to changes of air temperatures compared to the undisturbed tundra. Due to warmer and drier conditions in the soil, both GP P and R eco during July were significantly higher at the grazed site in the park compared to a undisturbed wet tussock tundra, while differences in N EE are not pronounced. CH 4 emissions,

480
following the shift in hydrological properties, were distinctly lower in the park, but also highly variable between plots.
The effect of grazing on nutrient availability, and associated responses of the vegetation community, remain open questions that must be quantitatively assessed at Pleistocene park. Furthermore, it is essential that carbon fluxes will be investigated over longer timescales, with year-round data coverage. Especially fluxes during autumn and early winter, which account for a Investigating the environmental drivers and their evolution in correlation with R eco measurements revealed that there is no uniform set of drivers across observation sites yielding optimum regression fits across.
For GR-1 and GR-2, in contrast to all other sites, changes in SM apparently exerted a strong influence on R eco (see Fig.A1).
However not in all SM -ranges both high and low T soil or T air were measured. Therefore, R eco fluxes during a moisture interval that shows a wide spectrum of soil temperatures, specific for each site (SM 7.5cm at GR-1: 54% -61%; at GR-2: 57.5% -62%), 495 were chosen in which a exponential regression (T soil in 5 cm ∼ R eco ) was conducted (blue dots, left graphs, Fig. A1). The resulting formula was applied to the T soil in 5 cm of the whole SM range, which yielded a simulated set of values for R eco if soil moisture did not change. This set of values was subtracted from the original flux values, and residuals were utilized for a second, linear regression (residuals ∼ SM 7.5cm , central graphs, Fig. A1) to account for the influence of variable soil moisture on fluxes. Huemmrich et al. (2010) observed a similar correlation between soil water regime, soil temperature and 500 soil moisture as proposed here, substantiating the approach. For R eco estimates at GR-3, UGR-1 and UGR-2, fits were optimal when utilizing air temperatures in combination with an exponential regression, while no significant correlation could be found when trying to explain residuals with SM , thaw depth or other variables (Fig A2).   Figure A1. Depiction of the relationship between TS,5cm and SM and Reco for GR-1 and GR-2. Interpolation models are formed by the equations of depicted regression curves. The graphs on the right show modeled vs. measured fluxes, respectively.

A2 Modeling CH 4 Fluxes
CH 4 fluxes showed a strong correlation to both T S (all sites) and SM at all depths (GR-1, GR-2, GR-3). However, there was a strong co-linearity between T S and SM . Therefore, to reach the best possible fit for interpolating CH 4 fluxes, at GR, while accounting for both drivers, data was split up in two moisture groups (SM 15cm > 60% and SM 15cm < 60%) to apply a pseudo-stepwise regression. Then, for each plot, a linear regression between CH 4 fluxes of the lower moisture group and 510 T S,25cm was computed (Fig.A4 b, e, h). Second, the resulting linear equation was applied to the complete dataset for each plot integrating both moisture groups. The residuals between these calculated values and the measured values was fitted against SM 15cm , applying another linear regression, resulting in a second linear equation. These two resulting equations were used to interpolate CH 4 fluxes for each plot. Since soil moisture did not change at UGR-1 and UGR-2, the linear regression between CH 4 fluxes and T S,15cm yielded the linear equations used to model fluxes at these sites (Fig.A3).  For the final modeled fluxes, which provide the basis to calculate daily average fluxes, a series of error sources was identified (see Tab. A1). First, a bootstrapping approach to obtain a median slope of CO 2 and CH 4 concentration changes (see also 520 methods section) allows to generate an error range for observed flux rates. The standard error of the calculated slopes was transformed into a flux by the same formula applied to the median slope, averaged over all measurements, and is called Err slope . For GP P , Err slope is composed by both the Err slope of N EE measurements and R eco measurements, since these two direct flux measurements needed to be combined for GPP. Second, modeling the chamber fluxes in order to have a continuous time series results in deviations from the modeled vs. the measured fluxes. Here, a linear regression (modeled 525 vs. measured) was applied to evaluate the model quality, and to obtain a standard error. Third, to model and interpolate R eco (at GR-1 and GR-2) and CH 4 fluxes (at UGR-1 and UGR-2), interpolated soil temperatures were used. Therefore, the RMSE of these models was considered by adding it to the T S -term in the interpolation formula for R eco (T S,5cm ) and CH 4 fluxes (T S,15cm ; T S,25cm ). In a last step, the initial flux was subtracted from this "enhanced" flux, and the result was defined as the