Microbial diversity through an oceanographic lens: re ﬁ ning the concept of ocean provinces through trophic-level analysis and productivity-speci ﬁ c length scales

productivity-speci ﬁ c length scales, the potential horizontal displacement of microbial communities by surface currents to an intrinsic biological rate (here, speci ﬁ c primary productivity). This scale provides key context for our trophically dis-aggregated diversity analysis that we could relate to underlying oceanographic features. We integrate this element to provide more nuanced insights into the mosaic-like nature of microbial provincialism, linking diversity patterns to oceanographic transport through primary production.


Introduction
The continuous movement of seawater and turnover of microbial biomass and diversity in marine ecosystems form a time-varying mosaic of phylogenetic and functional biodiversity across ocean basins.This complicates efforts to map, understand and monitor key marine ecosystem attributes such as microbial growth, biodiversity and carbon cycling (Stec et al., 2017).
Marine ecosystems comprise a wide array of microbial life: microbial (photo-)autotrophic, pro-and eukaryotes, form the base of the marine food web (Hutchins et al., 2015;Sunagawa et al., 2015), and sustain energy exchange, provision and recycling of resources (Falkowski et al., 2008;Guidi et al., 2016) for higher trophic levels.Heterotrophs and mixotrophs remineralize most of the carbon and nutrients from the primary production via the microbial loop, before these can be exported to the deep sea (Azam et al., 1983;Azam and Malfatti, 2007).
In terrestrial systems, primary producer communities have been used to define major biomes (Woodward, 1987;Woodward et al., 2004), biogeographic realms and ecoregions (Olson et al., 2001) through their physical and functional structuring of ecosystems (Cardinale et al., 2011).In the ocean, Longhurst (2007) used chlorophyll-a (chl a) concentrations as a proxy for phytoplankton biomass to delimit ocean provinces, alongside water temperature to distinguish water masses.Longhurst provinces are used to define oceanographic biogeographic subdivisions; however, static applications of these provinces typically overlook the dynamic interactions, life histories, endemism and/or vicariance within ecological assemblages, needed to truly map microbial biogeography.Thus, multiple variables including biomass, primary productivity (PP), and diversity need to be considered with carefully structured sampling across space and time (Kollmann et al., 2016).Here, we investigate how integrating multiple physical and biochemical variablesaccounting for their horizontal displacement by surface currentscan improve our understanding of microbial provincialization across a high-resolution transect (~0.5 latitude) of the Atlantic Ocean.
Microbial assemblages can act as 'fingerprints' for water masses thanks to their high diversity, responsiveness and (typically) fast generation time (hours to days; Martiny et al., 2006), aiding the definition of biogeographic boundaries in pelagic environments (Raes et al., 2011;Fuhrman et al., 2015).However, the dynamic nature of the oceanswith their fronts, currents, eddies, up-and downwellings, and other hydrographic featureshas the potential to add additional complexity to the ecological variation between and within such regions (Oliver and Irwin, 2008;Hernando-Morales et al., 2017).
Hydrographic features create structural variability in the ocean, which (through, e.g.modifying nutrient distributions) provides more diverse niche space for microbial communities to colonize.Unless mitigated, this would increase microbial diversity (e.g.Kemp and Mitsch, 1979;Cadotte, 2006) and should favour phytoplankton productivity (Legendre, 1981).While basin-scale horizontal dispersal of organisms by major ocean currents is known to reduce microbial β-diversity (Raes et al., 2011;Richter et al., 2020;Sommeria-Klein et al., 2021), mesoscale and sub-mesoscale horizontal transport of microbial communities, and the impact on their activity, is rarely taken into account, except in a few frameworks that couple ecology and hydrography (e.g. the 'dual-lens' approach; Oldham et al., 2013).Indeed, hydrographic dynamics within and between ocean provinces interact with both neutral and selective ecological processes, resulting in communities in different successional states (Zhou et al., 2014) and/or shaped by opportunistic responses (e.g.Duffy and Stachowicz, 2006;Hartmann et al., 2012;Fadeev et al., 2021).
Here, we assess the relative importance of regional water mass characteristics (physicochemical parameters and hydrography) on microbial diversity.Furthermore, we resolve diversity responses along major trophic groups (auto-, mixo-, heterotrophs), an important but understudied perspective in microbial ecology (reviewed in Seibold et al., 2018).We leverage a conceptual framework that asserts that microbial communities are distinct within oceanographic regions, separated by fronts and currents limiting microbial dispersal (Martiny et al., 2006;Milici et al., 2016;Raes et al., 2018).We then assess how ecosystem structure (i.e.hydrography), bottom-up and top-down factors can qualify and advance traditional partitioning of the Atlantic into biogeographic provinces.

Delineation of oceanographic, ecosystemic provinces
Our analyses of remote sensing observations of geostrophic currents, sea surface temperature and chl a combined with in situ measurements of oceanographic features, chl a and microbial diversity informed our determination of ecosystem boundaries shown in Fig. 1.The ecosystem boundaries broadly overlapped with the Longhurst Provinces and we thus maintain the same naming conventions (Fig. S1).One exception was the South Atlantic Gyral (SATL) province, wherein sites were clearly ordinated into separate groups; one with cool waters (COLD; in the Argentine Basin, south of the Rio Grande rise and northern, oligotrophic warmer waters (HOT; 25.1-27.5) in the Brazil Basin (Figs S2 and S3).Furthermore, Stations 24 and 92 were outliers with respect to the temperature-salinity plot used for identifying ecosystem boundaries (Fig. S2), and were thus not grouped in an ocean province.
On a broad scale, we could separate ocean provinces into provinces with low and high chl a with significant differences in chl a concentrations (Wilcoxon, p-value = 9.112e À6 , n 1 = 38, n 2 = 39), and significant differences in PP (Wilcoxon, p-value = 3.686e À5 , n 1 = 38, n 2 = 39; Table 1).In our principal component analysis (PCA) (Fig. S4), sites in the Southwest Atlantic Shelves province (FKLD), Brazil Current Coast province (BRAZ), Canary Current Coast province (CNRY), North Atlantic Subtropical Gyre province (NAST-E) and North Atlantic Drift province (NADR) (i.e.high-chl a) provinces were associated with high particulate organic matter and high dissolved inorganic nutrient concentrations.Furthermore, the SATL-COLD, SATL-HOT, Western Tropical Atlantic province (WTRA) and North Atlantic Tropical Gyre province (NATR) (i.e.low-chl a provinces) provinces were associated with low temperatures and larger distance to coast (>600 km; Fig. S4).

Biomass turnover and transport of pelagic microbial communities
In the following analyses, we used a quantitative comparison of biomass turnover (specific PP; P B ) and current speed (see Methods, Eqs. 3 and 4) to estimate the distance a microbial community has travelled through passive advection before half of its biomass has been turned over: the productivity-specific length scale (Fig. 2A).
Based on the biological and physical component of the productivity-specific length scale, we observed strong sample variations in both variables [P B based on PP per day (d) and chl a, Table 1, and surface current speed] within and between provinces (Fig. 2B).For example, the SATL-HOT province had high variations in P B (0.007-0.07 mg C m À3 d À1 , n = 15) with relatively constant horizontal current speed (1-3 m d À1 ).In contrast, we measured a large range of horizontal current speed (2-7 m d À1 , n = 4) at relatively low P B (0.002-0.01 mg C m À3 d À1 , n = 4) within the BRAZ province resulting in the largest productivity-specific length scales (up to 22 km) of our dataset.
We used our calculations of sample-specific productivity-specific length scales to examine variabilities in beta diversity (β-diversity) between neighbouring sites (Fig. 2C).We noted three major regimes in our joint analysis of β-diversity and productivity-specific length scales (Fig. 2C): The first regime matches the CNRY province and shows markedly high variability in β-diversity but always at low (<5 km) productivity-specific length scales.The second regime shows a largely random scatter bounded by a well-defined upper threshold of β-diversity (0.05 ordination distance).Similar to the first regime, we observed this distribution of β-diversity to be restricted to productivity-specific length scales under 5 km.The final regime occurs at productivity-specific length scales between 5 and 18 km.We observed an increase in β-diversity with increasing productivity-specific length scales.However, the variation between functional trophic groups increased with increasing productivity-specific length scales, too.Cyanobacteria, eukaryotic heterotrophs and mixotrophs were more similar to each other with a stronger increase in β-diversity compared to the β-diversity observed in prokaryotic heterotrophs and eukaryotic autotrophs.
We observed that α-diversity varied across trophic groups, provinces and latitude (Fig. 3A).The α-diversity of both eukaryotic and prokaryotic autotrophs and eukaryotic mixotrophs was lower than that of heterotrophs.Relative to other functional groups, heterotrophic prokaryotes had higher α-diversity in the southernmost and northernmost provinces (FKLD, BRAZ, NADR).Shannon diversity increased towards the equator and was most pronounced for heterotrophic pro-and eukaryotes (Fig. 3A) with the highest α-diversity in the SATL-HOT, SATL-COLD and WTRA provinces (Table S2); the warmest and most oligotrophic provinces sampled.Similarly, the low-chl a provinces had significantly higher mixotrophic α-diversity than the high-chl a provinces (Wilcoxon, p < 2.2e À16 , n 1 = 39, n 2 = 68).
We noted that α-diversity of trophic functional groups are similarly structured in their correlations with environmental (physical + chemical) parameters; however, the magnitudes and significances of the correlations varied between groups (e.g.NAST-E PO 4 , auto.cyano; r = 0.83, p = 0.01; PO 4 , het.prok, r = 0.39, p = 0.27, Figs S6 and S7), as did the distribution of residuals (Figs S8-S15).Between functional groups, we observed a significant positive correlation between the α-diversity of autotrophic eukaryotes and cyanobacteria in the CNRY provinces or between autotrophic eukaryotes and mixotrophs in the CNRY and NADR provinces as well as across the transect (Fig. 3B, S16).C. Beta diversity to neighbouring sites based on ordination distances against productivity-specific length scales for each trophic functional group.Sites are colour coded according to ocean provinces.Box 1 indicates high beta diversity with great variability within the CNRY province.Box 2 corresponds to beta diversity variation of most ocean provinces with an upper limit of 5 km productivity-specific length scale; box 3 indicates an increase in beta diversity with productivity-specific length scales >5 km.
We observed that each province has a distinct correlation pattern across the variables we examined (Fig. 3B).Correlations were more pronounced in the CNRY province (Fig. 3B).We could not identify pairs of parameters with correlations that were consistently repeated across provinces (e.g.temperature, auto.eukSATL-COLD; temperature, auto.eukSATL-HOT; Fig. 3B).Furthermore, we noted that samples were not evenly spread along bivariate plots, especially in physically energetic regions (i.e.large scatter of temperaturesalinity profile; Fig. S17).

Beta diversity (β-diversity) patterns of auto-, mixo-and heterotrophs across provinces
Sites belonging to high-chl a and low-chl a provinces were well separated along the first axes of our redundancy analysis (RDA) plots for each trophic functional group (Fig. 4; between 40% and 61% of variance constrained).Sites belonging to high-chl a provinces were associated with higher dissolved inorganic nutrient concentrations, particulate organic matter and chl a concentrations.Sites belonging to low-chl a provinces were associated with higher temperatures, salinity and larger distances to province boundaries and the coast.
We found that sites in low-chl a and high-chl a provinces were differentially ordinated in our RDA biplots.We observed the shortest distances between sites in low-chl a provinces when analysing autotrophic eukaryotic diversity.In contrast, heterotrophic eukaryotic and mixotrophic diversity led to sites in low-chl a provinces being ordinated furthest apart from those in other provinces (Table S4).While autotrophic eukaryotic communities were distinctly separated between the highchl a provinces, mixotrophic communities had short relative ordination distances or overlapped in these provinces (Fig. 4A and C; Table S4).Across all trophic groups, the samples from the CNRY province showed the largest spread among the second RDA axes and partially overlapped with samples from the NAST-E province (Fig. 4).
Prokaryotic heterotrophic communities were more similar to autotrophic eukaryotes in their pattern of the biplots (procrustes analysis, Table S5), with shorter relative distances in low-chl a than in high-chl a provinces (Fig. 4A and E; Table S4).In contrast, cyanobacteria, heterotrophic and mixotrophic eukaryotes were more similar to each other (procrustes analysis, Table S5); shorter relative distances in high-chl a than in low-chl a provinces (Fig. 4; Table S4).
Analysis of the β-diversity among communities of heterotrophic pro-and eukaryotes, and mixotrophs ordinated station 24 between those that are characteristic of the BRAZ province and the southern, cold part of the SATL-COLD province (Fig. 4C-E).Similarly, analysis of β-diversity of the autotrophic eukaryotes and mixotrophs at Station 92 (located between the SATL-HOT and the WTRA province) ordinated the station between the SATL-HOT and WTRA (Fig. 4A and C).
The SATL province, as defined by Longhurst (2007), clustered into two distinct groups in microbial β-diversity (Fig. 4) which was also apparent in our biogeochemical inspection and division into a southern dynamic (COLD) part, and a northern, oligotrophic (HOT) part in the Brazil Basin (Fig. 4).

Discussion
Our study revealed that microbial trophic groups showed variable αand β-diversity patterns in relation to physical and biogeochemical environmental parameters across our south-to-north transect.Below, we discuss how our integrative results can contribute to refining microbial, biogeographic provincialism.Our results provide an estimate of sample-based spatial scales of differences in biodiversity signals, suggesting that observations of ecosystem function and stability need regional highresolution sampling.
In situ measurements of province characteristics deviate from Longhurst provinces.Broadly, traditional Longhurst province boundaries (Longhurst, 2007) are determined by clustering remote sensing data on chl a and temperature data (e.g.Devred et al., 2007;Hardman-Mountford et al., 2008) or in situ measurements of phytoplankton pigment composition (Barlow et al., 2007;Taylor et al., 2011;Bracher et al., 2020); Our delineation largely overlapped (for about 80% of the stations) with the provinces delineated by Bracher et al., (2020) using hierarchical cluster analysis on phytoplankton group composition data derived from the HPLC marker pigments (Table S6).However, we observed regional divergence.Most pronounced was the separation of the SATL province into SATL-HOT and SATL-COLD; supported by our physical, chemical and microbial observations.This difference could not be detected in phytoplankton pigment-based analyses (see Bracher et al., 2020).Such observations (if confirmed) exemplify how new provincialisation can arise from microbial oceanographic perspectives.
However, we also noted that some of our in situ measurements could lead to spurious distinctions within provinces.For example, we occasionally measured high nutrient and chl a concentrations in provinces known to be oligotrophic (Longhurst, 2007; e.g. up to 0.43 mg m À3 chl a in the NATR province).Given our sampling regime, this may simply be the result of sampling a transient patch of high-nutrient/high-chl a water.Our β-diversity supports this possibility, as NATR sites were largely ordinated with sites from other oligotrophic provinces along the first major axis in each of our RDA triplots (Fig. 4).We note that this shows the value of microbial diversity data in holistically describing oceanographic provinces, especially when sampling dynamic regions.
Ocean currents can drive overlaps in β-diversity signals: For example, ordinations showed that sites from the NAST-E and CNRY province overlapped (Fig. 4), likely caused by transport and dispersal along the Azores Current which joins the Canary Current (Fedoseev, 1970; Fig. S2a).The complex dynamics in the CNRY province (driven, for example, by a weakening of the Canary Current through coastal upwelling and eddy formations as described in accompanying studies of the same sampling campaign; (Bracher et al., 2020;von Appen et al., 2020) likely contributed to increased microbial β-diversity, relative to most other provinces.Similarly, samples in physically energetic provinces (e.g.BRAZ province), defined by an increased scatter in their temperature-salinity profiles (Figs S2 and S17), were more distributed in their α-diversity suggesting that the habitat structure is more disrupted or patchy in these ecosystems and highresolution sampling (<50 km) is needed for stable biodiversity assessments as shown in other studies across ocean filaments (Fadeev et al., 2021) and fronts (Mousing et al., 2016).
Productivity-specific length scales and their role in microbial biogeography.The ratio of PP rate and current speed gives us a productivity-specific length over which a biological community is carried during a single biomass turnover.This can be interpreted as a scale relating the potential for observed changes in β-diversity to horizontal transport.This may be seen as a time-sensitive version of earlier relationships described between chl a patchiness and current speed (Powell et al., 1975).Specifically, we estimated the PP rate (normalized P B ) at each sampling site through 13 C stable isotope experiments and biomass assessment (chl a concentration).The advective environment in which this growth occurs was characterized through measuring surface current speedin m d À1at each sampling point.Overall, the productivity-specific length scale provides a first-order Fig. 4. Redundancy analysis (RDA) of CLR-transformed ASV counts, subsequently partitioned into separate tables for each functional group (A) autotrophic eukaryotes, (B) autotrophic prokaryotes (C) eukaryotic mixotrophs, (D) heterotrophic eukaryotes and (E) heterotrophic prokaryotes.Contextual spatial and environmental data (z-scored) were used as explanatory variables, represented as arrows.Samples cluster according to ocean provinces.Samples separate between low-chl a provinces (SATL-COLD, SATL-HOT, WTRA, NATR) (indicated with light grey star) and high-chl a provinces (FKLD, BRAZ, CNRY, NAST-E, NADR) (indicated with dark grey star) along the first axis (RDA 1) across all functional groups.estimate of the spatial scale of ecosystem patchiness in epipelagic systems, ultimately controlled by a key biological rate (PP) and horizontal current speed.
Across the Atlantic, we identified three regimes of β-diversity associated with different magnitudes of productivity-specific length scales [high diversity at low productivity-specific length scales (Fig. 2C Box 1), low diversity at low productivity-specific length scales (Fig. 2C Box 2), and increasing diversity with increasing productivity-specific length scales (Fig. 2C Box 3)], affected by different magnitudes of P B and current speed within different provinces (Fig. 2B).
For example, prokaryotic growth in oligotrophic regions with low current speed, such as the SATL-HOT province, has been shown to be controlled by heterotrophic grazing (Teira et al., 2019).In our data, the change in β-diversity within the SATL-HOT province occurred at intermediate productivity-specific length scales with higher P B , higher proportional heterotrophic biomass and higher eukaryotic heterotrophic richness relative to other sites (Fig. 2B and Cbox 2, Tables S1 and S2).This can suggest that differences in productivity-specific length scales are a result of greater biological top-down control on local biodiversity than differences in horizontal current speed.The β-diversity signals in provinces with high horizontal current speed increased with increasing productivity-specific length scales (Fig. 2C box 3 SATL_COLD, BRAZ), suggesting a possible shift in mechanisms for β-diversity at higher productivity-specific length scales.The observed increase in β-diversity was likely due to the variable local energetics of the Brazilian Coastal Current (De Souza and Robinson, 2004); reflected in a large range of temperature-salinity signals in our dataset (Fig. S2).These energetics can result in a confluence zone of communities potentially far from their region of origin (Malvinas-Brazil Confluence; (Clayton et al., 2013).However, at this point, we cannot differentiate between the physical and biological mechanisms as we also observed different magnitudes of ordination distances within different trophic functional groups.For most of our sites (Fig. 2C, Box 2), the length scale was <5 km and did not correlate with the relatively low β-diversity signals in our RDA analyses (<0.05 ordination distance).In this case, β-diversity can be controlled by one or more of the multiple environmental variables shaping microbial communities such as vertical mixing (Cheng et al., 2020), atmospheric processes (Mayol et al., 2014), macroorganisms (Troussellier et al., 2017) and anthropogenic impact (Nogales et al., 2011).
It is important to note that at this point, we discuss surface dynamics only, and do not include any consideration of vertical mixing, itself clearly an important feature in generating diversity changes (Delong et al., 2006), and likely the source for the great heterogeneity of β-diversity in the CNRY province (Fig. 2C -Box 1).Understanding the interplay of biological and physical controls would require more complete analyses of biological and physical interactions and is thus only a first-order estimate.
Essentially, the productivity-specific length scale can be seen as defining a relevant scale of the mosaic of the biodiversity signatures measured in this study, revealing where higher-resolution sampling (<5 km) would have added more value to our data set and thus where more careful interpretations of our β-diversity results are necessary.It also provides a perspective on the resolution future campaigns should adopt to develop biogeographic understandings in each region.
We are convinced that Lagrangian samplingat scales informed by sample-based productivity-specific length scalescould enhance our understanding of microbial pelagic biogeography.This approach would help characterize how far microbial communities in a given water packet could disperse within provinces boundaries, giving shape to their internal mosaics of biodiversity.This is analogous to how the Damköhler number relates exposure timescale and processing timescale of a chemical reaction, to express how spatially and temporally extended the impacts of that reaction will be felt (Oldham et al., 2013).
Province-dependent correlation structures between α-diversity and environmental variables.Across our entire transect, correlations between microbial α-diversity and environmental variables followed well-described latitudinal temperature-diversity relationships (e.g.Fuhrman et al., 2008;Ibarbalz et al., 2019).However, we observed more faceted environment-diversity relationships within provinces, especially within physically energetic provinces (Fig. 3B).More measured variables, including temperature, correlated with α-diversity in energetic provinces relative to those with less energetic profiles.This suggests that the temperature-diversity relationship is not controlled by thermal energy alone (see, e.g.Giebel et al., 2009) but is nested in a more faceted microbial response to local water mass characteristics.For example, the high-temperature provinces we sampled were also oligotrophic (low chl a), which would confound any attempt to independently assess temperaturediversity relationships without accounting for the impact of oligotrophic conditions on α-diversity (Fig. 3A; Santi et al., 2019).
Our results showed a nuanced relationship of PP to microbial α-diversity within provinces and do not corroborate previous observations, which detected significant positive correlations (Raes et al., 2018).We observed moderate positive (but not significant at n = 20) correlations of PP and prokaryotic heterotrophic α-diversity in low-chl a provinces.This may be due to more oligotrophic conditions supporting more even saturation of available niches, rather than boom-and-bust dynamics characteristic of eutrophic environments and events (Duffy and Stachowicz, 2006).In contrast, we observed a negative and significant correlation between α-diversity and PP in the highly productive CNRY province, suggesting the rise and succession (via invasibility) of a few, opportunistic phylotypes (Steiner and Leibold, 2004), due to input of limiting nutrients from Saharan dust (von Appen et al., 2020).Together, our observations in the Atlantic and those in the Pacific (Raes et al., 2018) suggest that the relationship between PP and microbial α-diversity is nuanced, and observed signals depend on province-specific characteristics, which drive competition.
Notably, the differences in sampling size (n min = 8, n max = 20) of different provinces impact individual correlation and correlation significance.Thus, we excluded the FKLD (n = 4) and NATR (n = 6) province from our correlation analysis.The great variability in significant correlations between environmental variables and microbial diversity showed that refined observations within provinces are needed to confirm the observed individual environment-diversity relationships.
Low and high chl a conditions correspond to contrasting diversity patterns between trophic groups.We observed contrasting β-diversity patterns in microbial functional groups associated with low-and high-chl a conditions, supporting the observations by Irwin et al. (2006) of differences in the importance of environmental predictors for different phytoplankton functional types.Our results show that environmental conditions are important predictors not only on a phylogenetic level (e.g.diatoms vs. dinoflagellates, Irwin et al., 2006) but also within trophic functional groups.
We observed that eukaryotic mixotrophic communities had relatively high α-diversity at each site and high β-diversity between sites in low-chl a provinces (vs.highchl a provinces), suggesting that mixotrophy is supported under low-nutrient, low chl a conditions (see, e.g.Hartmann et al., 2012).Similarly, the β-diversity of cyanobacteria and eukaryotic heterotrophs was greater in sites under low chl a conditions, suggesting these functional groups are more susceptible to selective forces in these provinces (e.g.increased cyanobacterial diversity under low phosphate conditions; Thompson et al., 2013).
In contrast, we observed that there is greater β-diversity between sites across high-chl a provinces (vs.low-chl a provinces) when examining eukaryotic autotrophic and prokaryotic heterotrophic communities (Fig. 4A and E), which are known to structure one another (Seymour et al., 2017).However, we did not observe any proportional change in the α-diversity of these groups between low-and high-chl a provinces.This suggests that higher productivity is contributing more to species turnover between sites, rather than greater diversity within sites (Vallina et al., 2014).Observing a stable number of Hutchinsonian niches suggests that their occupancy is driven more by stochastic than deterministic processes in these provinces.These processes may include the dilution of slow-growing cells in dynamic systems (Irwin et al., 2006) as well as the prevalence and favouring of r-strategists in the community.However, further work is needed to explore these speculations.
Overall, our results extend previous findings (Legendre, 1981) of coupling between microbial diversity and productivity, showing trophy-specific diversity patterns between low chl a and high chl a conditions.Only more temporally and spatially expanded observations (factoring in functional diversity using metagenomics/transcriptomics, improved hydrographic descriptions, and physicochemical/nutrient profiles in these regions would allow our observations to be more confidently linked to ecosystem states driven by productivity (Chase and Leibold, 2002) and broader biogeographic descriptions.

Conclusion
Here, we assessed how measures of microbial diversity and activity can better inform ecological partitioning of the ocean, granting new perspectives on functional microbial biogeography across, within and between provinces.We showed that eukaryotic autotrophs and prokaryotic heterotrophs show similar cross-province β-diversity patterns, distinct from those shared by mixotrophs, cyanobacteria and eukaryotic heterotrophs.Our calculations of a productivity-specific length scale are, to our knowledge, the first attempt to quantifyon a per-sample basisthe influence of surface ocean currents on microbial communities coupled to PP measurements.This provides a first-order estimate of how spatially extended a microbial diversity signature may be, and thus the scale of a recognizable patch in a larger biogeographic area.Our findings also show the value of exploring functional communities (i.e.guilds) of microorganisms to more holistically understand community ecology with phylogenetic diversity data.We conclude that highly resolved sampling of these factors along more Lagrangian designs would help microbial ecology coherently map the subdivisions of the ocean's biogeographic provinces.

Sample collection
Our sampling was part of the PS113 (ANT-XXXIII/4) campaign onboard RV Polarstern from Punta Arenas, Chile, to Bremerhaven, Germany, from 2018-05-08 to 2018-06-10 (Strass, 2018).We took discrete measurements for biophysical analyses of sea surface water at 193 stations in the Atlantic Ocean from about 11 m depth through the ship's seawater system (Teflon ® tubing with a membrane pump) at an interval of ~0.5 latitude.

Province delineation after Longhurst
We defined ecological regions based on the variables suggested by Longhurst (2007), which include gradients in sea surface temperature, salinity, chl a, and checked for matches in nutrient concentrations (Supplementary S1, Figs S1 and S2).We classified samples into provinces by identifying (i) clusters on a temperature-salinity plot (i.e. by water mass) where clusters were constrained by geographic proximity (Fig. S2), and (ii) boundary currents that coincided with province boundaries (i.e.where surface velocity vectors (geostrophic currents) were strong at province boundaries as identified above; Fig. S3a; Copernicus, 2020).We also identified ocean ridges in bathymetry profiles as features potentially structuring the modified provinces (GEBCO 2019).We compared our classification into Longhurst provinces with delineations proposed by Bracher et al. (2020) (Table S6).
We classified provinces as high-chl a provinces if their mean chl a concentrations were above 0.3 μg m À3 chl a, or as low-chl a provinces if their mean concentrations were below 0.1 μg m À3 chl a. Provinces with mean concentrations between 0.1 and 0.3 μg m À3 were treated as ambiguous, and we defaulted to classifications in Longhurst (2007).For details see Supplementary S1.
We calculated the distances to the coast and to province boundaries in qgis 3.14 (QGIS Geographic Information System, 2020) by densifying the vector coastlines and extracting vertices, followed by extracting the distance from each site points to the nearest hub on the coast contours.We used the '110 m vector coastline' (v4.1.0,Natural Earth Data) and the province boundaries (delineated above) in our calculations.

Biochemical and PP profiling
At each station, we collected biochemical samples for dissolved inorganic nutrients (silicate, phosphate, nitrate and nitrite) as well as particulate organic matter (POC and PN).For details see Supplementary S2.
We measured PP with 200 μmol NaH 13 CO 3 stable isotope incubations in triplicates over a time period of ~24 h.For details on experimental design see Supplementary S3 and incubation conditions in Table S7.Analysis of 13 C incorporated into organic matter was carried out on a PDZ Europa ANCA-GSL elemental analyser interfaced to a PDZ Europa 20-20 isotope ratio mass spectrometer (Sercon, Cheshire, UK) by the Isotopic Laboratory at the UC Davis, California campus.PP was calculated as in Eq. 1: with at.%(enriched) as the atom percent derived from 13 C-spiked samples; at.%(NA) is the atom percent of natural abundance derived from particulate organic carbon (POC) samples.Dissolved inorganic carbon (DIC) is assumed as 2000 μmol kg À1 (after Zeebe and Wolf-Gladrow, 2001).Incubation time is the time between sample spike and filtration (~24 h).
We calculated specific PP (P B ) based on PP and chl a concentration, as shown in Equation 2.
where PP is expressed in mg C m À3 d À1 and chl a concentration in mg chl a in m À3 , resulting in a P B value in mg C m À3 d À1 chl a À1 .

Productivity-specific length scale
To link the biomass turnover of microbial communities to their pelagic advection, we calculate a length scale for biological-physical coupling (productivity-specific length scale) using a quantitative comparison of measured biomass turnover and measured current speed.
productivity À specific length scale where the productivity-specific length scale is expressed in kilometer.Current speed reflects average horizontal current speed from the VM-ADCP which provided data for the depth range 20-50 m.The λ is derived as in Eq. 4.
PP is expressed in mg C m À3 d À1 and the standing stock biomass corresponds to the chl a concentration (in mg m À3 ) times 23 mg chl a (mg C) À1 [for conversion between chl a and carbon concentrations (Geider, 1987); validated against in situ POC concentrations; Table S8].

Microbial sampling, processing and amplicon sequence analyses
Flow cytometry.At each station, samples for cell counts were taken into 2 ml Eppendorf tubes and incubated in the dark prior to fixation using 0.2% paraformaldehyde.Samples were snap-frozen and stored at À80 C while at sea.Samples were analysed on a BD Accuri™ C6 Plus Flow Cytometer (BD Biosciences-US) according to Gasol and Moran (2015).For details on measurements see Supplementary S4.
Microbial DNA sampling.For 16S and 18S rRNA gene sequencing analysis, 4 L of seawater were filtered through 0.2 μm Sterivex ® filters.Filters were purged with air, tightly closed, snap-frozen in liquid nitrogen, and stored at À80 C until further analysis.
ASV tables for both 16S and 18S rRNA gene amplicon sequences were constructed using the DADA2 R package, v1.15.1 (Callahan et al., 2016).Samples were processed using the DADA2 pipeline (v.1.15)with an additional step where primers were trimmed using cutadapt v1.18 after pre-filtering of FASTQ files (dada2_16S.R, dada2_18S.R in Supplementary_Code/ dada2/).Diagnostics of each filtering step are in Table S9 and number of reads plotted for each filtering step in density plots (Fig. S18).
We classified eukaryotic taxa as either autotroph, mixotroph, heterotroph, or unknown if no information of trophy was available by performing an unstructured literature search to validate expert-led assignment by UJ (see outcomes in Table S10).Prokaryotes, except Cyanobacteria, were considered to be dominated by a heterotrophic lifestyle (sensu Herndl et al., 2008), see Table S10 for further details.

Ecological data analyses
We performed our data analysis using R v4.0.3 (R Core Team, 2020) and RStudio v1.4.1103 (RStudio Team, 2021).Biophysical parameters, concentrations of particulate organic matter, dissolved inorganic nutrients, microbial cell counts and PP measurements were used as environmental metadata for statistical analyses with microbial sequencing data.
Pearson correlations between microbial Shannon diversity and environmental parameters (salinity, temperature, PO 4À , H 4 SiO 4 , NO 3 , dissolved oxygen, distance to province boundary, distance to coast, PN, POC, chl a) were calculated and plotted using the corrplot() function of the corrplot package (v0.84) and adjusted for multiple testing using the Holm-Bonferroni method (Supplementary_Code/ correlation_analyses/general_corrplot.R).Residuals of all correlations were screened for notable departures from Gaussian distributions, in which case these correlations were excluded from our results (Supplementary_Code/cor-relation_analyses/Residuals_corrplot.R; Figs S8-S15).
ASV tables and environmental metadata were transformed for comparability and statistical downstream analyses (general_clr_hellinger_transformations.R, z-scoring_subset.R in Supplementary_Code/data_transformations/).Before transformations, we removed all ASVs with ≤3 instances across all samples.Furthermore, before CLR transformation, we performed Bayesian-multiplicative treatments of zeros in the ASV tables using cmultRepl() function of the zComposition package (v1.3.4): this uses sample-wise totals to convert zero counts (which will lead to errors in log-ratios) into near-zero estimates, assuming undersampling rather than absence.To account for compositionality effects in our ASV tables (see Gloor et al., 2017), we performed a CLR transformation for RDA.Prior to permutational MANOVA (PERMANOVA) analyses using the decostand() function in vegan (v2.5.6),Environmental variables were checked for normal distribution (Fig. S19) and z-scored for scale-independent intercomparability.
We examined the distribution of sites among environmental gradients using a PCA on our microbial diversity metadata.We tested differences of ocean provinces using two-sided Wilcoxon rank-sum tests (hereafter: Wilcoxon) (Supplementary_Code/environmental_analyses/PP_plots.R L.81-84).
To examine microbial beta diversity (β-diversity) and its relation to environmental and contextual variables, we performed a set of RDA using the CLR-transformed ASV tables as response matrices and tables of environmental variables as explanatory matrices.We calculated sun azimuth and altitude based on sampling time and location using the suncalc (v.0.5.0) package in R. We performed stepwise model to identify significant environmental variables using ordiR2step() function in vegan (Supplementary_Code/ multivariate_analysis_ordistep.R).Differences of microbial dissimilarity between ocean provinces were tested with a PERMANOVA (Anderson, 2001) on Aitchinson distance of the CLR-transformed ASV tables using the adonis2() function along with a beta dispersion test to evaluate the homogeneity of dispersion using the betadisper() function in vegan (Supplementary_Code/multivariate_analysis/ RDA_functions.R).

Fig. 1 .
Fig. 1.Map of the PS113 expedition showing chl a concentration gradients ranging from 0 to 3 mg chl a m À3 , total nanoplanktonic cell number (ranging from 1000 to 3000 cells μl À1 ) and primary productivity (PP) (nmol C L À1 h À1 ) against latitude.Stations are indicated by circles.Ocean provinces are indicated with coloured lines.

Fig. 2 .
Fig. 2. (A) Map of the PS113 expedition indicating productivity specific length scale using a quantitative comparison of measured biomass turnover [half of standing stock phytoplankton biomass (chl a concentration) is replaced by new biomass (calculated from primary productivity rates) and measured horizontal current speed (ADCP horizontal velocity)], see Eqs. 3 and 4 for more details.Direction is derived from ADCP measurements.Ocean provinces are indicated with coloured lines.Light grey stars indicate low-chl a provinces, dark grey stars indicate high-chl a provinces.B. Specific primary productivity (P B ) against current speed (m d À1 ) at each site.Sites are colour coded according to ocean provinces.Productivity-specific length scales (km) are indicated by grey lines across the plot.C.Beta diversity to neighbouring sites based on ordination distances against productivity-specific length scales for each trophic functional group.Sites are colour coded according to ocean provinces.Box 1 indicates high beta diversity with great variability within the CNRY province.Box 2 corresponds to beta diversity variation of most ocean provinces with an upper limit of 5 km productivity-specific length scale; box 3 indicates an increase in beta diversity with productivity-specific length scales >5 km.

Fig. 3 .
Fig. 3. Alpha diversity across ocean provinces.A. Mean microbial Shannon diversity of eukaryotic autotrophs, prokaryotic autotrophs (cyanobacteria), eukaryotic mixotrophs, heterotrophic eukaryotes and heterotrophic prokaryotes; Error bars indicate standard deviation of microbial Shannon diversity within each province.Sample size within each province is indicated below.B. Pearson correlations of Shannon diversities with and environmental parameters within ocean provinces.Distance to coast (dCoast) calculated as distance of sample to next shore (in kilometer), dBoundary is the distance to the province boundary of the sample identified to belong to, temperature is sea surface temperature ( C), salinity is sea surface salinity, oxygen (μM), nitrate (NO 3 ) in μM, phosphate (PO 4 ) in μM Silicate (Si) in μM, Particulate organic carbon concentration (POC) in nM, particulate nitrogen (PN) concentration in nM, chlorophyll a (chl a) in mg m À3 , specific primary productivity (P B ) in nmol C L À1 chl a À1 , primary productivity (PP) in nmol L À1 h À1 .Correlation plots with colours indicating gradient from negative (red) to positive (blue) correlation; correlations of not significant, i.e. p-value >0.05, and with non-normal distribution of residuals in linear regression model were removed from the plot [for full correlation plot see Fig. S6, and Figs S8-S14 for residual histogram plots of individual provinces and across the entire transect (Fig. S15)].Within provinces, correlations were calculated for eukaryotes and prokaryotes respectively, and each trophic group (autotroph, mixotroph and heterotroph) against different environmental variables and against each other.Coloured boxes indicate correlations within provinces: FKLD, Southwest Atlantic Shelves province (n = 4); BRAZ, Brazil Current Coastal province (n = 8); SATL, South Atlantic Subtropical Gyral province (COLD: n = 15; HOT: n = 20); WTRA, Western Tropical Atlantic province (n = 17); NATR, North Atlantic Tropical Gyral province (n = 6); CNRY, Canary Current Coastal province (n = 26), NAST, North Atlantic Subtropical Gyral province (n = 10); NADR, North Atlantic Drift province (n = 13).ALL indicates all samples across the entire transect (n = 121).