Genomic analyses suggest strong population connectivity over large spatial scales of the commercially important baitworm, Australonuphis teres (Onuphidae)

Barriers to dispersal can disrupt gene flow between populations, resulting in genetically distinct populations. Although many marine animals have potential for long-distance dispersal via a planktonic stage, gene flow among populations separated by large geographic distances is not always evident. Polychaetes are ecologically important and have been used as biological surrogates for marine biodiversity. Some polychaete species are used as bait for recreational fisheries, with this demand supporting commercial fisheries for polychaetes to service the retail bait market. However, despite their ecological and economic importance, very little is known about the life history or population dynamics of polychaetes, and few studies have used genetic or genomic approaches to understand polychaete population connectivity. Here, we investigate the population structure of one commonly collected beachworm species used for bait on the eastern coast of Australia, namely, Australonuphis teres, by using genome-wide single-nucleotide polymorphism data. We sampled A. teres from hierarchical nested spatial scales along 900 km of the coast in New South Wales. We identified six genetic groups, but there was no clear geographic pattern of distribution. Our results suggest that there is considerable gene flow among the sampled populations. These high-resolution genomic data support the findings of previous studies, and we infer that oceanographic processes promote genetic exchange among polychaete populations in south-eastern Australia. Additional keywords: admixture, fisheries, genotype-by-sequencing, management, polychaete, population structure, SNP. Received 11 February 2020, accepted 27 April 2020, published online 21 May 2020

Bermingham 1995; Roberts 1997;Cowen et al. 2007), many other marine taxa, even those with apparently high dispersal potential, show surprisingly little evidence for gene flow among disjunct populations (Dibacco et al. 2006;Purcell et al. 2006;Froukh and Kochzius 2007;Marko et al. 2007;Temby et al. 2007;Miller et al. 2009Miller et al. , 2014Xuereb et al. 2018). There is, thus, a growing recognition that marine systems are not all open and interconnected (Dennis and Hellberg 2010;Karsenti et al. 2011;Iacchei et al. 2013). Defining the scale of connectivity and determining the factors that influence gene flow are critical for understanding population dynamics and genetic structure of marine species, and for developing effective management strategies to mitigate anthropogenic effects on populations and ecosystems (Giangrande et al. 2005;Waples and Punt 2008).
Polychaete worms are common in intertidal ecosystems (Cole et al. 2007(Cole et al. , 2017, and are good indicators of species richness and community patterns in benthic invertebrate assemblages (Olsgard and Somerfield 2000;Giangrande et al. 2005). Indeed, polychaetes have been used as biological surrogates for marine biodiversity (Olsgard et al. 2003;Shokri et al. 2009). Among benthic groups, polychaetes are one of the best indicators of environmental disturbance, because there are both sensitive and tolerant species across pristine and heavily disturbed habitats (Olsgard and Somerfield 2000). Polychaetes are important members of the marine food chain; they can act as predators, scavengers and grazers of diverse other organisms, or as prey for a variety of bird, fish and crustacean species (Fauchald and Jumars 1979;Jumars et al. 2015).
In addition to their ecological importance, many species of polychaetes are an important bait resource for recreational fishers (Cole et al. 2018). As with any fisheries resource, management relies on information about stock size and population connectivity (Cowen et al. 2007;Fogarty and Botsford 2007). In Australia, recreational and commercial fishers target beachworms directly for bait or for sale into the bait market respectively. Beachworms, including the 'kingworm' or 'stumpy' Australonuphis teres (Onuphidae), inhabit highenergy sandy beaches from Maroochydore, Queensland to Lakes Entrance, Victoria (Paxton 1979). Although little is known about their population biology, they are thought to be broadcast spawners, have been recorded to be sexually mature at 420 mm in length (Paxton 1979), and are suspected to breed multiple times a year (Paxton 1986). As adults, dispersal of A. teres is relatively limited (Paxton 1979), with large topographic structures (e.g. headlands and river mouths) that separate sandy beaches acting as physical barriers to dispersal.
Despite their ecological and economic importance, few population genetic (and fewer population genomic) studies have yet been conducted for polychaetes. In one genetic study from the USA, the baitworm Glycera dibranchiata (Glyceridae) was inferred to have little connectivity among populations within an estuary and between intertidal and subtidal populations (Bristow and Vadas 1991). In contrast, in Australia, a recent study of estuarine polychaetes found little genetic differentiation among populations of the nephtyids Aglaophamus australiensis and Nephtys longipes, from which the authors inferred that pelagic larval dispersal is probably mediated by ocean currents in these two species (Smith et al. 2015). However, this latter research was based only on small fragments of mitochondrial and nuclear loci, which were not of a high-enough resolution to enable assessment of whether population connectivity was ongoing or merely recent. The connectivity of populations of polychaetes along Australian coasts, thus, remains largely unknown, yet, such knowledge is of considerable importance for managing the sustainable harvest of beachworms and other polychaete bait species.
Here, we use genomic approaches to investigate population structure of the polychaete worm A. teres at multiple nested spatial scales (i.e. sites separated by kilometres, beaches separated by tens of kilometres, and regions separated by hundreds of kilometres) along 900 km of the New South Wales (NSW) coast in Australia. We generated a single-nucleotide polymorphism (SNP) dataset via genotype-by-sequencing. The three possible findings are as follows: (1) genetic homogeneity in all A. teres samples across spatial scales, which would indicate a single panmictic population and considerable gene flow along the coast, (2) genetically distinct populations of A. teres, indicating negligible gene flow among populations, or (3) a mixed result, with both similarities and differences among populations, indicating some gene flow. This study is one of the first to use large-scale genomic data to assess population connectivity in polychaetes.

Materials and methods
Field methods for sampling Australonuphis teres Populations of A. teres were sampled in the Eastern Warm Temperate biogeographic zone of the NSW coast. Three bioregions were selected and we designated them as North, Central and South, corresponding to the Tweed-Moreton (288S-30.58S), Manning (30.58S-30.758S) and Batemans (34.68S-36.68S) bioregions respectively. A fully nested design was used to obtain samples of A. teres. In each of the three regions (separated by hundreds of kilometres), two beaches (separated by tens of kilometres), each with two randomly chosen sites (separated by kilometres), 45 individuals were collected. The posterior end (the last 20 mm of the body) was removed and placed in 98% ethanol and stored below À188C for subsequent analysis. Beaches were selected as those where commercial harvesting of beachworms occurs (.0.6 t year À1 commercial catch over the previous 2 years). Samples of the population of A. teres were obtained through beach sampling, whereby a mesh bag containing a fish frame was dragged along the sand in the swash zone to entice the worms to emerge, and any worms that surfaced were collected by hand.

DNA extraction, library preparation and sequencing
Tissue samples were removed from ethanol, dried and subsampled. DNA was extracted with the Qiagen DNeasy 96 Blood and Tissue Kit (Qiagen, Hilden, Germany) following the manufacturer's instructions. The quality and quantity of DNA were evaluated by gel electrophoresis and Qubit fluorometer (Invitrogen, Carlsbad, CA, USA). The 40 highest-yield extractions from each site were selected for sequencing, with a total of 480 individual A. teres being used in library preparation.
A genotyping-by-sequencing (GBS) approach to generate sequence data was used, as applied in similarly non-model systems in the past, and following Fraser et al. (2018a) and references therein. Briefly, GBS library preparation followed standard protocols (Elshire et al. 2011) with minor modifications (Wilson et al. 2019). To each DNA sample, a uniquely barcoded PstI adaptor was added (2.25 ng per sample; Morris et al. 2011). Digestion was performed with 4 U PstI-HF (New England Biolabs, MA, USA) in 1 Â CutSmart Buffer, and incubation at 378C for 2 h. Adapters were ligated with T4 DNA ligase in 1 Â ligation buffer (New England Biolabs, MA, USA) and incubated at 168C for 90 min (with 2 min at 378C every 30 min) and 808C for 30 min. Purification was performed using a Qiagen MinElute 96 UF PCR Purification Kit (Qiagen, Hilden, Germany), with elution in 25 mL 1 Â TE. Polymerase chain reactions (PCRs) were run on 50-mL volumes containing 10 mL purified DNA, 1 Â MyTaqTM HS Master Mix (Bioline, Sydney, Australia) and 1 mM each of PCR primers 5 0 AATGATACGGCGACCACCGAGATCTACACTCTTTC-CCTACACGACGCTCTTCCGATC*T and 5 0 CAAGCAGA-AGACGGCATACGAGATCGGTCTCGGCATTCCTGCTG-AACCGCTCTTCCGATC*T (where * indicates phosphorothioation) at 728C for 5 min, 958C for 60 s, and 24 cycles of 958C for 30 s, 658C for 30 s and 728C for 30 s, with a final extension step at 728C for 5 min. Concentrations for each sample were assessed using a LabChip GXII (Caliper Life Sciences, MA, USA) and pooled equimolarly. A 200-base pair (bp) fraction (fragment size 250-450 bp) of the pooled library was separated via electrophoresis on a 1.5% agarose gel. Sequencing of DNA from this size range was performed on one lane of a high-output flowcell in an Illumina NextSeq 500 system (Illumina, CA, USA, 75 bp paired-end). The raw sequence data are available in the NCBI Sequence Read Archive under BioProject: PRJNA512571 (http://www.ncbi.nlm.nih.gov/bioproject/512571).

Data analysis
Sequencing yielded 185 204 387 reads that were, on average, 76 bp long. The quality of the sequences was first assessed using FASTQC (Andrews 2010), and SNPs were extracted using STACKS 2.41 (Catchen et al. 2013). All scripts are available as Supplementary Material to this paper.
Raw sequence reads were demultiplexed and quality filtered using the process_radtags module in STACKS. As there is no reference genome, SNP calling was performed using the denovo_map.pl pipeline, using default parameter settings. After removing low-quality individuals, SNPs with less than 60% of individuals sequenced were discarded as likely collapsed repeats. We explored the data following McGaughran et al. (2019) in vcftools (Danecek et al. 2011) and filtered to have a maximum of 50% of samples with missing genotypes, yielding 853 SNP loci for 273 samples.
To further investigate connectivity and population structure, we investigated partitioning of variance at the three hierarchically sampled geographic levels (i.e. region, beach, site) by using an analysis of molecular variance (AMOVA, Excoffier et al. 1992), using the R packages poppr (Kamvar et al. 2014(Kamvar et al. , 2015, adegenet (Jombart 2008;Jombart and Ahmed 2011) and vcfR (Knaus and Grünwald 2017). To assess statistical significance, a randomisation test was performed using 1000 permutations.
Population-structure analysis was performed using fastSTRUCTURE ver. 2.3.4 (Pritchard et al. 2000;Falush et al. 2003Falush et al. , 2007Hubisz et al. 2009). We ran fastSTRUCTURE with default settings for between 1 and 12 populations, chose the model complexity to best explain the structure in the data, and produced plots to visualise the admixture analyses.

Results and discussion
Our GBS dataset contained 853 SNPs for 273 A. teres from the eastern coast of Australia. Genetic differentiation among populations in A. teres was small or near negligible across the sampling range. Model complexity of the fastSTRUCTURE analysis suggested six genetic populations, but there was no relationship between 'genetic populations' and geographic distribution (Fig. 1). The PCA (Fig. 2) showed no geographic structuring and each component represented less than 4% of the variation.
The lack of strong genetic differentiation was confirmed by pairwise measures of genetic distance between populations showing absent to negligible differentiation among populations along the eastern coast of Australia, using both Nei's genetic distance and Weir and Cockerham's Fst (Table 1).
The Mantel test showed no significant correlation between the geographic and genetic distance matrix, Nei's D À0.153 (P-value 0.672) and Weir and Cockerham's Fst À0.115 (P-value 0.587), further indicating that there was little genetic differentiation among populations across our sampling range. An AMOVA confirmed limited genetic structure across the sampling range, with most variance being among samples within beaches, and within samples (Table 2).
Using BayeScan ver. 2.1 (Foll and Gaggiotti 2008) to try to partition variance between potentially adaptive and truly neutral loci, we found that 12 SNPs showed an excess of differentiation among the three regions (Fig. 3) and, thus, could be under selection. In the absence of a reference genome, this number of bi-allelic variants is not enough to enable us to meaningfully assess genetic structure from a neutral versus adaptive perspective.
Population genetic structure of sedentary marine species is generally expected to be shaped by the dispersal ability of their larvae (Levin 2006). Long-lived pelagic larvae can connect populations through migration and gene flow, whereas species with non-dispersive larvae might be expected to have genetically differentiated populations (Kesäniemi et al. 2012). However, there are notable exceptions to these generic expectations; for example, congeneric species with similar life-history traits can show dissimilarities in population structure (Temby et al. 2007;Miller et al. 2009Miller et al. , 2014. The reproduction and larval development of A. teres is not well understood, but A. teres have been found with over 100 000 eggs in their body cavity (Paxton 1986), and mature gametes have been found throughout the year, suggesting a long spawning season (Paxton 1979). Exposure to a wide variety of oceanographic and biological processes would occur over this period, potentially influencing larval dispersal distance.
With these polychaetes increasingly being used as bait (Cole et al. 2018), there is a growing need to understand their population structure to support sustainable resource-management practices. Our results suggested that despite movement throughout the sample range of this study, each geographic population has a different, but not diagnostic, genetic composition, and there has been bidirectional genetic mixing along the coast. This inference is consistent with previous research along the same region on the eastern coast of Australia, except for benthic estuarine polychaete species, which were inferred to have high levels of gene flow among populations (Smith et al. 2015). Although they live in different habitats, A. teres and the two estuarine species studied by Smith et al. (2015) are likely to be influenced by the same ocean currents.
Most prior studies have found no or little genetic differentiation among polychaete populations (e.g. Shen and Gu 2015;Zaâbi et al. 2015;David et al. 2016). High levels of gene flow were inferred between populations of two polychaetes, Laenereis culveri and Capitella nonatoi, living in lagoons in Brazil, albeit with there being evidence of somewhat restricted gene flow between particular populations (Seixas et al. 2018). However, in a similar area, Nunes et al. (2017) found strong genetic differentiation between Phragmatopoma caudata polychaetes from Florida and those from Brazil, suggesting a biogeographical barrier between these locations. Overall, most research suggests that gene flow can be high in polychaetes, presumably because of planktonic larval dispersal. However, most previous studies have been limited to just one or a few markers, such as, for example, COI (Chatzigeorgiou et al. 2015;David et al. 2016), 18S rRNA (Shen and Gu 2015;Sun et al. 2018), 28S rRNA (Shen and Gu 2015;Sun et al. 2018) and ITS (Nunes et al. 2017;Sun et al. 2018), whereas our study employed a genomewide approach with hundreds of SNP loci, which enhanced confidence in our inferences.
The eastern coast of Australia appears to have few biogeographical barriers for marine coastal organisms with a high dispersal ability (e.g. Coleman et al. 2013;Smith et al. 2015;Bellgrove et al. 2017). Despite the general southward flow of the East Australian Current (EAC; Coleman et al. 2013), some small pelagic invertebrates have been inferred to have travelled 1000 km north from their release site (Ruello 1975;Montgomery 1990). Northward movement probably results from dispersal with counter-currents and local eddies (Coleman et al. 2011(Coleman et al. , 2013. These oceanographic currents could help explain the patchy distribution of A. teres, as the eddies, together with other population processes, affect the  (Levin 2006;Keane and Neira 2008;Chan et al. 2018). Other polychaete worms respond to chemical cues from adults when settling (Hsieh 1994), which could lead to density-dependent processes, including density blocking, limiting gene flow between connected populations (Waters et al. 2013;Fraser et al. 2018b). As the carrying capacity of a beach or site is approached, the rate of successful settlement of additional larvae might diminish, restricting gene flow. In the case of A. teres, the presence of six genetic groups could indicate that past structure that has since been eroded by natural disturbance events (e.g. storms), changes in environmental conditions affecting long-term rates of successful larval dispersal (e.g. changes in currents and temperature regimes), anthropogenic disturbance (e.g. periods of heavy harvesting reducing population densities and supporting successful recruitment), or combinations of the above, allowing the observed diverse lineages to establish at each site. Understanding the factors affecting genetic diversity, and minimising anthropogenic processes that could reduce diversity, should be a priority for the ongoing sustainable management of these beachworm populations. This study is one of the first population genomics studies of polychaetes and has demonstrated the power of SNP analyses in resolving fine-scale population structure. Genomic tools can assist with testing evolutionary and ecological questions for polychaetes globally.

Compliance with ethical standards
The authors declare no conflict of interest. The beachworms were sampled following jurisdictional requirements for biological sample collection and all necessary approvals were obtained before commencing this research.

Conflicts of interest
The authors declare that they have no conflicts of interest.

Declaration of funding
This work was funded by the New South Wales Recreational Fishing Saltwater Trust, Project DPIS011 (to RCC). C. I. Fraser was supported by a Rutherford Discovery Fellowship from the Royal Society of New Zealand (UOO1803).