Having an Eye for Species Identification: Iris Colouration Is Diagnostic In Highly Variable Leaf Litter (Craugastor: Craugastoridae) Frogs

An important aspect of evaluating biodiversity in a region, starting a monitoring program or informing conservation management decisions is having a good understanding of the taxonomy of local species. However, identification to the species-level can be challenging. A combination of DNA-based and phenotypic character analysis can provide a preliminary species list and help identify diagnostic features for taxonomically difficult groups such as the Neotropical leaf litter frogs ( Craugastor spp.). We used 16S and COI marker sequences to assess the number of phylogenetic Craugastor species present in Cusuco National Park, Honduras. We conducted a linear discriminant analysis to determine if phenotypic characteristics could validate identified monophyletic species. Subsequently, we evaluated the efficacy of the Automatic Barcode Gap Discovery (ABGD) algorithm, a DNA sequence similarity-based tool, for species delineation within Neotropical amphibians. Phylogenetic analyses conducted on sequences derived from 194 specimens produced concordant results between both loci, with reciprocal monophyly of mitochondrial DNA haplotypes for all clades, revealing the presence of four Craugastor species: C. rostralis, C. chac, C. laticeps and C. c.f. charadra . Iris colouration was discovered to be a diagnostic character and the ABGD algorithm accurately identified all four monophyletic species within the phylogenetic and phenotypic analyses. A further three species have been reported from Cusuco National Park including C. milesi, C. laevissimus and C. coffeus . These species are more readily identifiable than the cryptic species we examined, but they have yet to be confirmed using molecular analyses. We demonstrate that the use of molecular tools, in conjunction with the post hoc evaluation of phenotypic variation, can aid with the delineation of cryptic biological diversity and with the discovery of key diagnostic features for accurate species recognition in the field.


Introduction
An accurate assessment of species richness is a critical first step toward informed conservation and management decisions at a regional level.Species identification can be achieved with the use of morphological or molecular approaches, or better still, a combination thereof [1][2][3][4].
Molecular tools are highly valuable in the discovery and identification of species, particularly when traditional taxonomic identification is difficult to apply, such as cases of cryptic diversity [5,6].DNA sequence-based identification commonly employs a comparison of unknown individuals with a reference sequence library, but this has limited use in the study of lesser-known taxa where a reference library may not be established [7].A range of tools have been developed to effectively group the sequences derived from individuals into species clusters without subjective thresholds [8][9][10].The Automatic Barcode Gap Discovery (ABGD) algorithm has been a particularly useful tool because it allows multiple thresholds to be evaluated as opposed to a single standard limit [11].It statistically infers the potential threshold value from a set of input sequences, partitions the data based on that threshold, and then recursively applies the same procedure to the newly obtained groups of sequences.ABGD has previously been successful in species demarcation and cryptic species identification, particularly within taxonomically difficult amphibian groups [12,13].
Despite the benefits of such approaches in the lab, it remains necessary to have a reliable identification toolkit in the field.Species identification in the field must rely on phenotypic characteristics and, in some taxonomic groups, is unreliable especially when performed by multiple inexperienced surveyors.Advice on species identification from a taxonomic expert is often unavailable, particularly within reasonable timeframes.Additionally many diagnostic features often require microscopy or detailed examination and dissection of a specimen which is clearly inappropriate in the field.The incorporation of reliable yet manageable diagnostic phenotypic characteristics into molecularly classified groups is an important final step in the demarcation of cryptic species yet it is often overlooked.
This study tests the efficacy of the ABGD algorithm in conjunction with a phenotypic character analysis for species demarcation within the taxonomically difficult amphibian genus Craugastor.This diverse genus of leaf litter frogs, endemic to the Neotropics, include 113 species [14] that reproduce terrestrially where eggs hatch directly into small froglets bypassing the free-living tadpole stage.The taxonomy of species within this genus has been principally based on morphological characteristics but Craugastor species are known to show extreme phenotypic variability within populations [15], and they often lack distinct diagnostic features that can facilitate species recognition in the field.As taxonomic assignments within Craugastor are frequently accompanied by disagreement among experts, genetic characterization has become an indispensable tool for species determination within this group [16].
Many Craugastor species are at risk due to habitat destruction or modification, emerging pathogens (for example, the chytrid fungus Batrachochytrium dendrobatidis) or climate change [17].Leaf litter frogs inhabit lowland humid rainforest but can also occur at high altitude in fragmented and isolated patches of cloud forest.Cusuco National Park in north-western Honduras (Figure 1) is currently listed as the 25th most irreplaceable site (out of 173,000 protected sites) for amphibian biodiversity on Earth [18].A total of seven Craugastor species have been recorded in Cusuco National Park (Table 1) on the basis of morphology over nine years (2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015) of annual field surveys (Operation Wallacea Ltd., unpublished data).Some species can be reliably recognized, such as Miles' Robber Frog (Craugastor milesi), but most species, including the common Craugastor rostralis and Craugastor chac, are difficult to identify in the field and observational records are highly unreliable due to a lack of key diagnostic features.
This study aims to i) confirm the morphologically identified Craugastor species present in Cusuco National Park using molecular techniques, ii) find reliable and easy-to-use phenotypic characteristics for accurate recognition of species in the field, and finally iii) test whether sequence similarity-based approaches, such as the ABGD algorithm can accurately identify monophyletic species groups.

Data collection
Cusuco National Park (23,440 ha) consists of montane cloud forest and is part of a larger protected forest region that includes the Merendón Reserve range with an elevation gradient from 30 m to 2,240 m above sea level.Craugastor sampling was carried out by Operation Wallacea Ltd. during seasonal surveys for biodiversity in selected sites across the national park.Specimens were captured during visual encounter surveys of standardized transects at six sites over a three year period during 2011 to 2013 between the months June to August (Figure 1).Tissue samples from all Craugastor encountered on transects were collected resulting in a total of 194 specimens sampled from Cusuco National Park (Table S1).
All sampling was non-lethal and individuals were released within 15 minutes of capture, complying with research ethics codes of conduct (CCAC guidelines).Buccal swabs or toe clips were collected and stored in 95% ethanol.Specimens were measured and photographed when possible resulting in 95 (49%) and 138 (71%) specimens with good quality image data (Table S2) and a complete set of morphological measurements (Table S3) respectively.Photographs included a dorsolateral and ventral shot of the complete animal and a lateral close up of the head centred on the eye.Using dial callipers, eight body measurements were taken to the nearest 0.1 mm following Duellman [19] , Lynch and Duellman [20], specifically: distance from snout to vent, horizontal tympanum diameter, eye/tympanum diameter ratio, head width at its broadest point, head length from posterior end of tympanum to tip of snout, the snout length measured between anterior corner of eye and posterior nostril, hind leg length, foot length from the proximal edge of the heel to the tip of toe IV ( Figure S1).Additionally, the body weight of each specimen was attained using a Pesola spring scale (10,20   were recorded from each photograph: colour of the ventral surface of thighs including yellow, red, orange or uncolored; the presence or absence of a black and white checkered lip pattern; belly colour including uncolored or yellow; dorsum smooth or with warts and with or without a mid-dorsal line; upper half of the iris colour red, gold or silver; underside of throat with a mottled grey pattern or many isolated small grey points; lower half of the iris colour silver or maroon; the facial mask (following 21) having none or scattered black blotches (Z1), dark face mask reaching the tympanum (Z2) and dark face mask reaching beyond the tympanum (Z3); the presence or absence of a seat patch (see Figure S2).The colouration of the iris, belly, and thighs was classified into categories based on comparisons to 160 colour shades representing 11 standard colours ranging from white to black on the web based X11 colour codes chart.

Molecular analysis
DNA from buccal swabs was isolated [22].Swabs were placed in 1.5 mL microcentrifuge tubes and 600 µL of 50 mM NaOH was added.The tube was vortexed for 5 min and then heated at 95 o C for 10 min.Finally, 120 µL of 1M Tris (pH 8.0) was added to the tube, vortexed and the swab removed and discarded.DNA from tissue samples was isolated using Qiagen DNeasy Blood and Tissue Kits following the procedure for animal tissue.Mitochondrial sequence data was collected from fragments of two genes, the cytochrome c oxidase I (COI) and 16S ribosomal RNA (16S).The COI marker was amplified by polymerase chain reaction (PCR) using the primer pairs CO1f (5'-CCT GCA GGA GGA GGA GAY CC-3') and CO1a (5'-AGT ATA AGC GTC TGG GTA GTC-3') and the 16S marker was amplified using 16sar (5'-CGC CTG TTT ATC AAA AAC AT-3') and 16sbr (5'-CCG GTC TGA ACT CAG ATC ACG T-3') [23].The 50 μL PCR contained 10 μL of DNA template, 5.0 μL 10×PCR buffer, 0.2 μm of each primer, 0.2 mM of each dNTP, and 1 unit of Taq DNA polymerase.The PCR products were unidirectionally sequenced using ABI 3130 sequencer.All molecular sequence analysis was done within MEGA 6.0 [24].Sequences were aligned in ClustalW using the default parameters [25].For each gene, phylogenies were inferred separately using maximum likelihood approaches.The HKY+I model [26] and the T92+I model [27] were selected as the best fit to the COI and 16S data respectively using Bayesian Information Criterion [28].Clade support was assessed by non-parametric bootstrapping involving 1000 pseudoreplicates.The COI and 16S trees were rooted using Hyla intermedia (Hylidae, gb:FJ226788.1)and Phyllobates lugubris (Dendrobatidae, gb:FJ784587.1)respectively.The mean molecular distance between clades was accessed by averaging the number of base substitutions per site from all sequence pairs between groups using the K2 model [29] and standard error estimates were obtained by a bootstrap procedure involving 1000 pseudo-replicates.

Morphometric analysis
A linear discriminant analysis was conducted to determine if the morphometric measurements and phenotypic patterns of specimens would support the species identified by the molecular analysis.We used the clade classification from the maximum likelihood analysis as a grouping factor, excluding specimens without molecular data.Independent analyses were performed on categorical phenotypic characters (Table S2) and continuous morphological measurements (Table S2), to evaluate their propensity for species delineation independently.Multivariate normality was assessed visually with a Q-Q plot based on the squared Mahalanobis distance.The morphological measurements were log transformed and the categorical phenotypic variables were entered as dummy variables (0 or 1).All morphometric analyses were performed in R using the package MASS (lda; linear discriminant analysis) [30,31].

Delineation of candidate species
First, we delineated well-supported clades and then attempted to identify them by searching GenBank for all corresponding 16S and COI marker sequences identified as Craugastor.No prior published Craugastor sequences had enough similarity to provide clear identifications using COI, and these results were excluded.Next, we examined whether these clades could be distinguished from each other using the morphometric or phenotypic data.Finally, we evaluated the Automatic Barcode Gap Discovery (ABGD) algorithm's ability to delineate species within the unclassified specimens [11].A genetic distance matrix was built in MEGA6 with model and parameter values similar to the phylogenetic analysis described above and used as input for the ABGD web interface.Prior maximum intraspecific divergences thresholds were set to test 20 values ranging from 0.1% and 15%.

Results
Within this study, 79 (41%) and 52 (27%) Craugastor tissue samples were successfully amplified for the 16S and COI mitochondrial gene markers respectively.The final alignment of 16S sequences contained no gaps and consisted of 436 characters with 155 variable sites.The final COI sequence alignment consisted of 611 characters with 140 variable sites and the translated protein contained 203 amino acids with no internal stop codons.The maximum likelihood phylogenetic reconstruction for 16S and COI identified four and three well supported reciprocal monophyletic clades respectively (Figure 2).Clades A, B and D were distinguished with both genetic markers, however, clade C was present only with 16S.Only clade C aligned with sequences extracted from GenBank (Figure S3) and was identified as C. laticeps.The minimum mean sequence divergence between clades was 13.3 ± 1.6 % and 4.6 ± 1.0 % for COI and 16S respectively (Table 2).
The linear discriminant analysis (LDA) on continuous morphometric measurements with clades as a grouping factor included 45 (23%) specimens and extracted three discriminant axes explaining 58%, 34% and 8% of the variation respectively (Table 3).Snout length and head width were the major discriminating characteristics on the first axis, head length on the second axis and snout vent length on the third axis.The overall classification efficiency of the model, calculated as the average prediction for all samples, was 0.80 ± 0.18.There was substantial overlap in the measured variables and no diagnostic features were isolated (Figure 3A).The LDA on discrete phenotypic characters with clades as a grouping factor included 38 (20%) specimens.Lower iris colour data needed to be excluded from the LDA analysis as they perfectly separated clade D, composed of silver, from clades A, B and C, composed of maroon.The LDA on phenotypic patterns with the remaining characters extracted three linear discriminant axes explaining

IUCN category
Latin name Reference
59%, 29% and 12% of the variation respectively (Table 4).Upper iris colouration, including gold, silver and red, were major discriminating characteristics on all three axes.The overall classification efficiency of the model, calculated as the average prediction for all samples, was 0.84 ± 0.35.The LDA on phenotypic characters resulted in distinct clusters that more clearly separated the four phylogenetic clades (Figure 3B).There were only two individuals that did not cluster appropriately based on upper iris colouration (CR60 and CR115).
Overall, the combination of lower and upper iris colouration was found to be a strong diagnostic character for species identification.Clade A was identified as Craugastor rostralis, characterized by a gold upper half and a maroon lower half of the iris; clade B was identified as Craugastor chac with a red upper half of the iris and a maroon lower half of the iris; clade C was identified as Craugastor laticeps with a silver upper half and a lower maroon half of the iris; clade D was identified as      The percentage of variance in the dataset explained by respective axes is mentioned in brackets Craugastor cf.charadra and was characterized by a gold upper half and a silver lower half of the iris (Figure 4).
The Automatic Barcode Gap Discovery (ABGD) algorithm was applied to the 79 (41%) and 52 (27%) Craugastor specimens that were successfully sequences for 16S and COI mitochondrial gene markers respectively.A maximum intraspecific sequence threshold of 4.15% at 16S and 13.0% at COI implied 4 and 3 clades respectively.The clades or putative species generated by the ABGD algorithm were consistent with the maximum likelihood phylogenetic reconstruction and iris colouration diagnosis (Figure 2 and Table 2) further supporting the presence of four distinct species.

Discussion
The phylogenetic species concept distinguishes a species as the smallest biological unit that is monophyletic and supported by unique apomorphic characters [32].On this basis, we clearly delineated four sympatric species of Craugastor, which were also independently recognized using a diagnostic phenotypic character and a sequence similarity-based approach.The 16S and COI phylogenetic analyses distinguished three well supported monophyletic clades identified as C. rostralis, C. chac, and C. cf.charadra.A fourth species, C. laticeps, was differentiated using the 16S marker, but poorer amplification success left it undetected with COI.The COI primer sites are known to be problematic in amphibians, and resulted in lower success rates than might be expected from other taxa [33].As previously demonstrated, and further supported in this study, 16S is an important complementary marker to COI in amphibian studies [7,12].
All four monophyletic clades were reliably distinguished based on a combination of upper and lower iris colour while alive.Iris colouration in frogs has been demonstrated to have a clear association with ecology [34].A horizontally contrasted iris pattern, differing between the upper and lower iris, and a large dark tympanic patch are a typical eye structure for frogs inhabiting leaf litter [35].It is hypothesized that iris colouration could affect vision, function as a predator deterrent or be important as a mate recognition signal [36].The stability of iris colouration within the four Craugastor species in this study may point towards individual or mate recognition as a plausible hypothesis, particularly considering the otherwise highly variable morphology.Iris colouration is a common characteristic used in amphibian field identification guides due to its restricted variability, especially when used in conjunction with species range data.It was particularly useful in differentiating the two highly variable sympatric species, C. rostralis and C. chac, which are both typically light brown with darker markings on the hind legs.We have identified a diagnostic phenotypic character that will be crucial for the reliable differentiation of species in the field.A further three species have been reported from Cusuco National Park including C. milesi [37], C. laevissimus and C. coffeus (Operation Wallacea Ltd. unpublished data).These species are more readily identifiable than the cryptic species examined in this study, but have yet to be confirmed using molecular analysis.
The Automatic Barcode Gap Discovery (ABGD) algorithm, the sequence similarity-based tool evaluated in this study, successfully identified groups concordant with the four well supported monophyletic clades of the maximum likelihood phylogenetic analysis and matched the diagnostic phenotypic character of iris colouration between clades.The evaluation of various threshold values employed in the ABGD algorithm, obtained from the sequence data themselves, further tested and provided support for previously identified sequence threshold values for amphibian species in other studies [38,39].The results from this study demonstrate that the ABGD algorithm is highly applicable in the diagnosis of phylogenetic species and can be an important tool in the identification of cryptic species within amphibian groups.Furthermore, we demonstrate how mitochondrial genetic data, in conjunction with phenotypic character analyses can provide a strong framework for reliable field identification as well as the evaluation of DNA sequence similarity-based tools currently available for the assessment of species boundaries within amphibian datasets.

Figure 1 :
Figure 1: Map of Honduras with white star indicating Cusuco National Park (top-right) and map of Cusuco National Park displaying six survey sites (•) with grey circles proportional to number of specimens collected at each site (bottom-left).Collection sites included Buenos Aires (BA), Base Camp (BC), Cantiles (CA), Cortecito (CO), Guanales (GU), and Santo Tomas (ST).

Figure 1 :
Figure 1: Map of Honduras with white star indicating Cusuco National Park (top-right) and map of Cusuco National Park displaying six survey sites (•) with grey circles proportional to number of specimens collected at each site (bottom-left).Collection sites included Buenos Aires (BA), Base Camp (BC), Cantiles (CA), Cortecito (CO), Guanales (GU), and Santo Tomas (ST).

Figure 2 :
Figure 2: Maximum likelihood phylogeny of Craugastor in Cusuco National Park inferred from 16S (left) and COI (right) gene fragments representing 79 and 52 specimens respectively.The four clades A, B, C and D are highlighted in green, gold, pink and blue respectively.COI and 16S trees rooted (◊) using Hyla intermedia (Hylidae, gb:FJ226788.1)and Phyllobates lugubris (Dendrobatidae, gb:FJ784587.1)respectively.Tree is drawn to scale with branch lengths in units of number of base substitutions per site and with 1000 bootstrap replicate values indicated.

Figure 2 :
Figure 2: Maximum likelihood phylogeny of Craugastor in Cusuco National Park inferred from 16S (left) and COI (right) gene fragments representing 79 and 52 specimens respectively.The four clades A, B, C and D are highlighted in green, gold, pink and blue respectively.COI and 16S trees rooted (◊) using Hyla intermedia (Hylidae, gb:FJ226788.1)and Phyllobates lugubris (Dendrobatidae, gb:FJ784587.1)respectively.Tree is drawn to scale with branch lengths in units of number of base substitutions per site and with 1000 bootstrap replicate values indicated.

Figure 3 :
Figure 3: Biplot of first two axes of the linear discriminant analysis on morphological (A) and phenotypic (B) data, including 45 and 38 specimens respectively, with molecularly identified species as categories.Coloured circles group the clades: clade A highlighted in green, clade B highlighted in gold, clade C highlighted in pink and clade D highlighted in blue.Two outlier specimens indicated by the asterisks.

Figure 3 :
Figure 3: Biplot of first two axes of the linear discriminant analysis on morphological (A) and phenotypic (B) data, including 45 and 38 specimens respectively, with molecularly identified species as categories.Coloured circles group the clades: clade A highlighted in green, clade B highlighted in gold, clade C highlighted in pink and clade D highlighted in blue.Two outlier specimens indicated by the asterisks.

Figure 4 :
Figure 4: Images indicating diagnostic iris colour for molecular clades A, B, C and D representing Craugastor rostralis, Craugastor chac, Craugastor laticeps and Craugastor cf.charadra respectively.C. rostralis characterized by a gold upper half and a maroon lower half of the iris; C. chac characterized by a red upper half and a maroon lower half of the iris; C. laticeps characterized by a silver upper half and a lower maroon half of the iris; C. cf.charadra characterized by a gold upper half and a silver lower half of the iris.

Figure 4 :
Figure 4: Images indicating diagnostic iris colour for molecular clades A, B, C and D representing Craugastor rostralis, Craugastor chac, Craugastor laticeps and Craugastor cf.charadra respectively.C. rostralis characterized by a gold upper half and a maroon lower half of the iris; C. chac characterized by a red upper half and a maroon lower half of the iris; C. laticeps characterized by a silver upper half and a lower maroon half of the iris; C. cf.charadra characterized by a gold upper half and a silver lower half of the iris.

Table 2 :
Mean sequence divergence ± standard error between clades A, B, C and D for 16S (top) and COI (bottom) gene markers.

Table 3 :
Loadings of three linear discriminant analysis axes for 9 morphological measurements of 45 Craugastor individuals.