Table Of ContentRESEARCHARTICLE
Probing planetary biodiversity with DNA
barcodes: The Noctuoidea of North America
RezaZahiri1,2*,J.DonaldLafontaine3,B.ChristianSchmidt3,JeremyR.deWaard2,
EvgenyV.Zakharov2,PaulD.N.Hebert2
1 CanadianFoodInspectionAgency,OttawaPlantLaboratory,EntomologyUnit,Ottawa,Ontario,Canada,
2 CentreforBiodiversityGenomics,BiodiversityInstituteofOntario,UniversityofGuelph,Guelph,Ontario,
Canada,3 AgricultureandAgri-FoodCanada,BiodiversityProgram,CanadianNationalCollectionofInsects,
Arachnids,andNematodes,Ottawa,Ontario,Canada
a1111111111
*[email protected]
a1111111111
a1111111111
a1111111111
Abstract
a1111111111
ThisstudyreportstheassemblyofaDNAbarcodereferencelibraryforspeciesinthelepi-
dopteransuperfamilyNoctuoideafromCanadaandtheUSA.Basedontheanalysisof
69,378specimens,thelibraryprovidescoveragefor97.3%ofthenoctuoidfauna(3565of
OPENACCESS
3664species).InadditiontoverifyingthestrongperformanceofDNAbarcodesinthedis-
Citation:ZahiriR,LafontaineJD,SchmidtBC, criminationofthesespecies,theresultsindicateclosecongruencebetweenthenumberof
deWaardJR,ZakharovEV,HebertPDN(2017)
speciesanalyzed(3565)andthenumberofsequenceclusters(3816)recognizedbythe
ProbingplanetarybiodiversitywithDNAbarcodes:
TheNoctuoideaofNorthAmerica.PLoSONE12 BarcodeIndexNumber(BIN)system.Distributionalpatternsacross12NorthAmerican
(6):e0178548.https://doi.org/10.1371/journal. ecoregionsareexaminedforthe3251speciesthathaveGPSdatawhileBINanalysisis
pone.0178548
usedtoquantifyoverlapbetweenthenoctuoidfaunasofNorthAmericaandotherzoogeo-
Editor:IgorB.Rogozin,NationalCenterfor graphicregions.Thisanalysisrevealsthat90%ofNorthAmericannoctuoidsareendemic
BiotechnologyInformation,UNITEDSTATES
andthatjust7.5%and1.8%ofBINsaresharedwiththeNeotropicsandwiththePalearctic,
Received:December20,2016 respectively.Onethird(29)ofthelatterspeciesarerecentintroductionsand,asexpected,
Accepted:May15,2017 theypossesslowintraspecificdivergences.
Published:June1,2017
Copyright:©2017Zahirietal.Thisisanopen
accessarticledistributedunderthetermsofthe
CreativeCommonsAttributionLicense,which
permitsunrestricteduse,distribution,and Introduction
reproductioninanymedium,providedtheoriginal
Occupying14%oftheplanet’slandsurface,CanadaandthecontinentalUnitedStates(Fig1)
authorandsourcearecredited.
spanenvironmentsfromthehigharctictothesubtropics[1,2].Pastestimatessuggestthese
DataAvailabilityStatement:Detailsonall nationshostabout144,000insectspeciesalthoughapproximatelyathirdarestillundescribed
barcodedspecimens(e.g.,vouchercodes,higher
[2].With11,500describedspeciesandperhapsanother2700species-in-waiting[3],theorder
taxonomy,repositoryinstitutions,voucherimages,
Lepidoptera(mothsandbutterflies)isasubstantialcomponentofthefauna.With3664named
sequencelength,collectiondates,andcollection
data)areprovidedinS1Dataset.ResidualDNA speciesin742genera,theNoctuoideaisthelargestsuperfamily[4,5],comprising32%ofall
extractsarestoredintheDNAArchiveattheCentre NearcticLepidoptera[6–9].
forBiodiversityGenomics.GenBankaccession Sinceitsinceptionin2003[11],DNAbarcodinghasgaineddiverseapplicationsinbiodi-
numbersforallnewsequencesarealsoavailablein
versityscience:detectingnewspeciesandacceleratingtheirdescription[12–14];revealing
S2Dataset.Specimendataincludingimages,
crypticspecies[15,16];linkingimmaturestageswithadults[17];clarifyingsexualdimor-
detailsonthevoucherrepositories,GPS
phisms[18];andestablishingtrophicassociations[19].Inanimals,itemploysanalysisofthe
coordinatesforcollectionsites,sequencerecords,
tracefiles,andGenBankaccessionnumbersare DNAsequenceofastandardfragmentofthemitochondrialcytochromecoxidasesubunitI
PLOSONE|https://doi.org/10.1371/journal.pone.0178548 June1,2017 1/18
DNAbarcodelibraryforNorthAmericanNoctuoidea
availableintheBarcodeofLifeDataSystems gene(COI)asabasisforspecimenidentificationandspeciesdiscovery[11].Thisapproach
(BOLD,www.boldsystems.org)ineightpublic owesitseffectivenesstothefactthatthisgeneregionisgenerallycharacterizedbylowintraspe-
datasets:DS-NAMNOC1(dx.doi.org/10.5883/DS-
cificvariationandmuchhigherdivergencebetweenspecies.Asaconsequence,byassembling
NAMNOC1),DS-NAMNOC2(dx.doi.org/10.5883/
sequencedataforknownspecies(i.e.,aDNAbarcodereferencelibrary),newlyencountered
DS-NAMNOC2),DS-NAMNOC3(dx.doi.org/10.
specimenscanbeassignedtoaspeciesbycomparingtheirCOIbarcodestothoseinthe
5883/DS-NAMNOC3),DS-NAMNOC4(dx.doi.org/
10.5883/DS-NAMNOC4),DS-NAMNOC5(dx.doi. library.Thisapproachhasnowgainedglobalacceptance,motivatingtheassemblyofDNA
org/10.5883/DS-NAMNOC5),DS-NAMNOC6(dx. barcodereferencelibrariesforvariedgroups[20–29],informationthatiscuratedandpubli-
doi.org/10.5883/DS-NAMNOC6),DS-NAMNOC7 callyavailableonBOLD,theBarcodeofLifeDataSystems[30].AlthoughDNAbarcodingis
(dx.doi.org/10.5883/DS-NAMNOC7)andDS-
knowntodeliverhighspeciesresolutioninLepidoptera[21–25],mostpriorstudieshave
NAMNOC8(dx.doi.org/10.5883/DS-NAMNOC8).
examinedrelativelysmallgeographicareasoronlyafractionofthespeciesinatargetassem-
Funding:ThisresearchwassupportedbyaNatural blage[29].
SciencesandEngineeringResearchCouncilof
Thepresentstudyexaminestheimpactonbarcoderesolutionofincreasingbothtaxoncov-
Canada(NSERC)DiscoverygranttoPDNHandby
erageandgeographicscale.Itexaminestheperformanceofareferencelibrarythatincludes
fundingfromthegovernmentofCanadathrough
GenomeCanadaandOntarioGenomicsinsupport recordsfor97.5%ofthenoctuoidspeciesknownfromCanadaandtheUSA.Asidefromcom-
oftheInternationalBarcodeofLifeproject.Thisisa prehensivetaxonomiccoverage,thepresentlibraryprovidesagoodsenseofgeographicvaria-
contributionfromtheFoodForThoughResearch tioninmanyofthesetaxaasitisbasedontheanalysisofnearly70,000specimens.Becauseof
programsupportedbytheCanadaFirstResearch
thecomprehensivetaxoncoverageandlargesamplesizes,thepresentdataalsoprovideagood
ExcellenceFund.
opportunitytotesttheperformanceoftheBarcodeIndexNumber(BIN)System[31]—an
Competinginterests:Theauthorshavedeclared interimtaxonomicsystemthataggregatesspecimensandtheirCOIsequencesintopersistent
thatnocompetinginterestsexist.
sequenceclusterscalledBINs.BytestingtheconcordancebetweenBINmembershipandspe-
ciesboundariesinnoctuoidsatacontinentalscale,theconstraintsoftheBINsystemforspe-
ciesdelineationcanbeevaluated,aidingitsapplicationinlesserknowngroups.Wealso
examinethefrequencyofspecieswithdeepintraspecificCOIdivergenceswithaviewtowards
determiningifsuchcasesareoftenlinkedtophysiographicbarriers.Finally,weexamineshifts
inthecompositionofnoctuoidsamongtheterrestrialecoregionsofNorthAmericaanduse
theBINsystemtoascertainthelevelofendemismintheNorthAmericannoctuoidfaunaby
examiningitsoverlapwithotherzoogeographicalregions.
Materialsandmethods
Taxonomiccoverage
ThisstudyrecoveredDNAbarcoderecordsfrom69,378specimens;43.1%(29,885)derived
fromCanada,46.2%(32,032)fromtheUnitedStates,and10.7%(7,461)fromMexico,andthe
NeotropicalandPalearcticregions(Table1).TheCanadianNationalCollectionofInsects,
Arachnids,andNematodes(CNC)contributed~18,000museumspecimens(representing
3168species),whiletheBiodiversityInstituteofOntario(BIO)provided~36,500freshlycol-
lectedspecimens.Theremainder(~14,500)derivedfrombothinstitutional(e.g.,National
MuseumofNaturalHistory,SmithsonianInstitution;CanadianForestService,PacificFor-
estryCentre;UniversityofPennsylvania;RoyalBritishColumbiaMuseum;TexasLepidoptera
SurveyResearchCollection,Houston)andprivatecollections(e.g.DHandfield;JBSullivan;H
Kons,Jr.;RBorth;JTroubridge;TMustelin;EHMetzler;LGCrabo).Allspecimenswere
examined,identifiedandvalidatedbyJDLandBCS;genitaliadissectionsweremadewhen
necessary.Taxonomy(S1Table)followsthemostrecentchecklistoftheNoctuoideaofNorth
AmericanorthofMexicopublishedin2010[7]anditsthreeupdates[6,8,9].
Wheneverpossible,specimensofeachspecieswereanalyzedfromacrossitsrangeinNorth
America(S1Dataset).However,coverageforsomespeciescouldonlybeobtainedbyanalyz-
ingspecimensfromoutsideNorthAmerica(S2Table).These‘extra-territorials’involved57
speciesandweresplitintothreecategories:1)speciesbarcodedfromneighbouringcountries
(e.g.,Mexico,Cuba)thatlikelypossessbarcodesmatchingspecimensfromCanada/USA
PLOSONE|https://doi.org/10.1371/journal.pone.0178548 June1,2017 2/18
DNAbarcodelibraryforNorthAmericanNoctuoidea
Fig1.EcoregionsofNorthAmerica.MapsofNorthAmericashowingtheboundariesof15ecoregions[10]whicharenumberedasfollows:1Arctic
Cordillera;2Tundra;3Taiga;4Hudsonplain;5Northernforests;6Northwesternforestedmountains;7Marinewestcoastforests;8Easterntemperate
forests;9Greatplains;10NorthAmericandeserts;11MediterraneanCalifornia;12Southernsemi-aridhighlands;13TemperateSierras;14Tropicaldry
forests;15Tropicalwetforests.Insubsequentanalyses,theArcticCordilleraandTundraarecombinedintooneregion(Arctic)andtheTaiga,Hudson
plain,andNorthernforestsaremergedtocreateaBorealecoregion.NumbersineachpiechartindicatethenumberofDNAbarcodesfromeach
ecoregionfollowedbythenumberofBINs(above)andmean/maximumNearest-Neighbordistance(below).Modifiedfrom[10].
https://doi.org/10.1371/journal.pone.0178548.g001
(S2Table);2)speciesbarcodedfromamoredistantlocation(e.g.,CostaRica,Panama,or
SouthAmerica)wherethebarcodesmaynotmatchthosefromCanada/USA(S2Table);3)
non-indigenousspeciesfromEurasiathatareeitherraremigrantstoNorthAmericaorintro-
duced/invasivespecieswhosepopulationsfailedtopersist(S3Table).SpeciesingroupsIand
IIareraremigrantstoNorthAmericafromtheNeotropics,mostrepresentedbyjustasingle
orfewspecimenscollectedfromthesouthernUnitedStatesthataretoooldforbarcodeanaly-
sis.ForspeciescollectedfromTexasandArizona,specimensfromMexicowereselectedasthe
bestrepresentativeswithGuatemalaasthesecondchoice.ForspeciescollectedinFlorida,
specimensfromCubawereselectedwhenpossiblewiththeDominicanRepublicandPuerto
PLOSONE|https://doi.org/10.1371/journal.pone.0178548 June1,2017 3/18
DNAbarcodelibraryforNorthAmericanNoctuoidea
Table1. NoctuoideaofNorthAmericasummarytable.
Family North Originof Barcodecoverage Specieswith #DNA MeanNND/intra- %ID Species
American specimens (species#/ nobarcodes sequences/ specific success sharing
species (Nearctics/non- percentage) BINs divergence barcodes
Nearctics)
Doidae 2 2/0 2/100% 0 3/2 8.93/0.00 100.0 0
Notodontidae 134 123/1 124/93% 10 2932/147 3.90/0.59 96.8 4
Euteliidae 17 15/2 17/100% 0 377/19 3.10/0.18 100.0 0
Nolidae 39 38/0 38/97% 1 1140/57 4.24/0.59 94.9 2
Noctuidae 2520 2437/16 2453/97% 67 42478/2484 2.47/0.42 92.9 173
Erebidae 952 884/47 931/98% 21 22448/1107 2.90/0.53 91.8 76
Total 3664 3499/66 3565/97.3% 99 69378/3816 2.67/0.45 92.8 255
https://doi.org/10.1371/journal.pone.0178548.t001
Ricoassecondaryoptions.Thelikelyvalidityoftheseextra-territorialrecordsassurrogatesfor
barcodedatafromspecimenscollectedinCanada/USAwaslegitimizedbycomparingbarcode
recordsfrom202specieswithdatafromCanada/UnitedStatesaswellasfromnationsfarther
south(S1Table).Asthiscomparisondidnotrevealanycaseofdeepintraspecificsequence
divergencebetweenspecimensfromCanada/USAandtheothernations,itsupportsthecon-
clusionthat‘extra-territorials’willgenerallyproviderecordsvalidforinclusionintheNorth
Americanreferencelibrary.
SamplingstrategyacrosstheecologicalregionsofNorthAmerica
NorthAmericaisoftenpartitionedinto15terrestrialecoregions(Fig1)[2]:ArcticCordillera,
Tundra,Taiga,HudsonPlains,NorthernForests,NorthwesternForestedMountains,Marine
WestCoastForests,EasternTemperateForests,GreatPlains,NorthAmericanDeserts,Medi-
terraneanCalifornia,SouthernSemi-AridHighlands,TemperateSierras,TropicalDryForests
andTropicalWetForests.Tobetterreflectinsectphylogeography,ouranalysiscollapsedsev-
eraloftheseecoregions.WemergedtheArcticCordilleraandTundraintoanArcticecore-
gion,andmergedtheTaiga,HudsonPlainsandNorthernForeststocreateaBorealecoregion.
Theanalysisofecoregionsemployedadatasetwith~53,180recordsrepresenting3252named
species(includingspecieswithinterimnames)withaccurategeographiccoordinates(S4
Table).Toextractpointsfromeachofthe12ecoregions,weemployedArcGIS10.2.2[32]to
generateapresence/absencedatamatrixforallNorthAmericannoctuoidspecieswithGPS
information(S5Table).ToperformBINanalysis,weselectedthosesequencesfrominsideand
outsideofNorthAmericaassociatedwithbothBINdataandcollectiondata(countryname)to
generateadatasetwith68,985records.Thisdatasetincluded3804BINswhoseoccurrence
wasthenassessedinsixzoogeographicalregions(S6Table).
Dataacquisitionandanalysis
DNAextraction,PCRamplification,andsequencingoftheCOIbarcoderegionwereper-
formedattheCanadianCentreforDNABarcoding(CCDB)andfollowedstandardprotocols
[33–37].PCRandsequencinggenerallyusedasinglepairofprimers:LepF1(ATTCAACCAAT
CATAAAGATATTGG)andLepR1(TAAACTTCTGGATGTCCAAAAAATCA)whichrecoversa
658bpregionnearthe50endofCOIincludingthe648bpbarcoderegionfortheanimalking-
dom[11].Formuseumspecimensolderthantenyears,primerpairsdesignedtoamplify
smalleroverlappingfragments(307bp,407bp)wereemployed[37].
PLOSONE|https://doi.org/10.1371/journal.pone.0178548 June1,2017 4/18
DNAbarcodelibraryforNorthAmericanNoctuoidea
Detailsonallbarcodedspecimens(e.g.,vouchercodes,highertaxonomy,repositoryinstitu-
tions,voucherimages,sequencelength,collectiondates,andcollectiondata)areprovidedin
S1Dataset.ResidualDNAextractsarestoredintheDNAArchiveattheCentreforBiodiver-
sityGenomics.GenBankaccessionnumbersforallnewsequencesarealsoavailableinS2
Dataset.Specimendataincludingimages,detailsonthevoucherrepositories,GPScoordinates
forcollectionsites,sequencerecords,tracefiles,andGenBankaccessionnumbersareavailable
intheBarcodeofLifeDataSystems(BOLD,www.boldsystems.org)ineightpublicdatasets:
DS-NAMNOC1(dx.doi.org/10.5883/DS-NAMNOC1),DS-NAMNOC2(dx.doi.org/10.5883/
DS-NAMNOC2),DS-NAMNOC3(dx.doi.org/10.5883/DS-NAMNOC3),DS-NAMNOC4
(dx.doi.org/10.5883/DS-NAMNOC4),DS-NAMNOC5(dx.doi.org/10.5883/DS-NAMNOC5),
DS-NAMNOC6(dx.doi.org/10.5883/DS-NAMNOC6),DS-NAMNOC7(dx.doi.org/10.5883/
DS-NAMNOC7)andDS-NAMNOC8(dx.doi.org/10.5883/DS-NAMNOC8).Thenumberof
barcodesequencesperspeciesvariesfrom1to614(average=19.46)(S1Table).Onlysequence
recordsgreaterthan500bp(range500bp–658bp)andthosethatmeetlengthandquality
requirementsfortheBARCODEdatastandard[38]areincludedexceptingafewshortbut
diagnosticsequences[(Ectypiamexicana(307bp);Hypotrixocularis(447bp);Cydosianobilitella
(307bpand370bp);Cryphiaflavipuncta(307bp);Sympistisra(316bpand379bp);Sympistis
knudsoni(407bp);Grotellamargueritaria(407bp);Sympistisfortis(407bp);Grotellaolivacea
(486bp)].Ofthe3671speciesknownfromNorthAmerica,102veryrarespecieslackbarcode
coverage(S7Table).Theyinclude23Erebidae,67Noctuidae,1Nolidae,and11Notodontidae.
Fifteenofthespecieslackingabarcoderecordareonlyknownfromtheirholotype.
TestsofbarcodeperformancewerefirstlymadeatacontinentallevelbasedontheNorth
Americanchecklist[6–9]andsubsequentlyforeachofthe12ecoregions.Patternsofintra-
andinterspecificnucleotidesequencevariationwereexaminedatvarioustaxonomiclevels
usingtheKimura-2-Parameter(K2P)distancemodelandtheNeighbor-Joining(NJ)algo-
rithmcalculatedusingtheanalyticaltoolsonBOLDatacontinentalscaleandforeachecore-
gion.ToaccommodateforunequalvariancesandsamplesizesintheNearest-Neighbor(NN)
distancesandintraspecificdata,anunequalvariancet-testwithrandomsamplingofcases
wereemployed.Finally,anonparametriccorrelationtest(Spearman)implementedinSPSS
v18(IBM)wasusedtoassesstherelationshipbetweenthenumberofspeciesinagenusand
theincidenceofbarcodesharing.
Results
Barcodeperformance
DNAbarcodeswereobtainedfor3565ofthe3664validnoctuoidspeciesknownfromNorth
America.Noindels,frameshiftmutationsorstopcodonsweredetectedamongthe69,378
sequencesrecoveredfromthesetaxasuggestingthattheyderivefromCOIratherthanapseu-
dogene.Consideredfromacontinent-wideperspective,93%ofallnoctuoidspeciespossessa
diagnosticarrayofbarcodesequences(includingthosespecieswithdeepintraspecificdiver-
gence)(S1–S8Trees;Table1).Barcodeperformancewasslightlyhigher(96%)whenanalysis
consideredthespeciesassemblagewithineachofthe12ecoregions(Table2).Thecasesof
compromisedresolutionreflectedthefactthat255species(7%)sharedtheirbarcodewithat
leastoneotherspecieswhenconsideredatacontinentalscale(S8Table),whilethemeaninci-
denceofsharingdroppedto4%forthespeciesassemblageineachofthe12ecoregions
(Table2).
MeanNNdistancesshowedlimitedvariationamongfamilies,rangingfromalowof2.47%
intheNoctuidaetoahighof3.90%intheNotodontidaeafterexcludingthehighNNdistance
(~9%)fortheDoidaebecauseitwasonlyrepresentedbytwospecies(Table1).Therewas,
PLOSONE|https://doi.org/10.1371/journal.pone.0178548 June1,2017 5/18
DNAbarcodelibraryforNorthAmericanNoctuoidea
Table2. MeansequencedivergenceandmeanintraspecificvariationatCOIamongnoctuoidspeciesfromeachof12NorthAmericanecoregions.
EcoregionsofNorth DNA Specieswithhighsequence species#/ Mean/ Mean/Maxintra- %ID Species
AmericalevelI sequences#/ divergence(>2%)/Specieswith Genus# Max specific success sharing
BIN# lowdistancetoanotherspecies NND divergence barcodes
(<2%)
ArcticCordillera 39/24 18/6 24/14 4.63/ 0.39/2.34 100 0
+Tundra 11.25
Boreal 5499/608 491/104 637/235 3.98/ 0.31/4.12 96 26
12.49
Northwestern 7443/747 93/238 788/226 3.40/ 0.29/4.53 96 31
forestedmountains 12.88
Marinewestcoast 4065/340 296/24 341/161 4.50/ 0.19/2.39 98 7
forest 12.95
Easterntemperate 19838/1247 800/106 1274/391 3.8/ 0.33/12.64 93 92
forests 12.17
Greatplains 5576/968 744/111 991/365 4.43/ 0.29/3.62 98 23
11.85
NorthAmerican 5689/957 731/217 986/304 3.70/ 0.39/9.79 97 31
deserts 11.85
Mediterranean 1851/317 223/27 318/143 4.82/ 0.38/3.81 99 2
California 11.87
Southernsemi-arid 2040/617 444/10 605/255 5.3/ 0.33/8.39 99 4
highlands 11.08
TemperateSierras 619/207 166/1 209/123 5.88/ 0.26/4.92 98 5
10.00
Tropicaldryforests 6/5 5/0 5/5 9.64/ 0.31/0.31 100 0
11.48
Tropicalwetforests 518/142 134/6 141/98 7.34/ 0.23/2.35 100 0
11.93
Total 53183/6179 2935/658 4.90/ 0.38/8.4 96 119
11.81
https://doi.org/10.1371/journal.pone.0178548.t002
however,significantvariationinbarcodeperformanceamongfamilies(X2=38.3,p<0.0001).
AllspeciesofDoidae(2species)andEuteliidae(17species)wereunambiguouslydiscrimi-
natedbybarcodesequences,butbarcodesharingintheotherfamiliesrangedfrom3.2%(4/
125species)intheNotodontidaeto5.1%(2/39species)intheNolidae,7.1%(173/2452species)
intheNoctuidae,and8.2%(76/931species)intheErebidae(Table1andS8Table).Barcode
sharingwasmostfrequentingenerawithmanyspecies(Spearman’srho=0.432;p<<0.0001);
itinvolved10.6%(182/1717species)ofthespeciesinthemostdiversegenera(42generawith
16–182species)versus4.5%(73/1608species)ofthespeciesingenerawithfewertaxa(360
generawith2–15species).Asexpected,noneofthe346speciesinmonotypicgenerashared
theirbarcodeswithanyothertaxon(Fig2).
Casesofbarcodesharing
Intotal,255of3565speciesofnoctuoids(~7.1%)sharedtheirbarcodewithatleastoneother
species(S8Tableprovidesalist).Thesecasesofbarcodesharinginvolvedjust55ofthe747
noctuoidgenera(7.4%).The76casesofbarcodesharingamongthe931speciesofErebidae
involved14ofits268genera(S8Table).Fourgenera(Catocala–26,Grammia–18,Cisthene–
5,Haploa–5)wereresponsiblefor71%(54of76species)ofthesecases;theother22involved
10generawithtwoorthreespeciessharingthesamebarcode.Themoststrikingcasesofbar-
codesharinginthisfamilyinvolvedGrammia(50%,18of36species)andCatocala(25%,26of
PLOSONE|https://doi.org/10.1371/journal.pone.0178548 June1,2017 6/18
DNAbarcodelibraryforNorthAmericanNoctuoidea
Fig2.Therelationshipbetweenthenumberofspeciesinagenus(plottedonalog scalecategories)andthepercentageofbarcodesharing.
2
Valuesatthetopofthebarsindicatethenumberofgeneraineachlog category.Categoriesare:1)generawith1species,2)generawith2–3species,3)
2
generawith4–7species,4)generawith8–15species,5)generawith16–31species,6)generawith32–63species,and7)generawith64ormore
species.
https://doi.org/10.1371/journal.pone.0178548.g002
103species).The173casesofbarcodesharingamongthe2452speciesofNoctuidaeinvolved
38ofits420genera(S8Table).Amongthem,tengenera(Euxoa–24,Xestia–13,Schinia–13,
Abagrotis–12,Acronicta–11,Lasionycta–10,Copablepharon–9,Sympistis–8,Lithophane–7,
Bellura–5)represented64%(110outof173species)ofthesecases;theother63casesinvolved
28generawithtwotofourspeciessharingthesamebarcode.Ninenoctuidgenerashoweda
particularlyhighincidenceofbarcodesharingwith83%ofspeciesinBellura(5outof6),39%
ofCopablepharon(9of23),31%ofAbagrotis(12of39),26%ofXestia(13of50),23%ofLasio-
nycta(10of43)16%ofAcronicta,14%ofLithophane,13%ofEuxoa,and10%ofSchinia.
Casesoflowbarcodedivergence
Casesoflowsequencedivergenceweredefinedasthoseinvolvingtwoormorespecieswith
lessthan1%sequencedivergence,butwithnoevidenceofsequencesharing.Intotal,14.7%
(525species)ofNorthAmericannoctuoidspeciesshowedfrom0.15%to0.99%divergence
fromtheirNN(S9Table).Thesespeciesbelongto109generainfourfamilies(S9Table).
Twenty-threegenera(Abagrotis,Acronicta,Anarta,Annaphila,Apamea,Catocala,Copable-
pharon,Dasychira,Datana,Euxoa,Feltia,Grammia,Hadena,Lacinipolia,Lasionycta,Litho-
phane,Papaipema,Schinia,Sympistis,Virbia,Zale,Zanclognatha,Xestia)includedsixormore
specieswithlowdivergence,butwithnoevidenceofsharedsequences(exceptwherenoted
above).Theyincludedsevennoctuidgenerawithmanycasesoflowdivergence(Euxoa–56,
Sympistis–38,Lasionycta–22,Lithophane–21,Papaipema–19,Schinia–15,Acronicta–11)
andtwoerebidgenera(Catocala–21,Grammia–13).
PLOSONE|https://doi.org/10.1371/journal.pone.0178548 June1,2017 7/18
DNAbarcodelibraryforNorthAmericanNoctuoidea
Casesofdeepintraspecificsequencedivergence
Deepbarcodedivergence(>2%)wasdetectedin135(3.8%)species,andanother22species
showedsufficientdivergence(0.7%–1.99%)fortheircomponentspecimenstobeassignedto
twoorthreeBINs(S10Table).These157cases(4.4%ofthefauna)involved12.5%ofthenoc-
tuoidgenera(93of747genera).Mostcasesinvolvedaspeciesthatwaspartitionedintoeither
two(102speciesin70genera)orthree(31speciesin25genera)BINs.However,24speciesin
17generawereplacedinfourormoreBINs(S10Table).Ninegeneraincludedseveralcasesof
deepsplitsincludingAbagrotis(fourspeciesin11BINs),Euxoa(tenspeciesin25BINs),
Grammia(sixspeciesin18BINs),Idia(threespeciesin13BINs),Lacinipolia(eightspeciesin
30BINs),Sympistis(eightspeciesin17BINs),Virbia(twospeciesin22BINs),andXestia
(eightspeciesin20BINs).Onespeciesshowedexceptionaldiversity;specimensofVirbiafer-
ruginosawereassignedto18BINs.
Factorsinfluencingnearest-neighbordistances
a)Ecoregions. WhentheNorthAmericanfaunawaspartitionedintothespeciesassem-
blagesfromeachof12ecoregions(Fig1),barcoderesolutionimprovedfrom92.8%continent-
wideto96.0%(119outof2935validspeciesrepresentedsequencesharingandidenticalhaplo-
types)(Table2).Whenconsideredatacontinentalscale,255species(7.1%)sharedtheirbar-
codewithoneormorespecies,butbarcodesharingdroppedtojust119cases(4.0%)when
ecoregionsofNorthAmericawereconsideredindividually.NNdistancesshowedagradual
increasefromthenorth(e.g.,Tundra=4.63%)tothesouth(e.g.,TropicalWetForests=7.34%)
(Fig1,Table2).WhileanoticeabledeclineinNNdistancewasobservedwithincreasinglati-
tude(Table3),therewasnosimilartrendwithlongitudealthoughtheaveragevalueforthe
RockyMountainsecoregionwasslightlylower(~0.3%)thanfortheotherregions(Table4).
Therewerealsoshiftsintherelativediversityofthetwomajornoctuoidfamilies—theNoctui-
daedominatethenorthernhalfofthecontinentwhereastheErebidaedominateinthesouth;
thisshiftinfaunalcompositionlikelycontributestotheNNpatternwithlatitude.
b)Phylogenyandnon-indigenousspecies. Itisthoughtthat35noctuoidspeciesnow
foundinNorthAmericaareeithermigrantsorintroducedalienspecieswereintroducedfrom
Eurasiabyhuman-mediatedtransport(S3Table).Thesespecieshaveasignificantly
(p<0.0001)higherNNdistance(x=5.90%)thannativetaxa(x=3.60%),reflectingthefact
thatmanyhavelefttheirsistertaxabehind(S3Table).Asmightalsobeexpected,thesespecies
haveasignificantlylowermeanintraspecificdivergence(x=0.20%)thannativespecies
(x=0.44%)(S3Table),reflectingthelossofdiversityduetopopulationbottleneckingduring
theirestablishment(two-samplet-Test,t=7.1,p<0.0001).
SpeciesboundariesandBINconcordance
AlthoughtheBINcount(3816)wasjust7%higherthanthenumberofspecies(3565),thiscon-
gruencepartiallyreflectedthecounterbalancingeffectsofBINsplitsandmergers(low-diver-
genceandbarcodesharing).Inactuality,perfectcorrespondencebetweentheassignmentof
specimenstoaparticularspeciesandtheirplacementinauniqueBINwasonlyevidentfor
2711species(76.0%).Another157species(includingall135specieswith>2%intraspecific
sequencedivergence)wereinvolvedinsplitswiththeirmembersassignedtotwo(102species),
three(31species),ormore(24species)BINs(S10Tableandforintra-specificdivergencessee
S11Table).Another741speciessharedtheirBINassignmentwithatleastoneotherspecies.
Someofthesemergersreflectedspecies(255)thatsharedhaplotypes(S8Table),butmost
(525)involvedspecieswithdiagnosticbutlowbarcodedivergence(S9Table).Afewcases(40)
involvedspecieswithmixedbarcodesharingandlowdivergence.Finally,therewere43species
PLOSONE|https://doi.org/10.1371/journal.pone.0178548 June1,2017 8/18
DNAbarcodelibraryforNorthAmericanNoctuoidea
Table3. SequencedivergenceatCOIamongNorthAmericannoctuoidspeciesalonganorth-southaxis.
Zones #Sequences #BINs MeanNND MeanIntra-specific
>55˚ 3637 276 3.79 0.34
50˚&55˚ 9051 657 3.51 0.24
45˚&50˚ 12739 1090 3.32 0.31
40˚&45˚ 6800 957 3.43 0.36
35˚&40˚ 8214 1283 3.43 0.41
30˚&35˚ 10650 1686 3.73 0.45
25˚&30˚ 3355 698 5.06 0.38
<25˚ 388 75 7.61 0.29
Total 54835 3202 4.2 0.3
https://doi.org/10.1371/journal.pone.0178548.t003
whosememberswereinvolvedinbothBINsplitsandmergers.S11Tablereportsmeanintra-
specificdivergences,BINcountsandnumberofspecimensanalyzedforeachspecies.
Recordsfor3803BINswereexaminedforoverlapamongzoogeographicalregions,ananal-
ysiswhichrevealedthat284(7.5%)aresharedwiththeNeotropicswhile70(1.8%)represent-
ing88speciesaresharedwiththeHolarctic.Two-thirdsofthelatterspecies(59species)
appeartohavenaturalHolarcticdistributionswhile29speciesarebelievedtohavebeenintro-
ducedasaconsequenceofhumanactivity.
Cohesionathighertaxonomiclevels
Althoughbarcodesequencesgenerallydonotproviderobustphylogeneticinformationbeyond
thespecieslevel[39],mostgeneraformedcohesiveclusters.Suchgenus-levelcohesion,orits
lack,mayprovideausefulpreliminaryassessementofmonophyleticassemblages.Forexample,
AcronictainsularisandA.ursiniareimbeddedwithintheremainderofAcronicta,butwere
previouslyplacedinseperategenera,SimyraandMerolonche.Independentmolecularmarkers
andmorphologyhassinceshownthatSimyraandMerolonchefallwithintheconceptofAcro-
nicta[40].Similarly,barcoderesultsforrepresentativeEriopygini(Noctuidae)flaggedthe
closesimilarityofspeciesinfourputativegenera,leadingtotherecognitionofasingleunified
genus,Hypotrix[41].Conversely,generathataresplitorwidelyseparatedinNJtreesmayflag
non-monophyleticgroupsinneedofrevision.Forexample,morphologicalstudyconfirms
thatNorthAmericanOrthosiarepresentseveralgenera(JDL,unpubl.data),assuggestedby
thehighCOIdivergenceamongitscomponenttaxa.Finally,extensivetaxonsamplingcan
hintattribalandsubfamilysystematics;inthecaseofthegeneracurrentlycomprising
Table4. SequencedivergenceatCOIamongNorthAmericannoctuoidspeciesacrossaneast-westaxis.
Zones #Sequences #BINs MeanNND MeanIntra-specific
>135˚ 169 84 4.50 0.73
125˚&135˚ 1504 135 5.25 0.19
115˚&125˚ 14882 1077 3.07 0.32
105˚&115˚ 8200 1354 3.86 0.38
95˚&105˚ 5271 1009 4.43 0.37
85˚&95˚ 4650 753 4.10 0.43
75˚&85˚ 13885 1157 3.63 0.40
<75˚ 6273 636 3.93 0.28
Total 54835 3202 4.1 0.4
https://doi.org/10.1371/journal.pone.0178548.t004
PLOSONE|https://doi.org/10.1371/journal.pone.0178548 June1,2017 9/18
DNAbarcodelibraryforNorthAmericanNoctuoidea
Eustrotiinae,twoseparateclustersofgeneraledtoindependentconfirmationthatthissubfam-
ilyincludestwodistinctgroupsthatarenotcloselyrelated(BCS,unpubl.data).Insituations,
suchasthecurrentone,wherenearlycompletetaxonsamplingmaximizesphylogeneticsignal,
DNAbarcodedatahaveconsiderablepotentialtorevealphylogeneticaffinities[42].
Discussion
Thepresentstudyprovidesanaverageof20×barcodecoveragefor97.3%(3565/3664)ofthe
currentlyrecognizednoctuoidspeciesinNorthAmerica.Theseresultsindicatethat3310of
thesespecies(92.8%)possessadiagnosticarrayofDNAbarcodeswhenconsideredataconti-
nentalscale,whilebarcoderesolutionrisesto96.0%whenexaminedbyecoregion.About
threequarters(76.0%)ofthesespeciesperfectlycoincidewithBINassignments.Asreportedin
otherstudies[29],manyofthecasesofdiscordanceinvolvespecieswitheitherlowsequence
divergencefromanotherspeciesorwithdeepintraspecificdivergence.Mostspecies(3412of
3565)showedamaximumintraspecificdistanceoflessthan2%,butdeeperdivergencewas
detectedin157species(4.4%ofthetotal),andbarcodesharingwasdetectedin255species
(7.1%ofthetotal).Despitethesecomplexities,theresultantDNAbarcodelibraryallowsthe
unambiguousidentificationof93.0%ofcurrentlyrecognizednoctuoidspecieswhenconsid-
eredatacontinentallevelandidentificationsuccessis96.0%whenanalysisexaminesthespe-
ciesfromaparticularecoregion.
Ourresultsreinforceearlierindicationsthatincreasedgeographicsamplingdoesnotseri-
ouslydiminishtheperformanceofDNAbarcodesinspecimenidentification[29,43,44].In
fact,theresolutionforNorthAmericannoctuoidsisslightlyhigherthanthatforthenorthern
halfofthecontinentasZahirietal.[29]observed90.0%resolutionintheirstudyof1541spe-
ciesofnoctuoidsfromsitesacrossCanadaand95.6%resolutionwhenconsideredataprovin-
cialscale.Similarly,deWaardetal.[25]found93%resolutionfor400geometridspeciesfrom
BritishColumbia,whereasHebertetal.[45]observed99.0%resolutioninastudyon1200spe-
ciesindiversefamiliesofLepidopterafromsouthernOntario.DNAbarcodesalsodistin-
guished97%ofmorethan1000speciesfromnorthwesternCostaRica[46].Resultsfromthe
Palearcticindicatesimilarperformancewith93%of219speciesfromselectedsubfamiliesof
EuropeanGeometridae[47],90%for185speciesofRomanianbutterflies[22],98.5%for400
speciesofBavariangeometrids[23]and99%for957speciesofbutterfliesandlargermoths
fromsouthernGermany[24].
Highintraspecificdivergences(>2%)werepresentin135species(3.8%)ofNorthAmeri-
cannoctuoids,aslightlylowerincidencethanthe5–8%reportedinotherLepidopterafaunas
withwell-studiedtaxonomy[16,22–24,29,36].Thesedeepintraspecificdivergences(SI11)
mayindicateunrecognizedsiblingspecies,butmayalsoreflectphylogeographicvariationina
singlespecies,divergencelinkedtobacterialendosymbiontsortherecoveryofapseudogene.
Casesofdeepdivergencecanariseasaresultofintrogressionfollowinghybridization,paralo-
gouspseudogenes,retainedancestralpolymorphisms,andverticallytransmittedsymbionts
[48,49].Becausemitochondrialgenesareinheritedmaternally,areexposedtolittlerecombi-
nation,andhaveaneffectivepopulationsize(Ne)thatis¼thatofnucleargenes,theyarealso
particularlysusceptibletoselectivesweeps[50,51].BecauseCO1isaprotein-codinggene,
pseudogenescanberecognizedthroughtranslationofthenucleotidesequencetoensurethe
absenceofstopcodonsorframeshiftmutations.TheendosymbiontWolbachiacanfosterCO1
divergenceamonginfectedlineages[52].Virbiaferruginosashowedanexceptionallyhigh
levelofCOIvariationasindicatedbyitsassignmentto18BINs,butthecauseofthisdiversity
remainsuncertain.Becausemorphologicalstudies(BCS)suggestthatthisvariationisnotdue
tocrypticspecies,thereisaneedforfurtherworktoascertainifWolbachiaoranotheragent
PLOSONE|https://doi.org/10.1371/journal.pone.0178548 June1,2017 10/18
Description:1 Canadian Food Inspection Agency, Ottawa Plant Laboratory, Entomology Unit, Ottawa, Ontario, Canada, Canada, Biodiversity Program, Canadian National Collection of Insects,. Arachnids, and Nematodes, Ottawa, Ontario, Canada Proceedings of the Entomological Society of Washington. 2013