Comparison of number of fragments generated by in silico and in vitro ddRADseq for the invasive spotted lanternfly (Lycorma delicatula)
College:
The Dorothy and George Hennings College of Science, Mathematics, and Technology
Major:
Biotechnology/Molecular Biology
Faculty Research Advisor(s):
Brenna Levine
Abstract:
Lycorma delicatula is an invasive species in the United States that originated in China, and it poses a significant threat to US agriculture. If we can quantify gene flow and genetic structure among populations in the US, then we can identify dispersal routes and source populations to inform control. But, genetic resources for the spotted lanternfly are currently restricted to small panels of microsatellite loci and analyses of mitochondrial DNA. Therefore, we performed in silico and in vitro tests of a reduced representation genome sequencing approach (ddRAD) with the goal of sequencing thousands of genomic variants for each individual. . We first performed in silico digestion of the genome with two sets of restriction enzymes (NLaIII/MLuCI and NLaIII/EcoRI) using simRAD. The results of this simuilation allowed us to optimize a ddRAD protocol that would generate high coverage of SNPs and thus high confidence. We then prepared ddRAD libraries with these two sets of enzymes and sequenced them on an Illumina HiSeq X. The in silico double digestions with MLuCI/NLaIII and NLaIII/EcoRI predicted a total of XXX and 193,1991 fragments, respectively. In contrast, the in vitro library preparations with these same enzyme combinations resulted in 727,069 and 400,425 fragments, respectively. In this sense, in silico digestion dramatically underpredicted the number of fragments that would be generated by this protocol, resulting in lower than desired coverage and correspondingly low confidence in any analytical conclusion that has been made. These results are likely due to a low quality reference genome used for in silico analyses. Future work should consider the quality of the genome when generating ddRAD libraries for the spotted lanternfly, with a pivot to low coverage whole genome sequencing that does not rely on simulations to inform the laboratory protocol.