Diatraea saccharalis
Profile
Species: Diatraea saccharalis (Fabricius) – sugarcane borer
Order: Lepidoptera
Family: Crambidae
Genus: Diatraea
The sugarcane borer, Diatraea saccharalis (Fabricius), is native to the western hemisphere, but not to the United States. It apparently was introduced into Louisiana about 1855, and has since spread to the other Gulf Coast States. It inhabits only the warmer portions of these states. Sugarcane borer also occurs throughout the Caribbean, Central America, and the warmer portions of South America south to northern Argentina.
Sugarcane borer sometimes is a serious pest of sugarcane. Larvae bore into the sugarcane stalks. In mature plants the tops tend to weaken or die, sometimes breaking off. In young plants the inner whorl of leaves is killed, resulting in a condition known as "dead heart." The amount and purity of juice that can be extracted from cane is reduced when borers are present, and sucrose yield may be decreased 10 to 20%. Lastly, when seed cane is attacked, the tunneling by borers makes the seed piece susceptible to fungal infection.
D. saccharalis attacks plants in the family Poaceae (grasses). Though principally a pest of sugarcane, this insect also will feed on other crops such as corn, rice, sorghum, and sudangrass.
Source: University of Florida ‘Featured Creatures’ https://entnemdept.ufl.edu/creatures
Sample Collection
4th or 5th instar larvae were collected on 25.03.2020; USA; 700 Chesterfield Pwky W Chesterfield, MO 63017 (Bayer).
Next Generation Sequencing
i) Illumina Hi-C sequencing 150 bp paired end data:
770,011,116 reads and 115,501,667, 400x coverage.
ii) PacBio HiFi data, of mean read length 9,416, total reads 508,419, read length N50 10,270, and total bases 4,787,652,669.
iii) PacBio CLR data, of mean read length 8,358, total reads 10,515,233, read length N50 11,110, and total bases 87,889,471,906. DNA was extracted for both CLR and HiFi using a single individual with a ZYMO HMW kit (6000ng gDNA).
iiii) PacBio isoseq data, x2 smrt cells using Sequel II.
Methods
Non-sexed single individual’s DNA used for PacBio CLR and HiFi (University of Georgia, USA) and multi-individual for Hi-C Illumina sequencing (Arima Genomics USA). Flye was used to assemble the PacBio CLR and Hifiasm the HiFi, The HiFi assembly was used with Juicer then 3d-dna using Hi-C data for chromosome level assembly. Haplotigs were removed (purge_haplotigs). Manual curation was done to bring the genome together and check for miss-assemblies. Unmapped reads were mapped back to the original assembly to check for missing sequence and incorporated into the final assembly. Error correction was done with Hi-C data using freebayes.
Public RNA-seq PRJNA564321 (mid-gut), PRJNA527774 (larval paratization), transcriptome was assembled (BUSCO: C:94.7%[S:22.9%,D:71.8%],F:1.1%,M:4.2%) and used with the PGI ISO-seq transcriptome (1 smrt cell) (BUSCO C:69.1%[S:36.7%,D:32.4%],F:4.6%,M:26.3%) in the Maker2 annotation pipeline with trained Augustus and Genemark gene predictors. PASA was used to update the gene models to add UTR, correct existing models and add isoforms. Non-coding RNA was annotated using Infernal v1.1.4.
A Pfam genomic track was created by converting to six reading frames and utilizing hmmer to identify loci of interest i.e. P450 pfam domains on the genome. Using this information, loci of interest including UDP, P450, ABC and IRAC gene models were found and curated using mapped RNA-seq and a Maker gene annotation.
Final Results
A complete annotated 23 chromosome assembly deposited at NCBI under accession PRJEB47899 (incl. raw data).
BUSCO (Insecta odb10): C:98.4%,F:0.4%,M:1.2%
14,137 gene models - BUSCO C:96.7%[S:86.9%,D:9.8%],F:1.7%,M:1.6%
Scaffold No. (incl Mt): 26
N50: 22,222,603
N bases (bp): 114,600
Repeat: 42.93%
Total size (bp) (chr no.): 507,694,304 (23)
Curated: 91x P450, 55x ABC transporter, 29x UGT, and 116/130 IRAC gene models.
Other files
These are files that were not submitted to NCBI but might be useful.