Figure 2-1Length distributions of the assembled transcripts and unigenesin Corchorus olitoriThe horizontal axis is the length of transcripts and unigenes, and the vertical axis is the numbertranscripts and unigenes.图 1-3 长果种黄麻的转录本和 unigenes 长度的频率分布。横轴是转录本和 unigenes 的长纵轴是转录本和 unigenes 的数量。Transcripts were de novo assembled into unigenes, producing 70792 unigenes withmean length of 752 bp and the N50 length of 1420 bp (Figure 2-1, Appendix 2). Tlength of these unigenes differed from 201bp to 13328bp. 43599 (61.58%) unigenwere in the length range of 201 to 500 bp and 11485 (16.2%) unigenes were in tlength range of 501 to 1000 bp, whereas15710 (22.2%) unigenes lengths were greathan1000 bp (Figure 2-1).To assess the quality of the assembled unigenes, de novo assembltranscriptome sequences by Trinity was considered as a reference sequence. All tclean sequence reads were mapped to the assembled unigenes by RSEM software[15
Figure 2- 2 Classification of most abundant annotated species.图 2-2 基因注释的主要物种分类2.3.3 GO annotationUnigenes assembled from GO annotated product combined with noninformation were obtained using blast2GO software[153]. The GO funccategorized according to the WEGO program[154]. As results, a tota(39.45%) unigenes based on GO, were categorized into 3 major categoriesfunction, biological process, and cellular component) and 56 subcategories3). Among the sub categories, the cellular process was most dominant unigenes, followed by, binding, metabolic process, catalytic activity , singlactivity, cell, and cell part with 15134, 14962, 12605, 12010, 8859, 615respectively. There were very few genes (less than 10) assigned to growthrhythmic process, extracellular matrix components, and nucleoid, synap
Figure 2- 3 Gene ontology classifications of assembled unigenes in jute.图 2-3 黄麻 unigenes 的 GO 功能分类2.3.4 KOG annotationTo evaluate the integrity and validity of the transcriptome sequence, 33873(47.84%) unigenes annotated in the Nr database were assigned to the KOG databaseto categorize potential role. In total, 16993 (24%) genes were grouped into 26 KOGcategorizations (Figure 2-4). Of these categorizations, R (general function with 2646(13.9%) was the most abundant followed by O (post-translational modification,protein turnover), T (signal transduction mechanisms), J (translation, ribosomal,structure and biogenesis), C (energy production and conversion),U (intracellulartrafficking, secretion and vesicular transport with 2277 (11.96 %), 1559 (8.19%),1462 (7.68%), 1051 (5.52%) and 1041 (5.47), 928 (4.87%), 914 (4.80%) and 883(4.69%) genes respectively), whereas N (cell motility) and X (unnamed protein)which were less represented with 7 (0.03%) and 1 (0.01%) genes respectively.
