Comparasite Full-length cDNA Database

Comparasite Full-length cDNA Database
Home > Statistics

Statistics

Statistics

Table 1
Statistics of the data contents for each of the individual full-length cDNA databases.

SpeciesHostStageLibrary method#cDNA sequenced#RefFullDB

P. falciparumhumanerythrocytic, gametocyteoligo-capping12,4841,465Full-malaria
P. vivaxhumanerythrocytic, gametocyteoligo-capping11,2621,566Full-malaria
P. yoeliimouseerythrocytic, gametocyteoligo-capping9,6331,206Full-malaria
P. bergheimouseerythrocytic, gametocyteoligo-capping1,518416Full-malaria
T. gondiimammalstachyzoiteoligo-capping7,400762Full-toxoplasma
C. parvumhuman/cowsporozoiteoligo-capping5,921682Full-cryptosporidium
E. multilocularisdog/foxlarvavector trapping10,966NDFull-echinococcus



Table 2
Statistics of the number of RefFulls corresponded to the putative counterpart genes in each species.

Species543210Total

P. falciparum118441404448181465
P. vivax120491503639831566
P. yoelii119491553296531206
P. berghei118398974195416
T. gondii117304185588762
C. parvum1183349101480682

*#RefFulls corresponded to putative counterpart genes with indicated number of species


Table 3A
Number of annotation terms identified from the RefFulls in each species.

Annotations for RefFull (category)PfPvPyPbTgCp

"Antigen" (Keyword)10741011312
"Transcription" (Keyword)42160329
"Kinase" (Pfam)1163841000
"Mitochondria" (PSORT)8111749316719
"Transmembrane domain" (SOSUI)508250166627943
"Transporter" (GO term)1319260273020
"TATA box" (Promoter)115760153022742233
"TATA box; Pf" (Promoter)13281007741273158299



Table 3B
Number of putatively counterpart genes containing corresponding annotation terms in indicated number of species in common.

Annotations for RefFull (category)65432

"Antigen" (Keyword)0041456
"Transcription" (Keyword)001631
"Kinase" (Pfam)013429
"Mitochondria" (PSORT)003525
"Transmembrane domain" (SOSUI)04321108
"Transporter" (GO term)051185
"TATA box" (Promoter)021487443
"TATA box; Pf" (Promoter)01746192646



Table 4
Frequencies of the mapped and unmapped cDNAs for each species.

Species#cDNA sequenced#mapped cDNAMapping rate(%)

P. falciparum8968720680.4
P. vivax9633804383.5
P. yoelii11262840374.6
P. berghei1518108171.2
T. gondii7400659389.1
C. parvum10110988497.8

*As we used only cDNAs containing more than 200 bp with phred value>10, it can be hypothesized that at least major part of the unmapped cDNAs should correspond to the unfinished part of the genomic sequences. We provided this information at "Statistics" section in the web. However, we are not completely sure whether the extents of the completeness of the genomes estimated in this way are really accurate. We could not exclude the possibility that cDNAs representing some genes in particular species may have inherent problems to be mapped onto the genomic sequences. For example, our cDNAs mainly represent the 5'-end UTRs and those parts are very AT-rich especially in P. falciparum, forming seemingly repetitive sequences.
Help Sitemap Contact About Comparasite Search Database Related Sites