 |
 |
 |
|
 |
|
|
Statistics
Statistics
Table 1 Statistics of the data contents for each of the individual full-length cDNA databases. |
|
| Species | Host | Stage | Library method | #cDNA sequenced | #RefFull | DB |
|
| P. falciparum | human | erythrocytic, gametocyte | oligo-capping | 12,484 | 1,465 | Full-malaria |
| P. vivax | human | erythrocytic, gametocyte | oligo-capping | 11,262 | 1,566 | Full-malaria |
| P. yoelii | mouse | erythrocytic, gametocyte | oligo-capping | 9,633 | 1,206 | Full-malaria |
| P. berghei | mouse | erythrocytic, gametocyte | oligo-capping | 1,518 | 416 | Full-malaria |
| T. gondii | mammals | tachyzoite | oligo-capping | 7,400 | 762 | Full-toxoplasma |
| C. parvum | human/cow | sporozoite | oligo-capping | 5,921 | 682 | Full-cryptosporidium |
| E. multilocularis | dog/fox | larva | vector trapping | 10,966 | ND | Full-echinococcus |
|
Table 2 Statistics of the number of RefFulls corresponded to the putative counterpart genes in each species. |
|
| Species | 5 | 4 | 3 | 2 | 1 | 0 | Total |
|
| P. falciparum | 1 | 18 | 44 | 140 | 444 | 818 | 1465 |
| P. vivax | 1 | 20 | 49 | 150 | 363 | 983 | 1566 |
| P. yoelii | 1 | 19 | 49 | 155 | 329 | 653 | 1206 |
| P. berghei | 1 | 18 | 39 | 89 | 74 | 195 | 416 |
| T. gondii | 1 | 17 | 30 | 41 | 85 | 588 | 762 |
| C. parvum | 1 | 18 | 33 | 49 | 101 | 480 | 682 |
|
| *#RefFulls corresponded to putative counterpart genes with indicated number of species |
Table 3A Number of annotation terms identified from the RefFulls in each species. |
|
| Annotations for RefFull (category) | Pf | Pv | Py | Pb | Tg | Cp |
|
| "Antigen" (Keyword) | 107 | 41 | 0 | 11 | 3 | 12 |
| "Transcription" (Keyword) | 42 | 16 | 0 | 3 | 2 | 9 |
| "Kinase" (Pfam) | 116 | 38 | 41 | 0 | 0 | 0 |
| "Mitochondria" (PSORT) | 81 | 117 | 49 | 31 | 67 | 19 |
| "Transmembrane domain" (SOSUI) | 508 | 250 | 166 | 62 | 79 | 43 |
| "Transporter" (GO term) | 131 | 92 | 60 | 27 | 30 | 20 |
| "TATA box" (Promoter) | 1157 | 601 | 530 | 227 | 42 | 233 |
| "TATA box; Pf" (Promoter) | 1328 | 1007 | 741 | 273 | 158 | 299 |
|
Table 3B Number of putatively counterpart genes containing corresponding annotation terms in indicated number of species in common. |
|
| Annotations for RefFull (category) | 6 | 5 | 4 | 3 | 2 |
|
| "Antigen" (Keyword) | 0 | 0 | 4 | 14 | 56 |
| "Transcription" (Keyword) | 0 | 0 | 1 | 6 | 31 |
| "Kinase" (Pfam) | 0 | 1 | 3 | 4 | 29 |
| "Mitochondria" (PSORT) | 0 | 0 | 3 | 5 | 25 |
| "Transmembrane domain" (SOSUI) | 0 | 4 | 3 | 21 | 108 |
| "Transporter" (GO term) | 0 | 5 | 1 | 18 | 5 |
| "TATA box" (Promoter) | 0 | 2 | 14 | 87 | 443 |
| "TATA box; Pf" (Promoter) | 0 | 17 | 46 | 192 | 646 |
|
Table 4 Frequencies of the mapped and unmapped cDNAs for each species. |
|
| Species | #cDNA sequenced | #mapped cDNA | Mapping rate(%) |
|
| P. falciparum | 8968 | 7206 | 80.4 |
| P. vivax | 9633 | 8043 | 83.5 |
| P. yoelii | 11262 | 8403 | 74.6 |
| P. berghei | 1518 | 1081 | 71.2 |
| T. gondii | 7400 | 6593 | 89.1 |
| C. parvum | 10110 | 9884 | 97.8 |
|
| *As we used
only cDNAs containing more than 200 bp with phred value>10, it can be
hypothesized that at least major part of the unmapped cDNAs should
correspond to the unfinished part of the genomic sequences. We provided this
information at "Statistics" section in the web. However, we are not
completely sure whether the extents of the completeness of the genomes
estimated in this way are really accurate. We could not exclude the
possibility that cDNAs representing some genes in particular species may
have inherent problems to be mapped onto the genomic sequences. For example,
our cDNAs mainly represent the 5'-end UTRs and those parts are very AT-rich
especially in P. falciparum, forming seemingly repetitive sequences.
|
|
 |
|
 |