The tobacco genome initiative
Cultivated tobacco, Nicotiana tabacum , is member of the Solanaceae family, which also includes eggplant, pepper, petunia, potato and tomato. N. tabacum is an amphiploid species (2n=48) resulting from an interspecific cross between N. sylvestris (2n=24) and N. tomentosiformis (2n=24). N. tabacum has a very large genome size compared with other cultivated solanaceous plants. At approximately 4.5 billion base pairs, it is 1.5 times the size of the human genome. As with the human genome, the vast majority of these base pairs occur as highly repetitive sequence. Although N. tabacum has been cultivated since before the time of Columbus and is of great economic significance today, relatively little information exists on its genome structure and organization. We are employing a combination of strategies to identify a large percentage of genes in N. tabacum . During the past year, we constructed a BAC library (9.7-fold genome coverage) and initiated BAC-end sequencing (11,000 lanes) in preparation for construction of a physical map. We also constructed cDNA libraries from N. tabacum and N. benthamiana for EST sequencing. Nicotiana benthamiana , an amphiploid species with 38 chromosomes, is closely related to N. tabacum . N. benthamiana is an important model host to study plant disease interactions. To date, we have sequenced approximately 17,000 ESTs from each library. We are employing a methyl filtration library approach to identify gene rich regions in N. tabacum in order to expedite the project. Results from our pilot experiments show a nearly 10-fold increase in gene discovery in filtered vs . non-filtered libraries. We are currently moving to the production phase of the project.