ACSESS_Banner.gif
SDSU Picture Collage. Monday, March 3, 2008  12:00-5:00pm  SDSU Montezuma Hall
Home CSRC Home CSRC Faculty Subscribe to the CSRC SDSU Home Page SDSU Maps and Directions Contact the CSRC
Header.gif
Pictures Program Travel & Lodging Posters Program Presentation

 

Shannon's Uncertainty and Kullback-Leibler Divergence in Microbial Genome and Metagenome Sequences
All genome sequence data contains inherent information in it. Shannon's uncertainty theory can be used to measure of how much information a sequence has. Here we show that the amount of information in a sequence correlates with the similar sequences that will be found in the database using search algorithms (BLAST). Hence, a sequence with more information (higher uncertainty), has a higher probability of being significantly similar to other sequences in the database. Measuring uncertainty maybe a rapid way to screen for sequences likely to be similar to things in the database, and also show which sequences with no known similarities are likely to be false negatives. Here, we also present some work on amino acid composition for each of the complete bacterial genome sequences.
 
Sajia Akhter Poster
We show that (i) there is a significant difference between amino acid utilization in different phylogenetic groups of bacteria; (ii) that the bacteria with the most skewed amino acid utilization profile are endosymbionts or intracellular pathogens; and (iii) the skews are not restricted to one or a few metabolic processes but areacross all subsystems.
     
     
• Other Abstracts •