BFG@University of Richmond

Monday, October 17, 2005

Nutritional Genomics


An interesting new take on the bioinformatics/functional genomics story is nutritional genomics (nutrigenomics). It's the study of how our genes affect our health and nutritional requirements. As you can imagine, this research has tremendous upside, but it is clear that one must view this research with healthy skepticism (no pun intended).

Wednesday, October 12, 2005

Finding mutations in FlyBase


An important way to test the in vivo function of a protein is to compare the phenotype of normal (wild type) individuals to the phenotype of individuals lacking the function of a particular gene. By studying the defects associated with loss of gene function, we can make inferences about normal gene function. Drosophila melanogaster is a great model organism for doing these types of experiments, as there is a treasure trove of genetic resources available. How can you find mutations in a particular gene? There are a number of ways:

1) FlyBase gene query.

Go to FlyBase home page.
Click on the Genes link.
Click on Genes Search link.
Type in gene name (CGnnnn or actual gene name) and click submit button. For example, I did this with the Kinesin light chain (Klc) gene. The results page has lots of results; be sure to pick the correct one!
The Synopsis of the Klc FlyBase entry has lots of information about the size of the protein and its function. We are interested, however, in finding mutations in the Klc gene.
The first place to look is at the Stocks link on the right side of the page. Not all genes have mutant stocks available at the Stock Center, and some mutations that are available have not been ascribed to your favvorite gene (even though they do exist). How can you find such mutations?
Click on the Gene Region Map link on the right side of the page. A map representing the part of the genome in which your favorite gene resides will appear. The map is clickable, and map data can be changed using the toggle buttons at the bottom of the page. Useful buttons include transgene insertion site, deleted segments, and transgene insertions. Click the buttons you wish to try (one at a time at first), and then click the update map button. Click on new features that appear on the map for more information.
What if this approach yields no results? One approach is to search the Special Collections of fly mutations, especially the Baylor/BDGP Gene Disruption Project and the Exelixis Collection.

Tuesday, October 11, 2005

Expired links: not so good


BLAST servers at NCBI, Ensembl, etc. have a nifty feature in which your search is assigned a unique identifier (RID), which can be used to retrieve your results later. Unfortunately (but understandably) the unique identifier expires in a relatively short time; i.e. 48 hours. Therefore, links to BLAST results through the RID don't work after a few days. The way around this is to save your BLAST results as an html file (better), or link to the Blink entry (good).

Monday, October 10, 2005

IBM: Hands off our DNA


Today's NY Times has an interesting article about IBM's policy of not using genetic data in hiring decisions or in determining eligibility for company health benefits. IBM may be the first major corporation to specifically outlaw genetic discrimination.

"What I.B.M. is doing is significant because you have a big, leadership company that is saying to its workers, 'We aren't going to use genetic testing against you,' " said Arthur L. Caplan, director of the Center for Bioethics at the University of Pennsylvania medical school.


IBM's action will hopefully serve as an example to other corporations, and to Congress.

This year, the Senate passed a genetic information nondiscrimination bill, by a vote of 98 to 0, and the House is now considering similar legislation. Two years ago, after the Senate passed a genetic privacy act, the House never voted on the legislation. But House sponsors are more optimistic this time. Also, about 40 states have laws that address some aspect of genetic privacy and discrimination.

To some extent, the privacy provisions in existing statutes like the Health Insurance Portability and Accountability Act and disability and civil rights laws already address the issue. They include prohibitions against using personal medical information to discriminate against people in hiring and in providing health insurance.

In an e-mail message to be sent to all I.B.M. employees today, Samuel J. Palmisano, I.B.M.'s chief executive, writes that the spread of gene-testing and genomic research is "enormously promising - but it also raises very significant issues, especially in the areas of privacy and security."

I.M.A.G.E. consortium


The I.M.A.G.E. consortium is a database and "stock center" for finding and obtaining cDNAs, libraries, and vectors. Their database is searchable using IQ and IMAGEne.

TIGR gene indices


The Institute For Genomic Research has great resources for studying ESTs and genomic DNA from a variety of organisms; here are links to a few of them.

TIGR BLAST: Sequence searches of dozens of organisms.
Eukaryotic Gene Orthologs: Searchable database of orthologous genes.
Unfinished Genomes Search: BLAST searches of sequencing projects currently underway at TIGR.
TIGR Genome Projects: Self-explanatory; pretty cool.

Fisher exact test


The Fisher exact test is a statistical approach for comparing the composition of two collections (organsisms, genes, etc.). It is used to test the hypothesis that a given gene is not differentially represented in two cDNA libraries, as sampled by random sequencing. The result of the equation
p= [ N! M! c! C! ] / [(N+M)! a! b! A! B!]
is the probability value p.

The null hypothesis is rejected if
p <= 0.05/G, where 0.05 is the level of statistical certainty and G is a correction factor based on the Bonferroni inequality, equal to the total number of possible tests. In the case of DDD, this value is:
G = #UniGene clusters represented in analysis * #Pools * (#Pools - 1) / 2
Therefore, the p value depends on both the difference in the number of times a gene is represented in two cDNA libraries AND the total number of genes in the library.

Digital Differential Display


A nice feature of the NCBI UniGene page is Digital Differential Display (DDD). DDD indirectly compares gene expression levels in different tissue types by comparing the number of ESTs identified during the random sequencing of cDNA libraries. Statistical tests are used to determine whether observed differences in the expression of a particular gene are nonrandom. You pick the libraries you wish to compare, and the program finds statistically significant differences in cDNA representation between (or among) the libraries. Here's a link to one I did comparing Drosophila head and embryo cDNA libraries. Genes important for eye function, such as arrestin and ninaE, are overrepresented in the head library, while genes important for embryonic development, such as lola and armadillo, are overrepresented in the embryonic library. Cool!