BFG@University of Richmond

Friday, September 30, 2005

Haemonchus contortus Kinesin

Haemonchus contortus

A common parasite infecting sheep and goats, H. contortus is of major economic importance. A search for kinesin homologues to the Drosophila P17210 kinesin protein revealed a large number of hits in the H. contortus chromosome. By performing BLAST searches on the sequences found to be similar, their individual similarity to known kinesins could be determined, and through that method six likely kinesin homologues were identified. This procedure of working with a non-annotated section of sequenced DNA made clear the importance of having properly annotated sequences, beyond simply having masses of sequenced DNA.

Sequencing by the Wellcome Trust Sanger Institute.

Contig52451, P-value: 7.8e-30
AAGAGGTATAACGTGTTTGCAATTTCTTCCCTATCAAAAAGCAGCAAGC
AAGAGTGAGAGTTCGAAGTTGGAAATATCGCAAACAGCAAAATAGTTC
CATTTTTCAGGGAAAAGTATTCGTGTACGACAAAGTCTTTAAACCGAAT
ACAACCCAGGAGCAAGTTTATATGGGTGCAGCCTATCACATTGTCCAAG
ATGTTCTGTCGGGTTACAATGGGACAGTATTCGCCTACGGTCAGACGTC
TTCTGGTAAAACTCACACAATGGAG
Blast results (blastx): kinesin family members present

Contig23052, P-value: 4.0e-11
GTTGGAGCGACTCAGATGAACGAGGAAAGCAGTCGGTCCCATGCTATGT
TTTCGGTGACGGTTGAAGCGAGCGAAAGAGGCATGGTTACTCAGGGAAA
GTTGCATCTCGTCGATCTAGCGGGTAGCGAACGTCAGAGCAAAACTGGA
GCCGCTGGA
Blast results (blastx): kinesin family members present

Contig76363, P-value: 3.9e-09
CATATACCTTACCGTAATTCGAAACTGACTATGATTCTGCGCGATAGCCT
GGGAGCTGGTAATTCCAAGACGATGGTCATCGTTGCATTGAACCCCTCTG
CGGCTCAAATCGCAGAAACCAAACGAAGCCTGGAGTTCGCACAACAG
Blast results (blastx): kinesin family members present

Contig65838, P-value: 1.6e-06
CAGTCGCTTAGTGTGTTGGGTCGCGTAATCAGAACGCTGTCTGCAGTACA
TCGACGTGATGAGTTCGTGCCATATCGCGATTCTAAACTGACTCACATTT
TGCGAGATTCCCTCGGAGGAAACAGCAAGACAGCAGTGATTGTTAATCTT
CATCCC
Blast results (blastx): kinesin family members present

Contig22481, P-value: 1.7e-06
GAGAGCCAGaaGAGATTTGTTGATTTCAGCTCCCTCATGTCGCGTTTGCCT
ATCAGAATTACCGGTATCTTGACCACGTTCATTACCCGCTAGGTCAatcaaA
GAGAACTTCCCCCATAACtttttaTCTCtgaaaaaTggaagcTGagaacTAGCGaga
acgcGATgaaaaaaaaacgcactttcgaagcacaatttgaaaacggcatgcgatctgcttgaattg
gaatttgcagaggttgcacccgcagtgcgcatatcggttccttgtcgaatgagttccaggac
Blast results (blastx): kinesin family members present

Contig18696, P-value: 4.4e-05
TCCAGCGGCTCCTGTTTTGCTCTGACGTTCGCTACCAGCCAGATCGACGAG
ATGCAACTTTCCCTGAGTAACCATGCCTCTTTCGCTGGCTTCAACCGTCAC
CGAAAACATAGCATGGGAACGACTGCTTTCCTCGTTCATCTGAGTCGCTCC
AAC
Blast results (blastx): kinesin family members present

Thursday, September 29, 2005

Cow Kinesin

Bos taurus

I used the first 350 amino acids from the Drosophilia melanogaster kinesin heavy chain protein with the acession number P17210 for all of the searches.


  • When using e!Ensemble Cow BLAST, 178 hits were retrieved with E-values ranging from 4.6e-203 to 8.7. There were 161 hits that have an E-value of less then 0.005.
  • When using PsiBlast, there was 80 hits by the second iteration and no new sequences above the 0.005 threshold by the third iteration.

I feel that the e!Ensemble website is a great resource for blasting as the interface gives you more options for arranging your results after the search is done then NCBI. Still after learning how to do BLAST search at the NCBI website it is difficult to adjust my research strategy.

For more information about the Cow genome:

NCBI Cow Genome Resources

Bovine Genome Project

Scientist to Start the Bovine Genome Project (March 4, 2003)

Maize Kinesin

A tblastn search was performed on the TIGR gene indicies. The drosophila heavy chain kinesin (P17210) sequence was blasted against the Maize (Zea Mays) genome to identify Maize Kinesins. Following is a list of the hits from this blast search.


maizeTC249544 kinesin-like calmodulin binding protein E-value 2.4e-53
maizeTC252900 weakly similar to Kinesin-like protein KIFC3 E-value 1.7e-52
maizeTC260837 kinesin heavy chain Zea mays E-value 3.5e-52
maizeTC260836 Kinesin heavy chain E-Value 7.7e-49
maizeTC252595 Kinesin heavy chain E-value 2.1e-47
maizeTC272656 Kinesin heavy chain E-value 8.0e-47
maizeTC266490 Kinesin heavy chain E-value 5.4e-45
maizeTC252863) Kinesin heavy chain E-value 5.6e-44
maizeTC250806 similar to Kinesin-like protein E-value 1.3e-37
maizeTC265814 similar to Kinesin-like protein E-value 1.7e-37
maizeTC273975 homologue to Kinesin heavy chain E-value 4.3e-34
maizeTC272467 Kinesin heavy chain E-value 1.5e-33
maizeTC252410) Kinesin heavy chain E-value 8.2e-33
maizeTC250055 kinesin heavy chain [Zea mays] E-value 2.9e-32
maizeTC253517 similar to Kinesin-protein KIFC3 E-value 6.5e-31
maizeTC259401 similar to T5A14.3 protein E-value 5.0e-30
maizeTC265827 homologue to UPQ94G20 E-value 6.1e-29
maizeTC265764 similar to Kinesin-like protein E-value 1.0e-28
maizeTC268301 similar to Kinesin-like protein E-value 1.1e-27
maizeTC266881 homologue to Kinesin-like protein E-value 3.23-27

Streptomyces coelicolor kinesins

Streptomyces coelicolor

















I tried looking on the TIGR website to see if my species was there, but unfortunately it wasn’t. So I went to the Sanger Institute website mentioned in one of the previous posts, which has a specific BLAST server for Streptomyces coelicolor. I fiddled around on this site, which has some nice links in the sidebar, allowing you to do things like perform DNA motif searches, look at protein clusters, inspect the clone map, and find general information about S. coelicolor. I then did a tblastn search under “S. coelicolor genome sequence EMBL entries” and got the following results of 4 distinct kinesins:

Streptomyces coelicolor A3(2)

Segment 20/29
Score: 213
E value: 2.9e-10

Segment 13/29
Score: 102
E value: 0.19

Segment 5/29
Score:98
E value: 0.95

Segment 8/29
Score:102
E value: 0.999

Overall, the Sanger Institute website was helpful. I did feel like there were some drawbacks in the BLAST searching. For example, there were not as many search options available as there are on the NCBI BLAST site. But it was a useful site, and had a lot of information on many species.

Wednesday, September 28, 2005

Pine Kinesin

Pine (Pinus Taeda) Kinesins
A tblastn search on TIGR was run against the pine genome with the FASTA format of the kinesin heavy chain sequence of Drosophila melanogaster (P17210). This was done to determine the number of kinesin in pine trees. The FASTA format of the protein sequence was found using the acession number as a query in Entrez Protein. A total of 20 hits were found with decent scores and E values. However, upon further analysis of each sequence several were found to not be kinesins. For example, one protein was found to be a neurofilament protein. In general, the TIGR blast website provided several benefits over the NCBI blast website. TIGR blast allowed the search of more specific organism genomes, even incomplete genomes. For example, NCBI did not have the pine genome in its database, while TIGR allowed a query search against the pine genome. The result format was similar to NCBI blast results. TIGR blast even had links to the proteins on NCBI. One setback of TGIR blast was the slow execution of the blast search. Some searches took several minutes compared to the 15 seconds in NCBI blast.

The total Database: Pine 45,557 sequences; 36,548,862 total letters

pineDR691181 similar to UPQ69UI4 (Q69UI4) Kinesin 1-like
Score 409 E value 1.6e-51
pineTC61066 similar to UPQ9M0X6 (Q9M0X6) Kinesin-like protein
Score 389 E value 2.4e-35
pineTC62224 similar to UPATK1_ARATH (Q07970) Kinesin 1
Score 327 E value 1.5e-28
pineDR080983 similar to GBBAB09933.110176794AB022214 kinesin-like protein
Score 275 E value 7.7e-23
pineBM157420 similar to UPQ9LXV6 (Q9LXV6) Kinesin-like protein
Score 185 E value 3.9e-22
pineTC67370 similar to UPQ6WJ05 (Q6WJ05) Central motor kinesin 1
Score 258 E value 1.1e-18
pineCF666088 similar to UPQ75UP8 (Q75UP8) Kinesin heavy chain-like protein
Score 202 E value 1.7e-17
pineDR182269 homologue to UPQ6WJ05 (Q6WJ05) Central motor kinesin 1
Score 178 E value 3.3e-11
pineDR684708 similar to UPQ6YZ52 (Q6YZ52) Kinesin motor protein 1-like
Score 176 E value 4.5e-11
pineDR100363 UPKINH_GIBMO (Q86Z98) Kinesin heavy chain
Score 152 E value 1.3e-09
pineBF610071 similar to GBBAB19066.111761076AP002744 kinesin-like protein
Score 114 E value 2.4e-05

Plant Kinesin Lab 2

Vitis vinifera (grape) Kinesin!

  • grapeTC48382
    Kinesin-like calmodulin binding protein, partial
    Score: 362; E value: 1.2e-32
  • grapeTC43494
    Kinesin-like protein KIFC3, partial
    Score: 311; E value: 4.6e-31
  • grapeTC39410
    Central motor kinesin 1, partial
    Score: 363; 9.2e-31
  • grapeTC44010
    Kinesin-related protein-like, partial
    Score: 210; E value: 3.7e-16

I tried several different methods to located the Vitis vinifera kinesin before I found a way that worked. I have to admit I did try googling "grape kinesin" but alas didn't yeild any helpful results. Also tried browsing the NCBI site as that worked for some other groups but didn't really find anything helpful there either. I browsed the TIGR site for a while and tried doing searches with part of the nucleotide sequence for the P17210 protein (to only capture the head region) and the whole nucleotide sequence. No luck there. Also tried doing part of the protein sequence but eventually found that performing a tblastn search on the TIGR database using the whole protein sequence (which I retrieved from FASTA) gave me the best results. I did have to analyze the results and select the proteins that were most related to the head region (1st 350 amino acids). I had to do this rather than copy and paste the results because some of the heavy chain areas had higher similarity scores than the head region and therefore were mixed in my main results. I also found that some of the results were a repeated protein. Other than that I'd have to say that the TIGR website was a good resource for looking for a specific specie's kinesin. I would like to learn more about it in terms of how to limit the parameters and manipulate the results a little more.

Xenopus Tropicalis

Kinesin of XENOPUS TROPICALIS


I tried three different organism-specific BLAST sites (TIGR, Ensembl, and JGI) for the frog species, xenopus tropicalis. The non-default TIGR parameters I used were a tblastn search using P17210 as the query, a cutoff of 80, and an E value cutoff of less than or equal 0.00001. However, there was no option allowing a search for the first 350 aa (the head domain of KHC). Therefore, I selected the first 350 aa from the FASTA format of the Drosophilia KHC pasted it into TIGR in FASTA format. My results showed 36 hits with all significantly small E values (TIGR results). All hits showed statistically significant E values and bit scores suggesting homology. I checked my results by clicking on the genbank accession number, clicked "links" (NCBI website), and then "unigene (NCBI website)."

The JGI website (genome.jgi-psf.org) has an organism specific database for the species, Xenopus tropicalis that has been recently updated to version 4.1. Results yielded hits with significantly low E values (SEE JGI Results). By clicking on the genbank link and then the "more info" link, JGI's description of the protein is revealed. In this case, THE #1 HIT, has a molecular aspect of motor activity, a function aspect of ATP binding, and the cellular component is a micotubule associated complex. The protein is classified as part of the kinesin SMY1 subfamily and this 350 aa region is the kinesin protein's motor region.

The Ensembl Multi View search yielded significantly more hits, 413 (Ensembl Results). However, many of these hits could be of redundancy because I used the entire 961 aa of the query (P17210). The top hit had a score of 317 and an E value of 1.9 e^-21. In addition, the results had a nice karyotype diagram.

All in all, these three organism specific BLAST sites were much more useful because of their specificity rather then just using the NCBI website with the nr database.

Modification of Search for All Vertebrate Kinesin: psiBLAST

This week in lab we performed a position-specific iterated BLAST or psiBLAST for all Vertebrate kinesin using the Drosophila KHC as our query. This particular BLAST search is designed for finding distantly related proteins. psiBLAST uses a position-specific scoring matrix (PSSM) and therefore is more sensitive then a regular BLAST search.
Our initial results yielded 995 hits with the top hit being: KIF5C protein [Homo sapiens] (Score 520 and an E value of 9e-147) (psiBLAST Results). Our hits continued to increase after every iteration.

The Taxonomy Report

After 5 iterations, psiBLAST had yet to report convergence (no new results found). We ran into the problem of having too many hits. Therefore we modified our search parameters by typing in "refseq" in the limit by Entrez query box. All though this most likely eliminated redundant database matches, it over-rided our parameter of just Vertebrate kinesin. In the end we only got 17 hits, in which more then half were of non-vertebrate.
In conclusion, we had more success with the BLASTp search. Although we ran into the problems of both too many and too few hits with psiBLAST, we were able to recognize the potentiality of this specific BLAST. More time needs to be spent working with the psiBLAST parameters in order to get the kind of results we are looking for.

Honey Bee Kinesins

Apis Mellifera




In order to find all the kinesins known (sequenced) of Apis Mellifera (the honey bee), I first looked on the e! ensembl database and did a Blastp search in the honey bee genome against the kinesin heavy chain sequence (amino acids 1-350) of Drosophila melanogaster. The results were good, but it yielded many insignificant results, even though they had very low e values. To find better results, I used NCBI’s PSI-Blast program. I limited the search to only Apis Mellifera, and no iterations were necessary. It returned 21 hits, all representative of non repeated kinesin proteins. I didn’t have much trouble finding the kinesins, and would recommend the e!Ensembl and NCBI databases for specific Blast searches.
Links to Search Results:
e!ensembl Blast Search Results
Results of PSI-Blast
PSI-Blast Taxonomy View

















Plant blast searches














Arabidopsis
Brassica
TIGR plant sequence database

Organism-specific BLAST searches









Here are some nice alternatives to NCBI:

e! (ensembl)

EMBL-EBI
EMBL-EBI BLAST
TIGR databases -especially good for plant genomics
TIGR Sequence search
WU-BLAST2
Sanger Institute (really good)

vertebrate (chordate) blast searches











Ciona intestinalis (sea squirt)
S. purpuratus (sea urchin)
Dania rerio (zebrafish)
NOTE: The vertebrates get mad props from the general organism-specific sites such as NCBI.

Invertebrate blast searches


Dictyostelium discoideum (slime mold)
Streptomyces coelicolor A3(2) (soil fungus)
Insects (cool interface)
Mosquito
Protozoans
nematode
plasmodium

Chapter 5 links


Hi. Here are LOTS of good links for organism-specific sequence databases and software alternatives to BLAST at NCBI.

Tuesday, September 27, 2005

Background about extreme value distribution

In class we are discussing the statistical parameters surrounding BLAST sequence searches. The BLAST scores fit an extreme value distribution. There are three types: the Gumbel is of most interest to us. Extreme value distribution is a branch of statistics that deals with predicting values _extremely_ different than the median. It has been shown empirically that extreme value distributions have better predictive powers for rare events than other distributions, such as the normal distribution. Here's some information I found about extreme value distributions in other contexts:

The washer factory:

Background of the Generalized Extreme Value Distribution

Like the extreme value distribution, the generalized extreme value distribution is often used to model the smallest or largest value among a large set of independent, identically distributed random values representing measurements or observations. For example, you might have batches of 1000 washers from a manufacturing process. If you record the size of the largest washer in each batch, the data are known as block maxima (or minima if you record the smallest). You can use the generalized extreme value distribution as a model for those block maxima.

The generalized extreme value combines three simpler distributions into a single form, allowing a continuous range of possible shapes that includes all three of the simpler distributions. You can use any one of those distributions to model a particular dataset of block maxima. The generalized extreme value distribution allows you to "let the data decide" which distribution is appropriate.

Size of a 100-year flood

A one-hundred year flood is calculated to be the maximum level of flood water to be expected in an average one-hundred-year period. Sometimes, a 500-year or 10,000-year flood is also calculated (especially, in low lying countries, such as the Netherlands). The 100-year flood is sometimes referred to as the 1% flood, since there is a 1% chance of it occurring in any year. Based on the expected water level, an expected area of inundation may be mapped out according to elevation above sea level. This area figures very importantly in building permits, environmental regulations, and flood insurance.

The mathematical field of extreme value theory was created to model rare events such as 100-year floods for the purposes of civil engineering.


Wikipedia entry has good links, too

Monday, September 26, 2005

Arthropod Kinesins

Kinesin proteins are critical to the transport of various cellular components in many organisms. Since different classes of cargo are often transported by different kinesins, there is a huge diversity of kinesin transporters. Using a blastp search to identify homologues among arthropods to the head region of a Drosophila kinesin protein identified 151 kinesin-related sequences from a total of seven species.

The following organisms were represented:

  • Drosophila melanogaster 76 hits
  • Drosophila yakuba 1 hit
  • Drosophila pseudoobscura 23 hits
  • Anopheles gambiae str. 24 hits (malaria-carrying mosquito)
  • Apis mellifera 24 hits (honeybee)
  • Bombyx mori 1 hit (silk moth)
  • Lymantria dispar 2 hits (moth)

The blastp search was used for its ability to search between proteins of distant homology. The head region (amino acid residues 1 to 350) of the Drosophila kinesin protein P17210 was used as the query sequence, with the search limited to arthropods and entries with Entrez kinesin references. The default settings for the search matrix (BLOSUM62) and gap penalties (eleven for existence, one for extension) were used because of their wide application. The search reveals that many kinesin or kinesin-like proteins have been found in a wide variety of organisms, even among arthropods, few of which, aside from Drosophila, are genetically well-characterized.

Sunday, September 25, 2005

Vertebrate Kinesins

Created by: Jason Close, Matt Morale and Lisa Warner

blastp

A BLAST search revealed 1033 predicted vertebrate kinesins. Vertebrate Kinesin BLAST Results

When broken down by organism we see that the human has 248 predicted kinesins. Vertebrate Kinesin Tax BLAST Report. Some of these 248 kinesins are repeats of one another and others are not kinesin at all. According to the Kinesin Homepage there are 41 kinesin proteins currently known in humans.

The animals that the 1033 predicted kinesins come from are:
  • Bos taurus 30 hits
  • Canis familiaris 144 hits
  • Coturnis coturnix 3 hits
  • Danio rerio 60 hits
  • Gallus gallus 47 hits
  • Homo sapiens 248 hits
  • Macaca fascicularis 15 hits
  • Morone saxatilis 6 hits
  • Mus musculus 178 hits
  • Oryzias latipes 12 hits
  • Pan troglodytes 32 hits
  • Pongo pygmaeus 13 hits
  • Rattus norvegicus 70 hits
  • Tetraodon nigroviridis 39 hits
  • Xenopus laevis 63 hits
  • Xenopus tropicalis 5 hits

Our methodology for searching for all vertebrate kinesins was to use a BLASTp search. We used the Drosophilia KHC accession number (P17210) as our query sequence. Next we set our subsequence limits from 1 to 350 amino acids as the first 350 aa are a conserved head domain of most known kinesin. We used the default nr database as the subject because it includes all of the other databases and therefore it is the broadest. In the options for advanced blasting section, we limited our search to Vertebrata [ORGN] proteins only. We kept the default matrix (BLOSUM 62) as well as the gap costs- existence to extension ratio (11:1). Lastly, we increased descriptions and alignments to 1000. An important part of in BLAST searching is deciding whether an alignment is significant. Our cutoff for evaluating whether or not each protein was homologous to our query was an E value of less than or equal to 0.01. In addition, we were able to expand and organize our results using the taxonomy tool.

blastp v. psiblast

This week in lab we performed a position-specific iterated BLAST or psiBLAST for all Vertebrate kinesin using the Drosophila KHC as our query. This particular BLAST search is designed for finding distantly related proteins. psiBLAST uses a position-specific scoring matrix (PSSM) and therefore is more sensitive then a regular BLAST search.Our initial results yielded 995 hits with the top hit being: KIF5C protein [Homo sapiens] (Score 520 and an E value of 9e-147) (psiBLAST Results). Our hits continued to increase after every iteration.

The Taxonomy Report

After 5 iterations, psiBLAST had yet to report convergence (no new results found). We ran into the problem of having too many hits. Therefore we modified our search parameters by typing in "refseq" in the limit by Entrez query box. All though this most likely eliminated redundant database matches, it over-rided our parameter of just Vertebrate kinesin. In the end we only got 17 hits, in which more then half were of non-vertebrate.In conclusion, we had more success with the BLASTp search. Although we ran into the problems of both too many and too few hits with psiBLAST, we were able to recognize the potentiality of this specific BLAST. More time needs to be spent working with the psiBLAST parameters in order to get the kind of results we are looking for.

Wednesday, September 21, 2005

Plant Kinesin

Plant Kinesin Taxonomy
Plant Kinesin Blast Search

A blastp search yielded 241 hits from 23 organisms in the plant kingdom. Arabidopsis thaliana retrieved the most hits, 136, but according to the Kinesin Homepage this species has 61 recognized kinesin.

The organisms documented are
  • Arabidopsis thaliana (thale-cress) 136 hits
  • Brassica oleracea 1 hit
  • Brassica rapa (field mustard) 1 hit
  • Brassica rapa subsp. pekinensis (bai cai) 1 hit
  • Gossypium hirsutum (cotton) 3 hits
  • Cannabis sativa (marijuana) 1 hit
  • Arachis hypogaea 1 hit
  • Pisum sativum (peas) 1 hit
  • Glycine max (soybeans) 1 hit
  • Daucus carota (carrots) 1 hit
  • Nicotiana tabacum (tobacco) 16 hits
  • Solanum tuberosum (potatoes) 2 hits
  • Lycopersicon esculentum (tomato) 3 hits
  • Ipomoea batatas (batate) 1 hit
  • Solanum demissum 2 hits
  • Coptis japonica 1 hit
  • Oryza sativa (japonica cultivar-group) (Japanese rice) 49 hits
  • Zea mays (maize) 13 hits
  • Picea abies 1 hit
  • Welwitschia mirabilis 1 hit
  • Chlamydomonas reinhardtii 3 hits
  • Stichococcus bacillaris 1 hit
  • Volvox carteri 1 hit

Dog kinesin


Canis familiaris (dog)



-e!Ensemble Dog BlastView was employed at ensembl.org to align the sequence of D. melanogastor kinesin heavy chain (P17210) against the Canis familiaris genome.The paramaters for the search were as follows: E =<0.5 filter="seg," length =" 3," hitdist =" 40," extension =" 2:1," threshold =" 15,">

-Results: 801 alignments, S values: 79-2828, E values: 0-0.0036

- e!Ensemble having returned too many redundant or inconsequential alignments, a blastp search of the NCBI database was performed with a query of aa's 1-350 of D. melanogaster KHC (P17210). The blastp algorithm proved to be much more selective, returning alignments of only 10 dog kinesin isoforms all with values of 1e-135 or better: Taxonomy Report. However, it was expected that this search was too selective, and that the use of a position-specific scoring matrix would better achieve a balance between selectivity and sensitivity and correctly identify the known dog proteins homologous with fruit-fly kinesin.

- A psi-BLAST search of aa's 1-350 of P17210 against the dog genome, with max E set at 0.005, returned 88 hits within the set paramaters (max E = 1e-4) and 2 outside of it (E = 3.6 and 6.3)...for some reason hitting "2nd iteration" does nothing but it shouldn't matter since the E values of hits 88 and 89 are so far apart. Also, I couldn't paste a working link to the psi-BLAST results into the blog, so I hope you didn't have your hopes too high for that.

*Overall, psi-BLAST proved to be the most useful tool for running an organism-specific search for a specific protein (dog, kinesin). Its usefulness was deemed to be its achievement of a balance avoiding excessive sensitivity (Ensemble) and excessive selectivity (blastp).

Saving BLAST search results as HTML files

I discovered (new to me) that you may save your blast search results as an HTML file. The nice thing about that is your ability to save the results, and then open them later in a web browser or Word. The links embedded within the page are preserved, although the graphics are a little sketchy.

Fungal kinesins

Fungal kinesins

Here's the comprehensive list of the fungi in which kinesins have been found:

Syncephalastrum racemosum 1 hit
Aspergillus nidulans FGSC A4 14 hits
Emericella nidulans 4 hits
Gibberella moniliformis 9 hits
Neurospora crassa 16 hits
Cochliobolus heterostrophus 11 hits
Magnaporthe grisea 9 hits
Haematonectria haematococca 3 hits
Cryptococcus neoformans 11 hits
Aspergillus fumigatus 13 hits
Ustilago maydis 521 12 hits
Ustilago maydis 4 hits
Botryotinia fuckeliana 12 hits
Gibberella zeae 13 hits
Yarrowia lipolytica 9 hits
Candida albicans 9 hits
Schizosaccharomyces pombe 22 hits
Debaryomyces hansenii 9 hits
Ashbya gossypii 3 hits
Eremothecium gossypii 3 hits
Saccharomyces cerevisiae 21 hits
Thermomyces lanuginosus 1 hit
Cryptococcus neoformans 8 hits
Candida glabrata 8 hits
Kluyveromyces lactis 7 hits
Encephalitozoon cuniculi 6 hits
Candida albicans 3 hits
Schizosaccharomyces pombe 972h- 1 hit
Debaryomyces occidentalis 1 hit




RID: 1127312899-19385-21661114281.BLASTQ3

I got 243 hits, 222 with a score>50. Just for kicks, I selected the top 10 hits by clicking the checkbox next to each pairwise alignment in the blast results page, and then clicked the button "Get selected sequences". The next window lists the Genbank entries for these 10 proteins. In this window, I clicked on the "Details" tab to see whether I could create a URL for the genbank entries page; it didn't work. Bummer.

Request ID numbers

Request IDs are the unique identifiers assigned to each BLAST search. They allow users to return to searches done previously, rather than having to repeat the search. Request IDs work because NCBI stores results for 24 hours, giving researchers the opportunity to retreive or reformat data, rather than using search capacity for those functions. More info is available in section 6.6 of the "Getting started" (click above for link). For example, copy the following Request ID

RID: 1127311572-2598-156417425331.BLASTQ3

into the "Type in the request ID" window on the "retrieve results" page in the META section at http://www.ncbi.nlm.nih.gov/BLAST/

How to link pubmed search results

1) Do pubmed search.
2) On results page, click on "Details" tab.
3) On Details page, click on "URL" button. This will send you to a new page.
4) Copy the text in the URL window on the new page; this "permanent" URL can be used as a web site link in Blogger or other programs.

Friday, September 16, 2005

From Dayhoff to the airport

I'm off to a science teaching meeting at Furman in Greenville, SC. Richmond International Airport has FREE wireless internet. I'm not sure that people would come to the airport just to use the wireless connection, but it's a nice perk, and it allowed me to post.

Today in class we talked about the theory of scoring matrices in pairwise sequence alignment of protein sequences. I hope we can wrap our head around the idea that pairwise alignment scores help us determine whether two proteins are similar by homology or random chance. Below the twilight zone of 20% identity, telling these possibilities apart becomes more difficult.

Now that we know how a scoring matrix such as PAM250 or BLOSUM62 works, we can talk about the computer programs that apply these scoring matrices.

Have a good weekend!

Thursday, September 15, 2005

Human Disease reports

Yesterday we did presentations about "favorite" human diseases. Students used biomedical and general web sources to create blog entries about a disease of interest to them, and then presented their finding to class. Nice job!

I will catalog some of the most useful links for general use.

Thanks again to the students for your hard work.

Wednesday, September 14, 2005

Colon Cancer

Introduction:

Over 130,000 cases of colorectal cancer are diagnosed annually, and over 50,000 deaths. Colorectal cancer is the second most common cause of all cancer deaths. Colon and rectum cancers together called colorectal cancer afflict the lining of the large intestine. All forms of colon cancer arise from colorectal polyps that line the large intestine.

While there are environmental risk factors for cancer such as smoking, diet, and exposure to carcinogens, colon cancer is one of the most common inherited cancers. Hundreds of genes are involved in controlling cell division and ensuring DNA replication functions properly. Important genes found to be involved in colorectal cancer include MSH2 and MSH6 on chromosome 2 and MLH1 on chromosome 3. The protein products of these genes help repair DNA replication errors while functioning properly. If these genes are mutated they will code for faultly protein products that do not properly repair DNA replication errors. This can lead to irregular cell grown and in this case, colon cancer.

Symptoms:

While most cases of colon cancer do not have symptoms, common symptoms of colon cancer include:


  • Abdominal pain in the lowe abdomen
  • Blood in the stools
  • Weight loss with out apparent reason
  • Diarrhea, constipation, or other unusual bowel habits that do not heal
  • Stools that are narrower than usual

Prevention:

Regular physical examinations rarely indicate the presence of colon cancer, which is far more curable if found and treated in its early stages. Increase risk factors of colon cancer are colorectal polyps, a family history of colon cancer, and ulcerative colitis. Regular colonoscopy examinations is a very important prevention measure especially if an individual possesses any of the above risk factors.

Treatment:

Common treatment includes removal of the lesion or part of the colon containing the tumor. Depending upon the stage of the cancers developement chemotherapy and drugs such as 5-fluorouracil may be needed. More information about the diagnosis and treatment of colon cancer can be found at MedlinPlus Medical Encyclopedia.


Gene Homologs and Protein functions:

The MSH2 gene was discovered to be a human homolog of the E. coli mismatch repair gene mutS, which supports the idea of the MSH2 genes involvement of DNA replication repair.

The MLH1 gene was identified as a frequently mutated locus in hereditary nonpolyposis colon cancer. It is a human homolog of the E. coli DNA mismatch repair gene mutL.

MSH6 also functions in damaged DNA binding and codes for a G-T binding mismatch repair protein.

Why mutations in genes necessary in all body tissues cause cancer mainly in the colon is not clear.

Information about the above genes and the function of their corresponding proteins was found using NCBI's Entrez Gene search engine.

Genes and Disease is a good source for general information on colon cancer.

Support groups:

CancerCare
Colon Cancer Network

Tuesday, September 13, 2005

Transmissible Spongiform Encephalopathies (TSEs)



Introduction
The prion diseases are a large group of related neurodegenerative conditions, which affect both animals and humans. Included are Creutzfeldt-Jakob disease (CJD) and Gerstmann-Sträussler-Scheinker (GSS) in humans; bovine spongiform encephalopathy (BSE), or mad cow disease, in cattle; chronic wasting disease (CWD) in mule deer and elk; and scrapie in sheep. These diseases all have long incubation periods but are typically rapidly progressive once clinical symptoms begin. All prion diseases are fatal, with no effective form of treatment currently; however, increased understanding of their pathogenesis has recently led to the promise of effective therapeutic interventions in the near future.

Prion diseases are unique in that they can be inherited, they can occur sporadically, and they can be infectious. The infectious agent in the prion disease is composed mainly or entirely of an abnormal conformation of a host-encoded glycoprotein called the prion protein. The replication of prions involves the recruitment of the normally expressed prion protein, which has mainly an alpha-helical structure, into a disease-specific conformation that is rich in beta-sheet.

The first of these diseases to be described was scrapie, a disease of sheep recognized for over 250 years. This illness, manifested by hyperexcitability, itching, and ataxia, leads to paralysis and death. It is called scrapie because of the tendency of affected animals to rub against the fences of their pens in order to stay upright, reflecting their cerebellar dysfunction. The transmission of this disease was demonstrated first in 1943 when a population of Scottish sheep was accidentally inoculated against a common virus using a formalin extract of lymphoid tissue from an animal with scrapie (Gordon, 1946). Accidental transmission of prions is a recurrent event in the history of these agents and is related to their unusual biophysical properties.

excerpt from eMedicine.com
Etiology
Highly divergent hypotheses have been put forward regarding the makeup of the prions, including that they consist of nucleic acid only or protein only, are lacking both protein and nucleic acid, or are a polysaccharide. The most widely accepted hypothesis, first described by Griffith (Griffith, 1967) and more explicitly detailed by Stanley Prusiner, MD, is the protein only hypothesis (Prusiner, 1998). Prusiner introduced the term prion to indicate that scrapie is related to a proteinaceous infectious particle (PrP) (Prusiner, 1982).
This hypothesis was initially greeted with great skepticism in the scientific community; now it represents the current dogma, and Prusiner won the 1998 Noble Prize for Science. This hypothesis suggests that prions contain no nucleic acid and are referred to as PrPSc. The latter represents a conformationally modified form of a normal cellular PrPC, which is a normal host protein found on the surface of many cells, in particular neurons. PrPSc, when introduced into normal healthy cells, causes the conversion of PrPC into PrPSc, initiating a self-perpetuating vicious cycle.
excerpt from eMedicine.com
Although Prion diseases can be spontaneous, infectious as well as genetic, the focus here is on the hereditary disease....

Inherited Prion Diseases
Frequency
• In the US: The most common prion disease is CJD, with a uniform incidence of approximately 1 case per million population both in the United States and internationally. Familial forms of prion diseases, such as GSS and fatal familial insomnia (FFI), are much more rare. About 10% of cases of CJD are familial, with an autosomal dominant pattern of inheritance linked to mutations in the PRNP gene.

Mortality/Morbidity
Prion-related diseases are relentlessly progressive and invariably lead to death.
• The mean duration of sporadic CJD is 8 months.
• nvCJD has a slightly longer course, with a mean duration of 16 months.
• Familial CJD has a mean duration of 26 months, while GSS has the longest course, about 60 months.
*there are no sex/race predominance of the genetic disease

Creutzfeldt-Jakob Disease (CJD)
10–15% of the cases of CJD are inherited
inherited as autosomal dominant
mutation of the PRNP gene - most common mutations are:
• a change in codon 200 converting glutamic acid (E) at that position to lysine (K)(thus designated "E200K")[link to a table giving the single-letter code for the amino acids]
• a change from aspartic acid (D) at position 178 in the protein to asparagine (D178N) when it is accompanied by a polymorphism in the gene encoding valine at position 129 . (When the polymorphism at codon 129 is Met/Met, the D178N mutation produces Fatal Familial Insomnia instead.)
• a change from valine (V) at position at position 210 to isoleucine (V210I)
Extracts of autopsied brain tissue from these patients can transmit the disease to:
• apes (whose PRNP gene is probably almost identical to that of humans).
• transgenic mice who have been given a Prnp gene that contains part of the human sequence.

***These results lead to the important realization that prion diseases can only be transmitted to animals that already carry a PRNP gene with a sequence that is at least similar to the one that encoded the PrPSc. In fact, knockout mice with no Prnp genes at all cannot be infected by PrPSc.

Gerstmann-Sträussler-Scheinker disease (GSS)
caused by the inheritance of a PRNP gene with a mutations encoding most commonly:
• leucine instead of proline at position 102 (P102L) or
• valine instead of alanine at position 117 (A117V)

• like JCD, GSS is also strongly associated with homozygosity for a polymorphism at position 129 (both residues being methionine).
Brain extracts from patients with GSS can transmit the disease to
• monkeys and apes
• transgenic mice containing a portion of the human PRNP gene.

Fatal Familial Insomnia (FFI)
People with this rare disorder have inherited
• a PRNP gene with asparagine instead of aspartic acid encoded at position 178 (D178N)
• the susceptibility polymorphism of methionine at position 129 of the PRNP gene.
Extracts from autopsied brains of FFI victims can transmit the disease to transgenic mice.



Characteristics of the normal and mutated prion protein
PrPC
The normal protein
• is called PrPC (for cellular)
• is a glycoprotein normally found at the cell surface inserted in the plasma membrane
• has its secondary structure dominated by alpha helices (probably 3 of them)
• is easily soluble
• is easily digested by proteases
• is encoded by a gene designated (in humans) PRNP located on our chromosome 20.

PrPSc
The abnormal, disease-producing protein
• is called PrPSc (for scrapie)
• has the same amino acid sequence as the normal protein; that is, their primary structures are identical but
• its secondary structure is dominated by beta conformation
• is insoluble in all but the strongest solvents
• is highly resistant to digestion by proteases
• When PrPSc comes in contact with PrPC, it converts the PrPC into more of itself (even in the test tube).
• These molecules bind to each other forming aggregates.
• It is not yet clear if these aggregates are themselves the cause of the cell damage or are simply a side effect of the underlying disease process.
excerpt from CDC.gov webpage

Homologs
Homologene
Unigene



Other Related Information



Monday, September 12, 2005

Menkes Syndrome

Introduction: Menkes Syndrome, also know as Kinky Hair Disease, is a recessively inherited X chromosomal gene which disrupts copper metabolism and decreases a cells' ability to absorb copper. Research suggests that this disease is caused by

a mutation in the gene encoding Cu(2+)-transporting ATPase, alpha polypeptide
(OMIM #309400). First described in 1962, Menkes causes cerebral degeneration, seizures, hair that appears both kinked and lightly pigmented under the microscope, and will often result in death within the first year of life. This disease affects 1 in every 50,000-250,000 births with the average life expectancy of a patient with this disease is between 6 months and 3 years (eMedicine). (General Summary of Disease; History of Disease)


Gene Information: According to eMedicine,

the Menkes gene is located on the long arm of the X chromosome at Xq13.3 (ATP7A), and the gene product is a 1500-amino-acid P type adenosine triphosphatase (ATPase), which has 17 domains -- 6 copper binding, 8 transmembrane, a phosphatase, a phosphorylation, and an ATP binding.

In normal nondiseased cells, this ATPase transports copper into the secretory pathway of the cell for incorporation and excretion. When cells are affected by Menkes Disease, import of dietary copper is impared leading to unhealthy accumulation of copper in various organs like the kidneys while other organs such as the brain have an unhealthy lack of copper. Mutations of this gene are also know to cause Occipital horn syndrome, another copper transport disease that causes wedge-shaped calcifications on the skull. (Nucleotide Information; Entrez Gene)


Homologs:
Homologene- Notes the similarities between the gene, proteins, and pheontypes of Menkes Syndrome in H. sapiens and other organisms.
Unigene- Shows information on gene expression and mRNA and EST sequences as well as protein sequence similarities between humans and other organisms including E. coli, M. musculus, and C. elegans.


Symptoms: Symptoms include loss of developmental milestones, hypotonia, cerebral degeneration, seizures, temperature instability, hair that appears both kinked and lightly pigmented under the microscope, abnormal facial features such as depressed nasal bridge or sagging cheeks, and skin laxity. Once the symptoms are observed, microscopical examining the hair or testing copper levels are the most common methods for confirming the disease. In some cases genetic tests are done to prove mutations in the ATP7A gene.


Treatment: Oral, subcutaneous, and intravenous supplements of copper have shown to help symptoms but not the progression of the disease. Alternative therapies are still being investigated. Parents can undergo genetic screenings to identify possible carriers and evaluate potential pregnancy risks.


Additional Links:
The Medical Directory- Contains useful diagrams illustrating and explaing patterns of inheritance of Menkes
Pubmed- Scholarly articles about Menkes Syndrome
Genetics Home Reference- The U.S. National Library of Medicine's simple reference page on the genetic condition- contains many additional links about research institutions, articles of interest, clinical trials, patient support, etc.

NINDS- General summary of Menkes Syndrome by the The National Institute of Neurological Disorders and Stroke which supports and funds research on this disease. Can link to organizations looking for Menkes patients for various studies.

Wednesday, September 07, 2005

Parkinson Disease

Background

Parkinson Disease (PD) is a chronic progressive neurodegenerative disorder. It was first described by James Parkinson 1817. It is a member of the group of motor system disorders, which occur because of the loss of dopamine-producing brain cells. As explained by Southwestern Medical Center, PD is a result of a cascade of events in the nervous system:

When dopamine-producing neurons die, loss of dopamine release in the striatum causes the acetylcholine producers there to overstimulate their target neurons, thereby triggering a chain reaction of abnormal signaling leading to impaired mobility

PD affects around 1-1.5 million people in the US. Symptoms generally appear when the patient is 50 years or older. However, there is an early-onset PD which can occur in people 40 and younger.

Symptoms

Primary symptoms

  • Bradykinesia - slowness in voluntary movement due to delayed transmission signals between the brain and muscles
  • Tremors - most commonly occur in the hands, fingers, forearms, feet, mouth, and chin, when these muscles are at rest
  • Rigidity - painful stiffening of muscles that worsens during movement
  • Poor balance - occurs due to lessening of reflexes that maintain posture
  • Parkinson's gait - walk distinctive to PD that includes leaning unnaturally forward or backward, stooping, head and shoulders drooped, and minimal arm swing. Commencing walking may be difficult and it is common to halt in the middle of a stride.

Secondary symptoms

Patients may experience some, but generally not all, of the following symptoms. Severity varies between individuals:

Constipation, difficulty chewing and swallowing, excessive salivation or sweating, bowel/bladder control difficulties, loss of intellectual capacity, psychosocial problems, dry skin, and soft voice

This site demonstrates the neural pathways that lead to PD symptoms.

Genes

Until recently, PD was thought to be caused by environmental, not genetic factors. However, studies within the past decade have found a number of genes associated with the disease.

Polymeropoulos, et al. (1996) located a gene marker for PD in the 4q21-q23 region.

Further studies have specified particular genes involved in PD. This NLM site lists some of the genes involved and inheritance patterns:

  • Mutations in the LRRK2 or SNCA genes have been linked to an autosomal dominant inheritance of PD.
  • Mutations in PARK2, PARK7, or PINK1 display an autosomal recessive inheritance of PD.
  • When SNCAIP or UCHL1 genes are involved, there is no clear pattern of inheritance.

Information on environmental factors

Homologs

Evolutionarily conserved homologs have been found in:
Mus musculus
Rattus norvegicus
Gallus gallus
Drosophila melanogaster
Caenorhabditis elegans

Animal models

Some animal models have been used to better understand the genes and proteins involved in PD.

In mice:
Weaver Mouse Gene

In Drosophila:
Drosophila PD model
Drosophila neurodegenerative disease models

Protein
Alpha-synuclein (AS), a product of the SNCA gene, has been associated with PD. This protein also has implications in Alzheimer Disease, and they may share similar mechanisms. Studies (1, 2, 3) of families with a history of PD have explored the role of AS. Their findings have revealed some evidence that AS is related to Parkinsonian conditions. However, this issue is controversial and is being researched further. Mouse models were used to explore the role of AS in Parkinson Disease. Giasson, B. I. et al. (2002) gave mutant and wild-type versions of human AS to transgenic mice. The mice with the mutant AS developed motor problems, while the mice with wild-type AS did not.

Further information about alpha-synuclein is available at:
The NCBI Genes and Disease site
Society for Neuroscience

Treatment
There is no cure for PD. However, there are treatments that reduce the severity of the symptoms. Because type of symptoms and responsivenesss to medication varies from one individual to the next, there are a variety of treatment options and combinations. The two main types of treatment are:

Prescription medication is the most common form of treatment. There are several types of drugs. Levodopa, a dopamine precursor, is the most common form of medication. It is taken up by the brain and then transformed into dopamine. In addition to dopamine precursors, many PD patients take dopamine agonists, such as bromocriptine and ropinirole, which are directly involved in the substantia nigra, imitating dopamine.

Surgery is generally done as a last resort, for patients with severe symptoms or who are unresponsive to medication.
Palliodatomy is the lesioning of the globus pallidus. This helps treat rigidity, abnormal movements, tremors, freezing and falling.
Occasionally, a thalamotomy is performed to reduce tremors.
Deep Brain Stimulation, a relatively new technique, involves inserting electrodes in the brain to provide electrical stimulation. This fairly safe technique has been effective in helping PD patients control their motor movements.

Additional Info
Some helpful websites:
National Parkinson Foundation
The Michael J. Fox Foundation for Parkinson's Research
Parkinson's Disease Trials
OMIM Parkinson Disease overview article

Tay-Sachs disease (Hexosaminidase A deficiency)

introduction
tay-sachs is a neurodegenerative disorder inherited in an autosomal recessive pattern, usually manifested in human victims at first between 3-6 months of age and causing death before age 4. the molecular basis of tay-sachs is a mutation in the gene HEXA on chromasome 15, which codes for the alpha subunit of beta-hexosaminidase A (Hex A), an alpha-beta heterodimeric, lysosomal enzyme that along with with GM2 activator protein, catalyzes the breakdown of the glycosphingolipid GM2. an analogous disorder, sandhoff disease, occurs when HEXB is mutated, thus causing the production of deficient beta-hexosaminidase B (Hex B), the beta-beta homodimeric isoenzyme of Hex A. in addition, GM2 activator disease occurs as the result of the production of deficient GM2 activator protein.

when the mutated HEXA gene yields defective Hex A, GM2 accumulates in the patient's neurons causing neurodegeneration, blindness, seizures and eventually total incapacitation and death. although the most common form of tay-sachs, acute infantile hexosaminidase A deficiency, is fatal early in life, adult onset forms exsist that manifest more variable symptoms.





for more summary, visit NIH's GeneTests

genetic screening for tay-sachs
tests exsist for the more than 90 disease-causing mutations of Hexa, although the most commonly performed ones test for six specific alleles that usually account for Hex A deficiency. of these six alleles, three nullify Hex A's activity and account for classic TSD (+TATC1278, +1IVC12, and +1IVS9), one causes adult-onset TSD (G269S) and two are considered pseudodeficient, meaning their protein products act upon GM2 but are inactive against the synthetic substrate used to test individuals' Hex A activity levels in their blood serum (R247W and R249W) - GM2 is too unstable a reagant to be used in laboratory testing. the null alleles of HEXA are far more prevelant in persons of Ashkenazi Jewish heritage and the adult-onset and pseudodeficient alleles in persons of non-Jewish heritage. before widespread population-based carrier screening was in place, 1 in 3600 Ashkenazi Jewish births resulted in TSD, indicating 1 in 30 Jewish Americans of Ashkenazi heritage as a carrier (compare to 1 in 250 Sephardic Jews and 1 in 300 non-Jews). since the implementation of widespread education, population-based carrier screening and genetic counseling, though, TSD incidence among North American Jews of Ashkenazi lineage has been reduced by over 90%.




Citation: GeneTests




An interesting article adressing the social and ethical dilemmas presented by the rise of techniques such as preimplantion diagnosis (PID), antenatal screening (ANS) and antenatal diagnosis (AND)...briefly mentions the practice of terminating TDS-positive fetuses identified via AND: Antenatal screening and its possible meaning from unborn baby's perspective

knockout gene experimentation and its implications

mice embryos were engineered by knockout gene therapy to express each of the three gangliosidoses. although usually a powerful tool in elucidating complex physiological processes in humans, knocking out mouse genes Hexa and Hexb (inducing tay-sachs and sandhoff, respectively) yielded a vast difference in phenotypes when compared to the human versions of the diseases. Hexa -/- mice exhibited some neuropathological symptoms of TDS, namely limited accumulation of GM2 in the nervous system, but remained asymptomatic. Hexb -/- mice experienced severe GM2 accumulation causing fatal neurodegeneration. Finally, Gm2a -/- mice displayed limited GM2 build-up and defects in balance and motor coordination.

view the full text at
The Proceedings of the National Academy of Sciences


symptoms/treatment

the symptoms of tay-sachs are quite horrific. by 6-10 mo. of age, infant stops aqcuiring new motor skills, decreases in visual attentiveness, exhibits irregular eye movement. by 10-12 mo., voluntary movements and attentiveness diminish, vision deteriorates and seizures increase in frequency. between 18-24 mo., progressive enlargement of the head, decerebrate posturing, difficulty swallowing, increasingly severe seizures and eventually a vegetative state. death usually occurs before 4 yrs. of age.

adequate nutrition/hydration, protection of the airway and management of infectious disease are the only measures able to be taken on part of the patient. seizure control drugs exsist but with limited effectiveness.



Citation: GeneTests

Cytogenetic location of HEXA gene on chromasome 15 (scroll to bottom)


The Hexa mRNA transcript is 3163 bp long Entrez Nucleotide


the FASTA format of the 529 aa protein

Structure of the compound GM2: PubChem

Angelman Syndrome

Angelman Syndrome

Background:
Angelman Syndrome, first described only forty years ago, continues to both expand and challenge the limits of genetic research. The disorder affects all major racial groups at an incidence of between 1 in 15,000 and 1 in 30,000, though precise numbers are difficult to gauge because the diagnosis can require extensive genetic testing to confirm. The symptoms are consistent with a neurological disorder, including an inappropriately happy demeanor, difficulty in walking, and an inability to form words. There are several genetic bases for the disorder, but all involve the loss of genetic material on the maternally inherited copy of chromosome 15. Since genomic imprinting inactivates that region of the paternal copy of chromosome 15, carrying an intact paternal copy of the chromosome does not prevent the disease, meaning that inheritance is not typically Mendelian (Glenn C. et al., 1997).
Symptoms:
The symptoms of Angelman Syndrome often manifest themselves in developmental delays within the first year of life. With time, the profound developmental delays characteristic of AS become readily apparent. As learning and coordination are impaired, two of the most typical AS symptoms appear: an unbalanced gait and lack of verbal communication. The severity of these symptoms is variable, but they are noted to some degree in nearly all cases. Most children do, in time, learn to walk with an unsteady gait, though few ever develop the ability to speak. Learning in general, though not stopped by the disease, proceeds at a very slow pace, with basic life skills such as feeding taking many years to learn. Some things do still come naturally to those with AS, though, particularly the ability to enjoy music and an often inappropriately happy demeanor. Though both of these can pose challenges as the child with AS constantly makes noises or laughs at situations others do not find particularly amusing, they do show that these individuals retain the ability to enjoy their lives despite what appears to those on the outside a stifling existence.

Further information:
  • The Special Child provides general information on symptoms and typical trends of development with AS, written from the perspective of the parents.
  • The Angelman Syndrome Foundation provides extensive information also geared towards parents, outlining many particular challenges common to the disorder along with some genetic background.
  • NCBI's The Online Mendelian Inheritance in Man site provides extensive information in the form of many case studies, shedding some light on patterns of inheritance and suggesting that a significant portion of cases are linked to germ line mutations in the mother.

Genetic Origin:
Though the symptoms of AS are relatively easy to identify, the molecular basis for these symptoms has proved more elusive. Large deletions in the maternal copy of chromosome 15 have long been linked to the symptoms of AS, but until recently the precise gene responsible for the disorder has been difficult to identify. After extensive analysis of the common deleted regions of many patients, the location of the gene was narrowed down substantially until Matsuura et al. (1997) described patients showing typical AS symptoms but loss-of-function mutations in UBE3A gene instead of large deletions, suggesting a causative role of UBE3A in AS. Further research has supported this theory, as the genetic imprinting of UBE3A matches non-Mendelian inheritance pattern of AS. Since only the maternal copy of UBE3A is expressed in the brain, the maternal allele exclusively determines phenotype in the brain. Elsewhere, however, biallelic expression rescues the diseased phenotype, explaining why the disorder selectively affects the brain (Vu and Hoffman, 1997; Rougeulle et al., 1997). Though loss-of-function of UBE3A appears to cause the symptoms of AS, the precise mechanism is still unclear. UBE3A is known to interact with a number of proteins, including the tumor suppressor gene p53. It is thought that, by ligating ubiquitin to p53, UBE3A marks it for the protein degradation pathway, a finding consistent with the elevated p53 levels in the brains of mice carrying UBE3A knockouts. The reason why elevated levels of p53 would lead to the observed symptoms, though, is still unknown.

UBE3A Functional Homologues:

  • Caenorhabditis elegans, wwp-1, accession number AAN73850.
  • Canis familiaris, Predicted protein (derived from the genomic sequence and mRNA evidence), accession number XP_863724.
  • Mus musculus, Herc2, accession number NP_034548. Mice are the primary model organism for this disorder, and those lacking this gene suffer from neuromuscular and spermiogenic disorders. Though this does not precisely mimic the human pattern of the disease, mice have been a helpful model system for the disorder and analyzing the brains of sacrificed mutant mice has provided evidence that p53 levels are elevated brains of knockout strains, which supports the role of UBE3A in protein degradation via the ubiquitin pathway.

Marfan Syndrome

Introduction

Marfan Syndrome is a genetic disorder that affects the connective tissue in the heart, blood vessels, lungs, eyes, bones, and ligaments. It is one of the most common inherited connective tissue disorders and can affect men and women of all backgrounds. (Additional Background)



Symptoms

The most noticeable symptoms consists of tall slender bodies and appendages. Furthermore, heart problems and other internal problems can occur from the disorder. However, symptoms can be mild or severe in cases. More specific and additional symptoms can be found at NCBI's Online Mendelian Inheritance in Man Website.





The Cause of Marfan Syndrome

Marfan Syndrome has been linked to the protein, fibrillin. Fibrillin is encoded from the FBN1 gene on chromosome 15. The protein is crucial for the formation of elastic fibers that provide structural support in connective tissues. A mutation in the FBN1 gene renders the fibrillin protein nonfunctional and connective tissues can not be properly formed.


Data from several studies suggested fibrillin as a possible candidate gene for Marfan syndrome (154700). Fibrillin was immunolocalized to the ciliary zonule (Sakai et al., 1986), a ligamentous structure consisting mainly of fibrillin microfibrils which is characteristically affected in the Marfan syndrome. Hollister et al. (1990)demonstrated abnormal immunohistochemical patterns in the skin and cultured fibroblasts of patients with the Marfan syndrome. By pulse-chase analysis, Milewicz et al. (1992) demonstrated reproducible patterns of abnormal fibrillin-1 synthesis, secretion, or extracellular matrix utilization in the vast majority of fibroblast cell cultures from Marfan syndrome patients. Furthermore, the Marfan phenotype had been mapped by linkage studies to the same region of chromosome 15 that contained the fibrillin gene. The demonstration of linkage between the fibrillin gene (as recognized by a TaqI restriction site polymorphism) and the Marfan phenotype, combined with the demonstration of point mutations within the fibrillin gene in patients with classic Marfan syndrome (Dietz et al., 1991), concluded the proof that FBN1 is 'the Marfan gene.' In a letter received February 4, 1992, Hayward et al(1992) stated that of the 2 fibrillin mutations discovered up to that time, arg239-to-pro (now designated as codon 1137; 134797.0001) had been found twice in 111 cases and cys1409-to-ser (now designated as codon 2307; 134797.0002) had been found once in 140 cases (mutations being scored once for each sporadic case and once for each family segregating the gene). They concluded that there are likely to be many different FBN1 mutations responsible for the Marfan syndrome.


Navigate through the following link (Marfan Cause) to understand more about the cause of Marfan Syndrome.

Genetic Information

mRNA Reference Sequence - 9749 bp

Protein Reference Sequence - 2871 aa


Representative portion of the 3-D structure of fibrillin with two Ca ions in two separate domains.








courtesy of NCBI

Gene Expression

cDNA sources - NCBI's UniGene

blood, bone, bone marrow, brain, colon, eye, heart, kidney, larynx, liver, lung, lymph node, mammary gland, muscle, ovary, pancreas, peripheral nervous system, placenta, prostate, skin, soft tissue, stomach, testis, thymus, uterus, vascular,embryo, juvenile, adult

Homologs


M. musculus
ref:NP_032019.1 - fibrillin 1 [Mus musculus] 96.17 % / 2871 aa (see ProtEST)
R. norvegicus
ref:NP_114013.1 - fibrillin-1 [Rattus norvegicus] 95.89 % / 2871 aa (see ProtEST)


Treatment

According to the Merck Manual:

There is no cure for Marfan syndrome nor any way to correct the abnormalities in the connective tissue. Treatment is aimed at fixing abnormalities before dangerous complications develop. Some doctors prescribe drugs, such as beta-blockers, that make blood flow more gently through the aorta. However, whether these drugs help is controversial. If the aorta has widened or developed an aneurysm, the affected section can be repaired or replaced surgically. A displaced lens or retina can usually be reattached surgically. Years ago, most people with Marfan syndrome died in their 40s. Now, most people with Marfan syndrome live until their 60s. Prevention of aortic dissection and rupture probably explains why the life span has been lengthened.


Other Resources of the Marfan can be obtain by the links below.

The National Marfan Foundation - This link contains background information, related disorders, supports services, possible grant and research funding, and other significant information

http://www.americanheart.org/presenter.jhtml?identifier=4672 - This link provides more general background about Marfan Syndrome

http://marfan.stanford.edu/ - Information about the Stanford University Marfan Clinic