Thursday, December 18, 2008

Protein of the Day #15: Adropin

Recently in Cell Metabolism there was a very interesting article on the newly characterized peptide Adropin. Adropin seems quite important in regulating the balance between sugar and lipid use as fuels; its one of several promising avenues to understanding and treating diabetes that have come out in the last year or two. The peptide is cleaved from the product of the gene Enho (energy homoestasis associated). Its extremely well conserved in mammals - for most species the amino acid sequence is 100% identical (a few ungulates have 1 amino acid difference). Here is a picture of the conservation from UCSC's genome browser:

Monday, December 15, 2008

Coloring book reject

The image below is a Groebner fan that is just too complicated to put in the coloring book, buts I think its pretty impressive:

The ideal generating this is from what I call the "super three-vortex problem", the equations for the central configurations of a 1/r^2 potential:

I am mainly interested in the nonzero solutions of this system. To get at those, we can saturate the ideal - or in practical terms we can introduce a new variable, w, and add the equation w*s12*s13*s23 - 1 = 0 to the ideal. The 3D Groebner fan of the resulting system can be seen here.

Thursday, December 4, 2008

A mathematical coloring book

I've been working on a mathematical coloring book, with the pictures created using Sage. It still needs some work but I've put a preliminary version up at (I am not making any money on it, the cost is what charges to print it.) I have also made the download freely available. I would appreciate feedback, especially from people with kids who try it out.

Saturday, November 29, 2008

Protein of the Day #14: XP_001352106

The genome of the malaria-causing Plasmodium falciparum is bizarre in a number of ways - the most striking feature being the extremely high (over 80%) A+T content. Looking on the protein level, there seem to be many proteins with long asparagine inserts. These asparagine inserts must serve some sort of purpose but it is unclear what it is. They do tend to confuse sequence-alignment and similarity searches, which is one reason so many of the proteins remain uncharacterized.
An extreme case of this is XP_001352106, which has a run of 83 asparagines in a row. It does have some similarity to a subunit of cyclin kinase but not enough to be very confident about its identity.

Sunday, November 23, 2008

Symmetric Venn Diagram

This was the start of a small industry of making symmetric Venn diagrams, which Branko Grunbaum found in 1975. I have been working on making a mathematical coloring book (first edition should be - needs to be - done by the holidays, so more details on that soon). I've been trying to making some symmetric Venn diagrams for it, this is a by-product of my first attempts:

Wednesday, November 19, 2008

Protein of the Day #13: ATP synthase, beta chain

One of my students is doing a comparative genomics study of Plasmodium falciparum (with an ultimate goal of developing better alignment algorithms for organisms with extreme genomes), and I was curious about what the most conserved protein is relative to a model organism such as yeast. Turns out its ATP synthase; not too surprising but one might guess some other things too. The wikipedia entry is pretty good, although perhaps not as good as it could be. It does have a nice cartoon of the structure:

Monday, November 17, 2008

Protein of the Day #12: V1rf3

V1rf3 vomeronasal 1 receptor, F3, as its called in mice, appears well-conserved in a variety of mammals, as shown below (sometimes under slightly different names). The vomeronasal system is distinct from the usual olfactory system for smelling, and can be sensitive to very different compounds. In mice the vomeronasal system is important for pheromones. It remains unclear whether humans have any sort of functional vomeronasal at all; my guess is that most of us do not, but perhaps a few people still do.

Friday, November 14, 2008

Permutohedron mirrors

The image below is from a viewpoint in a mirror-faced 3D permutohedron (truncated cuboctahedron). It should link to a larger (1920x1200) version. This image was produced with Sage (using the Tachyon raytracer and some new code for polytopes that I've been working on).

Thursday, November 13, 2008

Slices of the 600-cell

Over the last 6 months or so I've been doing some work on visualizing polytopes, Groebner fans, and other geometric/algebraic objects. I gave a presentation last week about some of that, which forced me to finish up some projects. One of those was a movie of the 600-cell being sliced by 3-planes.

Monday, October 27, 2008

Protein of the Day #11: cystathionine gamma-lyase (CTH)

In a recent article in Science, Yang et al found that cystathionine gamma-lyase can produce hydrogen sulfide gas in mice, and that this seems to help control blood pressure. Since I'm interested in mammalian hibernation, this made me wonder about connections with another Science article from 2005 which showed that a torpor-like state of lowered body temperature can be induced in mice by a particular level of H2S exposure. This protein is very well conserved; below is an alignment with the mouse version with some mammals and the chicken. Normally it is involved in cysteine metabolism and presumably does not create H2S; there are some splice variants and perhaps the variants are important in this respect.

Tuesday, October 21, 2008

Protein of the Day #10: TRPV1

Mmmmm...spicy food. Wouldn't be as nice if we didn't have TRPV1, which responds to the capsaicin from hot peppers. Here is the predicted domain structure from the SMART database, with the transmembrane domains picked out (along with some ankyrin repeats):

Monday, October 13, 2008

Wordle Malaria

Wordle is a fun little site; I fed the OMIM entry for malaria susceptibility into it and got:

Protein of the Day #9: Retrocyclin

The defensins are an interesting protein family that is important in mammalian immune systems. It now seems that most mammals have some versions of the alpha- and beta-defensins, but only some primates have theta-defensins. In the human, there is a pseudo-gene for a theta-defensin that is post-translationally processed into a cyclic peptide called retrocyclin. It is possible that our loss of a functional retrocyclin contributes to our susceptibility to HIV and AIDS; its an interesting avenue for future gene therapy.

I can't find out too much about retrocyclin; since you pretty much have to use rhesus monkeys to study it, there isn't a lot out there yet. A good place to start is the OMIM entry for the alpha-defensins, and the paper by Cole et al.

Thursday, October 9, 2008

Schlegel and the 600 cell

I just wrote a patch to do Schlegel diagrams (a sort of projection) of 4D polytopes in Sage. The 4D regular polytopes are an awful lot of fun to think about; below is a picture of the 600 cell as rendered by my new code. Its much more fun to play with it interactively - check it out.

Thursday, October 2, 2008

Protein of the Day #8: Hemoglobin

So I should have called it "Protein of the Week". Ah well. Its the protein of the day, just not every day...

Hemoglobin: its a classic! Don't think that makes it boring. On the contrary, I think it remains a fascinating protein.

Its possible that it deserves the title of most-studied protein. Right now there are 4820 hemoglobin sequences at NCBI. It was discovered in 1851, and the structure solved in 1959 - I think that was the first protein structure found by x-ray crystallography. I could go on and on...

One of my interests in it at the moment is that hemoglobin is the food for the Plasmodium species that cause malaria. Amazingly, they synthesize their own heme groups. Hemoglobin is a funny food though, because of dealing with all those heme units, and Plasmodium has to accumulate hemozoin garbage.

Thursday, September 25, 2008

Protein of the Day #7: Enolase

Mammals have three enolases. A more descriptive name is "phosphopyruvate hydratase" - they catalyze the conversion of 2-phospho-D-glycerate to phosphoenolpyruvate.

In Plasmodium falciparum, there is still some controversy about their enolase. It appears that at least part of it comes from a migration from the apicoplast genome into the nuclear genome, but it may be a hybrid. Here's the telltale insertion that matches up with plants (in this case rice, but the apicoplast probably came from a red algae ancestor endosymbiont):

Thursday, September 18, 2008

Movie of a Groebner fan

This summer I spent some time thinking about animation and visualizing algebraic and geometric information. I have a longer to-do list than accomplishments but I have made some progress.

One of pilot project ideas was to take a 5-variable polynomial ideal and:
1) compute the Groebner fan using Sage and Gfan,
2) intersect it with a hyperplane (so now we're down to 4 dimensions)
3) slowly rotate the resulting polyhedral complex in 4 dimensions, rendering it using Tachyon/Sage
4) animate the resulting set of frames.

For step 4, I initially wanted to use Blender, but that was really overkill for what I needed and I didn't want to figure out how to get Sage and Blender using the same copy of Python (although someone should). In the end I used ffmpeg to get my movie.

Check out my current best effort.

My next goal in this direction is to do something with Sage's @interact command and JMol to highlight pieces of the fan, since the movie isn't really informative (more art than math I think).

Protein of the Day #6: CD36/Fatty acid translocase

CD36 is a great example of the complexity of biology.

After some modification, it is the same thing as "platelet glycoprotein IV", an important protein in platelets and clotting - thrombospondin binds to it. Its also important in malaria, since Plasmodium infected erythrocytes can bind to CD36, and mutations in it can result in varying severity of malaria.

But its also "fatty acid translocase" and its a receptor for low density lipoprotein (LDL). Its been associated with a number of effects on the immune system, reaction to hyperglycemia, and oxidant stress.

Both these roles make it interesing in the context of mammalian hibernation, where the clotting reactions must be suppressed and metabolism switched to using ketone bodies derived from lipid stores.

Free Stanford!

...just a little joke, no one is repressing it. But the Stanford Engineering school has done something extremely nice, namely put up entire course materials online for computer programming, AI, and some electrical engineering courses. Looks very well done. I really like the transcripts of the video lectures, since I like reading more than listening. I don't think MIT's OpenCourseWare does that, but that is another very nice open access project.

I am a little envious of today's self-motivated youth, it would be pretty easy to teach yourself almost anything these days. When I was a teenager I taught myself basic calculus from an old 1940s book (it was called something funny like Calculus for the Everyman, I can't remember exactly). It had nice line drawings but geez, being able to virtually sit in on MIT classes would have helped I think.

Friday, September 12, 2008

Protein of the Day #5: Aldehyde dehydrogenase 2

Aldehyde dehydrogenase 2, or ALHD2, is the highlight of an article in Science this week showing it is related to mechanisms for protecting the heart from ischemia (lack of blood, which results in lack of oxygen (hypoxia)). There are cytosolic and mitochondrial versions; since this protein is important in metabolizing alcohol, having both versions seems to clear alcohol faster, although I can't find a definitive reference for that fact.

Thursday, September 11, 2008

Protein of the Day #4: apical membrane antigen 1 (AMA1)

In the Plasmodium species that cause malaria, the merozoite stage must invade red blood cells. It does this using the strange apicoplast organelle, which is a much-warped descendent of a chloroplast. Apicoplast:chloroplast as Gollum:Hobbit. Anyway, one of the proteins that helps this invasion is the apical membrane antigen 1, although like most Plasmodial proteins not that much is known about it.

Eloquent Javascript

Because of my interest in Sage and eventually helping more with the notebook interface, I've been trying to learn some Javascript. Recently I found a fantastic online book that's a joy to use, partly because it has an interactive Javascript console and all code examples can be run within that framework: check out Eloquent Javascript.

Thursday, August 21, 2008

Protein of the Day #3: Bone morphogenic protein 7

In this week's Nature there is some exciting news about what makes brown adipose tissue, or "brown fat". It turns out that brown fat actually comes from muscle precursor cells, not fat cells, and bone morphogenic protein 7 (Bmp7) can induce the transformation.

Brown fat is very important in hibernation as it can generate heat by short-circuiting the mitochondrial proton pump. At very low temperatures, animals cannot shiver so they need other heat generating mechanisms. Its also important for infant mammals, including humans.

Bmp7 is part of the TGF-beta superfamily (transforming growth factors). In the human its on chromosome 20. 7 exons, very typical gene in that respect.

Brown fat has generated a lot of controversy over the years (since 1551!); this discovery seems like a big step forward.

Tuesday, August 19, 2008

Protein of the Day #2: Clathrin

Clathrin is a very cool protein. It forms polyhedral lattices that help cells in endocytosis. Your synapses are using a lot of clathrin right now.

Monday, August 18, 2008

Protein of the Day #1: Alpha-2 Macroglobulin

Long time no post. I've decided to try a "Protein of the Day" post mostly to help myself remember some of them.

For protein #1, I've picked alpha-2 macroglobulin, a serum protease inhibitor. NCBI's Online Mendelian Inheritance in Man (OMIM) is a nice curated resource for protein/gene information and its where I often turn first. Wikipedia has a nice entry too - the quality of wikipedia articles in biochemistry is usually excellent in my experience so far.

Its a glycoprotein, so there are some extra carbohydrates attached. There are four subunits held together by SS bonds. The gene structure is relatively complicated, with 36 exons (same number in human and mouse. It is thought to be evolutionarily related to the C3 and C4 proteins.

My interest in it mainly stems from the fact that it is important in mammalian hibernation. Among many other effects, it inhibits coagulation and fibrinolysis.

Friday, May 9, 2008

Anders Jensen's nonregular Groebner fan in 3D

The only example of a non-regular Groebner fan that I am aware of is the following one from Anders Jensen's 2007 thesis, here plotted in 3D with Sage and gfan:

R4. = PolynomialRing(QQ,4)
idnp = R4.ideal([x*y*z+x^2*z-x*y,x*w^2-z,x*w^4+x*z])
gfnp = idnp.groebner_fan()
show(gfnp.render3d(), frame = False)

You have to follow the link for the plot since I am not sure how to include JMol applets on a blogger post.

Saturday, May 3, 2008

OpenWetWare and Sage

After reading a nice article by Julius Lucks on OpenWetWare, about python, biopython and SWIG, I suggested he check out Sage. He in turn suggested I write something up on OpenWetWare, and so I have. Hopefully this will lead to some more biology and bioinformatics interest in Sage.

Sunday, April 27, 2008

Back from Lausanne

Last week I went to a very nice conference at the Bernoulli Center, on real algebraic and tropical geometry. I gave a talk on some problems on finiteness and bounds on the real solution of polynomial systems coming from the n-body problem; Sage was featured in a variety of ways during my talk. Here's an animation of a 4-body central configuration (couldn't seem to upload it to the blog directly; maybe its too big).

Thursday, April 17, 2008

making tracks

I've finally added a feature to Sage that I've wanted for a long time: tracking the solution paths of polynomial systems through a homotopy continuation (using Jan Verschelde's phcpack). I am cleaning up my code for formal inclusion, but it seems to work pretty well. The picture below tracks 87 of 99 solutions of the Albouy-Chenciner equations for the three-body problem (in the complex plane). The initial solutions (small blue dots) are for masses m1 = 1, m2 = 2, and m3 = 3. The final solutions are for m1 = 1/100, m2 = 1/10, and m3 = 3. Some of the solutions are moving off to infinity: the mixed volume for the system with m1=m2=0 is only 18, so 81 solutions have to coalesce or move out to infinity. (Why only 87 of the 99? The other twelve are somewhat degenerate, and their solution paths are a little jumpy). Alex Jokela helped a lot with writing the parser for the phcpack path-tracking output.

Saturday, April 5, 2008

color me gfan, now in rgb

Various improvements to the gfan interface in sage are in the works; one of the minor things I've had fun doing is adding more flexible color functions to the render function. Here's the Groebner fan of the 3-vortex problem relative equilibria equations, where the color is determined by the polynomial in each reduced Groebner basis which has the highest degree in any one variable - the degrees of the polynomial are converted to RGB values.

Wednesday, March 26, 2008

Interactive coalescents

Genetic coalescents are interesting statistically; the variance in the time to coalescence is large, which is the kind of quantity I think human intuition has trouble with. So it helps a bit to be able to play with them (code can be found here):

Tuesday, March 25, 2008

Gfan in 3D

Having upgraded the Sage interface to gfan for version 0.3, I've been thinking about other ways to leverage Sage's capabilities in this respect. One thing I've been working on is a 3D Groebner fan representation. I have some working code for this now, which hopefully will end up in Sage-3.0 if I have time to polish it up. Below are a couple of screenshots of the 3D rendering of the Groebner fan of the ideal generated by (w^3-x, x^3-y, y^3-w, z^2-x-y-w):

Friday, March 21, 2008

restricted four-body problem

The Albouy-Chenciner equations for the restricted four-body problem seem to generally have 191 solutions. For masses m1=17/20, m2 = 19/20, and m3 = 1, there are 160 complex solutions in 6 variables (the mutual distances between the particles). Plotting each sextuplet as a polygon gives the following plot, followed by the configurations formed by the positive real solutions. Computed with Sage and phcpack.

phcpack, sage, and interact

I've been working on integrating Jan Verschelde's phcpack software with Sage. phcpack finds solutions to pretty nasty systems of multivariable polynomials by using polyhedral homotopy continuation. Sage can provide a nice frontend for this. Here's an interactive display of the complex solutions of the Albouy-Chenciner equations (from the paper "Le problème des n corps et les distances mutuelles", Inventiones Mathematicae 131 p.151-184, 1998) for the 4-vortex problem:

Monday, March 17, 2008

Color me Gfan

The latest version of Gfan has some new capabilities that I am excited to use for testing whether ideals are zero-dimensional. But first I have to rewrite the Sage interface to Gfan. I thought that I should try to give some Sage-added-value while I was at it, so I am converting Gfan's xfig output to Sage graphics and adding some color. Here's one result so far: a map of all the reduced Groebner bases for the 3-vortex problem, colored by maximum degree:

Thursday, March 13, 2008

Sage 2.10.3

Sage 2.10.3 is out, with the first released version of the new interact command. As a spectator to the process, it looked like a tough fight between bugs and developers - which I think should be viewed as an entirely positive thing, since it is a consequence of improved QA practices.

Release notes are here, in case you need it the download page is here.

If I can stop playing with interact (which has already been useful to me in teaching and research after 1 day!!) I hope to contribute a wee bit to some upcoming releases. I am rewriting the gfan interface to make use of gfan 0.3. Also, a student and I are working on pretty graphics for path-tracking solutions from homotopy solvers of polynomial systems (phcpack). I'll have to revise the plan a bit to exploit interact.

Monday, March 10, 2008


William Stein and Co. have delivered once again, with a new "interact" command that looks amazing even in its beta form. You can almost smell Sage-3.0; it should be out before the roses are blooming here in Duluth. Among other tests, I used it to explore the CpG content of the human mitochondrion averaged over a variable length window:

Thursday, February 28, 2008


Okay, so its not about bioinformatics or Sage, but after a month of no posts I thought I should say something. In the near future I hope to put up some more technical stuff; right now I am working on updating the Gfan (Groebner basis fan software) component of Sage, which isn't very sexy.

So: some music picks. I have been very happy with Balkan Beat Box's eponymous first album. I highly recommend it. Also good recently was Dengue Fever's Escape From Dragon House.

Saturday, January 26, 2008


Given the title of this blog, I've meant to do some coalescent stuff in sage for a while. Now that I am about to teach about it in a class, I've finally found the motivation:
(It's cooler animated, but I can't figure out how to post a gif animation here.)

Tuesday, January 15, 2008

Sage in 3D

It appears that Jmol will be the backend for most 3D graphics in Sage. After some hard work by developers (that I had nothing to do with) some of that functionality was put in Sage in time for the joint math meetings in San Diego. William somehow managed to find a slew of 3D glasses so that folks could enjoy the stereoscopic view:

(photo by Robert Bradshaw)

It will be exciting to see what happens as the Jmol integration proceeds.