# 2016 Tester Symposium: Láruson

A couple of pictures from the Tester Symposium of Áki Láruson giving a presentation about his sea urchin work.

# CRISPR's

I think there are only two, maybe three, people left in the world that have not heard about the CRISPR-Cas9 system. It originates from a type of bacterial immune response but has been recruited as a genetic tool and has swept through the genetic engineering community in the last few years. In fact, one of the criticisms of our last NIH grant applications was that we were not using CRISPR's...  Here are some links to a few articles of wide interest regarding this technology.

# DARPA, Gene Drive Technologies, and the ENMOD Treaty

I just returned from a "gene drive" workshop at NCSU's Genetic Engineering and Society Center. There is a lot to talk about from the meeting. Here I want to focus on a couple of specific details that came up in reference to the military funding of genetic technology. The meeting was held under Chatham House Rule, so I cannot identify the people who made the original statements. Some of these were in group discussions and some of these were personal one-on-one conversations. As a brief, overly terse, background statement to describe a complex field: gene drive technology is a new emerging technology that is potentially very powerful and could be used for beneficial humanitarian and species conservation applications where other methods have fallen short in their long term effectiveness.

First of all, I was told that DARPA is interested in funding gene drive technology for environmental modifications. DARPA helps to develop new technologies for military applications.  This could be for both species conservation applications as well as preventing infectious disease (and also there is obviously the possibility of malicious hostile use in military applications but this was not brought up). Apparently a man named Dr. Jack Newman (link) is slated to become the program manager of mosquito gene drive technology at DARPA.

So---to be frank---I believe this is potentially a very bad idea for many reasons. The first is strategic. If these kinds of technologies are to ultimately be used for beneficial reasons they must be acceptable in some degree to the public so that they can become adopted and utilized. The Pacific Islands have a very negative track record of being used for testing grounds of new technologies. This ranges from classical bio-control releases of invasive species, to loss of traditional land to military activities, to, probably the most glaring example, nuclear testing in the Marshall Islands that displaced Native People and resulted in a region becoming uninhabitable from the resulting radiation (also note French nuclear testing, under protest, in Tureia, link). There is nested within the issues of the loss of self determination resulting from colonialism by many Western Countries across the Pacific. Like it or not, public perception is a very real force that cannot be ignored. The Three Mile Island accident in 1979 led to an effective moratorium on new nuclear reactor construction until 2012; however, many of these new projects have also been canceled with the more recent Fukushima disaster also playing a role. The public reaction to GM Crops has also had a very real effect on the laws surrounding the technology and adoption of the technology around the world including in the Pacific (e.g., the GM Taro and GM Papaya controversies in Hawai'i, link).* Right or wrong, in the Pacific, military funding of a new technology will be initially evaluated within the perspective of other military tests of new technologies and the effects this has had on the people of the Pacific Islands. Even more relevant to gene drive technologies, in the 1970's a World Health Organization project to test the release of sterile mosquitoes in India (to suppress the local population and limit the transmission of disease to humans) was shut down due to public perceptions that it might also be a secret military bio-warfare test (link, incidentally there are also some documents on WikiLeaks related to this).

In a broader ethical-moral sense (and this is very much a personal opinion from the perspective of a US citizen) are we comfortable with the military guiding and controlling the research that goes on in our country? This may sound like hyperbole; however, a comparison of the huge difference in the levels of US military funding (on the order of $610 billion) and National Science Foundation funding (on the order of$7 billion) is objectively dramatic. Advances in research depend on grant funding and support. Which technologies government funding agencies choose to support affects not only the advancement of these technologies but the direction they develop in and as a direct result the future applications of these technologies (the history of Project Orion is one example where limited funding sources and issues of potential military uses caused development focused on military applications yet ultimately stopped a line of scientifically promising yet controversial research, link).

Ideally, for gene drives technologies to be able to realize their potential in beneficial applications, they should be supported and developed by sources other than the military and private companies---and yes, this is strongly motivated by public perception as well as ethical principles. Scientific funding bodies as well as state and local funding have more of a long term potential benefit than is initially apparent. Furthermore, accepting funding from the military lends false support to continuing the objectively inflated funding of the military at the expense of government agencies devoted to scientific research (NSF and others); at the end of the day the military can say that it should continue to receive research funding because of the projects it has supported, but this comes with a social cost. Wouldn't it be better if NSF could make this statement instead without the social cost?

Okay, now comes the ace card that I have been hiding so far in this article... At the meeting, someone brought up (within the context of more "traditional" synthetic biology) that the US is a signatory to the ENMOD international treaty which came into force October 5, 1978. This treaty prohibits the military from using environmental modification technologies that have widespread and/or long-lasting effects. Interestingly, "Environmental Modification Technique includes any technique for changing – through the deliberate manipulation of natural processes – the dynamics, composition or structure of the earth, including its biota" (full text). This gets into philosophical discussions about the role of physical coercion by the military and the state, which I do not want to go into here (links for reference, military, monopoly on violence); however, I will appeal to the common-sense notion that military force by a state is a hostile act although this is more difficult to realize when it is done in a way that aligns with your own interests.  Since gene drive technologies are deliberate manipulations of natural biological processes with long term and possible widespread effects on manipulation of the environment, is DARPA military funding of gene drive technology even legal according to international treaty that the US agreed to support?

# SimpleMathJax

I have neglected the genetics wiki on this site. One of the frustrations was not being able to write formulas correctly in the wiki markup code. However, I installed a SimpleMathJax extension and it allows TeX code to be added between  tags. I added a genetic drift page to the site to test it out. http://hawaiireedlab.com/gwiki/index.php?title=Genetic_Drift

# The evolution of antibiotic resistance

Here is one result from this semseter's genetics teaching lab that I wanted to share. The students grew bacteria on a series of gradient media that had increasing concentrations of an antibiotic. At the end of the experiment the bacteria could grow on levels of antibiotic that would have prevented growth before the experiement (which we tested with a control that was genetically identical at the beginning of the experiment and was not exposed to antibiotics). The sucessive generations of bacteria evolved by mutations and selection to tolerate the antibiotic. (One of the goals of this was to show the students an example of evolution in action and illustrate the risks of over-using antibiotics.) We then measuered levels of gene expression for all the genes in the genome and identified which genes had increased their acitvity and which ones had decreased acitivity to allow them to survive (by extracting RNA and hybridizing it to an Affymetrix "GeneChip E. coli Genome 2.0 Array"). Next year I'm planning to have the students sequence some of the genes involved and try to find the precise mutations that have changed gene expression levels.

# Conway's recipe for success

“His recipe for success is to have 4 problems on the go: a big problem, difficult and important, that will probably depress you before it makes you successful; a workable problem, tedious but with a clear strategy so you can always make some progress and feel a sense of accomplishment; a book problem, for the book you're writing or may eventually write; and a fun problem, since life is hardly worth living if you're not having some fun.” pp. 114-115 Genius At Play: The Curious Mind of John Horton Conway by Siobhan Roberts, Bloomsbury Publishing 2015

# arXiv: Underdominance in Population Networks

We submitted a preprint of our "network" manuscript to arXiv and it will post today (link and info below).  We also submitted it to a journal but went over the page limit so are currently editing it down to be shorter.

http://arxiv.org/abs/1509.02205

arXiv:1509.02205

Title: Stability of Underdominant Genetic Polymorphisms in Population Networks
Authors: \'Aki J. L\'aruson and Floyd A. Reed
Categories: q-bio.PE

Heterozygote disadvantage is potentially a potent driver of population
genetic divergence. Also referred to as underdominance, this phenomena
describes a situation where a genetic heterozygote has a lower overall fitness
than either homozygote. Attention so far has mostly been given to
underdominance within a single population and the maintenance of genetic
differences between two populations exchanging migrants. Here we explore the
dynamics of an underdominant system in a network of multiple discrete, yet
interconnected, populations. Stability of genetic differences in response to
increases in migration in various topological networks is assessed. The network
topology can have a dominant and occasionally non-intuitive influence on the
genetic stability of the system. Applications of these results to theories of
speciation, population genetic engineering, and general dynamical systems are
described.

By the way, the is my first arXiv submission.  I wish I had done this years ago for other papers that were delayed for months, or in the two worst cases literally years, in review and resubmission cycles and then we ended up getting scooped in the end.  I am planning to use arXiv a lot more in the future.

It feels odd to submit something to be widely available before publication.  However, many journals now accept this (see also the discussion here) and there is excellent work that is freely available in arXiv.  There is also some discussion whether or not authors should cite preprints on arXiv with the general feeling that this is fine and in fact appropriate to do so and should be encouraged.

# Go Vulcans!

Jolene Sutton is starting a tenure track position as a new assistant professor in the Biology Department at UH Hilo this fall!

# Harmonics, Convergence, and the Diffusion Approximation

Lot's of different wave forms can be made by adding together harmonic series in certain ways.  A simple sine wave can have harmonics that vibrate twice as fast, three times as fast, etc.  Here is a plot of the odd numbered harmonics of a sine wave.

As the wavelength is reduced the amplitude of each wave is also purposely reduced.  It turns out that if you add these waves together you start approaching what is called a square wave.

However, the approach can be slow.  There is a lot of wiggle in the waveform.  Below is a plot with 50 odd numbered harmonics added together; $\sum_{i=1}^{50} \frac{\sin((2 i - 1) x )}{2 i - 1}$.

Closer, but you can still see the oscillation at the height of each peak.

If you add together the even harmonics, $\sum_{i=1}^{50} \frac{\sin(2 i x )}{2 i }$, you get a sawtooth wave.

Odd harmonics with a different weighting scheme, $\sum_{i=1}^{50} (-1)^i \frac{\sin((2 i -1) x )}{(2 i -1)^2}$, give triangular waves that converge quite fast.

Almost any waveform is possible,  $\sum_{i=1}^{50} \frac{\sin((2 i -1)^2 x )}{(2 i -1)^2}$

including this,  $\sum_{i=1}^{50} (-1)^i \frac{\sin((3 i -1) x )}{i ^2}$

Okay, so where am I going with this; there is a point here that ties back into population genetics.   Complex curves can be built up from the sum of a series of simpler curves.  However, it is also clear that in some cases the end result can take quite a large sum (the wiggle in the square and sawtooth waves above), in other words the final curve is slow to converge and requires a large number of harmonics.

Kimura's (1955) famous diffusion approximations to model the process of genetic drift in a finite population are built up in a similar fashion.  The final curve is a sum of an infinite series of higher order harmonics.  The math is messy.  It makes use of the hypergeometric function and Gegenbauer polynomials, but the underlying idea is similar to the examples given above.

In the simplest case of the diffusion approximation the series is

$\sum_{i=1}^{\inf} p (1-p) i (i+1) (2 i +1) \,F\!(1-i, i+2, 2, p) \,F\!(1-i, i+2, 2, x) \, e^{-i(i+1)t / 4N}$

In this equation $p$ is the allele frequency, $N$ is the population size of diploid individuals, $t$ is the time in generations, which is often combined with $N$ in a parameter like $\tau=t / N$, and $F$ is a hypergeometric function (specifically it is the ordinary Gaussian hypergeometric function ${}_2F_1$; there are many more but this is the most common).  In the graph below are the first six odd order curves of the series with $p=0.5$ and $\tau=0.1$.

You can see that they tend to build (are positive) near the centre of the x-axis near a frequency of 0.5 and tend to alternative positive and negative near the edges cancelling each other out for a sum near zero.

Plotting these as sums of each new harmonic plus the previous ones gives these curves.

This plot just focuses on the last step, which is the sum up to $i=12$ in the equation.

This is starting to give us the expected distribution of allele frequencies expected after N/10 generation of genetic drift when starting from a frequency of 1/2; however, the wiggle to positive and negative values near the edges means that it has not yet converged satisfactory.

Taking the iterations up to $i=25$ gives a nice result.

However, what if we want to look at even shorter periods of time.  Holding the y-axis scale the same and letting the peak run off the top of the graph look at what happens after just N/100 generations.

The sum has to be taken up to the $i=100$ to get things to smooth out.

This takes some time on the computer---the hypergeometric function takes a bit of grinding to calculate.  Drift over shorter periods of time is precisely some of the situations where we might want to use this type of approach (it addresses standing variation and ignores new mutations that occur over deeper periods of time). This is why I have been exploring a faster alternative with the beta distribution that I wrote about in an earlier post.

By the way, I used mathematica to generate the plots above.  Here is the code if you are interested.

t = 10;
n = 1000;
p = 0.5;
m = 100;
(*the frequency distribution (probability) of the polymorphic \
fraction*)
poly =
Plot[Sum[p*(1 - p)*i*(i + 1)*(2*i + 1)*
Hypergeometric2F1[1 - i, i + 2, 2, x]*
Hypergeometric2F1[1 - i, i + 2, 2, p]*E^(-t*i*(i + 1)/(4*n)), {i,
1, m}], {x, 0, 1}, PlotRange -> {-1, 4}, Filling -> Axis,
PlotStyle -> Blue,
AxesLabel -> {"allele frequency", "probability density"}]

I also tried to plot this in R and got the following.

I'm not sure what is going on but some kind of error seems to be building across the function.

Here is my code
myDiffuse <- function(x,p,t,N,max){
max=25
sum=0
for(i in 1:max){
hypgeosumx=myHypergeoGaussSeries(i,x,max)
hypgeosump=myHypergeoGaussSeries(i,p,max)
sum=sum+p*(1-p)*i*(i+1)*(2*i+1)*hypgeosumx*hypgeosump*exp(-i*(i+1)*t/(4*N))
}
return(sum)
}

myKayAll <- function(i, l){
n=1
for(j in 1:(l-1)){
n=n*(j-i)*(j+1+i)
}
d=factorial(l)*factorial(l-1)
return(n/d)
}

myHypergeoGaussSeries <- function(i, z, m){
h=1
for(j in 2:m){
h=h+myKayAll(i,j)*z^(j-1)
}
return(h)
}

x <- seq(0, 1, len = 10001)
t=10
N=1000
p=0.5
max=5
y<-myDiffuse(x,p,t,N,max)

plot(x,y,type='l',ylim=c(-5, 15))