Observe.Think.Touch Nature

August 27, 2010

Berg and Hippel on Position Weight Matrix basics

Filed under: Books,Science Related — hebin @ 3:33 pm

Ber and Hippel 1986

[Append some thoughts]

Refer to below for reading notes, but here are some understanding appending them:

An important step in formulating Boltzmann distribution is to place the system of interest (A) in contact with a big heat bath (A’), the two of which form a combined system which is assumed to have a fixed amount of energy. A is assumed to be much smaller a system than A’, so is its energy. So studying A can be thought of perturbing the combined system by a small amount near its total energy. The probability of finding A in a particular energy state Er could thus be examined by evaluating the number of microstates in A’ when its energy deviates from E0 by Er. That probability, according to what is described below, is dealt with by writing the probability of A’ as a function of E’ and Taylor expand the logarithm at near E0. After ignoring higher order terms, the delta change is found by multiplying the first order derivative by Er, where the first order of derivative, (partial ln omega partial E) is defined as beta, which is again defined as 1/kT. So we realize that the probability of finding A at E state is made up of two parts: a parameter T describing the heat bath with which A is equilibrated with and increment in energy Er which describes the

Think like this: you perturb the small system by exchanging an amount of Er in energy with the heat bath, then evaluate the probability of finding the system A in this energy state by counting the number of microstates in the heat bath after this energy change, normalized against the sum of this number over all possible Er perturbations.

One more thing: note that T is used to describe the heat bath, which is assumed to be constant because we assume A’ is much larger than A and thus the energy exchange with A have negligible effect on it. This means two things for our system of binding sites: 1. the combined system here is the whole motif. The perturbation is made to one position and the analog here is that the possible combinations in all the other other positions form an “environment” and is used to evaluate the probability of choosing a particular nucleotide at that position. However it seems that the environment here cannot be assumed to be much larger than a position such that the number of combinations in that buffer is not so large. (the influence of this is not clear to me yet). 2. T, or the thermodynamic temperature, is good for characterizing the thermodynamic system, but in the binding site case doesn’t have a clear physics meaning. Let’s designate the rate of change in this case as lambda, following Berg and Hippel. It seems to me that this lambda A. should depend on the kind of TF; B. may vary as we perturb one position.

[Notes]

I have to resort to the notes on Statistical Mechanics (by Richard Fitzpatrick at UT Austin) to understand the underling equations, namely the relationship between probability of a state and energy.

Firstly, how these two are related in the position weight matrix case? What “probability” is being evaluated here? The idea is that if we consider all the binding sites of a given energy and its very vicinity (E+δE), and consider a system made up of L particles, corresponding to the L positions in a motif. Each position takes a specific nucleotide from [ACGT], while the total energy is constrained to be E. Then the probability of observing nucleotide B at position l, flB, can be expressed in form of Boltzmann factor

Now turn to Boltzmann factor, or rather Boltzmann distribution. This part of the theory involves viewing a macroscopic system from the microscopic view and try to interprete the former with the later. In statistical thermodynamics, the key assumption is that at equilibrium, all accessible states at a given energy level are equally probably. And evaluating the probability of finding a system at a particular state or a set of states involves counting the number of states and divide it by the total number of accessible states.

An interesting side point on the relationship between Prob(system), energy and thermodynamic temperature. First, think about this: usually when we consider a system in equilibrium, we are considering it in thermo equilibrium with a much larger system (e.g. a cup of water in the open space). It is the total energy of the two (A and A’) that we assume to be constant (with some inevitable error). Then the probability P(E) = C Ω(E) Ω(E0-E) . Take the log on both sides and Taylor expand the two omega terms in the vicinity of the energy value that maximizes P(E), log(Ω(E)Ω’(E’)) = log(Ω(Em)Ω’(E0-Em)) + (β(Em) – β’(E’))η + secondary term + …, the idea is that when this is evaluated at maximum, the linear term must vanish because by definition nothing could exceed its value (if the two beta are not equal, then with some positive eta, it’s possible to exceed the “maximum”). From this we could see that beta characterizes some aspect of the system that relate to the probability of states. note that by Taylor expansion, beta simply refers to the first derivative of the number of states w.r.t. energy, i.e. how fast log(omega) changes with energy. later from this defines the thermodynamic temperature T, which is a dimentionless parameter charactering the system. 1/kT = beta

It is thus understandable from our intuition that if the system and the heat bath (A’) has the same beta or T, then that is the most probable state, with the most entropy.

This note is more for recording and rethinking of my own understanding and thus is almost definitely not nearly as clear as the text I referred to. I would recommend reading some of that pdf I read by Richard Fitzpatrick.

Getting back to Boltzmann factor. It concerns evaluating the probability of the system at a particular state (not a set of states with a given energy). I’ll simply record here how frequency, microstates and energy and hence kT are related:
We want to determine the probability of the system at a particular r of energy Er (in our pwm example refers to observing nucleotide B at position l). This means the number of accesible states of the combined system is determined by A’. It’s reasonable to assume that Er << E0 , in which case we can again Taylor expand the logarithm of the probability near E0, i.e. perturb the combined system a little bit by changing the energy of the system of interest. Thus ln (Pr) = ln C’ + ln Ω’(E0) – βEr, note that although instead of evaluating the number of accessible states of A, we did it on A’, which we don’t know much about, it turned out that the term relating to it is a constant that we don’t care and which can be worked out by requiring all Pr sum to 1, while the important thing here becomes the perturbation to the combined system due to Er.

Thus in the Boltzmann factor eβEr, we should view β as the rate of change in the number of accessible states w.r.t. energy change and Er as the size of the perturbation, which in combination gives the amount of change in the number of accessible states, a microscopic variable that relates to the probability.

Note that the partition function, which is used to determine  C’, is not unique. Think of the equation as counting the number of microstates and thus the probability simply being the conditional probability.

December 12, 2008

Optimal gene circuit design–Alon, Chapter 10

Filed under: Books — hebin @ 10:55 am
  1. cost-benefit analysis for lacZ operon
    the benefit is linear with each additional lacZ enzyme given a certain environmental lactose amount;–every additional protein ~ additional lactose digested ~ additional resource ~ gain in growth
    however, the cost is nonlinear.  when cells have excess resources (such as ribosomes etc), the cost is negligible.  But otherwise, synthesizing additional proteins ??? why is the cost nonlinear?  plus I think the cells don’t simply turn on the lacZ system while keeping all others “business as usual”?

Theme: Rubric. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.