4.6 Compatible observables and the uncertainty relation

Now that we have explained how observables correspond to normal operators, we can try to understand what implications follow from the fact that matrix multiplication does not generally commute: AB\neq BA. We can start by trying to figure out when exactly two given operators A and B will or will not commute, ideally in terms of eigenvectors (since this will let us talk about outcomes and their numerical values, using the language we have just built up). An important definition is the following: if a basis \{|e_1\rangle,\ldots,|e_n\rangle\} is such that each |e_k\rangle is an eigenvector of an operator A, then we call it an eigenbasis of A.

First of all, assume that A and B do commute, so that AB=BA, and let |e\rangle be some eigenvector of A with eigenvalue \lambda. Then \begin{aligned} AB|e\rangle &= BA|e\rangle \\&= B\lambda|e\rangle \\&= \lambda(B|e\rangle) \end{aligned} which says that B|e\rangle is also an eigenvector of A, with eigenvalue \lambda. If \lambda\neq0, then this says78 that B|e\rangle is proportional to |e\rangle, which is simply saying that |e\rangle is also an eigenvector of B. This means that any eigenbasis of A is also an eigenbasis of B. Another way of saying this is that A and B are simultaneously diagonalisable: there exists a basis in which both A and B are diagonal, namely any common eigenbasis of the two.

Conversely, say that A and B have some common eigenbasis \{|e_1\rangle,\ldots,|e_n\rangle\}, with A|e_k\rangle=\alpha_k and B|e_k\rangle=\beta_k. To show that AB=BA, it suffices to show that (AB)|\psi\rangle=(BA)|\psi\rangle for any state |\psi\rangle. But we can write any |\psi\rangle in the common eigenbasis as |\psi\rangle=\sum_k\lambda_k|e_k\rangle for some \lambda_k, and then \begin{aligned} (AB)|\psi\rangle &= AB\sum_k\lambda_k|e_k\rangle \\&= \sum_k\lambda_k AB|e_k\rangle \\&= \sum_k\lambda_k A\beta_k|e_k\rangle \\&= \sum_k\lambda_k \beta_k A|e_k\rangle \\&= \sum_k\lambda_k \beta_k\alpha_k|e_k\rangle \end{aligned} and \alpha_k and \beta_k commute, since thy are just complex numbers. This means that running the same calculation for (BA)|\psi\rangle would give exactly the same result, and so AB=BA.

Two operators A and B commute if and only if there exists some common eigenbasis. In this case, we say that A and B are compatible; if A and B do not commute then we say that they are incompatible.

We have said that eigenvectors |e_k\rangle of an operator A correspond to outcomes of the observable, where the eigenvalue \lambda_k is the associated numerical value. So if we have two compatible operators A and B, then we have a complete system of measurements for both observables at once, given by their common eigenbasis, say \{|e_1\rangle,\ldots,|e_k\rangle\}. What does this mean in terms of measurements? Well, if we measure A on some system initially in state |\psi\rangle, then we know that the system will collapse into one of the states |e_k\rangle. But this is also an eigenvector for B, so measuring B won’t affect the state at all, and similarly for a subsequent measurement of A.

If, however, A and B are incompatible operators, then things are very different. If we measure A, then B, and then A again, there is absolutely no guarantee that the two measurements of A will be the same. In other words, measuring B somehow makes the system “forget” the result of the first measurement of A. We see this in the lab if we measure position and momentum of a particle: taking the momentum measurement “spreads out” the position of the particle throughout space, meaning that a position measurement taken immediately prior will have no reason to be the same as a position measurement taken immediately afterwards.

Incompatible operators turn up all over the place, and actually turn out to be very interesting — sometimes it’s good when things don’t work too simply! One particularly interesting question we can ask is the following: can we quantify how far away from being compatible two incompatible operators are? We can make this question more mathematically concrete by rephrasing it slightly, asking if we can find at least some states that are close to being common eigenstates.

Imagine preparing a huge number of systems into the same initial state |\psi\rangle, and then measuring A on half of them and B on the other half. Doing so we can obtain the expected values \langle A\rangle and \langle B\rangle, and we can calculate (using classical statistics) the standard deviation of these variables, \sigma_A and \sigma_B, respectively. The standard deviation of a random variable is basically a measurement of “how close to the expected value are all the resulting values”.79 The smaller the standard deviation, the more “well defined” the measurement is. In particular, given any single operator A, we can always make the standard deviation exactly 0, by just preparing our system in an eigenstate of A. If A and B are compatible, then we can simultaneously make \sigma_A and \sigma_B exactly 0 as well, since we know that A and B have a common eigenbasis.

The really interesting, purely quantum, phenomena, however, comes when A and B are incompatible: we can prove that the standard deviations cannot both be made simultaneously arbitrarily small.

The uncertainty principle for operators A and B says that \sigma_A\sigma_B \geqslant \left|\frac{1}{2i}\langle[A,B]\rangle\right| where [A,B]=AB-BA is the commutator.

This says that there does not exist any state for which \sigma_A\sigma_B is less that some specific value, which is determined entirely by the operators A and B. Of course, if A and B are compatible, then [A,B]=0, and so the uncertainty principle doesn’t tell us anything at all — it simply says that the product of two non-negative numbers is greater than or equal to 0, which is always the case!

You might recognise the name, having maybe heard elsewhere of Heisenberg’s uncertainty principle, which is indeed a special case of this: one can show that the commutator of the (one-dimensional) position and momentum operators is exactly i\hbar (where \hbar is again the very small number known as the Planck constant), whence \sigma_x\sigma_p\geqslant\frac{\hbar}{2}.

We said that \hbar is very small, and this is fundamental to the relationship between quantum and classic physics. Most of the things that we deal with in day-to-day life are on the macroscopic level, and are many many orders of magnitude larger than the Planck constant. Indeed, if we wave our hands quite a lot, then we can say that “we see quantum effects only when dealing with things on the same order of magnitude as the Planck constant”. For example, a single photon of green light (roughly midway through the visible spectrum) has energy \approx3.5\times10^{-19} joules, whereas a mole of such photons (which is a “reasonable” number to encounter when talking about things that actually look green in day-to-day life) has energy \approx200\times10^{3} joules, so we would expect a single photon to exhibit quantum behaviour much more measurably than, for example, the light emitted from a green light bulb.

In a way which we shall not make precise, the fact that \hbar is strictly greater than zero (albeit very small) is what makes quantum physics inherently discrete, in contrast to classical physics which treats things like energy continuously. Quite wondrously, it is very often the case that taking a limit \hbar\to0 in some formula in quantum physics recovers the corresponding formula in classical physics — this is known as the classical limit or correspondence principle. This isn’t unique to quantum physics: special relativity reduces to classical mechanics if we take all velocities to be much smaller than the speed of light; general relativity reduces to the classical theory of gravity if we take all gravitational fields to be weak enough; statistical mechanics reduces to thermodynamics when we take the number of particles to be large enough; and so on.

This idea, that classical systems can be recovered from quantum ones by taking \hbar\to0, poses a question: can we go in the other direction? That is, given some classical theory that we know agrees with physical experiments, can we formulate some corresponding quantum version which we might hope to be correct on much smaller scales? Trying to answer this question has led to some incredibly deep (and very technical) mathematics known as quantization theory, with geometric quantization and deformation quantization being two key areas.

Before moving on, let us consider one more quantum phenomena that arises when we look at incompatible operators. Suppose that we have three operators, say A, B, and C, and we wish to let these act on our quantum system sequentially, but throwing away any results which are not a given outcome. That is, if we start (for simplicity) with some eigenstate |a\rangle of A, then we want to know the probability of measuring some specific output |c\rangle. But we know how to calculate this!

First of all, we know the probability of measuring outcome |c\rangle given that |a\rangle first evolves into the intermediate state |b_k\rangle: this is the probability of |a\rangle evolving under B into |b_k\rangle multiplied by the probability of |b_k\rangle evolving under C into |c\rangle, i.e. \Pr(c|b_k) = |\langle c|b_k\rangle|^2|\langle b_k|a\rangle|^2. Then to obtain the probability of measuring outcome |c\rangle we can just sum over all possible intermediate states: \Pr(c) = \sum_{k}|\langle c|b_k\rangle|^2|\langle b_k|a\rangle|^2.

But now, if we forget entirely about B then we could calculate \Pr(c) in a different way: it is simply given by \Pr(c) = |\langle c|a\rangle|^2. Using the fact that \sum_{k}|b_k\rangle\langle b_k|=\mathbf{1}, we can rewrite this as \Pr(c) = \left|\sum_k\langle c|b_k\rangle\langle b_k|a\rangle\right|^2 and this is not generally equal to the previous expression for \Pr(c). In fact, you can show that these two expressions agree if and only if [A,B]=0 or [B,C]=0, i.e. if and only if either A and B or B and C are compatible.

We briefly discuss an explicit scenario of where three evolutions behave in such a paradoxical way later on in Section 9, when we introduce Bell’s theorem, in what is sometimes known as the quantum Venn diagram paradox.


  1. To make this argument fully formal, and to deal with the case where \lambda is degenerate, isn’t too hard, but we don’t want to get too involved with the necessary linear algebra here.↩︎

  2. For example, if the random variable is normally distributed, then around 68% of the results will lie within one standard deviation from the expected value.↩︎