4.11 Quantum theory, formally

Even though multiplying and adding probability amplitudes is essentially all there is to quantum theory, we hardly ever multiply and add amplitudes in a pedestrian way. Instead, as we have seen, we neatly tabulate the amplitudes into vectors and matrices and let the matrix multiplication take care of multiplication and addition of amplitudes corresponding to different alternatives. Thus vectors and matrices appear naturally as our bookkeeping tools: we use vectors to describe quantum states, and matrices (operators) to describe quantum evolutions and measurements. This leads to a convenient mathematical setting for quantum theory: a complex vector space with an inner product (which is exactly a Hilbert space, since we only work in finite dimension). It turns out, somewhat miraculously, that this pure mathematical construct is exactly what we need to formalise quantum theory. It gives us a precise language which is appropriate for making empirically testable predictions. At a very instrumental level, quantum theory is a set of rules designed to answer questions such as “given a specific preparation and a subsequent evolution, how can we compute probabilities for the outcomes of such-and-such measurement”. Here is how we represent preparations, evolutions and measurements in mathematical terms, and how we get probabilities.

Note that we have already said much of the below, but we are summarising it again now in a more precise way, formally defining the mathematical framework of quantum theory that we use.

We also need to point out that a vital part of the formalism of quantum theory is missing from the following description, namely the idea of tensor products. To talk about this, we need to introduce the notion of entanglement, and this will be the subject of the next chapter.

It is a very reasonable question to ask why this formalism (Hilbert spaces, unitary operators, the Born rule) is “the good one”. One answer is that “it just works” — the calculations that we do in this framework give us answers which are in agreement with the results of physical experiments — but this can be rather unsatisfying as an answer.

Quite beautifully, it turns out that if we start from just five axioms, then we can prove that our choice of formalism is actually the only one that makes sense. This is the result of L. Hardy’s “Quantum Theory From Five Reasonable Axioms”, arXiv:quant-ph/0101012. We start by saying that a quantum system should be characterised by two integers: the number of degrees of freedom K, and the dimension N. The former is (roughly) the minimum number of real numbers needed to specify any state; the latter is the maximum number of states that can be distinguished from one another in one single measurement. The five axioms are then as follows.

  1. Probabilities. Relative frequencies of observed outcomes from measuring an ensemble of n systems tend to a well defined value, called the probability, when n tends to infinity.
  2. Simplicity. The integer K is a function of N, and takes the minimum possible value consistent with these axioms for each N.
  3. Subspaces. If a system is such that its states all lie within an M-dimensional subspace (for some M<N), then it behaves exactly like a system of dimension M.
  4. Composite systems. Composite systems behave multiplicatively, i.e. if a system is a composite of two subsystems A and B, then N=N_AN_B and K=K_AK_B.
  5. Continuity. Given any two pure states (all of the states that we have been discussing so far are pure states, but we define what this means in Section 8.1.) of a system, there exists a continuous reversible transformation of the system that sends one to the other.

What is particularly nice, as a bonus result, is that if we make one tiny change to these axioms — just dropping the word “continuous” from the fifth axiom — then the result is exactly classical probability theory.

Quantum states

With any isolated quantum system which can be prepared in n perfectly distinguishable states, we can associate a Hilbert space \mathcal{H} of dimension n such that each vector |v\rangle\in\mathcal{H} of unit length (\langle v|v\rangle =1) represents a quantum state of the system. The overall phase of the vector has no physical significance: |v\rangle and e^{i\varphi}|v\rangle, for any real \varphi, describe the same state. The inner product \langle u|v\rangle is the probability amplitude that a quantum system prepared in state |v\rangle will be found in state |u\rangle. States corresponding to orthogonal vectors, \langle u|v\rangle=0, are perfectly distinguishable, since the system prepared in state |v\rangle will never be found in state |u\rangle, and vice versa. In particular, states forming orthonormal bases are always perfectly distinguishable from each other.

Quantum evolutions

Any physically admissible evolution of an isolated quantum system is represented by a unitary operator.

Unitary operators describing evolutions of quantum systems are usually derived from the Schrödinger equation99 \frac{\mathrm{d}}{\mathrm{d}t} |\psi(t)\rangle = -\frac{i}{\hbar} \hat{H}|\psi(t)\rangle where \hat{H} is a Hermitian operator called the Hamiltonian.

This equation contains a complete specification of all interactions both within the system and between the system and the external potentials. For time-independent Hamiltonians, the formal solution of the Schrödinger equation reads \begin{gathered} |\psi(t)\rangle = U(t) |\psi(0)\rangle \\\text{where}\quad U(t) = e^{-\frac{i}{\hbar}\hat{H}t}. \end{gathered} Any unitary matrix can be represented as the exponential of some Hermitian matrix \hat{H} and some real coefficient t: \begin{aligned} e^{-it\hat{H}} &= \mathbf{1}- it\hat{H} + \frac{(-it)^2}{2}\hat{H}^2 + \frac{(-it)^3}{2\cdot3}\hat{H}^3 +\ldots \\&= \sum_{n=0}^\infty \frac{(-it)^n}{n!}\hat{H}^n. \end{aligned} The state vector changes smoothly: for t=0 the time evolution operator is merely the unit operator \mathbf{1}, and when t is very small U(t)\approx \mathbf{1}-it\hat{H} is close to the unit operator, differing from it by something of order t.

Quantum circuits

In this course we will hardly refer to the Schrödinger equation. Instead we will assume that our clever colleagues — experimental physicists — are able to implement certain unitary operations, and we will use these unitaries, like lego blocks, to construct other, more complex, unitaries. We refer to pre-selected elementary quantum operations as quantum logic gates and we often draw diagrams, called quantum circuits, to illustrate how they act on qubits. For example, two unitaries, U followed by V, acting on a single qubit are represented as

This diagram should be read from left to right, and the horizontal line represents a qubit that is inertly carried from one quantum operation to another (maybe through space, down a physical wire, but maybe through some other physical implementation — we don’t particularly mind!)


A complete measurement in quantum theory is determined by the choice of an orthonormal basis \{|e_1\rangle,\ldots,|e_n\rangle\} in \mathcal{H}, and every such basis (in principle) represents a possible measurement. Given a quantum system in state |\psi\rangle such that |\psi\rangle = \sum_i |e_i\rangle\langle e_i|\psi\rangle, the measurement in the basis \{|e_1\rangle,\ldots,|e_n\rangle\} gives the outcome labelled by e_k with probability |\langle e_k|\psi\rangle|^2, and leaves the system in state |e_k\rangle after measurement. This is consistent with our interpretation of the inner product \langle e_k|\psi\rangle as the probability amplitude that a quantum system prepared in state |\psi\rangle will be found in state |e_k\rangle. State vectors forming orthonormal bases are perfectly distinguishable from each other (\langle e_i|e_j\rangle=\delta_{ij}), so there is no ambiguity about the outcome. A complete measurement is the best we can do in terms of resolving state vectors in the basis states.

In general, for any decomposition of the identity \sum_k P_k=\mathbf{1} into orthogonal projectors P_k (i.e. P_kP_l = P_k\delta_{kl}), there exists a measurement that takes a quantum system in state |\psi\rangle, outputs label k with probability \langle\psi|P_k|\psi\rangle, and leaves the system in the state P_k|\psi\rangle (multiplied by the normalisation factor i.e. divided by the length of P_k|\psi\rangle): |\psi\rangle \mapsto \frac{P_k|\psi\rangle}{\sqrt{\langle\psi|P_k|\psi\rangle}}. The projector formalism covers both complete and incomplete measurements. The complete measurements are exactly those defined by rank-one projectors P_k=|e_k\rangle\langle e_k|, projecting on vectors from some orthonormal basis \{|e_k\rangle\}.

  1. We briefly discussed this equation in Section 3.6.↩︎