5.3 Quantum theory, formally (continued)

In Section 4.11, we said that we were missing a key part in our formalism of quantum theory — now we can finally fill in this hole. Our mathematical formalism of choice behind the quantum theory of composite systems is based on the tensor product of Hilbert spaces.

Tensor products

Let the states of some system \mathcal{A} be described by vectors in an n-dimensional Hilbert space \mathcal{H}_{\mathcal{A}}, and the states of some system \mathcal{B} by vectors in an m-dimensional Hilbert space \mathcal{H}_{\mathcal{B}}. The combined system of \mathcal{A} and \mathcal{B} is then described by vectors in the nm-dimensional tensor product space \mathcal{H}_{\mathcal{A}}\otimes\mathcal{H}_{\mathcal{B}}. Given bases \{|a_1\rangle,\ldots,|a_n\rangle\} of \mathcal{H}_{\mathcal{A}} and \{|b_1\rangle,\ldots,|b_m\rangle\} of \mathcal{H}_{\mathcal{B}}, we form a basis of the tensor product by taking the ordered pairs |a_i\rangle\otimes|b_j\rangle, for i=1,\ldots,n and j=1,\ldots,m. For brevity, we sometimes write |a_i\rangle\otimes|b_j\rangle as |a_i\rangle|b_j\rangle, or simply |a_ib_j\rangle. The tensor product space \mathcal{H}_{\mathcal{A}}\otimes\mathcal{H}_{\mathcal{B}} then consists of all linear combination of such tensor product basis vectors:¹⁰⁵ |\psi\rangle = \sum_{i,j} c_{ij}|a_i\rangle\otimes|b_j\rangle. \tag{$\ddagger$}

The tensor product operation \otimes is distributive: \begin{gathered} |a\rangle \otimes \left( \beta_1|b_1\rangle + \beta_2|b_2\rangle \right) = \beta_1|a\rangle\otimes|b_1\rangle + \beta_2|a\rangle\otimes|b_2\rangle \\\left( \alpha_1|a_1\rangle + \alpha_2|a_2\rangle \right) \otimes |b\rangle = \alpha_1|a_1\rangle\otimes|b\rangle + \alpha_2|a_2\rangle\otimes|b\rangle. \end{gathered}

The tensor product of Hilbert spaces is again a Hilbert space: the inner products on \mathcal{H}_{\mathcal{A}} and \mathcal{H}_{\mathcal{B}} give a natural inner product on \mathcal{H}_{\mathcal{A}}\otimes\mathcal{H}_{\mathcal{B}}, defined for any two product vectors by \left( \langle a'|\otimes\langle b'| \right) \left( |a\rangle\otimes|b\rangle \right) = \langle a'|a\rangle\langle b'|b\rangle and extended by linearity to sums of tensor products of vectors, and, by associativity¹⁰⁶, to any number of subsystems. Note that the bra corresponding to the tensor product state |a\rangle\otimes|b\rangle is written as (|a\rangle\otimes|b\rangle)^\dagger = \langle a|\otimes\langle b|, where the order of the factors on either side of \otimes does not change when the dagger operation is applied.

Some joint states of \mathcal{A} and \mathcal{B} can be expressed as a single tensor product, say |\psi\rangle=|a\rangle\otimes|b\rangle, meaning that the subsystem \mathcal{A} is in state |a\rangle, and the subsystem \mathcal{B} in state |b\rangle. If we expand |a\rangle=\sum_i\alpha_i|a_i\rangle and |b\rangle=\sum_i\beta_j|b_j\rangle, then |\psi\rangle=\sum_{i,j}\alpha_i\beta_j|a_i\rangle\otimes|b_j\rangle and we see that, for all such states, the coefficients c_{ij} in Equation (\ddagger) are of a rather special form: c_{ij} = \alpha_i\beta_j. We call such states separable (or product states). States that are not separable are said to be entangled.

A useful fact about tensor products is that \lambda a\otimes b = a\otimes\lambda b (where a and b are vectors, and \lambda is a scalar). This means that we don’t need to worry about where exactly we put \lambda, and can write something like \lambda(a\otimes b).

We will also need the concept of the tensor product of two operators. If A is an operator on \mathcal{H}_{\mathcal{A}} and B an operator on \mathcal{H}_{\mathcal{B}}, then the tensor product operator A\otimes B is an operator on \mathcal{H}_{\mathcal{A}}\otimes\mathcal{H}_{\mathcal{B}} defined by its action on product vectors via (A\otimes B)(|a\rangle\otimes|b\rangle) = (A|a\rangle)\otimes (B|b\rangle) and with its action on all other vectors determined by linearity: A\otimes B \left( \sum_{i,j} c_{ij}|a_i\rangle\otimes|b_j\rangle \right) = \sum_{i,j}c_{ij} A|a_i\rangle\otimes B|b_j\rangle.

We have described the tensor product in terms of how it acts on bases, and then extended everything by linearity, distributivity, and associativity. But there are other, more abstract approaches to defining the tensor product.

For example, given two vector spaces V and W, we can construct their tensor product V\otimes W as a quotient of the cartesian product V\times W (whose elements are simply pairs (v,w) of vectors in V and vectors in W) by the subspace spanned by the relations that we want the tensor product to satisfy: \begin{gathered} (v_1+v_2,w)-(v_1,w)-(v_2,w), \\(v,w_1+w_2)-(v,w_1)-(v,w_2), \\(\lambda v,w)-\lambda(v,w), \\(v,\lambda w)-\lambda(v,w). \end{gathered}

But really this is hinting at the so-called universal property that defines the tensor product without giving a choice of explicit construction: the tensor product of V and W is defined to be any vector space A along with a bilinear map \otimes\colon V\times W\to A such that, for any other vector space Z along with a bilinear map f\colon V\times W\to Z, there exists a unique linear map \tilde{f}\colon A\to Z such that f=\tilde{f}\circ\otimes. In the language of category theory, the tensor product is the initial object amongst vector spaces endowed with a bilinear map from V\times W; any other vector space Z with a bilinear map V\times W\to Z factors through the tensor product.

One specific reason to care about giving a definition in terms of universal property is that this guarantees (by some abstract nonsense) that the resulting object will be unique (“up to unique isomorphism”) whenever it exists, so you don’t need to worry about proving this separately.

Tensor products are much more general than just for vector spaces: they can be defined for modules (which are like vector spaces over an arbitrary commutative ring, instead of over a field), and abelian groups are, it turns out, exactly “modules over \mathbb{Z}”, so they also have a notion of tensor product. Going a bit deeper, we can define tensor products for complexes of modules and sheaves of modules, and these constructions are absolutely fundamental to modern algebraic geometry.

Going even deeper still (and now far beyond the purview of this book), tensor products are generalised by the notion of monoidal categories.

As a final note, the universal property of the tensor product can be used to prove that we do not need to impose the postulate “the Hilbert space of a composite system is the tensor product of the Hilbert spaces of its components”, but that this actually follows “for free” from the state and the measurement postulates. This is shown in Carcassi, Maccone, and Aidala’s “The four postulates of quantum mechanics are three”, arXiv:2003.11007.

If the bases \{|a_i\rangle\} and \{|b_j\rangle\} are orthonormal then so too is the tensor product basis \{|a_i\rangle\otimes|b_j\rangle\}.↩︎
Associativity means that ({\mathcal{H}}_a \otimes {\mathcal{H}}_b)\otimes {\mathcal{H}}_c = {\mathcal{H}}_a \otimes ({\mathcal{H}}_b\otimes {\mathcal{H}}_c).↩︎