Some mathematical preliminaries

0.1 Euclidean vectors

We assume that you are familiar with Euclidean vectors — those arrow-like geometric objects which are used to represent physical quantities, such as velocities, or forces. You know that any two velocities can be added to yield a third, and the multiplication of a “velocity vector” by a real number is another “velocity vector”. So a linear combination of vectors is another vector. Mathematicians have simply taken these properties and defined vectors as anything that we can add and multiply by numbers, as long as everything behaves in a nice enough way. This is basically what an Italian mathematician Giuseppe Peano (1858–1932) did in a chapter of his 1888 book with an impressive title: Calcolo geometrico secondo l’Ausdehnungslehre di H. Grassmann preceduto dalle operazioni della logica deduttiva.

0.2 Vector spaces

Following Peano, we define a vector space as a mathematical structure in which the notion of linear combination “makes sense”.

More formally, a complex vector space is a set V such that, given any two vectors a and b (that is, any two elements of V) and any two complex numbers \alpha and \beta, we can form the linear combination1 \alpha a+\beta b, which is also a vector in V.

A subspace of V is any subset of V which is closed under vector addition and multiplication by complex numbers. Here we start using the Dirac bra-ket notation and write vectors in a somewhat fancy way as |\text{label}\rangle, where the “label” is anything that serves to specify what the vector is. For example, |\uparrow\rangle and |\downarrow\rangle may refer to an electron with spin up or down along some prescribed direction and |0\rangle and |1\rangle may describe a quantum bit (a “qubit”) holding either logical 0 or 1. These are often called ket vectors, or simply kets. (We will deal with “bras” in a moment). A basis in V is a collection of vectors |e_1\rangle,|e_2\rangle,\ldots,|e_n\rangle such that every vector |v\rangle in V can be written (in exactly one way) as a linear combination of the basis vectors; |v\rangle=\sum_i v_i|e_i\rangle. The number of elements in a basis is called the dimension of V. (Showing that this definition is independent of the basis that we choose is a “fun” linear algebra exercise). The most common n-dimensional complex vector space is the space of ordered n-tuples of complex numbers, usually written as column vectors: \begin{gathered} |a\rangle = \begin{bmatrix}a_1\\a_2\\\vdots\\a_n\end{bmatrix} \qquad |b\rangle = \begin{bmatrix}b_1\\b_2\\\vdots\\b_n\end{bmatrix} \\\alpha|a\rangle+\beta|b\rangle = \begin{bmatrix}\alpha a_1+\beta b_1\\\alpha a_2+\beta b_2\\\vdots\\\alpha a_n+\beta b_n\end{bmatrix} \end{gathered}

In fact, this is the space we will use most of the time. Throughout the course we will deal only with vector spaces of finite dimensions. This is sufficient for all our purposes and we will avoid many mathematical subtleties associated with infinite dimensional spaces, for which we would need to tools of functional analysis.

0.3 Bras and kets

An inner product on a vector space V (over the complex numbers) is a function that assigns to each pair of vectors |u\rangle,|v\rangle\in V a complex number \langle u|v\rangle, and satisfies the following conditions:

  • \langle u|v\rangle=\langle v|u\rangle^\star;
  • \langle v|v\rangle\geqslant 0 for all |v\rangle;
  • \langle v|v\rangle= 0 if and only if |v\rangle=0.

The inner product must also be linear in the second argument but antilinear in the first argument: \langle c_1u_1+c_2u_2|v\rangle = c_1^\star\langle u_1|v\rangle+c_2^\star\langle u_2|v\rangle for any complex constants c_1 and c_2.

With any physical system we associate a complex vector space with an inner product, known as a Hilbert space2 \mathcal{H}. The inner product between vectors |u\rangle and |v\rangle in {\mathcal{H}} is written as \langle u|v\rangle.

For example, for column vectors |u\rangle and |v\rangle in \mathbb{C}^n written as |u\rangle = \begin{bmatrix}u_1\\u_2\\\vdots\\u_n\end{bmatrix} \qquad |v\rangle = \begin{bmatrix}v_1\\v_2\\\vdots\\v_n\end{bmatrix} their inner product is defined as \langle u|v\rangle = u_1^\star v_1 + u_2^\star v_2+\ldots + u_n^\star v_n. Following Dirac we may split the inner product into two ingredients \langle u|v\rangle \longrightarrow \langle u|\,|v\rangle. Here |v\rangle is a ket vector, and \langle u| is called a bra vector, or a bra, and can be represented by a row vector: \langle u| = \begin{bmatrix}u_1^\star,&u_2^\star,&\ldots,&u_n^\star\end{bmatrix}. The inner product can now be viewed as the result of the matrix multiplication: \begin{aligned} \langle u|v\rangle &= \begin{bmatrix}u_1^\star,&u_2^\star,&\ldots,&u_n^\star\end{bmatrix} \cdot \begin{bmatrix}v_1\\v_2\\\vdots\\v_n\end{bmatrix} \\&= u_1^\star v_1 + u_2^\star v_2 + \ldots + u_n^\star v_n. \end{aligned}

Bras are vectors: you can add them, and multiply them by scalars (which, here, are complex numbers), but they are vectors in the space {\mathcal{H}}^\star which is dual to \mathcal{H}. Elements of {\mathcal{H}}^\star are linear functionals, that is, linear maps from \mathcal{H} to \mathbb{C}. A linear functional \langle u| acting on a vector |v\rangle in \mathcal{H} gives a complex number \langle u|v\rangle.

All Hilbert spaces of the same dimension are isomorphic, so the differences between quantum systems cannot be really understood without additional structure. This structure is provided by a specific algebra of operators acting on \mathcal{H}.

0.4 Daggers

Although \mathcal{H} and \mathcal{H}^\star are not identical spaces – the former is inhabited by kets, and the latter by bras – they are closely related. There is a bijective map from one to the other, |v\rangle\leftrightarrow \langle v|, denoted by a dagger:3 \begin{aligned} \langle v| &= (|v\rangle)^\dagger \\|v\rangle &= (\langle v|)^\dagger. \end{aligned} We usually omit the parentheses when it is obvious what the dagger operation applies to.

The dagger operation, also known as Hermitian conjugation, is antilinear: \begin{aligned} (c_1|v_1\rangle+c_2|v_2\rangle)^\dagger &= c_1^\star\langle v_1| + c_2^\star\langle v_2| \\(c_1\langle v_1|+c_2\langle v_2|)^\dagger &= c_1^\star|v_1\rangle + c_2^\star|v_2\rangle. \end{aligned} Also, when applied twice, the dagger operation is the identity map. In the matrix representation,4 |v\rangle = \begin{bmatrix}v_1\\v_2\\\vdots\\v_n\end{bmatrix} \overset{\dagger}{\longleftrightarrow} \langle v| = \begin{bmatrix}v_1^\star,&v_2^\star,&\ldots,&v_n^\star\end{bmatrix}.

0.5 Geometry

The inner product brings geometry: the length, or norm, of |v\rangle is given by \|v\|=\sqrt{\langle v|v\rangle}, and we say that |u\rangle and |v\rangle are orthogonal if \langle u|v\rangle=0. Any maximal set of pairwise orthogonal vectors of unit length5 forms an orthonormal basis, and so any vector can be expressed as a linear combination of the basis vectors: \begin{gathered} |v\rangle =\sum_i v_i|e_i\rangle \\\text{where $v_i=\langle e_i|v\rangle$}. \end{gathered} Then the bras \langle e_i| form the dual basis \begin{gathered} \langle v| =\sum_i v_i^\star\langle e_i| \\\text{where $v_i^\star=\langle v|e_i\rangle$}. \end{gathered}

To make the notation a bit less cumbersome, we will sometimes label the basis kets as |i\rangle rather than |e_i\rangle, and write \begin{aligned} |v\rangle &= \sum_i |i\rangle\langle i|v\rangle \\\langle v| &= \sum_j \langle v|i\rangle\langle i|. \end{aligned} But do not confuse |0\rangle with the zero vector! We never write the zero vector as |0\rangle, but only ever as 0, without any bra or ket decorations (so e.g. |v\rangle+0=|v\rangle).

With any isolated quantum system, which can be prepared in n perfectly distinguishable states, we can associate a Hilbert space \mathcal{H} of dimension n such that each vector |v\rangle\in\mathcal{H} of unit length (i.e. \langle v|v\rangle =1) represents a quantum state of the system.

The overall phase of the vector has no physical significance: |v\rangle and e^{i\varphi}|v\rangle (for any real \varphi) both describe the same state.

The inner product \langle u|v\rangle is the probability amplitude that a quantum system prepared in state |v\rangle will be found in state |u\rangle upon measurement.

States corresponding to orthogonal vectors (i.e. \langle u|v\rangle=0) are perfectly distinguishable, since, if we prepare the system in state |v\rangle, then it will never be found in state |u\rangle, and vice versa. In particular, states forming orthonormal bases are always perfectly distinguishable from each other. Choosing such states, as we shall see in a moment, is equivalent to choosing a particular quantum measurement.

0.6 Operators

A linear map between two vector spaces \mathcal{H} and \mathcal{K} is a function A\colon\mathcal{H}\to\mathcal{K} that respects linear combinations: A(c_1|v_1\rangle+c_2|v_2\rangle)=c_1 A|v_1\rangle+c_2 A|v_2\rangle for any vectors |v_1\rangle,|v_2\rangle and any complex numbers c_1,c_2. We will focus mostly on endomorphisms, that is, maps from \mathcal{H} to \mathcal{H}, and we will call them operators. The symbol \mathbf{1} is reserved for the identity operator that maps every element of \mathcal{H} to itself (i.e. \mathbf{1}|v\rangle=|v\rangle for all |v\rangle\in\mathcal{H}). The product AB of two operators A and B is the operator obtained by first applying B to some ket |v\rangle and then A to the ket which results from applying B: (AB)|v\rangle = A(B|v\rangle). The order does matter: in general, AB\neq BA. In the exceptional case in which AB=BA, one says that these two operators commute. The inverse of A, written as A^{-1}, is the operator that satisfies AA^{-1}=\mathbf{1}=A^{-1}A. For finite-dimensional spaces, one only needs to check one of these two conditions, since any one of the two implies the other, whereas, on an infinite-dimensional space, both must be checked. Finally, given a particular basis, an operator A is uniquely determined by the entries of its matrix, defined by A_{ij}=\langle i|A|j\rangle. The adjoint, or Hermitian conjugate, of A, denoted by A^\dagger, is defined by the relation \begin{gathered} \langle i|A^\dagger|j\rangle = \langle j|A|i\rangle^\star \\\text{for all $|i\rangle,|j\rangle\in\mathcal{H}$}. \end{gathered}

An operator A is said to be

  • normal if AA^\dagger = A^\dagger A,
  • unitary if AA^\dagger = A^\dagger A = \mathbf{1},
  • Hermitian (or self-adjoint) if A^\dagger = A.

Any physically admissible evolution of an isolated quantum system is represented by a unitary operator. Note that unitary operators preserve the inner product: given a unitary operator U and two kets |a\rangle and |b\rangle, and defining |a'\rangle=U|a\rangle and |b'\rangle=U|b\rangle, we have that \begin{gathered} \langle a'|=\langle a|U^\dagger \\\langle b'|=\langle b|U^\dagger \\\langle a'|b'\rangle=\langle a|U^\dagger U|b\rangle=\langle a|\mathbf{1}|b\rangle=\langle a|b\rangle. \end{gathered} Preserving the inner product implies preserving the norm induced by this product, i.e. unit state vectors are mapped to unit state vectors, i.e. unitary operations are the isometries of the Euclidean norm.

0.7 Outer products

Apart from the inner product \langle u|v\rangle, which is a complex number, we can also form the outer product |u\rangle\langle v|, which is a linear map (operator) on \mathcal{H} (or on \mathcal{H}^\star, depending how you look at it). This is what physicists like (and what mathematicians dislike!) about Dirac notation: a certain degree of healthy ambiguity.

  • The result of |u\rangle\langle v| acting on a ket |x\rangle is |u\rangle\langle v|x\rangle, i.e. the vector |u\rangle multiplied by the complex number \langle v|x\rangle.
  • Similarly, the result of |u\rangle\langle v| acting on a bra \langle y| is \langle y|u\rangle\langle v|, i.e. the functional \langle v| multiplied by the complex number \langle y|u\rangle.

The product of two maps, A=|a\rangle\langle b| followed by B=|c\rangle\langle d|, is a linear map BA, which can be written in Dirac notation as BA = |c\rangle\langle d|a\rangle\langle b| = \langle d|a\rangle|c\rangle\langle b| i.e. the inner product (complex number) \langle d|a\rangle times the outer product (linear map) |c\rangle\langle b|.

Any operator on \mathcal{H} can be expressed as a sum of outer products. Given an orthonormal basis \{|e_i\rangle\}, any operator which maps the basis vectors |e_i\rangle to vectors |f_i\rangle can be written as \sum_i|f_i\rangle\langle e_i|, where the sum is over all the vectors in the orthonormal basis. If the vectors \{|f_i\rangle\} also form an orthonormal basis then the operator simply “rotates” one orthonormal basis into another. These are unitary operators which preserve the inner product. In particular, if each |e_i\rangle is mapped to |e_i\rangle, then we obtain the identity operator: \sum_i|e_i\rangle\langle e_i|=\mathbf{1}. This relation holds for any orthonormal basis, and it is one of the most ubiquitous and useful formulas in quantum theory. For example, for any vector |v\rangle and for any orthonormal basis \{|e_i\rangle\}, we have \begin{aligned} |v\rangle &= \mathbf{1}|v\rangle \\&= \sum_i |e_i\rangle\langle e_i|\;|v\rangle \\&= \sum_i |e_i\rangle\;\langle e_i|v\rangle \\&= \sum_i v_i|e_i\rangle, \end{aligned} where v_i=\langle e_i|v\rangle are the components of |v\rangle. Finally, note that the adjoint of |a\rangle\langle b| is |b\rangle\langle a|.

0.8 The trace

The trace is an operation which turns outer products into inner products, \operatorname{tr}\colon |b\rangle\langle a| \longmapsto \langle a|b\rangle. We have just seen that any linear operator can be written as a sum of outer products, and so we can extend the definition of trace (by linearity) to any operator. Alternatively, for any square matrix A, the trace of A is defined to be the sum of its diagonal elements: \operatorname{tr}A = \sum_k \langle e_k|A|e_k\rangle = \sum_k A_{kk}. You can show, using this definition or otherwise, that the trace is cyclic (i.e. \operatorname{tr}(AB) = \operatorname{tr}(BA)) and linear (i.e. \operatorname{tr}(\alpha A+\beta B) = \alpha\operatorname{tr}(A)+\beta\operatorname{tr}(B), where A and B are square matrices and \alpha and \beta complex numbers). Moreover, \begin{aligned} \operatorname{tr}|b\rangle\langle a| &= \sum_k \langle e_k|b\rangle\langle a|e_k\rangle \\&= \sum_k \langle a|e_k\rangle\langle e_k|b\rangle \\&= \langle a|\mathbf{1}\rangle|b\rangle \\&= \langle a|b\rangle. \end{aligned} Here, the second term can be viewed both as the sum of the diagonal elements of |b\rangle\langle a| in the |e_k\rangle basis, and as the sum of the products of two complex numbers \langle e_k|b\rangle and \langle a|e_k\rangle. We have used the decomposition of the identity, \sum_k|e_k\rangle\langle e_k|=\mathbf{1}. Given that we can decompose the identity by choosing any orthonormal basis, it is clear that the trace does not depend on the choice of the basis.

0.9 Some useful identities

  • |a\rangle^\dagger = \langle a|
  • \langle a|^\dagger = |a\rangle
  • (\alpha|a\rangle+\beta|b\rangle)^\dagger = \alpha^\star\langle a|+\beta^\star\langle b|
  • (|a\rangle\langle b|)^\dagger = |b\rangle\langle a|
  • (AB)^\dagger=B^\dagger A^\dagger
  • (\alpha A+\beta B)^\dagger=\alpha^\star A^\dagger+\beta^\star B^\dagger
  • (A^\dagger)^\dagger=A
  • \operatorname{tr}(\alpha A+ \beta B) = \alpha \operatorname{tr}(A)+\beta\operatorname{tr}(B)
  • \operatorname{tr}|a\rangle\langle b| = \langle b|a\rangle
  • \operatorname{tr}(ABC) = \operatorname{tr}(CAB) = \operatorname{tr}(BCA)

  1. As we said, there are certain “nice properties” that these things must satisfy. Addition of vectors must be commutative and associative, with an identity (the zero vector, which will always be written as \mathbf{0} ) and an inverse for each v (written as -v). Multiplication by complex numbers must obey the two distributive laws: (\alpha+\beta)v = \alpha v+\beta v and \alpha (v+w) = \alpha v+\alpha w.↩︎

  2. The term “Hilbert space” used to be reserved for an infinite-dimensional inner product space that is complete, i.e. such that every Cauchy sequence in the space converges to an element in the space. Nowadays, as in these notes, the term includes finite-dimensional spaces, which automatically satisfy the condition of completeness.↩︎

  3. “Is this a \dagger which I see before me…”↩︎

  4. Recall that the conjugate transpose, or the Hermitian conjugate, of an (n\times m) matrix A is an (m\times n) matrix A^\dagger, obtained by interchanging the rows and columns of A and taking complex conjugates of each entry in A, i.e. A^\dagger_{ij}=A^\star_{ji}. In mathematics texts it is often denoted by {}^\star rather than {}^\dagger.↩︎

  5. That is, consider sets of vectors |e_i\rangle such that \langle e_i|e_j\rangle=\delta_{ij} (where the Kronecker delta \delta_{ij} is 0 if i\neq j, and 1 if i=j.), and then pick any of the largest such sets (which must exist, since we assume our vector spaces to be finite dimensional).↩︎