Chapter 5 Quantum entanglement

About the fundamental tool of quantum computing: entanglement, via the formalism of tensor products, which was the missing ingredient from our previous formalism of quantum theory. Also about various controlled gates, including the always useful controlled-NOT.

We now know everything we need to know about a single qubit and its quantum behaviour. But if we want to understand quantum computation — a complicated quantum interference of many interacting qubits — then we will need few more mathematical tools. Stepping up from one qubit to two or more is a bigger leap than you might expect. Already, with just two qubits, we will encounter the remarkable phenomenon of quantum entanglement and have a chance to discuss some of the most puzzling features of quantum theory that took people decades to understand.

5.1 A small history

The notion of quantum entanglement was the subject of many early debates that focused on the meaning of quantum theory. Back in the 1930s, Albert Einstein, Niels Bohr, Werner Heisenberg, and Erwin Schrödinger (to mention just the usual suspects) were trying hard to understand its conceptual consequences.67 Einstein, the most sceptical of them all, claimed that it was pointing toward the fatal flaw in quantum theory, and referred to it as “spooky action-at-a-distance”. In contrast, Schrödinger was much more prepared to accept quantum theory exactly as it was formulated, along with all its predictions, no matter how weird they might be. In his 1935 paper, which introduced quantum entanglement, he wrote “I would not call it one but rather the characteristic trait of quantum mechanics, the one that enforces its entire departure from classical lines of thought”. Today we still talk a lot about quantum entanglement, but more often it is viewed as a physical resource which enables us to communicate with perfect security, build very precise atomic clocks, and even teleport small quantum objects! But what exactly is quantum entanglement?

5.2 One, two, many…

In classical physics, the transition from a single object to a composite system of many objects is trivial: in order to describe the state of, say, 42 objects at any given moment of time, it is sufficient to describe the state of each of the objects separately. Indeed, the classical state of 42 point-like particles is described by specifying the position and the momentum of each particle.

In the classical world, “the whole is the sum of its parts”, but the quantum world is very different.

Consider, for example, a pair of qubits. Suppose that each one is described by a state vector: the first one by |a\rangle, and the second one by |b\rangle. One might therefore think that the most general state of the two qubits should be represented by a pair of state vectors, |a\rangle|b\rangle, with one for each qubit. Indeed, such a state is certainly possible, but there are other states that cannot be expressed in this form. In order to write down the most general state of two qubits we first focus on the basis states.

For a single qubit we have been using the standard basis \{|0\rangle,|1\rangle\}. For two qubits we may choose the following as our standard basis states:68 \begin{array}{cc} |00\rangle \equiv |0\rangle|0\rangle & |01\rangle \equiv |0\rangle|1\rangle \\|10\rangle \equiv |1\rangle|0\rangle &|11\rangle \equiv |1\rangle|1\rangle. \end{array} Within each ket, the first symbol refers to the first qubit, and the second to the second, and we have tacitly assumed that we can distinguish the two qubits by their location, or some other means. Now, the most general state of the two qubits is a normalised linear combination of these four basis states, i.e. a vector of the form |\psi\rangle = c_{00}|00\rangle + c_{01}|01\rangle + c_{10}|10\rangle + c_{11}|11\rangle. Physical interpretation aside, let us count how many real parameters are needed to specify this state. Six, right? We have four complex numbers (eight real parameters) restricted by the normalisation condition, along with the fact that states differing only by a global phase factor are equivalent, which leaves us with six real parameters. Now, by the same line of argument, we need only two real parameters to specify the state of a single qubit, and hence need four real parameters to specify any state of two qubits of the form |a\rangle|b\rangle. So it cannot be the case that every state of two qubits can be expressed as a pair of states |a\rangle|b\rangle, simply for “dimension reasons”.

For example, compare the two states of two qubits, \begin{gathered} \frac{1}{\sqrt 2}|00\rangle + \frac{1}{\sqrt 2}|01\rangle \\\text{and} \\\frac{1}{\sqrt 2}|00\rangle + \frac{1}{\sqrt 2}|11\rangle. \end{gathered} The first one is separable, i.e. we can view it as a pair of state vectors where each one pertains to one of the two qubits: \frac{1}{\sqrt 2}|00\rangle + \frac{1}{\sqrt 2}|01\rangle = \underbrace{\vphantom{\frac{1}{\sqrt 2}}|0\rangle}_{\text{qubit 1}} \;\; \underbrace{\frac{1}{\sqrt 2}(|0\rangle+|1\rangle)}_{\text{qubit 2}}, The second state, however, does not admit such a decomposition: there do not exist any \psi_1,\psi_2 such that \frac{1}{\sqrt 2}|00\rangle + \frac{1}{\sqrt 2}|11\rangle = |\psi_1\rangle|\psi_2\rangle and so we say that it is an entangled state. Informally, any bipartite state that cannot be viewed as a pair of two states pertaining to the constituent subsystems is said to be entangled.

With this discussion in our minds, we can now give more formal account of the states of composite quantum systems.

5.3 Quantum theory, formally (continued)

Recalling Chapter 4, where we said that we were missing a key part of the formalism of quantum theory, we can now fill in this hole. Our mathematical formalism of choice behind the quantum theory of composite systems is based on the tensor product of Hilbert spaces.

5.3.1 Tensor products

Let the states of some system \mathcal{A} be described by vectors in an n-dimensional Hilbert space \mathcal{H}_{\mathcal{A}}, and the states of some system \mathcal{B} by vectors in an m-dimensional Hilbert space \mathcal{H}_{\mathcal{B}}. The combined system of \mathcal{A} and \mathcal{B} is then described by vectors in the (nm)-dimensional tensor product space \mathcal{H}_{\mathcal{A}}\otimes\mathcal{H}_{\mathcal{B}}. Given bases \{|a_1\rangle,\ldots,|a_n\rangle\} of \mathcal{H}_{\mathcal{A}} and \{|b_1\rangle,\ldots,|b_m\rangle\} of \mathcal{H}_{\mathcal{B}}, we form a basis of the tensor product by taking the ordered pairs |a_i\rangle\otimes|b_j\rangle, for i=1,\ldots,n and j=1,\ldots,m. The tensor product space \mathcal{H}_{\mathcal{A}}\otimes\mathcal{H}_{\mathcal{B}} then consists of all linear combination of such tensor product basis vectors:69 |\psi\rangle = \sum_{ij} c_{ij}|a_i\rangle\otimes|b_j\rangle. \tag{5.3.1}

The tensor product operation \otimes is distributive: \begin{gathered} |a\rangle \otimes \left( \beta_1|b_1\rangle + \beta_2|b_2\rangle \right) = \beta_1|a\rangle\otimes|b_1\rangle + \beta_2|a\rangle\otimes|b_2\rangle \\\left( \alpha_1|a_1\rangle + \alpha_2|a_2\rangle \right) \otimes |b\rangle = \alpha_1|a_1\rangle\otimes|b\rangle + \alpha_2|a_2\rangle\otimes|b\rangle. \end{gathered} The tensor product of Hilbert spaces is again a Hilbert space: the inner products on \mathcal{H}_{\mathcal{A}} and \mathcal{H}_{\mathcal{B}} give a natural inner product on \mathcal{H}_{\mathcal{A}}\otimes\mathcal{H}_{\mathcal{B}}, defined for any two product vectors by \left( \langle a'|\otimes\langle b'| \right) \left( |a\rangle\otimes|b\rangle \right) = \langle a'|a\rangle\langle b'|b\rangle and extended by linearity to sums of tensor products of vectors, and, by associativity70, to any number of subsystems. Note that the bra corresponding to the tensor product state |a\rangle\otimes|b\rangle is written as (|a\rangle\otimes|b\rangle)^\dagger = \langle a|\otimes\langle b|, where the order of the factors on either side of \otimes does not change when the dagger operation is applied.

Some joint states of \mathcal{A} and \mathcal{B} can be expressed as a single tensor product, say |\psi\rangle=|a\rangle\otimes|b\rangle (often written as |a\rangle|b\rangle, or |a, b\rangle, or even |ab\rangle), meaning that the subsystem \mathcal{A} is in state |a\rangle, and the subsystem \mathcal{B} in state |b\rangle. If we expand |a\rangle=\sum_i\alpha_i|a_i\rangle and |b\rangle=\sum_i\beta_j|b_j\rangle, then |\psi\rangle=\sum_{ij}\alpha_i\beta_j|a_i\rangle\otimes|b_j\rangle and we see that, for all such states, the coefficients c_{ij} in Equation (5.3.1) are of a rather special form: c_{ij} = \alpha_i\beta_j. We call such states separable (or just product states). States that are not separable are said to be entangled.

A useful fact about tensor products is that \lambda a\otimes b = a\otimes\lambda b (where a and b are vectors, and \lambda is a scalar). This means that we don’t need to worry about brackets, and can write something like \lambda(a\otimes b).

We will also need the concept of the tensor product of two operators. If A is an operator on \mathcal{H}_{\mathcal{A}} and B an operator on \mathcal{H}_{\mathcal{B}}, then the tensor product operator A\otimes B is an operator on \mathcal{H}_{\mathcal{A}}\otimes\mathcal{H}_{\mathcal{B}} defined by its action on product vectors via (A\otimes B)(|a\rangle\otimes|b\rangle) = (A|a\rangle)\otimes (B|b\rangle) and with its action on all other vectors determined by linearity: A\otimes B \left( \sum_{ij} c_{ij}|a_i\rangle\otimes|b_j\rangle \right) = \sum_{ij}c_{ij} A|a_i\rangle\otimes B|b_j\rangle.

5.4 Back to qubits

Let’s see how this formalism works for qubits. The n-fold tensor product of vectors from the standard basis \{|0\rangle,|1\rangle\} represent binary strings of length n. For example, for n=3,71 \begin{aligned} |0\rangle\otimes|1\rangle\otimes|1\rangle & \equiv |011\rangle \\|1\rangle\otimes|1\rangle\otimes|1\rangle & \equiv |111\rangle. \end{aligned} A classical register composed of three bits can store only one of these two binary strings at any time; a quantum register composed of three qubits can store both of them in a superposition.

Indeed, if we start with the state |011\rangle and apply the Hadamard gate to the first qubit (which is the same as applying H\otimes\mathbf{1}\otimes\mathbf{1}), then, given that linear combinations distribute over tensor products, we obtain72 \begin{aligned} |011\rangle \longmapsto &\frac{1}{\sqrt2} \big(|0\rangle + |1\rangle\big) \otimes|1\rangle\otimes|1\rangle \\\equiv &\frac{1}{\sqrt2} \big(|011\rangle + |111\rangle\big). \end{aligned} In fact, we can even prepare this register in a superposition of all eight possible binary strings: if we apply the tensor product operation H\otimes H\otimes H to the state |0\rangle\otimes|0\rangle\otimes|0\rangle = |000\rangle then we get

The resulting state is exactly a superposition of all binary string of length 3, and can also be written as \frac{1}{\sqrt2} \big(|0\rangle + |1\rangle\big) \otimes \frac{1}{\sqrt2} \big(|0\rangle + |1\rangle\big) \otimes \frac{1}{\sqrt2} \big(|0\rangle + |1\rangle\big).

In general, the tensor product operation H^{\otimes n}, which means “apply the Hadamard gate to each of your n qubits”, is known as the Hadamard transform, and it maps product states to product states. Like the Hadamard gate in the typical quantum interference circuit, the Hadamard transform opens and closes a multi-qubit interference.

5.5 Separable or entangled?

Most vectors in \mathcal{H}_a\otimes \mathcal{H}_b are entangled and cannot be written as product states |a\rangle\otimes|b\rangle with |a\rangle\in\mathcal{H}_a and |b\rangle\in\mathcal{H}_b.

In order to see this, let us write any joint state |\psi\rangle of \mathcal{A} and \mathcal{B} in a product basis as \begin{aligned} |\psi\rangle &= \sum_{ij} c_{ij}|a_i\rangle\otimes|b_j\rangle \\&= \sum_i|a_i\rangle\otimes\left(\sum_j c_{ij}|b_j\rangle\right) \\&= \sum_i|a_i\rangle\otimes|\phi_i\rangle \end{aligned} \tag{5.5.1} where the |\phi_i\rangle=\sum_j c_{ij}|b_j\rangle are vectors in \mathcal{H}_{\mathcal{B}} that need not be normalised. For any product state, these vectors have a special form. Indeed, if |\psi\rangle= |a\rangle\otimes|b\rangle then, after expanding the first state in the |a_i\rangle basis, we obtain |\psi\rangle = \sum_{i}|a_i\rangle\otimes\left(\sum_i\alpha_i|b\rangle\right). This expression has the same form as Equation (5.5.1) with |\phi_i\rangle=\alpha_i|b\rangle, i.e. each of the |\phi_i\rangle vectors in this expansion is a multiple of the same vector |b\rangle. Conversely, if |\phi_i\rangle = \alpha_i|b\rangle for all i in Equation (5.5.1), then |\psi\rangle must be a product state.73 So if we want to identify which joint states are product states and which are not, we simply write the joint state according to Equation (5.5.1) and check if all the vectors |\phi_i\rangle are multiples of a single vector. Needless to say, if we choose the states |\phi\rangle randomly, it is very unlikely that this condition is satisfied, and we almost certainly pick an entangled state.

Quantum entanglement is one of the most fascinating aspects of quantum theory. We will now explore some of its implications.

5.6 Controlled-NOT

How do entangled states arise in real physical situations? The short answer is that entanglement is the result of interactions. It is easy to see that tensor product operations U_1\otimes\ldots\otimes U_n map product states to product states:

and so any collection of separable qubits remains separable.

As soon as qubits start interacting with one another, however, they become entangled, and things start to get really interesting. We will describe interactions that cannot be written as tensor products of unitary operations on individual qubits. The most popular two-qubit entangling gate is the controlled-\texttt{NOT} (or \texttt{c-NOT}), also known as the controlled-X gate.74 The gate acts on two qubits: it flips the second qubit (referred to as the target) if the first qubit (referred to as the control) is |1\rangle, and does nothing if the control qubit is |0\rangle. In the standard basis \{|00\rangle,|01\rangle,|10\rangle,|11\rangle\}, it is represented by the following unitary matrix:


We can also represent the \texttt{c-NOT} gate using the circuit notation, as in Figure 5.1.

Where x,y\in\{0,1\}, and \oplus denotes \texttt{XOR}, or addition modulo 2.

Figure 5.1: Where x,y\in\{0,1\}, and \oplus denotes \texttt{XOR}, or addition modulo 2.

Note that this gate does not admit any tensor product decomposition, but can be written as a sum of tensor products:75 \texttt{c-NOT} = |0\rangle\langle 0|\otimes\mathbf{1}+ |1\rangle\langle 1|\otimes X where X is the Pauli bit-flip operation.

The \texttt{c-NOT} gate lets us do many interesting things, and can act in a rather deceptive way. Let us now study some of these things.

5.6.1 The Bell states, and the Bell measurement

We start with the generation of entanglement. Here is a simple circuit that demonstrates the entangling power of \texttt{c-NOT}:76

(Generating entanglement).

In this circuit, the separable input |0\rangle|0\rangle evolves as \begin{aligned} |0\rangle|0\rangle \overset{H}{\longmapsto}& \frac{1}{\sqrt2} (|0\rangle + |1\rangle) |0\rangle \\=& \frac{1}{\sqrt2}|0\rangle|0\rangle + \frac{1}{\sqrt2}|1\rangle|0\rangle \\\overset{\texttt{c-NOT}}{\longmapsto}& \frac{1}{\sqrt2}|0\rangle|0\rangle + \frac{1}{\sqrt2}|1\rangle|1\rangle \end{aligned} resulting in the entangled output \frac{1}{\sqrt 2}(|00\rangle+|11\rangle). In fact, this circuit implements the unitary operation which maps the standard computational basis into the four entangled states, known as the Bell states.

The Bell states, |\psi_{ij}\rangle, as generated by the above circuit: \begin{aligned} |00\rangle &\mapsto |\psi_{00}\rangle = \frac{1}{\sqrt 2}(|00\rangle+|11\rangle) \\|01\rangle &\mapsto |\psi_{01}\rangle = \frac{1}{\sqrt 2}(|01\rangle+|10\rangle) \\|10\rangle &\mapsto |\psi_{10}\rangle = \frac{1}{\sqrt 2}(|00\rangle-|11\rangle) \\|11\rangle &\mapsto |\psi_{11}\rangle = \frac{1}{\sqrt 2}(|01\rangle-|10\rangle) \end{aligned} The more standard notation for these states, however, is the following: \begin{aligned} \Phi^+ &= |\psi_{00}\rangle \\\Psi^+ &= |\psi_{01}\rangle \\\Phi^- &= |\psi_{10}\rangle \\\Psi^-&= |\psi_{11}\rangle \end{aligned} (and this is the notation that we will use from now on).

The Bell states form an orthonormal basis in the Hilbert space \mathcal{H}_1\otimes\mathcal{H}_2 of two qubits. We can perform measurements in the Bell basis: the easiest way to do it in practice is to “rotate” the Bell basis to the standard basis, and then perform the measurement in the standard basis.77 Indeed, if we reverse the circuit, then we get a circuit which maps the Bell state |\psi_{xy}\rangle to the corresponding state |xy\rangle in the standard basis. This unitary mapping allows us to “implement” the projections on Bell states by applying the reversed circuit followed by the usual qubit-by-qubit measurement in the standard basis.

5.6.2 Quantum teleportation

A wonderful fact, that sounds more like science fiction than actual science, is the following: an unknown quantum state can be teleported from one location to another. Consider the following circuit:78

(Quantum teleportation).

The first input qubit (counting from the top) is in some arbitrary state. After the action of the part of the circuit in the first dashed box (counting from the left), the state of the three qubits reads79 \big( \alpha|0\rangle+\beta|1\rangle \big) \big( |00\rangle+|11\rangle \big). By regrouping the terms, but keeping the qubits in the same order, this state can be written as the sum \begin{aligned} &(|00\rangle + |11\rangle) \otimes (\alpha|0\rangle + \beta|1\rangle) \\+ &(|01\rangle + |10\rangle) \otimes (\alpha|1\rangle + \beta|0\rangle) \\+ &(|00\rangle - |11\rangle) \otimes (\alpha|0\rangle - \beta|1\rangle) \\+ &(|01\rangle - |10\rangle) \otimes (\alpha|1\rangle - \beta|0\rangle). \end{aligned} Then the part of the circuit in the second dashed box maps the four Bell states of the first two qubits to the corresponding states from the computational basis: \begin{aligned} &|00\rangle \otimes (\alpha|0\rangle + \beta|1\rangle) \\+ &|01\rangle \otimes (\alpha|1\rangle + \beta|0\rangle) \\+ &|10\rangle \otimes (\alpha|0\rangle - \beta|1\rangle) \\+ &|11\rangle \otimes (\alpha|1\rangle - \beta|0\rangle). \end{aligned} Upon performing the standard measurement and learning the values of x and y, we choose one of the four following transformations depending on these values: \begin{array}{ll} 00 \mapsto \mathbf{1} &\quad 01 \mapsto X \\[1em] 10 \mapsto Z &\quad 11 \mapsto ZX \end{array} \tag{5.6.2} (e.g. if x=0 and y=1, then we choose X). We then apply this transformation to the third qubit, which restores the original state of the first qubit.

If you understand how this circuit works then you are ready for quantum teleportation. Here is a dramatic version.

Suppose three qubits, which look very similar, are initially in the possession of an absent-minded Oxford student, Alice. The first qubit is in a precious quantum state and this state is needed urgently for an experiment in Cambridge. The other two qubits are entangled, in the |\psi_{00}\rangle state. Alice’s colleague, Bob, pops in to collect the qubit. Once he is gone, Alice realises that, by mistake, she gave him not the first but the third qubit: the one which is entangled with the second qubit.

The situation seems to be hopeless — Alice does not know the quantum state of the first qubit, and Bob is now miles away and her communication with him is limited to few bits. However, Alice and Bob are both very clever and they both diligently attended their “Introduction to Quantum Information Science” classes. Can Alice rectify her mistake and save Cambridge science?

Hmmm… (pause for thought)…

Of course: Alice can teleport the state of the first qubit! She performs the Bell measurement on the first two qubits, which gives her two binary digits, x and y. She then broadcasts x and y to Bob, who chooses the corresponding transformation, as in Equation (5.6.2), performs it, and recovers the original state.

This raises a natural “philosophical” question: what do we really mean by teleportation? A key part of this question is understanding what happens to our original qubit when we teleport it. As it turns out, it must necessarily be destroyed, as we now explain.

5.6.3 Thou shalt not clone

Let us now look at something that controlled-\texttt{NOT} seems to be doing but, in fact, isn’t. It is easy to see that the \texttt{c-NOT} can copy the bit value of the first qubit: |x\rangle|0\rangle \overset{\texttt{c-NOT}}{\longmapsto} |x\rangle|x\rangle \qquad\text{(for $x=0,1$)} so one might suppose that this gate could also be used to copy superpositions, such as |\psi\rangle = \alpha|0\rangle+\beta|1\rangle, so that |\psi\rangle|0\rangle \overset{\texttt{c-NOT}}{\longmapsto} |\psi\rangle|\psi\rangle for any |\psi\rangle.

This is not so!

The unitarity of the \texttt{c-NOT} means that it turns superpositions in the control qubit into entanglement of the control and the target: if the control qubit is in the a superposition state |\psi\rangle = \alpha|0\rangle+\beta|1\rangle (with \alpha,\beta\neq0), and the target is in |0\rangle, then the \texttt{c-NOT} gate generates the entangled state \big( \alpha|0\rangle+\beta|1\rangle \big) |0\rangle \overset{\texttt{c-NOT}}{\longmapsto} \alpha|00\rangle + \beta|11\rangle. In fact, it is impossible to clone an unknown quantum state, and we can prove this!

To prove this via contradiction, let’s assume that we could build a universal quantum cloner, and then take any two normalised states |\psi\rangle and |\phi\rangle that are non-identical (i.e. |\langle\psi|\phi\rangle|\neq1) and non-orthogonal (i.e. \langle\psi|\phi\rangle\neq0). If we then run our hypothetical cloning machine we get \begin{aligned} |\psi\rangle|0\rangle|W\rangle &\mapsto |\psi\rangle|\psi\rangle|W'\rangle \\|\phi\rangle|0\rangle|W\rangle &\mapsto |\phi\rangle|\phi\rangle|W''\rangle \end{aligned} where the third system, initially in state |W\rangle, represents everything else (say, the internal state of the cloning machine). For this transformation to be unitary, it must preserve the inner product, and so we require that \langle\psi|\phi\rangle = \langle\psi|\phi\rangle^2 \langle W'|W''\rangle which can only be satisfied if |\langle\psi|\phi\rangle| is equal to 1 or 0, but this contradicts our assumptions!

Thus, states of qubits, unlike states of classical bits, cannot be faithfully cloned. Note that, in quantum teleportation, the original state must therefore be destroyed, since otherwise we would be producing a clone of an unknown quantum state. The no-cloning property of quantum states leads to interesting applications, of which quantum cryptography is one.

Universal quantum cloners are impossible.

5.7 Other controlled gates

5.7.1 Controlled-phase

Needless to say, not everything is about the controlled-\texttt{NOT} gate. Another common two-qubit gate is the controlled-phase gate \texttt{c-}P_\varphi.


We can also represent the \texttt{c-}P_\varphi gate using the circuit notation, as in Figure 5.2.

Where x,y\in\{0,1\}.

Figure 5.2: Where x,y\in\{0,1\}.

Again, the matrix is written in the computational basis \{|00\rangle,|01\rangle,|10\rangle,|11\rangle\}. If we do not specify the phase then we usually assume that \varphi=\pi, in which case we call this operation the controlled-Z gate, which acts as |0\rangle\langle 0|0\otimes\mathbf{1}+ |1\rangle\langle 1|1\otimes Z. Here Z refers again to the Pauli phase-flip \sigma_z\equiv Z operation.

In order to see the entangling power of the controlled-phase shift gate, consider the following circuit.

(Generating entanglement, again).

In this circuit, first the two Hadamard gates prepare the equally-weighted superposition of all states from the computational basis

and then the controlled-Z operation flips the sign in front of |11\rangle

which results in the entangled state.

5.7.2 Controlled-U

Both the above two-qubit controlled gates (i.e. \texttt{c-NOT} and \texttt{c-}P_\varphi) are specific examples of the more general construction of a controlled-U gate: \texttt{c-}U = |0\rangle\langle 0|0\otimes\mathbf{1}+ |1\rangle\langle 1|1\otimes U where U is an arbitrary single-qubit unitary transformation U.


We can also represent the \texttt{c-}U gate using the circuit notation, as in Figure 5.3.

Where x,y\in\{0,1\}.

Figure 5.3: Where x,y\in\{0,1\}.

We can go even further and consider a more general unitary operation: the two-qubit x-controlled-U gate: \sum_x |x\rangle\langle x|\otimes U_x \equiv |0\rangle\langle 0|\otimes U_0 + |1\rangle\langle 1|\otimes U_1 where each U_x is a unitary transformation that is applied to the second qubit only if the first one is in state |x\rangle. In general, an x-controlled-U gate can be defined on two registers of arbitrary size n and m, with x\in\{0,1\}^n and the U_x being (2^m\times 2^m) unitary matrices acting on the second register.

5.7.3 Phase kick-back

Before moving on to the next section, we first describe a simple “trick” — an unusual way of introducing phase shifts that will be essential for our analysis of quantum algorithms. Consider the following circuit.

(Controlled-U interference).

where |u\rangle is an eigenstate of U (that is, U|u\rangle = e^{i\varphi}|u\rangle for some \varphi).

This should look familiar: it is the usual interference circuit, but with the phase gate replaced by a controlled-U gate, which will mimic the phase gate, as we shall soon see. Note that the second qubit is prepared in state |u\rangle, which is required to be an eigenstate of U. The circuit effects the following sequence of transformations (omitting the normalisation factors): \begin{aligned} |0\rangle|u\rangle &\overset{H}{\longmapsto} (|0\rangle+|1\rangle)|u\rangle \\&\quad = |0\rangle|u\rangle + |1\rangle|u\rangle \\&\overset{\texttt{c-}U}{\longmapsto} |0\rangle|u\rangle + |1\rangle U|u\rangle \\&\quad = |0\rangle|u\rangle + e^{i\varphi}|1\rangle|u\rangle \\&\quad = (|0\rangle + e^{i\varphi}|1\rangle) |u\rangle \\&\overset{H}{\longmapsto} \left( \cos\frac{\varphi}{2}|0\rangle - i\sin\frac{\varphi}{2}|1\rangle \right) |u\rangle. \end{aligned} Note that the second qubit does not get entangled with the first one: it remains in its original state |u\rangle. However, the interaction between the two qubits introduces a phase shift on the first qubit. This may look like an unnecessarily complicated way of introducing phase shifts, but, as we shall soon see, this is how quantum computers do it. Let me give you a preview of things to come.

Consider the following x-controlled-U operation: \begin{gathered} \left[ \begin{array}{cccc} \mathbf{1}& 0 & 0 & 0 \\0 & \mathbf{1}& 0 & 0 \\0 & 0 & \mathbf{1}& 0 \\0 & 0 & 0 & X \end{array} \right] \\[1em] =|00\rangle\langle 00|\otimes\mathbf{1} \\+ |01\rangle\langle 01|\otimes\mathbf{1} \\+ |10\rangle\langle 10|\otimes\mathbf{1} \\+ |11\rangle\langle 11|\otimes X. \end{gathered} The first register is of size 2, and the second register is of size 1. If the first register is prepared in state |11\rangle, then the qubit in the second register is flipped (by the Pauli bit-flip X); otherwise, nothing happens. This unitary operation is a quantum version of the Boolean function evaluation: it corresponds to the Boolean function \begin{aligned} f\colon\{0,1\}^2 &\to \{0,1\} \\00&\mapsto0 \\01&\mapsto0 \\10&\mapsto0 \\11&\mapsto1. \end{aligned} If f(x)=1, then we flip the bit value in the second register (with operation X); if f(x)=0, then we do nothing.

Now, prepare the qubit in the second register in state |0\rangle-|1\rangle, which is an eigenstate of X with eigenvalue e^{\pi i}=-1. So whenever X is applied to the second register, the phase factor -1 appears in front of the corresponding term in the first register. If we prepare the first register in the superposition |00\rangle+|01\rangle+|10\rangle+|11\rangle then the result of applying the above x-controlled-U operation is the entangled state |00\rangle+|01\rangle+|10\rangle-|11\rangle. That is, the phase kick-back mechanism introduced a relative phase in the equally-weighted superposition of all binary strings of length two.

Phase kick-back is how we control quantum interference in quantum computation.

We will return to this topic later on, when we discuss quantum evaluation of Boolean functions and quantum algorithms.

5.7.4 Universality revisited

We will come across few more gates in this course, but at this stage you already know all the elementary unitary operations that are needed to construct any unitary operation on any number of qubits:

  • the Hadamard gate,
  • all phase gates, and
  • the \texttt{c-NOT}

These gates form a universal set of gates: with O(4^{n}n) of these gates, we can construct any n-qubit unitary operation.80 We should mention that there are many universal sets of gates. In fact, almost any gate that can entangle two qubits can be used as a universal gate.

We are particularly interested in any finite universal set of gates (such as the one containing the Hadamard, P_{\frac{\pi}{4}} (the T gate), and the \texttt{c-NOT}), which can approximate any unitary operation on n qubits with arbitrary precision. The price to pay is the number of gates — better precision requires more gates. We shall elaborate on this later.

5.7.5 Density operators and the like

The existence of entangled states leads to an obvious question: if we cannot attribute a state vector to an individual qubit, then how can we describe its quantum state? In the next few chapters we will see that, when we limit our attentions to a part of a larger system, states are not represented by vectors, measurements are not described by orthogonal projections, and evolution is not unitary. As a spoiler, here is a dictionary of some of the new concepts that will soon be introduced:

state vectors \longmapsto density operators
unitary evolutions \longmapsto completely-positive trace-preserving maps
orthogonal projectors \longmapsto positive operator-valued measures

5.8 Why qubits, subsystems, and entanglement?

One question that is rather natural to ask at this point is the following:

If entanglement is so fragile and difficult to control, then why bother? Why not perform the whole computation in one physical system that has as many quantum states as we normally have labels for the states of qubits, so that we can label these quantum states in the same way as we normally label the qubits, and give them computational meaning?

This suggestion, although possible, gives a very inefficient way of representing data (it is describing the unary encoding). For serious computations we need subsystems. Here is why.

Suppose you have n physical objects, and each object has k distinguishable states. If you can access each object separately and put it into any of the k states, then, with only n operations, you can prepare any of the k^{n} different configurations of the combined systems. Without any loss of generality, let us take k=2 and refer to each object of this type as a physical bit. We label the two states of a physical bit as 0 and 1. Any collection of n physical bits can be prepared in 2^{n} different configurations, which can be used to store up to 2^{n} messages/binary strings/different numbers. In order to represent numbers from 0 to N-1 we just have to choose n such that N\leqslant 2^n.

Suppose the two states in the physical bit are separated by the energy difference \Delta E. Then a preparation of any particular configuration will cost no more than E=n \Delta E=(\log_2 N)\Delta E units of energy.

In contrast, if we choose to encode N configurations into one chunk of matter, say, into the first N energy states of a single harmonic oscillator with the interstate energy separation \Delta E then, in the worst case, one has to use E=N\Delta E units of energy, e.g. to go from the ground state (labelled as 0) to the most excited state (labelled as N). For large N this gives an exponential gap in the energy expenditure between the binary encoding using physical bits and unary encoding, using energy levels of harmonic oscillators.

One can, of course, try to switch from harmonic oscillators to quantum systems which have a finite spread in the energy spectrum. For example, by operating on the energy states of the hydrogen atom one can encode any number from 0 to N-1, and one is guaranteed not to spend more than E_{\mathrm{max}}= 13.6\,\mathrm{eV} (otherwise the atom is ionised). The snag is that, in this case, some of the electronic states will be separated by the energy difference to the order of E_{\mathrm{max}}/N, and to drive the system selectively from one state to another one has to tune into the frequency E_{\mathrm{max}}/\hbar N, which requires a sufficiently long wave-packet (so that the frequency is well defined), and consequently the interaction time is of order N(\hbar/E_{\mathrm{max}}).

That is, we have to trade energy for time.

It turns out that whichever way we try to represent the number N using the unary encoding (i.e. using N different states of a single chunk of matter), we end up depleting our physical resources (such as energy, time, space) at a much greater rate than in the case when we use subsystems. This plausibility argument indicates that, for efficient processing of information, the system must be divided into subsystems — for example, into physical bits.

5.9 Remarks and exercises

5.9.1 Entangled or not?

Let a joint state of \mathcal{A} and \mathcal{B} be written in a product basis as |\psi\rangle = \sum_{ij} c_{ij}|a_i\rangle\otimes|b_j\rangle. Assume that \mathcal{H}_a and \mathcal{H}_b are of the same dimension.

  1. Show that, if |\psi\rangle is a product state, then \det (c_{ij}) = 0.

  2. Show that the converse (\det(c_{ij})=0 implies that |\psi\rangle is a product state) holds only for qubits. Explain why.

  3. Deduce that the state \frac12\big(|00\rangle + |01\rangle + |10\rangle + (-1)^k|11\rangle\big) is entangled for k=1 and unentangled for k=0. Express the latter case explicitly as a product state.

There is a lot of interesting physics behind the previous innocuous-looking mathematical statement. For example, think again about the state (|00\rangle+|11\rangle)/\sqrt2. What happens if you measure just the first qubit? It is equally likely that you get |0\rangle or |1\rangle, right? But after your measurement the two qubits are either in state |00\rangle or in |11\rangle, i.e. they show the same bit value. Now, why might that be disturbing? Well, imagine the second qubit to be light-years away from the first one. It seems that the measurement of the first qubit affects the second qubit right away, which seems to imply faster-than-light communication! This is what Einstein called “spooky action as a distance”.81 But can you actually use this effect to send a message faster than light? What would happen if you tried?

Hopefully you can see that it would not work, since the result of the measurement is random — you cannot choose the bit value you want to send. We shall return to this, and other related phenomena, later on.


Show that, for any states |\psi_1\rangle and |\psi_2\rangle, the circuit below implements the swap operation |\psi_1\rangle|\psi_2\rangle \mapsto |\psi_2\rangle|\psi_1\rangle.



Show that the circuit below implements the controlled-\texttt{NOT} gate.

(Controlled-\texttt{NOT}, again).


The controlled-\texttt{NOT} gate can act as a measurement gate: if you prepare the target in state |0\rangle then the gate acts as |x\rangle|0\rangle\mapsto|x\rangle|x\rangle, and so the target learns the bit value of the control qubit. If you wish, you can think about a subsequent measurement of the target qubit in the computational basis as an observer learning about the bit value of the control qubit.

Take a look at the circuit below, where M stands for measurement in the standard basis.


Now assume that the top two qubits are in the state |\psi\rangle = \frac{1}{\sqrt3}\big( |01\rangle - |10\rangle + i|11\rangle \big). The measurement M gives two outcomes: 0 and 1. What are the probabilities of each outcome, and what is the post-measurement state in each case? Further, what is actually being measured here?

5.9.5 Arbitrary controlled-U on two qubits

Any unitary operation U on a single qubit can be expressed as U = B^\dagger XBA^\dagger XA where X\equiv\sigma_x is the Pauli bit-flip operator, and A and B are unitaries. Suppose that you can implement any single qubit gate, and that you have a bunch of controlled-\texttt{NOT} gates at your disposal. How would you implement any controlled-U operation on two qubits?

5.9.6 Entangled qubits

Two entangled qubits in the state \frac{1}{\sqrt2}(|00\rangle+|11\rangle) are generated by some source S. One qubit is sent to Alice, and one to Bob, who then both perform measurements in the computational basis.

  1. What is the probability that Alice and Bob will register identical results? Can any correlations they observe be used for instantaneous communication?

  2. Prior to the measurements in the computational basis, Alice and Bob apply unitary operations R_\alpha and R_\beta (respectively) to their respective qubits:

    The gate R_\theta is defined by its action on the basis states: \begin{aligned} |0\rangle &\longmapsto \cos\theta|0\rangle + \sin\theta|1\rangle \\|1\rangle &\longmapsto -\sin\theta|0\rangle + \cos\theta|1\rangle. \end{aligned} Show that the state of the two qubits prior to the measurements is \begin{aligned} &\frac{1}{\sqrt2}\cos(\alpha-\beta)\big( |00\rangle + |11\rangle \big) \\- &\frac{1}{\sqrt2}\sin(\alpha-\beta)\big( |01\rangle - |10\rangle \big). \end{aligned}

  3. What is the probability that Alice and Bob’s outcomes are identical?

5.9.7 Quantum dense coding


5.9.8 Playing with conditional unitaries

The swap gate S on two qubits is defined first on product vectors by S\colon|a\rangle|b\rangle\mapsto|b\rangle|a\rangle and then extended to sums of product vectors by linearity.

  1. Show that the four Bell states \frac{1}{\sqrt2}(|00\rangle\pm|11\rangle) and \frac{1}{\sqrt2}(|01\rangle\pm|10\rangle) are eigenvectors of S that form an orthonormal basis in the Hilbert space associated to two qubits. Which Bell states span the symmetric subspace (i.e. the space spanned by all eigenvectors with eigenvalue 1), and which the antisymmetric one (i.e. that spanned by eigenvectors with eigenvalue -1)? Can S have any eigenvalues apart from \pm1?

  2. Show that P_\pm = \frac12(\mathbf{1}\pm S) are two orthogonal projectors which form the decomposition of the identity and project on the symmetric and antisymmetric subspaces. Decompose the state vector |a\rangle|b\rangle of two qubits into symmetric and antisymmetric components.

  3. Consider the quantum circuit below, composed of two Hadamard gates, one controlled-S operation (also known as the controlled-swap, or Fredkin gate), and the measurement M in the computational basis. Suppose that the state vectors |a\rangle and |b\rangle are normalised but not orthogonal to one another. Step through the execution of this network, writing down the quantum states of the three qubits after each computational step. What are the probabilities of observing 0 or 1 when the measurement M is performed?

(Symmetric and antisymmetric projection).

  1. Explain why this quantum network implements projections on the symmetric and antisymmetric subspaces of the two qubits.

  2. Two qubits are transmitted through a quantum channel which applies the same, randomly chosen, unitary operation U to each of them. Show that the symmetric and antisymmetric subspaces are invariant under U\otimes U.

  3. Polarised photons are transmitted through an optical fibre. Due to the variation of the refractive index along the fibre, the polarisation of each photon is rotated by the same unknown angle. This makes communication based on polarisation encoding unreliable. However, if you can prepare any polarisation state of the two photons then you can still use the channel to communicate without any errors. How can this be achieved?

5.10 Appendix: Tensor products in components

In our discussion of tensor products we have so far taken a rather abstract approach. There are, however, situations in which we have to put numbers in, and write tensor products of vectors and matrices explicitly. For example, here is the standard basis of two qubits written explicitly as column vectors: \begin{aligned} |00\rangle &\equiv |0\rangle\otimes|0\rangle = \begin{bmatrix}1\\0\end{bmatrix} \otimes \begin{bmatrix}1\\0\end{bmatrix} = \begin{bmatrix}1\\0\\0\\0\end{bmatrix} \\[1em] |01\rangle &\equiv |0\rangle\otimes|1\rangle = \begin{bmatrix}1\\0\end{bmatrix} \otimes \begin{bmatrix}0\\1\end{bmatrix} = \begin{bmatrix}0\\1\\0\\0\end{bmatrix} \\[1em] |10\rangle &\equiv |1\rangle\otimes|0\rangle = \begin{bmatrix}0\\1\end{bmatrix} \otimes \begin{bmatrix}1\\0\end{bmatrix} = \begin{bmatrix}0\\0\\1\\0\end{bmatrix} \\[1em] |11\rangle &\equiv |1\rangle\otimes|1\rangle = \begin{bmatrix}0\\1\end{bmatrix} \otimes \begin{bmatrix}0\\1\end{bmatrix} = \begin{bmatrix}0\\0\\0\\1\end{bmatrix} \end{aligned} Given |a\rangle = \alpha_0|0\rangle + \alpha_1|1\rangle and |b\rangle = \beta_0|0\rangle +\beta_1|1\rangle, we write |a\rangle\otimes|b\rangle as \begin{aligned} |a\rangle\otimes|b\rangle &= \begin{bmatrix}\alpha_0\\\alpha_1\end{bmatrix} \otimes \begin{bmatrix}\beta_0\\\beta_1\end{bmatrix} \\&= \begin{bmatrix}\alpha_0\begin{bmatrix}\beta_0\\\beta_1\end{bmatrix}\\\alpha_1\begin{bmatrix}\beta_0\\\beta_1\end{bmatrix}\end{bmatrix} \\&= \begin{bmatrix}\alpha_0\beta_0\\\alpha_0\beta_1\\\alpha_1\beta_0\\\alpha_1\beta_1\end{bmatrix}. \end{aligned} Note that each element of the first vector multiplies the entire second vector. This is often the easiest way to get the tensor products in practice.

The matrix elements of the tensor product operation A\otimes B

are given by (A\otimes B)_{ik,jl} = A_{ij}B_{kl} where ik\in\{00,01,10,11\} labels the rows, and kl\in\{00,01,10,11\} labels columns, when forming the block matrix:82 \begin{aligned} A\otimes B &= \begin{bmatrix}A_{00}&A_{01}\\A_{10}&A_{11}\end{bmatrix} \otimes \begin{bmatrix}B_{00}&B_{01}\\B_{10}&B_{11}\end{bmatrix} \\&= \begin{bmatrix}A_{00}B&A_{01}B\\A_{10}B&A_{11}B\end{bmatrix} \\&= \left[ \, \begin{array}{c|c} \begin{matrix}A_{00}B_{00}&A_{00}B_{01}\\A_{00}B_{10}&A_{00}B_{11}\end{matrix} & \begin{matrix}A_{01}B_{00}&A_{01}B_{01}\\A_{01}B_{10}&A_{01}B_{11}\end{matrix} \\\hline \begin{matrix}A_{10}B_{00}&A_{10}B_{01}\\A_{10}B_{10}&A_{10}B_{11}\end{matrix} & \begin{matrix}A_{11}B_{00}&A_{11}B_{01}\\A_{11}B_{10}&A_{11}B_{11}\end{matrix} \end{array} \, \right] \end{aligned}

The tensor product induces a natural partition of matrices into blocks. Multiplication of block matrices works pretty much the same as regular matrix multiplication (assuming the dimensions of the sub-matrices are appropriate), except that the entries are now matrices rather than numbers, and so may not commute.

  1. Evaluate the following matrix product of (4\times 4) block matrices: \left[ \, \begin{array}{c|c} \mathbf{1}& X \\\hline Y & Z \end{array} \, \right] \left[ \, \begin{array}{c|c} \mathbf{1}& Y \\\hline X & Z \end{array} \, \right] (where X, Y, and Z are the Pauli matrices).

  2. Using the block matrix form of A\otimes B expressed in terms of A_{ij} and B_{ij} (as described above), explain how the following operations are performed on the block matrix:

    • transposition (A\otimes B)^T; partial transpositions A^T\otimes B and A\otimes B^T;
    • trace \operatorname{tr}(A\otimes B); partial traces (\operatorname{tr}A )\otimes B and A\otimes (\operatorname{tr}B).

    Consider the Hadamard transform H\otimes H\otimes H on three qubits, which is described by a (2^3\times2^3) matrix. We know that H = \frac{1}{\sqrt2}\begin{bmatrix}1&1\\1&-1\end{bmatrix} and so we can calculate that H\otimes H = \frac12 \left[ \, \begin{array}{c|c} \begin{matrix}1&1\\1&-1\end{matrix} & \begin{matrix}1&1\\1&-1\end{matrix} \\\hline \begin{matrix}1&1\\1&-1\end{matrix} & \begin{matrix}-1&-1\\-1&1\end{matrix} \end{array} \, \right] and thus that H\otimes H\otimes H = \frac12 \left[ \, \begin{array}{c|c|c|c} \begin{matrix}1&1\\1&-1\end{matrix} & \begin{matrix}1&1\\1&-1\end{matrix} & \begin{matrix}1&1\\1&-1\end{matrix} & \begin{matrix}1&1\\1&-1\end{matrix} \\\hline \begin{matrix}1&1\\1&-1\end{matrix} & \begin{matrix}-1&-1\\-1&1\end{matrix} & \begin{matrix}1&1\\1&-1\end{matrix} & \begin{matrix}-1&-1\\-1&1\end{matrix} \\\hline \begin{matrix}1&1\\1&-1\end{matrix} & \begin{matrix}1&1\\1&-1\end{matrix} & \begin{matrix}-1&-1\\-1&1\end{matrix} & \begin{matrix}-1&-1\\-1&1\end{matrix} \\\hline \begin{matrix}1&1\\1&-1\end{matrix} & \begin{matrix}-1&-1\\-1&1\end{matrix} & \begin{matrix}-1&-1\\-1&1\end{matrix} & \begin{matrix}1&1\\1&-1\end{matrix} \end{array} \, \right]. The rows and columns of H\otimes H\otimes H are labelled by the triples 000,001,\ldots,111. Now, suppose we apply H\otimes H\otimes H to the state |110\rangle:

  1. The output state is a superposition of all binary strings: \sum_x c_x|x\rangle, with x\in\{0,1\}^3. Where in the H^{\otimes 3} matrix will you find the coefficients c_x?

    Now, do you want to write down H\otimes H\otimes H\otimes H? I don’t think so. This is an exponentially growing monster and you may soon run out of space if you actually do try to write it down. Instead, let’s spot the pattern of the entries \pm1 in these matrices.

  2. Consider the Hadamard gate matrix H_{ab}, where a,b=0,1 are the labels for the rows and the columns. Observe that H_{ab}=(-1)^{ab}/\sqrt{2}. (This may look like a fancy way of writing the entries of the Hadamard matrix but it will pay off in a moment). Using the fact that (A\otimes B)_{ik,jl} = A_{ij}B_{kl}, or any other method, analyse the pattern of the \pm1 in the tensor product of Hadamard matrices. What is the entry H^{\otimes 4}_{0101,1110}?

  3. For any two binary strings a=(a_1,\ldots, a_n) and b =(b_1,\ldots , b_n) of the same length we can define their “scalar” product as a\cdot b = (a_1b_1\oplus \ldots \oplus a_n b_n). Show that, up to the constant (1/\sqrt{2})^n, the entry H^{\otimes n}_{a,b}, for any n and for any binary strings a and b of length n, is (-1)^{a\cdot b}.

  4. Show that H^{\otimes n} acts as |a\rangle \longmapsto \left(\frac{1}{\sqrt 2}\right)^n \sum_{b\in\{0,1\}^n} (-1)^{a\cdot b}|b\rangle

  5. A quantum register of 10 qubits holds the binary string 0110101001. The Hadamard Transform is then applied to this register yielding a superposition of all binary strings of length 10. What is the sign in front of the |0101010101\rangle term?

5.11 Appendix: The Schmidt decomposition

An arbitrary vector in the Hilbert space \mathcal{H}_{\mathcal{A}}\otimes \mathcal{H}_{\mathcal{B}} can be expanded in a product basis as |\psi\rangle = \sum_{ij} c_{ij}|a_i\rangle|b_j\rangle. Moreover, for each particular joint state |\psi\rangle, we can find orthonormal bases, \{|\widetilde{a_i}\rangle\} in \mathcal{H}_{\mathcal{A}} and \{|\widetilde{b_j}\rangle\} in \mathcal{H}_{\mathcal{B}}, such that |\psi\rangle can be expressed as |\psi\rangle = \sum_{i} d_{i}|\tilde a_i\rangle|\tilde b_i\rangle, where the coefficients d_i are non-negative numbers. This is known as the Schmidt decomposition of |\psi\rangle. Any bipartite state can be expressed in this form, but remember that the bases used depend on the state being expanded. Indeed, given two bipartite states |\psi\rangle and |\phi\rangle, we usually cannot perform the Schmidt decomposition using the same orthonormal bases in \mathcal{H}_{\mathcal{A}} and \mathcal{H}_{\mathcal{B}}. The number of terms in the Schmidt decomposition is, at most, the minimum of \dim\mathcal{H}_{\mathcal{A}} and \dim\mathcal{H}_{\mathcal{B}}.

The Schmidt decomposition follows from the singular value decomposition or SVD): any (n\times m) matrix C can be written as C = UDV where U and V are (respectively) (n\times n) and (m\times m) unitary matrices and D is an (n\times m) diagonal matrix with real, non-negative elements in descending order d_1\geqslant d_2\geqslant\ldots\geqslant d_{\min{(n,m)}} (and with the rest of the matrix is filled with zeros). The elements d_k are called the singular values of C.

You can visualize the SVD by thinking of C as representing a linear transformation from m-dimensional to n-dimensional Euclidean space: it maps the unit ball in the m-dimensional space to an ellipsoid in the n-dimensional space; the singular values are the lengths of the semi-axes of that ellipsoid; the matrices U and V carry information about the locations of those axes and the vectors in the first space which map into them. Thus SVD tells us that the transformation C is composed of rotating the unit ball (transformation V), stretching the axes by factors d_k, and then rotating the resulting ellipsoid (transformation U).

Using the index notation C_{ij} = \sum_k U_{ik}d_k V_{kj}, we can thus apply SVD to c_{ij}, \begin{aligned} |\psi\rangle &= \sum_{i,j} c_{ij}|a_ib_j\rangle \\&= \sum_{i,j} \sum_k U_{ik}d_k V_{kj}|a_ib_j\rangle \\&= \sum_k d_k \left(\sum_i U_{ik}|a_i\rangle\right)\otimes\left(\sum_j V_{kj}|b_j\rangle\right). \end{aligned} The Schmidt decomposition of a separable state of the form |a\rangle\otimes|b\rangle is trivially just this state. The Bell states \Psi^+ and \Phi^+ are already written in their Schmidt form, whereas \Psi^- and \Phi^- can be easily expressed in the Schmidt form. For example, for |\Psi^-\rangle we have d_1 = d_2 = \frac{1}{\sqrt 2}, and the Schmidt basis is |\bar a_1\rangle =|0\rangle, |\bar a_2\rangle=|1\rangle, |\bar b_1\rangle = |1\rangle, |\bar b_2\rangle=-|0\rangle. The number of non-zero singular values of c_{ij} is called the rank of c_{ij}, or the rank of the corresponding quantum state, or sometimes, the Schmidt number. Clearly, all bipartite states of rank one are separable.

The Schmidt decomposition is almost unique. The ambiguity arises when we have two or more identical singular values, as, for example, in the case of the Bell states. Then any unitary transformation of the basis vectors corresponding to a degenerate singular value, both in \mathcal{H}_a and in \mathcal{H}_b, generates another set of basis vectors.

  1. E. Schrödinger, Discussion of probability relations between separated system, Mathematical Proceedings of the Cambridge Philosophical Society 31 (1935), pp. 555–563.↩︎

  2. It looks like we are defining some sort of “multiplication rule” for kets here, saying that |a\rangle|b\rangle:=|ab\rangle. This is indeed the case, but to talk about this properly we need to introduce the idea of tensor products (which we do very soon).↩︎

  3. If the bases \{|a_i\rangle\} and \{|b_j\rangle\} are orthonormal then so too is the tensor product basis \{|a_i\rangle\otimes|b_j\rangle\}.↩︎

  4. ({\mathcal{H}}_a \otimes {\mathcal{H}}_b)\otimes {\mathcal{H}}_c = {\mathcal{H}}_a \otimes ({\mathcal{H}}_b\otimes {\mathcal{H}}_c).↩︎

  5. We often drop the \otimes symbol, especially when we deal with the standard tensor product basis. For example, a state of a quantum register composed of four qubits holding the binary string 1001 may be written as |1\rangle\otimes|0\rangle\otimes|0\rangle\otimes|1\rangle, as |1\rangle|0\rangle|0\rangle|1\rangle, or even simply as |1001\rangle.↩︎

  6. We often simply write H\otimes H\otimes H as H^{\otimes 3}.↩︎

  7. Even though an entangled state cannot be written as a tensor product, it can always be written as a linear combination of vectors from the tensor product basis. In fact, any state of n qubits |\psi\rangle can be expressed in the standard product basis. In general, given n qubits, we need 2(2^n-1) real parameters to describe their state vector, but only 2n to describe separable states.↩︎

  8. Here the X refers to the Pauli operator \sigma_x\equiv X that implements the bit-flip.↩︎

  9. Make sure that you understand how the Dirac notation is used here. More generally, think why |0\rangle\langle 0|\otimes A + |1\rangle\langle 1|\otimes B means “if the first qubit is in state |0\rangle then apply A to the second one, and if the first qubit is in state |1\rangle then apply B to the second one”. What happens if the first qubit is in a superposition of |0\rangle and |1\rangle?↩︎

  10. John Stewart Bell (1928–1990) was a Northern Irish physicist.↩︎

  11. For any state |\psi\rangle of two qubits, the amplitude \langle\psi_{xy}|\psi\rangle can be written as \langle xy|U^\dagger|\psi\rangle, where U^\dagger is such that |\psi_{xy}\rangle = U|xy\rangle.↩︎

  12. Divide et impera, or “divide and conquer”: a good approach to solving problems in mathematics (and in life). Start with the smaller circuits in the dashed boxes.↩︎

  13. We neglect to write the normalisation factors.↩︎

  14. Recall the big-O asymptotic notation: given a positive function f(n), we write O(f(n)) to mean “bounded above by c\,f(n) for some constant c > 0 (for sufficiently large n)”. For example, 15n^2+4n+7 is O(n^2).↩︎

  15. “Spooky action at a distance” is a loose translation of the German “spukhafte Fernwirkung”, which is the phrase that Albert Einstein used in his 1947 letter to Max Born.↩︎

  16. We always use the lexicographic order 00<01<10<11.↩︎