## Dealing with density operators

Now we return to quantum states, and generalise the notion of trace distance to density operators.

The **trace norm** of an operator is the sum of its singular values:
\|A\|_{\operatorname{tr}} \coloneqq \sum_i s_i(A).
If A is *normal* (e.g. a density operator), then
\|A\|_{\operatorname{tr}} = \sum_{\lambda\in\sigma(A)} |\lambda|.

The induced **trace distance** between two density operators is
d_{\operatorname{tr}}(\rho,\sigma) \coloneqq \frac{1}{2}\|\rho-\sigma\|_{\operatorname{tr}}.

There are many questions raised by this definition, such as “how does this relate to the trace distance of probability distributions?” and “how does this trace norm relate to the operator norm from Section 11.4?” — we will answer the first question now, but our answer to the second builds upon the notion of an \ell^p-norm, which is a discussion that we will postpone for Section 11.10.2.

We can simply think of the trace distance for density operators as the natural analogue of the trace distance for probability distributions: it is a tight upper bound on the distances between the probability distributions obtained from \rho and \sigma by a measurement, as we now justify.

Let \{P_k\} be a complete set of orthogonal projectors, defining a projective measurement in some \mathcal{H}.
This measurement gives outcome k with some probability p(k) if the quantum system is in state \rho, and the same outcome with some probability q(k) if the system is in state \sigma.
That is,
\begin{aligned}
p(k) &\coloneqq \operatorname{tr}P_k\rho
\\q(k) &\coloneqq \operatorname{tr}P_k\sigma.
\end{aligned}
Then
\begin{aligned}
d_{\operatorname{tr}}(p,q)
&\coloneqq \frac{1}{2}\sum_k|p(k)-q(k)|
\\&= \frac{1}{2}\sum_k|\operatorname{tr}P_k(\rho-\sigma)|
\\&= \frac{1}{2}\operatorname{tr}((\rho-\sigma)U)
\end{aligned}
where we define
U\coloneqq \sum_k\frac{\operatorname{tr}P_k(\rho-\sigma)}{|\operatorname{tr}P_k(\rho-\sigma)|}P_k
or, in other words, U is the sum of the P_k but where the signs are determined by whether |\operatorname{tr}P_k(\rho-\sigma)| is equal to +\operatorname{tr}P_k(\rho-\sigma) or -\operatorname{tr}P_k(\rho-\sigma).

Since this U is unitary, and since the trace norm can be written as
\|A\|_{\operatorname{tr}} = \max_{U\text{ unitary}}|\operatorname{tr}AU|
we finally obtain that
\begin{aligned}
d_{\operatorname{tr}}(p,q)
&\coloneqq \frac{1}{2}\sum_k|p(k)-q(k)|
\\&= \leqslant\frac{1}{2}\|\rho-\sigma\|
\\&\eqqcolon d_{\operatorname{tr}}(\rho,\sigma)
\end{aligned}
which says that the trace distance d_{\operatorname{tr}}(\rho,\sigma) gives an upper bound on distances between probability distributions obtained from \rho and \sigma by a measurement.
The fact that this bound is tight (i.e. attainable) is witnessed by the measurement defined by the projectors onto the eigenspaces of \rho-\sigma.

As an example, consider pure states |u\rangle and |v\rangle.
The trace distance between them is
\frac{1}{2}\||u\rangle\langle u|-|v\rangle\langle v|\|_{\operatorname{tr}}.
We can write |v\rangle as
|v\rangle = \alpha|u\rangle + \beta|\bar{u}\rangle
where |\bar{u}\rangle is some unit vector orthogonal to |u\rangle, and where \alpha=\langle u|v\rangle, with \beta determined by |\alpha|^2+|\beta|^2=1.
Then
\begin{aligned}
|u\rangle\langle u| - |v\rangle\langle v|
&= \begin{bmatrix}1&0\\0&0\end{bmatrix} - \begin{bmatrix}|\alpha|^2&\alpha\beta^\star\\\alpha^\star\beta&|\beta|^2\end{bmatrix}
\\&= \begin{bmatrix}|\beta|^2&-\alpha\beta^\star\\-\alpha^\star\beta&-|\beta|^2\end{bmatrix}
\end{aligned}
(which has eigenvalues \pm|\beta|), and the trace distance is given by
\frac{1}{2}\||u\rangle\langle u|-|v\rangle\langle v|\|_{\operatorname{tr}} = \sqrt{1-|\langle u|v\rangle|^2}
which is exactly \sqrt{1-\text{fidelity}}.

As a consequence of this, we see that
\frac{1}{2}\||u\rangle\langle u|-|v\rangle\langle v|\|_{\operatorname{tr}}
\leqslant\|u-v\|
since
\begin{aligned}
1 - |\langle u|v\rangle|^2
&= \Big( 1+|\langle u|v\rangle| \Big) \Big( 1-|\langle u|v\rangle| \Big)
\\&\leqslant 2\Big( 1-|\langle u|v\rangle| \Big)
\\&= \|u-v\|^2.
\end{aligned}
So if two states |u\rangle and |v\rangle are \varepsilon-close in the trace distance, then the probability distributions of outcomes of *any* measurement performed on a physical system in state |u\rangle or |v\rangle will also be \varepsilon-close in the trace distance.