## 4.9 Distinguishability of non-orthogonal states

We have already mentioned that non-orthogonal states cannot be reliably distinguished, and now it is time to make this statement more precise. Suppose Alice sends Bob a message by choosing one of the two non-orthogonal states |s_1\rangle and |s_2\rangle, where both are equally likely to be chosen. What is the probability that Bob will decode the message correctly and what is the best (i.e. the one that maximises this probability) measurement? As a general rule, before you embark on any calculations, check for symmetries, special cases, and anything that may help you to visualise the problem and make intelligent guesses about the solution. One of the most powerful research tools is a good guess. In fact, this is what real research is about: educated guesses that guide your calculations. In this particular case you can use symmetry arguments to guess the optimal measurement — see Figure 4.2. Once you have guessed the answer, you might as well do the calculations. Figure 4.2: The optimal measurement to distinguish between the two equally likely non-orthogonal signal states |s_1\rangle and |s_2\rangle is described by the two orthogonal vectors |d_1\rangle and |d_2\rangle placed symmetrically around the signal states.

Suppose Bob’s measurement is described by projectors P_1 and P_2, with the inference rule “P_1 implies |s_1\rangle; P_2 implies |s_2\rangle”. Then \begin{aligned} \Pr(\text{success}) &= \frac12\left( \langle s_1|P_1|s_1\rangle + \langle s_2|P_2|s_2\rangle \right) \\&= \frac12\left( \operatorname{tr}P_1|s_1\rangle\langle s_1| + \operatorname{tr}P_2|s_2\rangle\langle s_2| \right) \\&= \frac12\left( \operatorname{tr}P_1|s_1\rangle\langle s_1| + \operatorname{tr}(\mathbf{1}-P_1)|s_2\rangle\langle s_2| \right) \\&= \frac12\left( 1 + \operatorname{tr}P_1\left( |s_1\rangle\langle s_1| - |s_2\rangle\langle s_2| \right) \right). \end{aligned} Let us look at the operator D = |s_1\rangle\langle s_1| - |s_2\rangle\langle s_2| that appears in the last expression. This operator acts on the subspace spanned by |s_1\rangle and |s_2\rangle; it is Hermitian; the sum of its two (real) eigenvalues is zero; and \operatorname{tr}D=\langle s_1|s_1\rangle-\langle s_2|s_2\rangle=0. Let us write D as \lambda(|d_+\rangle\langle d_+| - |d_-\rangle\langle d_-|), where |d_\pm\rangle are the two orthonormal eigenstates of D, and \pm\lambda are the corresponding eigenvalues. Now we write \begin{aligned} \Pr (\text{success}) &= \frac12\left( 1 + \lambda\operatorname{tr}P_1\left( |d_+\rangle\langle d_+|-|d_-\rangle\langle d_-| \right) \right) \\&\leqslant\frac12\left( 1+\lambda \langle d_+|P_1|d_+\rangle \right) \end{aligned} where we have dropped the non-negative term \operatorname{tr}P_1|d_-\rangle\langle d_-|. In fact, it is easy to see that we will maximise the expression above by choosing P_1 = |d_+\rangle\langle d_+| (and P_2 = |d_-\rangle\langle d_-|). The probability of success is then bounded by \frac12(1+\lambda). All we have to do now is to find the positive eigenvalue \lambda for the operator D. We can do this, of course, by solving the characteristic equation for a matrix representation of D, but, as we are now practising the trace operations, we can also notice that \operatorname{tr}D^2 = 2\lambda^2, and then evaluate the trace of D^2. We use the trace identities and obtain \begin{aligned} \operatorname{tr}D^2 &= \operatorname{tr}\left( |s_1\rangle\langle s_1|-|s_2\rangle\langle s_2| \right) \left( |s_1\rangle\langle s_1|-|s_2\rangle\langle s_2| \right) \\&= 2-2|\langle s_1|s_2\rangle|^2 \end{aligned} which gives \lambda = \sqrt{1-|\langle s_1|s_2\rangle|^2}. Bringing it all together we have the final expression: \Pr (\text{success}) = \frac12\left( 1+ \sqrt{1-|\langle s_1|s_2\rangle|^2} \right).

We can parametrise |\langle s_1|s_2\rangle| = \cos\alpha, and interpret \alpha as the angle between |s_1\rangle and |s_2\rangle. This allows us to express our findings in a clearer way: given two equally likely states, |s_1\rangle and |s_2\rangle, such that |\langle s_1|s_2\rangle| = \cos\alpha, the probability of correctly identifying the state by a projective measurement is bounded by \Pr (\text{success}) = \frac12(1 + \sin\alpha), and the optimal measurement that achieves this bound is determined by the eigenvectors of D = |s_1\rangle\langle s_1|-|s_2\rangle\langle s_2| (try to visualise these eigenvectors).

It makes sense, right? If we try just guessing the state, without any measurement, then we expect \Pr (\text{success}) = \frac12. This is our lower bound, and in any attempt to distinguish the two states we should do better than that. If the two signal states are very close to each other then \sin\alpha is small and we are slightly better off than guessing. As we increase \alpha, the two states become more distinguishable, and, as we can see from the formula, when the two states become orthogonal they also become completely distinguishable.