Can we combine the square roots inside the definition of the fidelity?

Question

The (Uhlmann-Jozsa) fidelity of quantum states $\rho$ and $\sigma$ is defined to be $$F(\rho, \sigma) := \left(\mathrm{tr} \left[\sqrt{\sqrt{\rho} \sigma \sqrt{\rho}} \right]\right)^2.$$ However, as discussed here, the cyclical property of the trace extends to arbitrary analytic functions: $$\mathrm{tr}[f(AB)] \equiv \mathrm{tr}[f(BA)]$$ for any analytic function $f$ whenever either side is well-defined. Letting $f$ be the square root function, this seems to imply that $$F(\rho, \sigma) \equiv \big(\mathrm{tr} \left[\sqrt{\rho \sigma} \right]\big)^2,$$ which is much easier to deal with. (I don't think the branch point at the origin of the square root function is an issue, because the function is still continuous there.)

Am I correct that these two expressions are equivalent?
If so, is there any reason why the much clunkier former expression with nested square roots is always given?

The only benefit of the original definition that I can see is that it makes it clear that the operator inside the trace is Hermitian and positive-semidefinite, so that the resulting fidelity is a non-negative real number.

Related on physics.SE

Jonathan Jones · Answer 1 · 2023-10-04T14:14:54.443

This result is indeed correct and gets rediscovered from time to time. We rediscovered it last year and got it published at Phys Rev A 107, 012427 (2023) (free version at arXiv:2211.02623). But it's certainly not original to us: there's quite a good list of earlier and later versions at this SciRate discussion of a later discovery.

The earliest clear statement of the final fidelity result I have found so far is in equation 5.4 of Some geometric interpretations of quantum fidelity by Jin Li, Rajesh Pereira, and Sarah Plosker (2015). However there are very close statements in other places, and there are plenty of earlier statements of results in contexts other than quantum fidelity, most notably the final equation in Some eigenvalue inequalities for positive semidefinite matrix power products by Bo-Ying Wang and Ming-Peng Gong (1993). If you don't like our proof then I can also recommend A Simplified Expression for Quantum Fidelity by Adrian Müller (2023) which trigged the SciRate discussion mentioned above.

Danylo Y · Accepted Answer · 2020-02-24T18:32:23.910

Square root is not differentiable at 0, so that cyclic property cannot be applied
While $\rho\sigma$ has the same non-negative eigenvalues as $\sqrt\rho\sigma\sqrt\rho$, it's not self-adjoint. Non self-adjoint matrices are not diagonalizable in general, so the square root $\sqrt{\rho\sigma}$ can be not well-defined (see edit below).
Anyway, $\text{Tr}(\sqrt{\sqrt\rho\sigma\sqrt\rho})$ is equal to the sum of square roots of eigenvalues of $\rho\sigma$.

EDIT
It turns out that $\rho\sigma$ is always diagonalizable https://www.sciencedirect.com/science/article/pii/002437959190239S

So, taking principal square root of it is a correct operation. And it is indeed possible to write this shorter formula. Though this is not very well known and not conventional, since $\rho\sigma$ is not self-adjoint.

score 3 · Answer 3 · answered Oct 06 '23 at 12:13

The reasoning of OP seems correct to me. I think it is best understood in terms of eigenvalues.

Quantum fidelity, typically defined as $$ F(\rho,\sigma) = \text{Tr}\left(\sqrt{\sqrt{\rho}\sigma\sqrt{\rho}}\right)^2 $$ is essentially summing up eigenvalues (because $\text{Tr}\, A = \sum_i \lambda_i(A)$). Now, $\sqrt{\rho}\sigma\sqrt{\rho}$ has the same characteristic polynomial as $\rho\sigma$, and thus the same set of eigenvalues. And because the matrix square root is defined as a transformation of the eigenvalues, $\sqrt{\sqrt{\rho}\sigma\sqrt{\rho}}$ has the same set of eigenvalues as $\sqrt{\rho\sigma}$, too, which gets us to $$ F(\rho,\sigma) = \text{Tr}\left(\sqrt{\rho\sigma}\right)^2 $$

A more detailed version of this argument can be found in my preprint arxiv:2309.10565 together with some efficiency analysis comparing with the methods common quantum libraries currently use. Cheers!

tparker · Answer 4 · 2020-02-24T13:20:25.100

Okay, this is a rather subtle situation, but I think I've figured it out. The key is to be very careful about which mathematical results about Hermitian operators do and do not hold for generic operators. Let $H$ represent an arbitrary Hermitian matrix, $N$ an arbitrary normal one, $D$ be a generic diagonalizable matrix, and $M$ an arbitrary matrix, all acting on an $n$-dimensional Hilbert space.

Subtlety 1: For normal $N$, the numerical range $$\left \{ \frac{\langle \psi | N | \psi \rangle}{\langle \psi|\psi\rangle} \right \}$$ for all nonzero $\psi$ in the Hilbert space is the convex hull of the eigenvalues of $N$. (So for $H$ Hermitian, it's the real interval $[\min \lambda, \max \lambda]$.) For generic $M$, the numerical range is still convex and contains the eigenvalues, but is not necessarily a hull for them.

Subtlety 2: Therefore, there are two definitions of "positive (semi-)definite" that are equivalent for Hermitian $H$ but not for generic $M$:

PD1: A matrix is positive (semi-)definite (PD1) if all of its eigenvalues are positive (nonnegative real).

PD2: A matrix $M$ is positive (semi-)definite (PD2) if $\langle \psi | M | \psi \rangle > (\geq)\ 0$ for all nonzero $|\psi\rangle$ in the Hilbert space.

PD1 and PD2 are equivalent for Hermitian $H$, but PD2 (which I think is the "real" meaning of positive-definite) is strictly stronger for generic $M$. See here for an example of a non-Hermitian $M$ that satisfies PD1 but not PD2.

Subtlety 3: There are two inequivalent definitions of a square root of a matrix.

SR1: $R$ is a square root (SR1) of a generic matrix $M$ if $R^2 = M$. Under this definition, a matrix has a finite number of square roots (e.g. $2^n$ if its eigenvalues are all distinct). This definition makes sense for any matrix. I'm not sure whether or not there's generically a natural choice of "principal" square root in this situation (e.g. if $M$ is defective), so the notation $\sqrt{M}$ is not (as far as I know) well defined.

SR2: $R$ is a square root (SR2) of a positive definite Hermitian matrix $P$ if $R^\dagger R = P$. (Since $P$ is Hermitian, we don't need to specify whether we mean PD1 or PD2 for "positive definite".) Under this definition, the set of square roots of a matrix $P$ is isomorphic to the Lie group $U(n)$, because if $R_1$ is a square root of $P$, then $R_2$ is a square root of $P$ iff $R_2 = U R_1$ for some unitary matrix $U$. $R$ is not necessarily Hermitian. But under this definition, we can define the natural "principal" square root of $P$, which we denote by $L = \sqrt{P}$, as the unique square root that is also Hermitian and positive (semi-)definite (again, no need to distinguish PD1 from PD2 here).

Since the principal square root $L$ is Hermitian by definition, it respects both definitions SR1 and SR2, as $L^\dagger L = L^2 = P$. But a generic square root (SR1) of $P$ will not be a square root (SR2) of $P$ or vice versa.

For a Hermitian $H$, the usual power series expansion of the square root function $$\sum_{n=0}^\infty \frac{(-1)^n (2n)!}{(1-2n)(n!)^2 4^n} (H - I)^n$$ will converge to $\sqrt{H}$ iff all the eigenvalues of $H$ lie in the interval $[0,2]$. For a generic diagonalizable matrix $D$, this series will converge to a square root of $D$ iff all of the eigenvalues of $D$ lie in the disk in the complex plane that has that interval as a diameter. (If I recall correctly, the boundary points 0 and 2 are included, but the boundary points with nonzero imaginary part are all excluded.)

Subtlety 4: If either of two generic matrices $A$ or $B$ is invertible, then $AB$ and $BA$ are similar, but if both $A$ and $B$ are singular, then $AB$ and $BA$ are not necessarily similar (see here for a counterexample). But even in this case, $AB$ and $BA$ always have the same eigenvalues and in fact characteristic polynomials, so (for example) their traces and regions of convergence for any formal power series will be the same.

Subtlety 5: If we have two Hermitian positive-definite matrices $P_1$ and $P_2$, then their non-Hermitian product $P_1 P_2$ will satisfy PD1 but not necessarily PD2 (see the first link above for a counterexample), so it may or may not necessarily be "positive definite", depending on your definition.

Now we can finally try to answer my question. The standard definition of the fidelity is unambiguous, because only Hermitian positive-semidefinite operators are ever getting square rooted. Since $\rho \sigma$ is non-Hermitian, its numerical range is generically complex and it does not satisfy PD2. Moreover, we can't talk about its square roots using definition SR2. And generically, the notation $\sqrt{M}$ may not be meaningful for a non-Hermitian $M$ because it implies some natural principal branch.

But we can talk about the square roots (plural) of $\rho \sigma$ under definition SR1, as with any matrix. Moreover, $\rho \sigma$ is a highly non-generic matrix. It satisfies PD1 by Subtlety 5. In fact, by Subtlety 4, $\rho \sigma$ has the same characteristic polynomial (with all roots lying in $[0,1]$) as $\sqrt{\rho} \sigma \sqrt{\rho}$. So because the eigenvalues all lie in this interval (which is obviously not generically true), there is a natural choice of principal square root: the one given by the usual power series expansion above, which is guaranteed to converge in light of its eigenvalue spectrum. So in this particular case, we can get away with defining $\sqrt{\rho \sigma}$ by the formal power series above. Then by the logic outlined in my question, we can indeed cycle the operators inside the square root and always get the right answer.

TLDR: The expression $\sqrt{M}$ is not uniquely defined for a generic matrix $M$ that is not Hermitian and positive semidefinite. But in this case, the special properties of the matrix $\rho \sigma$ guarantee that the formal power series above converges, so we can use that power series to (non-conventionally) define $\sqrt{\rho \sigma}$. If we use that convention, then we will indeed always get the same answer as the traditional definition. However, this is a bit of a hack, and the traditional definition's meaning is clear without needing to make any additional implicit definitions.

Can we combine the square roots inside the definition of the fidelity?

4 Answers4

Linked