Prove the fidelity can be written in terms of Pauli expectation values as ${\rm tr}(\rho\sigma)=\sum_k \chi_\rho(k)\chi_\sigma(\rho)$

Question

I am reading through "Direct Fidelity Estimation from Few Pauli Measurements" and it states that the measure of fidelity between a desired pure state $\rho$ and an arbitrary state $\sigma$ is $\mathrm{tr}(\rho\sigma)$. It then describes a 'characteristic function' $\chi_\rho(k) = \mathrm{tr}(\rho W_k/\sqrt d)$ where $W_k (k = 1, ... , d^2)$ are all possible Pauli operators (n-fold tensor products of $I$, $X$, $Y$ and $Z$). It then states that:

$$\mathrm{tr}(\rho\sigma) = \sum_k \chi_\rho(k) \chi_\sigma(k)$$

Which is where I get confused. How do the two equal each other? A proof is much appreciated.

Additionally, what do they mean by 'characteristic function'. Is it the type of characteristic function defined in probability theory. If so, I don't see how it is derived.

Adam Zalcman · Accepted Answer · 2021-07-03T01:31:27.213

Background

If $v_1, v_2, \dots, v_n$ is an orthonormal basis in the inner product space $V$, then any vector $u\in V$ can be expressed as a linear combination

$$ u = \alpha_1 v_1 + \alpha_2 v_2 + \dots + \alpha_n v_n.\tag1 $$

Moreover, the coefficients can be computed using $\alpha_k=\langle v_k, u\rangle$, as can be seen by applying $\langle v_k, .\rangle$ to both sides of $(1)$.

Fidelity in terms of the characteristic function

The set $L(\mathcal{H})$ of linear operators on a $d$-dimensional Hilbert space $\mathcal{H}$ forms an inner product space with the inner product defined as

$$ \langle A, B\rangle = \mathrm{tr}(A^\dagger B). $$

It is easy to check that the normalized Pauli operators $B_k = W_k/\sqrt{d}$ form an orthonormal basis in $L(\mathcal{H})$ with respect to this inner product. Therefore, any operator $\rho \in L(\mathcal{H})$ can be written as

$$ \rho = \alpha_1 B_1 + \alpha_2 B_2 + \dots + \alpha_{d^2} B_{d^2}\tag{1'} $$

and the coefficients can be computed as

$$ \alpha_k = \langle B_k, \rho\rangle = \mathrm{tr}(B_k^\dagger \rho) = \mathrm{tr}(\rho B_k) = \chi_\rho(k).\tag2 $$

Finally, using $(1')$ and $(2)$, we find

$$ \begin{align} \rho = & \sum_{i=1}^{d^2}\chi_\rho(i)B_i \\ \rho\sigma = & \sum_{i=1}^{d^2}\chi_\rho(i)B_i\sigma \\ \mathrm{tr}(\rho\sigma) = & \mathrm{tr}\left(\sum_{i=1}^{d^2}\chi_\rho(i)B_i\sigma\right) \\ \mathrm{tr}(\rho\sigma) = & \sum_{i=1}^{d^2}\chi_\rho(i)\,\mathrm{tr}\left(B_i\sigma\right) \\ \mathrm{tr}(\rho\sigma) = & \sum_{i=1}^{d^2}\chi_\rho(i)\chi_\sigma(i) \end{align} $$

which is the desired equality.

I am not aware of any connection between the characteristic function defined in the paper and the characteristic function of a random variable.

score 1 · Answer 2 · answered Apr 09 '24 at 13:22

A more general perspective on this is offered thinking in terms of operator frames.

Suppose we're working in a $\mathbb{C}^d$. Let $(\mathcal O_i)_{i=1}^n$ be a set of Hermitian operators that spans the set of all Hermitian operators in $\mathbb{C}^d$. This means that any Hermitian $H$ can be decomposed as $H=\sum_i \alpha_i \mathcal O_i$ for some set of $\alpha_i\in\mathbb{R}$. This implies in particular that $n\ge d^2$.

It follows that $(\mathcal O_i)_{i=1}^n$ is a frame, which means there is another set of (Hermitian) dual operators, write them as $(\mathcal O_i^\star)_{i=1}^n$ such that any Hermitian $\mathcal O$ decomposes as $$H=\sum_{i=1}^n \mathcal O_i \langle \mathcal O_i^\star,H\rangle = \sum_{i=1}^n \mathcal O_i^\star \langle \mathcal O_i,H\rangle,$$ where $\langle A,B\rangle\equiv \operatorname{tr}(A^\dagger B)$ is the standard Hilbert-Schmidt inner product of operators. A canonical (but not necessarily unique) way to compute the dual operators is as $$\mathcal O_i^\star = S^{-1}(\mathcal O_i), \quad \text{where}\quad S(X)\equiv \sum_i \langle \mathcal O_i,X\rangle \mathcal O_i.$$

Suppose now $H$ and $H'$ is some pair of Hermitian operators. It follows from the above decomposition that we can always write their inner product as $$\langle H, H'\rangle \equiv \operatorname{tr}(HH') = \sum_i \langle \mathcal O_i,H\rangle \langle O_i^\star,H'\rangle.$$ A particularly simple case occurs when the original frame $(\mathcal O_i)_{i=1}^n$ is not just a frame but an orthonormal basis, meaning $\langle \mathcal O_i,\mathcal O_j\rangle=\delta_{ij}$. This is for example the case with the Pauli operators $W_k/\sqrt d$ as you defined them. In any such case, the dual equals the original frame, $\mathcal O_i^\star=\mathcal O_i$, and thus we get the standard decomposition $$\langle H,H'\rangle = \sum_i \langle \mathcal O_i,H\rangle \langle \mathcal O_i,H'\rangle.$$

Toy example of decomposition for frames that aren't bases

To better illustrate the procedure above, say we're considering the operatorial "basis" (it's not a basis, it's a frame, because its elements aren't linearly independent) $$\{\mathcal O_i\}_{i=1}^6\equiv \{\mathbb{P}_0,\mathbb{P}_1,\mathbb{P}_+,\mathbb{P}_-,\mathbb{P}_L,\mathbb{P}_R\}.$$ In words, these are the projections onto the eigenvectors of the three Pauli operators. Computing the frame operator $S$, vectorising all operators, gives $$S = \begin{pmatrix}2&0&0&1\\0&1&0&0\\0&0&1&0\\1&0&0&2\end{pmatrix}, \quad S^{-1} = \frac13\begin{pmatrix} 2&0&0&-1\\ 0&1&0&0 \\ 0&0&1&0\\ -1&0&0&2 \end{pmatrix}.$$ It follows that the canonical dual frame is the one with operators $$\mathcal O_1^\star = \frac13 \begin{pmatrix}2&0\\0&-1\end{pmatrix}, \quad \mathcal O_2^\star = \frac13 \begin{pmatrix}-1&0\\0&2\end{pmatrix}, \quad \mathcal O_3^\star = \frac16 \begin{pmatrix}1&3 \\3&1\end{pmatrix}, \\ \mathcal O_4^\star = \frac16 \begin{pmatrix}1&-3\\-3&1\end{pmatrix}, \quad \mathcal O_5^\star = \frac16 \begin{pmatrix}1&3i \\ -3i & 1 \end{pmatrix}, \quad \mathcal O_6^\star = \frac16 \begin{pmatrix} 1 & -3i \\ 3i & 1 \end{pmatrix}. $$ You can verify how these give you a more general form of the decomposition in terms of "characteristic functions" you remarked, except the decomposition uses both $\mathcal O_i$ and $\mathcal O_i^\star$ operators together.

It's also interesting to note how the dual operators aren't positive semidefinite. This is a general feature of this type of calculation, which can be tied to the "positivity problems" in linear state tomography (see e.g. this answer), and can be tied to the nonclassicality of quantum states, as discussed e.g. in [FE2007] and [FME2009].

Connection with characteristic functions in probability theory

There is actually a tight connection with the usual meaning of the term "characteristic function" as well. Note that the characteristic function is by definition the inner product between the PDF and the "basis functions" $e_\nu(x) \equiv e^{2\pi i\nu x}$. Furthermore, the set $\{e_\nu\}_\nu$ is an orthonormal basis for the function space $L_2([0,1])$. See e.g. math.SE:827145, math.SE:2720188, math.SE:4017887. It follows that the frame decomposition for inner products discussed above for generic frames translates in this case into saying that for any $f,g\in L_2([0,1])$ you have $$f = \int_0^1 d\nu\, \langle e_\nu,f\rangle e_\nu, \qquad \langle e_\nu,f\rangle \equiv \int_0^1 dx\, e^{-2\pi i \nu x}f(x), \\ \langle f, g\rangle = \int d\nu \langle e_\nu,f\rangle \langle e_\nu,g\rangle.$$ You'll recognise the latter relation as nothing but Parseval's theorem.

If $f=p$ is a PDF, then $$\langle p,e_\nu\rangle\equiv \int_0^1 dx e^{2\pi i \nu x} p(x) \equiv \mathbb{E}[e^{2\pi i \nu X}]$$ gives you the standard face of the characteristic function in this context.

Prove the fidelity can be written in terms of Pauli expectation values as ${\rm tr}(\rho\sigma)=\sum_k \chi_\rho(k)\chi_\sigma(\rho)$

2 Answers2

Background

Fidelity in terms of the characteristic function

Toy example of decomposition for frames that aren't bases

Connection with characteristic functions in probability theory

Linked