Why does the twirl of a quantum channel give a depolarizing channel?

Question

I would like to understand in detail why the twirl of a quantum channel gives depolarizing channel, which is the starting point of randomized benchmarking. To be self-contained, let me set up the notation.

Let $\hat{U}$ denote a superoperator that acts on the density matrix as $\hat{U}(\rho)\equiv U\rho U^\dagger$ where $U$ (without the hat) is the corresponding unitary. Let $\hat{\Lambda}$ be a quantum channel such that $\hat{\Lambda}(\rho)=\sum_kA_k\rho A_k^\dagger$ where $A_k$ is the Kraus operator. We use $\circ$ to denote the composition of superoperators: $\hat{U}_1\circ\hat{U}_2(\rho)=U_1U_2\rho U_2^\dagger U_1^\dagger$. The twirl of a quantum channel is defined as $\hat{\Lambda}_t\equiv\int dU\hat{U}\circ\hat{\Lambda}\circ\hat{U}^\dagger$ which is equal to a depolarizing channel in the sense that

\begin{equation} \hat{\Lambda}_t(X) = (1-p_d)X + \frac{p_d}{D}\text{tr}(X)I \end{equation}

for any operator $X$. I would like to understand the derivation of this fact.

What I know is that the twirl commutes with arbitrary unitary superoperator $\hat{U}\circ\hat{\Lambda}_t = \hat{\Lambda}_t\circ\hat{U}$ and it hints that I should use some sort of Schur's lemma in the natural representation of $\hat{\Lambda}$, but I am not sure how to proceed...

Resources that I have found:

Nielsen's paper, but I don't understand his argument below Eq. (10).
The original RB paper, I don't understand their Eq. (46), so I guess I am missing some group theory here.
Meier's thesis, essentially following 2 but with a slight different representation, which I could not follow as well.

Any help to fill in the gap is really appreciated!

Markus Heinrich · Answer 1 · 2022-11-07T08:22:07.963

I hope you do not mind if I zoom out a bit and talk about representation theory. I think a more general approach helps understanding the essential bits and will be helpful if you encounter similar expressions in the future.

Let $G$ be a (compact) group equipped with its Haar measure and $\rho$ a (finite-dimensional unitary) representation on a Hilbert space $H$. The operator $$ \Pi_G(X) := \int_G \rho(g)X\rho(g)^\dagger\,\mathrm{d}g, $$ is the orthogonal projection onto the commutant $\rho'$ of $\rho$, i.e. on all operators $X\in L(H)$ which commute with $\rho$. We can write $\Pi_G$ using an orthogonal basis of this subspace.

Clearly, we want to use Schur's lemma but $\rho$ is generally a reducible representation. Hence, let's decompose $\rho$ into irreps $$ \rho = \bigoplus_\lambda \rho_\lambda \otimes \mathrm{id}_{n_\lambda}, $$ where $n_\lambda$ is the multiplicity of the irrep $\lambda$. We can now apply Schur's lemma irrep-wise: any $X\in L(H)$ can be written into matrix blocks as follows $$ X = \bigoplus_{\lambda,\lambda'} X_{\lambda,\lambda'}. $$ If $X$ is in the commutant $\rho'$, $ X_{\lambda,\lambda'} = 0$ if $\lambda\neq\lambda'$ because of Schur's lemma. Moreover, if $\lambda=\lambda'$, then $X$ can still be non-identity on the multiplicity space, hence $$ X = \bigoplus_\lambda \mathrm{id}_\lambda \otimes X_{\lambda}, \qquad X_{\lambda} \in \mathbb{C}^{n_\lambda\times n_\lambda}. $$ Special case: if an irrep is multiplicity-free, $n_\lambda=1$ and thus $X_{\lambda}\in\mathbb C$ is just a number.

Remark: From the above formula it is easy to see that the dimension of the commutant $\rho'$ is $\dim\rho'=\sum_{\lambda}n_\lambda^2$. It is also a one-line proof using a character formula.

Next, writing out the projection onto the commutant is simpler in the multiplicity-free case, so let's do this first. Then, any $X\in\rho'$ is of the form $$ X = \bigoplus_\lambda x_\lambda \mathrm{id}_\lambda, \qquad x_\lambda\in\mathbb C. $$ However, note that the projectors $P_\lambda\in L(H)$ onto the irreps of $\rho$ in $H$ have the form $$ P_\lambda = 0 \oplus \dots \oplus 0 \oplus \mathrm{id}_\lambda \oplus 0 \oplus \dots \oplus 0. $$ Hence, $$ X = \sum_\lambda x_\lambda P_\lambda. $$ Moreover, the $P_\lambda$ are orthogonal and $\|P_\lambda\|_2 = \sqrt{d_\lambda}$, such that $$ X = \sum_\lambda \frac{1}{d_\lambda}\mathrm{tr}(P_\lambda X) P_\lambda. $$ Hence, our formula for the projection onto $\rho'$ is $$ \Pi_G(X) = \sum_\lambda \frac{1}{d_\lambda}\mathrm{tr}(P_\lambda X) P_\lambda. $$

Now, let's apply this to the unitary group. So we have $G=U(d)$ and $\rho(U) = U(\cdot)U^\dagger$ acting on $H=L(\mathbb C^d)$. The irreps of $\rho$ are the trivial one, spanned by $I$ and the the traceless subspace of dimension $d^2-1$. Both are multiplicity-free and have the projectors $$ P_1(X) = \frac{1}{d}\mathrm{tr}(X) I, \qquad P_0(X) = X - P_1(X) = X - \frac{1}{d}\mathrm{tr}(X) I. $$ To evaluate the channel twirl of a channel $\Lambda$, we note that $$ \mathrm{tr}(P_1\Lambda) = \frac{1}{d}\mathrm{tr}(\Lambda(I)) = 1, \qquad \mathrm{tr}(P_0\Lambda) = \mathrm{tr}(\Lambda) - 1, $$ and write $p_\Lambda := \mathrm{tr}(P_0\Lambda)/(d^2-1)$. Then: $$ \Pi_{U(d)}(\Lambda)(X) = \mathrm{tr}(P_1\Lambda) P_1(X) + \frac{1}{d^2-1}\mathrm{tr}(P_0\Lambda) P_0(X) \\ = \mathrm{tr}(X)\frac{I}{d} + p_\Lambda\left(X - \frac{1}{d}\mathrm{tr}(X) I\right)\\ = \left(1 - p_\Lambda\right)\mathrm{tr}(X) \frac{I}{d} + p_\Lambda X. $$ This is a depolarizing channel with parameter $p_\Lambda$ (or $1-p_\Lambda$ whichever you prefer).

PS: If we have multiplicities, then there is only a canonical decomposition into $\lambda$-isotypes. We have to make a choice how to decompose these further into copies of $\rho_\lambda$ and define and orthogonal basis for the multiplicity space. The rest stays the same.

Adam Zalcman · Accepted Answer · 2022-01-29T16:24:17.723

Nielsen's paper cited in the question simplifies the arguments originally laid out in two papers by Horodecki family. This answer sketches the original arguments and is meant to complement the nice explanation based on representation theory written by @Markus Heinrich by requiring less background knowledge and hopefully providing some additional insight into the relationship between depolarizing channels and twirling. It also demonstrates the use of state-channel duality.

High level summary

The argument uses state-channel duality to translate twirling of channels to $U\otimes U^*$ twirling of states. By unitary invariance of the Haar measure, twirling is idempotent, so the Choi matrix of a twirled channel is invariant under $U\otimes U^*$ twirling of states. However, it turns out that the only states invariant under $U\otimes U^*$ twirling of states are the so-called noisy singlets which under state-channel duality correspond to depolarizing channels.

Noisy singlet

Consider two systems with the Hilbert spaces of the same finite dimension $N$. Let $|\psi\rangle:=\frac{1}{\sqrt{N}}\sum_{i=1}^N|i\rangle|i\rangle$. It is easy to check that for any linear operator $A$

$$ (A\otimes I)|\psi\rangle = (I\otimes A^T)|\psi\rangle.\tag1 $$

Now, for $p\in[0,1]$, we define the noisy singlet $\rho_p$ to be the bipartite state

$$ \rho_p:=p|\psi\rangle\langle\psi|+(1-p)\frac{I\otimes I}{N^2}.\tag2 $$

Twirling

Twirling of states sends a bipartite state $\rho$ to

$$ \rho_t := \int dU (U\otimes U^*)\rho(U^\dagger\otimes U^T)\tag3 $$

where $U^*$ denotes the complex conjugate of $U$. Using $(1)$, we can show that the Choi matrix $J(\hat\Lambda_t)$ of a twirled channel $\hat\Lambda_t$ is the result of twirling of states applied to the Choi matrix $J(\hat\Lambda)$ of the original channel $\hat\Lambda$

$$ \begin{align} J(\hat\Lambda_t)&=\hat\Lambda_t\otimes\hat{I}(N|\psi\rangle\langle\psi|)\\ &=\left(\int dU\hat{U}\circ\hat{\Lambda}\circ\hat{U}^\dagger\right)\otimes\hat{I}(N|\psi\rangle\langle\psi|)\\ &=\left(\int dU(\hat{U}\otimes\hat{I})\circ(\hat{\Lambda}\otimes\hat{I})\circ(\hat{U}^\dagger\otimes\hat{I})\right)(N|\psi\rangle\langle\psi|)\\ &=\int dU(\hat{U}\otimes\hat{I})\circ(\hat{\Lambda}\otimes\hat{I})\left((U^\dagger\otimes I)N|\psi\rangle\langle\psi|(U\otimes I)\right)\\ &=\int dU(U\otimes I)\left[\hat{\Lambda}\otimes\hat{I}\left((U^\dagger\otimes I)N|\psi\rangle\langle\psi|(U\otimes I)\right)\right](U^\dagger\otimes I)\\ &=\int dU(U\otimes I)\left[\hat{\Lambda}\otimes\hat{I}\left((I\otimes U^*)N|\psi\rangle\langle\psi|(I\otimes U^T)\right)\right](U^\dagger\otimes I)\\ &=\int dU(U\otimes U^*)\left[\hat{\Lambda}\otimes\hat{I}\left(N|\psi\rangle\langle\psi|\right)\right](U^\dagger\otimes U^T)\\ &=\int dU(U\otimes U^*)J(\hat\Lambda)(U^\dagger\otimes U^T)\\ &=J(\hat\Lambda)_t. \end{align}\tag4 $$

Another fact we can easily prove using $(1)$ is that every noisy singlet $(2)$ is invariant under twirling of states $\rho_{p,t}=\rho_p$. In fact, it turns out that noisy singlets are the only states with this property. See section $V$ in this paper for a proof of this fact.

Depolarizing channel

Depolarizing channel is a CPTP map defined by

$$ \hat\Delta_p(\rho) = p\rho + (1-p)\frac{I}{N}\mathrm{tr}\rho.\tag5 $$

A short calculation shows that the Choi matrix of $\hat\Delta_p$ is

$$ J(\hat\Delta_p)=(\hat\Delta_p\otimes\hat I)(N|\psi\rangle\langle\psi|)=N\rho_p\tag6 $$

where $\rho_p$ is a noisy singlet.

Putting it all together

Finally, unitary invariance of the Haar measure implies that twirling a channel twice yields the same result as twirling it once

$$ (\hat\Lambda_t)_t=\hat\Lambda_t.\tag7 $$

Therefore, by $(4)$

$$ J(\hat\Lambda_t)=J((\hat\Lambda_t)_t)=J(\hat\Lambda_t)_t\tag8 $$

i.e. the Choi matrix of $\hat\Lambda_t$ is invariant under twirling of states. But noisy singlets are the only states with this property. Therefore, $J(\hat\Lambda_t)$ is (a scalar multiple of) a noisy singlet

$$ J(\hat\Lambda_t)=N\rho_p\tag9 $$

for some $p\in[0,1]$. However, $N\rho_p=J(\hat\Delta_p)$, so by injectivity of $J$, we have

$$ \hat\Lambda_t=\hat\Delta_p\tag{10} $$

which was to be proven.

narip · Answer 3 · 2022-09-11T05:11:29.727

I will give an extended explanation of Nielsen's proof, i.e. your first ref link. The idea is that, $\rho=\sum_ip_i|i\rangle\langle i|$, we can prove it's depolarizing channel for each $|i\rangle\langle i|$ with same $p$, then we are done.

I start after eq.(10):$$V \mathcal{E}_T(\rho) V^{\dagger}=\mathcal{E}_T\left(V \rho V^{\dagger}\right)\tag{1}$$ For one $|i\rangle \langle i|$, we can choose $V$ to be diagonal block with respect to $|i\rangle \langle i|$ and $I-|i\rangle \langle i|$, you can think it as written $V$ in basis of $|i\rangle$ as $\left( \begin{matrix} a& 0\\ 0& B\\ \end{matrix} \right) $ where $a$ is a number and $B$ is a matrix. Now by eq1 we have $V\mathcal{E} _T\left( |i\rangle \langle i| \right) V^{\dagger}=\mathcal{E} _T\left( |i\rangle \langle i| \right) $, hence we have $\left[ V,\mathcal{E} _T\left( |i\rangle \langle i| \right) \right] =0$. I skip the proof that if $\left[ V,\mathcal{E} _T\left( |i\rangle \langle i| \right) \right] =0$ for all block diagonal unitary of the form mentioned above, we can have $\mathcal{E} _T(|i\rangle \langle i|)=\alpha |i\rangle \langle i|+\beta \left( I-|i\rangle \langle i| \right) $. Notice that $\mathcal{E} _T$ is trace preserving so we can rewrite it as $\mathcal{E} _T(|i\rangle \langle i|)=pI/d+(1-p)|i\rangle \langle i|$ for some $p$. Then we want to show that for different $|i\rangle \langle i|$, the $p$ is the same. To see this, we know that $|\tilde{i}\rangle $ and $|i\rangle $ can be connected with a $U$ such that $|\tilde{i}\rangle \langle \tilde{i}|=U|i\rangle \langle i|U^{\dagger}$. Then we will have $$\mathcal{E} _T(|\tilde{i}\rangle \langle \tilde{i}|)=\mathcal{E} _T(U|i\rangle \langle i|U^{\dagger})=U\mathcal{E} _T(|i\rangle \langle i|)U^{\dagger} \\ =U\left( pI/d+(1-p)|i\rangle \langle i| \right) U^{\dagger} \\ =pI/d+(1-p)|\tilde{i}\rangle \langle \tilde{i}|$$ So for $|\tilde{i}\rangle $ we have the same $p$.

Remark Notice that twirling does not have to be done w.r.t. to unitary group $U(d)$, any group $G$ can have its corresponding twirling operation, but we can see from the end of the reasoning above that if we want to connect any two pure state $|i\rangle$ and $|\tilde i\rangle$, then we must have the twirling w.r.t. $U(d)$.

score 1 · Answer 4 · answered Mar 27 '23 at 14:09

I'll present a slightly different wording for the general approach to the proof discussed in this answer, but being more explicit about the application of the general formalism to prove that the twirling operator products depolarizing channels. More specifically, consider the twirling operation $\mathcal T(\mathcal E)\equiv \mathcal E_T$ defined as $$\mathcal T(\mathcal E)\equiv \int dU\, \Phi_{U^\dagger}\circ\mathcal E\circ \Phi_{U},$$ with $\Phi_U(\rho)\equiv U\rho U^\dagger$ the unitary channel associated to $U$. Our goal is to prove that $\mathcal T(\mathcal E)$ is a depolarising channel, for any map $\mathcal E$. More explicitly, this means that $\mathcal T(\mathcal E)(\rho)=p_{\cal E} \operatorname{tr}(\rho) \frac{I}{d} + (1-p_{\cal E}) \rho$, for some $p_{\cal E}$.

Start observing that $\mathcal T^2\equiv \mathcal T\circ\mathcal T=\mathcal T$. This means that $\mathcal T$ is a linear (supersuper)operator, acting on the space of quantum maps. It is not hard to see that $\mathcal T$ is also Hermitian, and thus an orthogonal projection. In summary, $\mathcal T$ projects onto the set of quantum maps that commute with the action of the unitary group. Meaning $\mathcal \Phi_V\circ \mathcal T(\mathcal E)=\mathcal T(\mathcal E)\circ\Phi_V$ for all maps $\mathcal E$ and unitaries $V$. Here $$\mathbf U(d)\ni V\mapsto \Phi_V\in\mathbf{GL}(\mathrm{Lin}(\mathbb{C}^d))$$ is a representation of the unitary group into the space of quantum maps.

These observations allow to use Schur's lemma to characterise $\mathcal T(\mathcal E)$, as described in detail e.g. here. If the representation $U\mapsto \Phi_U$ was irreducible, then we would directly apply the lemma and conclude that $\mathcal T(\mathcal E)$ would have to be a multiple of the identity (here "identity" would mean the identity superoperator). However, this representation is not irreducible, as $\Phi_U(I)=I$ for all $U\in\mathbf U(d)$, which means that $\mathbb{C} I$ is a subspace left invariant by the representation. Similarly you can observe that for any traceless $X\in\operatorname{Lin}(\mathbb{C}^d)$, you have $\operatorname{tr}(\Phi_U(X))=0$, meaning the subspace of traceless operators is also an invariant subspace for the representation. You can then show that these two are both irreducible representations, and thus you get the decomposition $$\mathcal T(\mathcal E)= \alpha\, \Pi_I + \beta\, \Pi_0, \qquad \Pi_0 \equiv \operatorname{Id}-\Pi_I$$ where $\alpha,\beta\in\mathbb{R}$ are to be determined, $\operatorname{Id}$ is the identity superoperator, $\Pi_I:X\mapsto \operatorname{tr}(X)I/d$ is the map projecting onto the subspace generated by $I$, and $\Pi_0$ the map projecting onto the subspace of traceless operators, which can be given the above explicit form.

To find the coefficients $\alpha,\beta$ we now compute the inner product between $\mathcal T(\mathcal E)$ and $\Pi_I,\Pi_0$. For the RHS, we use the relations: $$\langle \Pi_I, \Pi_I \rangle = 1, \qquad\langle \Pi_0,\Pi_0\rangle = \operatorname{tr}(\Pi_0)=d^2-1, \qquad \langle\Pi_I,\Pi_0\rangle=0.$$ For the LHS, we instead use the specific structure of the twirling to observe $$\alpha = \langle \Pi_I, \mathcal T(\mathcal E)\rangle \equiv \frac1{d} \langle \Pi_I(I),\mathcal T(\mathcal E)(I)\rangle = \frac1d\operatorname{tr}[\mathcal T(\mathcal E)(I)] = \frac1d\operatorname{tr}(\mathcal E(I)), \\ (d^2-1)\beta=\langle \Pi_0,\mathcal T(\mathcal E)\rangle =\langle \operatorname{Id},\mathcal T(\mathcal E)\rangle - \langle\Pi_I,\mathcal T(\mathcal E)\rangle = \operatorname{tr}(\mathcal E) - \frac1d\operatorname{tr}(\mathcal E(I)).$$ Note that here we used the inner product between superoperators, defined as $$\langle\Phi,\Psi\rangle\equiv\sum_\sigma \langle\Phi(\sigma),\Psi(\sigma)\rangle\equiv \sum_\sigma \operatorname{tr}[\Phi(\sigma)^\dagger\Psi(\sigma)],$$ for any pair of quantum maps $\Phi,\Psi$ and orthonormal basis of operators $\{\sigma\}\in\operatorname{Lin}(\mathbb{C}^d)$. Similarly, $\operatorname{tr}(\mathcal E)$ is the "superoperator trace", equal to $\operatorname{tr}(\mathcal E)=\sum_\sigma \langle\sigma,\mathcal E(\sigma)\rangle$. We thus concluded that $$\mathcal T(\mathcal E) = \frac{\operatorname{tr}(\mathcal E(I))}{d} \Pi_I + \frac{\operatorname{tr}(\mathcal E)-\operatorname{tr}(\mathcal E(I))/d}{d^2-1} \Pi_0.$$ In the special case where $\mathcal E$ is a quantum channel we furthermore have $\mathcal E(I)=I$, and thus $$\mathcal T(\mathcal E) = \Pi_I + \frac{\operatorname{tr}(\mathcal E)-1}{d^2-1} \Pi_0.$$ Finally we can rewrite this slightly to get the depolarising channel: observe that $\Pi_0=\operatorname{Id} - \Pi_I$, and thus $$\mathcal E_T\equiv \mathcal T(\mathcal E) = \left(\frac{d^2-\operatorname{tr}(\mathcal E)}{d^2-1}\right)\Pi_I + \left(\frac{\operatorname{tr}(\mathcal E)-1}{d^2-1}\right)\operatorname{Id}, \\ \mathcal E_T(\rho) = \left(\frac{d^2-\operatorname{tr}(\mathcal E)}{d^2-1}\right)\operatorname{tr}(\rho) \frac{I}{d} + \left(\frac{\operatorname{tr}(\mathcal E)-1}{d^2-1}\right)\rho.$$