Quantum Probability, what makes quantum characteristic functions quantum?

Question

I'm trying to understand how $[Q,P] \neq 0$ leads to the conclusion that no probability distribution can be established for $A$ and $B$.

Classically if we had two random variables $Q$ and $P$ we could write

\begin{align} \phi_{Q,P}(t_q,t_p) =& E\left[e^{i(t_Q Q + t_P P)}\right]\\ =& \int e^{i(t_Q q + t_P p)} f_{Q,P}(q,p)dq dp\\ =& \mathcal{FT}\left[f_{Q,P}(q,p)\right](t_q, t_p) \end{align}

Here $\phi_{Q,P}$ is the characteristic function and $f_{Q, P}$ is the probability density function for $Q$ and $P$. In particular we have that

\begin{align} f_{Q,P}(q,p) = \mathcal{FT}^{-1}\left[\phi_{Q,P}(t_Q,t_P)\right](q, p) \end{align}

If expectation values like $E\left[ Q^n P^m \right]$ are known for all non-negative integers $n, m$ then $\phi_{Q,P}(t_Q, t_P)$ can in principle be calculated and then Fourier transformed to find $f_{Q,P}(q, p)$. That is, knowledge of expectation values is enough to determine a probability function.

Quantum mechanically it is known that this procedure breaks down. There is no way to come up with a probability distribution function for non-commuting observables. My question is where my argument above breaks down in the non-commuting case. Quantum mechanically (at least theoretically) we have access to expectation values of the form $E\left[Q^n P^m\right]$ (and versions of the same with different operator orderings). This means that we can calculated some sort of quantum characteristic function $\phi_{Q, P}(t_Q, t_P)$ for $Q$ and $P$. In principle we should then be able to Fourier transform this characteristic function to get something like a probability distribution for $Q$ and $P$. For some reason, we only get a quasiprobability distribution and not a normal probability distribution. Why not?

I don't know the full answer to this but I have a couple of leads that I will mention.

First, as I mentioned above, $E[QP] \neq E[PQ]$ and $E[Q^nP^m] \neq E[Q^{n-1}P^{m}Q]$ and the like. This means that there is not a unique definition for the characteristic function. Given that there is not a unique characteristic function it makes sense there is not a unique probability distrubition. The different choices of characteristic function can be related to different quasiprobaiblity distribution such as the Wigner, P, or Q distributions. My question is why is it the case that NONE of these characteristic functions could ever lead to a valid probability distribution function.
Not just any function $\phi$ can be transformed to give a probability distribution function. Probability distribution functions are normalized and always positive. It is possible to take a Fourier transform and get something which is not normalized and which is not always positive. I believe this is related to Bochner's theorem but I'm having trouble parsing the theorem because of all of the measure theory stuff. I would really appreciate an answer that explains how we can look at a classical characteristic function and see certain properties that allow us to know it will Fourier transform to a nice probability distribution function and then how we can clearly see that non-commuting operator characteristic function do not satisfy these properties so we know they won't give us nice probability distribution functions.

lcv · Accepted Answer · 2020-06-05T00:42:27.050

I think the simplest way to understand why we cannot have a joint probability distribution for two incompatible observables $A,B$ (meaning non-commuting) is the following. With a slight abuse of notation, the joint probability is defined as:

$$ P_{A,B}(a,b) := \mathrm{Prob}(A=a,B=b) $$

which means, it's the probability of $A$ having value $a$ and $B$ having value $b$. In quantum mechanics this means that there is an eigenstate of $A$ with eigenvalue $a$ which is at the same time an eigenstate of $B$ with eigenvalue $b$. But if $[A,B]\neq 0$ this is notoriously not possible.

Note that, conversely, if $A$ and $B$ commute it is always possible to find a common eigenbasis and so the above prescription works fine.

This argument, however does not answer the other part of the question. Which is:

Why can I not define a bona fide characteristic function (i.e. which is the Fourier transform of a probability density) in case of non-compatible observables? And perhaps, is there any way to amend this?

As pointed correctly by the OP this has to do with Bochner's theorem which tells precisely what requirements a characteristic function has to satisfy. I will state Bochner's theorem in the form needed for our purposes.

Theorem (Bochner) (Univariate case) $\chi (t)$ is the Fourier transform of a probability density $P(\omega)$ ($t,\omega \in\mathbb{R}$) if and only if for any $n$-tuple $t_1,t_2,\ldots t_n$ ($t_k \in \mathbb{R}, \ k=1,2,\ldots,n$) the $n\times n$ matrix with entries $\chi_{i,j}:= \chi(t_i-t_j)$ is non-negative definite (and hermitian).

Note: for the Multivariate $d$-dimensional generalization simply consider the obvious rephrasing with $t,\omega, t_k \in \mathbb{R}^d$.

A simple way to understand Bochner's theorem is the following. A matrix $\chi$ is non-negative definite if and only if it can be written as $\chi = A A^\dagger$.

Let $P(\omega)$ be the Fourier transform of $\chi$. Then

$$ \chi_{i,j} = \int d\omega e^{i(t_i-t_j) \omega} P(\omega) \ \ \ \ \ (0) $$

which we write as

\begin{align} \chi_{i,j} &= \int d\omega e^{it_i \omega} P(\omega) e^{-it_j \omega} \\ & = (A A^\dagger)_{i,j} \end{align}

with

$$ A_{i,\omega} := e^{it_i \omega} \sqrt{P(\omega)} $$

which we can do since $P(\omega)$ is non-negative. So $P(\omega)$ non-negative means that $\chi_{i,j}$ is a non-negative definite matrix. This characterizes characteristic functions in the classical case.

Let us now turn to the quantum mechanics and consider the multivariate case, i.e., we have several observables which I call $X_1, X_2, \ldots X_n$ with spectra in $\omega_1, \ldots, \omega_n$. The conjugate variables being $t_1,\ldots,t_n$, and the notation

$$tX:=\sum_{k=1}^n t_k X_k$$

In this case the wannabe characteristic function is

$$ \chi(t):=\mathsf{E}[ e^{itX} ]= \operatorname{Tr} ( e^{itX} \rho ) $$

for some quantum state $\rho$ (a normalized non-negative matrix). We want to check under which conditions $\chi_{i,j}=\chi(t_i-t_j)$ is non-negative definite as a matrix.

If the $X_k$ were mutually commuting operators we would have

$$ e^{i(t_i-t_j) X} = e^{it_i X} e^{-it_j X} \ \ \ \ \ \ (1) $$

and then we could write

\begin{align} \chi_{i,j} &=\operatorname{Tr} \left ( e^{i(t_i-t_j) X} \rho \right) \\ &=\operatorname{Tr} \left ( e^{it_i X} e^{-it_j X} \rho \right ) \\ &=\operatorname{Tr} \left (e^{-it_j X} \sqrt{\rho} \sqrt{\rho} e^{it_i X} \right ) \end{align}

Now define the matrix $A_{j,lq} := \left ( e^{-it_j X} \sqrt{\rho} \right )_{l,q} =: A_{i,\xi}$ where $\xi=(l,q)$. We have

$$ \chi_{i,j} = \sum_{lq} A_{j,lq} \overline{A_{j,lq}} $$

which is of the form $BB^\dagger$ and proves that $\chi_{i,j}$ is non-negative definite. Obviously this whole construction breaks down if $X_k$ are not mutually commuting.

Strictly speaking (as pointed out correctly by @AcuriousMind) this does not prove that the matrix $\chi_{i,j}$ is not non-negative definite for non-mutually commuting observables. For that one should find a counterexample, i.e. show that $\chi$ has a negative eigenvalue. However it does show where the argument breaks down.

Added edit

Here I present a counterexample for a single qubit. It can be shown that for $n=2$ the $\chi_{i,j}$ matrix is always non-negative definite. So to look for the first counterexample we must take $n=3$.

Consider the following problem with incompatible (non-commuting) observables given by $\sigma^x$ and $\sigma^z$. As for the state we pick $\rho = | 0\rangle\langle 0|$. Hence the putative characteristic function is

\begin{align} \chi(t_x,t_z) &:= \langle 0| e^{i (t_x \sigma^x +t_z \sigma^z)}| 0 \rangle \\ & = \cos \left (\sqrt{t_x^2 +t_z^2}\right ) + i\frac{t_z}{\sqrt{t_x^2 +t_z^2}} \sin \left ( \sqrt{t_x^2 +t_z^2}\right ) . \end{align}

Now form the matrix $\chi_{i,j}$ for $n=3$. It can be shown that $\chi$ has the form

$$ \chi = 1\!\mathrm{l} + \Gamma $$

where the matrix $\Gamma$ is hermitian and has zero on the diagonal. Since $\Gamma$ is traceless $\chi$ fails to be non-negative definite if $\Gamma$ has an eigenvalue smaller than $-1$.

For simplicity let's call $a_{ij} = t_x^i-t_x^j$ and $b_{ij} = t_z^i-t_z^j$. Now simply pick random $a_{ij}, b_{ij}$:

\begin{align} a_{12} & = 1 \ \ b_{12} = 0.5 \\ a_{13} & = 2 \ \ b_{13} = 1.3 \\ a_{23} & = 0.4 \ \ b_{23} = 0.9 \\ \end{align}

The eigenvalues of $\Gamma$ turn out to be $\{1.499, \ -1.221, \ -0.278\}$, which implies $\chi$ is not non-negative definite. This shows that the Fourier transform of $\chi(t_x,t_z)$ is not a (joint) probability distribution. $\square$

score 2 · Answer 2 · answered Feb 02 '20 at 02:35

This is an issue of physical interpretation, not mathematics. For example, the Husimi $Q$-function is a probability distribution in the formal sense that it is a non-singular, normalizable, positive definite function.

We don't call it a probability distribution, however, because quantum measurement doesn't work like it does classically. The quantity $Q(q, p)$ does not physically represent "the probability of the particle having position $q$ and momentum $p$", fundamentally because $[q, p] = i$, so you can't measure both simultaneously.

score 0 · Answer 3 · answered Feb 02 '20 at 02:16

0

The expectation value $E[\mathrm{e}^{\mathrm{i}(t_QQ+ t_PP)}]$ simply does not exist in quantum mechanics - the operator inside the brackets is not a self-adjoint operator, hence not an observable, hence it does not possess an "expectation value".

answered Feb 02 '20 at 02:16

ACuriousMind

132,081

Quantum Probability, what makes quantum characteristic functions quantum?

3 Answers3

Linked