Why is the quantum kernel $\kappa(x,x')=|\langle\phi(x)|\phi(x')\rangle|^2$ defined with a square?

Question

I've always wondered why the quantum kernel method

\begin{equation}\label{QKM1} \kappa (x,x')=|\langle \phi(x) |\phi(x') \rangle {{|}^{2}} \end{equation}

must be a square. After reading “Supervised learning with quantum-enhanced feature spaces" by V. Havlíček, A. D. Córcoles et al. and “Quantum machine learning in feature Hilbert spaces” by M. Schuld, and N. Killoran, it was not straightforward to discover the derivation of this formula.

Furthermore, is $\kappa (x,x')=|\langle \phi(x) |\phi(x') \rangle {{|}^{2}}$ a semi-positive definite matrix?

Adam Zalcman · Answer 1 · 2023-09-21T02:47:21.770

TL;DR: There could be other reasons, too, but without the square $\kappa$ fails to be positive semidefinite (PSD).

Summary

As in classical machine learning, the kernel function $\kappa$ is supposed to be the inner product in the latent space. Therefore, $\kappa(x_i,x_j)$ must be PSD. If it is defined with the square, then it is the Hadamard product of the Gram matrix $\langle\phi(x_i)|\phi(x_j)\rangle$ with its complex conjugate, both of which are PSD. Since Hadamard product sends a pair of PSD matrices to a PSD matrix, $\kappa$ with the square is PSD. On the other hand, elementwise absolute value of a PSD matrix may fail to be PSD, so $\kappa$ without the square may fail to be PSD.

Intuitively, the "right" way of obtaining a real inner product from a complex inner product is not by taking the absolute value, but by multiplying it with its complex conjugate. This is related to the fact that the L2 norm is induced by an inner product, but L1 norm is not.

Background

The key assumption behind the kernel methods is that $\kappa:\mathcal{X}\times\mathcal{X}\to\mathbb{R}$ is an inner product $\langle.,.\rangle_\mathcal{L}$ in the latent space $\mathcal{L}$ $$ \kappa(x, x')=\langle f(x),f(x')\rangle_\mathcal{L}\tag1 $$ where $f:\mathcal{X}\to\mathcal{L}$ is the feature map from the input space $\mathcal{X}$ to the latent space $\mathcal{L}$. However, for $\kappa$ to admit this expression it must induce a PSD matrix for any finite set of $n$ points of $\mathcal{X}$, i.e. $\sum_{i,j=1}^nc_ic_j\kappa(x_i,x_j)\geq 0$ for any $x_1,\dots,x_n\in\mathcal{X}$ and any $c_1,\dots,c_n\in\mathbb{R}$.

Good and bad $\kappa$

Now, let's define two functions: $$ \begin{align} \kappa_{\text{good}}(x,x')&:=|\langle\phi(x)|\phi(x')\rangle|^2,\tag2\\ \kappa_{\text{bad}}(x,x')&:=|\langle\phi(x)|\phi(x')\rangle|.\tag{2'} \end{align} $$

Claim 1 $\kappa_{\text{good}}(x,x')$ is PSD.

Proof Let $\lambda(x,x'):=\langle\phi(x)|\phi(x')\rangle$. By definition, $\lambda(x_i,x_j)$ is a Gram matrix, so it is PSD. Moreover, the complex conjugate of $\lambda(x_i,x_j)$ is PSD, too. But then $$\kappa(x_i,x_j)=\overline{\lambda(x_i,x_j)}\lambda(x_i,x_j)\tag3$$ is the Hadamard (elementwise) product of two PSD matrices. By Schur product theorem, $\kappa(x_i,x_j)$ is PSD. $\square$

We can avoid the use of Schur product theorem by constructing the inner product space $\mathcal{L}$ and the feature map $f$ such that $\kappa_{\text{good}}(x,x')=\langle f(x),f(x')\rangle_\mathcal{L}$. This can be done by setting $\mathcal{L}=\mathcal{H}\otimes\overline{\mathcal{H}}$ where $\overline{\mathcal{H}}$ is the same as $\mathcal{H}$ except the inner product in $\overline{\mathcal{H}}$ is the complex conjugate of the inner product in $\mathcal{H}$ and by defining $f:\mathcal{X}\to\mathcal{L}$ via $f(x)=|\phi(x)\rangle\langle\phi(x)|$.

Claim 2 $\kappa_{\text{bad}}(x,x')$ is not necessarily PSD.

Proof Let $\mathcal{X}=\{0,1,2,3\}$ and define $$ \begin{align} |\phi(0)\rangle=\sqrt{\frac23}|0\rangle+\sqrt{\frac16}|1\rangle-\sqrt{\frac16}|3\rangle\tag4\\ |\phi(1)\rangle=\sqrt{\frac16}|0\rangle+\sqrt{\frac23}|1\rangle+\sqrt{\frac16}|2\rangle\tag5\\ |\phi(2)\rangle=\sqrt{\frac16}|1\rangle+\sqrt{\frac23}|2\rangle+\sqrt{\frac16}|3\rangle\tag6\\ |\phi(3)\rangle=-\sqrt{\frac16}|0\rangle+\sqrt{\frac16}|2\rangle+\sqrt{\frac23}|3\rangle.\tag7 \end{align} $$ Then $$ \kappa_{\text{bad}}(x_i,x_j)=\begin{bmatrix} 1&\frac23&0&\frac23\\ \frac23&1&\frac23&0\\ 0&\frac23&1&\frac23\\ \frac23&0&\frac23&1 \end{bmatrix}\tag8 $$ has $-\frac13$ among its eigenvalues. $\square$

Credit: the quantum states $(4{-}7)$ are derived from the second matrix in this answer.

Why is the quantum kernel $\kappa(x,x')=|\langle\phi(x)|\phi(x')\rangle|^2$ defined with a square?

1 Answers1

Summary

Background

Good and bad $\kappa$