TL;DR: There could be other reasons, too, but without the square $\kappa$ fails to be positive semidefinite (PSD).
Summary
As in classical machine learning, the kernel function $\kappa$ is supposed to be the inner product in the latent space. Therefore, $\kappa(x_i,x_j)$ must be PSD. If it is defined with the square, then it is the Hadamard product of the Gram matrix $\langle\phi(x_i)|\phi(x_j)\rangle$ with its complex conjugate, both of which are PSD. Since Hadamard product sends a pair of PSD matrices to a PSD matrix, $\kappa$ with the square is PSD. On the other hand, elementwise absolute value of a PSD matrix may fail to be PSD, so $\kappa$ without the square may fail to be PSD.
Intuitively, the "right" way of obtaining a real inner product from a complex inner product is not by taking the absolute value, but by multiplying it with its complex conjugate. This is related to the fact that the L2 norm is induced by an inner product, but L1 norm is not.
Background
The key assumption behind the kernel methods is that $\kappa:\mathcal{X}\times\mathcal{X}\to\mathbb{R}$ is an inner product $\langle.,.\rangle_\mathcal{L}$ in the latent space $\mathcal{L}$
$$
\kappa(x, x')=\langle f(x),f(x')\rangle_\mathcal{L}\tag1
$$
where $f:\mathcal{X}\to\mathcal{L}$ is the feature map from the input space $\mathcal{X}$ to the latent space $\mathcal{L}$. However, for $\kappa$ to admit this expression it must induce a PSD matrix for any finite set of $n$ points of $\mathcal{X}$, i.e. $\sum_{i,j=1}^nc_ic_j\kappa(x_i,x_j)\geq 0$ for any $x_1,\dots,x_n\in\mathcal{X}$ and any $c_1,\dots,c_n\in\mathbb{R}$.
Good and bad $\kappa$
Now, let's define two functions:
$$
\begin{align}
\kappa_{\text{good}}(x,x')&:=|\langle\phi(x)|\phi(x')\rangle|^2,\tag2\\
\kappa_{\text{bad}}(x,x')&:=|\langle\phi(x)|\phi(x')\rangle|.\tag{2'}
\end{align}
$$
Claim 1 $\kappa_{\text{good}}(x,x')$ is PSD.
Proof
Let $\lambda(x,x'):=\langle\phi(x)|\phi(x')\rangle$. By definition, $\lambda(x_i,x_j)$ is a Gram matrix, so it is PSD. Moreover, the complex conjugate of $\lambda(x_i,x_j)$ is PSD, too. But then $$\kappa(x_i,x_j)=\overline{\lambda(x_i,x_j)}\lambda(x_i,x_j)\tag3$$ is the Hadamard (elementwise) product of two PSD matrices. By Schur product theorem, $\kappa(x_i,x_j)$ is PSD. $\square$
We can avoid the use of Schur product theorem by constructing the inner product space $\mathcal{L}$ and the feature map $f$ such that $\kappa_{\text{good}}(x,x')=\langle f(x),f(x')\rangle_\mathcal{L}$. This can be done by setting $\mathcal{L}=\mathcal{H}\otimes\overline{\mathcal{H}}$ where $\overline{\mathcal{H}}$ is the same as $\mathcal{H}$ except the inner product in $\overline{\mathcal{H}}$ is the complex conjugate of the inner product in $\mathcal{H}$ and by defining $f:\mathcal{X}\to\mathcal{L}$ via $f(x)=|\phi(x)\rangle\langle\phi(x)|$.
Claim 2 $\kappa_{\text{bad}}(x,x')$ is not necessarily PSD.
Proof Let $\mathcal{X}=\{0,1,2,3\}$ and define
$$
\begin{align}
|\phi(0)\rangle=\sqrt{\frac23}|0\rangle+\sqrt{\frac16}|1\rangle-\sqrt{\frac16}|3\rangle\tag4\\
|\phi(1)\rangle=\sqrt{\frac16}|0\rangle+\sqrt{\frac23}|1\rangle+\sqrt{\frac16}|2\rangle\tag5\\
|\phi(2)\rangle=\sqrt{\frac16}|1\rangle+\sqrt{\frac23}|2\rangle+\sqrt{\frac16}|3\rangle\tag6\\
|\phi(3)\rangle=-\sqrt{\frac16}|0\rangle+\sqrt{\frac16}|2\rangle+\sqrt{\frac23}|3\rangle.\tag7
\end{align}
$$
Then
$$
\kappa_{\text{bad}}(x_i,x_j)=\begin{bmatrix}
1&\frac23&0&\frac23\\
\frac23&1&\frac23&0\\
0&\frac23&1&\frac23\\
\frac23&0&\frac23&1
\end{bmatrix}\tag8
$$
has $-\frac13$ among its eigenvalues.
$\square$
Credit: the quantum states $(4{-}7)$ are derived from the second matrix in this answer.