Trouble with position operator in quantum mechanics

Question

I'm having some trouble with understanding the derivation of the action of the $X$ operator. It seems to be a result of the notation used and not a property of itself.

The usual argument is to consider eigenfunctions of the $X$-operator: $X|x\rangle = x|x\rangle$ where $X$ is an operator, $|x\rangle$ is an eigenket of $X$ and $x$ is the corresponding eigenvalue. Then \begin{eqnarray*} \color{red}{\langle x'|}X|x\rangle &=& \color{red}{\langle x'|}x|x\rangle \\ \\ \langle x'|X|x\rangle &=& x\langle x'|x\rangle \\ \\ \langle x'|X|x\rangle &=& x\,\delta(x'-x) \end{eqnarray*} where $\delta$ is Dirac's $\delta$-"function". Then we ask what $X$ does to arbitrary kets like $|f\rangle$: \begin{eqnarray*} (Xf)(x) &=& \langle x | X | f\rangle \\ \\ &=&\int_{-\infty}^{+\infty} \langle x|X|x'\rangle\langle x'|f\rangle~\mathrm dx' \\ \\ &=& \int_{-\infty}^{+\infty}f(x')\color{red}{\langle x|X|x'\rangle}~\mathrm dx' \\ \\ &=& \int_{-\infty}^{+\infty} f(x')\,\color{red}{x'\,\delta(x-x')}~\mathrm dx' \end{eqnarray*} The defining property of the $\delta$-"function" is that $\int_{\mathbb R} f(y)\,\delta(x-y)~\mathrm dy=f(x)$, and so $$(Xf)(x) = x\,f(x)$$

However, if I do this with other symbols, then I can't get the same result. Let's say $X|x\rangle = \lambda |x\rangle$. Then following the same steps gives $\langle x'|X|x\rangle = \lambda \langle x'|x\rangle=\lambda \,\delta(x'-x)$, and whence $$(Xf)(x)=\lambda f(x)$$ This is to be expected: $Xf$ is just an eigenvalue multiple of $f$.

It seems that the property that $X : f \mapsto xf$ comes from the fact that we used $x$ to denote the eigenvalue of $X$. What am I missing here?

Perhaps because $x$ is a real number and the set of all kets $|x\rangle$ can be identified with the real line by $|x\rangle \mapsto x$, and that $x$ must be an eigenvalue of $|x\rangle$ under this construction?

Emilio Pisanty · Accepted Answer · 2017-06-27T23:19:13.070

By saying $X|x\rangle = \lambda |x\rangle$ and then integrating over $x$ without allowing for the fact that $\lambda$ depends on $x$, you're essentially saying that the action of $X$ on all $|x\rangle$ states is the same, so that $X$ is a constant: $$ X = X\int|x\rangle\langle x|\mathrm dx = \int X|x\rangle\langle x|\mathrm dx = \int \lambda|x\rangle\langle x|\mathrm dx = \lambda\int|x\rangle\langle x|\mathrm dx =\lambda\mathbb I. $$ Allow for $\lambda$ to depend on $x$ (i.e. allow for $X$ to have different eigenvalues for different eigenvectors), say, via notation like $\lambda_x$ or similar, and you'll recover the initial behaviour in the case $\lambda_x = x$.

That said, from what you've said in comments, the confusion is a bit more fundamental. To be clear: the property $$ (Xf)(x) = xf(x) $$ is the definition of the $X$ operator, at least when working on the explicit instantiation of the abstract Hilbert space in the position-based $L_2(\mathbb R)$ space; as a definition, it hardly needs any justification.

Shankar's treatment arises because he starts using an explicit coordinate-based construction of finite-dimensional Hilbert spaces and then decides to keep on using the same notation without showing how the Hilbert-space basis is built, or the theorems that guarantee its existence; this is fine for a first pass at the subject but there are better ways to do it once you're accustomed to the basics. (And that means that you shouldn't take that section as representative of the rigorous ways to construct these mathematical structures.)

Frankly, I find that part of Shankar's presentation to be rather confused (not necessarily wrong, as such, but I would encourage you to read other textbooks to get a better feeling for those topics). In particular, Shankar introduces the notation of the $X$ operator as the operator "responsible for the $|x⟩$ basis" - but this is not really necessary. You already have that basis, you don't really need an operator to generate it. (It exists, of course, but you don't need it, in the way Shankar implies.)

Either way, at some point you need to commit to one or another definition of $X$: the simplest way is to define it via $(Xf)(x) = xf(x)$, but you can also define it, as Shankar does, as the unique operator whose action on the basis states is $$ X|x⟩=x|x⟩, $$ i.e. to multiply each state by its label (and one can then show, as Shankar does, that the two definitions are equivalent). Now why would you define something like that? because you can, and because it works. Is this purely a trick of the notation? No, there's clear physical content to the assertion that $X|x⟩=x|x⟩$, and if you want to just "change the labels" then you need to do so consistently: you'd need to say something like $X|\lambda⟩=\lambda|\lambda⟩$, while making clear that $|\lambda⟩$ is the same position basis state as $|x⟩$ but with altered notation.

And if this was not enough to dig you out, then I would seriously recommend taking up a few more books for alternative perspectives on the topic: there's better ways to do it (or at least, ways that are better in this respect) but there's no space in a Q&A format to develop them fully. Read around. Seriously.

score 3 · Answer 2 · edited Nov 02 '21 at 23:11

I think your difficulty lies in figuring out what aspects of the QM formalism are determined by nature and what are human-constructed. In this regard, it doesn't help that there are two equivalent but incompatible ways to make that judgement.

One way is to begin with state vectors. In this view, $|x\rangle$ is defined to be the state corresponding to position $x$, so that it is fixed by nature. Once you have done this, $X$ is constructed. This is the point of view adopted by @Emilio Pisanty in his answer.

The alternative, more advanced point of view is that $X$ is an aspect of nature, and $|x\rangle$ is human-constructed. You seem to intuitively prefer his description. From this perspective, we define $|x\rangle$ to mean "the eigenvector of $X$ with eigenvalue $x$." So, yes, by definition, $\lambda=x$. (This, of course, assumes that $X$ does not have multiple eigenvectors with eigenvalue $x$. But that seems to be the case.)

score 1 · Answer 3 · answered Jun 27 '17 at 18:21

This may be a possible explanation.

The equation $X|x\rangle = x|x\rangle$ means that if we measure position of state $|x\rangle$ we will get the numerical value x in our measuring device. Now if you write $X|x\rangle = \lambda(x)|x\rangle$, the equation is valid as long as $\lambda(x)$ uniquely characterize each state. But the state $|x\rangle$ in second equation is not the same as the first one. Because, on measuring this $|x\rangle$ you will get numerical value $\lambda(x)$, not x. To avoid confusion, it will be useful to write the second equation using a different notation. $$X|x\rangle_\lambda = \lambda(x)|x\rangle_\lambda$$ Note that $|x\rangle \neq |x\rangle_\lambda$. So there is no reason that $\langle x|X|f\rangle$ should be equal to $_\lambda\langle x|X|f\rangle$.

Trouble with position operator in quantum mechanics

3 Answers3