4

In a famous paper "Statistical Distance and the Geometry of Quantum States" by Braunstein and Caves, the authors discuss the problem of estimating an unknown parameter $X$. One considers an estimator $X_\textrm{est}$ of $X$, which is a function of measurement results. The authors then argue that the appropriate measure of derivation of the estimator away from the parameter is $$\delta X = \frac{X_\textrm{est}}{|d\langle X_{\textrm{est}} \rangle_X/dX|}-X.$$

My question is that why we do not simply use $\delta X = X_\textrm{est} - X$. Anyway $X_\textrm{est}$ is supposed to be close to $X$ (if it were a good estimator), hence its difference from the true value should be a natural way of quantifying the derivation.

Rather, the paper says that the derivative $|d\langle X_{\textrm{est}} \rangle_X/dX|$ removes the local difference in the "units" of the estimator and the parameter. What does it mean by the "local difference in the units"? Since $X_\textrm{est}$ is supposed to estimate $X$, doesn't $X_\textrm{est}$ and $X$ must have the same unit?

Frederik vom Ende
  • 4,163
  • 3
  • 12
  • 49
Laplacian
  • 143
  • 5

1 Answers1

3

Let us first figure out the meaning of the formula through some examples. The estimator may depend on $X$ in any way, with the special case of unbiased estimators $\langle X_\mathrm{est}\rangle=X$. Taking the derivative, we see that unbiased estimators have $\delta X=X_\mathrm{est}-X$ as you desire and $\langle \delta X\rangle=0$.

Next, take an estimator with some fixed offset $\langle X_\mathrm{est}\rangle=X+b$. The offset does nothing to the derivative, leaving us again with $\delta X=X_\mathrm{est}-X$ but this time with $\langle \delta X\rangle=b$. Looks reasonable.

Finally, take an estimator with some other linear dependence $\langle X_\mathrm{est}\rangle=mX+b$. It appears strange at first, that we now must define $\delta X=X_\mathrm{est}/m-X$. But if you look at the expectation value, you find that the $m$ cancels, leaving us with $\langle \delta X\rangle=b/m$.

What does this mean? Take $b=0$, for example. Braunstein and Caves have a definition where $\langle \delta X\rangle=0$ while the average of $X_\mathrm{est}-X$ would be $\langle X\rangle(1/m-1)$: they are saying that the statistical deviation should be zero if $\langle X_\mathrm{est}\rangle\propto X$, regardless of the proportionality constant.

Then, they consider that proportionality constant to be a sort of "unit" that can be corrected. Oops, you used kilograms instead of grams, but we won't penalize you because your model is good otherwise. So then when they report the final bias, they must do it in the appropriate units, and that's why $b$ gets divided by $m$ to go from the estimator's units to the parameter's units.


One often hears reports of results in terms of a signal being "5 sigma from the mean" or "5 standard deviations from the mean." This is useful because one is interested in more than just the magnitude of the difference between two things: if I measure a result that is 100 kilograms away from the mean, it makes a huge difference whether the standard deviation was 1 kg (so this is very significant) or 1000 kg (so this is a negligible difference from the mean). Here, instead of standard deviations, the rate of change $|d \langle X_\mathrm{est}\rangle/dX|$ is used, which is a reasonable measure of "how quickly does my quantity change as I move away from the mean" when we are not dealing with Gaussian variables. But if that's the case, I might have expected the definition $$\delta X=\frac{X_\mathrm{est}-X}{|d \langle X_\mathrm{est}\rangle/dX|}$$ so that can't be the answer, it really must be that Braunstein and Caves consider $X_\mathrm{est}$ and $X$ to possibly have different units if $X_\mathrm{est}=mX+b$ and they want their model to correct for those units.

Quantum Mechanic
  • 4,719
  • 5
  • 26