The article looks indeed wrong. In fact, there are two mistakes.
First, you're right that the acceptance-rejection method has to be applied to $\rho(r)$, and not to $m(r)$. To understand how this idea works, suppose we want to generate a one-dimensional normalized distribution function $p(y)$. Now, let's assume we can rewrite this distribution function in terms of a variable $x$, such that it takes the form of a uniform distribution. That is,
$$
p(x) =
\begin{cases}
1& \text{for $0\leqslant x \leqslant 1,$}\\
0& \text{elsewhere}.
\end{cases}
$$
Given $p(y)$, what is $x$? We have the Jacobian transformation
$$
p(y)dy = p(x(y))\left|\frac{dx}{dy}\right|dy = \left|\frac{dx}{dy}\right|dy,
$$
which implies
$$
p(y) = \frac{dx}{dy},
$$
assuming that $x(y)$ is an increasing function. Thus
$$
x = \int_0^y p(y')dy' = F(y).
$$
In other words, the integral of $p(y)$ (or equivalently, the area under the curve) follows a uniform distribution. With this in mind, there are essentially two ways to perform a Monte-Carlo simulation.
The first way is the acceptance-rejection method: plot the curve $p(y)$ and uniformly generate a pair of numbers $(a,b)$ in the interval $([0,y_\max],[0,p_\max])$, where $y_\max$ and $p_\max$ are the upper bounds of $y$ and $p(y)$. If the coordinate $(a,b)$ lies under the curve $p(y)$, accept it; otherwise, reject it. If the coordinate is accepted, $y=a$ is generated point.

There are major drawbacks to this method: $y_\max$ and $p_\max$ can be infinite, so one would need a cut-off. And if $p(y)$ has a sharp peak, one ends up rejecting a lot of points.
A far more efficient method is to uniformly generate $x$, and calculate the corresponding $y$ by inverting $x=F(y)$:
$$
y = F^{-1}(x).
$$
This automatically fills up the area under the curve, without rejecting points.

If the calculation of $F^{-1}(x)$ is too numerically involved, one can use a combination of both methods: introduce another (simpler) function $f(y)$ that lies everywhere above $p(y)$. Apply the inversion method to $f(y)$, generating a point $y$. Then uniformly generate a value $b$ in the interval $[0,f(y)]$. If $b\leqslant p(y)$, accept $y$; otherwise, reject it.
Now, consider the Hernquist distribution. Since it has a cusp at the origin, and the cumulative mass $m(r)$ is a simple function
$$
m(r) = M\frac{r^2}{(a+r)^2},
$$
I'd definitely recommend the inversion method. But there is an important caveat here, and that's the second mistake in the article: $\rho(r)$ is not really a one-dimensional distribution. Instead it is a distribution in 3-dimensional space, and it is only a function of one variable due to spherical symmetry. In order to apply the Monte-Carlo method, we have to express $\rho$ as a truly one-dimensional distribution function, which we can do by expressing it in terms of the volume
$$
y = \frac{4\pi}{3}r^3.
$$
Now we have
$$
p(y) = \rho(y) = \frac{M}{2\pi}\frac{a\,(3y/4\pi)^{-1/3}}{\left[a + (3y/4\pi)^{1/3}\right]^3},\\
F(y) = m(y) = \int_0^y\rho(y')dy' = M\frac{(3y/4\pi)^{2/3}}{\left[a + (3y/4\pi)^{1/3}\right]^2}.
$$
Once we generated a point $y$, the corresponding radius is
$$r=\left(\frac{3y}{4\pi}\right)^{1/3}.$$
There is an important consequence: there are likely more particles at large radii than around the centre, even though $\rho(r)$ is much larger at small radii. The reason is that particles between two radii $r$ and $(r+\Delta r)$ occupy a shell with volume
$$V = \frac{4\pi}{3}\left[(r+\Delta r)^3-r^3\right].$$
The larger the radius $r$, the larger the volume of the shell, which means you need more particles to fill it and get $\rho(r)$. This is obvious in the case of a constant density, but it is also true for general densities.