Making everything dimensionless so that $\hbar=1$:
$$
H = \frac12(p-A)^2\\
A = \frac\omega2(-y,x)
$$
with $\omega$ the cyclotron frequency, the usual theory of Landau level gives the spectrum of the Hamiltonian for $n\in\mathbb N$:
$$
E_n = \omega(n+1/2)
$$
with degeneracy (up to a numerical prefactor):
$$
D_n \propto \omega \mathcal A
$$
with $\mathcal A$ the area of the domain. Therefore, the partition function is calculated just like for the harmonic oscillator:
$$
Z \propto \mathcal A\frac{\omega/2}{\sinh(\beta\omega/2)}
$$
In your case, $\mathcal A$ is infinite, so you expect to have an infinite prefactor. This is just like a free particle which you precisely recover in the $\omega\to0$ limit. In fact, since you know the partition function exactly in this limit, you can deduce the numerical prefactor (the $2\pi$ comes from the definition of the quantum flux defined in terms of $h=2\pi$ in my dimensionless units):
$$
Z = 2\pi\mathcal A\frac{\omega/2}{\sinh(\beta\omega/2)}
$$
If you don't know the formula for the degeneracy, you can rederive it in many ways by regularising the sum. Your approach is not physically transparent. One way of seeing why it works is to realise that $(3),(4)$ both give the Landau levels. However, from (3), you have an infinite "constant" degeneracy (morally the same countable amount given by varying $m$, which captures the extensive degeneracy), while from (4), you only have a linear degeneracy:
$$
E_n = \omega(n+1/2) \\
D_n = n+1
$$
The latter will therefore be completely swamped by the former one, which is why it appears that you are neglecting the second term in (5). The tricky part is precisely to anticipate that $M$, your surface degeneracy, is proportional to $\omega$. Without appealing to the classical limit, you can justify this by looking at the wave function. You wan identify a characteristic length for the wavefunction at large distance:
$$
r_* \propto \sqrt{m+2n}a
$$
(competition between the polynomial and the exponential). Therefore, if you want to localise your wavefunctions on disk of area $\mathcal A$, it amounts to bounding:
$$
m+2n\leq \omega \mathcal A
$$
and then sending $\mathcal A$ to infinity. This regularises your sum and dividing by $\mathcal A$, you get a finite well defined limit.
For completeness, I'll also add other standard regularisation methods.
One approach is to take a large finite domain. If you want to keep using the symmetric gauge, you can take for example a disk with hard boundaries (indeed, there is no issue of flux quantisation in this case). As you've already commented, this is difficult to implement. You would need to investigate the zeros of hypergeometric functions. Plus you can anticipate that the spectrum would be complicated as the hard boundary would generate the discrete version of chiral edge states.
Another approach compatible with the rotation invariance of the symmetric gauge is to add a harmonic trap:
$$
H = \frac12(p-A)^2+\frac{\omega_0^2}2(x^2+y^2)
$$
This has the advantage of not altering your wavefunctions, and it is essentially the same situation up by a reparametrization. Explicitly, your previous spectrum was:
$$
E_{mn} = \omega\left(n+\frac{m+|m|}2+\frac12\right)
$$
it is now changed to:
$$
E_{mn} = \sqrt{\omega_0^2+\left(\frac\omega2\right)^2}\left(2n+|m|+1\right)-m\frac\omega2
$$
You can check that $\omega_0=0$ gives your spectrum, and that the effect of $\omega_0$ is to lift the degeneracy of (3). As a sanity check, you can also verify that you recover the 2D harmony oscillator when $\omega=0$.
Let:
$$
\Omega = \sqrt{\omega_0^2+\left(\frac\omega2\right)^2}
$$
It is not hard to compute the new partition function which is now finite, it's just geometric series:
$$
Z = \frac1{2\sinh(\beta\Omega)}\left(1+\frac1{e^{\beta(\Omega-\omega/2)}-1}+\frac1{e^{\beta(\Omega+\omega/2)}-1}\right)
$$
Sending $\omega_0\to0$, you get:
$$
Z = \frac1{2\sinh(\beta\Omega)}\frac1{\beta(\Omega-\omega/2)} = \frac{\omega/2}{\sinh(\beta\omega/2)}\frac1{\beta\omega_0^2}
$$
While formally, it may seem that just sending $\omega_0\to0$ should do the trick, you'll need to also add a temperature dependence. Indeed, you want to reproduce hard boundaries, whose position is temperature independent. By introducing "soft" boundaries, the effective size of the trap acquires a temperature dependence, which you need to compensate by scaling $\omega_0$ appropriately. Either by thinking classically by assuming that the effective domain is a disk whose radius is the root mean square distance, by dimensional analysis, or just to cancel the extra $\beta$ in the partition function, you obtain the scaling:
$$
\omega_0 \propto \frac{1}{\sqrt{\beta\mathcal A}}
$$
Now sending $\mathcal A\to\infty$ does send $\omega_0\to$, but not uniformly in temperature. You will thus obtain the correct final partition function.
You can already see how the trick works without magnetism $\omega=0$. In this case:
$$
Z = \frac1{(2\sinh(\beta\omega_0/2))^2} \to Z_0 \\
Z_0\propto \frac{\mathcal A}\beta
$$
since I know the exact partition function $Z_0$, I can even figure out the correct numerical prefactor:
$$
\omega_0 = \frac1{\sqrt{2\pi\beta} \mathcal A}
$$
You'll notice that in practice, this method is essentially the same as yours. Instead of using a hard cutoff for the first infinite sum of (5), I instead use an exponential cutoff which is physically more justifiable. In your case, the cutoff parameter $M$ acquires a magnetic dependence, whereas mine picks up a thermal dependence, both of which can be justified physically. The mathematical correspondence for the two is:
$$
M = \frac\omega{\beta\omega_0^2}
$$
so that you have both scalings:
$$
\begin{align}
M &\propto \omega\mathcal A & \omega_0 &\propto \frac{1}{\sqrt{\beta\mathcal A}}
\end{align}
$$
In a similar spirit, a simpler way would be would be to add a harmonic trap not for the position but for the cyclotron orbit centers. They are the independent (unlike position which does not commute with the original Hamiltonian) degrees of freedom responsible for the large degeneracy in the first place. This gives:
$$
\begin{align}
v &= (v_x,v_y) & v_x &= p_x-A_x & v_y &= p_y-A_y \\
R &= (X,Y) & X &= x-\frac{v_y}\omega & Y &= y+\frac{v_x}\omega \\
\end{align}\\
H \to H+\frac{\omega_0^2}2(X^2+Y^2)
$$
This one also can be solvable with your wavefunctions, but has the advantage of being solvable using ladder operators for $v$ and $R$. You'll need the same scaling in temperature for $\omega_0$ to recover the correct limit.
Hope this helps.