You can compute this via purely combinatorial arguments.
The trick is to observe that $aa^\dagger=a^\dagger a +1$ means that you can think of the process of going from $a^{m}a^{\dagger n}$ to $a^{\dagger n}a^m$ as one composed of a number of steps, in each one of which a choice is made as to whether have $a$ "act" on $a^\dagger$, thus having them both disappear, or to have them switch places. You thus build a "tree of possibilities" whose leaves are the normally ordered terms obtained in the process.
The nontrivial part is counting the number of resulting factors of each type.
The possible factors in the final expression are $a^{\dagger (n-k)}a^{(m-k)}$ for $k=0,...,\min(n,m)$. This is easy to see: every time you apply the commutation relation you either maintain all operators or annihilate one pair of them.
The final expression must therefore have the form
$$a^m a^{\dagger n}= \sum_{k=0}^{\min(n,m)}c_k a^{\dagger (n-k)}a^{(m-k)}$$
for some integer coefficients $c_k$. These coefficients are derived by counting the number of ways in which a given term can be obtained.
To do this, we simply need to find the number of ways in which we can pair subsets of size $k$ of $a^\dagger$ and $a$.
For example, $c_0=1$ follows easily from the fact that there is only one way to not pair anything with anything else.
At the same time, $c_1=nm$, because each of the $m$ annihilation operators can interact once with each of the $n$ creation ones.
More generally, we have
$$c_k = k! \binom{n}{k}\binom{m}{k}.$$
To see it, notice that the number of ways to single out a subset of $k$ operators $a$ and $k$ operators $a^\dagger$ (ignoring ordering) is $\binom{n}{k}\binom{m}{k}$. Then, the number of ways in which these $k$ operators can all annihilate each others is $k!$: the first $a$ can interact with any of the $k$ terms on the right, then the second $a$ can interact with any of the remaining $k-1$ terms, etc.
To obtain the final form of the commutator we just remove the $k=0$ term from the RHS:
$$[a^m,a^{\dagger n}]=\sum_{k=1}^{\min(n,m)}k!\binom{n}{k}\binom{m}{k} a^{\dagger(n-k)}a^{(m-k)}.$$
To better visualise what's going with these calculations, I wrote a couple of functions to take an expression, systematically apply the commutation rules, and graph the resulting "tree of possibilities". Here is for example the result for $a^2 a^{\dagger 2}$ (I use $b\equiv a^\dagger$ in the graph for better readability):

Here is the Mathematica code generating this figure (it requires having MaTeX installed):
Get["MaTeX`"]
distribute[args_] := args //.
{HoldPattern[nc[l___, Plus[m__], r___]] :>
Total[(nc[l, #1, r] & ) /@ {m}],
nc[l___, (c_)*nc[m__], r___] :> c*nc[l, m, r],
nc[l___, nc[m__], r___] :> nc[l, m, r], nc[-(a_), b_] :>
-nc[a, b], nc[a_, -(b_)] :> -nc[a, b], nc[nc[l__], r_] :>
nc[l, r], nc[l_, nc[r__]] :> nc[l, r], nc[a_] :> a, nc[] -> 1};
singleStepExpand[expr_] := (Map[distribute])[
expr /. {nc[l___, a, b, r___] :> {nc[l, b, a, r], nc[l, r]}}];
stepExpand[expr_] := (If[Length[#1] > 1, stepExpand /@ #1, #1] & )[
singleStepExpand[expr]];
stepExpandFullStory[expr_] :=
(If[Head[#1] === List, Append[{expr}, stepExpandFullStory /@ #1],
#1] & )[singleStepExpand[expr]];
firstIfList[expr_] := If[Head[expr] === List, First[expr], expr];
nestedListToListOfEdges[expr_] := Cases[expr,
{l:nc[__], {first_, second_}} :> Sequence @@
{DirectedEdge[l, firstIfList[first]], DirectedEdge[l,
firstIfList[second]]}, All];
groupPowers[args_] := args //. {nc[l___, a_, a_, r___] :>
nc[l, a^2, r], nc[l___, (a_)^(n_), a_, r___] :>
nc[l, a^(n + 1), r], nc[l___, a_, (a_)^(n_), r___] :>
nc[l, a^(n + 1), r]};
makeCsToBrackets[expr_] := expr //.
{c[a_, b_] :> StringJoin["[", ToString[makeCsToBrackets[a]], ",",
ToString[makeCsToBrackets[b]], "]"],
(a_)^(n_) :> StringJoin[ToString[makeCsToBrackets[a]], "^",
ToString[n]]};
beautify[expr_] := makeCsToBrackets[expr] //.
{nc[args__] :> MaTeX[StringJoin @@ ToString /@ {args},
Magnification -> 1.5], s_String :>
MaTeX[s, Magnification -> 1.5]};
edgesToGraphWithNiceLabels[edges_] :=
Graph[edges, VertexLabels ->
(#1 -> beautify[groupPowers[#1]] & ) /@ DeleteDuplicates[
Flatten[edges, Infinity, DirectedEdge]],
GraphLayout -> "LayeredDigraphEmbedding"];
nc[a, a, b, b] // stepExpandFullStory // nestedListToListOfEdges // edgesToGraphWithNiceLabels