Inspired by a recent comment from John Watrous, consider a Cayley graph $\Gamma$ of a group $G$ of order $O(\exp n),$ with generating set $R\subseteq G$ having $d=\vert R\vert$ such generators.
Rather than using the machinery of Aharanov and Ta-Shma for sparse simulation to quantumly walk along the graph $\Gamma$ by decomposing the Cayley graph into $(d+1)^2 n^6$ different colors, could we instead edge-color using the $d$ generators themselves (at the cost of maybe increasing the number of qubits needed to store the vertices of the graph)? Further must the generators be involutions, or can we use other generators as long as we know their order and can take the square-roots thereof?
For the symmetric group $S_n$ considered by Gerhardt and Watrous, I think we could simulate a walk along the positive semi-definite Laplacian of the Cayley graph by considering $n$ registers of $n$-dimensional qudits, and partially SWAP'ping the contents of each register based on the generator.
As an example, let $n=6$ and consider a register $|\psi\rangle$ as being in a superposition of six, six-dimensional qudits:
$$|\psi\rangle=|s_1s_2s_3s_4s_5s_6\rangle=\frac{1}{\sqrt 2}\big(|436251\rangle-|154623\rangle\big).$$
We can partially SWAP of the first register $|s_1\rangle$ with the second $|s_2\rangle$, followed by a partial SWAP of the second register $|s_2\rangle$ with the third $|s_3\rangle$, etc. Indeed $(1\:2)$ commutes with $(3\:4)$ and the partial SWAP of the third register with the fourth could be done concurrently with the first and second.
If so then a walk along $S_{20}$ with generating set $(1\:2)(2\:3)\cdots(19\: 20)$ could probably be simulated with about $100$ qubits and about $1000$ or so CSWAPS. $S_{52}$ (for an international deck of playing cards) could maybe be simulated with probably around $320$ qubits and several thousand CSWAPS. Maybe for a walks along $S_{20}$ with $20!\approx 2.4e18$ elements, and certainly $S_{52}$ with $52!\approx 8e67$ elements, are probably classically difficult to manage. Sampling the spectrum of certain states as in $\frac{1}{\sqrt{2}}\big(|\pi_1\rangle-|\pi_2\rangle\big)$ is likely classically challenging, owing to Wocjan and Zhou's proof of the general BQP-completeness thereof.