Scott Aaronson describes quantum mechanics as "statistics but with the L2-norm". States are L2-norm unit vectors (sum of squared amplitudes is 1) instead of L1-norm unit vectors (sum of probabilities is 1). And operations are unitary matrices (preserve the L2-norm) instead of stochastic matrices (preserve the L1 norm). In this view, it's clear that the correct rule for how often a measurement result will occur is going to be described by the L2-norm.
Strictly speaking, you could have a theory with L2-normed states and operations, but with a measurement that followed some other norm. You can write a quantum state vector simulator and make it sample measurements any way you want, and nothing comes down and smashes your computer for doing so. There's nothing mathematically impossible about it. However, in experiment, reality does seem to follow the L2-norm. If you measure how much light makes it through a polarizing filter with an angle $\theta$ between the filter and the polarization of the light, the amount that gets through is $\cos^2 \theta$ instead of $\cos \theta$ or $\cos^3 \theta$ or $e^{\cos \theta}$
More abstractly, if you pick a measurement norm other than L2, you will tend to introduce effects that would be experimentally visible. Things like faster-than-light signalling, exponential computational power, quantum teleportation not working right, the deferred measurement principle not working right, quantum error correction not working right, many-worlds interpretations being trivially distinguishable from copenhagen interpretations, etc. So if you believe Einstein was correct about no faster than light signalling, and think your theory should have that property, it forces you to pick the L2 norm. As do many other reasonable "I think my theory should have that" assumptions.