This is a sequel to How does $x^{\frac{r}{2}} \equiv -1 \pmod {p_i^{a_i}}$ follow from "if all these powers of $2$ agree"?
From Polynomial-Time Algorithms for Prime Factorization and Discrete Logarithms on a Quantum Computer (Shor, 1995) (arXiv), p. 16:
The multiplicative group (mod $p^\alpha$) for any odd prime power $p^\alpha$ is cyclic [Knuth 1981], so for any odd prime power $p_i^{a_i}$, the probability is at most 1/2 of choosing an $x_i$ having any particular power of two as the largest divisor of its order $r_i$. Thus each of these powers of 2 has at most a 50% probability of agreeing with the previous ones, so all k of them agree with probability at most $1/2^{kâ1}$, and there is at least a $1 - 1/2^{k-1}$ chance that the $x$ we choose is good. This scheme will thus work as long as $n$ is odd and not a prime power; finding factors of prime powers can be done efficiently with classical methods.
Why is the probability of choosing an $x_i$ having any particular power of two as the largest divisor of its order $r_i$ at most $\frac{1}{2}$?
Why is the probability of each of these powers of 2 agreeing with the previous at most 50%?
Related: Expected repetitions of the quantum part of Shor's algorithm