In a discussion in another question, a user named @Claudiu showed me his own version of the derivation of CHSH inequality that does not need hidden variables (reproduced below). I can just wonder if this proof is valid and if so why aren't we introduced to this version of the derivation that rules out locality and avoids the whole need to discuss for hidden variables.
I leave here the derivation, but you may skip it and go directly to the next section (it is just to show "previous work").
Derivation
This derivation is based in J.S. Bell's Speakeable and Unspekeable in Quantum Mechanics, first edition, pp. 156 Appendix 2. However, the idea in this question is to carry out the same derivation as Bell did in the book, but removing all integrals related to hidden variables; steps that I reproduce here:
Let us assume (1) statistical independence (no superdeterminism) (2) factorabizability and the usual (3) locality so that we can write the correlations as
$$E(\theta,\phi)=[P_1(\uparrow|\theta)-P_1(\downarrow|\theta)][P_2(\uparrow|\phi)-P_2(\downarrow|\phi) ]=A(\theta)B(\phi)$$ where $A$ and $B$ stand for the first and second expressions in square brackets, respectively. Here $\theta$ and $\phi$ are the the angles of measurement of two detectors 1 and 2 in a Bell test with two entangled spin-1/2 particles. The results of the experiment can be $\updownarrow=\uparrow,\downarrow$ and $P_i(\updownarrow|\varphi)$ is the conditional probability of measuring $\updownarrow$ with in detector $i=1,2$ measuring with an angle $\varphi$.
Note that some of you will be familiar with this expression, but it would normally look like $\int \rho(\lambda)A(\theta,\lambda)B(\phi,\lambda)d\lambda$ but I just got rid of the hidden variable $\lambda$.
Without assuming determined values of $0$ and $1$ for $P_i$ (determinism), we can at least admit $0\leq P_1\leq 1$ and $0\leq P_2\leq 1$, and thus $|A(\theta)|\leq 1$ and $|B(\phi)|\leq 1$.
We can use that to write $$|E(\theta,\phi)\pm E(\theta,\phi')|=|A(\theta)B(\phi)\pm A(\theta) B(\phi')|\leq |B(\phi)\pm B(\phi')|$$ and analogously we have $$|E(\theta',\phi)\mp E(\theta',\phi')|=|A(\theta')B(\phi)\mp A(\theta) B(\phi')|\leq |B(\phi)\mp B(\phi')|$$ where I took $\theta\to \theta'$ and $\pm\to\mp$.
Summing the two last equations, we have $$|E(\theta,\phi)\pm E(\theta,\phi')|+|E(\theta',\phi)\pm E(\theta',\phi')|\leq |B(\phi)\pm B(\phi')|+|B(\phi)\mp B(\phi')|\leq 2$$
This last result includes the famous CHSH inequality $$|E(\theta,\phi)+ E(\theta,\phi')+E(\theta',\phi)- E(\theta',\phi')|\leq 2$$
Nowhere in this derivation, I needed to use hidden variables. For the version with the integrals over the hidden variable distribution, check the reference above.
The only "classical" physical assumption above was locality there rest seems to come from probability theory. Of course this derivation does not address how you argue for assumption (2) [which is very natural], but that is sometimes defined through "determinism" (fixing some probabilities to be strictly 0 or 1).
[Assumption (1) is out of the scope of this question].
Comment and question
A derivation of Bell's inequalities without all the integrals associated to hidden variables is to me a stronger argument for those that advocate violations of locality as it avoids the unnecessary discussion about the nature of hidden variables. Maybe somebody with more background could provide a perspective on this issue. It seems that such a version should be taught more in introductory courses as it is a much simpler demonstration "quantum nonlocality" if you wish.
Note that this boils down to Occam's razor, why have a more complicated version of the theorem if it adds nothing to the result.
I will try not ask if the derivation above is right (as that would probably get the question closed). However I will ask, is it possible to derive CHSH without ever making appeal to hidden variables as I did here?