In Hamiltonian formalism, specifically generating functions, why do the variables $q, p, Q, P$ are treated as independent when finding the equations that arise from the generating function?
Let us agree to consider a pedagogical example with just one degree of freedom. A well known example of this type is the simple harmonic oscillator in one dimension. There is just one degree of freedom $n=1$, meaning there is one coordinate $q$ and one momentum $p$, so phase space is $2n = 2$ dimensional. The Hamiltonian is well known to be:
$$
H(q,p) = \frac{1}{2m}p^2 + \frac{k}{2}q^2\;.
$$
We should all understand that in this Hamiltonian formalism we know that we are treating $q$ and $p$ as two independent variables and this is made clear by how we write our function $H$ with a pair of parenthesis and a comma separating $q$ and $p$ a la $H(q,p)$. Indeed, this is all that such notation means. In other words, the Hamiltonian is a function of two independent variables.
I understand that generating functions generate canonical transformations, but I'm not sure why it is valid to treat $q,p,Q,P$ as independent variables during the derivation of transformation equations, since we know that they are indeed dependent on each other - $Q=Q(q,p), P=P(q,p)$ (and the inverse relations),
It is not valid to generally treat all of $q,p,Q,P$ as independent variables. Indeed, they are not all independent
or even the relations that we get from the generating function (e.g. $\frac{\partial F_1}{\partial q} = p$)?
Here we are now talking about a more-specific generating function form $F_1$, which is described in Goldstein as $F_1 = F_1(q,Q)$, which is only a function of $q$ and $Q$. I.e., only a function of two variables in our example case, where one is from the first set of variables (the old $q$) and one is from the second set of variables (the new $Q$).
Further, I will direct you to page 371 of Goldstein reading: "Indeed, F is useful... only when half of the variables... are from the old set and half are from the new." $F_1(q,Q)$ has this form since half of two is one. One variable, $q$ is from the old set, and one variable $Q$ is from the new.
Because we started off with $2n=2$ independent variables (which we initially agreed were $q$ and $p$), it should not be surprising that we can pick $2n=2$ new variables ($q$ and $Q$) that we treat as independent.
More clearly, as said by Arturo Don Juan in the related post: "How are $q_i$ and $Q_i$ separately independent? I thought $Q_i$ was, in general, a function of the original canonical coordinate variables $q_i$, thereby making it explicitly dependent on $q_i$"? (I didn't understand the answer given there)
In general $Q$ could be a function of $q$ and thus we could not, in general, take them as independent. But we are not interested in the completely general case, we are interested in the case of specific useful canonical transformations such as those generated by $F_1(q,Q)$.
For example, if we take
$$
F_1 = qQ
$$
then
$$
P=-q
$$
and
$$
Q=p\;,
$$
where we see that $Q$ is not a function of $q$ in this useful case of a specific canonical transformation generated by $F_1(q,Q)$.
So, perhaps it is better to say that we require $q$ and $Q$ to be independent in our derivation of the $F_1=F_1(q,Q)$ specific canonical transformation. And this is certainly expected to be possible since there are two independent dimensions in phase space and there are two variables in the $F_1$ transformation.
I would be thankful if someone could clarify this, and provide references that deal with this point (if you know to direct me to an explanation for this point in Goldstein, Classical Mechanics it would be best).
The best explanation in Goldstein is probably the one I cited to above. I see no other more fulsome discussion that would specifically address your point. Probably all of Chapter 9 is generally relevant.
To complete our discussion of the simple harmonic oscillator, the canonical transformation generated by
$$
F_1 = qQ
$$
leads to
$$
K = H(q(P),p(Q)) = H(-P, Q) = \frac{1}{2m}Q^2 + \frac{k}{2}P^2\;,
$$
which basically just flips what we meant by "position" with what we meant by "momentum."