Let’s say we have $N$ electrons and we want to derive the Hartree-Fock (HF) equations. The first step would be to define a Slater determinant of $N$ electrons:
$$\psi(x_1,x_2,… x_N) = \frac{1}{\sqrt{N!}}\begin{vmatrix}\phi_{1}(x_1) & \phi_{2}(x_1) & … & \phi_{N}(x_1)\\ .\\ .\\ .\\ \phi_{1}(x_N) & \phi_{2}(x_N) & … & \phi_{N}(x_N)\end{vmatrix} $$
then we would use the Lagrange minimisation principle to get our HF equations, which are:
$$f\phi_k = \varepsilon_k\phi_k \qquad \forall k=1,\dots,N.$$
$$f := h + \sum_{n=1}^N J_n - K_n,$$
We note the following: We have $N$ electrons in our system so we get a $N\times N$ slater determinant and $N$ molecular wave functions $ \phi_k$.
If we would try to solve this HF equation we could simply put $N$ trial orbitals into the HF equation and solve it iteratively. In practice however, one approximates the molecular orbitals by a linear combination of e.g. $M>N$ basis functions: $$\phi_k=\sum_{m=1}^M{C_{mk} \xi_m}$$
If we put this into the HF equation above, we eventually get a matrix equation (Roothan equations).$$FC=\epsilon SC $$ which can be solved on the computer.
My question is:
There are $M>N$ expansion coefficients $C_m$. By using these $M$ coefficients we eventually get $2M$ molecular orbitals $\phi_k$ with $k=1,\cdots,2M$. Or put in words: we get two molecular orbitals for every basis function we use (due to spin). The lowest $N$ orbitals are the occupied orbitals the highest $2M–N$ are the virtual orbitals. These $2M$ molecular orbitals now correspond to a slater determinant with $2M$ electrons
$$\psi(x_1,x_2,\cdots, x_{2M}) = \frac{1}{\sqrt{2M!}}\begin{vmatrix}\phi_{1}(x_1) & \phi_{2}(x_1) & … & \phi_{2M}(x_1)\\ .\\ .\\ .\\ \phi_{1}(x_{2M}) & \phi_{2}(x_{2M}) & … & \phi_{2M}(x_{2M})\end{vmatrix} $$
But now we have a problem: We wanted to describe a $N$-electron system and stared out by assuming a $N$-electron slater determinant. Now we used this expansion into basis functions and ended up with a slater determinant describing $ 2M>N$ electrons. Isn’t this somehow unphysical? How do we know that our result does even describe a $N$-electron system in the right way? Since our Slater determinant has a size of $2M$ how is it possible to just fill the lowest orbitals and ignore all other?