In the paper "Efficient Measurement of Quantum Dynamics via Compressive Sensing" Shabani Et. al (2011) [arXiv:0910.5498], The full process tomography of two-photon is performed as: Preparing 16 pairwise combinations of the 4 input states $\{|H\rangle,|V\rangle,|D\rangle,|R\rangle\}$ and, for each input, measuring 36 two-qubit combinations of the observables $\{|H\rangle,|V\rangle,|D\rangle,|A\rangle,|R\rangle,|L\rangle\}, \quad$ where $|D(A)\rangle=(|H\rangle$ $\pm |V\rangle) / \sqrt{2}$ and $\mid R(L)\rangle=(|H\rangle \pm i| V\rangle) / \sqrt{2}$. These $16\times 36=576$ input-output configurations represent an overcomplete set which allows the best possible estimate of the quantum process, denoted $\chi_{576}$.
Why are 36 pairs of observables for each input state used here?