Once you accept that (a) the electromagnetic field is decribed by a U(1) gauge field $A^\mu(x)$ (as field variable) and (b) that the (free) field equations should be linear in $A^\mu$, the corresponding action is (up to a normalization factor) uniquely determined by the requirements of Poincare symmetry and gauge invariance:
The field strength tensor $F_{\mu \nu}= \partial_\mu A_\nu-\partial_\nu A_\mu$ is clearly invariant under the U(1) gauge transformation $A_\mu \to A_\mu + \partial_\mu \Lambda$ and possible candidates (bilinear in $A^\mu$) for terms in the Lagrange density (Lagrangian) are the scalar $F_{\mu \nu}F^{\mu \nu} = 2 (\vec{B}\cdot \vec{B}-\vec{E}\cdot \vec{E})$ or the pseudoscalar $\epsilon_{\mu \nu \alpha \beta}F^{\mu \nu} F^{\alpha \beta} = 8 \vec{E} \cdot \vec{B}$. The second possibilty is ruled out by that fact that this term can be written as a 4-gradient $\partial_\mu( \ldots)$, which does not affect the equations of motion. So one finds the Poincare invariant action
$S= k \int d^4 x \, F_{\mu \nu}(x) F^{\mu \nu}(x)$, with
$k =-1/16 \pi c$ in the Gauss system (or $-1/4c$ in the Heaviside system).
Note that a "mass term" $A_\mu A^\mu$ is forbidden as it violates gauge invariance and terms like $(F_{\mu \nu} F^{\mu \nu})^2$ or $(\epsilon_{\mu \nu \alpha \beta} F^{\mu \nu} F^{\alpha \beta})^2$ are excluded by condition (b).
Thus the "intuition" for introducing $\vec{E}^2-\vec{B}^2$ in the Lagrangian is the simple fact that this is a Lorentz invariant and gauge invariant quantity. Interpreting $\vec{E}^2$ and/or $\vec{B}^2$ as "kinetic energy" or a "potential" term is completely misleading. The correct way to identify the energy density (and the momentum density) of the electromagnetic field is to study its energy-momentum tensor.