A first remark
This same phenomenon of 'control' qubits changing states in some circumstances also occurs with controlled-NOT gates; in fact, this is the entire basis of eigenvalue estimation. So not only is it possible, it is an important fact about quantum computation that it is possible. It even has a name: a "phase kick", in which the control qubits (or more generally, a control register) incurs relative phases as a result of acting through some operation on some target register.$\def\ket#1{\lvert#1\rangle}$
The reason why this happens
Why should this be the case? Basically it comes down to the fact that the standard basis is not actually as important as we sometimes describe it as being.
Short version. Only the standard basis states on the control qubits are unaffected. If the control qubit is in a state which is not a standard basis state, it can in principle be changed.
Longer version —
Consider the Bloch sphere. It is, in the end, a sphere — perfectly symmetric, with no one point being more special than any other, and no one axis more special than any other. In particular, the standard basis is not particularly special.
The CNOT operation is in principle a physical operation. To describe it, we often express it in terms of how it affects the standard basis, using the vector representations
$$ \ket{00} \to {\scriptstyle \begin{bmatrix} 1 \\ 0 \\ 0 \\ 0 \end{bmatrix}}\,,
\quad
\ket{01} \to {\scriptstyle \begin{bmatrix} 0 \\ 1  \\ 0 \\ 0 \end{bmatrix}}\,,
\quad
\ket{10} \to {\scriptstyle \begin{bmatrix} 0 \\ 0 \\ 1   \\ 0 \end{bmatrix}}\,,
\quad
\ket{11} \to {\scriptstyle \begin{bmatrix} 0 \\ 0 \\ 0 \\ 1   \end{bmatrix}}$$
— but this is just a representation. This leads to a specific representation of the CNOT transformation:
$$
\mathrm{CNOT}
\to
{\scriptstyle \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0 \end{bmatrix}}\,.$$
and for the sake of brevity we say that those column vectors are the standard basis states on two qubits, and that this matrix is a CNOT matrix.
Did you ever do an early university mathematics class, or read a textbook, where it started to emphasise the difference between a linear transformation and matrices — where it was said, for example, that a matrix could represent a linear transformation, but was not the same as a linear transformation? The situation with CNOT in quantum computation is one example of how this distinction is meaningful. The CNOT is a transformation of a physical system, not of column vectors; the standard basis states are just one basis of a physical system, which we conventionally represent by $\{0,1\}$ column vectors.
What if we were to choose to represent a different basis — say, the X eigenbasis — by $\{0,1\}$ column vectors, instead? Suppose that we wish to represent $$
\begin{aligned}
\ket{++} \to{}& [\, 1 \;\; 0 \;\; 0 \;\; 0 \,]^\dagger\,,
\\
\ket{+-} \to{}& [\, 0 \;\; 1 \;\; 0 \;\; 0 \,]^\dagger\,,
\\
\ket{-+} \to{}& [\, 0 \;\; 0 \;\; 1 \;\; 0 \,]^\dagger\,,
\\
\ket{--} \to{}& [\, 0 \;\; 0 \;\; 0 \;\; 1 \,]^\dagger \,.
\end{aligned}$$
This is a perfectly legitimate choice mathematically, and because it is only a notational choice, it doesn't affect the physics — it only affects the way that we would write the physics. It is not uncommon in the literature to do analysis in a way equivalent to this (though it is rare to explicitly write a different convention for column vectors as I have done here). We would have to represent the standard basis vectors by:
$$ \ket{00} \to \tfrac{1}{2}\,{\scriptstyle \begin{bmatrix} 1 \\ 1 \\ 1 \\ 1 \end{bmatrix}}\,,
\quad
\ket{01} \to \tfrac{1}{2}\,{\scriptstyle \begin{bmatrix} 1 \\ -1  \\ 1 \\ -1 \end{bmatrix}}\,,
\quad
\ket{10} \to \tfrac{1}{2}\,{\scriptstyle \begin{bmatrix} 1 \\ 1 \\ -1   \\ -1 \end{bmatrix}}\,,
\quad
\ket{11} \to \tfrac{1}{2}\,{\scriptstyle \begin{bmatrix} 1 \\ -1 \\ -1 \\ 1   \end{bmatrix}}\,.$$
Again, we're using the column vectors on the right only to represent the states on the left. But this change in representation will affect how we want to represent the CNOT gate.
A sharp-eyed reader may notice that the vectors which I have written on the right just above are the columns of the usual matrix representation of $H \otimes H$. There is a good reason for this: what this change of representation amounts to is a change of reference frame in which to describe the states of the two qubits. In order to describe $\ket{++} = [\, 1 \;\; 0 \;\; 0 \;\; 0 \,]^\dagger$, $\ket{+-} = [\, 0 \;\; 1 \;\; 0 \;\; 0 \,]^\dagger$, and so forth, we have changed our frame of reference for each qubit by a rotation which is the same as the usual matrix representation of the Hadamard operator — because that same operator interchanges the $X$ and $Z$ observables, by conjugation.
This same frame of reference will apply to how we represent the CNOT operation, so in this shifted representation, we would have
$$
\begin{aligned}
\mathrm{CNOT} \to \tfrac{1}{4}{}\,{\scriptstyle
\begin{bmatrix} 1 & 1 & 1 & 1 \\ 1 & -1 & 1 & -1 \\ 1 & 1 & -1 & -1 \\ 1 & -1 & -1 & 1 \end{bmatrix}
\,\begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0 \end{bmatrix}\,
\begin{bmatrix} 1 & 1 & 1 & 1 \\ 1 & -1 & 1 & -1 \\ 1 & 1 & -1 & -1 \\ 1 & -1 & -1 & 1 \end{bmatrix}
}\,
=
\,{\scriptstyle
\begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & 0 & 1 & 0 \\ 0 & 1 & 0 & 0 \end{bmatrix}}
\end{aligned}$$
which — remembering that the columns now represent $X$ eigenstates — means that the CNOT performs the transformation
$$ \begin{aligned}
\mathrm{CNOT}\,\ket{++} &= \ket{++} , \\
\mathrm{CNOT}\,\ket{+-} &= \ket{--}, \\
\mathrm{CNOT}\,\ket{-+} &= \ket{-+} , \\
\mathrm{CNOT}\,\ket{--} &= \ket{+-} .
\end{aligned} $$
Notice here that it is only the first, 'control' qubits whose state changes; the target is left unchanged.
Now, I could have shown this same fact a lot more quickly without all of this talk about changes in reference frame. In introductory courses in quantum computation in computer science, a similar phenomenon might be described without ever mentioning the words 'reference frame'. But I wanted to give you more than a mere calculation.  I wanted to draw attention to the fact that a CNOT is in principle not just a matrix; that the standard basis is not a special basis; and that when you strip these things away, it becomes clear that the operation realised by the CNOT clearly has the potential to affects the state of the control qubit, even if the CNOT is the only thing you are doing to your qubits.
The very idea that there is  a 'control' qubit is one centered on the standard basis, and embeds a prejudice about the states of the qubits that  invites us to think of the operation as one-sided. But as a physicist, you should be deeply suspicious of one-sided operations. For every action there is an equal and opposite reaction; and here the apparent one-sidedness of the CNOT on standard basis states is belied by the fact that, for X eigenbasis states, it is the 'target' which unilaterally determines a possible change of state of the 'control'.
You wondered whether there was something at play which was only a mathematical convenience, involving a choice of notation. In fact, there is: the way in which we write our states with an emphasis on the standard basis, which may lead you to develop a non-mathematical intuition of the operation only in terms of the standard basis. But change the representation, and that non-mathematical intuition goes away.
The same thing which I have sketched for the effect of CNOT on X-eigenbasis states, is also going on in phase estimation, only with a different transformation than CNOT. The 'phase' stored in the 'target' qubit is kicked up to the 'control' qubit, because the target is in an eigenstate of an operation which is being coherently controlled by the first qubit. On the computer science side of quantum computation, it is one of the most celebrated phenomena in the field. It forces us to confront the fact that the standard basis is only special in that it is the one we prefer to describe our data with — but not in how the physics itself behaves.