Well how do we say something is real? One way, most common and most efficient is to observe it via EM waves which travel at the speed of light.
Your now is in your locality and someone else's now (who is away but not moving wrt you, i.e. same frame of reference) is in their locality. When you try to communicate this "now" to another observer (or vice versa), the fastest communication happens at speed of light and your now will differ due to the time taken by that communication.
This is simplest case and obviously the loss of simultaneity arises due to finite speed of light. In this case, you will agree on simultaneity only if the event in question is mid way. The other way is to have synchronized clocks and checking on simultaneity after the effect via the clocks.
When the two observers are not in same frame of reference, then it is not this simple and the calculation of "time taken by the communication" will be different. This difference will also be due to the fact that speed of light is finite, and same for all observers.
So in overall, loss of simultaneity arises due to finite speed of communication and in addition is dependent upon the relative speed and relative location of the event wrt to two observers.
Time dilation impacts the rate of change of time in different frames of reference but simultaneity is "checking now", which is due to finite speed of light. Checking on clocks is always after the effect.
For example, two people sitting couple of meters away never seem to disagree on simultaneity even if the event is not equidistant from them. Why? Because the difference in time is negligible, may be immeasurable.