It is actually possible to allow for sleeping nodes without the need for time synchronization. The basic idea is to send a message multiple times until the node finally wakes up. There is of course a lot of room for clever optimization, so there are hundreds of MAC layer approaches based on this idea.
But since your question specifically asks for MAC layers, where a node knows when to transmit in advance, i.e. Time Division Multiple Access (TDMA), I will focus on those approaches.
As you already mentioned, one problem is clock drift, so the devices have to wake up regularly for time synchronization. In the typical short-range wireless applications we are talking about, signal propagation duration itself over a single hop is not a big problem. So it is sufficient that a central coordinator sends a beacon, including the current time, in regular time intervals that are known to the nodes.
In a multi-hop network it gets more complicated. Just forwarding the beacon will not work, because the latency is too high. The solution is that multiple (if not all) nodes send beacons, i.e. receive a beacon from a node closer to the coordinator, correct the own clock drift with it and send out an own beacon with the corrected time. You just have to avoid building circles (been there, done that....).
Since now every node in the network has the same notion of time, there is a second problem: How does a node know when he should wake up to transmit or receive? There are basically four approaches, that can also be combined:
- Common Slot: All nodes wake up at the same time and use a contention based access method to transmit their packets Advantage: Easy (if you know how to do CSMA/CA). Disadvantage: Prone to collisions, lower throughput. 
- Predefined: For a restricted number of nodes you can just assign fixed slots to the nodes. For example, node 2 can send to node 1 in the first timeslot and node 3 can send to node 2 in the second timeslot. Advantage: Dedicated slots and no collisions. Disadvantage: The topology has to be fixed (very difficult in wireless mesh networks). 
- Centralized: A central coordinator requests information from the nodes about the topology, calculates a global schedule and distributes it again to the nodes. Advantage: No predefined topology required. Disadvantage: Scales poorly and is prone to topology changes (the whole process has to be restarted). 
- Decentralized: Two nodes that want to communicate negotiate the slot themselves. It is quite complex, because they have to make sure no neighboring devices transmit at the same time. Advantage: Scales well because the negotiation is local. Disadvantage: Complex to implement. 
There are two related techniques included in the IEEE 802.15.4 standard that currently find much research attention: TSCH and DSME.
TSCH itself is quite basic. It only solves the time synchronization problem, but leaves the slot assignment problem for an upper layer. There is 6TiSCH that tries to fill this gap, but it is still work-in-progress. There are implementations, for example included in Contiki or OpenWSN.
DSME on the other hand already provides a mechanism for decentralized slot negotiation. We have actually build an open-source implementation of this called openDSME. While there is a video tutorial for running a simulation, the hardware implementation is unfortunately still underdocumented. Ask another question or contact us directly if you want to use it.