The Landauer erasure principle states that to erase a bit of information from a system, the entropy of the environment will be increased by at least $k_B\log2$, or equivalently, it costs at least an amount $k_BT\log2$ of energy to erase a bit of information.
To make this precise, you have to carefully define what erasure of information means. Kurt Jacobs does this, and gives a nice concise derivation in this article. If I understand it correctly, the argument is essentially as follows:
- As the laws of physics are reversible, information is never really destroyed. What we mean by information being destroyed is that it is not accessible through macroscopic variables of the system (or the environment), and is hidden in inaccessible degrees of freedom of the microstate, i.e. essentially in thermal noise.
- Reduce the system to one that carries a single bit of information, and assume that the environment is in one of $N$ equivalent microstates. This gives un an entropy of $S = k_B\log N$.
- Destroying the state of this system means mapping it to a fixed state (e.g. 0) in such a way that no information about the original state can be recovered from macroscopic variables of the system or environment.
- Physically the old state of the system must still be encoded in the environment, along with the environment's original state.
- By an explicit numbering argument, it is argued that there must be at least $2N$ indistinguishable microstates corresponding to the new macrostate of the environment if we want the original system state to be erased.
The last point seems intuitively clear: if we have 2 initial system states, and $N$ environment states, we have a total of $2N$ states. If the number of system states gets reduced to 1, the environment must end up in 1 of $2N$ states, and it must not be derivable from macroscopic properties of the microstate what the initial system state was.
However, couldn't these $2N$ states correspond to 2 different macrostates? If half of the original "0" states would map to the first, and the other half to the second, and likewise for the "1" states, the entropy would not be increased. More precisely, let $a,a'$ be indistinguishable initial microstates of the environment, $b,b'$ and $c,c'$ final pairs of indistinguishable microstates, but such that $b,c$ and $b',c'$ are macroscopically distinguishable. Then if the erasure mapping would map
$$\begin{gather*} (0,a)\mapsto (0,b) \\ (0,a')\mapsto (0,c) \\ (1,a)\mapsto (0,b') \\ (1,a')\mapsto (0,c') \end{gather*}$$
the original system state has become inaccessible, but the entropy hasn't gone up. Note that in this erasure mapping, indistinguishable states, e.g. $(0,a)$ and $(0,a')$ are mapped to distinguishable states $(0,b)$ and $(0,c)$. Since I was uncertain if that would be possible, I asked on this site.
Is there a flaw in my reasoning, or am I ignoring some assumption that has to be made for Landauer's principle to hold?