It seems then that, in passing two bits into the gate, and getting out only one we have "deleted" one bit of information, corresponding to an entropy decrease $k_b\log(2)$.*
*I suppose a different way of phrasing it is that the macrostate of 2 bits (A and B) has 4 microstates, such that $S=k_b\log(4)$ but passing the OR gate leaves only 1 bit, with 2 microstates. Thus $\Delta S = k_b \left(\log(4) - \log(2)\right) = k_b \log(2)$.
What you're describing is decrease of number of logical states; the OR operation plus input erasure mean that 4 different logical states change into one of 2 different logical states, so information on the initial state has been lost (state of the input bits is assumed erased to a known standard state, e.g. to 1).
But this does not mean entropy of the three bits has decreased, whether we talk about information entropy or thermodynamic entropy. Bit values are known both before and after, there is no uncertainty about their state, so information entropy of the appropriate probability distribution of bit values remains zero. And if the bit devices have the same thermodynamic entropy in both bit states (we assume same temperature), then, if temperature does not change after the erasure, their thermodynamic entropy remains the same.
This means that the surroundings (the room the computer is in) must have a commensurate increase in entropy, in accordance with the 2nd Law of Thermodynamics, right?
A common argument for this (by Landauer?) works differently, with different assumptions. It is not based on the idea that thermodynamic entropy of the bits has decreased. Also, 2nd law is not used.
Instead, the idea of conservation of number of states, or phase volume (the Liouville theorem valid in reversible systems) is used. The argument goes like this.
Let the system of memory bits, gates and the rest of the computer be a complicated mechanical system, which is mechanically reversible. This means that from any microstate at any time, we can retrodict any other past microstate. From this it's clear that bit values cannot be microstates; it is not possible to retrodict the past bit state from the present bit state, because the computation, although it is mechanically reversible, is not logically reversible. It does not keep all the necessary bit values, they get erased. So the bit values are actually (incomplete) description of bit macrostates. One bit value (macrostate) can be compatible with many bit microstates.
Let each macrostate of a bit be compatible with $N$ microstates of the bit; let the macrostate of the rest of the computer before the erasure be compatible with $R$ microstates of the rest, and let the macrostate of the rest of the computer after the erasure be compatible with $R'$ microstates of the rest.
Before the erasure of a bit, irrespective of the bit value, the whole supersystem can be in $NR$ microstates, and the Boltzmann entropy is
$$
S = k_B\ln (NR).
$$
This is true whether the bit has value 0 or 1.
After the erasure to a default value 1, the supersystem can be in $NR'$ states. This is assumed to be true for all initial microstates compatible with bit value 0 (number of them is $NR$), and bit value 1 (number of them is $NR$ too). From the Liouville theorem, we have
$$
2NR = NR'
$$
so $R' = 2R$. After the erasure to 1, the whole supersystem is in a macrostate with $2NR$ microstates, and thus has entropy
$$
S' = k_B\ln (2NR),
$$
which is greater by a factor of $\ln 2$.
So the argument is based on three things:
- the Liouville theorem holds for the supersystem (not necessarily 2nd law);
- the erasure of a bit ends up in the same macrostate of the bit and the same macrostate of the rest, regardless of microstate of the bit;
- all initial microstates of the whole supersystem follow the erasure as intended and end up in the erased state.
The second assumption can be challenged, there may be multiple final macrostates of the rest of the computer, see Is it necessary to assume that equivalent microstates cannot get transformed into inequivalent microstates to derive the Landauer principle? .
The third assumption is problematic too. We do not know that all microstates compatible with a computer macrostate follow the erasure procedure; we only know negligible amount of them do, based on experiments. The fact that we observe the erasure to work correctly 1000 000 times does not mean all zillions of microstates compatible with the initial computer macrostate will do that as well. It may well be that only small fraction of the number of initial microstates $2NR$ obey the prescribed erasure plan. Then the assumption there being $2NR$ microstates compatible with the erased macrostate would be wrong, and thermodynamic entropy would not have to increase.
The situation is similar to derivations of necessity of entropy non-decrease from mechanics (e.g. Jaynes). These assume that if we observe that macrostate $A$ systematically evolves into macrostate $B$, then all microstates compatible with $A$ have to evolve into microstates compatible with $B$, and thus $B$ has to have equal or higher multiplicity. Besides the well-known problem with induction, the conclusion is known to be mathematically false in mechanics. There are microstates compatible with macrostate $A$ which will evolve it in a surprising direction, into macrostates $B'$ with lower thermodynamic entropy. We do not observe these initial conditions and entropy decrease, but mathematically they exist.
This means that the surroundings (the room the computer is in) must have a commensurate increase in entropy
This is why the room must heat up? I'm still confused as to where exactly that energy comes from... Since energy can't be created, for the room to heat up, something else must've lost the energy... Would this simply be the electrons in the wiring/transistors?
Resistive heating I can explain to almost anyone, by analogy of electrons jostling through wires. But as for why going from 2 bits of information to 1 should heat up the room, I struggle to put it simply.
The room need not heat up. Even if there is increase in entropy (and there typically is, and much greater), increase of entropy can happen without a heat transfer or heating the room. Generation of entropy does not require any heat. For example, expansion of a gas into a larger volume increases entropy, but this can be done without any heat transfer with the environment, and without increase in temperature of the gas.