As far as I remember, the activation functions in our university lecture (which we called transfer function as well) were always (non-strictly) ~~monotonous~~ monotonic (thanks to A.R. for the correction). Modulo is not a monotonic function.
The single layer perceptron (which we also called connectionist neuron) is indeed incapable of XOR-classification if we have a monotonic activation function because in that case the binary classification is one straight line or hyperplane separator in space (ternary would be two parallel hyperplanes, etc.), and the activation function is necessarily a heaviside function, an edge detector.
We did not formally define XOR classification but we considered it to be the capability of dividing space such that 4 disjoint subsets of space are assigned two classes, diagonal subsets belong to the same class. (It would make sense to define XOR as 4 quadrants in a 2D space.) Between every two subsets of the same class, there is one subset of the opposite class. When two same-class subsets are in one half then a third subset also will be in the half. We need at least a second linear classifier in addition to separate both classes in each half.
By the way, modulo is not a binary classifier and therefore cannot be considered to be an XOR or other binary classification. Binary classifiers produce binary output for real-valued input vectors and "mod x" certainly has a ~~domain~~ real-valued range (thanks to John Madden for the correction) from 0 (inclusive) to "x" (exclusive).