If I read a book I can see with my inner eye (or ear) see the plot developing. With a random string of letters, with random spacings, this isn't the case. Then why contains the former less information than the latter?
5 Answers
Firstly, the concept of information vs. noise boils down to the concept of known coding vs. unknown (or useless) coding of information. So-called 'noise' is just information that we don't know the coding for, don't care, or both--like information on the particular configuration of the groups of particles (temperature) comprising a transmission line. Normally we'd call that 'noise'. Whether a given piece of 'noise' information is even theoretically decodable (has a meaning that can be resolved, whether we want to know it or not), is the subject of another conversation.
Now I might decide the letter 'W' represents the entire works of Tolstoy. But somewhere the works must be recorded for that to have meaning. It must be in the original data (as a stack of books), the compression dictionary, in our brains, or combinations thereof. For example, I might see the uncompressed word 'tomorrow' on a page of a book. Most of that word's information must be in my brain in the form of a description of that set of letters and a detailed description of what that word means. So 'information' is a measure of the total data needed to describe a concept, not just the encrypted symbol of that concept.
Your list of random letters might have been generated by a pseudo-random algorithm given certain input. So if you know that algorithm (i.e. its in your compression dictionary), then you could resolve the input of that algorithm that generated that 'noise', and thus its no longer 'noise' in the theoretical sense.
- 2,208
- 10
- 12
Because you cannot compress the string of random letters, whereas you can compress words from a specific language. Example: axbheodpiurt hspe dirbe siwoebx , versus , one car one man one universe. The first string you cannot write it as anything but the sequence of about 30 letters and spaces. The second string you can write it as 1 car 1 man 1 universe for a total of about 23 letters and spaces. 30 is more than 23. Now of course there has to be an index saying one=1, but in a book there will be most likely a lot of "one" words , thereby reducing "one" to "1" everytime "one" is used.
- 51
I ran out of room before completing my comment above, which I cut-and-pasted below, and then continued below that...
Google "Kolmogorov complexity" (or "algorithmic complexity", and note that G.Chaitin has written lots about this) for much additional info along the lines described here. And note that @NeuroFuzzy seems to have, very logically, misread your question which explicitly asks why the non-random string contains more info. And actually, you're wrong, the random string contains more, like everybody said. But the combination (maybe tensor product) of the string's info with info already in your mind is what counts here. And the non-random string, acting as "input", generates much more "output"
...to continue, the preceding input$\to$output analogy is a bit too simplistic, which is why I need more words here. A more accurate analogy is between a Turing machine state and a "brain state". An input string, random or non-random, that you read, takes your brain word-by-word from some initial state to some final state. Reading random words/strings doesn't take your brain much of anywhere, semantically speaking. That is, the "meaning" of your final brain state contains no more information than your initial state. On the other hand, a non-random string takes your brain to a final state with additional semantic meaning.
"Meaning" seems vague, but is actually mathematically precise in Domain Theory, where Denotational Semantics provides what's called a semantic function from syntax (our random or non-random strings) to semantics. And domain elements are characterized by a poset-like ordering that measures information content. But you'd pretty much need a textbook's worth of discussion for a reasonably complete picture. If interested, maybe try https://en.wikipedia.org/wiki/Denotational_semantics for a start.
You might think the written book has more information than the string of letters and spaces. But that's only because you happen to have personal opinions on what is useful information (words) and what is useless information (random letters and spaces). But say you don't have a preference for dictionary words over random words, and look at it from a different point of view. Imagine you were trying to memorize a book vs a string of letters and spaces. You'd have an easier time with the book, because it is made up of fewer arrangement of letters. It is in this sense the book is considered to have less information. The book is easier to store (remember) than the random strings of letters.
- 8,388
The English language contains about 170 000 words. Some words occur more frequently than others. So if we assign the number 0 to the spaces between words (which doesn't reduce the information) the number 1 to the most frequently used words, the number 2 to the second most frequently used ones, etc. until we arrive at the less frequently used word to which we assign the number 170 000, we can replace the string of words by a string of decimal digits (with a space between them) varying from 0 to 170 000, which obviously contains lesser characters (decimal digits) than the number of alphabetic characters in the original text. For simplicity, I omitted the grammar (syntax) of the words, which gives us a bigger number of words (I am, you are, he/she is, etc.)
Of course, you need a "dictionary", in which you can see what number to assign to each word, and because of that, the text will not be easy to read. But the number of characters is reduced, and so is the information.
The meaning of the words (for example the associations I have in my mind when reading the word "woman") is not information inherent in the written text, but (literally) contextual information, which doesn't count as an objective measure.
- 1
- 5
- 45
- 105