Binary

T'ai Chi · Oct 25, 2003

If a string, of finite length, of 1's and 0's is constructed randomly, what is the probability that when the string is converted in English, that it contains a real word? Forms a coherent sentence?

Would these probabilities change if the string length was infinite?

Skeptoid · Oct 25, 2003

11000101100100100011110101001111110100101100110010
10000010110111010110000011011011111011010000110101
01100101011010111111110111011010111101000001101000.

100101100.

Cabbage · Oct 25, 2003

For the finite case, I'm not sure. For one thing, how are you translating from binary to alphabetical? There are lots of different ways to do this, and I expect you would probably get different probabilities depending on your translation method. Also, you would have to know things like, What percentage of four letter (alphabetical) strings are actual words? I have no idea offhand.

For the infinite binary string, the chances are 100%. There are such things known as normal numbers, which are real numbers which have every possible finite string of digits (in a particular base) occuring in them (and occuring with the expected frequency associated with a uniformly random distribution). Then there are absolutely normal numbers, which are normal in every base.

What that means is that a number normal in binary (and any absolutely normal number, as well) will have not only words "encoded" in their digits (regardless of whatever encoding system you may use), but also the complete works of Shakespeare (in chronological order), critical reviews of all of Shakespeare's works, all of the various religious texts (and all various versions of each), and, for that matter, any work of literature ever created (and to be created, on this planet and others). For that matter, if you encoded music or images (or anything else) in binary, then all possible pieces of music (and all possible pictures) would be encoded into a single binary normal number.

To my knowledge, we've never found an absolutely normal number (though it's strongly suspected that pi is absolutely normal, this has never been proven). The surprising thing, however, is that almost all of the real numbers are absolutely normal ("almost all" meaning the set of numbers not absolutely normal has Lebesgue measure zero). So, in the Lebesgue measure/probability sense, if you pick a binary number between 0 and 1, you have a perfect 100% probability of picking a number in which is encoded all possible (finitely encodable) information.

T'ai Chi · Oct 25, 2003

Cabbage said:
To my knowledge, we've never found an absolutely normal number (though it's strongly suspected that pi is absolutely normal, this has never been proven).

Actually, I do believe we do know some normal numbers, although, aritifically constructed ones like the Champernowne constant and the Copeland-Erdos constant. I'm sure there are several others that have been constructed to be normal.

I think it is only speculation that the sqrt

, for n=2,3,5,6,7,8, etc. is normal.

(edited to remove my proposed "normal" number 12345678901234567890123... . For some reason I was thinking that a normal number is simply one where only the digits 0-9 appeared with equal frequency.)

Cabbage · Oct 25, 2003

The number .12345678901234567890123... can't be normal because the digits keep repeating in the same order--for example, the string "21" never occurs, so it can't be normal.

Champernowne's constant (0.1234567891011121314151617181920212223... from concatenating the base 10 positive integers) and the Copeland-Erdos constant (0.235711131719232931... from concatenating the base 10 representations of the primes) are both known to be normal in base 10, but to my knowlede, neither has been shown to be normal in every base, which is required for a number to be absolutely normal.

Ziggurat · Oct 25, 2003

There are 10 kinds of people in the world:
those who can count in binary,
and those who can't.

epepke · Oct 25, 2003

I offer the following program. I noticed that a lot of people claim that there are secret messages hidden in the digits of pi or the Bible or whatever. I wrote this as an object lesson in just how easy it is to play these games. It attempts to find a hidden message in the digits of pi.

It can provide literally minutes of mild amusement.

Code:

/* fpi.c
   Pialyzer

   Copyright (C) Eric Pepke 2003

   Attempts to find a string in the digits of pi.  For entertainment
   purposes only.

   This program is free software; you can redistribute it and/or
   modify it under the terms of the GNU General Public License
   as published by the Free Software Foundation version 2.

   This program is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
   GNU General Public License for more details.  You can find this
   at [url]www.gnu.org[/url]

Compiling

   fpi.c is written in simple, unassuming ANSI C and should
   compile on any compiler worth its salt.

   Requires a file called "pi" containing some digits of pi as
   ASCII digits '0'-'9' with nothing else (no whitespace, line
   breaks, etc.)  I use a file with the first 50 million digits
   of pi.  These are freely downloadable from several sources;
   do a quick Web search to find some.  This file must be in
   the current directory for fpi to find it.

   Three constants at the beginning of the program can be
   modified.  

   FN is the name of the file containing the digits.

   NDIGITS is the number of digits in the file.

   MAXDIFF is the maximum number of different characters.  
   Values from the size of a typical alphabet (say 32 to 
   handle 26 letters plus a few punctuation marks) up to 
   100 are permitted.  Lower numbers generally increase 
   the odds of finding matches, but not always, and not
   by much.

   PROGNAME is the name of the executable; it's included
   as a constant because some systems don't handle argv[0]
   properly.

Using

   Run the program from the command like, including a single
   quoted argument containing the string to look for.

   For example, ./fpi "Eric Pepke" finds my name at position
   243606. 

Algorithm

   The algorithm tries to match characters to subsequent pairs
   of digits, even- or odd-aligned against a large but not
   exhaustive number of possible substitution ciphers.  The
   number of possible ciphers is MAXDIFF!, but there is also
   a reduncancy factor of 100/MAXDIFF.  Case is ignored, but
   whitespace is significant.

*/

#include <stdio.h>
#include <stdlib.h>
#include <memory.h>

#define FN "pi"                 /* Name of file with pi, better be NDIGITS */
#define NDIGITS 50000000        /* Number of digits of pi */
#define MAXDIFF 100             /* Max # different characters, up to 100 */
#define PROGNAME "fpi"          /* Interactive name of the program */

int main (int argc, char *argv[])
{
    char *s;            /* String to test */
    int sLen;           /* Length of s */
    char *r;            /* Runner for string */
    char used[128];     /* Number of times each char is used */
    int k;
    int n;              /* Number of different characters */
    FILE *in;           /* Input file */

    char *buffer;       /* Input buffer for pi */

    int inputToChar[MAXDIFF];   /* Dynamic mapping of input to result */

    /* Legal stuff */
    printf("Pialyzer, Copyright (C) 2003 Eric Pepke.  ABSOLUTELY NO WARRANTY.\n");
    printf("See version 2 of the GNU Public License for detals ([url]www.gnu.org[/url]).\n\n");

    /* Check to see if there's an arg */
    if (argc != 2)
    {
        printf("Usage: " PROGNAME " inputString\n", argv[1]);
        exit(0);
    }

    s = argv[1];

    /* Reject 0- or 1-length strings, trivial case and might break the alg. */    
    if (strlen(s) == 1)
    {
        fprintf(stderr, PROGNAME ": Input string must have at least 2 characters.\n");
        exit(2);
    }
    else
    {
        sLen = strlen(s);
    }

    if (sLen > NDIGITS)
    {
        fprintf(stderr, PROGNAME ": This is ludicrous.\n");
        exit(1);
    }

    /* Make the input buffer */
    buffer = malloc(sLen * 2);
    if (!buffer)
    {
        fprintf(stderr, PROGNAME ": Out of memory\n");
        exit(1);
    }

    /* Clear the used array */
    for (k = 0; k < 128; ++k) used[k] = 0;

    /* Count the characters */
    for (r = s; *r; ++r)
    {
        char c = toupper(*r);
        ++used[c & 0x7F];
    }
    n = 0;
    for (k = 0; k < 128; ++k)
    {
        if (used[k])
        {
            ++n;
        }
    }

    /* See if too many different characters */
    if (n > MAXDIFF)    
    {
        fprintf(stderr, PROGNAME ": Too many different characters\n");
        exit(1);
    }

    /* Open the file */
    in = fopen(FN, "r");
    if (in)
    {
        /* Fill the buffer */
        for (k = 0; k < sLen * 2; ++k)
        {
            buffer[k] = getc(in);
        }

        /* Now go through the file */
        for (; k < NDIGITS; ++k)
        {
            int i;

            /* Clear the input to result array */
            for (i = 0; i < MAXDIFF; ++i)
            {
                inputToChar[i] = '\0';
            }

            /* Clear the used array */
            for (i = 0; i < sLen; ++i)
            {
                used[toupper(s[i])] = 0;
            }

            /* Now go through the result */
            for (i = 0; i < sLen; ++i)
            {
                int comb = (buffer[i * 2] - '0') * 10 +
                           (buffer[i * 2 + 1] - '0');
                comb = comb % MAXDIFF;
                if (inputToChar[comb] == '\0')
                {
                    if (used[toupper(s[i])])
                    {
                        break;
                    }
                    inputToChar[comb] = toupper(s[i]);
                    used[toupper(s[i])] = 1;
                }
                if (inputToChar[comb] != toupper(s[i]))
                {
                    break;
                }
            }
            if (i >= sLen)
            {
                printf("String found at position %d\n", k - sLen * 2);
                exit(1);
            }

            /* Shift the buffer and move over.  (I know a circular list
                would be faster, but I don't care.) */
            for (i = 0; i < sLen * 2 - 1; ++i)
            {
                buffer[i] = buffer[i + 1];
            }

            /* Read the next digit in the file */
            {
                int c;
                c = getc(in);
                if (c < 0)
                {
                    fprintf(stderr, PROGNAME ": Ran off end of file\n");
                    exit(1);
                }
                buffer[sLen * 2 - 1] = c;
            }
        }
        fclose(in);
    }
    else
    {
        fprintf(stderr, PROGNAME ": Couldn't open %s\n", FN);
    }

    free(buffer);
}

T'ai Chi · Oct 25, 2003

binary trick

There's a silly math magic trick one can do.

Ask someone to write down any number of circles (in a staight line) without you seeing how many they write down.

Tell them, with them still covering up their circles, that you will write down the number of possible outcomes based on the number of coins they drew. Of course it will be super duper amazing because you don't know how many coins they drew.

For example, if they were to draw 1 coin, there would be 2 possible outcomes. If they were to draw 2 coins, there would be 4 possible outcomes. Etc. You might see where this is going...

With their hand still covering up their circles, simply draw a 1 to their left of the first coin they drew.

The resulting number will, in binary.., be the number of possible outcomes based on the number of coins they drew.

Solitaire · Oct 26, 2003

T'ai Chi said:
If a string, of finite length, of 1's and 0's is constructed randomly,
what is the probability that when the string is converted in English,
that it contains a real word? Forms a coherent sentence?

Complex. The string only need be on bit long if the assignment
function is "Word[0] = Andriod" and "Word[1] = Ibex". Of course,
if you like the eight bit letter of ASCII such as "A" or "I", then each
one will occure most probably in a sequence of 256 random bits.
A two letter word such as "An" or "It" will occure once in a sequence
of 65536 bits, likewise a three letter word will occure most likely in
a sequence of 16777216 bits, and so on.

P.S. This is close to random.

BillHoyt · Oct 26, 2003

F'umble Whi'd

Anybody else notice tr'oll chi / whodini has ignored the call for the missing information in his first troll? Whereupon he switched topics, and fumbled that troll as well? When the fumble was called, he corrected only part of it. Whereupon he switched topics again?

T'ai Chi · Oct 26, 2003

Ziggurat said:
There are 10 kinds of people in the world:
those who can count in binary,
and those who can't.

LOL, I love that one.

Binary

T'ai Chi

Penultimate Amazing

Skeptoid

Graduate Poster

Cabbage

Banned

T'ai Chi

Penultimate Amazing

Cabbage

Banned

Ziggurat

Penultimate Amazing

epepke

Philosopher

T'ai Chi

Penultimate Amazing

Solitaire

Neoclinus blanchardi

BillHoyt

Unregistered

T'ai Chi

Penultimate Amazing