More Shakespearean Monkeys


 

While I was in high school, I attended a lecture at Yale University in 1974 about monkeys at a typewriter by William R. Bennet, Jr, one of the inventors of the helium-neon laser.

Quote:

He was also frequently seen at various sites around the Yale campus collecting data for his popular course on "The Computer as a Research Tool." For this course he was named one of the 10 best professors at Yale for many years in a row. His lectures in that course were multi-media events and included demonstrations of firestorms, removal of warts by laser, calculations of how long it would take monkeys sitting at the typewriter to produce phrases recognized from great works of literature, and comparisons of the sound waveforms of the French horn and the garden hose

Keep in mind this was the day of the punch card and paper tape, with the Yale Computer center boasting a computer with 512K RAM! (See this link for a program I wrote for it).

if you have a room full of monkeys, each banging on a typewriter, eventually you’ll get all of Shakespeare’s works. Using a random number generator to generate characters simulates this process with fewer bananas required.

One way to imagine this: an ordinary typewriter has 26 letter keys of equal size. If you made the sizes proportional to their frequency, then you can get more “intelligible” random text. So the ”Q” key would be much smaller than the “E” key.

We can take this a step further by creating multiple keys for 2 letter combinations. There would be a “th” key but no “qz” key because no English language word has  “qz”. This would result in a  maximum of 26^2 =676 keys.

Why limit it to a length of 2? Let’s try 15 letter keys, for which we’d need 26^15 = 1.7E21, a huge number of keys. Thankfully, most of those 15 letter sequences are not necessary.

The program below takes as input some sample text and creates a typewriter with keys of the specified length. If lenPattern = 2, then we have a dictionary of a few hundred entries for Hamlet. (it includes some non-letter characters for embellishment)

For War and Peace (by Leo Tolstoy) and a pattern length of 15, we have almost 3 million keys. That would be a very big typewriter indeed, but just takes some computer memory.

I wrote a program in FoxPro several years ago to simulate the monkeys.

Solving the same problem in C# requires a different approach from FoxPro. C# doesn’t have a built in database that FoxPro has, and using a Dictionary is not the same. My first attempt was fine for Hamlet Act III, but for War And Peace,  and 15 letter pattern lengths, it was taking about 1 seconds per character. See if you can make the program go faster with some minor changes.

Try the C# code below. It includes Hamlet’s famous “To be or not to be” soliloquy from Shakespeare’s Hamlet, Act III.

You can also feed it various texts. Try WarAndPeace (link below) different languages, such as Russian, German, or even try computer programming language text.

I was thinking about porting the code to run on an Arduino or other microcomputer attached to an LCD screen. If I were to do this, I’d have to write my own Dictionary (implement it via a HashTable), which isn’t so hard, but making it generic would be more difficult. Then putting that and the code to write the text to the screen in Arduino memory means lots of memory consumed, way more than SRAM available on an Arduino.

See also

https://blogs.msdn.microsoft.com/vsdata/2004/03/03/calvin-monkeys-tolstoy-and-shakespeare-by-yag/

http://www.calvinhsia.com/default.asp?page=link&file=vfp%5Cmonkeys.htm

WarAndPeace.txt

<foxcode>

SET TALK off
CLEAR ALL
CLEAR
nPatLen=18
CREATE CURSOR letters (chrs c(nPatLen), cnt i)
INDEX ON chrs tag chrs
fd=FOPEN("f:\calvinhsia.com\vfp\warandpeace.txt")
*fd=FOPEN("..\genmenu.prg")
?fd
IF fd <0
      ?"err opening file"
      RETURN
ENDIF
cprior=""
nCnt=0
do while !FEOF(fd) and !CHRSAW() and nCnt<100000
      ch=FREAD(fd,1)
      nCnt=nCnt+1
      nAsc=ASC(LOWER(ch))
      IF (nAsc>96 and nAsc <=122) or nAsc=32 or nasc=13 or nasc=10
            IF LEN(cprior) >= nPatLen
                  cprior=SUBSTR(cprior,2)+ch
                  IF SEEK(cprior)
                        REPLACE cnt with cnt+1
                  ELSE
                        INSERT into letters values (cprior,1)
                  ENDIF
            ELSE
                  cprior=cprior+ch
            ENDIF
      *     ?cprior
      ENDIF
ENDDO
 
 
FCLOSE(fd)
LOCATE
BROWSE last nowa
ACTIVATE WINDOW (PROGRAM())
RAND(1)
RAND()
GO INT(RAND()*RECCOUNT()+1)
cPrior=chrs
 
SET ALTERNATE TO t
SET ALTERNATE on
do while !CHRSAW()
      cLast=SUBSTR(cPrior,2)
*     cLast=RIGHT(cPrior,1)
      SEEK cLast
      SUM cnt while chrs = cLast to nCnt
      nr=INT(RAND()*nCnt+1)
      SEEK cLast
      SCAN while nr >0
            nr=nr-cnt
      ENDSCAN
      SKIP -1
      cNewLet=RIGHT(chrs,1)
      ??cNewLet
      cPrior=SUBSTR(cPrior,2)+cNewLet
ENDDO
SET ALTERNATE to

INKEY()

</foxcode>

<C# Code>


using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace Monkey
{
    class Program
    {
        static string txtHamlet =
@"To be, or not to be--that is the question:
Whether 'tis nobler in the mind to suffer
The slings and arrows of outrageous fortune
Or to take arms against a sea of troubles
And by opposing end them. To die, to sleep--
No more--and by a sleep to say we end
The heartache, and the thousand natural shocks
That flesh is heir to. 'Tis a consummation
Devoutly to be wished. To die, to sleep--
To sleep--perchance to dream: ay, there's the rub,
For in that sleep of death what dreams may come
When we have shuffled off this mortal coil,
Must give us pause. There's the respect
That makes calamity of so long life.
For who would bear the whips and scorns of time,
Th' oppressor's wrong, the proud man's contumely
The pangs of despised love, the law's delay,
The insolence of office, and the spurns
That patient merit of th' unworthy takes,
When he himself might his quietus make
With a bare bodkin? Who would fardels bear,
To grunt and sweat under a weary life,
But that the dread of something after death,
The undiscovered country, from whose bourn
No traveller returns, puzzles the will,
And makes us rather bear those ills we have
Than fly to others that we know not of?
Thus conscience does make cowards of us all,
And thus the native hue of resolution
Is sicklied o'er with the pale cast of thought,
And enterprise of great pitch and moment
With this regard their currents turn awry
And lose the name of action. -- Soft you now,
The fair Ophelia! -- Nymph, in thy orisons
Be all my sins remembered.
";
        Dictionary<string, int> _dictData;
        Random _rand = new Random(1);
        int _maxLen = -1;
        int _lenPattern = 2;
        string _txt = txtHamlet;

        static void Main(string[] args)
        {
            // create a new instance of "Program", minimizing
            // the use of statics and making it 
            // much more maintainable and testable
            var prog = new Program();
            prog.DoMain(args);
        }

        internal void DoMain(string[] args)
        {
            ProcessArgs(args);
            var sw = new Stopwatch();
            sw.Start();
            FillDictionaryFromReadingText();
            sw.Stop();
            Console.WriteLine($"Filled dictionary in {sw.Elapsed.TotalSeconds:n2} seconds");

            Console.WriteLine($"Txt len = {_txt.Length:n0} PatLen = {_lenPattern} DictSize = {_dictData.Count:n0}");
            sw.Restart();
            int nChars = GenerateRandomText();
            sw.Stop();
            if (nChars > 0)
            {
                Console.WriteLine($"Generated {nChars} characters at the rate of {(nChars / sw.Elapsed.TotalSeconds):n2} chars per second");
            }
        }
        private void ProcessArgs(string[] args)
        {
            for (int i = 0; i < args.Length; i++)
            {
                switch (args[i][0])
                {
                    case '/':
                    case '-':
                        if (args[i].Length > 1)
                        {
                            switch (args[i][1])
                            {
                                case 'f':
                                    if (++i < args.Length)
                                    {
                                        _txt = System.IO.File.ReadAllText(args[i]);
                                        Console.WriteLine($"Filename {args[i]}");
                                    }
                                    break;
                                case 'm':
                                    if (++i < args.Length)
                                    {
                                        if (int.TryParse(args[i], out _maxLen))
                                        {
                                            Console.WriteLine($"Max len = {_maxLen}");
                                        }
                                    }
                                    break;
                                case 'l':
                                    if (++i < args.Length)
                                    {
                                        if (int.TryParse(args[i], out _lenPattern))
                                        {
                                            Console.WriteLine($"Pat len = {_lenPattern}");
                                        }
                                    }
                                    break;
                            }
                        }
                        break;
                }
            }
            if (_maxLen > 0 && _txt.Length > _maxLen)
            {
                _txt = _txt.Substring(0, _maxLen - 1);
            }
            //_txt = "the quick brown fox jumps over the lazy dog";
            //_lenPattern = 3;
        }

        private void FillDictionaryFromReadingText()
        {
            _dictData = new Dictionary<string, int>();
            var txtLen = _txt.Length;
            var strPrior = " ";
            for (int i = 0; i < txtLen; i++)
            {
                var chrNew = char.ToLower(_txt[i]);
                strPrior += chrNew;
                if (strPrior.Length < _lenPattern)
                {
                    continue;

                }
                if (_dictData.ContainsKey(strPrior))
                {
                    _dictData[strPrior] += 1;
                }
                else
                {
                    _dictData[strPrior] = 1;
                }
                // lop off first char
                strPrior = strPrior.Substring(1);
            }
        }

        private int GenerateRandomText()
        {
            var strPrior = string.Empty;
            int nChars;
            for (nChars = 0; nChars < 1000000 && !Console.KeyAvailable; nChars++)
            {
                if (string.IsNullOrEmpty(strPrior))
                {
                    // we need some sort of initial prior string
                    // so we'll take it from a random entry in the data
                    int numSkip = _rand.Next(Math.Min(10000, _dictData.Count));
                    if (_dictData.Count < 30)
                    {
                        // when testing with very few items
                        numSkip = 0;
                    }
                    strPrior = _dictData
                        .Skip(numSkip)
                        // look for one that is preceded by " "
                        .SkipWhile(
                            d => !d.Key.StartsWith(" "))
                        .First()
                        .Key
                        .Substring(_lenPattern - 1);
                    // we need to print the new prior string. 
                    // indicates it's a restart by using Upper case
                    Console.Write(strPrior.ToUpper());

                }
                // first we find the sum of all the items that start with strPrior
                int sum = _dictData
                    .Where(
                    d => d.Key.StartsWith(strPrior))
                    .Sum(d => d.Value);
                if (sum == 0) // last sequence in text
                {
                    strPrior = " ";
                    continue;
                }
                // now we get a random number from 0 to the sum
                int targetRand = _rand.Next(sum + 1);
                var target = _dictData
                    .Where(d => d.Key.StartsWith(strPrior))
                    // we subtract each entry's value until we reach 0
                    .Where(d => (targetRand -= d.Value) <= 0)
                    .FirstOrDefault();
                var newchar = target.Key.Substring(_lenPattern - 1);
                strPrior = target.Key.Substring(1, _lenPattern - 1);
                Console.Write(newchar);
            }
            Console.WriteLine();
            return nChars;
        }
    }
}

</C# Code>

Comments (0)

Skip to main content