Performance Quiz #6 — Chinese/English Dictionary reader


Raymond Chen is running a series of articles about how to build and optimize the startup time of a Chinese/English dictionary.



Actually truth be told I got a look at his article quite some time ago as he was kind enough to ask me for comments well in advance.  At the time I couldn’t resist doing a managed version of the same program to see how it would do.  So I encourage you to watch as Raymond works through various steps optimizing his program and see how it comes along. 


This managed code is a line for line conversion in the dumbest possible way of his initial program with no attempt whatsoever to optimize anything.


And then, the question of the hour:  How does Raymond’s program fare vs. the equivalent managed code below?


Feel free to comment on the code, the problem, or just the unfairness of it all but please don’t accuse me of concluding too much from the result of just this one benchmark :) :)


using System;
using System.IO;
using System.Text;
using System.Collections;


namespace NS
{
   class Test
    {
        [System.Runtime.InteropServices.DllImport(“Kernel32.dll”)]
        private static extern bool QueryPerformanceCounter(out long lpPerformanceCount);


        [System.Runtime.InteropServices.DllImport(“Kernel32.dll”)]
        private static extern bool QueryPerformanceFrequency(out long lpFrequency);


        static void Main(string[] args)
        {
            long startTime, endTime, freq;


            QueryPerformanceFrequency(out freq);
            QueryPerformanceCounter(out startTime);


            Dictionary dict = new Dictionary();        


            QueryPerformanceCounter(out endTime);


            Console.WriteLine(“Length: {0}”, dict.Length());
            Console.WriteLine(“frequency: {0:n0}”, freq);
            Console.WriteLine(“time: {0:n5}s”, (endTime – startTime)/(double)freq);
        }


        class DictionaryEntry
        {
            private string trad;
            private string pinyin;
            private string english;


            static public DictionaryEntry Parse(string line)
            {
                DictionaryEntry de = new DictionaryEntry();
               
                int start = 0;
                int end = line.IndexOf(‘ ‘, start);
               
                if (end == -1) return null;
                de.trad = line.Substring(start, end – start);
               
                start = line.IndexOf(‘[‘, end);
                if (start == -1) return null;
               
                end = line.IndexOf(‘]’, ++start);
               
                if (end == -1) return null;
               
                de.pinyin = line.Substring(start, end – start);


                start = line.IndexOf(‘/’, end);
               
                if (start == -1) return null;
                start++;
               
                end = line.LastIndexOf(‘/’);
                if (end == -1) return null;
                if (end <= start) return null;
               
                de.english = line.Substring(start, end-start);


                return de;
            }
        };


        class Dictionary
        {
            ArrayList dict;
           
            public Dictionary()
            {
                StreamReader src = new StreamReader(
                   
“cedict.b5”, 
                    
System.Text.Encoding.GetEncoding(950));
                string s;
                DictionaryEntry de;
                dict = new ArrayList();


                while ((s = src.ReadLine()) != null)
                {
                    if (s.Length > 0 && s[0] != ‘#’) {
                        if (null != (de = DictionaryEntry.Parse(s))) {
                            dict.Add(de);
                        }
                    }
                }
            }


            public int Length() { return dict.Count; }      
        };        
    }
}

Comments (16)

  1. Rico Mariani decided to try a managed version of the dictionary I talked about earlier today. According to Rico…

  2. I want to go on the record and note that I will not be deveoping a Chinese/English Dictionary, in unmanaged…

  3. Converting the file as we read it is taking a lot of time.

  4. Stefang jumped into the fray with his analysis in the comments from my last posting.&amp;nbsp; Thank you…

  5. Raymond Chen (aka &quot;fixed more Windows bugs than you’ve had hot dinners&quot;) and Rico Mariani (aka &quot;Mr .NET…

  6. So I was reading through one of my favorite MSDN blogs (http://blogs.msdn.com/oldnewthing/)

    And he…

  7. I’m just not an expert.

  8. &bull; Closures and Continuations / c# .net continuations Continuations in their full glory capture more

  9. The fun continues as today we look at Raymond’s third improvement . Raymond starts using some pretty

  10. Stefang jumped into the fray with his analysis in the comments from my last posting . Thank you Stefang.