Displaying the dictionary, part 1: Naive version


We return briefly to the ongoing Chinese/English dictionary series and write some code to display all the definitions we had worked so hard to collect. (I figure you're anxious to see something on the screen, so I am going to handle the Traditional Chinese/Simplified Chinese issue later. For now, the "Simplified" column will be blank.)

Take the dictionary program we've been developing so far and paste it into our new scratch program. (Delete the main function, of course.) First, search/replace and change m_hwndChild to m_hwndLV since our child window is a listview, and it's just nicer to say what it is up front since we're going to be talking about it a lot. Next, make the following additional changes:

class RootWindow : public Window
{
public:
 virtual LPCTSTR ClassName() { return TEXT("Scratch"); }
 static RootWindow *Create();
protected:
 LRESULT HandleMessage(UINT uMsg, WPARAM wParam, LPARAM lParam);
 LRESULT OnCreate();
 const DictionaryEntry& Item(int i) { return m_dict.Item(i); }
 int Length() { return m_dict.Length(); }
private:
 enum {
  IDC_LIST = 1,
 };
 enum {
  COL_TRAD,
  COL_SIMP,
  COL_PINYIN,
  COL_ENGLISH,
 };
private:
 HWND m_hwndLV;
 Dictionary m_dict;
};

LRESULT RootWindow::OnCreate()
{
  m_hwndLV = CreateWindow(WC_LISTVIEW, NULL,
                  WS_VISIBLE | WS_CHILD | WS_TABSTOP |
                  LVS_NOSORTHEADER |
                  LVS_SINGLESEL | LVS_REPORT,
                  0, 0, 0, 0,
                  m_hwnd,
                  (HMENU)IDC_LIST,
                  g_hinst,
                  NULL);

 if (!m_hwndLV) return -1;

 ListView_SetExtendedListViewStyleEx(m_hwndLV,
                                     LVS_EX_FULLROWSELECT,
                                     LVS_EX_FULLROWSELECT);

 LVCOLUMN lvc;

 lvc.mask = LVCF_TEXT | LVCF_WIDTH;
 lvc.cx = 200;
 lvc.pszText = TEXT("Traditional");
 ListView_InsertColumn(m_hwndLV, COL_TRAD, &lvc);

 lvc.mask = LVCF_TEXT | LVCF_WIDTH;
 lvc.cx = 200;
 lvc.pszText = TEXT("Simplified");
 ListView_InsertColumn(m_hwndLV, COL_SIMP, &lvc);

 lvc.mask = LVCF_TEXT | LVCF_WIDTH;
 lvc.cx = 200;
 lvc.pszText = TEXT("PinYin");
 ListView_InsertColumn(m_hwndLV, COL_PINYIN, &lvc);

 lvc.mask = LVCF_TEXT | LVCF_WIDTH;
 lvc.cx = 800;
 lvc.pszText = TEXT("English");
 ListView_InsertColumn(m_hwndLV, COL_ENGLISH, &lvc);

 ListView_SetItemCount(m_hwndLV, Length());

 for (int i = 0; i < Length(); i++) {
  const DictionaryEntry& de = Item(i);
  LVITEM item;
  item.mask = LVIF_TEXT;
  item.iItem = i;
  item.iSubItem = COL_TRAD;
  item.pszText = const_cast<LPWSTR>(de.m_pszTrad);
  item.iItem = ListView_InsertItem(m_hwndLV, &item);
  if (item.iItem >= 0) {
   item.iSubItem = COL_PINYIN;
   item.pszText = const_cast<LPWSTR>(de.m_pszPinyin);
   ListView_SetItem(m_hwndLV, &item);
   item.iSubItem = COL_ENGLISH;
   item.pszText = const_cast<LPWSTR>(de.m_pszEnglish);
   ListView_SetItem(m_hwndLV, &item);
  }
 }
 return 0;
}

After creating the listview control, we set it into full row select mode and create our columns. Before inserting the words into the dictionary, we use ListView_SetItemCount to tell the listview the number of items we're about to put into the listview. (This is optional; it allows the listview to pre-allocate some structures.) I'm not using an STL iterator because this code is going to be deleted soon. You'll find out why if you can't figured it out already.

Compile and run this program. Notice that it takes a ridiculously long time to start up. That's because our loop is inserting 20,000 dictionary entries into the listview, and that can't be fast.

Next time, we'll work on speeding that up.

Comments (14)
  1. James Risto says:

    Finally, I can offer something intelligent! I did this in a .net program the other day, and there are functions to "suspend updates" to the list while you jam stuff in. Much faster. Offtopic now … I found my program did not display anything if I only inserted 1 item this way … still working on it …

  2. Anthony Wieser says:

    James Risto: You’ll need to invalidate after you’ve SetRedraw(FALSE) to get them to display.

    But, I’ll bet Raymond is going to show us the Virtual list view (in the end anyway) where you don’t have to insert them at all…

  3. If no virtual listview then I’m gonna be disappointed, since he "promised" that in post #1 of the dictionary-serie [1] :)

    [1] http://blogs.msdn.com/oldnewthing/archive/2005/05/09/415714.aspx

  4. Andy says:

    Cool! I have been doing these as Raymond has been showing them and I was beggining to wonder when we would see the next one. I check everyday to see if you have a new dictionary tutorial post up. These have been some very cool posts/tutorials/explinations please keep them coming.

  5. AC says:

    I consider the start of Raymond’s "dictionary series" a real masterpiece. Since C++ was introduced (cca. last 12 years) there were so many "language prophets" and the mentality of the most of the "consumers" of these prophecies was always "this new thing is solution for everything". It was easy for "prophets", since they didn’t have to really make something that works. They wrote the books. But the "consumers" spent a hell of the energy following the "prophets". And here we have it – a lot of inefficient programs using everything possible just because (STL, RTTI, exceptions all around).

    Somebody has to write the real truth about all hyped techniques overused in the last years. I haven’t seen anywhere something so effective like in Raymond’s articles. Just like

    http://www.lysator.liu.se/c/bwk-on-pascal.html

    explained the real weaknesses of Pascal, it was about the time that somebody demonstrates that STL is not appropriate for really effective programs. I also liked the response of the NET fans. Yes, the first NET version of "dictionary loader" was faster than STL (I was not surprised – anybody who thinks that STL strings and streams should be widely used never made some really serious programs, but only wrote clumsy Basic equivalents in C++). But then Raymond showed what could be done with a good C programming. I’ve never seen any demonstration that NET can, with optimizations, achieve the same. And I don’t expect to see it, but I’d really appreciate if somebody of the NET people proves me wrong.

  6. Ben Hanson says:

    In response to AC:

    I think it’s unfair to slate the STL. The only real issue is that it’s always slower to allocate thousands of objects than to pre-allocate a big chunk of memory and use pointers into it. This idea goes much deeper than the STL and the STL even supports this approach with custom allocators.

    Any technique can be mis-used and it gets tiring to hear the same old ‘throw the baby out with the bathwater’ arguments…

  7. Factory says:

    AC:

    While there is no such thing as a silver bullet, that does not indicate that all new techniques since 1992 have been rubbish. (BTW C++ was started in the 80’s)

    And citing inefficiency as a bad thing is the wrong way to think about it. It’s programmers trading off efficiency for other more important attributes, like for example, maintainability.

  8. Joku says:

    The only performance issue I see with .NET is that MS doesn’t ship every edition of VS with the method level instrumentation tool. With easy tools (not only restricted to the VSTS) readily available to find the performance bottlenecks I am sure they’d be used already during developing, not only after people mail you how bad the perf is. It’s likely to take considerable work to fix perf issues after shipping.

  9. Text callbacks let you delay setting data into the listview.

  10. AC says:

    To Ben Hanson: Your only real argument would be to write a STL and standard library version of the Raymond’s program that has comparable performance. You mention custom allocators — please use them and demonstrate to all of us your results. I still doubt that you can use strings and reach the goal.

    To Factory: Just a few examples: 1) It is always easier to maintain and develop the code without using too much exceptions (if at all). That is confirmed even by Stroustrup. And modern standard library functions raise them all around. 2) Imagine that you received two DLLs to maintain. The interface of one is C++, a bunch of headers of the classes, all using STL, exceptions and RTTI, and the interface of the second one is one header with some plain structures and a few functions operating on them. If both are supposed to do the same work, which would be easier to maintain?

    To Joky: As I said, I’d really like to see a NET equivalent of Raymond’s last sample with the comparable performance.

  11. AC: As Rico Mariani wrote in his entries about this (see http://blogs.msdn.com/ricom/archive/2005/05/19/420158.aspx) it’s impossible for the managed version to be quicker than Raymond’s final solution because his solution is quicker then the managed overhead. But at what cost? Rico got something reasonably fast without too much effort. So in productivity .NET wins, IMO. But developer productivity is easily traded away for performance when dealing with low-level stuff(e.g. OS scheduling algorithm) which are run every often and will give a huge performance impact to the system.

    Is there any who would notice the 0.031 second difference between the final solutions? I doubt. But the programmer would note the amount of time spent for building the managed vs. the unmanaged version of the application.

  12. Norman Diamond says:

    The question of whether it is necessary to do this kind of code tinkering and get that 0.3 seconds of performance improvement is a question that depends on the application. For starting up Internet Explorer or Word of course it’s not needed. If a web server or SQL server is supposed to respond to some number of requests every second then it’s almost needed, i.e. you might be able to avoid it by doubling your hardware but you probably want to improve the code first. If a controller for some component of a car or airplane is supposed to serve 50 operations per second, you need to tinker with the code this way.

    Although I wouldn’t recommend C++ for kernel mode code under an OS, it did work out all right in an embedded system without an OS. The application had to do a ton of matrix calculations and C++’s overloading of operators made it easy to write the code. But the first bottleneck wasn’t the matrix calculations, it was memory deallocations of temporary objects that were no longer necessary, and the second bottleneck was memory allocations in constructors. The first step towards optimization was to make Lex and Yacc programs that would convert the application to use named static variables for everything, getting rid of the memory allocations. Of course the resulting code is unreadable. Maintenance (bug fixes etc.) still had to be done on the original code. Other kinds of optimizations were done by tweaking the Yacc program to produce more efficient C++ programs using increasingly tailored subsets of the language. Other kinds of optimizations produced speedups by lesser and lesser factors, but the factors all multiplied together. Eventually we got our 50 iterations per second.

    Anyway, automatic garbage collection really is one of several things that are enormously useful in rapid application prototyping. After seeing how the result performs, then you figure out where you need to optimize.

  13. Ben Hanson says:

    Reply to EC:

    I think we all agree that optimised C will always beat C++ techniques and for sure specialised calls to Windows (say memory mapped files) will also give a huge performance when used well. What I object to is when programmers say "See? This C++ code is slow!" and instead of trying to improve the speed with better use of the STL etc. they just go straight back to C. For example if you are appending a lot of data to a std::vector and your code is slow, you can switch to a std::deque for a big performance increase.

    It’s be great to try and prove a point by doing a version of Raymond’s code using a load of C++ best practices. I’m sure it wouldn’t be faster, but it would be nice to show that it could be competitive. I doubt I will ever have the time to do this – so instead I recommend the C++ in-depth series (http://www.awprofessional.com/series/series.asp?st=44142&rl=1) for tips on good C++ usage.

    On your DLL comment to Factory:

    I wouldn’t recommend leaking exceptions from a DLL interface. Probably better to trap the exceptions inside the DLL, convert to an error code, and support an extern "C" interface.

    In general I would recommend the approach already mentioned on this thread: Code the highest level C++ you can and then optimise the (noticeably) slow parts, yes, resorting to optimised C if (really) necessary. That way you should end up with easier to understand and maintain code. It’s a real shame when C/C++ programmers opt for either or, instead of this approach, as both languages have a lot to offer in the appropriate context.

    Cheers,

    Ben

  14. AC says:

    Ben Hanson: Ok, so you claim that it you *believe* that it can be made competitive "using best C++ practices", but you’ll never have a time to do this. Does it mean that even "using best C++ practices" takes too much time? Hopefully not so much as to read all the books to which you link. I’d take care, people who write the books will always try to "sell" something new because old things are common knowledge. Yet, that, together with the common sense, is what’s most of the time the major thing missing in the real projects.

Comments are closed.