News, an update, and a tip

If you have a MSN Watch, then basketball is now available as a channel. Now if I only liked basketball… Where is baseball and hockey?

 

An update to my last post: as I suspected, the .NET Framework was not the cause of my performance problems in my Visual Studio language service. When you would scroll through a text document with about 200 lines of code, it would take a few seconds to redraw with each click in the vertical scroll bar – not exactly a well-performing application. A few modifications to the source, and everything is running smoothly now. I am rolling my own tokenizer for a language service and I had one method, IsAtBeginning, which was causing all the problems. IsAtBeginning is given a text string pszText, start at position nIndex, and a second string pszLookFor. IsAtBeginning would check to see if the text given in the variable pszLookFor is at that position nIndex in pszText. Also, the function needs to make sure that the character at nIndex-1 and the character at nIndex+Length of pszLookFor+1 is not a character that keeps the string at that location from being a special token. For example, if pszLookFor is the IL keyword .subsystem, then the function should return true when:

pszText = “.subsystem” and nIndex = 0

pszText = “ .subsystem” and nIndex = 1

pszText = “.subsystem ” and nIndex = 0

pszText = “.subsystem(” and nIndex = 0

pszText = “ (.subsystem” and nIndex = 0

and return false when:

pszText = “X.subsystem” and nIndex = 0

pszText = “X.subsystem” and nIndex = 1

pszText = “.subsystemX” and nIndex = 0

 

After some experimentation (which included commenting out all of my tokenizer and one by one adding bits of code back), I tracked down the problem to the IsAtBeginning function. Examination of that function showed that I was doing some unnecessary string manipulation – I translated the C++ code “if(!wcsncmp(pszText+nIndex, pszLookFor, wcslen(pszLookFor)) {…}” Into a number of C# string.Substring calls, which is not as efficient. The moral of the story: I always like to blame my code as the culprit before blaming somebody else’s. I could have said it was a problem with the .NET Frameworks, but after some investigation it was my fault.

 

Here is the code I use now (I am sure there is more optimization I could perform, but for now this is what I am using):

 

static string specialIDChar = "#$@?_";

bool IsAtBeginning(string text, int index, string lookFor)

{

  //If the char before or after is a space, tab, or non alphanumeric,

  // (meaning that there is not a character infront of it disqualifying it as our token) then verify

  // that the text is what we are looking for:

  int lenLookFor = lookFor.Length;

  int lenText = text.Length;

  if((lenText - index) < lenLookFor)

    return false;

  char before = (index == 0) ? ' ' : text[index-1];

  char after = ((lenText - index) > lenLookFor) ? text[index+lenLookFor] : ' ';

  if((specialIDChar.IndexOf(before) != -1) || (specialIDChar.IndexOf(after) != -1) || char.IsLetterOrDigit(before) || char.IsLetterOrDigit(after))

    return false;

  if(text.Substring(index, lenLookFor) == lookFor)

  {

    return true;

  }

  return false;

}

 

 

Now the tip: I was recently working on a pet project, a blog reader. The architecture is that I have an engine that manages syncing to and holding blog data, while there is a user control that manages display of the data. The engine defines a couple delegates that the user interface connects to so that it can be informed when a blog is being updated. I have been using serialization to save data to disk, but suddenly I was running into problems where I was getting an exception when trying to save data to disk because the control used to display data was not marked with the [Serializable] attribute. After some investigation, I found that the problem was with the delegates. It turns out that when the control connected to the events fired by the blog engine, and I tried to serialize the engine, it also tried to serialize anything connected to the events thrown by the engine. The solution in this case was to disconnect all events before serializing. This may not work in all cases, but for what I was doing (serializing in the destructor of the class, when any events would not need to fire any longer), it worked just fine.