Improving CLRProfiler 4: Reducing SampleObject memory consumption by 58%

In the previous three posts, we managed to double the speed of file loading time of CLRProfiler through profile-guided optimization in three simple steps. Now let’s take a look at reducing CLRProfiler’s memory consumption, making it more useful to real world applications. I managed to create a 10-Gb profile using a performance test. The test program…

2

Improving CLRProfiler 3: Double the speed of profile loading

In the first step of profile-guided optimization, we reduced tatal CPU sample of ReadNewLog.ReadFile from 6,223 samples to 5,803 samples; the second step reduced it further to 3,982. Would it be nice if we can reduce it to below 3,111 samples, essentially doubling the speed of CLRProfiler’s profile loading? The biggest target now are ReadChar…

1

Improving CLRProfiler 2: 19.7% in TryGetValue

Playing with Visual Studio 2010 profiler data Function Code View finds another easy target for improvement: 19.7% in calling TryGetValue on a Dictionary object: This piece of code is called in ‘c’ command process for logged method call information. There are just millions of method calls in the profile we’re testing with. The code uses…

2

Improving CLRProfiler 1: 8.4% Progress Bar Update

Being a CLR performance dev, it seems that CLR profiler will be a very useful. So now I’m trying to understand it inside out, and may be even make some improvement to it. Here are the steps I’m taking: Download CLR profiler 4.0 source code. Unzip, open solution using Visual Studio 2010, making a few…


New Job: CLR performance team

I switched job to CLR performance team last August. Now I’m working on something which has huge customer impact, needs tons of technical depth, and fits my long time passion as an old practicing programmer (I still have 30-year source code listing from my college senior year to proof it). This should be fun. Related blogs:…

3

(VS2010 Beta1) Native Parallel Programming: ConcRT example — Debugging TwentyFour

In a previous posting, we presented a simple C++ parallel program TwentyFour, which is based on the new native parallel programming API (ConcRT) introduced in Visual Studio 2010. In this posting, we’re going to discuss on how understand the running of native parallel programs using Visual Studio 2010. With Concrt programming, you divide your code…

2

(VS2010 Beta1) Native Parallel Programming: ConcRT example — TwentyFour

WIth Microsoft Visual Studio 2010, Microsoft is releasing a new set of API for native parallel programming named ConRT (concurrent runtime). The purpose of this blog article is to illustrate the basics of ConcRT programming in VS 2010. We will be using Visual Studio Beta1 release for the discussion. The problem we’re trying to parallel here is…


STS: A palindromic word I will remember

Continuing searching palindromic sqaure numbers finds a new record number, a 55 digit parlindromic square:  1373512530649258635292477609^ 2 = 1886536671850530641991373196913731991460350581766356881 Quite a few records for other palindromic sequences have also been broken my multi-core searching algorithm. Check http://www.worldofnumbers.com/palrecs.htm for details. But the focus of this posting is really about a simple palindromic acronym: STS, which stands for Science Talent Search….

1

A 53-digit Palindromic Square Number

Six years ago I wrote a program to search for palindromic square numbers and other palindromic numbers and found some new palindromic numbers. But with increasing length, the search became longer and longer so I finally gave it up. Recently I got a new powerful machine with dual quad-core CPUs, so I brought it home…

1

XpsStat: A program for gathering statistics information of XPS documents

In the Winhec presentation on XPS document performance optimization, a simple program XpsStat is introduced to analyze XPS documents.  Given an XPS document, XpsStat generates four or five tables in HTML format: Container Summary Table: information about each type of parts in the document. FixedPage / Remote Resource Dictionary Table: information about each fixed page / remote…

5