Performance issues with "String Concatenation" in JScript.

Hello Friends,
I am Jaiprakash and work as a Developer in the Jscript team. There is not much to tell about me except one fact - I can’t recall the last time when any of my teammates called me by my name, they just call me JPJ.

I know, you have seen pretty cool blogs in past from my team mates and I promise this one also not going to disappoint you. So coming to the point, I am going to talk about the famous ‘string append’ problem in Internet Explorer (or Jscript to be more accurate) and how are we going to fix it.

The Problem
let’s do a small experiment to get to it. Copy paste following code in an html file and run in Internet Explorer. Observe the time reported.

<script language="javascript">
var before = new Date().getTime();
var s = "jscript";
x = "";
count = 20000;
for (var i = 0; i < count; i++) {
x += s;
}
var after = new Date().getTime();
var time = after - before;
alert("Time taken to append 'jscript' " + count + " times : " + time + " ms");
</script>

On my machine the time reported is 1922 ms (averaged out three times). As you see this code is doing nothing other than appending a 7 character string 20K times. And for such a trivial operation it is consuming almost 2 seconds. Phewww….
Impressed? Not yet. It still doesn’t appeal you? OK. Try increasing the no of appends and enjoy the show. Here is what I get on my box…

Appends

Time (ms)

10K

260

20K

1922

25K

6125

30K

11578

Well, I am sure you would be impressed by now. And I can feel your curiosity to know how Jscript makes such a trivial looking operation such a big one.

Not much, it uses a very trivial algorithm to append strings to get you to such a great show. Whenever it has to append two strings of size ‘m’ and ‘n’ each, it allocates a new buffer of size (m + n) , copies the strings one by one, and returns this buffer back. Hmmm. This logic looks fine. What’s the deal then?

Let’s say you want to append 3 strings like following...

resultStr = “Jscript” + “String” + “Append”;

To give you the final result, Jscript first appends “Jscript” and “String”. It allocates a buffer of size 13, copies them one by one to the buffer and assigns the buffer to a temporary, let’s say temp1. Next it appends temp1 and “Append”, allocates a buffer of size 19, copies temp1 and “Append”, and assigns the result to resultStr. As you see, to generate final string for you it copied “Jscript” and “String” twice (temp1 and resultStr) and allocated one intermediate buffer. Imagine what happens when you append ‘n’ such strings. It will allocate n-1 buffers, copy each string (n-i) times where ‘i’ is the index at which the string appeared in the operation.

You might be thinking why Jscript was shipped with such an algorithm. Well, Jscript was meant to be used just as glue, not as full-fledge programming language as used in modern WebApps.

 How it affects Real world Scenarios
Gone are the days when web apps used to be static text which was just retrieved from the server and displayed in your browser. Today’s world is AJAX, involves lot more interaction with server, and data received is not just the plain HTML. A modern WebApp fetches data from server; builds the page on your machine using Jscript and DOM APIs, and displays on your browser. And to build the page it has to generate HTML, which is nothing but text. And to generate HTML out of the pieces retrieved from server, it has to append them as strings. And as the applications are becoming smarter and smarter, they tend to minimize the interaction with server, and do as much processing as possible in your browser.

Let me give you some real world examples. You all must have used some or the other AJAX mail client like msn.com, outlook web access or yahoo mail beta. In your inbox, if you scroll down or up, they just update your mail list not the entire page. How it happens? It gets the information about mails from server, builds the table of your mails on the fly and displays in the browser.

Another good example would be Redfin which is actually AJAX based real estate portal. If you search a locality, it shows a map with location of all the results and below the map it shows details of each result. This table it constructs by appending several hundreds of strings.

These are just two. There are many more such. Basically what I want to convey is that string append is a pretty common and basic operation and probability is very high that large number of web apps get affected by this.

So we are fixing it!!!
We are trying an algorithm which avoids those intermediate buffer allocations and repeated copy operations. With this algorithm, I just ran the same test case and here are the numbers for you. Please note that the fix is still under progress so these numbers might change a bit here and there, however they should be enough to give you an idea of improvements.

Appends

Time (ms)

10K

23

20K

47

25K

62

30K

78

Shhhh. Numbers speak for themselvesJ.

JP, I Need this fix on my box, now!!!
I wish I could help you. You will have to wait till next release of Internet Explorer.

At last…
Please help us in helping you. If you have encountered other such script “hairballs”, do leave a note on any of the blogs published here. I want to publish more such blogs for youJ.

 

- JP