Using Jscript based IE Add-On to export data from any website

Often times we browse to a website which has lot of data in tabular format, or we login to secure website which displays data like financial transactions etc. in a neatly formated tabular format. Sometimes we have a need to export such data into another tool, if the website supports such an export good, otherwise we have to manually cut and paste the data into a text file, and painfully convert it into the desired format of the target tool. This process can be very lengthy, painful and error prone. We can use some kind of a script to try to automate some part of this process specially the post processing of the cut-and-pasted data to get into the desired format. But the final process is still not completely automated and has lots of manual steps involved.

What if we can easily write an add-on for the browser which can, not only extract the right set of data from a website we are browsing but can also easily convert it into any format we desire. In this blog post I will give step-by-step instruction on how to develop a Jscript based IE Add-On to achieve exactly such a solution.

In this example we will use an online stocks and mutual funds trading website www.icicidirect.com and write an IE add-on to extract trade transaction details from it, and then to export it in QIF format so that it can be imported into personal finance tools like Microsoft Money.

There are  four major requirements for this add-on:

  1. We should be able to invoke it from IE while browsing a web page and it should be allowed to access the contents of that web page.
  2. It should be able to traverse the contents of the web page easily and locate the fields which are of interest.
  3. It should be able to easily convert the data in the desired output format.
  4. And finally it should be able to write the data at the desired location on user machine.

For the first requirement i.e. attaching it to IE, we will make use of the "Adding Entries to the Standard Context Menu" facility in IE. There are many different ways to extend functionality in IE but this one is one of the simpler ones and it satisfies our requirement perfectly. Only thing we need to do for this is to run the following command:

REG ADD "HKCU\Software\Microsoft\Internet Explorer\MenuExt\icicidirect2qif" /ve /d "file://C:\Program Files\EXTENDIE\icicidirect2qif.htm"

What this does is add registry entry for IE to start showing 'icicidirect2qif' as an item in the context menu which shows up anytime you right click in a web page. And when you click on that item IE will run the file "C:\Program Files\EXTENDIE\icicidirect2qif.htm" in context of that web page. The file icicidirect2qif.htm can contain Jscript code written inside the tags <SCRIPT>...</SCRIPT> e.g.

<script type="text/javascript">

   ...

</script>

For the second requirement, when IE calls into icicidirect2qif.htm, the script code inside this page can use DOM element "external.menuArguments.document" to access any and all the contents of the web page currently being browsed in IE. In your Jscript code you can use one or more of the properties or methods available in IE to traverse through the 'document' object. Our target web page in icicidirect site had named <text> fields containing the transaction details like 'date', 'amount', 'mutual fund name', 'price' etc. so we have used the DOM call "getElementsByName" to locate and get data for the fields we are interested in, for example "var date = doc.getElementsByName("trans_date"+i)[0];" etc.

For the third requirement, Jscript has a number of useful "String" and "Number" APIs to work with the data we have extracted out of the web page and to convert it to any target format we are interested in, for example we have used code like:

        var units = doc.getElementsByName("Trans_Units"+i)[0].value;
        units = units.replace(/,/g, "");
        amount = new Number(amount); units = new Number(units);
        var price = amount/units;

to automatically calculate 'price' information for a transaction given the 'amount' and 'quantity' information extracted from the web page.

For the fourth and final requirement, Jscript has multiple ways to write data out, one of the simplest method is to use the 'document.write()' API to print output in the currently browsed web page itself. Alternatively you can also access the 'clipboard' object from Jscript and write output into the clipboard which can later be pasted into any application you want. We have used 'ActiveXObject("Scripting.FileSystemObject")' to write out our data into a file on user's machine as follows:

    var fso = new ActiveXObject("Scripting.FileSystemObject");
    var FileObject = fso.OpenTextFile("d:\\export.qif",8,true,0);
    FileObject.WriteLine("D"+date[1]+"/"+date[0]+"/"+date[2]);

In fact there is another ActiveXObject Shell.Application, which can be used in the end to automatically invoke Microsoft Money QIF import tool to start and load the QIF file we have just written out in our example we skipped this because we were writing out an 'investment' account which needs to be loaded into Money using 'File->Import->Restore Account' while the standard QIF import tool in Money works for the normal 'banking' account:

    var oShell = new ActiveXObject("Shell.Application");
    oShell.ShellExecute("d:\\export.qif");
    oShell = null;
 

 Here is the code for the complete "icicidirect2qif.htm":

 <script type="text/javascript">
    var activityArray = {
        "P":"Buy",
        "D":"Div",
        "DR":"ReinvDiv",
        "R":"Sell"
    };
    var doc = external.menuArguments.document;
    var i = 0;
    while (doc.getElementsByName("trans_date"+i)[0] != null) i++;
    var fso = new ActiveXObject("Scripting.FileSystemObject");
    fso.DeleteFile("d:\\export.qif");
    var FileObject = fso.OpenTextFile("d:\\export.qif",8,true,0);
    FileObject.WriteLine("!Type:Invst");
    while (i--) {
        var date = doc.getElementsByName("trans_date"+i)[0];
        date = date.value;
        date = date.split(" ")[0];
        date = date.split("-");
        FileObject.WriteLine("D"+date[1]+"/"+date[0]+"/"+date[2]);
        var amount = doc.getElementsByName("Trans_Amt"+i)[0].value;
        amount = amount.replace(/,/g, "");
        FileObject.WriteLine("T"+amount);
        var activity = doc.getElementsByName("trans_cd"+i)[0].value;
        if (activityArray[activity] != null) activity = activityArray[activity];
        FileObject.WriteLine("N"+activity);
        var fund = doc.getElementsByName("Fund_Name"+i)[0].value;
        fund = fund.split(" ")[0];
        var scheme = doc.getElementsByName("Scheme_Name"+i)[0].value;
        FileObject.WriteLine("Y"+fund+" "+scheme);
        var units = doc.getElementsByName("Trans_Units"+i)[0].value;
        units = units.replace(/,/g, "");
        amount = new Number(amount); units = new Number(units);
        var price = amount/units;
        FileObject.WriteLine("I"+price.toFixed(6));
        FileObject.WriteLine("Q"+units.toString());
        if (activity == "Div" || activity == "ReinvDiv") {
            FileObject.WriteLine("LInvestment Income:Dividends");
        } else if (activity == "Sell") {
            FileObject.WriteLine("LInvestment Income:Capital Gains");
        }
        FileObject.WriteLine("^");
    }
    FileObject.close();
    /*var oShell = new ActiveXObject("Shell.Application");
    oShell.ShellExecute("d:\\export.qif");
    oShell = null;*/
</script>