Publishing blog statistics horror story

I received an email from someone at Microsoft this morning about the availability of the Excel spreadsheet for internal Microsoft blogger December hit statistics. Last month I had written a program to show MS Bloggers hit statistics.

So I ran the program including the new December XL file on my test machine, tested it from a few web browser instances, saw that it all worked, then copied the PRG, DBFs to the production server. I just did a simple test on the production server, then sent an email to the MS Bloggers alias announcing the stats through December were available.

Within a few minutes, there were dozens of hits from around the world (note: to run this program you must be on the internal Microsoft corp net).

On a whim, I clicked on one of the generated links, and to my horror, I saw that only a few records of data were showing! The main page showing aggregated stats for all MS Bloggers worked fine, but the detailed month by month pages were missing many months.

I examined the suspect SQL Select statement to see what could be wrong:

SELECT LEFT(CMONTH(bs.date),3)+" 2005" as Month, ;

      TRANSFORM(rss,cPict) as Rss,;

      TRANSFORM(web,cPict) as Web,;

      TRANSFORM(total,cPict) as Total,;

      TRANSFORM(rss/web,"99,999.99") as Ratio,;

      TRANSFORM(pos,"9999")+"/"+TRANSFORM(bm.cnt,cPict)+" = %"+TRANSFORM(INT(100*pos/bm.cnt)) as Position, ;

      IIF(bs.btype='M',"MSDN","Technet") as MSDNTech;

      FROM blogstats bs INNER JOIN blogsum bm ON bs.btype+DTOC(bs.date,1) =bm.btype+DTOC(bm.date,1);

            WHERE blog == cBlogName ORDER BY MSDNTech,bs.date INTO CURSOR foo

Because a few records were missing my first thought was that the INNER JOIN wasn’t working right. Perhaps I had upper/lower case join condition mismatch? How about the date format? Did SET ENGINEBEHAVIOR make a difference?

What version of VFP was running on each machine? Did I have an interim beta build running? I tried another machine, and it reproduced the problem.

Finally, I realized that when I copied the main DBF, I had neglected to copy the associated index file, so VFP’s Rushmore was using an incomplete index for the SQL statement.

See also What is an index anyway?