Binary Files and the File System Object Do Not Mix


OK, back to scripting today.

But before I get back to scripting issues, one brief correction. An attentive reader noted that “The Well-Tempered Clavier” was in fact designed to sound good on a “well tempered” instrument, not an “equally tempered” instrument. The difference is that a “well” temperament is designed so that every key sounds good, but is allowed to have some badly-out-of-tune intervals that must be avoided. (Traditionally these are called “wolf intervals”.)

There was considerable controversy when equal temperament was introduced in Europe. I suppose it was the “what is the One True Bracing Style?” ridiculous issue of the day.

Another commenter pointed out that you could translate my wav-writing program into VBScript by using the File System Object to write out the bytes. To simplify their code down to a program that writes out individual bytes:

‘ DO NOT DO THIS
Set FSO=CreateObject(“Scripting.FileSystemObject”)
Set File=FSO.CreateTextFile(“c:\test.bin”, True)
For i = 0 to 255
  File.Write Chr(i)
Next
File.Close

And sure enough, this writes out a binary file consisting of those bytes.

Please don’t do that. See that line that says “CreateTextFile”? We wrote that method to create a text file, not a binary file. Though this code might appear to work, it actually does not. Text files are more than just binary files that can be interpreted as text. Text files have to conform to certain rules to ensure that they can be sensibly interpreted as text in the local code page. If that’s not 100% clear to you, read Joel’s article on the subject before we go on.

Let me give you an example that clearly fails. What does this program do?

Set FSO=CreateObject(“Scripting.FileSystemObject”)
Set File=FSO.CreateTextFile(“c:\test.bin”, True)
For i = 0 to 255
  File.Write Chr(&hE0)
Next
File.Close

If you said “it writes out a binary file consisting of 256 E0 bytes,” bzzt! Sorry, try again. The correct answer is “it writes out a binary file consisting of 256 E0 bytes on any operating system where the user’s default ANSI code page does not define E0 as a lead byte in a DBCS encoding, like, say, Japanese, in which case it writes out 256 zeros.”

In the Japanese code page, just-plain-chr(E0) is not even a legal character, so Chr will turn it into a zero. 

If I were whipping up a little one-off program on my own to write out a binary file — well, I’d personally do it in C, but I can see how some people might want to do it in script. But there’s a big difference between writing a one-off program that you’re going to delete in five minutes, and writing a general-purpose utility program that you expect people around the world will use. That’s an entirely different standard of robustness and portability. Do not use the FSO to read/write binary files, you’re just asking for a world of hurt as soon as someone in DBCS-land runs your code.

I have been asked many times over the years if I know of a scriptable object that can read-write true binary files in all locales. I do not. Anyone have any suggestions? I would have thought given the number of people that have asked me, that some third party would have come up with something decent by now.

Comments (33)

  1. Alex Papadimoulis says:

    Eric, I’ve had a lot of luck using the ADO Stream object.

  2. Curtis Hulett says:

    I have used this in the past, but I don’t know if it works in all locales. I have never had to deal with that.

    Function SaveBinaryData(FileName, ByteArray)

    Const adTypeBinary = 1

    Const adSaveCreateOverWrite = 2

    ‘Create Stream object

    Dim BinaryStream

    Set BinaryStream = CreateObject("ADODB.Stream")

    ‘Specify stream type – we want To save binary data.

    BinaryStream.Type = adTypeBinary

    ‘Open the stream And write binary data To the object

    BinaryStream.Open

    BinaryStream.Write ByteArray

    ‘Save binary data To disk

    BinaryStream.SaveToFile FileName, adSaveCreateOverWrite

    End Function

  3. Dave says:

    Google says:

    http://www.google.com/search?q=vbscript+binary+file

    And those paths generally lead to the Adodb.Stream solution.

  4. Eric Lippert says:

    I’ve heard that, and I’ve also heard from people that it doesn’t work well from script, so I don’t know who to believe. How do you create the binary array? VBScript only supports creation of arrays of variants.

  5. hir says:

    I found that text mode works OK like so.

    Only for writing, though.

    —————————————–

    //JScript version

    var str = WScript.CreateObject("ADODB.Stream");

    str.type = 2; //adTypeText

    str.charset = "iso-8859-1";

    str.open();

    for(var i = 0; i < 0x100; i++){

    str.writeText(String.fromCharCode(i));

    }

    str.saveToFile("c:\temp\bin.bin", 2);

    str.close();

    str = null;

    ‘VBScript version

    dim str

    set str = WScript.CreateObject("adodb.stream")

    str.type = 2

    str.charset = "iso-8859-1"

    str.open

    for i = 0 to &hff

    str.writeText(ChrW(i)) ‘uses ChrW

    next

    str.saveToFile "c:tempbin.bin", 2

    str.close

    —————————————–

    There still is a problem when you try to read some of the byte values 0x80 – 0x9f: when you read them they turn into completely different values. I guess this also relates to encoding.

    I heard you could acquire an array of bytes like this (haven’t tried myself):

    Set DM = CreateObject("Microsoft.XMLDOM")

    Set EL = DM.createElement("tmp")

    EL.DataType = "bin.hex"

    EL.Text = [some text in hex format]

    bin = EL.NodeTypedValue

  6. Mike Trinder says:

    Further to this, is there a reason why Binary read/write was left out of the File System Object? I would have thought given the number of people that have asked you, that microsoft would have come up with something decent by now 🙂

    Surely it’s just a simpler version of the FSO.OpenTextFile code?

  7. Eric Lippert says:

    We certainly considered it. However, there are two main factors. First, and most important, we decided that the Script Team wanted to be in the business of building the script engines themselves, not the objects that those engines would script. We looked around the company and realized that other teams were working on object models for administration (WMI), email (CDONTS), database access (ADO), web servers (IIS), etc. Our tiny team could never do as good a job as those fully staffed and dedicated teams, and to try would have taken away time from stuff that _wasn’t_ a massive duplication of effort. So we finished off the FSO and called it done. (This also explains why we did not add any features to the WScript.Network object, etc, when we inherited the WSH codebase.)

    Second, adding binary file reading/writing is not as straightforward as you might think. Exposing a straightforward array of bytes on disk is only the very first step. To do it right and make it usable, we’d want to provide things like default serialization of all simple data types — strings, ints, doubles, singles, currencies, etc. But once you bite that off — big endian or little endian? Length prefixed? How do you handle seeking? What if the user is reading a file that has a DBCS string embedded in it and wants to translate it into a Unicode string?

    You have to think about the real-world problems that people are going to have to solve with this tool, and there are a LOT of different scenarios for binary files. We didn’t want to bite that off. It didn’t seem like a very "scripty" scenario.

  8. Chris says:

    You can use SoftArtisans’ FileManager. It’s like FSO, but can handle binary files.

    http://fileup.softartisans.com/fileup-120.aspx

  9. Frederik Slijkerman says:

    Even though writing binary files is not a ‘scripty’ scenario, it is something that people will want to do now and then. I don’t see any reason why you couldn’t have added simple binary read/write functions so you don’t have to muck around with ADO stream objects or CreateTextFile.

  10. I’ve been reading & writing binary data the "ADODB.Stream" object for years without any problems, but then again I haven’t been using anything other than the UK codepage. But since it’s got native binary handling, surely in this particular case binary is binary is binary?!

  11. ADODB.Recordset is quite popular in… certain communities. Unfortunately it requries that you have created an ADODB.Stream object, and I don’t know how you go about populating that with arbitrary content.

  12. zwetan says:

    JSDB based on spidermonkey

    can read/write binary files

    see http://www.jsdb.org

    and much more than that: database connection, socket server, E4X etc..

    when I feel WSH is limited by something I automatically move to JSDB, both running ECMAScript code, portability made easy :).

  13. Randall K. says:

    I use Perl right now to read files and don’t have any problems at all with binary. It doesn’t require any special methods or variable types or even library includes. It’s native to the language itself. It’s unfortunate that the creators of Perl (an ancient scripting language by all comparisons), has always allowed working with binary files even across different platforms, but yet VBScript and JScript script developers are just left without access to any such basic routines as working with binary file data, even through the use of ActiveX controls (because there apparently aren’t any).

    Keep in mind that ADODB.Stream is not a solution because it’s disabled on most Windows machines now due to the security vulnerabilities it’s imposed with Internet Explorer. You know, it’s always nice when a workaround is suggest, then it’s not even really available which I guess defeats the purpose.

    –Randall

  14. berniem says:

    "However, there are two main factors. First, and most important, we decided that the Script Team wanted to be in the business of building the script engines themselves, not the objects that those engines would script."

    Ok, that’s reasonable. Add a few functions to the FSO to support binary byte reads and writes and y’all are done. Simple, eh? 🙂

    "Second, adding binary file reading/writing is not as straightforward as you might think. Exposing a straightforward array of bytes on disk is only the very first step. To do it right and make it usable,… "

    Well, that’s ONE approach. Another, quite simple approach, is to NOT be the end-all and just support reading and writing a series of bytes. If someone needs to make it more "usable", they can do it themselves – that’s the cost of dealing with BLOB data. And that’s exactly why y’all won’t know whether it’s big-endian, Unicode, or dollars. Just let me get at the bytes and I’ll do whatever is necessay to interpret/manipulate the data. 🙂

    As it is now, I seem to be left with a choice of: a) moving to another language, or b) limiting functionality. Neither is a good solution.

    Thanks.

  15. John says:

    Why is this such a big deal MS??   UN*X systems have been doing this from day 1 !!  I guess it all stems from MS (or CP/M) making the decision years ago that text files NEED CR/LF pairs to be called text files rather than the UN*X philisophy that files are a stream of bytes and it’s up to the end-user (or script developer) to decide how to interpret the bytes (or words or quadword …).

  16. Seb says:

    How can vb create a byte array? as.

    bytes = ChrB(1) & ChrB(1)

    ..is still a string.

  17. Eric Lippert says:

    In Visual Basic you create a byte array by, well, creating a byte array.

    dim b() as Byte

    In VBScript there is no way to create a byte array.  VBScript only supports arrays of variants.

    In VBScript if you create a string that contains binary data, and then pass that to an ActiveX object which expects a byte array, the default implementation of IDispatch::Invoke provided by the operating system will turn the binary string into a byte array for you.  So maybe that sneaky trick will work for you.  But my advice would be that if you need to create a byte array, the best thing to do would be to use a language which has byte arrays — VB, C#, C++, etc.

  18. Igor says:

    Eric,

    you mentioned that creating a string that contains binary data would satisfy a COM method that expects a byte array. That is exactly what I am seeking. Could you please give an example of code that would create such string from a conventional VBScript string, say, "Hello World"? I am not "native" to ASP/VB, so maybe it’s common knowledge for those who are in that world; sorry about that.

    Thanks!

  19. Igor says:

    I found out a way to create an array of bytes in VBScript using ADODB.Stream object mentioned above, which resolved the problem I had. Thanks!

  20. ATLANTES says:

    EricLippert said:

    How do you create the binary array?

    One way is to create a Text ADODB.Stream and copy it into a Binary ADODB.Stream.

    Consider the following snippet I recently wrote to update a database connection UDL file:

      Option Explicit

      Dim sServer, sDatabase, sUsername, sPassword

      sServer   = "servername"

      sDatabase = "database"

      sUsername = "username"

      sUsername = "password"

      Dim UDL

      UDL = ReadBinaryFile("Old.udl")

      UDL = SetValue(UDL, "Data Source", sServer)

      UDL = SetValue(UDL, "Initial Catalog", sDatabase)

      UDL = SetValue(UDL, "User ID", sUsername)

      UDL = SetValue(UDL, "Password", sPassword)

      UDL = SaveBinaryData("New.udl", UDL)

    ‘ =================================================================

    ‘ Function to read text from a binary (unicode) file

    ‘ =================================================================

      Function ReadBinaryFile(FileName)

         Const adTypeBinary = 1

         Dim BinaryStream

         Set BinaryStream = CreateObject("ADODB.Stream")

         BinaryStream.Type = adTypeBinary

         BinaryStream.Open

         BinaryStream.LoadFromFile FileName

         ReadBinaryFile = BinaryStream.Read

         BinaryStream.Close

         Set BinaryStream = Nothing

      End Function

    ‘ =================================================================

    ‘ Function to write a modified string back to a unicode file

    ‘ =================================================================

      Function SaveBinaryData(FileName, Text)

         Const adTypeBinary = 1

         Const adTypeText = 2

         Const adSaveCreateOverWrite = 2

         Dim BinaryStream

         Set BinaryStream = CreateObject("ADODB.Stream")

         BinaryStream.Type = adTypeBinary

         BinaryStream.Open

         With CreateObject("ADODB.Stream")

            .Type = adTypeText

            .Open: .WriteText Text

            .Position = 2

            .CopyTo BinaryStream, Len(Text) * 2

            .Close

         End With

         BinaryStream.SaveToFile FileName, adSaveCreateOverWrite

         BinaryStream.Close

         Set BinaryStream = Nothing

      End Function

    ‘ =================================================================

    ‘ Function replace semicolon delimited values in a unicode string

    ‘ =================================================================

      Function SetValue(Data, Key, Value)

         Dim Text, Prefix, Suffix, i

         If Len(Value) = 0 Then

            SetValue = Data

         Else

            ‘Drop leading character

            Text = Mid(Data, 2)

            i = InStr(Text, Key)

            Prefix = Left(Text, i + Len(Key))

            Suffix = Mid(Text, i + Len(Key))

            i = InStr(Suffix, ";")

            if i = 0 Then

               Suffix = ""

            Else

               Suffix = Mid(Suffix, i)

            End If

            ‘Restore leading character and concatinate new value

            SetValue = Left(Data, 1) + Prefix + Value + Suffix

         End If

      End Function

  21. LA.NET [EN] says:

    I’ve been playing with SideBar gadgets for some time now. Besides some quirks (ok, they’re really bugs

  22. Brahim Raddahi says:

    Here’s how to make a byte array, I extracted this from the example given by Hir

    Function VariantArrayToByteArray(arr)

    dim DM, EL, bin

    Set DM = CreateObject("Microsoft.XMLDOM")

    Set EL = DM.createElement("tmp")

    EL.DataType = "bin.hex"

    EL.Text = ArrayToHexString(arr)

    bin = EL.NodeTypedValue

    VariantArrayToByteArray = bin

    End Function

  23. Brahim Raddahi says:

    You will also need thi function:

    Function ArrayToHexString(arr)

     Dim I, B

     Redim B(UBound(arr))

     For I= 0 to UBound(arr)

       B(I) = right("0" & hex(arr(I)), 2)

     Next

     ArrayToHexString = Join(B,"")

    End Function

  24. P.G. says:

    I have ie6 on my machine. I tried using adodb.stream from vbs for manipulating

    binary files. I understand that adodb.stream is not getting recognized and

    I am unable to save or read binary files through this. any ideas??

  25. Blue Streak says:

    Try using an HTA or VBS file.

    IE6 block dynamic content (i.e. scripts) in HTM, HTML files

  26. Ben says:

    Surely it is better to call it MBCS-land… and anyway, aren’t we all in an MBCS land these days? Everything except the most basic of basic text files (that do not include any currency symbols other than the dollar sign ;-))

  27. Ian Freeman says:

    I came across newObjects’ AXPack1 when I found out that Windows CE doesn’t have FSO. It seems to have great support for Binary files. It’s free too.

    Check out this VBS sample:

    Dim fso, file, BD, BoM

    Set fso = CreateObject("newObjects.utilctls.SFMain")

    Set file = fso.OpenFile(filename)

    Set BD = CreateObject("newObjects.utilctls.SFBinaryData")

    BD.Value = file.ReadBin(2)   ‘read first 2 bytes

    BoM = BD.Data(0,1)   ‘convert first 2 bytes to byte array

    file.Close()

    bom now contains the byte order mark of filename in a VT_UI | VT_ARRAY byte array.

  28. William Shakespeare says:

    If ADODB.Stream is a solution.

    I dont know what is much ado aabout nothing means.

    /****************************************************************************/

     Option Explicit

     Dim sServer, sDatabase, sUsername, sPassword

     sServer   = "servername"

     sDatabase = "database"

     sUsername = "username"

     sUsername = "password"

     Dim UDL

     UDL = ReadBinaryFile("Old.udl")

     UDL = SetValue(UDL, "Data Source", sServer)

     UDL = SetValue(UDL, "Initial Catalog", sDatabase)

     UDL = SetValue(UDL, "User ID", sUsername)

     UDL = SetValue(UDL, "Password", sPassword)

     UDL = SaveBinaryData("New.udl", UDL)

    ‘ =================================================================

    ‘ Function to read text from a binary (unicode) file

    ‘ =================================================================

     Function ReadBinaryFile(FileName)

        Const adTypeBinary = 1

        Dim BinaryStream

        Set BinaryStream = CreateObject("ADODB.Stream")

        BinaryStream.Type = adTypeBinary

        BinaryStream.Open

        BinaryStream.LoadFromFile FileName

        ReadBinaryFile = BinaryStream.Read

        BinaryStream.Close

        Set BinaryStream = Nothing

     End Function

    ‘ =================================================================

    ‘ Function to write a modified string back to a unicode file

    ‘ =================================================================

     Function SaveBinaryData(FileName, Text)

        Const adTypeBinary = 1

        Const adTypeText = 2

        Const adSaveCreateOverWrite = 2

        Dim BinaryStream

        Set BinaryStream = CreateObject("ADODB.Stream")

        BinaryStream.Type = adTypeBinary

        BinaryStream.Open

        With CreateObject("ADODB.Stream")

           .Type = adTypeText

           .Open: .WriteText Text

           .Position = 2

           .CopyTo BinaryStream, Len(Text) * 2

           .Close

        End With

        BinaryStream.SaveToFile FileName, adSaveCreateOverWrite

        BinaryStream.Close

        Set BinaryStream = Nothing

     End Function

    ‘ =================================================================

    ‘ Function replace semicolon delimited values in a unicode string

    ‘ =================================================================

     Function SetValue(Data, Key, Value)

        Dim Text, Prefix, Suffix, i

        If Len(Value) = 0 Then

           SetValue = Data

        Else

           ‘Drop leading character

           Text = Mid(Data, 2)

           i = InStr(Text, Key)

           Prefix = Left(Text, i + Len(Key))

           Suffix = Mid(Text, i + Len(Key))

           i = InStr(Suffix, ";")

           if i = 0 Then

              Suffix = ""

           Else

              Suffix = Mid(Suffix, i)

           End If

           ‘Restore leading character and concatinate new value

           SetValue = Left(Data, 1) + Prefix + Value + Suffix

        End If

     End Function

    July 11, 2006 7:11 PM

    /****************************************************************************/

  29. 12demons says:

    hem. just test that FileSystemObject can be use to READ and WRITE binary file. only that you had to CHEAT a little. Instead of using ReadAll() just use recursice Read() to the file Size. a little snippet in JScript:

    — Code Start —

    var objFileSystem = new ActiveXObject(‘Scripting.FileSystemObject’);

    var objFileIO = objFileSystem.GetFile(‘pack.exe’);

    var streamIO = objFileIO.OpenAsTextStream();

    var strTransform = new Array();

    for (i=0;i<objFileIO.Size;i++) {

    var strContent = streamIO.Read(1);

    strTransform[i] = strContent.charCodeAt(0);

    }

    streamIO.Close();

    objFileIO = objFileSystem.CreateTextFile(‘dumb.exe’, true);

    for (i=0;i<strTransform.length;i++) { objFileIO.Write(strTransform[i]); }

    objFileIO.Close();

    — Code End —

    Basicaly copy pack.exe into Array then write Array into dumb.exe

    Surprisingly both are identical and executable. Isn’t strange?

    All i need now is smart way to play with the Array ….

  30. 12demons says:

    ups. Sorry … a typo in the code. just replace:

    strTransform[i] = strContent.charCodeAt(0);

    into

    strTransform[i] = strContent;

    I use charCodeAt to play around with ASCII code …..

  31. Michele Fiorantino says:

    Scriptable Byte Array  

    Function CreateByteArray(nsize)

    Dim sBin

    Set sBin = CreateObject("Adodb.Stream")

    sBin.Open

    sBin.Type = 2 ‘ adTypeText = 2

    sBin.WriteText String(nsize, Chr(0))

    sBin.Position = 0

    sBin.Type = 1 ‘ adTypeBinary

    CreateByteArray = sBin.Read(nsize)

    sBin.Close

    End Function

  32. Kirby L. Wallace says:

    I use FSO and ADO.Streams for most of my binary file ops already.  But…  A comment here…

    The problem you seem to be describing here doesn’t seem to have so much to do with the FileSystemObject, but rather with a quirky behaviour in the CHR() function.  The fso seems to happily, and binarily, write your binary file without hesitation.  It only broke down when you tried to make the CHR() function do something it isn’t supposed to do.  

    Wouldn’t an ADO stream have done the same thing with that same input?

  33. Gerardo Lima says:

    The code described in [http://www.asp101.com/articles/jacob/scriptupload.asp] does the job just right for uploading files to a server using only vbscript and standard TextStream objects(1). The only catch is that it must run under the following page directive <%@ CODEPAGE="65001"; %>.

    (1) as returned by fso.CreateTextFile;