How To: Concatenate files using MSBuild tasks


This came across the internal MSBuild discussion alias this week:



How can I concatenate a bunch of individual files into a single file during my build process? It looks like the ReadLinesFromFile and WriteLinesToFile tasks will do what I want, but I can’t figure out what kind of batching operators to use.


Dan provided an elegant answer. The first step is to create an ItemGroup with the files you want to concatenate:

<ItemGroup>
   <InFiles Include=”1.in;2.in”/>
</ItemGroup>

Then create a target to do the concatenation:


<Target Name=”ConcatenateFiles”>
   <ReadLinesFromFile File=”%(InFiles.Identity)”>
      <Output TaskParameter=”Lines” ItemName=”lines”/>
   </ReadLinesFromFile>
   <WriteLinesToFile File=”test.out” Lines=”@(Lines)” Overwrite=”false” />
</Target>


The first task, ReadLinesFromFiles, goes through each file in the InFiles list and adds their lines to a new list called Lines. To ensure that each file is done in a loop (batched), the .Identity item metadata is tacked on. The lines that are read in are added to a new property called Lines, which is created by using the Output element. Then WriteLinesToFile takes the generated list and spits them out to a file.


Pretty nifty!


[ Author: Neil Enns ]

Comments (10)

  1. barrkel says:

    Maybe I’m old-fashioned, but I can’t help but think that the old:

    cat a b c > d

    is more elegant and easier to understand. It can easily be extended further, using bash script, easily inserted in a Makefile:

    cat $(find -iname *.cs | sort) > d

    This modern obsession with XML is fine for things that are wrapped up in a UI, but for something that needs programmer thought, you need (at some level) a programming language.

  2. Mike Fourie says:

    That’s a fair point. We’ve actually often wished we had some sort of more straight forward programmability in MSBuild, directly embedded in the project file, for this kind of thing.

    There are other ways of achieving the above. As you mention, cat works, and you can involve it by from the <Exec> task.

    Another option, if you do this often, would be to write a task called <Cocnatenate> that takes the list of files and the output file name. It would certainly look nicer in the XML!

    The ReadLinesFromFiles and WriteLinesToFile tasks were never meant to be used this way, but I thought it was pretty cool that you could combine stuff in MSBuild in a way it was never intended to be used. Even though it’s not a proper programming language, I’m always amazed at what you can do with our existing constructs.

    Neil

  3. Jimmie says:

    How can i reset the taskparameter "Lines" if i want to read and write from and to another file? I have some problems if i try to write to a file in a new target, then i write the input from the earlier ReadFromFile job..

  4. James Denning says:

    Just a note – this method strips out tab characters and blank lines

  5. Samir says:

    The code shown here works! but as James pointed out, the ReadLinesFromFile is stripping the tabs and blank lines. So I loose all formatting in the concatenated files.

    Is there a way to prevent the tabs from being stripped ?

  6. Neil says:

    Hi

    how can I specify duplicate files in the InFiles Item and have them read in?

    i.e.

    <InFiles Include="1.in;2.in;1.in"/>

    Cheers!

    Neil

  7. You can conatenate files like this:

    <ItemGroup>

           <TextFiles Include="*.txt" Exclude="final.txt"/>

    </ItemGroup>

    <Exec Command="echo y| type %(TextFiles.Identity) >> final.txt"/>

  8. Simon Ransom says:

    >> <Exec Command="echo y| type %(TextFiles.Identity) >> final.txt"/>

    This unfortanately doesn’t work if the files are UTF encoded with a BOM (Byte-order mark) since it will add the BOM at each seam.

  9. Minherz says:

    This method trims heading spaces and tabs. So, this isn't actual files concatenation.