Custom building and code generators in Visual Studio 2005


I’m a fervent fan of using code generator tools wherever possible to make your life easier. Although they come with issues related to effective building, diagnostics, and debugging, the amount of value they add to your application is immense: they can eliminate entire classes of potential bugs, save you a great deal of effort and time, make your module much easier to extend and maintain, and even yield runtime performance gains. Among the most frequently used code generator tools are the lexer and parser generators GNU Flex and Bison, based on the classic generators lex and yacc. Although I’ll get into the details of how to use these tools effectively at a later time, today what I want to show you is a practical example of how to use the new custom build features of Visual Studio 2005 to effectively incorporate a code generator into your automatic build process.


Here is a very simple Bison grammar for evaluating arithmetic expressions involving addition, subtraction, and multiplication:

/* example.y */
%{
#define YYSTYPE int
%}
%token PLUS MINUS STAR LPAREN RPAREN NUMBER NEWLINE
%left PLUS MINUS
%left STAR

%%
line : /* empty */
| line expr NEWLINE { printf(“%d\n”, $2); }
expr : LPAREN expr RPAREN { $$ = $2; }
| expr PLUS expr { $$ = $1 + $3; }
| expr MINUS expr { $$ = $1 – $3; }
| expr STAR expr { $$ = $1 * $3; }
| NUMBER { $$ = $1; }
;

%%

int yyerror (char const *msg) {
printf(“Error: %s\n”, msg);
}

int main() {
printf(“%d\n”, yyparse());
return 0;
}


The Flex lexer used by this parser looks like this:

/* example.lex */
%{
#include “example.parser.h”
%}
%option noyywrap

%%

[ \t]+ { /* ignore whitespace */ }
“(” { return LPAREN; }
“)” { return RPAREN; }
“+” { return PLUS; }
“-” { return MINUS; }
“*” { return STAR; }
\n { return NEWLINE; }
[0-9]+ { yylval = atoi(yytext); return NUMBER; }
. { printf(“Invalid character ‘%s'”, yytext); }

%%


If we were writing this parser at a UNIX command line, we might generate the source files and compile the result using this sequence of commands:

bison -v -d example.y -o example.parser.c
flex -oexample.lexer.c example.lex
gcc -o example example.lexer.c example.parser.c

Now say you wanted to build the same application for Windows using Visual Studio 2005. The tools are available on Windows (see flex for Win32, bison for Win32), and you could simply run the same first two commands at the command-line and then build the resulting source files in Visual Studio. However, this simple approach sacrifices many of the advantages that Visual Studio provides for its built-in source file types: it doesn’t rebuild the generated source files as needed, it doesn’t allow you to jump to errors that occur during generation, and it doesn’t allow you to configure build options using a nice GUI. Let’s see how we can reclaim these advantages for Flex and Bison files.


Creating a simple custom build type


Our first goal is simply to be able to build Flex and Bison files. First, use the Flex and Bison setup binaries from the links above to install the tools. A bin directory will be created in the installation directory. Add this to your system path. You should be able to execute both flex and bison from a command prompt without specifying a path.


Next, we create a new C++ console application. Uncheck the option to use precompiled headers – I’ll explain how to use these with Flex and Bison later. Remove the main source file created for you by the wizard. Next, right-click the project in Solution Explorer and choose Custom Build Rules. The following dialog appears:


A build rule establishes how to build a file of a particular type. A group of related build rules are stored in a build rule file, which can be saved, distributed, and reused in many projects. We’ll start by creating a new build rule file for our Flex and Bison rules:



  1. Click New Rule File.
  2. Enter “GNU Tools” for Display Name and File Name.
  3. Choose a suitable directory for the build rule file. If it asks you if you want to add the directory to your search path, say yes.

Now we’ll create a build rule for Bison files:



  1. Click Add Build Rule.
  2. Enter the following values:

    1. Name: Bison
    2. File Extensions: *.y
    3. Outputs: $(InputName).parser.c;$(InputName).parser.h
    4. Command Line: bison -d [inputs] -o $(InputName).parser.c
    5. Execution Description: Generating parser…

  3. Click OK twice, then check the box labelled “GNU Tools” and click OK.
  4. Add the example.y file above to your project. Right-click on the file and choose Compile. You should receive no errors.
  5. Create a new file folder under the project called “Generated Files”. Add the existing file example.parser.c to this folder.

If you build now, you should receive only an error complaining that yylex() is undefined. Now, go back to Custom Build Tools and click Modify Rule File on GNU Tools. Create a rule for Flex:



  1. Click Add Build Rule.
  2. Enter the following values:

    1. Name: Flex
    2. File Extensions: *.lex
    3. Outputs: $(InputName).lexer.c
    4. Command Line: flex -o$(InputName).lexer.c [inputs]
    5. Execution Description: Generating lexer…

  3. Click OK three times.
  4. Add the example.lex file above to your project. Right-click on the file and choose Compile. You should receive no errors.
  5. Add the existing file example.lexer.c to your project.

If you build now, you should receive no errors and be able to run the application successfully. Now in any project you can simply check the “GNU Tools” box, add the .lex and .y files to your project, and build. What happens if you modify the example.y and build? It runs Bison again and recompiles example.parser.c, because it was regenerated, and example.lexer.c, because it includes a header file that was regenerated. If we modify the .lex file, Flex is rerun and example.lexer.c is recompiled, but example.parser.c is not rebuilt. If you had a larger parser, you’d appreciate how much time this incremental rebuilding saves you.


Improving diagnostic support


Delete one of the “%%” marks in the .y file and build. Unsurprisingly, Bison fails. However, the Error List tells you no more than this. It’d be more helpful if you could find out what errors the tool produced. If you look at the output window, Bison did produce some errors, but if you double click on them to visit the error location, it just takes you to the top of the file. What gives?


The reason for this is that Visual Studio only recognizes one error format, that used by its own tools. Here’s an example:

c:\myprojects\myproject\hello.cpp(10) : error C2065: ‘i’ : undeclared identifier

Bison doesn’t output errors in this format, and so they aren’t parsed. Flex uses yet another different format. What to do? The simplest way to deal with this is to invoke a simple script on the output of the tools as part of the build rule which parses the output and converts it to the desired format. You can write this script in any language; I wrote them in C# using the .NET Framework’s regular expressions. Here’s what I wrote inside the Main() function for the Bison converter tool (error checking and such omitted):

string line;
while ((line = Console.In.ReadLine()) != null)
{
Match match = Regex.Match(line, “([^:]+):([0-9]+)\\.[^:]*: (.*)”);
if (match != null)
{
Console.WriteLine(“{0}({1}): error BISON: {2}”,
Path.GetFullPath(match.Groups[1].Value),
match.Groups[2].Value, match.Groups[3].Value);
}
else
{
Console.WriteLine(line);
}
}

I deploy the binary, say it’s called BisonErrorFilter.exe, to the same directory as bison.exe. I then change the Command Line of the Bison build rule to the following (click the arrow in the right of the field to access a multiline text box):

bison.exe -d [inputs] -o $(InputName).parser.c > bison.err 2>&1
BisonErrorFilter < bison.err

If you compile the .y file now, any errors should appear in the error list, as desired, and you can double-click them to visit their locations. I wrote a similar script for the lexer output. Be careful when doing this, though, because if you miss any errors, Visual Studio might look at the error return of the last command and interpret it as success. A better way to do this would be to wrap the tool in a script that passes its arguments to the tool, collects the tool’s output and return code, converts and prints the output, and then returns its return code.


I haven’t figured out how, but I believe it’s possible to also create custom help entries for each error message, then have the filter tool produce the right error code for each one. This way, users can get help for each error individually by just clicking on it and pressing F1.


Properties


Properties enable you to control how the command-line tool is executed directly from the properties page for each individual file you wish to build with it. Let’s start with a simple example: a handy lexer switch is -d, which prints out an informative message each time a token is recognized. We don’t want it on all the time, and certainly not in release mode, but it’d be handy to be able to turn on and off as necessary.


To create a property for this, first return to the lexer build rule. Then follow these steps:



  1. Click Add Property.
  2. Choose Boolean for the User Property Type.
  3. Enter the following values:

    1. Name: debug
    2. Display Name: Print debug traces
    3. Switch: -d
    4. Description: Displays an informative message each time a token is recognized.

  4. Click OK. Then, add [debug] right after “flex” in the Command Line field.
  5. Click OK three times.
  6. Right-click on example.lex in Solution Explorer and choose Properties.
  7. In the left pane, click the plus next to Flex. Click General.
  8. You’ll see your property. Click on it and its description will appear at the bottom. Set it to Yes.
  9. Click Command Line in the left pane. You’ll see that the -d flag has been added.
  10. Click OK and build.
  11. Run the app and type an arithmetic expression. You’ll see trace messages.
  12. View the project properties. You’ll see that it now has a Flex node also. Here you can set the default settings for all files of that type in the project which don’t have specific overriding settings set.

Adding more properties is just as simple. You can go through the man page for the tool and add properties for each switch, using the Category field to group them into categories. You can use the other property types for switches accepting arguments. If you want, you can create a detailed help file with additional explanation and examples for each switch. When you’re done you have an impressive looking property sheet for your files reminiscent of those for built-in types:


You can also set different settings for debug and release builds. For example, for Flex, it’s good to set table size to slowest and smallest for the Debug version, to speed up compilation, and to set it to the recommended full tables with equivalence classes for the Release version, which is a good tradeoff of table size and speed.


Finally, once you’re done adding all the properties you like, you can take the resulting .rules files and give it to everyone on your team, or distribute it on a website, so that everyone can easily integrate the tool into Visual Studio. Perhaps eventually tools like Flex and Bison will ship with a .rules file.


Conclusion


In Visual Studio 2003 you would have had to write a plug-in to come close to achieving this level of integration with a third-party tool. Although it has limitations, I hope the problems solved by these new features help encourage you to incorporate more tools and code generation into your regular development. Now that you know how to use Flex and Bison from the Visual Studio IDE, next time I’ll talk about how to use the tools themselves, going through some of the development and debugging processes that a grammar developer goes through, and show you some similar tools for other .NET languages. Thanks for reading, everyone.

Comments (11)

  1. Don says:

    You should check out CodeSmith. It’s a simply incredible code generation utility that works great with Visual Studio. For me, the learning curve was about 15 minutes (try to find anyone who can say that about Yacc or Bison)

    http://www.codesmithtools.com/

  2. Mike G says:

    Everything you wanted to know about Flex and Bison, but were too afraid to ask the sandal-wearing geeks!

  3. Jonathon Bell says:

    When Visual Studio 2005 saves out a project that references the new rule file ‘gnu.rules’, it saves the path to the file as a relative pathame:

    <ToolFile RelativePath="..blahblahgnu.rules"/>

    This makes it hard to share the rule file amongst the developers at my organization. Is there a special path, environment var, registry entry, or something that I’m missing that tells VS2005 where to look for rule files when loading the projects that refer to them?

    Jonathon

  4. MSDN Archive says:

    Hi Jonathon. I believe the behaviour you describe was chosen so that you could check your .rules file into your version control repository without everyone having to check it out to the same absolute location. I believe you can set an absolute path by editing the project file, possibly even a network path, but I don’t know of any way to set a search path. I hope this helps.

  5. Nathan Yospe says:

    Jonathan,

    I spent far too much time trying to find a way to

    get around the same issue, for very different reasons.

    I wanted to add the .rules files for a few tools – source generators, resource converters, custom compilers, and so forth – to the source control system, and have done with the pain of managing custom build steps that none of our developers ever added correctly.

    Unfortunately, I wanted to be able to use the same .rules files from all of the projects in the source tree. The .rules files needed to be able to address locations relative to the root of the source tree – and to the location of the .rules file itself – as well as the target files that $(InputPath) provided.

    Visual Studio does not provide this macro.

    I ended up working around this by setting a properties file named root_from_sln.vsprops in each solution directory, and creating a macro for the path relative to the solution.

  6. leon says:

    i’ve only just found your excellent blog. top work.

    because you’re "a fervent fan of using code generator tools" i wanted to show you my "world’s simplest code generator (version 3)", because i think it has a few usability advantages over the big guns and the classics (i.e. bison et al).

    it’s located here:

    http://www.secretgeek.net/w3scg.asp

    best of luck, keep up the excellent work

    lb

  7. Rakesh says:

    How do I use flex and bison tool in C#?

  8. Tracy Whitehead says:

    When I’m compiling at the stage of the first sentence of the last paragraph of section -Creating a simple custom build type-, I receive the numerous warnings and errors:

    Warning 1 warning C4013: ‘atoi’ undefined; assuming extern returning int c:FlexBisonExample.lex 16

    Warning 2 warning C4267: ‘=’ : conversion from ‘size_t’ to ‘int’, possible loss of data c:FlexBisonExample.lexer.c 901

    Warning 3 warning C4244: ‘initializing’ : conversion from ‘__w64 int’ to ‘int’, possible loss of data c:FlexBisonExample.lexer.c 1067

    Warning 4 warning C4996: ‘fileno’ was declared deprecated c:FlexBisonExample.lexer.c 1254

    Warning 5 warning C4013: ‘exit’ undefined; assuming extern returning int c:FlexBisonExample.lexer.c 1448

    Warning 6 warning C4013: ‘malloc’ undefined; assuming extern returning int c:FlexBisonExample.lexer.c 1511

    Warning 7 warning C4312: ‘type cast’ : conversion from ‘int’ to ‘void *’ of greater size c:FlexBisonExample.lexer.c 1511

    Warning 8 warning C4013: ‘realloc’ undefined; assuming extern returning int c:FlexBisonExample.lexer.c 1529

    Warning 9 warning C4312: ‘type cast’ : conversion from ‘int’ to ‘void *’ of greater size c:FlexBisonExample.lexer.c 1529

    Warning 10 warning C4013: ‘free’ undefined; assuming extern returning int c:FlexBisonExample.lexer.c 1539

    Error 11 error LNK2001: unresolved external symbol _yylval Example.lexer.obj

    Error 12 error LNK2019: unresolved external symbol _main referenced in function ___tmainCRTStartup MSVCRTD.lib

    Error 13 fatal error LNK1120: 2 unresolved externals C:FlexBisonDebugFlexBison.exe 1

    What can I be doing wrong?

    TW

  9. Ramana says:

    this process requires to compile each file seperately. If I want to build all .asm files once what can I do?

  10. Nicholas says:

    I followed these instructions to the letter but I couldn’t get Visual Studio to compile the code bison or flex generated. First of all, there are errors concerning functions in libraries not included in the example. How is it supposed to work "as is" if the programme doesn’t have access to the printf method even?

    Also, I’m getting strange syntax errors, including spare ‘{‘s.

    Is there anything I can do??