Building a Code Analyzer for the Roslyn Analyzer Project


OSS&Microsoft Banner

Welcome to our weekly MVP series where we discuss some of the most useful solutions, tips & tricks and how to’s on integrating OSS & Microsoft technologies. In case you have any questions, don’t hesitate and drop us a comment below or reach out to us via Twitter and Facebook! Enjoy!

With the official release of Visual Studio 2015 this summer, code analyzers have started to get a lot of attention. Analyzers allow companies and individuals to enforce a given set of rules within a code base.
Broadly speaking there are two kinds of analyzers:

· Analyzers that enforce coding styles and best practices.

· Analyzers that guide individuals using a third party library.

The first set of analyzers may remind you of tools like StyleCop. Indeed there is an open source initiative to port StyleCop rules over into a set of analyzers. Other projects in this category include the Code Cracker project and Refactoring Essentials.

The second set of analyzers is meant to help people understand how to properly use a library. These analyzers detect improper usage and issue warnings to consumers. So what might this look like?

Let’s consider the DateTime API. Dates in .NET are immutable which means there is no way to change a given date once it’s created. What this means in code:

var date = DateTime.UtcNow;

date.AddDays(1); //Wrong: This does nothing.

var newDate = date.AddDays(1); //Good: This assigns a new date based off the old one.

//or

date = date.AddDays(1); //Good: Overwrites the old date with the new one.

So a Roslyn analyzer could be shipped with the DateTime library that catches and warns you about all the places in your code when you're using the API incorrectly. The beauty is that analyzers can catch a lot of errors at build-time that you normally wouldn't catch until run time.

Immutability in Roslyn

It turns out that most people run into a similar problem when they use Roslyn for the first time. Much of the API exposes immutable objects and most people can’t figure out why their objects aren’t being updated.

Document doc = default(Document);

var newText = SourceText.From("class C { void M() { } }");

doc.WithText(newText); //Bad: Does not change doc.

var newDoc = doc.WithText(newText); //Good: Assign to new variable.

//or

doc = doc.WithText(newText); //Good: Overwrite our previous reference to document


Basically Roslyn has the same pitfalls as DateTime. So why not write an analyzer to help people catch this bug the moment they type it?

Planning a Roslyn Analyzer

If you're unfamiliar with Roslyn, this next bit might be tough to follow. If you're interested in getting started with Roslyn I've put together a series on getting started with Roslyn called Learn Roslyn Now.

At a high level, we want our analyzer to look for invocations of methods on immutable objects that might mislead consumers of the API. In my experience these methods usually start with one of:

· Add

· Replace

· Remove

· With

Our previous example dealt with the invocation of the method “WithText()”.

We also want our analyzer to operate only on immutable objects. Issuing a warning on .Add() for a List isn’t going to help anyone. C# doesn’t offer a way to discover immutable objects so we’re going to have to hard code a list of immutable objects into our analyzer. For starters we’ll analyzer any objects that inherit from:

· Solution

· Project

· Document

· SyntaxNode

In my experience, the hardest part of write an analyzer is figuring out exactly what sort of code you want your analyzer to catch. I’ve found that it helps to create a list of situations in which we want our analyzer to create a warning and a list of situations where we do not. For simplicity’s sake, we’ll use the same Document example as before:

var newDoc = doc.WithText(newText); //Good: Assign to a new variable

doc = doc.WithText(newText); //Good: Assign to existing variable

someOtherMethod(doc.WithText(newText)); //Good: Pass the result to a method.

return doc.WithText(newText); //Good: Return the result

doc.WithText(newText); //Bad: Ignoring return value

So from this list we can start reasoning about what kind of syntax we'll be looking for in our analyzer. Our analyzer will look at invocations where the parent of that invocation is a statement all by itself. One exception to this is when our invocation is found within a return statement.

Writing our Analyzer

Now that we've got some perspective on the problem, let's draft some steps in pseudo-code. Our analyzer should:

· Find all invocations

o Filter down to only invocations whose parent is a complete statement.

o Filter out any invocations whose parent is a return statement.

· For each invocation in our list of invocations:

o Does the invocation start with "Add", "Remove", "Replace" or "With"? If not, ignore it.

o Is the invocation on an object of type or inherit from Solution, Project, Document, or Syntax Node? If not, ignore it.

o Issue a warning on this invocation.

At some point we're going to have to hit the symbol API to figure out if we're looking at a Document or one of the other immutable types. With this in mind I chose to register my analyzer as a CodeBlockAction. CodeBlockAction analyzers run after semantic analysis completes for a given block. This means we won't force any semantic analysis when we need to use the symbol API.

For starters, we set up the list of immutable objects and problem methods:

private static readonly string s_solutionFullName = @"Microsoft.CodeAnalysis.Solution";

private static readonly string s_projectFullName = @"Microsoft.CodeAnalysis.Project";

private static readonly string s_documentFullName = @"Microsoft.CodeAnalysis.Document";

private static readonly string s_syntaxNodeFullName = @"Microsoft.CodeAnalysis.SyntaxNode";

// A list of known immutable object names

private static readonly List<string> s_immutableObjectNames = new List<string>()

{

s_solutionFullName,

s_projectFullName,

s_documentFullName,

s_syntaxNodeFullName,

};

private static readonly string s_Add = "Add";

private static readonly string s_Remove = "Remove";

private static readonly string s_Replace = "Replace";

private static readonly string s_With = "With";

private static readonly List<string> s_immutableMethodNames = new List<string>()

{

s_Add,

s_Remove,

s_Replace,

s_With,

};

Next we build the “brains” of our analyzer according to the pseudo-code we laid out above.

public void AnalyzeCodeBlockForIgnoredReturnValues(CodeBlockAnalysisContext context) 
 
{

var model = context.SemanticModel;

var syntaxNode = context.CodeBlock;

//Find invocations that ignore the return value

//We're looking for invocations that are children of a statement but not of a return statement.

var candidateInvocations = syntaxNode.DescendantNodes()

.OfType<InvocationExpressionSyntax>()

.Where(n => n.Parent is StatementSyntax && !(n.Parent is ReturnStatementSyntax));

foreach (var candidateInvocation in candidateInvocations)

{

//If we can't find the method symbol, quit

var methodSymbol = model.GetSymbolInfo(candidateInvocation).Symbol as IMethodSymbol;

if (methodSymbol == null)

continue;

//If the method doesn't start with something like "With" or "Replace", quit

string methodName = methodSymbol.Name;

if (!s_immutableMethodNames.Any(n => methodName.StartsWith(n)))

continue;

//If we're not in one of the known immutable types, quit

var parentName = methodSymbol.ContainingType.ToString();

var baseTypesAndSelf = methodSymbol.ContainingType.GetBaseTypes().Select(n => n.ToString()).ToList();

baseTypesAndSelf.Add(parentName);

if (!baseTypesAndSelf.Any(n => s_immutableObjectNames.Contains(n)))

continue;

var location = candidateInvocation.GetLocation();

var diagnostic = Diagnostic.Create(DoNotIgnoreReturnValueDiagnosticRule, location, methodSymbol.ContainingType.Name, methodSymbol.Name);

context.ReportDiagnostic(diagnostic);

}

}

The full analyzer is available at on my public Github fork.

Now we can test it out against our previous examples:
https://i.gyazo.com/732f7437bfcc9d76ad543323df9d7f35.png
clip_image002

Awesome, looks good! 🙂

Submitting a Pull Request to Roslyn Analyzers

Finally, we have to submit our changes back to the original Roslyn Analyzers project. We do this by creating a pull request. A pull request is basically asking the maintainers of the Roslyn Analyzers project to pull our changes into their master branch.

Before doing so they take some time to review the submitted code. In my case, my initial approach contained some bugs and other contributors helped me catch them and improve my code. It can be nerve-wracking to put your code on display like that that, but it's important to remember that we're all on the same team, contributing to a project in order to help make it better.

You can see the full process over at: https://github.com/dotnet/roslyn-analyzers/pull/301https://github.com/dotnet/roslyn-analyzers/pull/301

With luck my pull request will be accepted sometime in the next week. 🙂

Comments (1)

  1. HykeemWells says:

    Okay

Skip to main content