Special connect() issue 2014

Volume 29 Number 12A

C# and Visual Basic - Use Roslyn to Write a Live Code Analyzer for Your API

Alex Turner; issue 2014

For 10 years now, Visual Studio Code Analysis has provided build-time analysis of your C# and Visual Basic assemblies, running a specific set of FxCop rules written for the Microsoft .NET Framework 2.0. These rules are great at helping you avoid general pitfalls in your code, but they’re targeted at the problems developers hit back in 2005. What about today’s new language features and APIs?

With live, project-based code analyzers in Visual Studio 2015, API authors can ship domain-specific code analysis as part of their NuGet packages. Because these analyzers are powered by the .NET Compiler Platform (code-named “Roslyn”), they can produce warnings in your code as you type even before you’ve finished the line—no more waiting to build your code to find out you made a mistake. Analyzers can also surface an automatic code fix through the new Visual Studio light bulb prompt to let you clean up your code immediately.

Many of these live code analyzers weigh in at just 50 to 100 lines of code, and you don’t have to be a compiler jock to write them. In this article I’ll show how to write an analyzer that targets a common coding problem hit by those using .NET regular expressions (regex): How to make sure the regex pattern you wrote is syntactically valid before you run your app. I’ll do this by showing how to write an analyzer that contains a diagnostic to point out invalid regex patterns. I’ll follow up this article with a second installment later, where I’ll show how to add in a code fix to clean up the errors your analyzer finds.

Getting Ready

To get started, make sure you have the necessary Visual Studio 2015 bits:

  • Set up a box with Visual Studio 2015in one of these two ways from aka.ms/vs2015:
    1. Install Visual Studio 2015Community.
    2. Install Visual Studio 2015 Enterprise Trial.
  • Install the Visual Studio 2015 SDK. The Visual Studio SDK is now included as an optional component in Visual Studio setup. During installation, select Visual Studio Extensibility Tools under Common Tools to include the SDK. If you have already installed Visual Studio, you can install this SDK by going to the main menu File | New | Project, choosing C# in the left navigation pane, and then choosing Extensibility. When you select "Install the Visual Studio Extensibility Tools" item, it prompts you to download and install the Visual Studio SDK.
  • Install the .NET Compiler Platform ("Roslyn") SDK from aka.ms/roslynsdktemplates. You can also install this SDK by going to the main menu File | New | Project, navigating to the C# / Extensibility node and choosing "Download the .NET Compiler Platform SDK".  Along with project templates, this SDK includes the Roslyn Syntax Visualizer. This extremely useful tool helps you figure out what code model types you should look for in your analyzer. The analyzer infrastructure calls into your code for specific code model types, so your code only executes when necessary and can focus only on analyzing relevant code.
  • You also need to install the .NET Compiler Platform SDK into the experimental Visual Studio sandbox you’ll use to debug your analyzer. I’ll walk through how to do that in the Syntax Visualizer section later.

Exploring the Analyzer Template

Once you’re up and running with Visual Studio 2015, the Visual Studio SDK and the necessary VSIX packages, check out the project template for building an analyzer.

Inside Visual Studio 2015, go to File | New Project | C# | Extensibility and choose the Analyzer with Code Fix (NuGet + VSIX) template. You can create analyzers in Visual Basic, as well, but for this article I’ll use C#. Be sure the target framework is set to .NET Framework 4.5.2 or higher at the top. Give your project the name RegexAnalyzer and select OK to create the project.

You’ll see a set of three projects the template generates:

  • RegexAnalyzer: This is the main project that builds the analyzer DLL containing the diagnostics and code fix. Building this project also produces a project-local NuGet package (.nupkg file) containing the analyzer.
  • RegexAnalyzer.VSIX: This is the project that bundles your analyzer DLL into a Visual Studio-wide extension package (.vsix file). If your analyzer doesn’t need to add warnings that affect builds, you can choose to distribute this .vsix file rather than the NuGet package. Either way, the .vsix project lets you press F5 and test your analyzer in a separate (debugee) instance of Visual Studio.
  • RegexAnalyzer.Test: This is a unit test project that lets you make sure your analyzer is producing the right diagnostics and fixes, without running that debugee instance of Visual Studio each time.

If you open the DiagnosticAnalyzer.cs file in the main project, you can see the default code in the template that produces diagnostics. This diagnostic does something a bit silly—it “squiggles” any type names that have lowercase letters. However, because most programs will have such type names, this lets you easily see the analyzer in action.

Make sure that the RegexAnalyzer.VSIX project is the startup project and press F5. Running the VSIX project loads an experimental sandbox copy of Visual Studio, which lets Visual Studio keep track of a separate set of Visual Studio extensions. This is useful as you develop your own extensions and need to debug Visual Studio with Visual Studio. Because the debuggee Visual Studio instance is a new experimental sandbox, you’ll see the same dialogs you got when you first ran Visual Studio 2015. Click through them as normal. You might also see some delays as you download debugging symbols for Visual Studio itself. After the first time, Visual Studio should cache those symbols for you.

Once your debuggee Visual Studio instance is running, use it to create a C# console application. Because the analyzer you’re debugging is the Visual Studio-wide .vsix extension, you should see a green squiggle appear on the Program class definition within a few seconds. If you hover over this squiggle or look at the Error List, you’ll see the message, “Type name ‘Program’ contains lowercase letters,” as shown in Figure 1. When you click on the squiggle, you’ll see a light bulb icon appear on the left. Clicking on the light bulb icon will show a “Make uppercase” fix that cleans up the diagnostic by changing the type name to be uppercase.

The Code Fix from the Analyzer Template
Figure 1 The Code Fix from the Analyzer Template

You’re able to debug the analyzer from here, as well. In your main instance of Visual Studio, set a breakpoint inside the AnalyzeSymbol method within Diagnostic­Analyzer.cs. As you type in the editor, analyzers continually recalculate diagnostics. The next time you type within Program.cs in your debuggee Visual Studio instance, you’ll see the debugger stop at that breakpoint.

Keep the console application project open for now, as you’ll use it further in the next section.

Inspecting the Relevant Code Using the Syntax Visualizer

Now that you’re oriented in the analyzer template, it’s time to start planning what code patterns you’ll look for in the analyzed code to decide when to squiggle.

Your goal is to introduce an error that shows up when you write an invalid regex pattern. First, within the console application’s Main method, add the following line that calls Regex.Match with an invalid regex pattern:

Regex.Match("my text", @"\pXXX");

Looking at this code and hovering on Regex and Match, you can work out the conditions for when you want to generate a squiggle:

  • There’s a call to the Regex.Match method.
  • The Regex type involved is the one from the System.Text.RegularExpressions namespace.
  • The second parameter of the method is a string literal that represents an invalid regular expression pattern. In practice, the expression might also be a variable or constant reference—or a computed string—but for this initial version of the analyzer, I’ll focus first on string literals. It’s often best to get an analyzer working end to end for a simple case before you move on to support more code patterns.

So, how do you translate these simple constraints into .NET Compiler Platform code? The Syntax Visualizer is a great tool to help figure that out.

You’ll want to install the visualizer within the experimental sandbox you’ve been using to debug analyzers. You might have installed the visualizer earlier, but the installer just installs the package into your main Visual Studio.

While you’re still in your debuggee Visual Studio instance, open Tools | Extensions & Updates | Online, and search the Visual Studio Gallery for “syntax visualizer.” Download and install the .NET Compiler Platform SDK which includes the Syntax Visualizer package. Then choose Restart Now to restart Visual Studio.

Once Visual Studio restarts, open up the same console application project and open the Syntax Visualizer by choosing View | Other Windows | Syntax Visualizer. You can now move the caret around the editor and watch as the Syntax Visualizer shows you the relevant part of the syntax tree. Figure 2 shows the view for the Regex.Match invocation expression you’re interested in here.

The Syntax Visualizer in Action for the Target Invocation Expression
Figure 2 The Syntax Visualizer in Action for the Target Invocation Expression

The Parts of the Syntax Tree As you browse around in the syntax tree, you’ll see various elements.

The blue nodes in the tree are the syntax nodes, representing the logical tree structure of your code after the compiler has parsed the file.

The green nodes in the tree are the syntax tokens, the individual words, numbers and symbols the compiler found when it read the source file. Tokens are shown in the tree under the syntax nodes to which they belong.

The red nodes in the tree are the trivia, representing everything else that’s not a token: whitespace, comments and so on. Some compilers throw this information away, but the .NET Compiler Platform holds onto it, so your code fix can maintain the trivia as needed when your fix changes the user’s code.

By selecting code in the editor, you can see the relevant nodes in the tree, and vice versa. To help visualize the nodes you care about, you can right-click on the InvocationExpression in the Syntax Visualizer tree and choose View Directed Syntax Graph. This will generate a .dgml diagram that visualizes the tree structure below the selected node, as shown in Figure 3.

Syntax Graph for the Target Invocation Expression
Figure 3 Syntax Graph for the Target Invocation Expression

In this case, you can see you’re looking for an Invocation­Expression that’s a call to Regex.Match, where the ArgumentList has a second Argument node that contains a StringLiteralExpression. If the string value represents an invalid regex pattern, such as “\pXXX,” you’ve found the span to squiggle. You’ve now gathered most of the information needed to write your diagnostic analyzer.

Symbols and the Semantic Model: Going Beyond the Syntax Tree While the syntax nodes, tokens and trivia shown in the Syntax Visualizer represent the full text of the file, they don’t tell you every­thing. You also need to know what each identifier in the code is actually referencing. For example, you know this invocation is a call to a Match method on a Regex type with two parameters, but you don’t know what namespace that Regex type is in or which overload of Match is called. Discovering exactly what definitions are referenced requires the compilers to analyze the identifiers in the context of their nearby using directives.

Answering those kinds of questions requires you to ask the semantic model to give you the symbol associated with a given expression node. Symbols represent the logical entities that your code defines, such as your types and methods. The process of figuring out the symbol referenced by a given expression is known as binding. Symbols can also represent the entities you consume from referenced libraries, such as the Regex type from the Base Class Library (BCL).

If you right-click on the Invocation­Expression and choose View Symbol, the property grid below fills with information from the method symbol of the invoked method, as shown in Figure 4.

Viewing the Regex.Match Method Symbol in the Syntax Visualizer
Figure 4 Viewing the Regex.Match Method Symbol in the Syntax Visualizer

In this case you can look at the Original­Definition property and see that the invocation refers to the System.Text.RegularExpressions.Regex.Match method, and not a method on some other Regex type. The last piece of the puzzle in writing your diagnostic is to bind that invocation and check that the symbol you get back matches the string System.Text.RegularExpressions.Regex.Match.

Building the Diagnostic

Now that you’ve got your strategy, close the debuggee Visual Studio instance and return to your analyzer project to start building your diagnostic.

The SupportedDiagnostics Property Open up the Diagnostic­Analyzer.cs file and look at the five string constants near the top. This is where you define the metadata for your diagnostic rule. Even before your analyzer produces any squiggles, the Ruleset Editor and other Visual Studio features will use this metadata to know the details of the diagnostics your analyzer might produce.

Update these strings to match the Regex diagnostic you plan to produce:

public const string DiagnosticId = "Regex";
internal const string Title = "Regex error parsing string argument";
internal const string MessageFormat = "Regex error {0}";
internal const string Description = "Regex patterns should be syntactically valid.";
internal const string Category = "Syntax";

The diagnostic ID and filled-in message string are shown to users in the Error List when this diagnostic is produced. A user can also use the ID in his source code in a #pragma directive to suppress an instance of the diagnostic. In the follow-up article, I’ll show how to use this ID to associate your code fix with this rule.

In the line declaring the Rule field, you can also update the severity of the diagnostics you’ll be producing to be errors rather than warnings. If the regex string doesn’t parse, the Match method will definitely throw an exception at run time, and you should block the build as you would for a C# compiler error. Change the rule’s severity to DiagnosticSeverity.Error:

internal static DiagnosticDescriptor Rule =
  new DiagnosticDescriptor(DiagnosticId, Title, MessageFormat,
   Category, DiagnosticSeverity.Error, isEnabledByDefault: true, description: Description);

This is also the line where you decide whether the rule should be enabled by default. Your analyzer can define a larger set of rules that are off by default, and users can choose to opt in to some or all of the rules. Leave this rule enabled by default.

The SupportedDiagnostics property returns this Diagnostic­Descriptor as the single element of an immutable array. In this case, your analyzer will only produce one kind of diagnostic, so there’s nothing to change here. If your analyzer can produce multiple kinds of diagnostics, you could make multiple descriptors and return them all from SupportedDiagnostics.

Initialize Method The main entry point for your diagnostic analyzer is the Initialize method. In this method, you register a set of actions to handle various events the compiler fires as it walks through your code, such as finding various syntax nodes, or encountering a declaration of a new symbol. The silly default analyzer you get from the template calls RegisterSymbolAction to find out when type symbols change or are introduced. In that case, the symbol action lets the analyzer look at each type symbol to see if it indicates a type with a bad name that needs a squiggle.

In this case, you need to register a SyntaxNode action to find out when there’s a new call to Regex.Match. Recall from exploring in the Syntax Visualizer that the specific node kind you’re looking for is InvocationExpression, so replace the Register call in the Initialize method with the following call:

context.RegisterSyntaxNodeAction(AnalyzeNode, SyntaxKind.InvocationExpression);

The regex analyzer only needed to register a syntax node action that produces diagnostics locally. What about more complex analysis that gathers data across multiple methods? See “Handling Other Register Methods” in this article for more on this.

AnalyzeNode Method Delete the template’s AnalyzeSymbol method, which you no longer need. In its place, you’ll create an AnalyzeNode method. Click on AnalyzeNode within the RegisterSyntaxNodeAction call and press Ctrl+Dot to pull up the new light bulb menu. From there, choose Generate method to create an AnalyzeNode method with the right signature. In the generated AnalyzeNode method, change the parameter’s name from “obj” to “context.”

Now that you’ve gotten to the core of your analyzer, it’s time to examine the syntax node in question to see if you should surface a diagnostic.

First, you should press Ctrl+F5 to launch the debuggee instance of Visual Studio again (this time without debugging). Open up the console application you were just looking at and make sure the Syntax Visualizer is available. You’ll look at the visualizer a few times to find relevant details as you build the AnalyzeNode method.

Getting the Target Node Your first step in the AnalyzeNode method is to take the node object you’re analyzing and cast it to the relevant type. To find that type, use the Syntax Visualizer you just opened. Select the Regex.Match call and navigate in the syntax tree to the InvocationExpression node. You can see just above the property grid that InvocationExpression is the kind of node, while the type is InvocationExpressionSyntax.

You can test a node’s type with a type check or test its specific kind with the IsKind method. However, here you can guarantee that a cast will succeed without either test, because you asked for the particular kind in the Initialize method. The node your action analyzes is available from the Node property on the context parameter:

var invocationExpr = (InvocationExpressionSyntax)context.Node;

Now that you have the invocation node, you need to check whether it’s a Regex.Match call that needs a squiggle.

Check No. 1: Is This a Call to a Match Method? The first check you need is to make sure this invocation is a call to the correct Regex.Match. Because this analyzer will run on every keystroke in the editor, it’s a good idea to perform the quickest tests first and ask more expensive questions of the API only if those initial tests pass.

The cheapest test is to see whether the invocation syntax is a call to a method named Match. That can be determined before the compiler has done any special work to figure out which particular Match method this is.

Looking back at the Syntax Visualizer, you see that the Invocation­Expression has two main child nodes, the SimpleMemberAccess­Expression and the ArgumentList. By selecting the Match identifier in the editor as shown in Figure 5, you can see that the node you’re looking for is the second IdentifierName within the Simple­MemberAccessExpression.

Finding the Match Identifier in the Syntax Tree
Figure 5 Finding the Match Identifier in the Syntax Tree

As you build an analyzer, you’ll quite often be digging into syntax and symbols like this to find the relevant types and property values you need to reference in your code. When building analyzers, it’s convenient to keep a target project with the Syntax Visualizer handy.

Back in your analyzer code, you can browse the members of invocationExpr in IntelliSense and find a property that corresponds to each of the InvocationExpression’s child nodes, one named Expression and one named ArgumentList. In this case, you want the property named Expression. Because the part of an invocation that’s before the argument list can have many forms (for example, it might be a delegate invocation of a local variable), this property returns a general base type, ExpressionSyntax. From the Syntax Visualizer, you can see that the concrete type that you expect is a MemberAccessExpressionSyntax, so cast it to that type:

var memberAccessExpr =
  invocationExpr.Expression as MemberAccessExpressionSyntax;

You see a similar breakdown when you dig into the properties on memberAccessExpr. There’s an Expression property that represents the arbitrary expression before the dot, and a Name property that represents the identifier to the right of the dot. Because you want to check to see if you’re calling a Match method, check the string value of the Name property. For a syntax node, getting the string value is a quick way to get the source text for that node. You can use the new C# “?.” operator to handle the case where the expression you’re analyzing wasn’t actually a member access, causing the previous line’s “as” clause to return a null value:

if (memberAccessExpr?.Name.
    ToString() != "Match") return;

If the method being called isn’t named Match, you simply bail out. Your analysis is complete at minimal cost and there’s no diagnostic to generate.

Check No. 2: Is This a Call to the Real Regex.Match Method? If the method is named Match, do a more involved check that asks the compiler to determine precisely which Match method the code is calling. Determining the exact Match requires asking the context’s semantic model to bind this expression to get the referenced symbol.

Call the GetSymbolInfo method on the semantic model and pass it the expression for which you want the symbol:

var memberSymbol =
  context.SemanticModel.GetSymbolInfo(memberAccessExpr).Symbol as IMethodSymbol;

The symbol object you get back is the same one you can preview in the Syntax Visualizer by right-clicking the SimpleMemberAccessExpression and choosing View Symbol. In this case, you’re choosing to cast the symbol to the common IMethodSymbol interface. This interface is implemented by the internal PEMethodSymbol type mentioned for that symbol in the Syntax Visualizer.

Now that you have the symbol, you can compare it against the fully qualified name you expect from the real Regex.Match method. For a symbol, getting the string value will give you its fully qualified name. You don’t really care which overload you’re calling yet, so you can just check as far as the word Match:

if (!memberSymbol?.ToString().
  StartsWith("System.Text.RegularExpressions.Regex.Match") ?? true) return;

As with your previous test, check to see if the symbol matches the name you expect, and if it doesn’t match, or if it wasn’t actually a method symbol, bail out. Though it might feel a bit odd to operate on strings here, string comparisons are a common operation within compilers.

The Remaining Checks Now your tests are starting to fall into a rhythm. At each step, you dig a bit further into the tree and check either the syntax nodes or the semantic model to test if you’re still in an error situation. Each time, you can use the Syntax Visualizer to see what types and property values you expect, so you know in which case to return and in which case to continue. You’ll follow this pattern to check the next few conditions.

Make sure the ArgumentList has at least two arguments:

var argumentList = invocationExpr.ArgumentList as ArgumentListSyntax;
if ((argumentList?.Arguments.Count ?? 0) < 2) return;

Then make sure the second argument is a LiteralExpression, because you’re expecting a string literal:

var regexLiteral =
  argumentList.Arguments[1].Expression as LiteralExpressionSyntax;
if (regexLiteral == null) return;

Finally, once you know it’s a literal, you can ask the semantic model to give you its constant compile-time value, and make sure it’s specifically a string literal:

 

var regexOpt = context.SemanticModel.GetConstantValue(regexLiteral);
if (!regexOpt.HasValue) return;
var regex = regexOpt.Value as string;
if (regex == null) return;

Validating the Regex Pattern At this point, you’ve got all the data you need. You know you’re calling Regex.Match, and you’ve got the string value of the pattern expression. So how do you validate it?

Simply call the same Regex.Match method and pass in that pattern string. Because you’re only looking for parse errors in the pattern string, you can pass an empty input string as the first argument. Make the call within a try-catch block so you can catch the ArgumentException that Regex.Match throws when it sees an invalid pattern string:

try
{
  System.Text.RegularExpressions.Regex.Match("", regex);
}
catch (ArgumentException e)
{
}

If the pattern string parses without error, your AnalyzeNode method will exit normally and there’s nothing to report. If there’s a parse error, you’ll catch the argument exception—you’re ready to report a diagnostic!

Reporting a Diagnostic Inside the catch block, you use the Rule object you filled in earlier to create a Diagnostic object, which represents one particular squiggle you want to produce. Each diagnostic needs two main things specific to that instance: the span of code that should be squiggled and the fill-in strings to plug into the message format you defined earlier:

var diagnostic =
    Diagnostic.Create(Rule,
    regexLiteral.GetLocation(), e.Message);

In this case, you want to squiggle the string literal so you pass in its location as the span for the diagnostic. You also pull out the exception message that describes what was wrong with the pattern string and include that in the diagnostic message.

The last step is to take this diagnostic and report it back to the context passed to AnalyzeNode, so Visual Studio knows to add a row to the Error List and add a squiggle in the editor:

context.ReportDiagnostic(diagnostic);

Your code in DiagnosticAnalyzer.cs should now look like Figure 6.

Figure 6 The Complete Code for DiagnosticAnalyzer.cs

using System;
using System.Collections.Immutable;
using Microsoft.CodeAnalysis;
using Microsoft.CodeAnalysis.CSharp;
using Microsoft.CodeAnalysis.CSharp.Syntax;
using Microsoft.CodeAnalysis.Diagnostics;
namespace RegexAnalyzer
{
  [DiagnosticAnalyzer(LanguageNames.CSharp)]
  public class RegexAnalyzerAnalyzer : DiagnosticAnalyzer
  {
    public const string DiagnosticId = "Regex";
    internal const string Title = "Regex error parsing string argument";
    internal const string MessageFormat = "Regex error {0}";
    internal const string Description = "Regex patterns should be syntactically valid.";
    internal const string Category = "Syntax";
    internal static DiagnosticDescriptor Rule =
      new DiagnosticDescriptor(DiagnosticId, Title, MessageFormat,
      Category, DiagnosticSeverity.Error, isEnabledByDefault: true, description: Description);
    public override ImmutableArray<DiagnosticDescriptor>
      SupportedDiagnostics { get { return ImmutableArray.Create(Rule); } }
    public override void Initialize(AnalysisContext context)
    {
      context.RegisterSyntaxNodeAction(
        AnalyzeNode, SyntaxKind.InvocationExpression);
    }
    private void AnalyzeNode(SyntaxNodeAnalysisContext context)
    {
      var invocationExpr = (InvocationExpressionSyntax)context.Node;
      var memberAccessExpr =
        invocationExpr.Expression as MemberAccessExpressionSyntax;
      if (memberAccessExpr?.Name.ToString() != "Match") return;
      var memberSymbol = context.SemanticModel.
        GetSymbolInfo(memberAccessExpr).Symbol as IMethodSymbol;
      if (!memberSymbol?.ToString().StartsWith(
        "System.Text.RegularExpressions.Regex.Match") ?? true) return;
      var argumentList = invocationExpr.ArgumentList as ArgumentListSyntax;
      if ((argumentList?.Arguments.Count ?? 0) < 2) return;
      var regexLiteral =
        argumentList.Arguments[1].Expression as LiteralExpressionSyntax;
      if (regexLiteral == null) return;
      var regexOpt = context.SemanticModel.GetConstantValue(regexLiteral);
      if (!regexOpt.HasValue) return;
      var regex = regexOpt.Value as string;
      if (regex == null) return;
      try
      {
        System.Text.RegularExpressions.Regex.Match("", regex);
      }
      catch (ArgumentException e)
      {
        var diagnostic =
          Diagnostic.Create(Rule, regexLiteral.GetLocation(), e.Message);
        context.ReportDiagnostic(diagnostic);
      }
    }
  }
}

Trying It Out That’s it—your diagnostic is now complete! To try it, just press F5 (make sure RegexAnalyzer.VSIX is the startup project) and reopen the console application in the debuggee instance of Visual Studio. You should soon see a red squiggle on the pattern expression pointing out why it failed to parse, as shown in Figure 7.

Trying Out Your Diagnostic Analyzer
Figure 7 Trying Out Your Diagnostic Analyzer

If you see the squiggle, congratulations! If not, you can set a breakpoint inside the AnalyzeNode method, type a character inside the pattern string to trigger reanalysis, and then step through the analyzer code to see where the analyzer is bailing out early. You can also check your code against Figure 6, which shows the complete code for DiagnosticAnalyzer.cs.

Use Cases for Diagnostic Analyzers

To recap, starting from the analyzer template and writing about 30 lines of new code, you were able to identify and provide a squiggle for a real issue in your users’ code. Most important, doing so didn’t require you to become a deep expert in the operations of the C# compiler. You were able to stay focused on your target domain of regular expressions and use the Syntax Visualizer to guide you to the small set of syntax nodes and symbols relevant to your analysis.

There are many domains in your everyday coding where writing a diagnostic analyzer can be quick and useful:

  • As a developer or lead on a team, you might see others make the same mistakes often when you do code reviews. Now, you can write a simple analyzer that squiggles those anti-patterns and check the analyzer into source control, ensuring that anyone who introduces such a bug will notice it as they’re typing.
  • As the maintainer of a shared layer that defines business objects for your organization, you might have business rules for correct use of these objects that are hard to encode in the type system, especially if they involve numerical values or if they involve steps in a process where some operations should always come before others. Now you can enforce these softer rules that govern use of your layer, picking up where the type system leaves off.
  • As the owner of an open source or commercial API package, you might be tired of answering the same questions repeatedly in the forums. You might even have written white papers and documentation and found that many of your customers continue to hit those same issues, as they haven’t read what you wrote. Now you can bundle your API and the relevant code analysis into one NuGet package, ensuring that everyone using your API sees the same guidance from the start.

Hopefully, this article has inspired you to think about the analyzers you’ll want to build to enhance your own projects. With the .NET Compiler Platform, Microsoft has done the heavy lifting to expose deep language understanding and rich code analysis for C# and Visual Basic—the rest is up to you!

What’s Next?

So now you have Visual Studio showing error squiggles under invalid regex patterns. Can you do more?

If you’ve got the regular expressions domain knowledge to see not just what’s wrong with a pattern string but also how to fix it, you can suggest a fix in the light bulb, as you saw in the template’s default analyzer.

In the next article, I’ll show how to write that code fix, as you learn how to make changes to your syntax trees. Stay tuned!

Handling Other Register Methods

You can dig into the Initialize method’s context parameter to see the full set of Register methods you can call. The methods in Figure A let you hook various events in the compiler’s pipeline.

A key point to note is that a top-level action registered with any Register method should never stash any state in instance fields on the analyzer type. Visual Studio will reuse one instance of that analyzer type for the whole Visual Studio session to avoid repeated allocations. Any state you store and reuse is likely to be stale when analyzing a future compilation and could even be a memory leak if it keeps old syntax nodes or symbols from being garbage collected.

If you need to retain state across actions, you should call RegisterCodeBlockStartAction or RegisterCompilationStartAction and store the state as locals within that action method. The context object passed to those actions lets you register nested actions as lambda expressions, and these nested actions can close over the locals in the outer actions in order to keep state.

Figure A Register Methods To Hook Various Events

RegisterSyntaxNodeAction Triggered when a particular kind of syntax node has been parsed
RegisterSymbolAction Triggered when a particular kind of symbol has been analyzed
RegisterSyntaxTreeAction Triggered when the file’s whole syntax tree has been parsed
RegisterSemanticModelAction Triggered when a semantic model is available for the whole file

RegisterCodeBlockStartAction

RegisterCodeBlockEndAction

Triggered before/after analysis of a method body or other code block

RegisterCompilationStartAction

RegisterCompilationEndAction

Triggered before/after analysis of the entire project

Alex Turner is a senior program manager for the Managed Languages team at Microsoft, where he’s been brewing up C# and Visual Basic goodness on the .NET Compiler Platform (“Roslyn”) project. He graduated with a master’s degree in Computer Science from Stony Brook University and has spoken at Build, PDC, TechEd, TechDays and MIX conferences.

Thanks to the following Microsoft technical experts for reviewing this article: Bill Chiles and Lucian Wischik
Bill Chiles worked on languages (CMU Common Lisp, Dylan, IronPython and C#) and developer tools most of his career.  He spent the last 17 years in Microsoft's Developer Division working on everything from core Visual Studio features, to the Dynamic Language Runtime, to C#.

Lucian Wischik is on the Visual Basic/C# language design team at Microsoft, with particular responsibility for Visual Basic. Before joining Microsoft he worked in academia on concurrency theory and async. He’s a keen sailor and long-distance swimmer.