Playing With Roslyn
As a tools guy, I've been fascinated with Roslyn ever since Microsoft previewed it. It looks like it provides lots of power for tools to consume. Now that the first CTP is out, I spent some time over the weekend playing with it.
The first thing it gives you access to is parsing source code, giving you an object model you can play with, and then letting you spit the modified source code back out. Reading source code in is easy:
Note the "Hello World" has changed to "Goodbye World", open parentheses now have spaces in front of them, and we colored the keywords blue with very few lines of code.
The tools guy inside of me loves playing with the API, but the Mono guy inside me has to go further and actually play with implementing the API, so the screenshot is actually running on a toy implementation of Roslyn's VB syntax tokenizer that I whipped up. I want to stress that it's just a quick hack that pretty much only parses my sample program and I know nothing about writing parsers, but it was a fun exercise.
The other nice feature of the Roslyn APIs is that it makes each compiler step independently testable, so I know I produce the same 59 syntax tokens as the MS implementation with the same leading and trailing trivia.
If you want to play with the code, it's available on GitHub:
https://github.com/jpobst/Mokii
The first thing it gives you access to is parsing source code, giving you an object model you can play with, and then letting you spit the modified source code back out. Reading source code in is easy:
var source_text = File.ReadAllText ("input.txt"); var tree = SyntaxTree.ParseCompilationUnit(source_text); var root = (CompilationUnitSyntax)tree.Root;Now we have an object model we can play with. We can search through the model for specific tokens and replace them with new ones, like this:
// Replace "Hello World!" with "Goodbye World!" var output = root.GetFirstToken (p => p.Kind == SyntaxKind.StringLiteralToken); var new_output = Syntax.StringLiteralToken ("\"Goodbye World!\"", "Goodbye World!", output.LeadingTrivia, output.TrailingTrivia); root = root.ReplaceToken (output, new_output);We can do source code formatting:
// Add a space in front of all open parentheses var parens = root.DescendentTokens ().Where (p => p.Kind == SyntaxKind.OpenParenToken && p.LeadingWidth == 0); var new_parans = parens.Select (p => p.WithLeadingTrivia (Syntax.WhitespaceTrivia (" "))); root = root.ReplaceTokens (parens, (p, q) => p.WithLeadingTrivia (Syntax.WhitespaceTrivia (" ")));Finally, we can do syntax highlighting:
private static void OutputNode (CompilationUnitSyntax token) { var default_color = Console.ForegroundColor; foreach (var t in token.DescendentTokens ()) { // Make keywords blue if (SyntaxFacts.IsKeyword (t)) Console.ForegroundColor = ConsoleColor.DarkCyan; Console.Write (t.ToString ()); Console.ForegroundColor = default_color; } }Putting this all together and running it results in:
Note the "Hello World" has changed to "Goodbye World", open parentheses now have spaces in front of them, and we colored the keywords blue with very few lines of code.
The tools guy inside of me loves playing with the API, but the Mono guy inside me has to go further and actually play with implementing the API, so the screenshot is actually running on a toy implementation of Roslyn's VB syntax tokenizer that I whipped up. I want to stress that it's just a quick hack that pretty much only parses my sample program and I know nothing about writing parsers, but it was a fun exercise.
The other nice feature of the Roslyn APIs is that it makes each compiler step independently testable, so I know I produce the same 59 syntax tokens as the MS implementation with the same leading and trailing trivia.
If you want to play with the code, it's available on GitHub:
https://github.com/jpobst/Mokii
Comments
Seth
I'm wondering if the actual Microsoft implementation exposes the compiler's API or it ships with a custom parser ?
Is it possible to expose directly the Mono compiler API ?