Skip to content

Instantly share code, notes, and snippets.

@JamesTryand
Created November 14, 2023 06:06
Show Gist options
  • Save JamesTryand/65de3d64ca91615ae011827492f82e04 to your computer and use it in GitHub Desktop.
Save JamesTryand/65de3d64ca91615ae011827492f82e04 to your computer and use it in GitHub Desktop.
roslyn

You i have a lot of C# source code. There are certain strings in the text of the source that i would like to extract into yaml for instance a line like "class ShoppingCartEmptied : INotification" - would be

messages:
  ShoppingCartEmptied:
    payload:
      type: string
      pattern: '^hello .+$'

How would I do this in C# either using regex's or using Roslyn? ChatGPT ChatGPT Extracting such information from your C# source code can be done in two primary ways:

Regular Expressions (Regex): Suitable for simpler cases or when you don't need a full parse tree for the code. Roslyn: Microsoft's .NET Compiler Platform, offers APIs for parsing and semantic analysis of C# code. It's the way to go if you want more fine-grained information from your code or if your extraction logic becomes complex. I'll show you how to do it with both methods:

  1. Using Regular Expressions: csharp Copy code using System; using System.Text.RegularExpressions;

public class Program { public static void Main() { string sourceCode = "class ShoppingCartEmptied : INotification"; // Your source code string pattern = @"class (\w+) : INotification";

    var match = Regex.Match(sourceCode, pattern);
    if (match.Success)
    {
        var className = match.Groups[1].Value;
        var yaml = $@"

messages: {className}: payload: type: string pattern: '^hello .+$'";

        Console.WriteLine(yaml);
    }
}

} 2. Using Roslyn: First, install the necessary NuGet packages:

Copy code Microsoft.CodeAnalysis.CSharp Microsoft.CodeAnalysis.CSharp.Workspaces Now, use Roslyn to analyze the source:

csharp Copy code using System; using Microsoft.CodeAnalysis; using Microsoft.CodeAnalysis.CSharp; using Microsoft.CodeAnalysis.CSharp.Syntax;

public class Program { public static void Main() { string sourceCode = "class ShoppingCartEmptied : INotification"; // Your source code

    var tree = CSharpSyntaxTree.ParseText(sourceCode);
    var root = tree.GetRoot() as CompilationUnitSyntax;

    foreach (var member in root.Members)
    {
        if (member is ClassDeclarationSyntax classDeclaration)
        {
            foreach (var baseType in classDeclaration.BaseList.Types)
            {
                if (baseType.ToString() == "INotification")
                {
                    var className = classDeclaration.Identifier.ValueText;
                    var yaml = $@"

messages: {className}: payload: type: string pattern: '^hello .+$'";

                    Console.WriteLine(yaml);
                }
            }
        }
    }
}

} This example uses Roslyn to parse the source code, and then checks for classes that inherit from INotification. When it finds one, it extracts its name and produces the desired YAML output.

Choose the method that best suits your needs. If you're only interested in extracting class names that inherit from INotification, the regex solution might suffice. However, if you have more complex requirements or if you think they might evolve in the future, Roslyn is a more robust and scalable choice.

You how would i iterate through a folder of files loading each file of a specific extension ( '.cs' ) up ? ChatGPT ChatGPT To iterate through a folder and load each file of a specific extension (in your case, .cs), you can use the System.IO.Directory class. Below, I'll show you how to do it with both the Regex and Roslyn approaches.

  1. Using Regular Expressions: First, you'll want to list all .cs files and then process each file with the Regex pattern:

csharp Copy code using System; using System.IO; using System.Text.RegularExpressions;

public class Program { public static void Main() { string folderPath = @"path_to_your_folder"; // replace with your folder path string pattern = @"class (\w+) : INotification";

    foreach (var file in Directory.GetFiles(folderPath, "*.cs"))
    {
        var sourceCode = File.ReadAllText(file);
        var match = Regex.Match(sourceCode, pattern);

        if (match.Success)
        {
            var className = match.Groups[1].Value;
            var yaml = $@"

messages: {className}: payload: type: string pattern: '^hello .+$'";

            Console.WriteLine(yaml);
        }
    }
}

} 2. Using Roslyn: Same idea, list all .cs files and then process each file using Roslyn:

csharp Copy code using System; using System.IO; using Microsoft.CodeAnalysis; using Microsoft.CodeAnalysis.CSharp; using Microsoft.CodeAnalysis.CSharp.Syntax;

public class Program { public static void Main() { string folderPath = @"path_to_your_folder"; // replace with your folder path

    foreach (var file in Directory.GetFiles(folderPath, "*.cs"))
    {
        var sourceCode = File.ReadAllText(file);
        var tree = CSharpSyntaxTree.ParseText(sourceCode);
        var root = tree.GetRoot() as CompilationUnitSyntax;

        foreach (var member in root.Members)
        {
            if (member is ClassDeclarationSyntax classDeclaration)
            {
                foreach (var baseType in classDeclaration.BaseList?.Types ?? Enumerable.Empty<BaseTypeSyntax>())
                {
                    if (baseType.ToString() == "INotification")
                    {
                        var className = classDeclaration.Identifier.ValueText;
                        var yaml = $@"

messages: {className}: payload: type: string pattern: '^hello .+$'";

                        Console.WriteLine(yaml);
                    }
                }
            }
        }
    }
}

} In both cases, you should replace @"path_to_your_folder" with the path to the folder you want to process. The scripts will then load each .cs file and extract the desired information to create the YAML content.

You what nuget packages would be needed and how would i register them using the dotnet command? ChatGPT ChatGPT To work with Roslyn, you'd need the following NuGet packages:

Microsoft.CodeAnalysis.CSharp Microsoft.CodeAnalysis.CSharp.Workspaces You can install these packages using the dotnet command-line tool.

Here's how you can add the packages to your project:

Navigate to your project directory using a terminal or command prompt. Run the following commands: bash Copy code dotnet add package Microsoft.CodeAnalysis.CSharp dotnet add package Microsoft.CodeAnalysis.CSharp.Workspaces These commands will add the required packages to your project. After running these commands, the necessary references will be added, and you should be able to use the Roslyn APIs in your C# project.

For the regex approach, no additional NuGet packages are needed since System.Text.RegularExpressions is part of the base .NET libraries.

You for the roslyn solution, the INotification is an interface rather than a class, how would the solution change ? ChatGPT ChatGPT If INotification is an interface, the Roslyn solution wouldn't change much in terms of looking for types that implement the interface. In C#, both classes and interfaces can appear in the same BaseList property of the ClassDeclarationSyntax. The difference comes down to semantics: classes are derived (inherited), whereas interfaces are implemented.

In the provided solution, we were simply iterating over the BaseList.Types collection and checking if any of them matched the INotification string. This approach works for both base classes and interfaces.

However, if you want to differentiate between base classes and interfaces or if you want a more semantically accurate solution (e.g., you want to ensure INotification is indeed an interface), you'd have to dive deeper into semantic analysis with Roslyn.

Here's an enhanced Roslyn example that checks if the base type is actually an interface named INotification:

csharp Copy code using System; using System.IO; using Microsoft.CodeAnalysis; using Microsoft.CodeAnalysis.CSharp; using Microsoft.CodeAnalysis.CSharp.Syntax; using Microsoft.CodeAnalysis.MSBuild;

public class Program { public static async System.Threading.Tasks.Task Main() { string folderPath = @"path_to_your_folder"; // replace with your folder path

    // Create a workspace and open your solution
    using var workspace = MSBuildWorkspace.Create();
    var solution = await workspace.OpenSolutionAsync(@"path_to_your_solution.sln");

    foreach (var project in solution.Projects)
    {
        var compilation = await project.GetCompilationAsync();

        foreach (var file in Directory.GetFiles(folderPath, "*.cs"))
        {
            var sourceCode = File.ReadAllText(file);
            var tree = CSharpSyntaxTree.ParseText(sourceCode);
            var root = tree.GetRoot() as CompilationUnitSyntax;
            var semanticModel = compilation.GetSemanticModel(tree);

            foreach (var member in root.Members)
            {
                if (member is ClassDeclarationSyntax classDeclaration)
                {
                    foreach (var baseType in classDeclaration.BaseList?.Types ?? Enumerable.Empty<BaseTypeSyntax>())
                    {
                        var typeSymbol = semanticModel.GetSymbolInfo(baseType.Type).Symbol as INamedTypeSymbol;

                        if (typeSymbol?.TypeKind == TypeKind.Interface && typeSymbol.Name == "INotification")
                        {
                            var className = classDeclaration.Identifier.ValueText;
                            var yaml = $@"

messages: {className}: payload: type: string pattern: '^hello .+$'";

                            Console.WriteLine(yaml);
                        }
                    }
                }
            }
        }
    }
}

} In this example, the semanticModel.GetSymbolInfo(baseType.Type).Symbol as INamedTypeSymbol line allows us to retrieve semantic information about the type in the base list. With that, we can check if it's an interface (typeSymbol?.TypeKind == TypeKind.Interface) and if its name is INotification.

Note: This solution assumes you have a solution file (*.sln). If you only have a standalone project, you can adjust it by loading the project instead of the solution.

Lastly, for this more detailed analysis to work, you'd need another NuGet package:

Copy code Microsoft.CodeAnalysis.MSBuild You can add it using:

bash Copy code dotnet add package Microsoft.CodeAnalysis.MSBuild This package allows for the creation of the MSBuildWorkspace and opening of solutions or projects.

You the check for 'if (memebr is ClassDeclarationSyntax classDeclaration)' is returning member's whose type is 'Microsoft.CodeAnalysis.CSharp.Syntax.FileScopedNamespaceDeclarationSyntax' how would i get to the ClassDeclarationSyntax object? ChatGPT ChatGPT The FileScopedNamespaceDeclarationSyntax type corresponds to a new feature introduced in C# 10, which allows for file-scoped namespace declarations. It provides a way to avoid an extra level of indentation in a file. The new declaration style looks like this:

csharp Copy code namespace MyNamespace;

class MyClass { //... } Instead of the traditional:

csharp Copy code namespace MyNamespace { class MyClass { //... } } Given this, the classes are now directly under the FileScopedNamespaceDeclarationSyntax. You'll need to look for class declarations within this type to get to the ClassDeclarationSyntax objects.

Here's a way to modify the solution:

Check if the member is a FileScopedNamespaceDeclarationSyntax. If it is, iterate over its members looking for ClassDeclarationSyntax. csharp Copy code foreach (var member in root.Members) { // If we encounter file-scoped namespace, we look inside it for classes. if (member is FileScopedNamespaceDeclarationSyntax fileScopedNamespace) { foreach (var nsMember in fileScopedNamespace.Members) { if (nsMember is ClassDeclarationSyntax classDeclaration) { ProcessClassDeclaration(classDeclaration, semanticModel); } } } else if (member is ClassDeclarationSyntax classDeclaration) { ProcessClassDeclaration(classDeclaration, semanticModel); } }

// ...

// Encapsulate the class processing logic into its own method for clarity and reuse void ProcessClassDeclaration(ClassDeclarationSyntax classDeclaration, SemanticModel semanticModel) { foreach (var baseType in classDeclaration.BaseList?.Types ?? Enumerable.Empty()) { var typeSymbol = semanticModel.GetSymbolInfo(baseType.Type).Symbol as INamedTypeSymbol;

    if (typeSymbol?.TypeKind == TypeKind.Interface && typeSymbol.Name == "INotification")
    {
        var className = classDeclaration.Identifier.ValueText;
        var yaml = $@"

messages: {className}: payload: type: string pattern: '^hello .+$'";

        Console.WriteLine(yaml);
    }
}

} This modification will ensure that you're able to capture class declarations both inside and outside of file-scoped namespaces.

You each of those classes being converted have get/set properties on them. Could the "type: string pattern: '^hello .+$'" be changed to simply ': ' for example 'string Cake { get; set; }' would become 'Cake: string' ChatGPT ChatGPT Sure! To achieve this, you'll need to dive deeper into the ClassDeclarationSyntax to get its properties (members of type PropertyDeclarationSyntax) and extract the information you need. Here's how you can modify the Roslyn code to get properties of the classes and format the output as you described:

csharp Copy code void ProcessClassDeclaration(ClassDeclarationSyntax classDeclaration, SemanticModel semanticModel) { // Check for INotification first foreach (var baseType in classDeclaration.BaseList?.Types ?? Enumerable.Empty()) { var typeSymbol = semanticModel.GetSymbolInfo(baseType.Type).Symbol as INamedTypeSymbol;

    if (typeSymbol?.TypeKind == TypeKind.Interface && typeSymbol.Name == "INotification")
    {
        var className = classDeclaration.Identifier.ValueText;

        var properties = classDeclaration.Members.OfType<PropertyDeclarationSyntax>();

        string yamlProperties = "";
        foreach (var property in properties)
        {
            var propertyName = property.Identifier.ValueText;
            var propertyType = property.Type.ToString();
            
            yamlProperties += $"      {propertyName}: {propertyType}\n";
        }

        var yaml = $@"

messages: {className}: payload: {yamlProperties}";

        Console.WriteLine(yaml);
    }
}

} This modification inspects the members of the class, specifically looking for properties. It then constructs the YAML based on the property name and type as you specified.

You If i wished to extend that to include methods on the class under a new collection of string values, how would i go about that? ChatGPT ChatGPT To include methods on the class, you'd look for members of type MethodDeclarationSyntax within the ClassDeclarationSyntax.

Here's how you can modify the previous example to also include methods:

csharp Copy code void ProcessClassDeclaration(ClassDeclarationSyntax classDeclaration, SemanticModel semanticModel) { // Check for INotification first foreach (var baseType in classDeclaration.BaseList?.Types ?? Enumerable.Empty()) { var typeSymbol = semanticModel.GetSymbolInfo(baseType.Type).Symbol as INamedTypeSymbol;

    if (typeSymbol?.TypeKind == TypeKind.Interface && typeSymbol.Name == "INotification")
    {
        var className = classDeclaration.Identifier.ValueText;

        // Extracting properties
        var properties = classDeclaration.Members.OfType<PropertyDeclarationSyntax>();
        string yamlProperties = "";
        foreach (var property in properties)
        {
            var propertyName = property.Identifier.ValueText;
            var propertyType = property.Type.ToString();
            yamlProperties += $"      {propertyName}: {propertyType}\n";
        }

        // Extracting methods
        var methods = classDeclaration.Members.OfType<MethodDeclarationSyntax>();
        string yamlMethods = "";
        foreach (var method in methods)
        {
            var methodName = method.Identifier.ValueText;
            yamlMethods += $"      - {methodName}\n";
        }

        var yaml = $@"

messages: {className}: payload: {yamlProperties} methods: {yamlMethods}";

        Console.WriteLine(yaml);
    }
}

} Now, if you have a class that implements INotification with the following structure:

csharp Copy code class MyClass : INotification { public string MyProperty { get; set; } public void MyMethod() { } } The YAML output will be:

yaml Copy code messages: MyClass: payload: MyProperty: string methods: - MyMethod This code now captures both properties and methods of the class and formats them into the desired YAML structure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment