Skip to content

Instantly share code, notes, and snippets.

@swarn
Last active November 14, 2024 02:59
Show Gist options
  • Save swarn/fb37d9eefe1bc616c2a7e476c0bc0316 to your computer and use it in GitHub Desktop.
Save swarn/fb37d9eefe1bc616c2a7e476c0bc0316 to your computer and use it in GitHub Desktop.
Using semantic highlighting in neovim

Semantic Highlighting in Neovim

What is Semantic Highlighting?

And, how is it different than treesitter highlighting? Here's a small example:

treesitter and lsp highlights

In C++, treesitter will highlight member variable declarations with @property and names in the parameter list as @parameter. But when they are used inside the function body, treesitter can't tell the difference between them, so they are all just blue @variable identifiers. Semantic highlighting uses an LSP (clangd, in this case) to show more accurate highlights.

Being able to tell the difference with a glance is useful:

a possible error

You know immediately — without seeing any other code — that something strange is going on with z. Maybe it's just poorly named, or maybe it's shadowing another variable.

Semantic highlighting can do much more. Here's another C++ example that highlights functions and variables by scope:

highlighting scopes

Seeing variable scope at a glance is so useful, many C++ projects use conventions like "prefix member variables with m_." But there isn't a universal convention, and even if there was, people would make mistakes. If you use semantic highlighting, you can simply assign a specific color to member variables.

Highlighting variables by scope is only one option! Instead, you could choose to highlight mutable variables, or async functions, or anything else that an LSP tells you about your code. You probably care about different properties for each language you write in.

Treesitter and semantic highlighting work great together! Treesitter is a fast, in-process parser. It understands the structure of your code, and it will always handle most of the highlighting. An LSP can add more — or more accurate — highlights for some parts of your code, but it is a slower, separate process.

Default Highlighting

Tokens to Highlights

An LSP server that supports semantic highlighting sends "tokens" to the LSP client. A token is data that describes a piece of text. Each token has a type, and zero or more modifiers.

For this C++ code:

//        Let's look at this token ↓
int function(int const p) { return p; }

The LSP tells us that p has a token with type parameter and two modifiers: readonly and functionScope. The default highlighting will apply five highlights to p:

  • @lsp.type.parameter.cpp
  • @lsp.mod.readonly.cpp
  • @lsp.mod.functionScope.cpp
  • @lsp.typemod.parameter.readonly.cpp
  • @lsp.typemod.parameter.functionScope.cpp

In general, it applies:

  • @lsp.type.<type>.<ft> highlight for each token
  • @lsp.mod.<mod>.<ft> highlight for each modifier of each token
  • @lsp.typemod.<type>.<mod>.<ft> highlights for each modifier of each token

You can use the :Inspect command to see what semantic highlights are being applied to your code.

Changing Highlights

Most popular colorschemes for neovim define colors for the @lsp.type.* highlights. Of course, you can change or add to the colors. To make parameters purple:

hi @lsp.type.parameter guifg=Purple

Or, with equivalent lua:

vim.api.nvim_set_hl(0, '@lsp.type.parameter', { fg='Purple' })

Just like treesitter highlights, if there is no specific-to-C++ @lsp.type.parameter.cpp group, it will fall back to the @lsp.type.parameter group.

Then, if you want everything which is read-only to be italic:

hi @lsp.mod.readonly gui=italic

If you only want parameters which are read-only to be italic:

hi @lsp.typemod.parameter.readonly gui=italic

Note that for a readonly parameter, the typemod.parameter.readonly highlight will have higher priority than the mod.readonly and type.parameter highlights.

To make sure your changes persist after changing colorschemes, wrap them in an autocommand that will reapply them after each colorscheme change:

vim.api.nvim_create_autocmd('ColorScheme', {
  callback = function ()
    vim.api.nvim_set_hl(0, '@lsp.type.parameter', { fg='Purple' })
    vim.api.nvim_set_hl(0, '@lsp.mod.readonly', { italic=true })
  end
})

Be careful to create the autocommand before calling :colorscheme in your init.

The C++ scopes example above can be created with a handful of highlights:

hi @lsp.type.class      guifg=Aqua
hi @lsp.type.function   guifg=Yellow
hi @lsp.type.method     guifg=Green
hi @lsp.type.parameter  guifg=Purple
hi @lsp.type.variable   guifg=Blue
hi @lsp.type.property   guifg=Green

hi @lsp.typemod.function.classScope  guifg=Orange
hi @lsp.typemod.variable.classScope  guifg=Orange
hi @lsp.typemod.variable.fileScope   guifg=Orange
hi @lsp.typemod.variable.globalScope guifg=Red

If your colorscheme doesn't define colors for these groups, then neovim will default to treesitter highlight colors for many of them (see :help hi-link). For example, @lsp.type.namespace is default-linked to @module.

Disable Semantic Highlighting

For a Specific LSP

To disable semantic highlighting for a particular server, you can remove semanticTokensProvider from the server_capabilites in the vim.lsp.ClientConfig. For example, to disable semantic highlighting for lua_ls:

lspconfig.lua_ls.setup {
  on_init = function(client, _)
    client.server_capabilities.semanticTokensProvider = nil
  end,
  on_attach = ...,
  settings = ...,
}

Read about it at :help vim.lsp.semantic_tokens.start()

For All LSPs

As above, but for every LSP:

require ('lspconfig').util.default_config.on_init = function(client, _)
  client.server_capabilities.semanticTokensProvider = nil
end

For a Specific Token

You can avoid highlighting particular tokens by clearing the associated semantic highlighting group. For example, maybe you don't like the semantic highlighting of functions in lua. Disable it with:

vim.api.nvim_set_hl(0, '@lsp.type.function.lua', {})

See Changing Highlights above for more information.

Using LspTokenUpdate for Complex Highlighting

You can apply custom highlights based on semantic tokens using the LspTokenUpdate event. This event is triggered every time a visible token is updated. You can write code to inspect the token, then apply a highlight with the vim.lsp.semantic_tokens.highlight_token function. Here are a few examples:

Highlighting Based on More Than One Modifier

What if I want all global variables that aren't read-only to get a special highlight? I can check the modifiers for the semantic tokens and use whatever logic I want:

vim.api.nvim_create_autocmd("LspTokenUpdate", {
  callback = function(args)
    local token = args.data.token
    if
      token.type == "variable"
      and token.modifiers.globalScope
      and not token.modifiers.readonly
    then
      vim.lsp.semantic_tokens.highlight_token(
        token, args.buf, args.data.client_id, "MyMutableGlobalHL")
    end
  end,
})

vim.api.nvim_set_hl(0, 'MyMutableGlobalHL', { fg = 'red' })

By default, this highlight is higher priority than the standard LSP highlights.

Dealing with Ambiguity

Imagine I have these highlights:

hi @lsp.typemod.variable.globalScope     guifg=Red
hi @lsp.typemod.variable.defaultLibrary  guifg=Green

And I have the following c++:

std::cout << "Hello";

The semantic highlights applied to cout will be:

  • @lsp.type.variable.cpp, priority: 125
  • @lsp.mod.defaultLibrary.cpp, priority: 126
  • @lsp.mod.globalScope.cpp, priority: 126
  • @lsp.typemod.variable.defaultLibrary.cpp, priority: 127
  • @lsp.typemod.variable.globalScope.cpp, priority: 127

There are two different highlights (the last two) with the same priority and different colors. Because of that, there's no way to tell whether cout will be red or green.

One way to fix that is to make sure you use composable highlights. If globalScope is red and defaultLibrary is underlined, then cout will be both red and underlined.

Another alternative is use a callback to apply the highlight you want at a higher priority:

vim.api.nvim_create_autocmd("LspTokenUpdate", {
  callback = function(args)
    local token = args.data.token
    if token.type == "variable" and token.modifiers.defaultLibrary then
      vim.lsp.semantic_tokens.highlight_token(
        token, args.buf, args.data.client_id, "@lsp.mod.defaultLibrary")
    end
  end,
})

Complex Highlighting

You can write highlighting logic that uses more than just the token type and modifiers. Here's an example that highlights variable names written in ALL_CAPS that aren't constant:

local function show_unconst_caps(args)
  local token = args.data.token
  if token.type ~= "variable" or token.modifiers.readonly then return end

  local text = vim.api.nvim_buf_get_text(
    args.buf, token.line, token.start_col, token.line, token.end_col, {})[1]
  if text ~= string.upper(text) then return end

  vim.lsp.semantic_tokens.highlight_token(
    token, args.buf, args.data.client_id, "Error")
end

vim.api.nvim_create_autocmd("LspTokenUpdate", {
  callback = show_unconst_caps,
})

Controlling When Highlights are Applied

The previous example, which highlighted mutable variables, only makes sense for languages that have some way of marking variables as readonly, like const in C++ and Typescript. In languages like Lua or Python, where there is no readonly, that highlight won't work correctly.

Thankfully, there are many ways to control how the highlights are applied:

  • :h autocmd-pattern explains how you can filter autocommands based on file name:

    vim.api.nvim_create_autocmd("LspTokenUpdate", {
      pattern = {"*.cpp", "*.hpp"},
      callback = show_unconst_caps,
    })
  • :h LspTokenUpdate tells you that the client_id is in the args, so you can just return early if it's not an LSP server you want to highlight:

    local function show_unconst_caps(args)
      local client = vim.lsp.get_client_by_id(args.data.client_id)
      if client.name ~= "clangd" then return end
    
      local token = args.data.token
      -- etc
    end
  • You can create buffer-local autocommands (:h autocmd-buflocal) whenever an LSP client attaches to a buffer:

    require('lspconfig').clangd.setup {
      on_attach = function(client, buffer)
        vim.api.nvim_create_autocmd("LspTokenUpdate", {
          buffer = buffer,
          callback = show_unconst_caps,
        })
    
        -- other on_attach logic
      end
    }
    
  • You can also create buffer-local autocommands inside an :h LspAttach event callback:

    vim.api.nvim_create_autocmd("LspAttach", {
      callback = function(args)
        local client = vim.lsp.get_client_by_id(args.data.client_id)
        if client.name ~= "clangd" then return end
    
        vim.api.nvim_create_autocmd("LspTokenUpdate", {
          buffer = args.buf,
          callback = show_unconst_caps,
        })
      end
    })
@telemachus
Copy link

Someone in the neovim channel on matrix suggested :help vim.lsp.semantic_tokens.stop(). That looks worth a look, but it's also per client (and maybe even per buffer).

@swarn
Copy link
Author

swarn commented Sep 26, 2024

Yeah, stop per-buffer would also work.

I think your previous comment may point to a better solution, both for the "disable all servers" problem, and possibly similar for others. I don't know that it makes a huge difference, but changing the client capabilities seems more reasonable than editing the server capabilities.

I need to test, but with lspconfig, I think you could use the capabilities key for per-server setup (rather than the on_init approach), and lspconfig.util.default_config for the overall case — providing an edited capabilites table in both cases.

@swarn
Copy link
Author

swarn commented Sep 26, 2024

I got nowhere playing with client capabilities, but this seems to work for globally disabling semantic highlights:

require ('lspconfig').util.default_config.on_init = function(client, _)
  client.server_capabilities.semanticTokensProvider = nil
end

@telemachus
Copy link

telemachus commented Sep 26, 2024

I think your previous comment may point to a better solution, both for the "disable all servers" problem, and possibly similar for others. I don't know that it makes a huge difference, but changing the client capabilities seems more reasonable than editing the server capabilities.

Like you, I don't know whether (or how much) it matters whether you edit the capabilities of the client or the server. But I think I get why it seems cleaner to edit the client capabilities if you are on the client side. Either way, I'm assuming that if one or the other lacks a capability that they are both smart enough to stop wasting cycles sending (or looking for) messages about that capability. I hope eventually to understand LSP well enough to know more about this.

I need to test, but with lspconfig, I think you could use the capabilities key for per-server setup (rather than the on_init approach)

Neat idea. The docs for nvim-lspconfig talk only about adding capabilities, but that doesn't mean it couldn't work.

  • {capabilities} table <string, string|table|bool|function>

a table which represents the neovim client capabilities. Useful for
broadcasting to the server additional functionality (snippets, off-protocol
features) provided by plugins.

You can probably use a function for capabilities field and have that function look for the semanticTokens item, set it to nil, and return all the other fields as is. (I may try that myself.)

and lspconfig.util.default_config for the overall case — providing an edited capabilites table in both cases.

That also may work, though I'm not sure whether I would want to monkeypatch that table.

By the way, thanks in the first place for writing this all up.

@telemachus
Copy link

telemachus commented Sep 26, 2024

I got nowhere playing with client capabilities, but this seems to work for globally disabling semantic highlights:

require ('lspconfig').util.default_config.on_init = function(client, _)
  client.server_capabilities.semanticTokensProvider = nil
end

That's great: nicely simple and direct. (Sorry, I was writing my (overly long) reply while you posted this.)

This is a dumb question, but how do you confirm that a given method works? (Do you just load an LSP buffer and then check the client capabilities? Or do you just check what the highlights look like?)

@swarn
Copy link
Author

swarn commented Sep 26, 2024

I'm using the dumbest method: load an LSP buffer, see if it's adding semantic highlights. I'm not checking, for example, to see if the LSP is still sending tokens that are simply being ignored.

@telemachus
Copy link

telemachus commented Sep 26, 2024

I'm using the dumbest method: load an LSP buffer, see if it's adding semantic highlights. I'm not checking, for example, to see if the LSP is still sending tokens that are simply being ignored.

Fair enough. Ultimately, I'd like to make sure that I know how to turn off (parts of) an LSP and make sure that the LSP stops doing that work. I'll probably try different methods with expanded logging this weekend.

@swarn
Copy link
Author

swarn commented Sep 26, 2024

That sounds good. I appreciate you digging into it!

@telemachus
Copy link

telemachus commented Sep 28, 2024

(I used vim.lsp.log.set_level(vim.log.levels.DEBUG) for logging. This comment is probably too detailed, but I got curious.)

@swarn: your two methods, which tweak server capabilities, both seem to work best and pretty much equally well. In both cases, semanticTokens shows up (a few times) in initial "announcements" from the server or client, but then there doesn't seem to be any more messaging about this feature in either direction (no matter how long you edit). (The number of initial references is not identical, but the totals are never more than 5. If anyone really cares, the default_config.on_init method seems to have fewer initial references. Maybe that is because setting the provider to nil in configuration happens at an earlier stage in things than when the LSP attaches?)

require ('lspconfig').util.default_config.on_init = function(client, _)
  client.server_capabilities.semanticTokensProvider = nil
end

vim.api.nvim_create_autocmd("LspAttach", {
callback = function(args)
    local client = vim.lsp.get_client_by_id(args.data.client_id)
    if not client then return end
    client.server_capabilities.semanticTokensProvider = nil
end,
})

The method I suggested above, which tweaks client capabilities, does not work. The client capabilities can affect the overall results (see below), but it does not seem to work to stop semantic highlighting. That seems counter-intuitive to me, but I didn't pursue it further since other methods are simple and effective.

For individual servers, the on_init method works just as well as the default_config.on_init method does. (No surprise here; just reporting for the sake of completeness.)

Finally, in an effort to reduce initial announcements about semantic tokens to an absolute minimum, I came up with this.

-- Tweak the server capabilities. This is the most important step.
require ('lspconfig').util.default_config.on_init = function(client, _)
  client.server_capabilities.semanticTokensProvider = nil
end

-- Tweak the client capabilities. This does not seem to make a *significant* difference, but it
-- will prevent a small number of extra initial announcements about semantic tokens.
local original_make_client_capabilities = vim.lsp.protocol.make_client_capabilities
function vim.lsp.protocol.make_client_capabilities()
    local caps = original_make_client_capabilities()
    if caps.workspace then
        caps.workspace.semanticTokens = nil
    end
    if caps.textDocument then
        caps.textDocument.semanticTokens = nil
    end

    return caps
end

Using those methods together results in only a single initial announcement about semantic tokens. Without tweaking the client capabilities, you get four or five. (Yes, results varied. Again, I didn't pursue why.) I cannot imagine that the few extra announcements affect anything, so the extra code is probably not worth it. But I got curious, so I kept trying different combinations.

@swarn
Copy link
Author

swarn commented Oct 9, 2024

@telemachus I meant to respond to this earlier and lost track!

Thanks for really digging in. I'll add the lspconfig default_config note up above; I'm happy to know that it's (after the handshake) working as hoped.

@telemachus
Copy link

No worries. I basically nerd-sniped myself, but I'm glad if it's useful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment