Using semantic highlighting in neovim

What is Semantic Highlighting?
Default Highlighting
Complex Highlighting
- Controlling When Highlights are Applied

What is Semantic Highlighting?

And, how is it different than treesitter highlighting? Here's a small example:

In C++, treesitter will highlight member variable declarations with @property and names in the parameter list as @parameter. But when they are used inside the function body, treesitter can't tell the difference between them, so they are all just blue @variable identifiers. Semantic highlighting uses an LSP (clangd, in this case) to show more accurate highlights.

Being able to tell the difference with a glance is useful:

You know immediately — without seeing any other code — that something strange is going on with z. Maybe it's just poorly named, or maybe it's shadowing another variable.

Semantic highlighting can do much more. Here's another C++ example that highlights functions and variables by scope:

Seeing variable scope at a glance is so useful, many C++ projects use conventions like "prefix member variables with m_." But there isn't a universal convention, and even if there was, people would make mistakes. If you use semantic highlighting, you can simply assign a specific color to member variables.

Highlighting variables by scope is only one option! Instead, you could choose to highlight mutable variables, or async functions, or anything else that an LSP tells you about your code. You probably care about different properties for each language you write in.

Treesitter and semantic highlighting work great together! Treesitter is a fast, in-process parser. It understands the structure of your code, and it will always handle most of the highlighting. An LSP can add more — or more accurate — highlights for some parts of your code, but it is a slower, separate process.

Default Highlighting

Tokens to Highlights

An LSP server that supports semantic highlighting sends "tokens" to the LSP client. A token is data that describes a piece of text. Each token has a type, and zero or more modifiers.

For this C++ code:

//        Let's look at this token ↓
int function(int const p) { return p; }

The LSP tells us that p has a token with type parameter and two modifiers: readonly and functionScope. The default highlighting will apply five highlights to p:

@lsp.type.parameter.cpp
@lsp.mod.readonly.cpp
@lsp.mod.functionScope.cpp
@lsp.typemod.parameter.readonly.cpp
@lsp.typemod.parameter.functionScope.cpp

In general, it applies:

@lsp.type.<type>.<ft> highlight for each token
@lsp.mod.<mod>.<ft> highlight for each modifier of each token
@lsp.typemod.<type>.<mod>.<ft> highlights for each modifier of each token

You can use the :Inspect command to see what semantic highlights are being applied to your code.

Changing Highlights

Most of these highlight groups will be undefined, so they won't change the appearance of your code. To make parameters purple:

hi @lsp.type.parameter guifg=Purple

Or, with equivalent lua:

vim.api.nvim_set_hl(0, '@lsp.type.parameter', { fg='Purple' })

Just like treesitter highlights, if there is no specific-to-C++ @lsp.type.parameter.cpp group, it will fall back to the @lsp.type.parameter group.

Then, if you want everything which is read-only to be italic:

hi @lsp.mod.readonly gui=italic

If you only want parameters which are read-only to be italic:

hi @lsp.typemod.parameter.readonly gui=italic

To make sure your changes persist after changing colorschemes, wrap them in an autocommand that will reapply them after each colorscheme change:

vim.api.nvim_create_autocmd('ColorScheme', {
  callback = function ()
    vim.api.nvim_set_hl(0, '@lsp.type.parameter', { fg='Purple' })
    vim.api.nvim_set_hl(0, '@lsp.mod.readonly', { italic=true })
  end
})

Be careful to create the autocommand before calling :colorscheme in your init.

The C++ scopes example above can be created with a handful of highlights:

hi @lsp.type.class      guifg=Aqua
hi @lsp.type.function   guifg=Yellow
hi @lsp.type.method     guifg=Green
hi @lsp.type.parameter  guifg=Purple
hi @lsp.type.variable   guifg=Blue
hi @lsp.type.property   guifg=Green

hi @lsp.typemod.function.classScope  guifg=Orange
hi @lsp.typemod.variable.classScope  guifg=Orange
hi @lsp.typemod.variable.fileScope   guifg=Orange
hi @lsp.typemod.variable.globalScope guifg=Red

You probably want to use nicer colors than these!

If your colorscheme doesn't define @lsp.* groups yet, but it does define treesitter highlights, you might find it useful to link the semantic groups to the treesitter groups to get consistent colors:

local links = {
  ['@lsp.type.namespace'] = '@namespace',
  ['@lsp.type.type'] = '@type',
  ['@lsp.type.class'] = '@type',
  ['@lsp.type.enum'] = '@type',
  ['@lsp.type.interface'] = '@type',
  ['@lsp.type.struct'] = '@structure',
  ['@lsp.type.parameter'] = '@parameter',
  ['@lsp.type.variable'] = '@variable',
  ['@lsp.type.property'] = '@property',
  ['@lsp.type.enumMember'] = '@constant',
  ['@lsp.type.function'] = '@function',
  ['@lsp.type.method'] = '@method',
  ['@lsp.type.macro'] = '@macro',
  ['@lsp.type.decorator'] = '@function',
}
for newgroup, oldgroup in pairs(links) do
  vim.api.nvim_set_hl(0, newgroup, { link = oldgroup, default = true })
end

Disabling Highlights

You can disable semantic highlighting by clearing the semantic highlighting groups.

For example, maybe you don't like the semantic highlighting of functions in lua. Disable it with:

vim.api.nvim_set_hl(0, '@lsp.type.function.lua', {})

Or, you can disable all semantic highlights by clearing all the groups:

for _, group in ipairs(vim.fn.getcompletion("@lsp", "highlight")) do
  vim.api.nvim_set_hl(0, group, {})
end

Complex Highlighting

You can do quite a lot with just the default highlights! But if you want more, you can do more complex highlighting by writing an autocommand for the new LspTokenUpdate event. This event is triggered every time a visible token is updated. You can write code to inspect the token, then apply a highlight with the new vim.lsp.semantic_tokens.highlight_token function.

You can apply highlights based on more than one modifier:

vim.api.nvim_create_autocmd("LspTokenUpdate", {
  callback = function(args)
    local token = args.data.token
    if
      token.type == "variable"
      and token.modifiers.globalScope
      and not token.modifiers.readonly
    then
      vim.lsp.semantic_tokens.highlight_token(
        token, args.buf, args.data.client_id, "MyMutableGlobalHL")
    end
  end,
})

vim.api.nvim_set_hl(0, 'MyMutableGlobalHL', { fg = 'red' })

Note that this example uses the globalScope modifier, which is specific to clangd.

You can write highlighting logic that uses more than just the token type and modifiers. Here's an example that highlights variable names written in ALL_CAPS that aren't constant:

local function show_unconst_caps(args)
  local token = args.data.token
  if token.type ~= "variable" or token.modifiers.readonly then return end

  local text = vim.api.nvim_buf_get_text(
    args.buf, token.line, token.start_col, token.line, token.end_col, {})[1]
  if text ~= string.upper(text) then return end

  vim.lsp.semantic_tokens.highlight_token(
    token, args.buf, args.data.client_id, "Error")
end

vim.api.nvim_create_autocmd("LspTokenUpdate", {
  callback = show_unconst_caps,
})

Controlling When Highlights are Applied

The previous example, which highlighted mutable variables, only makes sense for languages that have some way of marking variables as readonly, like const in C++ and Typescript. In languages like Lua or Python, where there is no readonly, that highlight won't work correctly.

Thankfully, there are many ways to control how the highlights are applied:

:h autocmd-pattern explains how you can filter autocommands based on file name:

vim.api.nvim_create_autocmd("LspTokenUpdate", {
  pattern = {"*.cpp", "*.hpp"},
  callback = show_unconst_caps,
})

:h LspTokenUpdate tells you that the client_id is in the args, so you can just return early if it's not an LSP server you want to highlight:

local function show_unconst_caps(args)
  local client = vim.lsp.get_client_by_id(args.data.client_id)
  if client.name ~= "clangd" then return end

  local token = args.data.token
  -- etc
end

You can create buffer-local autocommands (:h autocmd-buflocal) whenever an LSP client attaches to a buffer:

require('lspconfig').clangd.setup {
  on_attach = function(client, buffer)
    vim.api.nvim_create_autocmd("LspTokenUpdate", {
      buffer = buffer,
      callback = show_unconst_caps,
    })

    -- other on_attach logic
  end
}

You can also create buffer-local autocommands inside an :h LspAttach event callback:

vim.api.nvim_create_autocmd("LspAttach", {
  callback = function(args)
    local client = vim.lsp.get_client_by_id(args.data.client_id)
    if client.name ~= "clangd" then return end

    vim.api.nvim_create_autocmd("LspTokenUpdate", {
      buffer = args.buf,
      callback = show_unconst_caps,
    })
  end
})

rockyzhang24/semantic_hl.md