Skip to content

Instantly share code, notes, and snippets.

@kamleshchandnani
Last active April 25, 2023 15:50
Show Gist options
  • Save kamleshchandnani/9c0661e902e630127aa6f082d99f0ebd to your computer and use it in GitHub Desktop.
Save kamleshchandnani/9c0661e902e630127aa6f082d99f0ebd to your computer and use it in GitHub Desktop.
Understanding "reqrep" in HA Proxy config

Let's assume we have following line in our HA proxy file:
reqrep ^([^\ :]*)\ /api/v1/api-name/(.*) \1\ /staging/path-name/\2
Here is our sample domain:
https://example.com/api/v1/api-name/

The goal here is to rewrite /api/v1/api-name/ to /staging/path-name/ leaving anything else unchanged.

Breaking the Regex and understanding in parts:
There are basically 3 parts in the regex:

  1. ^([^\ :]*) -- Match a single character not present in the list below [^\ :]* i.e blank space and colon :
  2. /api/v1/api-name/ -- Match the whole path string as it is including forward slashes
  3. (.*) -- Matches anything after the 2nd group, basically for capturing query params.

Coming to our example when the request comes to HA we'll have it in following form:

  1. Host: example.com
  2. Path: /api/v1/psychological-triggers
  3. Query params if any

Now let's analyse it with our regex part by part:

  1. ^([^\ :]*) -- Match a single character not present in the list below [^\ :]*.
  • Here it matches our host example.com
  1. /api/v1/api-name/ -- Match the whole path string as it is including forward slashes.
  • Here it matches our path exactly
  1. (.*) -- Matches anything after the 2nd group, basically for capturing query params.
  • We don't have query params so nothing happens here

Let's understand how replace works. This is our regex for it \1\ /staging/path-name/\2. Let's break it and understand what each part means:

  1. \1 -- Keep part 1 from our regex match as it is when rewrting.
  • Here example.com
  1. /staging/path-name/ -- rewrite the part 2 from the regex match to /staging/path-name/.
  • Here replace /api/v1/api-name/ with /staging/path-name/
  1. \2 -- Keep the part 3 of the regex as it is. Now it is part 3 as per our descriptions but actually in our regex we have just 2 pattern matching 1 and 3, since 2 is just the exact match, hence part 3 is actually 2 over here.
  • Here nothing since we don't have query params.
@wdoekes
Copy link

wdoekes commented Oct 21, 2020

Hi there.

Let's say you have a HTTP/1.1 request:

GET /api/v1/api-name/something/?query_string=yes HTTP/1.1
Host: example.com
X-Path: /api/v1/api-name/foo

Then:

reqrep ^([^\ :]*)\ /api/v1/api-name/(.*) \1\ /staging/path-name/\2

in fact matches:

GET /api/v1/api-name/something/?query_string=yes HTTP/1.1

Where the matches are:

  • \1 = GET
  • \2 = something/?query_string=yes HTTP/1.1

(There is no \3, as .* ate it too.)

There is no example.com in that \1!

And generally, you'd see this:

reqrep ^([^\ :]*)\ /api/v1/api-name/([^ ]*)\ (.*) \1\ /staging/path-name/\2\ \3

In which case you would match the HTTP/1.1 in the \3.

The colon (:) in first match ensures you're only matching the request-uri line, and not any other headers.

If you were to omit the colon, like so:

reqrep ^([^\ ]*)\ /api/v1/api-name/(.*) \1\ /staging/path-name/\2

then it would also match:

X-Path: /api/v1/api-name/foo
  • \1 = X-Path:
  • \2 = foo

@Commod0re
Copy link

Commod0re commented Oct 25, 2021

expression analysis is not correct

wdoekes explained the capture groups pretty clearly but I wanted to point this out too

^([^\ :]*) does not match a single character, it matches 0 or more characters which are neither (space) nor : (colon), starting from the beginning of the string

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment