Skip to content

Instantly share code, notes, and snippets.

@JeremyTBradshaw
Last active December 19, 2024 11:50
Show Gist options
  • Save JeremyTBradshaw/0e66d4425f45b1bca8a6ae90c4f92abd to your computer and use it in GitHub Desktop.
Save JeremyTBradshaw/0e66d4425f45b1bca8a6ae90c4f92abd to your computer and use it in GitHub Desktop.
The Power of the Guid for Even Distribution of Large Sets
# Tackle large sets of objects with a Guid property which can be filtered as a string (e.g., ExternalDirectoryObjectId in Exchange Online).
<#
Guids are typically (near)PERFECTLY distributed alphabetically, working from the leftmost character to the rightmost character.
This means that if you have a large set of objects with a Guid property, you can filter them by the first character(s) of the Guid
to get perfectly distributed groups of the entire set of objects.
**NOTE: This is assuming all Guids in the set are generated from a common source (e.g., Active Directory/Entra ID, Exchange/EXO, etc.).
For example, if we just pick only the 1st character of the Guid, we can get 16 groups (0-9, A-F) of objects. Each of the 16 groups
will have a near-exact number of objects in them. Check it out for yourself, and see:
- Get-Mailbox -Filter "ExternalDirectoryObjectId -like '0*'" | Measure-Object
- Get-Mailbox -Filter "ExternalDirectoryObjectId -like '1*'" | Measure-Object
- ...
- Get-Mailbox -Filter "ExternalDirectoryObjectId -like 'F*'" | Measure-Object
- The Count property of each of the above commands should be very close to each other.
If we instead pick the first 2 characters of the Guid, we can get 256 groups (00-FF) of objects. I.e.,:
- Get-Mailbox -Filter "ExternalDirectoryObjectId -like '00*'" | Measure-Object
- ...
- Get-Mailbox -Filter "ExternalDirectoryObjectId -like 'FF*'" | Measure-Object
- Wow!! Amazing, right? The Count property of each of the above commands should be very close to each other.
If we pick the first 3 characters of the Guid, we can get 4096 groups (000-FFF) of objects. And so on... so we can have this many groups:
- 1 character: 16 groups
- 2 characters: 256 groups
- 3 characters: 4096 groups
- 4 characters: 65536 groups
- There are 32 characters in a Guid, so the maximum number of groups we can have is 16^32 = 3.40282366920938E+38 groups (highly unlikely to ever need this many groups!)
Now, what to do with this magical knowledge? Well, you can use it to distribute workloads across multiple days of the week, for example.
The following code is doing just that - spreading the 256 groups of objects across 7 days of the week. We can use modular arithmetic to
find even distribution of the groups across the days and then wrap the remaining groups around the days until all groups are assigned.
#>
# First, create a list of all 256 possible hex values (00-FF), when we filter by the first 2 characters of a Guid, for example:
$groups = @()
$hexChars = '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'A', 'B', 'C', 'D', 'E', 'F'
foreach ($hxChar1 in $hexChars) {
# ^^ First character of the Guid.
foreach ($hxChar2 in $hexChars) {
# ^^ Second character of the Guid... and so on (e.g., add more foreach loops for more characters in the Guid).
$groups += "$hxChar1$hxChar2"
}
}
# Next, create a hashtable with keys for the 7 days of the week and empty arrays as values:
$days = [ordered]@{
'Sunday' = @()
'Monday' = @()
'Tuesday' = @()
'Wednesday' = @()
'Thursday' = @()
'Friday' = @()
'Saturday' = @()
}
# Now, we distribute as many of the 256 groups as possible evenly across the 7 days, using modular arithmetic:
# Set the number of groups in total, so that we can optionally multiply this number in the next step.
$groupCount = $groups.count
# Optionally, multiply the the number of groups (256 in this case) by the number of times we want to process each group per week, e.g., twice (2):
$groupMultiplier = 1
$groupCount *= $groupMultiplier
# Figure out the the remainder of the division of the total number of groups to be processed per week, by the number of days in the week:
# E.g., 256 % 7 = 4
# ...or if we multiply the number of groups by 2, then 512 % 7 = 1
$mod = $groupCount % $days.count
# Figure out the base number of groups to be processed per day, without the remainder:
# E.g., (256 - 4) / 7 = 36 groups per day, plus 1 extra group on 4 days of the week.
# E.g., (512 - 1) / 7 = 73 groups per day, plus 1 extra group on one day of the week.
$baseGroupSize = ($groupCount - $mod) / $days.count
# Distribute the groups across the days, using the base number of groups per day:
$groupCounter = 0
foreach ($d in [array]$days.Keys) {
$baseGroupCounter = 0
foreach ($i in 1..$baseGroupSize) {
if ($groupCounter -eq $groupCount / $groupMultiplier) { $groupCounter = 0 } #<--: Loop back through the groups if we're multiplying the frequency.
$days[$d] += $groups[$groupCounter]
$groupCounter++
$baseGroupCounter++
}
}
# Finally, we wrap the remaining groups around the days until all groups are assigned:
do {
foreach ($d in [array]$Days.Keys) {
if ($mod -gt 0) {
$days[$d] += $groups[$groupCounter]
$groupCounter++
$mod--
}
}
}
until ($mod -eq 0)
<#
$Days will look like this (for the example of 256 groups distributed across 7 days of the week):
Day GroupCount Groups
--- ---------- ------
Sunday 37 {00, 01, 02, 03, 04, 05, 06, 07, 08, 09, 0A, 0B, 0C, 0D, 0E, 0F,
10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 1A, 1B, 1C, 1D, 1E, 1F,
20, 21, 22, 23, FC}
Monday 37 {24, 25, 26, 27, 28, 29, 2A, 2B, 2C, 2D, 2E, 2F,
30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 3A, 3B, 3C, 3D, 3E, 3F,
40, 41, 42, 43, 44, 45, 46, 47, FD}
Tuesday 37 {48, 49, 4A, 4B, 4C, 4D, 4E, 4F,
50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 5A, 5B, 5C, 5D, 5E, 5F,
60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 6A, 6B, FE}
Wednesday 37 {6C, 6D, 6E, 6F,
70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 7A, 7B, 7C, 7D, 7E, 7F,
80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 8A, 8B, 8C, 8D, 8E, 8F,
FF}
Thursday 36 {90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 9A, 9B, 9C, 9D, 9E, 9F,
A0, A1, A2, A3, A4, A5, A6, A7, A8, A9, AA, AB, AC, AD, AE, AF,
B0, B1, B2, B3}
Friday 36 {B4, B5, B6, B7, B8, B9, BA, BB, BC, BD, BE, BF,
C0, C1, C2, C3, C4, C5, C6, C7, C8, C9, CA, CB, CC, CD, CE, CF,
D0, D1, D2, D3, D4, D5, D6, D7}
Saturday 36 {D8, D9, DA, DB, DC, DD, DE, DF,
E0, E1, E2, E3, E4, E5, E6, E7, E8, E9, EA, EB, EC, ED, EE, EF,
F0, F1, F2, F3, F4, F5, F6, F7, F8, F9, FA, FB}
...or like this if multiplying the frequency by 2:
Day GroupCount Groups
--- ---------- ------
Sunday 74 {00, 01, 02, 03, 04, 05, 06, 07, 08, 09, 0A, 0B, 0C, 0D, 0E, 0F,
10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 1A, 1B, 1C, 1D, 1E, 1F,
20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 2A, 2B, 2C, 2D, 2E, 2F,
30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 3A, 3B, 3C, 3D, 3E, 3F,
40, 41, 42, 43, 44, 45, 46, 47, 48, FF}
Monday 73 {49, 4A, 4B, 4C, 4D, 4E, 4F,
50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 5A, 5B, 5C, 5D, 5E, 5F,
60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 6A, 6B, 6C, 6D, 6E, 6F,
70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 7A, 7B, 7C, 7D, 7E, 7F,
80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 8A, 8B, 8C, 8D, 8E, 8F,
90, 91}
Tuesday 73 {92, 93, 94, 95, 96, 97, 98, 99, 9A, 9B, 9C, 9D, 9E, 9F,
A0, A1, A2, A3, A4, A5, A6, A7, A8, A9, AA, AB, AC, AD, AE, AF,
B0, B1, B2, B3, B4, B5, B6, B7, B8, B9, BA, BB, BC, BD, BE, BF,
C0, C1, C2, C3, C4, C5, C6, C7, C8, C9, CA, CB, CC, CD, CE, CF,
D0, D1, D2, D3, D4, D5, D6, D7, D8, D9, DA}
Wednesday 73 {DB, DC, DD, DE, DF,
E0, E1, E2, E3, E4, E5, E6, E7, E8, E9, EA, EB, EC, ED, EE, EF,
F0, F1, F2, F3, F4, F5, F6, F7, F8, F9, FA, FB, FC, FD, FE, FF,
00, 01, 02, 03, 04, 05, 06, 07, 08, 09, 0A, 0B, 0C, 0D, 0E, 0F,
10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 1A, 1B, 1C, 1D, 1E, 1F,
20, 21, 22, 23}
Thursday 73 {24, 25, 26, 27, 28, 29, 2A, 2B, 2C, 2D, 2E, 2F,
30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 3A, 3B, 3C, 3D, 3E, 3F,
40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 4A, 4B, 4C, 4D, 4E, 4F,
50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 5A, 5B, 5C, 5D, 5E, 5F,
60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 6A, 6B, 6C}
Friday 73 {6D, 6E, 6F,
70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 7A, 7B, 7C, 7D, 7E, 7F,
80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 8A, 8B, 8C, 8D, 8E, 8F,
90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 9A, 9B, 9C, 9D, 9E, 9F,
A0, A1, A2, A3, A4, A5, A6, A7, A8, A9, AA, AB, AC, AD, AE, AF,
B0, B1, B2, B3, B4, B5}
Saturday 73 {B6, B7, B8, B9, BA, BB, BC, BD, BE, BF,
C0, C1, C2, C3, C4, C5, C6, C7, C8, C9, CA, CB, CC, CD, CE, CF,
D0, D1, D2, D3, D4, D5, D6, D7, D8, D9, DA, DB, DC, DD, DE, DF,
E0, E1, E2, E3, E4, E5, E6, E7, E8, E9, EA, EB, EC, ED, EE, EF,
F0, F1, F2, F3, F4, F5, F6, F7, F8, F9, FA, FB, FC, FD, FE}
#>
# Now, here's an example of how you can use this to distribute workloads across the week. Imagine the following code is run daily on a schedule.
# We want to process all mailboxes in Exchange Online, but we want to spread the work across the week. We can use the above $days hashtable to determine which group of mailboxes to
# process on any given day of the week:
$Today = [datetime]::Today.DayOfWeek
$GuidFilters = $days[$Today]
$TodaysMailboxes = foreach ($gf in $GuidFilters) {
Start-Sleep -Seconds 1
Get-Mailbox -Filter "ExternalDirectoryObjectId -like '$gf*'" -ResultSize Unlimited
}
# Process $TodaysMailboxes:
foreach ($mbx in $TodaysMailboxes) {
Get-MailboxPermission -Identity $mbx
Get-MailboxStatistics -Identity $mbx
Get-MobileDeviceStatistics -Mailbox $mbx
# etc.
}
# ...and so on, for each day of the week. This way, you can distribute the workload across the week, and be sure to process *all* mailboxes in one week (or twice/multiple times per week).
@MichelZ
Copy link

MichelZ commented Dec 19, 2024

This is nice, thanks. I was thinking about doing the same thing, but it does only seem to work with this specific GUID property, and none of the others. Glad you've found the one that works

@JeremyTBradshaw
Copy link
Author

I'm glad you found it and hope you get to make use of it. I've used it to straddle mailboxes across the days of the week for a few tasks in EXO where all mailboxes definitely cannot be processed in one go. Seemed like a universal epiphany once I found it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment