This VBA code library includes several functions and subroutines designed to manage sheet merging and data hashing in an Excel workbook. It also handles the import of CSV files into separate sheets, which are then merged into a single summary sheet.
IMPORT_PREFIX
: The prefix used to identify sheets that need to be imported (default is "imp-").IMPORT_SHEET_NAME
: The name of the merged worksheet where all other sheets' data will be consolidated (default is "Imported").
Important: Make sure to have Net Framework 3.5 installed on your Windows 10 64-bit Office setup to avoid hash routine errors.
Purpose:
Calculates the MD5 hash of a given string.
Arguments:
strToHash
: The string you want to hash.
Returns:
A string representing the MD5 hash of the input string.
Purpose:
Converts a column letter to its corresponding column number.
Arguments:
colLetter
: The column letter (e.g., "A", "Z", "AA").
Returns:
A long representing the column number.
Purpose:
Generates an MD5 hash for an entire row of cells within a given range.
Arguments:
rng
: The range representing a row.
Returns:
A string representing the hash of the entire row's content.
SanitizeSheetName(sheetName As String, Optional customRegexPattern As String = "", Optional customRegexReplace As String = "") As String
Purpose:
Sanitizes sheet names to make them valid Excel worksheet names.
Arguments:
sheetName
: The original sheet name.customRegexPattern
: Custom regular expression pattern to apply.customRegexReplace
: Replacement string for the custom regular expression.
Returns:
A sanitized sheet name as a string.
Purpose:
Merges data from all worksheets with names that start with IMPORT_PREFIX
into a single sheet named IMPORT_SHEET_NAME
.
Operation:
- Identifies the hash column in each worksheet to be merged.
- Compares hashes to avoid duplicate rows.
- Appends new rows to the merged sheet.
- Sorts the merged sheet by date (assumed to be in Column A).
- Deletes all original sheets with data that has been merged.
Purpose:
Imports data from one or more selected CSV files into new worksheets in the workbook.
Operation:
- Shows a file picker dialog for selecting CSV files.
- Creates new worksheets with sanitized names and imports the CSV data into these sheets.
- Adds a "Hash" column to each sheet to store the MD5 hash of each row.
Purpose:
Executes the entire operation of importing CSVs and then merging them into a single worksheet.
Operation:
- Calls
ImportCSVs
to import CSV files. - Calls
MergeSheets
to merge the newly created sheets. - Displays a message indicating that the process is complete.
To import and merge sheets, simply call the ImportAndMerge
subroutine.
Call ImportAndMerge
To only perform sheet merging, you can call the MergeSheets
subroutine.
Call MergeSheets
To import CSV files without merging, use the ImportCSVs
subroutine.
Call ImportCSVs
Const IMPORT_PREFIX As String = "imp-"
Const IMPORT_SHEET_NAME As String = "Imported"
Const IMPORT_SHEET_AMOUNT_COLUMN = "C"
' !!!! IMPORTANT !!!!
' It was found that the hash routines errored in a Windows 10, 64 bit Office setup.
' However, subsequent checking revealed the solution.
' The Windows platform must have intalled the Net Framework 3.5 (includes .Net 2 and .Net 3),
' this older version, and not only the Net Framework 4.8 Advanced Services that was enabled in Turn Windows Features on and off.
' When it was selected there, the routines worked perfectly.
'
' MD5 hash using VBA
Function MD5Hash(strToHash As String) As String
Dim xml As Object
Set xml = CreateObject("MSXML2.ServerXMLHTTP.6.0")
Dim bytes() As Byte
bytes = StrConv(strToHash, vbFromUnicode)
With CreateObject("System.Security.Cryptography.MD5CryptoServiceProvider")
Dim byteHash() As Byte
byteHash = .ComputeHash_2((bytes))
MD5Hash = ""
For Each b In byteHash
MD5Hash = MD5Hash & Right("0" & Hex(b), 2)
Next
End With
End Function
Public Function CRC32(strToHash As String) As String
Dim i As Long, j As Long
Dim crcValue As Long
Dim byteValue As Byte
Dim polynomial As Long
Dim crcTable(0 To 255) As Long
' Polynomial used in the CRC-32 calculation
polynomial = &HEDB88320
' Generate the CRC table
For i = 0 To 255
crcValue = i
For j = 8 To 1 Step -1
If (crcValue And 1) <> 0 Then
crcValue = ((crcValue And &HFFFFFFFE) \ 2&) And &H7FFFFFFF
crcValue = crcValue Xor polynomial
Else
crcValue = ((crcValue And &HFFFFFFFE) \ 2&) And &H7FFFFFFF
End If
Next j
crcTable(i) = crcValue
Next i
' Initialize CRC value
crcValue = &HFFFFFFFF
' Calculate CRC
For i = 1 To Len(strToHash)
byteValue = Asc(Mid(strToHash, i, 1))
crcValue = crcTable((crcValue And &HFF) Xor byteValue) Xor ((crcValue \ 256&) And &HFFFFFF)
Next i
' Finalize CRC value
crcValue = Not crcValue
' Convert the CRC value to a hex string
CRC32 = Right("00000000" & Hex(crcValue), 8)
End Function
Function ColLetterToNumber(colLetter As String) As Long
ColLetterToNumber = Range(colLetter & "1").Column
End Function
Sub SetCurrencyFormat(wsName As String, col As String)
Dim lastRow As Long
Dim colNum As Long
Dim ws As Worksheet
Set ws = ThisWorkbook.Sheets(wsName)
colNum = ColLetterToNumber(col)
lastRow = ws.Cells(Rows.Count, colNum).End(xlUp).Row
ws.Range(ws.Cells(2, colNum), ws.Cells(lastRow, colNum)).NumberFormat = "$#,##0.00"
End Sub
Function RowHash(rng As Range, Optional ByRef colArray As Variant) As String
Dim i As Long
Dim strToHash As String
Dim ws As Worksheet
Dim rowNumber As Long
Dim cell As Range
Set ws = rng.Worksheet
rowNumber = rng.Row
If IsMissing(colArray) Then
' Use all columns if colArray is empty
For Each cell In rng
strToHash = strToHash & cell.Value
Next cell
Else
' Use specified columns to form the string to hash
For i = LBound(colArray) To UBound(colArray)
strToHash = strToHash & ws.Cells(rowNumber, colArray(i)).Value
Next i
End If
' Use the CRC32 function to get the hash of the concatenated string
RowHash = CRC32(strToHash)
End Function
Sub MergeSheets()
' Variable Declarations
Dim wsMerged As Worksheet ' Worksheet where all other sheets will be merged into
Dim ws As Worksheet ' Worksheet currently being processed
Dim lastRowMerged As Long ' Last row with data in the merged worksheet
Dim lastRowWs As Long ' Last row with data in the worksheet currently being processed
Dim wsToDelete As Collection ' Collection to store names of worksheets to be deleted
Dim wsName As Variant ' Used to loop through wsToDelete collection
Dim lastCol As Long ' Last column with data in the merged worksheet
Dim hashColWs As Long ' Column where the hash values are stored in the worksheet being merged
Dim newRowsMerged As Long ' Counter for the number of new rows added to the merged sheet
Dim wsHash As Range ' Range object to find hash values in merged worksheet
Dim sheetsMerged As Integer ' Counter for the number of sheets merged
' Initialize counter to 0
sheetsMerged = 0
' Initialize the collection to hold the names of the worksheets to delete
Set wsToDelete = New Collection
newRowsMerged = 0
' Initialize or reset the merged sheet
On Error Resume Next
Set wsMerged = ThisWorkbook.Sheets(IMPORT_SHEET_NAME)
On Error GoTo 0
' Create a new sheet if wsMerged does not exist
If wsMerged Is Nothing Then
Set wsMerged = ThisWorkbook.Sheets.Add
wsMerged.Name = IMPORT_SHEET_NAME
End If
' Move the wsMerged to the end
wsMerged.Move After:=ThisWorkbook.Sheets(ThisWorkbook.Sheets.Count)
' Check if the first row is empty in the merged sheet
Dim firstRowIsEmpty As Boolean ' Flag to check if the first row of the merged sheet is empty
firstRowIsEmpty = IsEmpty(wsMerged.Cells(1, 1).Value)
' Find the last column in the merged worksheet
If firstRowIsEmpty Then
lastCol = 1 ' Initialize to 1 if first row is empty
Else
lastCol = wsMerged.Cells(1, wsMerged.Columns.Count).End(xlToLeft).Column
End If
' Loop through each worksheet
For Each ws In ThisWorkbook.Sheets
' Only process sheets that match the import prefix
If ws.Name Like IMPORT_PREFIX & "*" Then
' Find the hash column in the current worksheet
hashColWs = ws.Cells(1, ws.Columns.Count).End(xlToLeft).Column
' Copy headers if the merged sheet is empty
If firstRowIsEmpty Then
ws.Rows(1).EntireRow.Copy wsMerged.Rows(1)
firstRowIsEmpty = False ' Mark that the first row is now populated
End If
' Determine last populated row for both sheets
lastRowMerged = wsMerged.Cells(wsMerged.Rows.Count, "A").End(xlUp).Row
lastRowWs = ws.Cells(ws.Rows.Count, "A").End(xlUp).Row
' Loop through each row in the current sheet to compare hash and copy if needed
For rw = 2 To lastRowWs
' Search for the hash value in the merged sheet
Set wsHash = wsMerged.Range(wsMerged.Cells(2, lastCol), wsMerged.Cells(lastRowMerged, lastCol)).Find(ws.Cells(rw, hashColWs).Value, LookAt:=xlWhole)
' Copy the row if hash does not exist in the merged sheet
If wsHash Is Nothing Then
ws.Rows(rw).EntireRow.Copy wsMerged.Cells(lastRowMerged + 1, 1)
lastRowMerged = lastRowMerged + 1
newRowsMerged = newRowsMerged + 1
End If
Next rw
' Mark this sheet for deletion
wsToDelete.Add ws.Name
' Increment counter for the number of sheets processed
sheetsMerged = sheetsMerged + 1
End If
Next ws
' Final steps: sort, delete sheets, and display message
If sheetsMerged > 0 Then
' Sort by date (assuming it is in Column A)
wsMerged.Sort.SortFields.Clear
wsMerged.Sort.SortFields.Add Key:=Range("A2:A" & lastRowMerged), SortOn:=xlSortOnValues, Order:=xlAscending, DataOption:=xlSortNormal
With wsMerged.Sort
.SetRange Range("A1:Z" & lastRowMerged)
.Header = xlYes
.Orientation = xlTopToBottom
.Apply
End With
' Delete the sheets marked for deletion
Application.DisplayAlerts = False
For Each wsName In wsToDelete
ThisWorkbook.Sheets(wsName).Delete
Next wsName
Application.DisplayAlerts = True
' Display message indicating how many new rows were merged
MsgBox "Total number of new rows merged: " & newRowsMerged
Else
' If no sheets were merged, display a message
MsgBox "No sheets were merged. Nothing to do."
End If
End Sub
Function SanitizeSheetName(sheetName As String, Optional customRegexPattern As String = "", Optional customRegexReplace As String = "") As String
Dim regEx As New RegExp
Dim validName As String
' Initialize the RegExp object
regEx.Global = True
regEx.IgnoreCase = True
' Apply custom regex replacement if provided
If customRegexPattern <> "" Then
regEx.Pattern = customRegexPattern
sheetName = regEx.Replace(sheetName, customRegexReplace)
End If
' Replace invalid Excel sheet characters with an underscore
regEx.Pattern = "[\/\\\*\[\]:\?]"
validName = regEx.Replace(sheetName, "_")
' Truncate to 31 characters
If Len(validName) > 31 Then
validName = Left(validName, 31)
End If
' Return the sanitized sheet name
SanitizeSheetName = validName
End Function
Sub FormatSheet()
Call SetCurrencyFormat(IMPORT_SHEET_NAME, IMPORT_SHEET_AMOUNT_COLUMN)
End Sub
Sub ImportCSVs()
Dim ws As Worksheet
Dim csvFile As String
Dim csvFilePath As String
Dim fileDialog As fileDialog
Dim selectedFiles() As String ' Declare as a dynamic array
Dim i As Integer
Dim lastRowWs As Long, hashColWs As Long
Dim rw As Long
' Create and show the file picker dialog
Set fileDialog = Application.fileDialog(msoFileDialogFilePicker)
With fileDialog
.Title = "Select CSV Files"
.AllowMultiSelect = True
.Filters.Add "CSV Files", "*.csv", 1
' Show the dialog and proceed if files are selected
If .Show = -1 Then
ReDim selectedFiles(1 To .SelectedItems.Count)
For i = 1 To .SelectedItems.Count
selectedFiles(i) = .SelectedItems(i)
Next i
Else
Exit Sub
End If
End With
' Loop through each selected CSV file
For i = 1 To UBound(selectedFiles)
csvFilePath = selectedFiles(i)
csvFile = Right(csvFilePath, Len(csvFilePath) - InStrRev(csvFilePath, "\"))
' Create a new worksheet with sanitized name
Set ws = ThisWorkbook.Sheets.Add
ws.Name = SanitizeSheetName(IMPORT_PREFIX & Replace(csvFile, ".csv", ""), "^(.*?)\s\d{4}-\d{2}-\d{2}.*", "$1")
' Import the CSV file into the new worksheet
With ws.QueryTables.Add(Connection:="TEXT;" & csvFilePath, Destination:=ws.Range("A1"))
.TextFileParseType = xlDelimited
.TextFileConsecutiveDelimiter = False
.TextFileTabDelimiter = False
.TextFileSemicolonDelimiter = False
.TextFileCommaDelimiter = True
.TextFileTextQualifier = xlTextQualifierDoubleQuote
.TextFileDecimalSeparator = ","
.TextFileThousandsSeparator = "."
.Refresh BackgroundQuery:=False
End With
' Find the last row and the last column in the imported worksheet
lastRowWs = ws.Cells(ws.Rows.Count, "A").End(xlUp).Row
hashColWs = ws.Cells(1, ws.Columns.Count).End(xlToLeft).Column + 1
' Add "Hash" as the header for the hash column
ws.Cells(1, hashColWs).Value = "Hash"
' Calculate hash for each row in the imported worksheet
For rw = 2 To lastRowWs
' The string to hash are Date, Amount, Counterparty & Description columns concatenated
ws.Cells(rw, hashColWs).Value = RowHash(ws.Rows(rw).Cells(1, 1).Resize(1, hashColWs - 1), Array(1, 3, 5, 7))
Next rw
Next i
End Sub
Sub ImportAndMerge()
Dim originalWsName As String
' Save the name of the currently active worksheet
originalWsName = ActiveSheet.Name
' Call ImportCSVs subroutine to import the CSV files
ImportCSVs
' Call MergeSheets subroutine to merge the sheets
MergeSheets
' Perform formatting on the merged sheet
FormatSheet
' Return to the original worksheet
ThisWorkbook.Sheets(originalWsName).Activate
' Display a message indicating that the process is complete
MsgBox "Import and merge complete."
End Sub