Skip to content

Instantly share code, notes, and snippets.

@TripleEmcoder
Last active May 28, 2025 07:16
Show Gist options
  • Select an option

  • Save TripleEmcoder/0bee2cee946d2289ade2a4aa56d1364f to your computer and use it in GitHub Desktop.

Select an option

Save TripleEmcoder/0bee2cee946d2289ade2a4aa56d1364f to your computer and use it in GitHub Desktop.

Clarifying the Core Problem

I'm zeroing in on the crux of the issue: identifying MDF and LDF files that are truly "orphaned." That is, files residing on the system that aren't tied to any active database on a specific SQL Server instance. I'm focusing on clarifying this definition of "orphaned" so that any script built will provide the correct results.

Outlining Script Structure

I've mapped out the foundational structure for the script. My plan is to use PowerShell as the primary orchestrator, leveraging T-SQL to query the SQL Server instance. I'll execute a T-SQL query through PowerShell to get the list of "used" MDF/LDF files. Then, PowerShell will handle the file system operations, comparing the lists to pinpoint the orphans. I'm focusing on making the script flexible, allowing users to specify the SQL Server instance, and the folders to be scanned. I need to make sure the paths are handled correctly for compatibility.

Refining File Path Handling

I'm now focusing on making the file path handling robust. The script needs to correctly interpret and compare file paths from the SQL Server query with those found on the file system, accounting for potential differences in case, formatting, and UNC paths. I'm exploring various PowerShell methods to standardize these paths before comparison, ensuring accuracy in identifying orphaned files. This will be key to preventing false positives.

Outlining the Comparison Logic

I'm now focusing on the comparison phase. My approach is to utilize a HashSet in PowerShell to store the database file paths retrieved from SQL Server. I'll then iterate through the files found on the file system, checking each path against the HashSet for an O(1) lookup. This will dramatically improve efficiency, especially when dealing with a large number of files. I'm also considering logging to help debug.

Implementing the T-SQL Query

Now I'm focusing on crafting the T-SQL query that will retrieve the relevant database file paths from the SQL Server instance. I will use the sys.master_files catalog view, filtering for data (MDF/NDF) and log (LDF) file types. The query's physical_name column will provide the file paths. I'll test the query to ensure accurate results, which will be essential for the comparison step in PowerShell.

Defining Orphan Criteria

I've completed my initial framework, focusing on the core logic: a direct, case-insensitive string comparison of file paths. I've chosen to use lower-casing paths before comparison, to handle folder paths with or without trailing slashes. I think this will be an effective, straightforward approach, so I can start testing and refining.

Testing the Comparison

I've completed my work on the comparison logic using a HashSet to ensure efficiency. To enhance the robustness, I'll use lowercase paths during the comparison stage to handle differences. I'm confident that this approach will deliver fast and accurate results. I'm now testing the script, and I'll see how it performs with a large number of files.

Testing the Comparison

I've finalized the comparison logic using a HashSet for speed and lowercase normalization for accuracy, so now, I'm verifying the script's ability to swiftly and correctly identify orphaned files, even when dealing with large numbers of files.

Validating Path Normalization

I'm now focused on validating the path normalization strategy. I'm confirming that lower-casing paths before comparison reliably handles differences in casing, formatting, and trailing slashes. This includes thorough testing with different path variations to ensure the script's accuracy and robustness.

Implementing User-Friendly Outputs

I'm now focusing on user-friendly outputs, adding more details about the processes, using color, and including messages if no orphans are detected. This makes it easier for users to interpret the script's results and ensures a positive user experience.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment