Skip to content

Instantly share code, notes, and snippets.

@bjulius
Created November 7, 2022 22:27
Show Gist options
  • Save bjulius/9fdfa9a1ff3029075dd5c9a990e7bc7f to your computer and use it in GitHub Desktop.
Save bjulius/9fdfa9a1ff3029075dd5c9a990e7bc7f to your computer and use it in GitHub Desktop.
Subject Data Cleaning Script for JJ
let
Source = Excel.Workbook(File.Contents("C:\Users\brjul\Downloads\Subjects.xlsx"), null, true),
Sheet1_Sheet = Source{[Item = "Sheet1", Kind = "Sheet"]}[Data],
PromoteHeaders = Table.PromoteHeaders(Sheet1_Sheet, [PromoteAllScalars = true]),
SplitByDelimToRows = Table.ExpandListColumn(
Table.TransformColumns(
PromoteHeaders,
{
{
"Subject",
Splitter.SplitTextByDelimiter("|", QuoteStyle.Csv),
let
itemType = (type nullable text) meta [Serialized.Text = true]
in
type {itemType}
}
}
),
"Subject"
),
LetterCheck = Table.AddColumn(
SplitByDelimToRows,
"ContainsLetter",
each if List.ContainsAny(Text.ToList([Subject]), {"A..Z", "a" .. "z"}) then 1 else 0
),
FilterNoLetters = Table.SelectRows(LetterCheck, each ([ContainsLetter] = 1)),
RemoveDupes = Table.Distinct(FilterNoLetters, {"Subject"}),
RemoveLetterCheck = Table.RemoveColumns(RemoveDupes, {"ContainsLetter"})
in
RemoveLetterCheck
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment