Last active
November 12, 2015 02:12
-
-
Save mwinkle/eeb4a5974d2aa2647099 to your computer and use it in GitHub Desktop.
An example of using F# to implement a U-SQL UDO (in this case, a processor).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
// sample U-SQL UDO (Processor) written in F# | |
// Note, currently (11/2015) requires deployment of F#.Core | |
namespace fSharpProcessor | |
open Microsoft.Analytics.Interfaces | |
type myProcessor() = | |
inherit IProcessor() | |
let CountryTranslation = dict [ "Deutschland", "Germany"; | |
"Schwiiz", "Switzerland"; | |
"UK", "United Kingdom"; | |
"USA", "United States of America"; | |
"中国", "PR China"; ] | |
override this.Process(row, output) = | |
let UserId = row.Get("UserId") | |
let Name = row.Get("Name") | |
let Address = row.Get("Address") | |
let City = row.Get("City") | |
let State = row.Get("State") | |
let PostalCode = row.Get("PostalCode") | |
let Country1 = row.Get("Country") | |
let Country = if CountryTranslation.ContainsKey(Country1) then CountryTranslation.[Country1] else Country1 | |
let Phone = row.Get("Phone") | |
output.Set(0, UserId) | |
output.Set(1, Name) | |
output.Set(2, Address) | |
output.Set(3, City) | |
output.Set(4, State) | |
output.Set(5, PostalCode) | |
output.Set(6, Country) | |
output.Set(7, Phone) | |
output.AsReadOnly() | |
Here's another way that is more F#-idiomatic. I've removed the need for the interface - it's now just a free standing function that takes in the row and output row. As long as this signature matches IRow * IUpdatableRow -> IRow
, the code would be valid.
/// This module might contain all my UDOs. No need for a separate class for each one - why do we
/// need a interface that only has a single function?
module MyUDOs =
let countryTranslation =
[ "Deutschland", "Germany"
"Schwiiz", "Switzerland"
"UK", "United Kingdom"
"USA", "United States of America"
"中国", "PR China" ] |> Map
let translateCountryUdo (row:IRow, output:IUpdatableRow) =
let country =
let country = row.["Country"]
countryTranslation.TryFind country
|> defaultArg <| country
output.[0] <- row.["UserId"]
output.[1] <- row.["Name"]
output.[2] <- row.["Address"]
output.[3] <- row.["City"]
output.[4] <- row.["State"]
output.[5] <- row.["PostalCode"]
output.[6] <- country
output.[7] <- row.["Phone"]
output.AsReadOnly()
Here's yet another example, that uses a hypothetical USQL Type Provider. This would parse the USQL file and provide bespoke parsing methods for that file.
/// Create Type Provider for Script.sql
type MyScript = USqlTypeProvider<"Script.usql">
module MyUDO =
let countryTranslation =
[ "Deutschland", "Germany"
"Schwiiz", "Switzerland"
"UK", "United Kingdom"
"USA", "United States of America"
"中国", "PR China" ] |> Map
let translateCountry (row:IRow, output:IUpdatableRow) =
// "Wrap" IRow as a strongly-typed object whose schema is inferred from the
// drivers definition in Script.usql
let row = MyScript.Parsers.driver(row)
let country =
let country = row.Country // row is now strongly typed, can access all properties safely.
countryTranslation.TryFind country |> defaultArg <| country
// provide a strongly-typed generator function for the drivers_CountryName schema
// takes in "output", populates it, and returns it AsReadOnly()
output
|> MyScript.Generators.drivers_CountryName(row.UserId, row.Name, row.Address, row.City, row.State, row.PostalCode, country, row.Phone)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Here's a first alternative way of the code above (applies to C# as well as F#) - supporting indexers might make for a slightly more readable format: -