Haskell

Google

Presentation about Haskell
- 101: https://www.youtube.com/watch?v=cTN1Qar4HSw&t=3417s
- 102: https://www.youtube.com/watch?v=Ug9yJnOYR4U
Sources: https://github.com/google/haskell-trainings

FULL LANGUAGE DOCS: at Hoogle https://hoogle.haskell.org/ Allows searching for types https://hoogle.haskell.org/?hoogle=Maybe%20Int%20-%3E%20Int%20-%3E%20Int

Universities

https://www.seas.upenn.edu/~cis194/spring13/lectures/01-intro.html

References and books

If you are coming from Imperative Languages (Java, C++), use https://github.com/pvorb/learn-you-a-haskell
Real World Haskell: http://book.realworldhaskell.org/read/
- Small projects in every chapter
https://haskellbook.com/
Created by Haskell Curry
Cabal? Cabal Hell?
Stackage? Stack?
Advanced
- Functors
- Monards
- Monard Transforms

Prerequirements

GHCI: types, codes, debug, test

It is

strongly statically typed
purely functional
Lazily evaluated
General purpose

It is not

A silver bullet
Only for matematitians
Hard?

Purely functional

Everything is a funcion

Data structures
Functions
Structs

Math	Haskell
f : Z -> Z	f :: Int -> Int
f(x) = x + 1	f x = x + 1

Everything is immutable

Evaluation can't be done
Can't mutaate anything in place
Transform data only by creating new

let a = 3 in 
    a := a + 1  ---compile error

Everything is an expression

No loops
No statements
No control flow
Expression is everyhing (if, can be assigned because it's an expression)
- The body is an expression

let a = if somebBool then 1 else 0 in 
    a + 1

Expanding the pressions too
- in is an expression on itself and can be changed

let a = if somebBool then 1 else 0 in 
    a + (let b = 2 in b)

Switch statements as well is an expression

let offset = case color of
                  Red   ->  0
                  Green ->  8 
                  Blue  -> 16
in baseValue + offset

No side effects unless explicitly stated

Can't read/write from databases, IO, etc
- Testing the code is trivial
- Input produces the same output
But we need to access

-- foo :: Int -> String

If we need to talk to external world
- Side effects are in IO
- Functions not in IO are deemed pure
- Functions in IO are deemed impure
- Main method is impure, Dependency Injection is impure, Utility methods are pure

-- readFile :: String -> IO String

Everything pure vs impure
- f :: a -> IO a (from pure to impure)
- f :: IO a -> a (impure to pure) does not exist, no side effects

C++ const functions IO corrupts

Lazy Evaluated

Regular languages: evaluated right away

a = expensive (is stored in memory)

Everything is evaluated and only evaluated when needed
- such as a print function
- delayed and deferred
Same as other languages:
- Second part is only evaluated when obj is not null, deferred and lazely evaluated

if (obj != NULL && obj-> value > 0)

Strict evaluation: inner to outer
Lazy evaluation: outer to inner

add( 12 + 8, 20 + 2)
      12+8 + 20 + 2
          42

Cons of Laziness - Memory Pitfalls

Delayed computations (but escape hatches)
- Accumulators are lazy and will grow,
- becomes a gigantic expression that is growing in memory and
- end up with a stack overflow
- Beginners see this at least once, but can be overcome
Haskell is lazy by default, but not purely lazy
- We can instruct it to be strict, overridding the behavior
- We can make things strict at precise points to avoid memory pitfalls if we know what we are doing (do things non-lazy)

Cons of Laziness - IO and Paralleism

10 threads (making them evaluated non-lazy)
Imagine writing solutions and joing their results to print
- But we can't forget that the whole thing is lazy
- When you do print is when the computations actually happens

spawn 10 threads
join them at B
print B

We can use the hatches we have in the language to say
- I want to be strict here in the thread to make sure it happens when you want it to happen
- By default, it just waits until the last possible moment

Those were the weird, non-intuitive and pitfalls for beginners

Pros - Huge Optimizations

Equation reduction and short-circuiting
No strong guarantee, the compiler will wait until the very last expression to evaluate it
- it does a better judgement in what to do based on the expression
- In eager language, it evaluates and writes it
- The compiler re-arranges the code to make the computation happen whereever it's efficient because it doesn't need to compute it immediately
For example, given a gigantic struct and you need to make a change to it,
- you need to make a copy of it because we never mutate anything in place, right?
- In eager languages, it's horrible because it makes lots of copies of a struct, which would have to be committed in memory because computations have to happen immediately.
- If we have a struct, then modifies it, 10 times, by the end of the function only returns a field of the struct and that function is pure and has no side effect
Why it is efficient? The only thing that matters is:
- since the compiler is not required, the compiler inlines all your code,
- removes everything
- never commits the struct in memory,
- and generates the equation that generates the fields that you want
- it then removes everything else

In short

That is, the compiler has a lot of freedom thanks to laziness
- because it removes most of the constraints associated with computations in other languages.
Purity and laziness work together
- Laziness gives the compiler a lot of freedom
- Purity informs the compiler to know what it is doing
First chapters of parallel programs in Hashell
- Spread multiple threads, several thousands of threads
- Forget about laziness
- compute sudokus, and at the end of the funciotns, to print the length of the list of results
- The compiler realizes that the length of the list of results is the same length of inputs, it removes the entire code that does all tparallel computations, prints the correct result, without spawning a single thread, optimizing all your code away because the compiler is able to verify that the reasult you want only requires that small comutation and non of all what you've written. The compiler will tear through your code.
Haskell is as fast as C, C++, rather being close to Python

Pros - Greater expressivity (infinite structures)

Generating expressions that are infinite
- However, because it's lazily-evaluated, it won't compute all the values, but only the 5 first elements

let naturalNumbers = [0, 1..]
let squreNumbers = map (2^) naturalNumbers
take 5 squareNumbers
[0, 1, 4, 9, 16]

Curried Functions, partial application

Named after Haskell Curry
What's on the left has the type that's on the right

-- f :: Int -> Int -> [Int]

Interpretations in differnet ways
- A function that takes a function and returns a list
- A function that takes two arguments that returns a list
In math, a function with multiple arguments can be represented by a chain of functions of one argument
- Under the hood, every function takes one argument, which is why the arrow for functions is take one thing output one thing

-- f :: Int -> ( Int -> [Int] )
-- f :: Int -> Int -> [Int]

The compiler does not distinguishes these lines as they mean the same
- Although it takes 2, the compiler can only call f with one

-- f     :: Int -> ( Int -> [Int] )
-- f     :: Int ->   Int -> [Int]
-- f 1   ::          Int -> [Int]
-- f 1 2 ::                 [Int]
-- (f 1) 2 ::               [Int]

Here, the same function can be called with any arrangement of parameters because it is as generic as it gets

Quizz

Lowercase Letter: type parameter
It expects any type name, you can depend on properties of a
Here, there are no preconditions whatsoever
This works for absolutely any type

-- ??? :: (a -> b) -> [a] -> [b]

(a -> b)
- It's a function from type A to type B
- This is how you would take a function as a first argument: surround the function in these parenthesis
- Surrounds the arrow function with parenthesis
[a]: list of values of type A
[b]: list of values of type B
- End result

What it turns out is that this is just a function map

By applying the transformation function on everyu single element

-- map :: (a -> b) -> [a] -> [b]

Can we add constraints

We can express level constraints on the types, not on the values, by default
- "I require type A to have such methods"
- The list must have this given size which has a constant value, NOT BY DEFAULT
These are language extensions, non-standards, that give the ability to express such conditions at the type level
- mostly comment level like in java constraints in the beginning
- the compiler checks them
- Liquid Haskell that allows you to prove assertions on the vaues themselves instead of proving assertions on these types, but it is completely non-standards (RECOMMENDED)

-- ???????   ::  (a -> Bool) -> [a] -> [a]
-- filter    ::  (a -> Bool) -> [a] -> [a]

Sad part: Sugar Code - curse of haskell - operators

Developer can create their own operators
You can create and reuse code
- But when horrobly written, it just is hard
Lots of operators in Haskell in the wild
- OSS
- 2 operators commonly used in Haskell and that will be seen in any code

Operators vs Functions

Main difference: the name

Operators: as just functions
- The compiler treats them as funcitons if they contain letters
- the compiler treats them as operators if they contain symbols
You can use an operator as argument to another function, just surruond it with parenthesis
- That's the case when you declare its type
Example: Operator $ (it already exists)
- We would have to surround it with parenthesis
- Based on its type

-- ($)  ::  (a -> b) -> a -> b

You can see it as a function of two arguments,
- takes a function from A to B (a -> b)
- It takes a value from type A -> a
- returns a value of type B -> b
- Since we know nothing about the aguments, the only thing the compiler can do is to apply the function to the value.

-- [ (a -> b) -> a ] -> b

If you see it as function with one argument that returns a function
- It takes a function from A to B (a -> b)
- and returns a function from A to B

-- [ (a -> b) ] -> [a  -> b ]

We don't need the $ function to apply to values
It is just syntatic sugar

$ operator

Function of sum

let a = fun (x + y)

Since function application has the highest priority and binds first
- If I were to remove the parenthesis in this example, it would be fun x + y
- Instead of being what I want, which is fun x + y, which is why we have to surround the argument with parenthesis to make sure it is computed first
- However, haskell doesn't like parenthesis that much, "this is Haskell not List"
- We can use the $ operator, because it simply is function application, but with the lowest priority

let a = fun $ x + y

Everything in the right will be computed as an argument [x + y]
Everything in the left will be computed as a function
And then the function on the left is going to be applied to the argument on the right
- Syntatic sugar for functional applcations without parenthesis =
- It is used absolutely everywhere in Haskell code, which is what you need to know about it.
It has a very nice property:
- If you are calling several functions, if you have parenthesis, you will have five to six parernthesis
- closing parenthesis at the end of the line
If you use dollars, you just need a chain of proper functions from right to left, but still
We can remove most of the parenthesis, not all of them, but most

dot operator (.)

It's not related to the dot operator in imperative OOP programming languages
It is used for composition
Combining functions without using lambda functions or declare new functions explicitly

-- (.)  ::  (b -> c) -> (a -> b) -> (a -> c)

Thsi is mostly applied in filters
- why dot? This was inspired by the circle operator in math

-- (f * g)(x) = f(g(x))

Examples of dot (.)

Function show takes Stuff and returns string
Function length takes a String and returns Int
Composition will take stuff and return Int

--         show   ::  Stuff -> String
-- length         ::  String -> Int
-- length . show  ::  Stuff -> Int

Use of $ and . operators

We build pipelines of functions
- Apply this transformation
- Then apply this transformation
Compining the . and $ sign
- You can evaluate everything on the left side of the dollar operator
- Have the dollar, apply the value of the function
This is already what we do in real world

cat input | grep token | sed stuff | tee output

Quizz with a vengeance

What does this function type mean?

--foldl :: (a -> b -> a) -> a -> [b] -> a

This is a reduce function that combines accumulator and value
- like in reduce
(a -> b -> a): Combines accumulator and value
a: initial accumulator
[b]: list of values
a: result

Data Types

Algebric Data types

type Point = (Int, Int)  -- tuple
type Polygon = [Point].  -- list
Type Map k v = [(k, v)]  -- type parameters

Immutable Data structures

We don't have member functions because we have no inheritance, no classes
No modifiers: No differentiation or anything
We have no setters because we never mutate a thing
We have no private members versus public members, because we dont have no objects

What we have is the following: Constructors

Naming convention, Constructors on how to create values
The compiler has the complete freedom as to how the thing is presented in memory
The way we declare data types is by listing the constructors are either constants or functions that create an expression of our type.

None Data

None :: None

Only possible argument to the expression None is None itself
- It's a convention
It can be used to represent lack of information, because if a function returns None value
- You know it'sonly going to be none because it cannot be anything else
- Similar implementation of the None type in Python, which is a constant none.

You can only call None expression by using it as a value that is itself

data None = None

Named Type

If we want to create values of type Minutes
- It's to call its only constructor, which is also named minutes
- Constructor that takes arguments

-- Data type minutes of type Int
data Minutes = Minutes Int

-- Constructor Minutes, which is a function that takes int and returns and expression of type Minutes
Minutes :: Int -> Minutes

-- Minutes 42 would be a valid expression of type Minutes
-- This kind of tpe could be used if you have a datetime library, or seconds or hours 
-- better abstraction instead of taking int and people making mistakes using it
-- It says no layout in memory, easy valid expression
Minutes 42 :: Minutes

Boolean Type

boolean values cannot be Null

-- Data type 
data Bool

-- Constructor of true
-- True  :: Bool

-- Constructor of false
-- False :: Bool

Maybe Data (Optional)

How do we represent the fact that we might not have a value?
C++ used to have boost optional until it's c++ 14 they introduced sdt optional intestead
Inspired by Optional
May or may not contain a value of a type
- It's a maybe of A
- A can be of any type like strings, lists, lists of lists of ints

-- Nothing   ::      Maybe A
-- Just      :: a -> Maybe A
-- Just 42   ::      Maybe Int
data Maybe a = Nothing | Just a

List Type

Just stores the list
- Linked list

data List  =  a Nil | Cell a (List a)
--- Nil  :: List a
--- Cell :: a -> List a -> List a
--- Cell 0 (Cell 1 (Nil)) :: List Int

default

data [a] = [] | (a:[a])
[]      :: [a]
(:)     :: a -> [a] -> [a]
[0:1]   :: [Int]

Record Syntax

With the simple syntax, we don't know the types of String and Int

data User = User String Int

User   :: String -> Int -> User

The regular value, it creates 2 functions
- Good to prefix the properties as they are represented as functions
- Works similar to getters
- Created on the same namespace and not overloaded the name and the age names

data User = User {
  userName :: String,
  userAge  :: Int
}

User     :: String -> Int -> User 
userName :: User -> String
userAge  :: User -> Int

Not Type

It's good to add the name and type of the function, implemented using regular functions

not :: Bool -> Bool
not x = if x then False else True

We can use Pattern Matching to implement the method

not :: Bool -> Bool
not True  = False
not False = True

Say we want to implement the AND expression

(&&)  ::  Bool -> Bool -> Bool
---x && y = ????
--- Implementation with function
x && y = if x
         then (if y then True else False)
         else False
         
-- Verbose, but also valid with Pattern Matching  
(&&) :: Bool -> Bool -> Bool
False && False = False
False && True = False
True && False = False
True && True = True

We don't need to list every single pattern
Patterns are tested in order
- Deferring the value of the matching can

(&&) :: Bool -> Bool -> Bool
True && True = True
x    && y    = False

Patterns are tested in order
- Deferring the value of the matching can

-- An even shorter version, we reduce the function for equality
(&&) :: Bool -> Bool -> Bool
True && y = y
x    && y = False

Declaring the names and not being used
- In python, we can declare then prefixed with _ to indicate our intent (of not using them, don't care linter)
- In Haskell, we can do the same to indicate the same
Since the compiler interprets from top-down, then the evaluation occurs in the first declaration
- Then, the next ones we don't care much

-- An even shorter version, we reduce the function for equality
(&&) :: Bool -> Bool -> Bool
True && y = y
_    && _ = False

Deconstructors

Pattern matching allows us to resolve functions and deconstruct the arguments
- And access fields inside them

data Minutes = Minutes Int
add :: Minutes -> Minutes - Minutes
-- Regular pattern  matching
add (Minutes x) (Minutes y) = Minutes (x + y)
-- Using the dollar $ to remote the parenthesis
add (Minutes x) (Minutes y) = Minutes $ x + y

More on lists: https://hackage.haskell.org/package/base-4.16.0.0/docs/Data-List.html

Length implementation: Lists are either
- empty list, empty brackets []
- or a cell a:, which is a value column and the rest of the list :[a]

Implememting it with patterns we would do as follows (using recursion)

data [a] = [] | (a:[a])
length :: [a] -> Int
-- length l = ??? how's the implementation?
length [] = 0
length (_:xs) = 1 + length xs

Lambda function \x -> function

Using the replacement of the functitions to write like math

-- Define the add function
let add x y = x + y :: Int

-- Show the type add
:t add
add :: Int -> Int -> Int

-- Use the function
add 4 8
12

-- Define a new function add5
let add5 = add 5

-- view the type
:t add5
add5 :: Int -> Int

-- View the result
add5 6
11

Under the hood, the compiler reduces the syntax to a function of one argument as follows
- Using a lambda function, when using a \
- This is the declaration that takes one argument and returns a function
- That way, the compiler sees no difference

let add x = \y -> x + y :: Int

add 4 5
11

Emurgo Haskell

Deriving Show: :t show shows the type

data FailableDouble -> 
safedDivision :: Double -> Double -> FailableDouble

Note: the :: is just a syntax to show types

fac :: Int -> Int
fac n = product [1..n]

-- recursive model
fac :: Int -> Int
-- base value
fac 0 = 1
-- recursive value
fac n = n * fac (n - 1)

List

product :: (Num a) => [a] -> a
product [] = 1
product (x:xs) = x * product xs

{-
product [2,3,5]
= 2 * product [3, 5]
= 2*3*5*1-
-}

Lists

ins :: Ord a => a -> [a] -> [a]
ins x [] = [x]
ins x (y:ys) | x <y = xy:y:ys
             | otherwise = y : ins x ys

:t zip

zip :: [a] -> [b] -> [(a, b)]
zip [1,2,3] [4,5,6]
[(1,4),(2,5),(3,6)]

Foldr vs Foldl

https://kseo.github.io/posts/2016-12-21-foldl-vs-foldl%27.html#:~:text=foldl'%20is%20always%20what%20you,using%20foldl'%20instead%20of%20foldl%20

Foldr: Works for infinite lists because we can read the values without evaluating the whole thing
Foldl: optioned for performance reaons
- MUST: https://wiki.haskell.org/Foldr_Foldl_Foldl%27
- https://wiki.haskell.org/Fold

Terinal

Filters

Prelude> filter (>100) [1000, 300, 40]
[1000,300]
Prelude>
Leaving GHCi.
🐳 Supercash Base 💰 alpine@3.15.0 🐧 [root@257f5642770d]:haskell-101 # vim 04_abstractions/src/Codelab.hs
🐳 Supercash Base 💰 alpine@3.15.0 🐧 [root@257f5642770d]:haskell-101 # vim 04_abstractions/src/Codelab.hs
🐳 Supercash Base 💰 alpine@3.15.0 🐧 [root@257f5642770d]:haskell-101 # ghci
GHCi, version 8.10.7: https://www.haskell.org/ghc/  :? for help
Prelude> :{
Prelude| gts100:[Integer]->[Integer]
Prelude| gt100:Integer->Bool
Prelude| g100 x = > 100
Prelude| :}

<interactive>:2:17: error: parse error on input ‘->’
Prelude> :{
Prelude| gts100:[Integer]->[Integer]
Prelude| gt100:Integer->Bool
Prelude| gts100:[Integer]->[Integer]
Prelude>
Leaving GHCi.
🐳 Supercash Base 💰 alpine@3.15.0 🐧 [root@257f5642770d]:haskell-101 # ghci
GHCi, version 8.10.7: https://www.haskell.org/ghc/  :? for help
Prelude> :{
Prelude| gt100::Integer->Bool
Prelude| gt100 x = x > 0
Prelude| :}
Prelude> gt100 103
True
Prelude> gt100 39
True
Prelude> :{
Prelude| gt100::Integer->Bool
Prelude| gt100 x = x > 100
Prelude| :}
Prelude> gt100 39
False
Prelude> gt100 103
True
Prelude> gts100 xs = filter gt100 xs
Prelude> rs

<interactive>:14:1: error: Variable not in scope: rs
Prelude> gtd100 [244, 34]

<interactive>:15:1: error:
    • Variable not in scope: gtd100 :: [a0] -> t
    • Perhaps you meant one of these:
        ‘Ghci1.gt100’ (imported from Ghci1), ‘gt100’ (line 9),
        ‘gts100’ (line 13)
Prelude> gts100 xs = filter (\x -> x > 100) xs
Prelude> gts100 [244, 34]
[244]

Foldr

Prelude> :{
Prelude| sum'::[Integer] -> Integer
Prelude| sum' [] = 0
Prelude| sum' (x:xs) = x + sum' xs
Prelude| :}
Prelude> sum' [2,4]
6

Prelude| :}
Prelude> pro
prod'           product         properFraction
Prelude> prod
prod'    product
Prelude> prod' [4,5]
20

Foldr

Prelude> :{
Prelude| foldr' _ v [] = v
Prelude| foldr' f v (x: xs) = f x (foldr' f v xs)
Prelude| :}

Prelude> :t foldr'
foldr' :: (t1 -> t2 -> t2) -> t2 -> [t1] -> t2

Prelude> len = foldr (\_ y -> 1 + y) 0
Prelude> len [3, 4, 5]
3
Prelude> len [3, 4]
2
Prelude> len "Marcello"
8
Prelude> len = foldr (\_ y -> 1 + y) 0

Prelude> foldr
foldr   foldr1
Prelude> foldr (++) [] ["abc", "def"]
"abcdef"
Prelude> :t (++)
(++) :: [a] -> [a] -> [a]
Prelude> foldr (++) "" ["abc", "def"]
"abcdef"
Prelude> foldr (++) " " ["abc", "def"]
"abcdef "
Prelude> foldl (++) " " ["abc", "def"]
" abcdef"
Prelude> foldr (&&) True [False, True]
False
Prelude> foldr (&&) True [True, True]
True
Prelude> foldr (&&) True [True, True, False]
False
Prelude> foldr max [] [[1,2,3], [1,2], [1,2,3,4,5]]
[1,2,3,4,5]
Prelude> foldr max [] ["0000", "00", "000000000"]
"000000000"
Prelude> foldr max 0 [1..10000]
10000
Prelude> foldr max 11110000 [1..10000]
11110000

Recursive function, with computation

Prelude| fun1 :: [Integer] -> Integer
Prelude| fun1 [] = 1
Prelude| fun1 (x:xs)
Prelude|   | even x = (x-2) * fun1 xs
Prelude|   | otherwise = fun1 xs
Prelude| :}

Prelude> fun1 [10, 3, 4, 5]
16
Prelude> fun1 [1,2,3]
0
Prelude> fun1 [1,4,3]
2
Prelude> fun1 [2,2]
0
Prelude> fun1 [4,4]
4
Prelude> fun1 [6,4]
8

Prelude> :{
Prelude| fun1' :: [Integer] -> Integer
Prelude| fun1' = foldr (*) 1 . map (subtract 2) . filter even
Prelude| :}

Prelude> fun1 [6,4]
8
Prelude> map (subtract 2) . filter even $ [4..10]
[2,4,6,8]
Prelude> map (subtract 2) . filter even $ [6,4]
[4,2]

NOTE: We don't need to declare x:xs when desinging folders

Prelude> :{
Prelude| rev :: [a] -> [a]
Prelude| rev [] = []
Prelude| rev (x:xs) = foldr (++) [x] rev xs
Prelude| :}

Prelude> :{
Prelude| rev :: [a] -> [a]
Prelude| rev = foldr (\x xs -> xs ++ [x]) []
Prelude| :}
Prelude> rev [1, 2, 3]
[3,2,1]
Prelude> :{
Prelude| rev' :: [a] -> [a]
Prelude| rev' = foldl (\acc x -> x:acc) []
Prelude| :}
Prelude> rev' [1, 3, 4]
[4,3,1]

Data Structures

Binary Tree

Prelude> data BinaryTree (a, b) = Empty | Node (BinaryTree) a (BinaryTree) b (BinaryTree) | Leaf

Docs

Look for docs in Hoogle

https://hoogle.haskell.org/?hoogle=len%20%3A%3A%20

marcellodesales/haskell-at-google.md