2015-08-20: Go Workshop Notes

Workshop run by Bill Kennedy
Course materials available at https://github.com/ardanstudios/gotraining
This was a one-day distillation of the full 3-day course

Variables and Types

Type gives the compiler two things: size + representation. The compiler guarantees these types
When initialising a variable using var, it is given its zero value
Strings, slices, maps, interfaces are all reference types
Strings are immutable two-word data structures
- First word is a reference (pointer) to an array
- Second word is the number of bytes in the underlying array
Use unsized types (e.g. int, float) unless you specifically need the size
- It'll use the word size of the underlying architecture
:= is a short variable declaration operator: it initialises and infers the type from the value on the RHS
- SVD must declare at least one variable (e.g. var r string; r, err := foo() would compile)
Go doesn't allow casting; it converts instead (e.g. a := int32(10))
struct allows us to create user-defined types
example{ ... } defines a struct literal (assuming the type example struct has been defined)

type example struct { ... }
e1 := example{} // <-- don't do this
var e1 example  // do this: use `var` when you can (zero-allocation)
                // (exceptions: making code easier to read / reason)

With named types, compiler will enforce type compatibility
With anonymous (unnamed) types, compiler treats as compatible if the type definition matches

Pointers

Values can exist on the stack or the heap
A value only counts as being allocated when it's on the heap
Addresses in stack space go down
- i.e. a value added to the stack will have a lower address than once added before it
Anything on the stack does not have to be garbage collected
- Stack frames remain allocated on the way back up, and get written / initialised on the way down
Don't think about this when writing code: write for ease of maintenance, then benchmark and tweak
4k stack space in 1.4, 2k in 1.5.
& operator retrieves the address of a value
"Sharing" == pointers
With pointers, we're still passing by value, but the value is an address
The value for a pointer variable (e.g. *int) is always an address
- The address must always point to a value of the correct type (int, in this case)
Confusion: the * operator is used in both type definition and pointer dereferencing
- E.g. func increment(inc *int) { ... } vs. println("Inc:", *inc)
Example of dangers of using addresses / relying on implementation which might not work in the next version:
- Values "escape" from the stack to the heap (escape analysis)
- Use go build -gcflags -m to see escape analysis and heap allocation
- (-gcflags -S shows all the Plan 9 machine code for your code)

Constants

Constants are a compile-time construct (only exist at compile type)
They also have a parallel type system
Lowest level precision for a numeric constant are 256 bits, and are considered mathematically exact
Untyped constants with a given kind (e.g. const ui = 12345) can be implicitly converted by the compiler
Typed constants (e.g. const ti int = 12345) can't be implicitly converted
You get additional flexibility by using untyped constants
If you hear the phrase "ideal type," whoever said it is talking about constants of a kind
See package time for a really good use of constants + implicit type conversion
- (it's also a good example of where constants can be a helpful part of a package's API)

Scoping

Three different scopes: package, function local, and block
"Block" scopes include anything delimited by {}, which includes if and for blocks
- if _, err := foo(); err != nil { ... }
Be mindful of variable shadowing: easy to do without realising

Functions

Every package can have an init() function, which will execute before main()
- (you can technically have multiple init() methods in the same package...)
- Importing a package using a blank identifier means the imported package's init() functions will be called
  - E.g. import ( _ foo), often used with database packages

Panicking

Programs should not panic, and if they do you probably want the stack trace (so don't handle panics)
Three ways to terminate your program: panic() (to shut down and get the stracktrace), log.Fatal or os.Exit
If you need to handle a panic, use defer + recover() (it's the only way to capture a panic)
- defer isn't free: setup + a heap allocation (which gets cleaned up, so doesn't require GC)
- See http://play.golang.org/p/eg14ClW4_y for an example of capturing a strack trace

Arrays

The array is the core data structure behind slices, strings, etc
Pointer arithmetic is disallowed, e.g. trying to add to the location of an array
Iteration: use for ... range because it's safe (you can't go outside the range of the array)

for i, fruit := range strings {  // fruit is a copy of each element
	fmt.Println(i, fruit)
}

The size of an array is part of its type: [4]int is of a different type to [5]int
"An array is just a slice waiting to happen"

Slices

Slices are backed by arrays
It is the core data structure, you're unlikely to write any program without them
A slice is a reference type, like a string, but with 3 elements:
- pointer: pointer to the memory location of the backing array
- length: the number of elements in the slice
- capacity: the total capacity of the underlying array
Use make() to create an empty slice, e.g. slice := make([]string, 5) for a 5-element slice
- Binary arity version of make() will set both length and capacity to the same thing
Use slices of values, not pointers, because the data will be contiguous in memory
A nil slice ([]T) has both a length and capacity of 0
Use append() to add elements to a slice
- If the length and capacity are equal, a new underlying array is created, values are copied across, new element is appended, and the new slice header (with correct pointer, length and capacity) is returned
- When capacity is <1000, the capacity is doubled; after that, it ranges between ~20-40%
You can create slices of slices, which is a new view of the underlying array
- s2 := s1[2:4]: starting index is 2, 2nd element is an exclusive end index
- Think of the 2nd element as start_index + required_length
- The length of the new slice will be the requested range
- The capacity of the new slice will be cap(s1) - starting_index
Writing to multiple slices off the same backing array is unsafe (values can be overridden)
For safe appends, use a three-index slice to set the capacity to the same as the length: s2 := s1[2:4:4]
- Appending to s2 will require a new backing array to be created, making the write safe
You can create a slice over all of an array via s := a[:]
If you can exclude the start_index if want a slice starting from the beginning: s2 := s1[:4]
Nil slice: var data []string; Empty slice: data := []string{}
- Use an empty slice only if required, e.g. for JSON serialisation (empty slice will return {}, not null)
Parlour trick: a := []string{100:""} <-- create an array of length 101 with a[100] == ""
If you know the exact size of the slice, It's more efficient to declare that capacity up front
- But that will involve an apparently magic number, so don't do it unless you really need to

Methods

A function becomes a method when a receiver is attached to it
Receivers can be a value or pointer receiver:
- value: operates on its own copy of the receiver
- pointer: operates on a shared value
Value type will automatically be adjusted if necessary:
- If you call a method with a value receiver with a pointer, Go will automatically dereference
- If you call a method with a pointer receiver with a value, Go will automatically create a reference
All this is really just syntactic sugar; the receiver is effectively the first parameter to the function

Interfaces

Interfaces provide polymorphism, just like any other OO language
Interface type values are reference values: 1st word is the referred-to type; 2nd word is pointer
The nil value for an interface has a nil type and a nil pointer
There is no keyword implements or similar; it just need to be declared with the correct signature
Receiver type (value or pointer) is very important
Values of the incorrect type will not be adjusted automatically
Only values that implement the interface can be used
Think of it from the receiver point of view
- If you implement an interface using a pointer receiver, only pointers satisfy the interface
We can't always take the address of T, so we can't include the pointer receiver in the method set
- E.g. a plain numeric value might be stuck straight in a register, so has no memory address

Concurrency: Goroutines

Concurrency is about managing lots of things at once
Parallelism is about doing lots of things at once
Any function or method can be launched as a goroutine
By default, one logical processor (one per core in >=1.5) and each logical processor is given one OS thread
- Keep the minimum number of OS threads as busy as possible
- Don't launch more logical processors than you have cores
Scheduler uses a number of triggers to decide when to do the next thing (blocking calls, GC, function calls...)
A system call may result in the scheduler kicking up an new OS thread while waiting for the system call to return
Don't code to the above: code assuming every goroutine is currently running
- Avoids race conditions (use go build -race or go test -race to detect race conditions)
- Calls running / returning in different orders
- Reading + writing data at the same time
Any function or method call (including anonymous) can be made in to a goroutine by using the go keyword
More than one goroutine == chaos.
If main() ends, the program ends, so use sync.WaitGroup to ensure all goroutines are finished
runtime.GOMAXPROCS() can be used to set the number of logical processors, to allow goroutines to run in parallel
No concept of affinity between logical processors

Concurrency: Channels

A channel is not a queue (even if it can act like one, that's not its purpose)
It's a way of creating guarantees in your code, and switching responsibility between goroutines
Unbuffered channel: guaranteed that the recipient has received the message
- This means that sender and/or recipient may be blocked
Buffered channels have a performance benefit, and may avoid the blocking, but there are risks associated with it
- Data may not be processed immediately, and if something goes wrong then the data may be lost
- There's no guarantee that data is going to come out the other end
- What happens when you reach the limit of the buffer? Then you do get blocking
Suggested rule: don't used buffered channels, especially when the buffer >1
- If you can guarantee something will never block, then fine
- Use an unbuffered channel to ensure you understand what happens if the channel does block, then add buffering
Unbuffered channel: receive happens first, then send; vice versa for buffered channel
Being able to measure things like back-pressure is important
Worker pools are a good, measurable pattern
Sending on a nil channel will panic; receiving on a nil channel will block forever (so use make(chan T))
close(c) will set state on the channel
- Closing a channel does not mark it for GC, which will happen if there are no references left to it
- A channel cannot be closed more than once
- A send on a closed channel will panic
- Any routines receiving on a channel will be immediately notified (would receive the zero value for the type)
- A second param on a channel receive (x, ok := <-ch) is only needed if a channel is being explicitly closed
A channel with a type struct{} can be used just for notifications
select + case allows us to receive on multiple channels at the same time

Type Assertions

Also called boxing + unboxing
A method of pulling a concrete type out of an interface

Notes, Tips and Suggestions

Lots of Go's idioms are based around mechanical sympathy (working with the hardware)
Go's syntax is "done" - going forward it's all about performance
Start by understanding type, then once you've got that, move on to behaviour and API design
Don't use the built-in println; use fmt.Println
Don't pass pointers of reference types
- The pass-by-value semantics will copy the slice header, which is effectively a pointer
Empty structs are zero-allocation
Go has closures
Sending a SIGQUIT [Cmd-] to a Go program will instantly quit it and print a stacktrace
Define variables as close to where they're used (locally scoped if possible)
The closer the definition, the shorter the variable name can be
Anonymous structs are useful for avoiding type pollution where a type is only used locally
- Example of use: JSON deserialisation

timblair/20150820-go-workshop-notes.md

2015-08-20: Go Workshop Notes

Variables and Types

Pointers

Constants

Scoping

Functions

Panicking

Arrays

Slices

Methods

Interfaces

Concurrency: Goroutines

Concurrency: Channels

Type Assertions

Notes, Tips and Suggestions