Skip to content

Instantly share code, notes, and snippets.

@mindscratch
Last active May 15, 2018 15:34
Show Gist options
  • Select an option

  • Save mindscratch/cf06d7634b6e134027fafef9561b91ed to your computer and use it in GitHub Desktop.

Select an option

Save mindscratch/cf06d7634b6e134027fafef9561b91ed to your computer and use it in GitHub Desktop.
One Day Advanced Ultimate Go Notes

Still need to organize and cleanup.


  • pointers provide efficiency b/c we can reference things on heap, tradefoff is having to have GC

  • value symantics mean things use the stack, doesn't use heap

  • data segment - globals, literals

  • stack - go routine (2k per go routine)

  • size of every frame is calculated at compile-time, if we don' know size of something at compile time, then heap is used. example: when u make a slice, if you don't set a size compiler can't know how big it'll be, if you provide a literal make([]string, 10), it can then maybe use stack

  • go is passy-by-value, either addr of something or a value

  • zero values for variables provides integrity

  • pointers are for sharing (across a program boundary, such as function call), pointer lets you access memory outside of your frame (indirectly)

  • * - always 4-8 bytes, depending on arch.

  • return &u -- make a copy of the values address, and pass it up compiler performs escape analysis - determine where value lives (stack or heap), when you see &, think share

// we can use this as a value, however, it lives on the heap (so its a pointer underneath), lives on heap b/c we return back pointer u := user{...} return &u

  • use & for readability to show you're sharing. don't do this: u := &user{....}; return u , do this: u := user{...}; return &u;

  • only take addr of slice/map if passing it down, or marshal/unmarshal

  • numeric, bool, string, slice, map all use value semantics

  • factory function is most important, it tells you if we're using value or pointer semantics

  • do not mix value and pointer semantics with your types (aka structs)

  • when a factory function uses pointer semantics, its telling you its not safe to copy the value of it (ex: os.File)

  • go build-gcflags "-m -m" -- tells you what's going on with escape analysis (not documented)

  • values on stack can potentially move, when a stack has to grow to have more memory (go uses contiguous memory), one time hit to perform expansion of stack. GC checks all stacks, if a go routine is using less 25% of stack size, it'll shrink it. the stack is only accessible to the go routine that owns it, so we share via the heap.

  • look at training in github, language/pointers/readme.md (see GC diagram)

  • pacing algorithm - figure out minimal heap we need at the right pace to have enough memory when we need it

  • 25% performance hit when GC runs , bc GC has its own go routines, so go routines have to move around to make room

  • go 1.8 removed mark phase 2, which gave GC performance boost

  • struct type, think "concrete" type. language divides state and behavior. there are concrete types and interfaces

type data struct {}
d := data{...}
d.displayName()
// this is what go does underneath
data.displayName(d)

// function variable will get its own copy of d because the method is useing a value receiver
f1 := d.displayName
f1()
d.name = "foo"
// call the method via the variable, we don't see the change
f1()

// function variable will get the address of d because the method is using a pointer receiver
f2 := d.setAge

type reader interface { 
    // let use manage memory
    read(b []byte) (int error)
}
var r reader -- set to valueless, `[nil | nil]` (both words are nil)
  • polymorphism means a program or function behaves differently depending on the data it operates on (tom kurtz)

  • first word in an interface is a pointer, that points to iTable, the second word points to a copy of the data.

  • when you put data into an interface, it lives on the heap. cost of interface is allocation, and indirect call to the method. benefit is decoupling.

  • any type that implements methods with a value receiver, those methods exist on the data, these are the only methods that are legal.

  • may not use pointer semantics when using value semantics.

  • literal values aren't in memory, so they don't have an address.

type duration int {}
duration(42) // has no address
  • when creating a type, if you're not sure which semantics to use, choose pointer semantics.
// entities := []printer{u, &u}
// e is a copy of every element in the range (value semantics)
for _, e := range entities { ... }
  • you need to know which semantic you want when using range:
for _, val := range { val }  <-- value semantics
for i := range users { users[i] }  <-- pointer semantics

software design

composition

  • start with concrete types, then decouple/abstract with interfaces

  • go doesn't have subtyping, instead they want you to group things by "what we do" (behavior, capability) instead of group by what they are, this allows code to be decoupled

  • bills rule: you're "done" when tests cover 100% of the happy path, 80% everything else

  • once you're "done", ask what can change, figure out what we can decouple.

  • give me 2 days to get in prod, but i need two weeks (so have time to fix,refactor), otherwise, once its in prod mgmt doesn't let you touch it again

  • layer your apis

  • if a function creates a value and passes it up the stack, it goes on the heap, this is not ideal

// choose this
func (*Xenia) Pull(d *Data) error { ... }

// over this
func (*Xenia) Pull() (*Data, error) { ... }
  • never embed a type for its state, only for its behavior

  • everytime you perform a conversion, think "intent"

  • tdd - forces you to think about api before you even know the design/ideas will work, bill likes to prototype first

  • library doesn't need to provide mock/interface, your client code can make an interface to use. its on the app to create it, if it needs it.

package oriented design

  • in go, directories are turned into static libraries, which get flattened out.

  • packages must provide, not contain. you should be able to name and tell what it provides.

  • packages must be purposefuly and portable.

  • do not put common types into packages.

  • less is more, single repo is a workflow mgmt solution

  • see goinggo.net (for package design) (posts feb 20, and feb 24)

  • every company should have a "kit" repo, packages needed by all apps go here.

  • rules:

  • a package at the same level as another, can not import one another. if they must, use the hierarachy of the tree to show the relationship.

  • these packages can't have opinions

application/
    cmd/  -- have a folder for every binary you're building, can import anything it wants (can import down, imports going up is bad)
    internal/ -- things reusable by multiple commands (directories can't import one another), internal is used b/c the compiler prevents anyone outside this project from importing them.
        platform/
    vendor/ -- third party dependencies, this includes the "kit" package
  • have a single repo with multiple components in it

  • error handling on design/packaging/README.md

  • logging in go is not cheap, b/c its using the heap (20-30% performance hit), only write to logs when something is actionable

  • never log an error and then return it

  • error handling, see example6.go. look into using pkg/errors (dave cheney)

  • lower down call stack you can handle error, the better.

  • in cmd, add integration tests, such as: cmd/xenia/tests

concurrency

  • doesn't use channels if he doesn't have to, only if its needed

  • its better to let a go routine do everything from beginning to end. handing out work to workers is not scalable.

  • channels are slow

  • cancellation, deadlines and timeouts are a beautiful use case for go routines

  • go scheduler is a cooperating scheduler, b/c go runs in user space

  • 4 class of events for scheduler to make decisions:

  • use go keyword

  • GC

  • system calls (such as fmt.Println)

  • i/o (network, disk, etc)

  • you can have up to 10,000 threads in a single go program

  • more threads than cores, mean more context switches, more load, we don't want this

  • unknown latency with context switches

  • if you're doing i/o bound work, having more go routines than cores can be useful b/c threads go to sleep during i/o

  • if you're doing cpu work, more go routines than cpu isn't helping b/c the go routines are never going to sleep

  • rule: can not create a go routine unless you know and when it will end w/o the go program terminating

  • create go routines that are stateless and/or don't need to talk to each otherwise

  • channels are not queues, serve one purpose: signaling. you can signal with or without data.

  • unbuffered channel allowsyou to signal another go routine, with a guarantee that the signal was received

  • benefit: guarantee your signal was received (receive happens first, that's how you get the guarantee)

  • cost: unknown latency

  • buffered channel, allows us to send a signal with data

  • cost: no guarantee

  • benefit: reduced latency

  • if u have an unknown # of go routines that have to perform a send on same channel, you can not use a buffered channel > 1

  • why? buffered channel of size 1 gives enough throughput and guarantee it was received, b/c you can't send on the channel until its been received

  • have to prove you need it to be larger (by measuring), will still be a small number

  • daniel data science (giving a talk)

  • see example1.go - func selectRecv, if you change ch to be unbuffered, you could get memory leak b/c go routine can not move on

  • use empty struct to signal we're signalling without data

  • if "ok" is true, you received a signal with data, if false, you received a signal w/o data v, ok := <- ch

  • when you close a channel, all receivers are notified immediately

  • calling close on a channel is state change, it does not clean anything up

profiling

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment