Still need to organize and cleanup.
-
pointers provide efficiency b/c we can reference things on heap, tradefoff is having to have GC
-
value symantics mean things use the stack, doesn't use heap
-
data segment - globals, literals
-
stack - go routine (2k per go routine)
-
size of every frame is calculated at compile-time, if we don' know size of something at compile time, then heap is used. example: when u make a slice, if you don't set a size compiler can't know how big it'll be, if you provide a literal
make([]string, 10), it can then maybe use stack -
go is passy-by-value, either addr of something or a value
-
zero values for variables provides integrity
-
pointers are for sharing (across a program boundary, such as function call), pointer lets you access memory outside of your frame (indirectly)
-
* - always 4-8 bytes, depending on arch.
-
return &u-- make a copy of the values address, and pass it up compiler performs escape analysis - determine where value lives (stack or heap), when you see &, think share
// we can use this as a value, however, it lives on the heap (so its a pointer underneath), lives on heap b/c we return back pointer u := user{...} return &u
-
use & for readability to show you're sharing. don't do this:
u := &user{....}; return u, do this:u := user{...}; return &u; -
only take addr of slice/map if passing it down, or marshal/unmarshal
-
numeric, bool, string, slice, map all use value semantics
-
factory function is most important, it tells you if we're using value or pointer semantics
-
do not mix value and pointer semantics with your types (aka structs)
-
when a factory function uses pointer semantics, its telling you its not safe to copy the value of it (ex: os.File)
-
go build-gcflags "-m -m" -- tells you what's going on with escape analysis (not documented)
-
values on stack can potentially move, when a stack has to grow to have more memory (go uses contiguous memory), one time hit to perform expansion of stack. GC checks all stacks, if a go routine is using less 25% of stack size, it'll shrink it. the stack is only accessible to the go routine that owns it, so we share via the heap.
-
look at training in github, language/pointers/readme.md (see GC diagram)
-
pacing algorithm - figure out minimal heap we need at the right pace to have enough memory when we need it
-
25% performance hit when GC runs , bc GC has its own go routines, so go routines have to move around to make room
-
go 1.8 removed mark phase 2, which gave GC performance boost
-
struct type, think "concrete" type. language divides state and behavior. there are concrete types and interfaces
type data struct {}
d := data{...}
d.displayName()
// this is what go does underneath
data.displayName(d)
// function variable will get its own copy of d because the method is useing a value receiver
f1 := d.displayName
f1()
d.name = "foo"
// call the method via the variable, we don't see the change
f1()
// function variable will get the address of d because the method is using a pointer receiver
f2 := d.setAge
type reader interface {
// let use manage memory
read(b []byte) (int error)
}
var r reader -- set to valueless, `[nil | nil]` (both words are nil)
-
polymorphism means a program or function behaves differently depending on the data it operates on (tom kurtz)
-
first word in an interface is a pointer, that points to iTable, the second word points to a copy of the data.
-
when you put data into an interface, it lives on the heap. cost of interface is allocation, and indirect call to the method. benefit is decoupling.
-
any type that implements methods with a value receiver, those methods exist on the data, these are the only methods that are legal.
-
may not use pointer semantics when using value semantics.
-
literal values aren't in memory, so they don't have an address.
type duration int {}
duration(42) // has no address
- when creating a type, if you're not sure which semantics to use, choose pointer semantics.
// entities := []printer{u, &u}
// e is a copy of every element in the range (value semantics)
for _, e := range entities { ... }
- you need to know which semantic you want when using
range:
for _, val := range { val } <-- value semantics
for i := range users { users[i] } <-- pointer semantics
-
start with concrete types, then decouple/abstract with interfaces
-
go doesn't have subtyping, instead they want you to group things by "what we do" (behavior, capability) instead of group by what they are, this allows code to be decoupled
-
bills rule: you're "done" when tests cover 100% of the happy path, 80% everything else
-
once you're "done", ask what can change, figure out what we can decouple.
-
give me 2 days to get in prod, but i need two weeks (so have time to fix,refactor), otherwise, once its in prod mgmt doesn't let you touch it again
-
layer your apis
-
if a function creates a value and passes it up the stack, it goes on the heap, this is not ideal
// choose this
func (*Xenia) Pull(d *Data) error { ... }
// over this
func (*Xenia) Pull() (*Data, error) { ... }
-
never embed a type for its state, only for its behavior
-
everytime you perform a conversion, think "intent"
-
tdd - forces you to think about api before you even know the design/ideas will work, bill likes to prototype first
-
library doesn't need to provide mock/interface, your client code can make an interface to use. its on the app to create it, if it needs it.
-
in go, directories are turned into static libraries, which get flattened out.
-
packages must provide, not contain. you should be able to name and tell what it provides.
-
packages must be purposefuly and portable.
-
do not put common types into packages.
-
less is more, single repo is a workflow mgmt solution
-
see goinggo.net (for package design) (posts feb 20, and feb 24)
-
every company should have a "kit" repo, packages needed by all apps go here.
-
rules:
-
a package at the same level as another, can not import one another. if they must, use the hierarachy of the tree to show the relationship.
-
these packages can't have opinions
application/
cmd/ -- have a folder for every binary you're building, can import anything it wants (can import down, imports going up is bad)
internal/ -- things reusable by multiple commands (directories can't import one another), internal is used b/c the compiler prevents anyone outside this project from importing them.
platform/
vendor/ -- third party dependencies, this includes the "kit" package
-
have a single repo with multiple components in it
-
error handling on design/packaging/README.md
-
logging in go is not cheap, b/c its using the heap (20-30% performance hit), only write to logs when something is actionable
-
never log an error and then return it
-
error handling, see example6.go. look into using pkg/errors (dave cheney)
-
lower down call stack you can handle error, the better.
-
in cmd, add integration tests, such as:
cmd/xenia/tests
-
doesn't use channels if he doesn't have to, only if its needed
-
its better to let a go routine do everything from beginning to end. handing out work to workers is not scalable.
-
channels are slow
-
cancellation, deadlines and timeouts are a beautiful use case for go routines
-
go scheduler is a cooperating scheduler, b/c go runs in user space
-
4 class of events for scheduler to make decisions:
-
use
gokeyword -
GC
-
system calls (such as fmt.Println)
-
i/o (network, disk, etc)
-
you can have up to 10,000 threads in a single go program
-
more threads than cores, mean more context switches, more load, we don't want this
-
unknown latency with context switches
-
if you're doing i/o bound work, having more go routines than cores can be useful b/c threads go to sleep during i/o
-
if you're doing cpu work, more go routines than cpu isn't helping b/c the go routines are never going to sleep
-
rule: can not create a go routine unless you know and when it will end w/o the go program terminating
-
create go routines that are stateless and/or don't need to talk to each otherwise
-
channels are not queues, serve one purpose: signaling. you can signal with or without data.
-
unbuffered channel allowsyou to signal another go routine, with a guarantee that the signal was received
-
benefit: guarantee your signal was received (receive happens first, that's how you get the guarantee)
-
cost: unknown latency
-
buffered channel, allows us to send a signal with data
-
cost: no guarantee
-
benefit: reduced latency
-
if u have an unknown # of go routines that have to perform a send on same channel, you can not use a buffered channel > 1
-
why? buffered channel of size 1 gives enough throughput and guarantee it was received, b/c you can't send on the channel until its been received
-
have to prove you need it to be larger (by measuring), will still be a small number
-
daniel data science (giving a talk)
-
see example1.go -
func selectRecv, if you changechto be unbuffered, you could get memory leak b/c go routine can not move on -
use empty struct to signal we're signalling without data
-
if "ok" is true, you received a signal with data, if false, you received a signal w/o data
v, ok := <- ch -
when you close a channel, all receivers are notified immediately
-
calling close on a channel is state change, it does not clean anything up