Skip to content

Instantly share code, notes, and snippets.

@l0k18
Last active February 18, 2019 01:41
Show Gist options
  • Save l0k18/745a7329ecd798b5019232e031c7f4b7 to your computer and use it in GitHub Desktop.
Save l0k18/745a7329ecd798b5019232e031c7f4b7 to your computer and use it in GitHub Desktop.
Go doesn't need a version bump, it needs a macro preprocessor

Macros, not version bumps

An analysis and proposal for dealing with the gotchas in Go 1 other than a breaking change in grammar and syntax

by Loki Verloren, 16-04-2019, Novi Sad, Serbia

Abstract

Go's architecture is aimed towards a specific type of application - namely, multi-threaded, low latency high throughput network systems.

On one hand, a whole slew of types of bugs, and ones that are particularly important for security, the garbage collector takes managing the allocation and freeing of memory, and static types shift the bugs from runtime to compile time.

Garbage collection is very much work for a machine. It does not need to be extremely precise, it just needs to ensure there is a chunk available big enough to satisfy the next request.

Never send a machine to do the job of a man, or a man to do the job of a machine

Sure, people can be meticulous enough to take care of memory allocation and freeing, People who have cultivated this skill feel powerless against the 'stop the world' stops that happen from time to time with Go's scheduler and garbage collector.

What kinds of programmers like Go?

The kinds of programmers who like Go are those who are previously experienced with interpretive languages like Python, Ruby and Node.js. They fall in love with it at first, because of how much faster it is in execution, yet going from changing the code to testing it, is so fast, that the speed.

Maybe they specifically like the ultra light weight multithreading of goroutines and channels, others like how, in common with those interpreted languages, that Go's compiler can produce binaries for literally almost every computer architecture there is.

However, this type of programmer will also tend to have a lot of problems with the peculiar grammar and the various idioms of how to implement solutions to specific types of problems.

Some of the bug-traps I will describe are virtually custom made to catch exactly this kind of mind. With go, you can skate very close to the edge of completely disorganised, and still end up with something useful.

More informally, from my limited contact with a number of Go programmers, I put on my psychiatrist's hat and say 'hmm, looks like AD/HD'.

Such people tend to become overwhelmed at complexity, but if they can maintain momentum, their executive processing system is keyed up and the capacity to maintain perspective is thereby improved.

Go, like many of the Niklaus Wirth inspired languages, to those used to the different patterns found in C, C++ and other imperative procedural and non-garbage-collecting languages, takes the precise explicit control of memory management away from the programmer.

It does this for a reason - memory leaks, and associated types of bugs involving incorrect memory accesses are the most common type of bug found in C, and lead to stack overflows, privilege escalations, race conditions.

Go is specifically aimed at rapid prototyping, almost every part of the developmennt process is streamlined, minimalistic, and by having adopted a substantial part of the Niklaus Wirth language grammar patterns, that are both easy to parse into syntax trees, and simpler to understand, hence the frequency of Pascal's appearance in beginner programming courses.

Go's immediacy and the cognitive style of the programmers who favour it

I can definitely say that my capacity to visualise and even elaborate the algorithms, data structures and protocols of a larger application, is greatly improved with Go.

I always had the problem of not being able to keep the superstructure in mind as I flesh out the details of the components of the system.

With Go, I seem to be able to construct much larger structures, and I can easily go from hello world to cryptocurrency peer node, in a matter of months of work, and with very little real preparation or memorisation.

But then, at a certain point, it's not so much a mess but rather, I have to step away, relax a bit, in order to come back and start with a fresh mind.

In this article my intention is to elaborate the problem domain, and the types of strategies that suit their resolution.

Where Go breaks currently:

Variable scope shadowing

Shadowed variables declared via := are a common cause of problems, and this is exacerbated by the tendency for narrow scope variables to be named as single letters or very short words.

Frequently, single letters, I don't know about others but I always want to call my iterators i and my strings s.

For example, in Go, this is perfectly valid and we see the very same symbol represents two different variables:

var twoDeeArray [][]string
for i, x := range twoDeeArray {
    for i, x := range x {

    }
}

The compiler will not flag an error at this. The value of x will depend on which side of the second for statement you are looking.

The linter will maybe inform you about it, but what happens when the value we expect is three screenfuls back, and our code is on the wrong side of a declaration scope boundary?

Overly verbose, and ad-hoc error/exception handling

Error and exception handling are very verbose and clunky in Go. Originally it was not even a built in type.

The multiple return values are incommutable with structured variables, slices, arrays and structs. Even though the pattern matches, a slice of interface cannot receive go's multiple return values.

The error is a specific type of variable and has to be manually returned from everything, and exaggerates the need for ever more local variable declarations.

The 'idiom' dictates this also means extensive use of the pseudo tuple returns, which then results in the need to make a lot of local variable declarations. (and thus risk of two variables appearing to be one to the user).

The error handling idiom is extremely simple. Variables receive the returns from a function, and the handling process is completely arbitrary.

It almost doesn't need to be illustrated as it is known as almost one of the first features of Go most coders grasp. This is because you see it so much:

result, err := pkg.SomeFunction(someParameter)
if err != nil {
    panic(err)
}

Go is a quite low level language, on par with C in some aspects of its grammar, and like C, a small simple set of rules can produce a diversity of valid approaches to many problems.

100% of the Go standard library is written in an arbitrary manner, and the style of them varies quite widely, though perhaps not quite as broadly as you will find with a survey of repositories on Github.

There isn't really anything stopping anyone from rewriting the stdlib, nor, in the deprecation of the Go compiler from its' first class position within Google's framework, the so called 'Go Idiom'.

To illustrate, above is the idiomatic pattern for error handling. However, it is quite simple to implement other models.

With some wordy interface specifications and a lot of boilerplate you can have something like this instead:

pkg.New().HandleSomeFunctionError(pkg.SomeFunction())

However, it takes a lot of boilerplate to do it this way. A lot of copy and paste, and then for more than one, an unmaintainable mess.

Or, in other words, you need a complete, if basic, standard library, that is also written in this type of pattern.

It is fortunate that the multi-cursor text editor appeared around the same time as Go, because you are a masochist if you program in go and don't make use of the multi cursor to spit out the boilerplate required because of the static typing and lack of generic.

Idiomatic Go code is inherently repetitive and wordy

Maybe you can push 4Gb/s back and forth from your SSD, but no, you have to manually handle the incredibly advanced task of text substitution for the machine for its code, for... substituting the types between variables that even share all of the same symbolic operators.

First rule you have to abandon to make any substantially sized Go application is DRY.

You can use interfaces but they require type switches, when implementing orthogonal and even nearly word for word identical algorithms, differing only in the type in the returns and declarations.

This is why go:generate was introduced. But you have to separately invoke it, it kinda seems a bit silly to me. Plus! it basically is a super simple text substitution engine. Ya know? Like a cut down macro processor.

The idiomatic grammar is not just verbose it is also clumsy. Multiple returns looks really cool for a few minutes until you start to do things it prevents (like pipelining).

If those comma separated values were commutable to []interface{}, or interface{}, it wouldn't be so bad, because they then can become the parameters for another function.

But they are not. They are not tuples either. If you put commas between two variables, it expects variables or literals of the same number as the values to be on the other side of the assignment operator.

They are not a type, at least not in the way that you can declare their existence not through return from a function, and once they can be accessed in scope, they are identical to individual variables declared at any scope level. They are not structured.

Zero != Nil

Nil pointer panics

In go, there is a huge difference between zero, nil, and empty.

  • Zero means the value is initialised and ready to load with new data.

  • Empty means there is a slice management structure there, but there are zero elements.

  • Nil means there is no structure at all, and the pointer is invalid (zero pointer).

Simple numerical values can be zero, but they cannot be empty or nil.

Pointers can be nil, which is the pointer version of zero, but they are not containers, there is no 'empty' pointer. A pointer is really just an integer, a coordinate for the memory map.

Structs, Maps and Slices, can all be Empty and Nil.

And then, to not leave anything out, there is Interfaces, interface{}, which is like a pointer with a second value, being the type of the variable.

To access an interface, you have to identify the type, and then declare a variable to refer to it.

Pointer handling is nice and safe, for the most part, but you can put nil into parameters (usually inadvertently, and frequently) and POW!!! nil pointer panic.

I'd estimate that the ratio between incidents of nil pointer panics in go versus memory leaks in C are probably about 1:1 comparing beginner programmers of similar levels of experience.

It is good to shift the error from runtime to compile time, in deference to the entire goal of this, the service for the user, but this is only a half step.

But...

It is not complicated to get Go to do anything. It just sometimes can take a lotta lotta code because you have to repeat everything to implement the same algorithms on a different type.

Or there will be a type switch in the middle of it, also not terribly concise.

Don't get me wrong, interfaces are awesome. Typed pointers. Java was the most famous user of interfaces that raised its profile as a programming language grammar element. Several early OOP languages had it.

Interfaces are basically a collection of functions that share an internal, and inaccessible data storage representation, but that use a more generic type of data that is derived from whatever representation is hidden inside the black box.

Interfaces are basically the computer program version of a protocol. Protocols are singular, but implementations can be manifold.

Interfaces are used extensively in the go standard library to enable functions to accept many types of values, but to implement this, the type has to first be resolved, in order to be able to load it into a slot and refer to it with a typed symbol.

But if I implemented a simple iterator type, one for 32 bit floating point numbers, and another for 64 bit signed integers, literally, I can copy and paste, and just change the type name.

Yet, the simplest solution is simple text substitution - almost exactly the same as the Printf function!

Copy and paste, search and replace, compile and voila.

But what happens when you need 20 differennt ones and they need to implement several interfaces each. Sure, copy and paste.

But then you spin up five of them, and then notice a bug.

You have to correct all five versions, even though they only differ in the type name, and not in any other part of the code.

This is the polar opposite of DRY (Don't repeat yourself) and it makes the structure of Go applications brittle once they get to a certain size.

You can use interfaces and type switches to reduce some of the repetition, but a lot of it is replaced with the various necessary steps to resolve interfaces.

Interfaces are even cooler with zero nil and empty than any other type too, I mean that sarcastically. There is typed nils, and there is interface nil. A typed nil will not switch on an untyped case, and vice versa.

Slice/Array Bounds errors

Closely related to passing nil pointers is slice bounds errors.

These are serious stop-the-world errors too, because attempting to access outside of the address ranges received from the kernel memory allocator will make the kernel literally shut the program down right there.

Bounds errors are runtime errors

There is no way to catch them in compilation, as they depend entirely upon the data, and whether the iterators operating on it are ensuring they don't try to overstep.

Only mocks and test data is predictable, and only after its behaviour has been logged.

This is a case where a human is doing a computer's job. What should be happening here is that the syntax of the accessors needs to be human-friendly:

x = customListType.New()

    x.New, x.Next, x.Previous, x.Get, x.Put, x.Peek(direction, distance).

and don't forget the error handling!

x.Err(somename.handleIt)

Very often, when iterating lists, one necessarily has to compute relationships between items in the slice, one has to write an iterator, but the built in for loop's range statement is zero to end, no offsets, no anything.

So you have to add an if block to the loop to break the loop once the offsets used in the algorithm would fall outside the bounds of the array. The ways you can construct these for loops is quite limited also - you have the option of visit every element, or the clunky old C syntax with init;conditional;increment clauses, and yet more local variable declarations that this implies.

An aside:

Regarding variable declaration in C and similarly low level high level old languages, the variables at the top of the function block, and all symbols in C must be declared before they can be used.

The C compiler does no work for you, a variable assignment only creates, usually, a single machine binary value, if it is a pointer, then it must be manually dereferenced every time, only machine word types are directly accessible as values.

Then, if your type is a struct type, you have to call malloc() with a sizeof value to get the byte-size of the struct. For arrays, it is exactly the same, except even worse, you must pass the start and end boundaries to any function, to iterate it you have to increment the index manually and manually check that it stays within bounds, because only the kernel will care, C couldn't give a damn.

The foregoing, to anyone who understands how you do things in Go, would understand that Go is actually not that demanding.

You can append elements to arrays of anything to your hearts' content, the built in storage methods for the types are intentionally optimised to be dense and have a high degree of cache locality.

You don't have to think about freeing that memory, as the GC will pass through every so often and mark all the space that has fallen out of use back into the free list, you don't have to think about zeroing the values, that's also done for you.

In go, the big hassle is that Go is acting like an overly protective father, trying too protect you, youngling, from the dangerous roads of allocation, dereferencing and iteration without guardrails.

NO! you may not freely cast everything to void* and then to whatever you like. That's ...

unsafe.

Yes, you still can jump on that rail and slide down like a maniac, but the old grandpa is just not quitting with the lecture!

Go away, old man!

Don't get me wrong, I greatly admire the Go developers, between Pike and Wirth's efforts at teaching the world to program, and the wonderful operating systems that both of them have already created, and the several languages.

No.

For me, Go is like the proper successor of both the C and Pascal families of languages, but this is the most important thing:

  1. Compiled
  2. Minimalistic
  3. Fast (not the fastest, but if that were #1 priority I would write in a Macro Assembler language)
  4. Builds and runs on every hardware platform
  5. Has the most sane and simple build system I have ever seen. No scripts to write, just a couple of lines in each source code file and a folder structure.

Since I first had access to a computer, I have wanted a programming language that fits all those criteria.

The last language that did this for me before was a funny old language called Amiga E, which had in common with Go a very simple keyword based and line-based syntax, something half way in between C and Pascal.

It was also the very first actual compile-to-machine-code language system I got my hands on, and came with a suffficient amount of documentation that you could get up and running directly

Despite all the flaws I am pointing out in this document, my primary intent is to show that the language, the core, the parser, the compiled-but-dynamic allocation system... I just don't want to have to read or type so much to express an algorithm, nor have to nearly letter for letter copy it to simply enable a 32 bit variant...

Slice variable race errors

The 5th and final frequent class of bugs I have observed in my Go code has been caused by the failure to copy slices, send the same variable in two directions.

One caller changes it, and like Quantum Entanglement, mysteriously the slice changes in the observation of the non-modifying party as well, and then suddenly an array index is out of bounds and game over, start again.

Golang slices are implemented with the primary aim to minimise unnecessary and expensive copy operations.

This is great, and all, but leaving out one copy statement can turn the output into garbage as one process is modifying it as part of a parse and another process is expecting a static, immutable value.

Go basically forces you to suck up the problem of mutability, in concert with concurrency, for almost all of the types. Constants can basically only be integer, floating point and string.

Everything else is mutable, so you can't go handing it out to callers without first copying it, or imposing the requirement to operate a lock to get a turn at reading its data, even though you don't want to change it, you can, so it has to be mutex locked.

Copy on Write is a perfect fit to the Mark and Sweep GC in Go, as the copied values will be dereferenced as soon as the caller has modified it to requirements, it is returned to the factory.

This also eliminates the thorny problem of shared memory locking and the bottleneck this represents, and then the Goroutines and Channels can really show people what they are all about.

CoW and Go's concurrency model together would have incredibly low response latency, as you would no longer require mutexes, and the only times that blocking and waiting have to take place would be not because of the shared resources but because of waiting on data from an external source.

An alternative could be where a server process is the guardian and conflict resolution agent, and instead of mutating data directly, the callers write a journal entry of how they want to change the data, and these changes are merged if possible, or if not, returned to the caller whose changes were rejected, who can then attempt to generate a mergeable change in a second attempt (and so on).

With this arrangement, no longer are we concerned with copying big chunks of memory just to allow read access for the majority of the users. Nobody has to wait for anything, except confirmation of a conflict free change.

The server notifies all of its clients holding pointers to the read-only, receive a value in a channel indicating a change has occurred, so the goroutines that work with the data know to repeat the processing with the modified data.

To my mind, the one writer many readers model is always the right way to delegate access to shared resources, like a window gives view inside, but prevents direct physical contact (mutability).

The viewer then describes how they want the data changed, and the owner changes it so long as nobody else sent a mutation to them in conflict before a change is confirmed and readers notified.

The Mutability problem

Go does not have any mechanisms for explicitly specifying immutability, simply, some things are, and others cannot be. Anything that is constructed from mutable data cannot be then made immutable (despite this being a feature of CPUs for over 20 years).

If you stay strictly within the bounds of Go Idiom ™ then you slather your code with waitgroups and mutexes, or you have to strictly implement a pass-by-copy, a copy-on-write with a journal, or similar.

For which reason also, a mutex-free copy-on-write-factory model for the macro language, would also be essential. Read only locks are expensive to implement permissioning for.

In operating system kernels, read/write control is integrally tied up with processes, who have an owner, and the kernel can make write access for one process but not for others.

However, with Go, this is not the case. In go, all goroutines are still the same operating system process, with the same owner, so there cannot be selectivity from the use of the kernel memory write lock system.

However, instead, you could use a channel as a serial FIFO queue, with built in concurrency safety (messages on a channel are not exclusive, and are not constrained to represent the canonical view of a piece of memory).

Go puts no blocks in the path of creating a channel/goroutine based shared resource system, except perhaps that to implement it, every request has to include a copy operation. The owner could checksum the data, and recompute when a copy is returned, and if it is unchanged, and the original also, the next client can receive this unmutated, current copy also.

I believe that such an arrangement would also have the convenient property of producing a scheduling pattern in which latency is reduced to the latency of the CPU's interrupt queue.

Servers would maintain a multi-copy cache, that synchronises, and automatically subscribes callers to notifications (perhaps through a channel in the passed struct) that change has been confirmed from another client, and the message contain all the data required to change the stale copy into the current one

Concurrency modelling and the real world:

The most efficient, in terms of resource allocation, method of sharing information resources is to grant exclusive control to one entity, and for the custodian to cache and distribute copies to requesters, and act as sole arbitrator over mutations of the data.

Mutual exclusion locks are like the locks and engaged indicators on public toilets:

  • The lock is no real lock, you barely have to jump at a toilet stall door to break the often plastic lock.

  • The toilet itself, has 3 functions, number 1, number 2 and washing the hands.

  • Each of these three processes can be more efficiently performed with separate, special purpose objects, namely, WC for sitting, (unisex) urinal for number 1s, and;

  • A washing basin is more hygenic anyway not in thte same physical room as water that is frequently disturbed and highly biohazardous materials regularly deposited in it.

So you can see from the analogy I have made, that mutex locking is a highly inefficient resource sharing method, that general purpose cointainers are less efficient than special purpose ones.

My favourite part is that exclusive ownership of resources has superior capacity for optimal utilisation, just the same as markets are better than bureaucracies for finding the most efficient use of a resource.

If I follow the ablution facilities analogy and apply it to the channel/custodian concurrency model would be like, the customers arrive in an RV with a private toilet, every one of them, and the server is like the porta-potty hire company, emptying and cleaning them between usage cycles, the shared service.

Or in other words, what is shared is not the data, but the state of the data, and its changes. It's your own private loo, but you share the service that cleans and refills their sanitising agent.

It is even one of the ten commandments of Go, to share by communication, not communicate to share. This maxim refers to the fact that communication is cheap, and and resources require exclusion of concurrent access, writing more so, but reading, when writing is occurring, causing race conditions, where two concurrent threads have conflicting views of the data.

With careful design, Go's channels and goroutines can eliminate almost all of the overhead from arbitrating shared resources. Channels don't have race conditions, because they are serial, only one thing can occupy the slot at once.

Concurrent software is sharing data, not processes, so in the custodian model, the costs are in the copying and message queue processing, instead of offloading the job to the kernel, whose absurdly exclusive access patterns are more like a toilet that automatically opens and closes, self-cleans and stops the whole queue if it can't kick out the current occupant.

Strings

Go strings are immutable. They have their memory region write-protected and violating this causes the kernel to kill your application.

The strings are immutable for performance reasons. Mutability forces the need for mutual exclusion between writers. For readers it does not matter.

But almost everything else except integers, floats and strings is mutable, and you can in no way make it mutable by anything less than calls directly to the kernel, in C, presumably.

The concurrency model of Preemptive Multitasking

The kernel-based arbitration of access to memory, to achieve a per-user exclusion, is intimately tied to the premptive multitasking scheduling, which has a time-granularity of about 1 millisecond on modern linux kernels.

To implement, basically the kernel unlocks the memory as part of the context switch, making it writeable, and other processor threads running on different memory maps see only read-only memory.

But this exclusion is somewhat illusory, beecause actually the kernel is just flipping the write enable switch before it resumes the permissioned thread, and then when it yields or has the processor taken back from it, the switch is toggled again.

Switching between processes is a very expensive operation, in time, and in memory resources.

The biggest problem with preemptive scheduling is that this is literally a millionth of the precision of the CPU clock, and, for example, an SHA256 hash costs about 500ns on one thread of a Ryzen 5 1600.

An external request from a client may only require a few microseconds, but with a 1 millisecond preemption ticker, CPU gives a millisecond to one thread, then the message arrives after the thread is is locked out.

With 100 other processes, it could be as long as another 100ms before the process is back and can then process that data that it just missed last time, and then probably it would just yield control to the next in the queue, after consuming a few microseconds for a tiny request.

Back of the envelope estimates comparing preemptive vs cooperative multithreading latency

If that process, for example, was an infrared laser fly killer, flies can perceive about 400 changes per second, and our modern preemptive multitasking kernel's ticker precision is only just over double the precision of the fly.

Probably the fly will evade vaporisation for some time with even only one competing process, and about 15% of the total time, plus half to the competitor, you can see that the fly and the CPU are about 1:1 in their precision to respond.

In contrast, a channel based lightweight thread and per-thread memory pools, can track to a precision around 10,000 slices per second, and respond within 10 of those. It can observe data within 100ns, and let's say, the processing cost of accurately targeting the fly and getting it first time, is about 300ns per cycle, obviously the preemptive two process case compared to using the Go concurrency modelwhich can kill one fly per 300ns, and control about 100 lasers.

The problem domain

These are the complete set of what I have observed to be the most common types of bugs that manifest in a typical Golang application. They are all resolvable issues but they were not fully accounted for by the designers.

Likely there is other areas where solutions are not so complicated as a fairly extensive rewrite of the parser, but the ones listed above should be simple to solve with a macro/preprocessor that makes explicit many implicit things while taking the bulk of the typing work away from the programmer.

Mitigating these gotchas without requiring any rewrite of the compiler

It's not that you can't write code that properly deals with all of the gotchas above, it's just that after it's all working and tested, more than half your code is going to be 'if nil then make', and 'panic/defer', root level brimming with all your iterator and working buffer variables and structures.

Preventing scope shadowing bugs

A pre-processor can transpose the location of declaration of a variable - think macro expansion combined with pretty printing processing.

It will then compile as though the declaration within your for loop was for a variable declared in root scope (because it moved it there), meaning this allocation and initialisation precedes the application reporting it is ready to process data.

Shadows and edge cases

When it compiles this way, the name conflicts become visible and the potential error possibilities are eliminated from this vector.

The scoping as well as shifting the site of allocation and initialisation to the pre-init() phase, regardless of where you wrote it in the code, is done invisibly.

Explicit allocations are not the problem, it is only the arbitrary and implicit ones that are of concern.

The first reason to move variable allocation to initialisation isn't performance

It's to prevent invisible walls blocking, and invisible tubes carrying information in ways that you don't intend. This is the error scenario that has the worst prognosis, once it comes up.

Pure human error, caused by identical labels for the variables, but location in the code separates the two.

Knowing from where a black swan can show up is key to avoiding one.

I can't think off the top of my head the ways in which this kind of bug can be exploited, but it's basically a leak between the scopes of different parts of the program, and these might provide a way to escalate privilege, for example.

All exploits are rooted in bugs

If the logic involved is malformed, it is possible the overlap may never be noticed because of the rarity of the conditions that cause a problem to manifest. The very same kind of bug that leads to massive losses in business, as hackers exploit it to attack someone or organisation.

The times when such bugs are not just incidental and inconsequential will inevitably be when you least expect it, yet if the variable was file scoped (declared at the root of the source file, not inside a function) it could not possibly conflict.

This is a bug masquerading as a feature.

Go lets you literally cause those declarations to occur mid-stream during processing, in that part of source code where it tends to be one letter variables, especially iterators and counters.

It seems probable to me that the Go implementation in actual fact also shifts the declarations to the initial variable initialisation at the start of execution also, in many cases, but probably some cases there is a nontrivial amount of processing and waiting on the kernel.

...Right in the middle of trying to decrypt or decode some process heavy bit of data, as the garbage collector wakes up and stops all the goroutines and we see the devices receiving data the buffers fill up, acknowledges don't get sent back soon enough (hypothetical first person shooter game, for example), and you're back at the spawn.

And if that block coincides with a recycling pass, and that event had a 2ms deadline, obviously, a not that insignificant amount of times, the machine is going to fail to produce a timely response, only because of the coincidence of unloading the thread that then receives a message barely nanoseconds later.

Note that controls are exposed that can be used to modify the behaviour of the GC, it can even be completely disabled, function entirely on manual malloc()/free() calls, via the C embedding system called Cgo.

If latency 100% of the time is an issue, you can turn off the GC, but it should be pointed out that except for those brief 2-5milliseeconds.

Preemptive multitasking is for badly written code that has a low time precision requirement

Preemptive multitasking mainly just ensures one process does not block the schedule indefinitely, mainly because this can happen fairly often with buggy software containing errors in the memory allocation/freeing, and releasing locks on devices and other system shared resources.

The kernel isn't listening to the interrupt queue while it context switches (as it has to be performed atomically and the process can't start until it's environment is correct), and so, if the timing demand of a process is high enough (such as the targeting process in the laser fly killer analogy), you are going to get quite a high rate of tardy responses.

How software is written when latency is everything

Of course, a real computer controlled fly killer, would not be running a kernel at all. It would be synchronised exactly, within under scores of nanoseconds of precision, to the input and output devices (motion sensor/triangulator/radar).

So its switching is far below the time period in which the fly can observe and respond, and move and likely could do a Zorro slash in the exact eye-baubles, and vaporise its central nerve nexus instantaneously.

Almost always, multitasking executives in kernels will cost latency for realtime processing, and probably is where all the difference between Go and other languages' performance differences come in, and not just because of the garbage collector, even if Go's individual threads are capable of responding within 100 microseconds or less, all of them could be kicked off the processor just as a big load lands in the network device buffer and 20-100milliseconds.

It's just that Go tries to parallelise its goroutines as much as possible, and often a core escapes a context switch, and a goroutine is awake to greet the visitors from the internet.

On the fly variable allocations and initialisation are potential sources of latency in realtime processing and during peak load times.

They should be avoided at all times, with the exception of on extremely limited resource devices.

In this, I am not concerned about the garbage collector's stopping everything to release a bunch of memory, I'm talking about where a bunch of goroutines are all processing a big stream of data (let's say, blocks from a blockchain)

In the middle of their processing loops, the goroutine sends of a request for memory, without which it cannot run, and in the middle of heavy processing, it's twiddling its rosary beads, and staring into space blankly.


About scheduling latency

One of the core benefits of Go is the ultra light weight concurrency primitives.

The scheduler for this is completely cooperative, in that if anyone is running late, everyone is running late.

But this cooperative multitasking has a benefit not present in premptive scheduling used in operating systems kernels - it can react to new input faster and won't be stopped in the middle of a job.

The Linux kernel, for example, has a 1000hz timer. This means that the cpu is gonna stop everything and switch every millisecond.

If over the time of running of an application, events are coming in at a rate of more than a few hundred per tick, the Linux kernel is going to be unresponsive while at least several if not scores up to hundreds of requests just sit in the buffer.

Go's concurrency system allows response times down around the low microseconds range, assuming the kernel doesn't carbonite it and come back a score or three milliseconds later.

In a network system like a cryptocurrency, or especially a very busy forex exchange or similar, these kinds of timespans can sometimes be very expensive.


Variable Initialisation - When a good time?

The best time for doing work that prevents anything else happening, is before anything is happening.

An aspect of the memory management problems that C and C++ and other non GC languages that is not often as deeply addressed is that of the initial state of variables. Using a variable with an indeterminate initial state implicitly will cause nondeterministic failures.

So in Go, the initialisation is not just allocation and reference counting, it is also iterating over the bytes (probably by 64 bits at a time) to zero them out.

Static typing and dynamic allocation

Memory allocation is a fairly time-expensive operation. On a typical current generation machine, grabbing a gigabyte of memory can take up to a second.

Whether or not the initialisation happens concurrent to or has any timing influence on the execution of loops inside those inner scopes, is not fixed, and varies according to the context of the types, the configuration of the OS kernel, and other factors.

So, if response time is important to you, you don't want to be waiting until that moment to ask the kernel for memory.

The only way to ensure it isn't raising your application's response latency under load is to pre-allocate, not allocate on demand.

In C, all of these allocations and initialisations are fully manual and require the programmmer to take care about when, where, and how much.

When you automate things, that can't be as careful as you can, in the peace and quiet of your office as a programmer, but rather are just heuristics, and entirely at the mercy of many things that can't be controlled, one has to do each type of action at the right part of the process.

Otherwise other things get delayed, and only takes a few clocks going out of sync and the timetable becomes a work of aspirational fiction.

Error Handling and Exceptions

Though always in a computer system, an error is a definite and negative thing, it need not be a stop the world type of issue.

There has been a gradual evolution of computer languages towards reducing the burden of handling errors, as it is one of the least favourite and least productive parts of programming (and more so, the more clumsy the language's grammar for handling them).

So, there is several aspects here, and this is how I will propose they be dealt with:

  1. Make all variables structured, of a common singular base type (error handler), assign/copy the variables to variables in one's temporary variable pool declared in the source file root, generate the formulae out of these localised variables, and then copy the result back in, or point to the generated result as the new storage location referred to by the pointer to the structure.

The methods will be something like SetErr, Err, OK, and they will be tightly coupled to the logging system.

Further, each type can also have a set of error conditions specified in a concise declaration along with the type definition, and in the actual handler sections of the functions, a simple switch statement is used trigger the correct response.

Don't think of them as errors, but rather as state

You don't have to separately handle the error from a function, instead you can just call the variable's error methods, get/set and nil test, directly from the result variable in the next line.

Basically, it allows you to unclutter the core of your program logic, being the location in which the bugs are hiding. Mess is harder to find in the middle of a mess.

However: the boilerplate that is required for this type of implementation is onerous in Go

The benefits of good error handling go far beyond reducing the stress of a programmer trying to implement a complex algorithm on a deadline. Correctly handled errors don't cause unpredictable behaviour.

It is fortunate it is not hard to write text processors in Go.

Yet, somehow, in 9 years since its first release, everyone has happily played along with the 'idiomatic' code you have to write, though there has been a few efforts.

  1. Types that are known directly by CPUs, the 8, 16, 32 and 64 bit numeric types including the floating points in the same size and implemented on hardware, do not need to store explicit state and would severely suffer in performance if encapsulated in a structure.

So, the generated code relating to these types when using the macro processor will hide the implementation difference from you.

You will be able to create a type as an alias of, say, int, but using it to encode, for example, 32 bit rational numbers (eg 22/7), and rather than have to separately process and give an address to the return value from it, separately catching the error value, you can just stick the result, whole, into another function. The error handling can be encapsulated in the methods, and not taking up 30%+ of your algorithms.

In Go, you would not be able to use the embedded integer value of a type without the dot operator, however, with a macro processor, you can add and subtract this type with ANY other variable carrying precisely the payload of int. This is what it would look like:

i = structuredInt * 22/7

vs the generated code:

temp1 = structuredInt.int
roottempvarint1 = temp1 * 22/7
structuredInt.int = i

The macro typing declarations would create a metadata in both of the types' 'hidden' root level package source files, that the preprocessor understands to mean that the embedded type's native infix/prefix operator syntax can be used, and the macro processor rewrites the code so the result is ordinary old Go.

Implicit error type for arithmetic

Arithmetic and logic types in Go do not have any kind of error status. It would be excessive overhead in the form of dereferencing of pointers and, worse, function calls and their context switch burden, to also encapsulate machine level data types.

Instead, the CPU has a status register, and the results of a computation are immediately available as a pure (bitwise) boolean value in the condition code register.

Arithmetic and binary logic operators cause these bits to flip when various conditions arise in the execution of the computations. For reasons of performance, the macro processor pre-allocates at initialisation, and the code it generates eliminates dereferences as much as possible prior to commencing a long processing loop.

Thus, the additional time required to request the memory pointed to by the structured variable type generated by the macro processor is performed before starting the calculations, and then once results are generated, stored back in the variable.

Thus, here is another reason why to use a preprocessor rather than complicate the grammar of the language.

It is possible to manually write code that does all this careful preallocation and at least in the reading of the main processing loop, but it is, again, a job for a machine, not a human being. A computer is like a dumb, and obedient bureaucrat. It doesn't care how tedious or repetitive its work is. Thus, it doesn't make mistakes, assuming errors were not left inside it.

Error/exception handling block

  1. An additional block section at the end of function definitions - similar to struct literals, where you have one set of braces wrapped around the function body, and a second one for the populating of the structure, which is what runs when something inside the function block panics.

By adding the error handling function as a little pod at the end of a function, it is clear to what it relates, it isn't directly interspersed with the actual function, and it allows the error state to become

Again, all of these things can be manually implemented by a programmer, but the amount of details that must be attended to necessarily mean a human is prone to missing something and causing a bug, and all the worse if the bug manages to not cause a problem for a long time.

All the macro preprocessor really would be doing is declaring an accessory function directly after the function, which is invoked by a defer that is inserted at the top of the fuction. Likewise, a macro system could

  1. Conditional branch operator When it comes down to it, the if err != nil blocks are basically conditional branches. However, they require two lines to express, unlike ASSEMBLER, where it is implicit in the condition code register. Condition code register. The proper way to position exception/error handling code is exactly the same as it is with the hardware.

Currently, one has to stitch together an if statement and a return statement to break out of a function. All this particular language element does is absurdly simple, that you do have to wonder what the logic against this is since it's like only every 10th line you have to write this. Basically, an optional second parameter, in the same way as the way the range operator works in for statements.

So, you basically can say in one line,

return <returnvalue>, <condition>

as opposed to:

if <condition> {
    return returnvalue
}

With a macro processor you could do anything, really, but the idea would be to build in a set of sensible primitives that cover the basics.

How the macro preprocessor will handle multiple return values

The syntax will look the same superficially, but it will be like a permeable scope boundary instead, where the value has a symbol on both sides of the scope boundary (and likely, it will be declared in the root scope.)

So where before you had to either put up with the in situ declaration of the variables to store the error state of a function after execution, so called 'helper' variables, now that pair of variables can be replaced with one variable, which has the error methods to read and write its state.

And the variable is initialised before the performance and latency critical period of runtime begins.

The substitution performed by the preprocessor will be that the variable will be a struct, instead of this half-tuple half-nothing thing, but it will let you declare separately named variables for the fields of the struct, and write the assignment for you, like a good robot.

Personally, I think the phobia towards preprocessors is unwarranted. It is true that a preprocessor can let you do really stupid things, but almost every one of the things that Go sorely needs are most simply provided by a preprocessor.

The solution I propose is a pure text substitution scheme.

In keeping with the minimalistic keyword set and syntax construction rules of Go, the symbols will be constructed so as to be both visually and computationally disjoint with the Go grammar, to ensure that the reader is in no way unclear it is not Go.

It should not be as ugly or programmable as CPP, however. I will elaborate this in the next section.

Go Macro Preprocessor

It is true, as I mentioned earlier, that the Go compiler suite includes a code generator, which basically does that search and replace on type names and copies code into files to be compiled, before compilation.

But you have to invoke it separately. It's not just go build and it first generates your generics, you have to go go generate first. Why????

The purpose of the preprocessor is about automating the generation of boilerplate code that is necessitated by the grammar of the language.

The library of transformations from source text to generated source code can be done bit by bit, replacing bit at a time the boilerplate parts and delivering the benefit of readability at the same time as the safety of bug free implementors that in native go are most efficiently written by a code generation from a library of macro definitions, defining thte body, the slots for the variables and constant parameters, implementing necessary interfaces, and so on.

It is not the objective to sugar Go's syntax, but rather, to eliminate repetitive and thus labor intensive and error prone code, and allow a simple and distinctive substitution mapping syntax that works something like a form letter printer, taking a list of names and addresses and inserting them into a data structure that is otherwise the same except for what you put into the slots.

The macros will automatically inject error handling code into functions, automated generic variable definitions based on type templates (eg iterator, binary tree, linked list, filesystem, message protocol, whatever),

You will still write those for i := range x {} function scope variable declarations, but the macro processor will rename the variable, move its declaration to the root type block, and it will be impossible that you shadowed a variable, and end up with a mysteriously wrong value when it appears everything should be correct.

Then it turns out there's a variable in scope that you didn't realise was in scope and the one you meant it to be is behind the scope boundary, and thus inaccessible where it's being used.

The purpose of this preprocessing is to automate all the tedious things that you are supposed to do, but can be neglected due to the complexity of the repetitive boilerplate this requires.

Implementation issues

There is existing macro processor systems, and even a Go one is available. Macro processing is usually fairly simple. Basically, forms, slots and mappings. The macro user writes, for example, a generic array iterator that operates on byte slices.

The macro processor then can accept a specific command, the label of the particular macro, it reads the input variables out of its parameter list, and then it plops the input values into the expanded version of the code, all correct and bug free, and then sends it to the Go compiler to generate the binary.

Well, enough from me about this. I am not even sure exactly where to start!

@l0k18
Copy link
Author

l0k18 commented Feb 18, 2019

Thinking it through a little more, what I mean by macros is not macros but pattern substitutions that replace things. If a variable is a pointer, the preprocessor will insert a nil guard with a panic.

When an function that opens a resource is present, it automatically places a deferred close unless an explicit close exists for the handle, in scope.

When a map or other makeable variable is declared and used, if a make is missing, it is inserted before the first assignment.

Some things will be substitutions, essentially new keywords. The conditional return is basically a new keyword, which expands into an if condition return

When a := appears, it is changed to = and the declaration placed at the start of the source. When an := appears in front of variable name that has an explicit declaration, the preprocessor will remove the : and add a comment to the end of the line indicating it was corrected

The recovery block attached to the ends of functions is moved into the block and wrapped in a closure declaration and invocation on a defer. Obviously, this means the programmer will stop thinking about the panic() as being a show-stopper and instead, to mean that the recovery block will run (which is the intention, but the syntax is bloated and no fixed location exists for placing the handlers).

The nil guard feature will stop nil panics, but it has to have some kind of default handling that the programmer will expect. For this then, a special distinctive variable prefix (probably @) will indicate that the programmer doesn't want the automatic allocation inserted into the code, and that they must implement the recovery code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment