haf · December 10, 2015 10:49 · haf · Jan 1, 2013
diff --git a/gistfile1.txt b/gistfile1.txt
 Inconsistency Roboustness in Software Systems of the Future

 It is a known fundamental problem to the people who do programming 
 for a living, that languages do not match the nature of the universe in which
 those languages compute. Computer languages are based a rather romantic notion
 of sequential processing that is not in line with how the real world operates; 
 the real world operates in a continuoum of space-time with multiple concurrent
 threads of reality always ongoing, being acted upon by actors.

 As such we must strive to model our information systems and programming 
 languages in the same shape.

 Writing programs is dealing with information and information processing; algorithms 
 operating on data. It is the case, however, that the data can come from external parties 
 over network or over direct memory access with interrupts or some other means of 
 out-of-bound delivery, non-present at the start of the computation.

 It means that not only do we as programmers, using a programming language need to 
 be able to handle the concurrent events of reality, but we also need a way to 
 reason about the state, which equals the data that our algorithms process, as 
 that data is being concurrently updated.

 If programming language is not explicit with what time is, it will lead its
 programmers into the pit of despair, because the users won't be able to
 reason about events from the outside.

 Yet, most programming languages of today don't let programmers readily reason 
 about how time passes by, nor about what data other actors in the systems 
 interacted with have seen or have created. And when hardware fails, there are
 vague presumptions about having atomicity and consistency as in ACID on 
 that part of the system: presumptions that are hard to test and reason about.

 Those of us who are language designers, software architects, framework builders 
 and plain old programmers need to provide ways to reason about the invariants of
 out systems as they change over time, and therefore need a way to know when points
 of known values occur. But there is no programming language out there that does it.

 Instead the knowledge is embedded in the brains of whoever is architect or
 lead developer at the moment, subject to office politics and plain old 
 human mistakes.

 The actor model allows supervisor trees, actor linking and handling environmental 
 chaos (failing harddrives, failing network cards etc), software transactional 
 memory allows us to reason about the program state when an actor or a tree of 
 actors fail and reliance on fsync to disk allows us to reason about transaction 
 logs in crash-only systems.

 I posit that we need a few features not already seen in programming languages:

 * A compare operator that acts on data's temporal nature, as in "given datas a, b; 
   a < b if b was based on the data in a" - similar to what interval tree clocks 
   can give us
 * A feature that facilitates a logical fiber of computation that will only let out
   time-stampted data
 * The "regular actor framework" with garantueed devlivery on a message level, 
   supervision trees, fibers and context switching, software transactional memory,
   'send my last will (containing this data)'
 * A core library that gives actors/objects a strong convergent-data-type flavor 
   as seen in Bloom/Bud
 * A core library that gives us strong insight into the execution context with 
   metrics; timers, counters, gauges, histograms in a distributed setting
 * A core library that gives us a way of expressing two "sorts" of monads; 
   explicit monads similar to the Haskell monads that allow the compiler to reason
   with side effects and do magic -- and -- an implicit, ML-style IO monad that 
   lets the programmers easily step out-side to do some side effects (logging, 
   metrics, plugin-architectures and p/invoke/ffi are obvious examples of this)
   -- but giving the power to the programmer to switch between these modalities 
   of writing code.
 * A profileable asynchronous language core alike F#'s async that lets the 
   programmer run the software with a profiler attached that also automatically
   correlates all from before-trampolin (before async call is made) to after-
   trampoline (when IO-completion port/epoll/kqueue signals, 
   or async exception happens), that works on the same explicit message/unit
   of work identity that would be in a message between processes.

 The paper will discuss these features in depth and suggest how they might overlap
 given examples from existing research and production systems and languages.
   
 ***
   
 Happy new year 2013!
 Henrik


 ***

 Post Scriptum;
 These are some of the ideas floating around inside my head; I'd like some feedback 
 -- if anyone would like me to continue down this path, I'll spend some time gathering 
 references and improving the abstract.
	Inconsistency Roboustness in Software Systems of the Future

	It is a known fundamental problem to the people who do programming
	for a living, that languages do not match the nature of the universe in which
	those languages compute. Computer languages are based a rather romantic notion
	of sequential processing that is not in line with how the real world operates;
	the real world operates in a continuoum of space-time with multiple concurrent
	threads of reality always ongoing, being acted upon by actors.

	As such we must strive to model our information systems and programming
	languages in the same shape.

	Writing programs is dealing with information and information processing; algorithms
	operating on data. It is the case, however, that the data can come from external parties
	over network or over direct memory access with interrupts or some other means of
	out-of-bound delivery, non-present at the start of the computation.

	It means that not only do we as programmers, using a programming language need to
	be able to handle the concurrent events of reality, but we also need a way to
	reason about the state, which equals the data that our algorithms process, as
	that data is being concurrently updated.

	If programming language is not explicit with what time is, it will lead its
	programmers into the pit of despair, because the users won't be able to
	reason about events from the outside.

	Yet, most programming languages of today don't let programmers readily reason
	about how time passes by, nor about what data other actors in the systems
	interacted with have seen or have created. And when hardware fails, there are
	vague presumptions about having atomicity and consistency as in ACID on
	that part of the system: presumptions that are hard to test and reason about.

	Those of us who are language designers, software architects, framework builders
	and plain old programmers need to provide ways to reason about the invariants of
	out systems as they change over time, and therefore need a way to know when points
	of known values occur. But there is no programming language out there that does it.

	Instead the knowledge is embedded in the brains of whoever is architect or
	lead developer at the moment, subject to office politics and plain old
	human mistakes.

	The actor model allows supervisor trees, actor linking and handling environmental
	chaos (failing harddrives, failing network cards etc), software transactional
	memory allows us to reason about the program state when an actor or a tree of
	actors fail and reliance on fsync to disk allows us to reason about transaction
	logs in crash-only systems.

	I posit that we need a few features not already seen in programming languages:

	* A compare operator that acts on data's temporal nature, as in "given datas a, b;
	a < b if b was based on the data in a" - similar to what interval tree clocks
	can give us
	* A feature that facilitates a logical fiber of computation that will only let out
	time-stampted data
	* The "regular actor framework" with garantueed devlivery on a message level,
	supervision trees, fibers and context switching, software transactional memory,
	'send my last will (containing this data)'
	* A core library that gives actors/objects a strong convergent-data-type flavor
	as seen in Bloom/Bud
	* A core library that gives us strong insight into the execution context with
	metrics; timers, counters, gauges, histograms in a distributed setting
	* A core library that gives us a way of expressing two "sorts" of monads;
	explicit monads similar to the Haskell monads that allow the compiler to reason
	with side effects and do magic -- and -- an implicit, ML-style IO monad that
	lets the programmers easily step out-side to do some side effects (logging,
	metrics, plugin-architectures and p/invoke/ffi are obvious examples of this)
	-- but giving the power to the programmer to switch between these modalities
	of writing code.
	* A profileable asynchronous language core alike F#'s async that lets the
	programmer run the software with a profiler attached that also automatically
	correlates all from before-trampolin (before async call is made) to after-
	trampoline (when IO-completion port/epoll/kqueue signals,
	or async exception happens), that works on the same explicit message/unit
	of work identity that would be in a message between processes.

	The paper will discuss these features in depth and suggest how they might overlap
	given examples from existing research and production systems and languages.

	***

	Happy new year 2013!
	Henrik


	***

	Post Scriptum;
	These are some of the ideas floating around inside my head; I'd like some feedback
	-- if anyone would like me to continue down this path, I'll spend some time gathering
	references and improving the abstract.