(Algebraic Infrastructure company) ==> Major takeaway: Why DT matters
- Express types that couldn't express otherwise Type system is to prove something about our program, through syntactical rules. So we prove things @ compile time. Typically we think of type in terms of how they look like on the memory structure. But type is more about guaranteeing behavior. E.g. a NEL has same shape as List, but gives one extra guarantee!
Aim: prove properties through type system (Curry-Howard): types are propositions (statements), and values are proofs of those proposition. Functions? Are implications: If input is A, output is of type B => it's a way where inputs are preconditions, outputs are postconditions.
Implicit resolution uses compiler to verify proofs.
- implicit val: axiom
- implicit def: theorem
- compilation: proof
It's type that depends on a value. but values don't exist @ compile time yet, so we don't know until runtime. But we can still refer to this value, like in any function. 2 Kinds of DT:
- Dep Function (Pi-type): return type depends on the argument => it's a forall
- Dep Pair (Sigma-type): second type depends on the value of first value => it's an exists
With type members we can hide the type when necessary, and specialize it extending on GADT.
Encode variants, enforce things that we know hold for sure. E.g. a red-black tree can declare these properties through the type! This makes dynamic programming type safe
Peano numbers. Looks like a list (base case and inductive case).
S[N_ <: Nat]
Nat in this case is a kind , kinds are groups of types
Type params blocks start to look like value params
trait Vec[T] {
type Length <: Nat
}
2 different Vec have different Length
refinement.
We can put the constraint as depending on the input type, and have this checked @ compile time. E.g. enforcing that length of output has same length as input when going through map
. (see example)
Can lift booleans as types
Miles: Option is Some(_) | None
. f(Some[T](t))
is actually a f(Option[T])
where compiler checks that this option is a Some
. Therefore subtype is a depdendent type.
Each stage has materialized value. The to
combinator keeps the mat value of the left. So we use toMat(_)(Keep.right)
throttle(nr, 1 second, ThrottlingMode.shaping)
=> If we get more than nr
in the 1 second, does something accordingly
Akka SSE. Problem that connection might drop, but I want to reconnect. This can be handled through akka-streams flows. Check EventSource
class.
Unmarshal to EventSource
the response.
How do we feedback to server the last acknowledged event id? GraphDSL. Source.fromGraph
if we want to create ultimately a source. How do we know the last value of stream? We need to fanOout to a sink that holds that last event: eventSource.alsoToMat(Sink.lastOption)
Check scan
operator on Streams, kinda foldLeft
.
The server is completing the request with a Source of marshalled events. N.B.: Any complete
goes through a ToResponseMarshallable
TODO Piero: Check documentation for http://doc.akka.io/docs/akka/2.4/scala/stream/stream-dynamic.html#Dynamic_fan-in_and_fan-out_with_MergeHub_and_BroadcastHub
With SOA we have many standards. EDI -> SOAP -> JSON. Methodologies: From too structured to anarchy. Now we do micro-services. Recipes:
- Bulk-heading, don't have microservices depending on each other. E.g. Recommendation for Netflix is an accessory services, so streaming videos is not dependent on it. if recommendation is down, stream anyway.
- Instead of Service A calling B, we can have B communicating data to A wheen they occur, so A is up-to-date anyway
- Own your state: we should know who knows our service, and only our service owns those data
They are based on Req-Res => can have counters and timers on this. Logs are descriptive, we can backtrack. Use ThreadLocal for transferring ctx.
Sometimes even looking at source isn't enough. We have to Jump Asynchronous Boundaries. Lightbend have ActorInstrumentation that injects pointcuts that are invoked. In particular, have actorTold/Received/Completed
When things are Distributed, we want that ActorA -> msg -> ActorB. ActorA receives, notes that he received (with time), then when ActorB receives does the same. Log each event with time, and send alll events to a collector. BUT we can't be sure that time is the same across the nodes! Time means nothing. How do we achieve order? Google uses atomic clocks for F1 database. There is also cockroach DB (TODO: Check). How can we use causal relationships to trace ristributed systems? We add (currentID, parentId, traceId). TraceId is our correlationId. ParentId is what happened before. So the first one is the one without parent. So this is how cross boundaries are traced.
- Handling scale: If all our systems are logging to the same system, we can decide to avoid choking it through sampling. Log one every 100 occurrences. Or don't log if no delta occurred in the state. Or another approach, pull like prometheus.
- Ephemerality: e.g. AWS lambda, fully stateless. I want to recognize my 100K intances of my function as one
- Staying Cost Effective: How are we sure that monitoring doesn't introduce too much overhead? We can e..g. not monitor in isolation actors that die soon, bettter monitor them collectively
- Check spikes: They mean typically GC. Monitoring system shoudl be aware of what happens on the orchestration tool and not call one fornothing at 3am.
- Machine learning can detect if anomoalies, and ask system to provide more information. ATTENTION: this might mean that we trigger a positive feedback that brings the system over the cliff.
- Higher Availability: Have the monitoring system
InfluxDB for time series. Riemann as stream analyzer.
It's best to have different thread pools and contexts/dispatchers (e.g. one for DB, one for http etc). WIP: Future Context propagation, AkkaHttp monitoring. Requires license for production usage
Build a DSL to command and event handlers
in CQRS instead of addItem/removeItem attached to a case class, we have commands, so methods become themselves case class! Command = AddItem(,,) | RemoveItem(,) Then we have events, but there is no formal way to say these commmands and events are referring to the same guy (Item) In OOP I would use an OnItemEvent() -> ItemEffect; and OnItemEffet(ItemEffect) -> Item'
We want to write this only once!
In general: we have a command to create, one for updateing existing, which produce List(events) and onEvent for creation, and one for update. So can generalize in [A, C, E]
If we try to implement this, it looks like shit.
It gets better if we define a DSL. I send commmands to an AggregateRef
. To summn the aggregateRef, we have to pass all the parameters [Order, OrderCommand, OrderEvent]
. Worst thing is that this is not typechecked, we coudl define [Order, ItemCommand, OrderEvent]
So we define the protocol where we define the commands and events as type members.
Type projection might go away with Dotty => Just use a companion object with type members and can access type through . instead of #
Trick: can expose a trait as implicit type as himself: implicit def selfImplicit: _ = this.type
Advantage: Persistence is completely out of Akka
Used for fraud detection. Consume 15 different APIs that have different response times. most of time it waits.
Write requests to Kafka. All requests go via Kafka
Gains: Long runnning Tx, Persistence, Debugging, Upgrades of the versions of transactions from Kafka, and ScalaCheck for workflow
Transaction: Stream of ApiResponse(value: String)
Could model workflow as a FSM (Moore), but this is difficult to combine Another way is through fs2. Can be resumed but has no backtracking. Define the algebra of combinators for workflow steps.
Can shoot directly JSON in Kafka and examine the topic. Check Haskell machines