Skip to content

Instantly share code, notes, and snippets.

@agocorona
Last active July 16, 2018 07:01
Show Gist options
  • Save agocorona/779b88e51b67324cf97d770b55974146 to your computer and use it in GitHub Desktop.
Save agocorona/779b88e51b67324cf97d770b55974146 to your computer and use it in GitHub Desktop.

Peer to Peer Cloud Computing

Current cloud services and traditional Web sites like Amazon or Azure or Google or Facebook all are centralized from the management point of view and thus, they suppose a threat as big as a Super-Mainframe would pose for privacy and security of the data as well as the computing power. Amazon could paralyze the entire world. Facebook can be perfect for political campaigns if the right "leaks" happens when they need them. we know it very well.

The source of concern are:

  • The centralized databases that consolidate all the data of all users instead of letting the user to control where to store his own data wether locally or in other's nodes and define his own access policies. Although centralized databases are also a consequence of other concerns depicted below, centralization of data is something desired, since it is very profitable for legal and -I´m worried- criminal uses of such data.

  • The methods for obtaining scalability apparently demand farms of computers, again centrally managed and controlled. Instead, scaling applications can be achieved truely distributed among the users in a peer to peer fashion on exchange for some computing or storage credit that can be sold or used later. For example, my PC or my node in a third party provider can execute a process from other company in my spare time if I wish. In exchange, I can multiply my computing power and storage when I really need it. Or I can sell it to others. A platform for cloud computing that is architecture independent is desirable for such purpose.

  • Big, centrally managed frameworks for content management, Big Data, Machine Learning, Event management etc require complex configurations and maintenance. They are pre-configured and set-up for clients at the price of dependence from a remote, centralized infrastructure. Simpler auto-installable libraries and services with functional composition instead of configuration can reduce this complexity to almost zero.

  • The shortcomings of HTTP and in general all the network protocols whose addresses identify particular machines and folders instead of identifying contents. This rigid , location-dependent addressing limit the flexibility and tend to concentrate resources in a few well known internet locations. Addressing by content (for example IPFS) or addressing by software service (for example the transient services) allows programming the access of resources and computing power completely abstracted from physical locations. With IPFS you can address a content by his hash and dowload it form a dozen locations at the same time in parallel. You can aso call a dozen instances of a computing service by his description, independently from where it is located, either if it is already installed or not. The main service that Amazon bring to the companies is location independence within amazon.com by giving a primitive form of addressing resources by service definition. What is needed is true location independence that is independent.

  • The huge increase of complexity created by the distribution of resources makes real distributed applications unfeasible. This conspires in favour of the centralized farm model. A single huge database replicated in the network and a single-node application also replicated among the nodes is the standard way to scale across the net. If we have a library that allows coding native distributed applications without the definition of specific interfaces in the form of API's JSON definition or special IDL specifications, which can execute a remote query as if it were local by simply adding a prefix : runAt remoteNode query. That would make the centralization of the resources unnecessary. This could allow any application access any resource in other node without an increase in design/coding/test effort.

Centralization is not bad. It could be very good sometimes but it should be a -reversible- option, not a requirement for using applications. At a moment I could decide to store my own data in my own computer and decide for myself what and how others use what is mine.

It's not even centralization or decentralization; it is about having freedom. Perhaps you use an application that run in amazon, but you want to store your data in azure. No problem. whatever request your data to the application would be pointed to the right place without changing the application neither reconfiguring it. The invocation of a primitive in your script would do the migration.

Coding distributed programs using network addresses is like programming by addressing individual CPU registers. configuring a distributed application identifying resources by location is equivalent to configure the registers of a chip. Programming by identifying resources by content and service characteristics is equivalent to programming in a high level language. substituting configurations and locations by composable network library primitives is like eliminating the assembler program for the multiplication of two registers and the code that load the numbers in such register and, instead of such complication use the expression a * b in a high level language.

Having distributed computing easy and composable open the door to amazing abstractions. For example caching records in other nodes and cache invalidation becomes almost trivial. Consensus algorithms can be packed in configuration-free library calls. There are many many distributed abstractions that can be converted into library primitives and raise programming the cloud to a new level.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment