Caliopen Architecture Guide

Context

Caliopen in this document refer to the technical messaging infrastructure developed to achieve Caliopen project aims.

Caliopen is designed as a scalable online platform for management of messages in many protocols. Mail and related protocols are the first implementation.

Security is a main concern, particulary personal data security principles.

Design

Caliopen is a basically a 3 layers architecture:

Storage with a data and an index engines used. Currently Cassandra and Elasticsearch are supported
Core layer. It's where all logic is build. This layer must be used by all clients of Caliopen.
Protocol layers. Protocol specific logic is build in these layers and use core layer to manage data. REST Api, LMTP MDA, SMTP are such layers.

Code repositories reflect this architecture. It's a protocol component oriented architecture, not a microservice one.

All different protocols layer must used state of the art security mechanisms. For example HTTPS cyphers when Caliopen do not operate with a third party HTTP(S) service must be enforced to a really high security level (A+ note on Ssl labs site).

Layers

Storage layer

All data are stored in a cassandra cluster, when it's possible using user main encryption key. Not crypted data are stored in an index engine to permit fast retrival of information per user.

Models are related to only one user in most cases, and each user have its own index with models that can be indexed. User index must be able to be rebuild at any time, using cassandra data. So many updatables data have to be updated in both cassandra and elasticsearch (tags is a good example).

Core layer

This layer and only this one must be used to read and write from storage. All models have their equivalent in this layer and must be used to manage related data.

This layer must not use specific storage objects methods, this logic belong stricly to storage layer (NB: it's not the current status in code).

All inputs must be validated and cleaned before. All methods in this layer must be strictly declarative and not support anonymous arguments, specially when requesting indexed data.

Protocols layer

This layer is many packages, one per protocol.

Supported protocols are at this time:

HTTPS REST Api
LMTP Mail Delivery Agent
SMTP for mail sending
Vcard import, export for address book management.

And more to come (x;pp, twitter, facebook, linkedin, ....)

Models

Storage layer define many models. Most are related to only one user, but some are shared by the whole platform.

User management and configuration

User


All users are stored using this model. This model is used for user authentication
and lookup.

Counter

This model store all counters related to one user.

Tag


This model store all tags defined by an user

FilterRule

This model store filtering rules defined by one user Current design is not correct as we store directly executable python code, needed that for proof of concept, need a better solution

Message management

RawMessage


Store in raw format any message that can be related to many users
No modification on received or sent message must be done before
storing in this model

Message
~~~~~~~

Message processed for one user. Only interesting data for display
are stored in this model.

Thread
~~~~~~~

Threads related to one user. Any user message belong to a new or
an existing thread.

MessageLookup

Lookup table (index) to retrieve a user message by it's external id.

Contact management

XXX

gdchamal/gist:0fcfa10a38b3b5eb2b31