-
DNS lookup
-
TCP connection Open
-
TLS handshake
-
HTML parsing
-
JavaScript parsing
-
API call (AJAX)
Container technology, e.g. Docker, is a technology that provide OS-level virtualization in packages called containers. These containers can therefore eliminate cross-OS incompatibilities. Containers greatly reduce the development and deployment difficulties, allowing developers to develop in different host OS (e.g. macOS vs Windows vs Linux) BUT work in the same container environment with the production environment.
For example, if the production environment is Ubuntu docker image, then the developers who are working on either Windows or Mac machines can develop, run, and test their works with the same Ubuntu docker image environment.
Other than benefits in software development, Dockers when used properly can also maximize a computer resource. Instead of setting up VMs for different tasks individually which will leave wasted resource, Docker can pack many images on the same host. Now one computer can run numerous VMs, and those VMs in turn run many Docker images, which can save costs and time when configuring these systems (because Docker configuration can be automated and replicated easily).
Caching is a technique used to store information on faster storage (e.g. CPU cache, or RAM vs the much slower disk storage) for better I/O performance, e.g. access time, or query performance.
There's a catch however. If we are using certain type of cache like Write-Behind (which is volatile - if we lose power, we lose data), then we risk losing the data since there is a time when data is not yet committed to non-volatile storage.
Also, faster memory like CPU cache and RAM always cost more than slower, non-volatile memory like SSDs and hard disks. This remains true even with today's cloud infrastructure, like AWS ElastiCache. This means that we must carefully determine an optimal level of cache, that is both high-performing while also not being too expensive.
One of the tools used for caching is Redis, which is an in-memory key-value database and cache that is usually used today in front of the actual database because it is very fast.
4. Write a function (in any language) that reverse element order of array of strings without using any built-in functions or libraries.
5. In which cases should we choose MongoDB over Postgres, and in which cases should we choose Postgres over MongoDB.
We should choose MongoDB if the data we're dealing with need extra flexibility (i.e. unstructured data), or if it doesn't fit in a table, or if it is not very relational. This is because MongoDB is essentially a document database, and it doesn't resemble traditional SQL tables at all. In vanilla MongoDB specifications, there is not even a foreign key or linking mechanisms like relations.
In MongoDB, we can also store data without having to first define a schema, which means that our data model stored in such database can change at any time (which is impossible in SQL). And because it is not a table database, we can easily store unstructured data on these databases, at the cost of not being to do stuff like JOIN in a traditional SQL database that enables very complex queries.
In terms of performance, when compared to SQL databases, MongoDB is generally faster when writing new data because there is no structures that may slow down write operations, but MongoDB is generally slower when it comes to queries (reading) because there is no strictly defined structures that the computers can go through quickly when doing queries. So if the use case is write-heavy, MongoDB maybe a better choice.
MongoDB also handles multiple connections better than traditional SQL. Moreover, MongoDB can be scaled to multiple servers more easily because we don't have to sync our tables across multiple computers. According to my own understanding, is less expensive to run and maintain. Also, many people pair Redis with MongoDB to improve read performance via caching.
We should choose Postgres if the data is highly relational and tabular, or if the use case is very read-heavy, or if it is legacy data inherited from legacy RDBMS systems. Postgres also has some extra features like JSON data store in its database, which is similar to how we make queries to MongoDB, so we can have the best of both worlds when we mainly require a SQL database while also wanting to store document-like (JSON) data.
And because Postgres has been around for a long time, we may be forced to use Postgres or any other SQL databases when we have legacy code that can't talk to MongoDB.
GraphQL is a data query language for APIs, developed and released as open-source project by Facebook. It allows clients to define the structure of the requested data (client-driven), allowing more flexibility and other potential benefits when contrasted to RESTful APIs.
REST servers expose multiple endpoints (URL-driven) for clients to make requests to, while GraphQL only exposes just one URL for POST requests, where the body of the POST requests is query (query-driven). Both GraphQL and REST usually exchanges JSON data, although other data other than JSON (e.g. a file) can also be used.
If the use case is to build an ad-hoc public API that we don't know how it will be used, or it will be used differently on different consumer, then we should use GraphQL because the clients won't have to process all the extra information they will not use.
If we are building a very specific API, like stock prices, then it may be better to use REST. If the APIs only have one single consumer (e.g. our own web page), then it may also be better to use REST.
Database indexing may increase query (read) performance in high-traffic situations because we can look for data from a sorted list instead of going through every database row, but it may decrease insert (write) performance, because the index needs to be recomputed after each insert. The trade-offs have to be taken into consideration when deciding about database indexing.
To securely store password in a database, we would need to first hash the plaintext passwords before we store them. But just using secure hasing algorithms like SHA is not enough, because the attackers may have prepared a rainbow table for all possible password combinations. We must combine our own randomness with the plaintext passwords before we hash it.
To protect the hashed passwords against rainbow table attacks, we introduce our one-time random initialization vector (salt, or entropy), which would produce different password hash every time even though the input passwords are identical. Bcrypt and PBKDF2 are examples of passwords hash algorithms which make use of the salt.
-
int64:0 -
string: Empty string -
bool:false -
*bool:nil -
struct { a bool, b int64, c *string }:{ false, 0, nil } -
interface{}:nil -
[]string:[]
10. Write a program that encrypts and decrypts string with RSA-OAEP and SHA-256. Use pub/pri key pair from environment variable RSA_PUB_KEY and RSA_PRIV_KEY. The program will have functions encrypt(plaintext) ciphertext and decrypt(ciphertext) plaintext
The nature of RSA key also includes the fact that a key pair can only encrypt a message of limited length. If we really want to use asymmetric encryption (due to any key distribution concerns, or any other concerns), we can mix symmetric encryption (e.g. AES) to our scheme.
-
We first use a symmetric key, for example
s_key, to first encrypt the large plaintext symmetrically. -
Then we can asymmetrically encrypt
s_keywith the receiver's public key. -
Send our symmetrically encrypted ciphertext and asymmetrically encrypted
s_keyto our receivers. -
When the receivers receive the data (ciphertext and encrypted
s_key), they can use their private key to asymmetrically decrypt the asymmetrically encrypteds_key. -
The receiver can then use the asymmetrically decrypted
s_keyto symetrically decrypt the whole message.
For AES algorithms, I recommend AES256-GCM (which has message authentication), or AES256-CTR for very, very large message.
GCM must have the entire data in memory during encryption, so it may not be suitable for encrypting/decrypting very large files in devices with limited memory, while CTR sequentially encrypts/decrypts the message in chunks, ruling out the memory consumption concerns. However, CTR does not have message authentication. We must carefully choose the trade-offs between these AES two algorithms.