- hackmd version: https://hackmd.io/Iwe0MVlwRpOMZ70W1ZwPtQ?view
- gist version: https://gist.github.com/bluet/5ebdfb65f486e01c13175dcdedf27906
[TOC]
This is a HOWTO for database record/value version control, not for schema.
With MariaDB, now we can Versioning database records (values) WITH SYSTEM VERSIONING
.
Once we have tables configured correctly, Data Versions will be generated automatically everytime we have new data inserted or values updated.
Also note that when new values are as the same as the old ones, no modifications will be done and no new versions will be created; it just works like a automatic data dedup (de-duplication) feature which avoids adding repeating data.
Unless you really need to log every metric data down (ex, system monitoring), and in that case you might want to check TSDB (time series databases) like InfluxDB, saving repeating unchanged records is just a waste of disk and memory and could also drag system performance down. That's why data dedup is important.
- MariaDB version 10.3.4 or up (better with 10.4 or up, for Application-time Periods)
- With docker:
docker run --name mariadb -e MYSQL_ROOT_PASSWORD=password -d mariadb
- With docker:
Connect with client shipped with docker image Docker Hub
docker exec -it mariadb mysql -u root -p
Create test database:
CREATE DATABASE Company;
use Company;
Create a Temporal table with versioning support for all columns:
CREATE TABLE Person (
Id int(11) NOT NULL AUTO_INCREMENT,
FirstName varchar(50) NOT NULL,
LastName varchar(50) NOT NULL,
Gender char(1) NOT NULL,
DepartmentId int(11) NOT NULL,
PRIMARY KEY (Id),
CONSTRAINT con_gender CHECK (Gender in ('f','m')))
WITH SYSTEM VERSIONING;
For excluding columns from versioning, we can append
WITHOUT SYSTEM VERSIONING
in column definition. Reference
CREATE TABLE t (
x INT,
y INT WITHOUT SYSTEM VERSIONING
) WITH SYSTEM VERSIONING;
For inclusive column declaration, we can append
WITH VERSIONING
. Reference
CREATE TABLE t (
x INT WITH SYSTEM VERSIONING,
y INT
);
If most queries on that table use only current data in stead of historical data, partitioning can reduce the size of the table, gain performance on data queries (table scans and index searches). Reference
CREATE TABLE t (x INT) WITH SYSTEM VERSIONING
PARTITION BY SYSTEM_TIME (
PARTITION p_hist HISTORY,
PARTITION p_cur CURRENT
);
INSERT
always creates new version, as it creates a new data of values as a new record.
MariaDB [Company]> INSERT INTO Person (FirstName, LastName, Gender, DepartmentId) VALUES ('Rasmus', 'Johansson', 'm', 1);
Query OK, 1 row affected (0.002 sec)
MariaDB [Company]> SELECT * FROM Person;
+----+-----------+-----------+--------+--------------+
| Id | FirstName | LastName | Gender | DepartmentId |
+----+-----------+-----------+--------+--------------+
| 1 | Rasmus | Johansson | m | 1 |
+----+-----------+-----------+--------+--------------+
1 row in set (0.001 sec)
Above command created the first row of records, and it's timestamps.
MariaDB [Company]> SELECT *,ROW_START, ROW_END FROM Person FOR SYSTEM_TIME ALL;
+----+-----------+-----------+--------+--------------+----------------------------+----------------------------+
| Id | FirstName | LastName | Gender | DepartmentId | ROW_START | ROW_END |
+----+-----------+-----------+--------+--------------+----------------------------+----------------------------+
| 1 | Rasmus | Johansson | m | 1 | 2020-02-19 22:04:48.565979 | 2020-02-19 22:05:13.094584 |
+----+-----------+-----------+--------+--------------+----------------------------+----------------------------+
1 rows in set (0.001 sec)
UPDATE
doesn't always generates new version:
- If values are really being changed, it will create a new version.
- If values are the same, the
UPDATE
won't modify any value, so no new version.
MariaDB [Company]> UPDATE Person SET DepartmentId = 2 WHERE Id = 1;
Query OK, 1 row affected (0.050 sec)
Rows matched: 1 Changed: 1 Inserted: 1 Warnings: 0
New version shows up.
MariaDB [Company]> SELECT *,ROW_START, ROW_END FROM Person FOR SYSTEM_TIME ALL;
+----+-----------+-----------+--------+--------------+----------------------------+----------------------------+
| Id | FirstName | LastName | Gender | DepartmentId | ROW_START | ROW_END |
+----+-----------+-----------+--------+--------------+----------------------------+----------------------------+
| 1 | Rasmus | Johansson | m | 1 | 2020-02-19 22:04:48.565979 | 2020-02-19 22:05:13.094584 |
| 1 | Rasmus | Johansson | m | 2 | 2020-02-19 22:05:13.094584 | 2020-02-19 22:05:39.879380 |
+----+-----------+-----------+--------+--------------+----------------------------+----------------------------+
2 rows in set (0.001 sec)
MariaDB [Company]> UPDATE Person SET DepartmentId = 2 WHERE Id = 1;
Query OK, 0 rows affected (0.001 sec)
Rows matched: 1 Changed: 0 Inserted: 0 Warnings: 0
No new version created.
MariaDB [Company]> SELECT *,ROW_START, ROW_END FROM Person FOR SYSTEM_TIME ALL;
+----+-----------+-----------+--------+--------------+----------------------------+----------------------------+
| Id | FirstName | LastName | Gender | DepartmentId | ROW_START | ROW_END |
+----+-----------+-----------+--------+--------------+----------------------------+----------------------------+
| 1 | Rasmus | Johansson | m | 1 | 2020-02-19 22:04:48.565979 | 2020-02-19 22:05:13.094584 |
| 1 | Rasmus | Johansson | m | 2 | 2020-02-19 22:05:13.094584 | 2020-02-19 22:05:39.879380 |
+----+-----------+-----------+--------+--------------+----------------------------+----------------------------+
2 rows in set (0.001 sec)
MariaDB [Company]> UPDATE Person SET DepartmentId = 3 WHERE Id = 1;
Query OK, 1 row affected (0.011 sec)
Rows matched: 1 Changed: 1 Inserted: 1 Warnings: 0
INSERT INTO ins_duplicate VALUES (1,'Antelope') ON DUPLICATE KEY UPDATE animal='Antelope';
MariaDB [Company]> SELECT * FROM Person;
+----+-----------+-----------+--------+--------------+
| Id | FirstName | LastName | Gender | DepartmentId |
+----+-----------+-----------+--------+--------------+
| 1 | Rasmus | Johansson | m | 3 |
+----+-----------+-----------+--------+--------------+
1 row in set (0.001 sec)
MariaDB [Company]> SELECT * FROM Person FOR SYSTEM_TIME ALL;
+----+-----------+-----------+--------+--------------+
| Id | FirstName | LastName | Gender | DepartmentId |
+----+-----------+-----------+--------+--------------+
| 1 | Rasmus | Johansson | m | 1 |
| 1 | Rasmus | Johansson | m | 2 |
| 1 | Rasmus | Johansson | m | 3 |
+----+-----------+-----------+--------+--------------+
3 rows in set (0.001 sec)
MariaDB [Company]> SELECT *,ROW_START, ROW_END FROM Person FOR SYSTEM_TIME ALL;
+----+-----------+-----------+--------+--------------+----------------------------+----------------------------+
| Id | FirstName | LastName | Gender | DepartmentId | ROW_START | ROW_END |
+----+-----------+-----------+--------+--------------+----------------------------+----------------------------+
| 1 | Rasmus | Johansson | m | 1 | 2020-02-19 22:04:48.565979 | 2020-02-19 22:05:13.094584 |
| 1 | Rasmus | Johansson | m | 2 | 2020-02-19 22:05:13.094584 | 2020-02-19 22:05:39.879380 |
| 1 | Rasmus | Johansson | m | 3 | 2020-02-19 22:05:39.879380 | 2038-01-19 03:14:07.999999 |
+----+-----------+-----------+--------+--------------+----------------------------+----------------------------+
3 rows in set (0.001 sec)
MariaDB [Company]> SELECT * FROM Person FOR SYSTEM_TIME AS OF TIMESTAMP '2020-02-19 22:05:20';
+----+-----------+-----------+--------+--------------+
| Id | FirstName | LastName | Gender | DepartmentId |
+----+-----------+-----------+--------+--------------+
| 1 | Rasmus | Johansson | m | 2 |
+----+-----------+-----------+--------+--------------+
1 row in set (0.001 sec)
MariaDB [Company]> SELECT *, ROW_START, ROW_END FROM Person FOR SYSTEM_TIME AS OF TIMESTAMP '2020-02-19 22:05:20';
+----+-----------+-----------+--------+--------------+----------------------------+----------------------------+
| Id | FirstName | LastName | Gender | DepartmentId | ROW_START | ROW_END |
+----+-----------+-----------+--------+--------------+----------------------------+----------------------------+
| 1 | Rasmus | Johansson | m | 2 | 2020-02-19 22:05:13.094584 | 2020-02-19 22:05:39.879380 |
+----+-----------+-----------+--------+--------------+----------------------------+----------------------------+
1 row in set (0.001 sec)
In some situations it might not work as expected
- When enable versioning (timestamp) for whole table except one column, using upsert (insert ... on duplicate update) to update the unversioning column, might still create new version records.
- https://hackmd.io/WNz_xg9pRYiQrL69QkufrQ?view#on-duplicate-key-update-with-primary-key--multi-unique-key
- a workaround is to use txrid (transaction id) based versioning
- but it doesn't support partitions to separate current and historic data