Tip of the day: Check out Special users on how to give trusted users/bots more rights without making them IRCOp.

Dev:UnrealDB

From UnrealIRCd documentation wiki
Jump to navigation Jump to search

This page describes the "UnrealDB" database design to developers. It is not really meant for end-users (admins).

Encrypted databases

UnrealDB supports both unencrypted and encrypted databases. Below we explain the choice of the cipher and key derivation function for encrypted databases.

Cipher

UnrealDB uses the XChaCha20 cipher provided by libsodium. ChaCha20 is one of the two available ciphers in TLSv1.3 (the other one being AES). The XChaCha20 version uses a large nonce, allowing it to be safely used for writing lots of data.

Key derivation

A cipher such as ChaCha20 requires a key. In UnrealIRCd we use Argon2id as a key derivation function (KDF) to convert a password to a suitable key. The exact parameters of the Argon2id call are stored in each database file, such as the time/memory/parallel cost and some other parameters. By default UnrealIRCd uses quite strong Argon2id parameters, namely 4 for time, 32M for memory and 2 for parallelism. With these strong defaults, hashing takes about 100msec on common hardware in the year 2021 (Intel 2.1GHz). It means only about 10 hashes per second can be checked per 2 CPU cores. This "slowness" is intentional and an important defense mechanism against password cracking.

In the configuration file you don't directly enter a key in a set:: item. Instead you refer to a Secret block. In the secret block { } the password is stored or it tells how to retrieve the password (eg some external file, keyboard input or in future versions an URL).

As mentioned, only 10 hashes can be generated per second per 2 CPU cores, which is rather slow (and simply too slow for things like channeldb). Therefore, to speed things up, UnrealIRCd will cache the Argon2id result and converge to using the same parameters for the same secret block. This includes using the same salt. The main reason for the salt to exist is to prevent the case where (lots of) machines use the same known parameters/hashes, since that would allow precompution attacks. Reusing a salt on a single machine is no problem as using one per machine makes precompution still not costworthy.

Header

This describes the header that is present at the beginning of the database file.

raw v0

This is used for backwards compatibility with databases written using the old functions in UnrealIRCd 5.0.0 through 5.0.9 that only support unencrypted operations. This means there is NO header (zero bytes).

plaintext v1

  • 32 bytes: the string "UnrealIRCd-DB-v1" followed by zeroes (0x00)
  • 8 bytes: creation/update time of the database

After this 40 bytes the same format is used as the previous read_int32/read_int64/etc functions did from UnrealIRCd 5.0.0-5.0.9.

So, if you ever need to convert plaintext v1 back to raw v0, then just chop off the first 40 bytes.

crypto v1

  • 32 bytes: the string "UnrealIRCd-DB-Crypted-v1" followed by zeroes (0x00)
  • 2 bytes: key derivation function (always 0x1 at the moment for Argon2id)
  • 2 bytes: time cost for Argon2id, that is: number of rounds
  • 2 bytes: memory cost for Argon2id, in powers of two, eg 15 means 2^15=32MB
  • 2 bytes: parallel cost for Argon2id, that is number of threads
  • 2 bytes: salt length in bytes (16 at the moment)
  • variable size: salt as a string (of size saltlen as indicated earlier)
  • 2 bytes: cipher (always 0x1 at the moment for XChaCha20)
  • 2 bytes: key length in bytes (32 at the moment)
  • variable size: key as a string (of size keylen as indicated earlier)
  • 24 bytes: 192bit nonce (libsodium crypto_secretstream_xchacha20poly1305_HEADERBYTES)
  • (From here on everything is encrypted)
  • A 2 byte string length field followed by the string "UnrealIRCd-DB-Crypted-Now" to verify that the password is correct, see unrealdb_read_str() on how this works.
  • 8 bytes: creation/update time of the database

Data

Strings

When a string is written, first the length is written which is a 2 byte field, followed by the actual string (without NUL terminator). If the length field has the magical value 0xffff it means a NULL pointer (which is not the same as length 0 which means a pointer to a \0 string. The maximum size of a string can be 65534 bytes, which should be no problem in UnrealIRCd.

Functions: unrealdb_read_str(), unrealdb_write_str()

Integers

The database routines support reading and writing unsigned integers of size 16, 32 and 64. When writing timestamps we always read/write 64 bit integers to avoid the Y2K38 problem.

Functions: unrealdb_read_int16(), unrealdb_read_int32(), unrealdb_read_int64(), unrealdb_write_int16(), unrealdb_write_int32(), unrealdb_write_int64()

Other data

It is possible to write other binary structures to the database, but this is generally not recommended as it is very easy to make a mistake when doing so (eg: altering a struct { } and not realizing it changes the binary format on-disk, non-packed structs, writing partially uninitialized memory, incompatible when moving between xx bit architectures, etc.).

Functions: unrealdb_read_data(), unrealdb_write_data()