1. Introduction
ChaCha [1] is a popular cryptographic algorithm that is efficient, secure, and known by its versatility.
Developed as an improvement over its predecessor, Salsa20 [2], ChaCha is a stream cipher [3, 4] designed to provide a high level of encryption while being faster and more resistant to certain types of attacks. It was created by Daniel J. Bernstein in 2008 and has since become a widely used choice for securing data in various applications, including secure communication, disk encryption, and internet protocols.
One of the advantages of ChaCha is its simplicity and ease of implementation. It operates as a pseudorandom number generator, producing a stream of pseudorandom numbers, also known as a keystream, that is operated with the data to produce ciphertext. The design of ChaCha incorporates a 256-bit (32 byte) key, which ensures a high security against brute force attacks.
The ChaCha was developed to address some concerns and vulnerabilities found in the original Salsa20 stream cipher, which includes improvements in its resistance to linear and differential cryptanalysis attacks. These attacks attempt to exploit patterns and biases in the encryption process, aiming to recover the key or plaintext.
2. Xor Operator
The XOR (exclusive or) operator, often denoted as \(\oplus\), is a fundamental operation in cryptography. XOR is a binary operation that takes two bits as input and produces a single bit as output, following the rules:
\begin{array}{c|c|c} \text{Input} A & \text{Input} B & \text{Output} (A \oplus B) \\ \hline 0 & 0 & 0 \\ 0 & 1 & 1 \\ 1 & 0 & 1 \\ 1 & 1 & 0 \\ \end{array}The XOR operation can be succinctly described as follows: if the two input bits are different, the result is 1; if they are the same, the result is 0. This simple operation have big implications in cryptography.
In cipher algorithms, XOR is used for bitwise manipulation of data. It allows the encryption and decryption processes to be reversible, as XOR'ing the ciphertext with the same key produces the original plaintext. This property is known as the "self-inverting" property of XOR.
3. Key Stream
Key stream generation is one of the most critical component of many stream ciphers algorithms. The key stream is a sequence of pseudorandom bits that are combined (usually using the XOR operator) with the plaintext to produce the ciphertext. This process ensures the confidentiality and security of the data being transmitted or stored.
The key stream should exhibit the properties of true randomness, making it harder for an attacker to predict the next bit in the sequence without knowledge of the secret key. This unpredictability is essential for thwarting various types of attacks, including brute force and statistical attacks.
One difficulty in key stream generation is ensuring that the key stream does not repeat within a reasonable timeframe. If the key stream were to repeat, it would introduce vulnerabilities, as attackers could potentially exploit this repetition. Cipher algorithms are carefully designed to have an extremely long or even an effectively infinite key stream period, making it highly unlikely for repetition to occur.
The XOR operator is often used in stream ciphers along with their key streams. The advantage of using XOR in this context is that it provides a simple and efficient way to introduce the randomness and unpredictability (and reversibility) of the key stream into the encryption process.
It is of great importance that the key stream, generated with a given key, is easily reproducible. Reproducibility ensures that both the sender and receiver can independently regenerate the same key stream, enabling them to synchronize their encryption and decryption processes seamlessly. This reproducibility simplifies the implementation and operation of cipher algorithms, but also guarantees the consistency and integrity of encrypted data between different parties. Without it, secure communication would be impractical and the encryption system would be ineffective.
4. ChaCha
As previously mentioned, ChaCha is a widely-used stream cipher designed for encryption and decryption purposes. The algorithm used is based in the original work and takes indo account the following components:
- Constant: A 256-bit (16 byte) fixed constant number, usually called tag.
- Key: ChaCha uses a 256-bit (32 byte) key. The security of the cipher heavily relies on the secrecy and randomness of the key.
- Nonce: A 64-bit (8 byte) nonce is used in conjunction with the key to create a unique keystream for each message.
- Counter: A 64-bit (8 byte) counter that can be initialized to anything. This counter is incremented with each block of keystream generated.
- Stream Matrix: The stream matrix is a 4x4 matrix (elements of 4 bytes, 64 bytes in total) that holds the key stream to be used initialized with zeros.
- Key position: The index in the stream matrix in which the byte will be retrivied
- Block Matrix: The block matrix is a 4x4 matrix (elements of 4 bytes, 64 bytes in total) filled with the constant, key, nonce, and counter according to:
The nonce component ideally should be set uniquely for each message. However, in cases where setting a nonce at decryption time is not feasible, a fixed nonce is employed, and the achieved security level relies on other aspects of the cipher. Similarly, the constants and the counter are also fixed. The counter serves the purpose of tracking the number of rounds the block matrix undergoes, although this information is not utilized in practice and can be set to any value.
4.1. Initialization
Considering the aforementioned components and their structure, it is necessary to have a function whose responsibility is to prepare the entire initial state of the algorithm:
- Have defined within its body the constant, key, nonce, and counter.
- Fill both matrices (block and stream) with 0.
- Set the key position to 64 (invalid position).
- Copy the constant to positions [0, 16) bytes of the block matrix.
- Copy the key to positions [16, 48) bytes of the block matrix.
- Copy the counter to positions [48, 56) bytes of the block matrix.
- Copy the nonce to positions [56, 64) bytes of the block matrix.
4.2. Hash block
From the block matrix, it is necessary to generate the keystream, which is stored in the stream matrix. To do this, an endomorphic function \(f: \mathbb{Z}^{4\times4} \mapsto \mathbb{Z}^{4\times4}\) is considered, which takes the block matrix as input and repeatedly and deterministically applies a binary diffusion operation.
Initially, in \(f\), we copy the content of the block matrix to the stream matrix. Next, we repeatedly apply the same binary diffusion operation. This operation takes four elements from the matrix as input and also returns four elements, overwriting the ones that were taken as input from the matrix. The same 8 sets of 4 elements are chosen for each repetition of the application. These 8 sets consist of: each column of the matrix, from left to right, and each diagonal in the same direction as the main diagonal, including the main diagonal itself, from left to right, wrapping around when necessary.
- Columns:
\vspace{0.4cm}
\begin{array}{cccc} tag & \textbf{tag} & tag & tag \\ key & \textbf{key} & key & key \\ key & \textbf{key} & key & key \\ counter & \textbf{counter} & nonce & nonce \end{array}\vspace{0.4cm}
\begin{array}{cccc} tag & tag & \textbf{tag} & tag \\ key & key & \textbf{key} & key \\ key & key & \textbf{key} & key \\ counter & counter & \textbf{nonce} & nonce \end{array}\vspace{0.4cm}
\begin{array}{cccc} tag & tag & tag & \textbf{tag} \\ key & key & key & \textbf{key} \\ key & key & key & \textbf{key} \\ counter & counter & nonce & \textbf{nonce} \end{array}- Diagonals:
\vspace{0.4cm}
\begin{array}{cccc} tag & \textbf{tag} & tag & tag \\ key & key & \textbf{key} & key \\ key & key & key & \textbf{key} \\ \textbf{counter} & counter & nonce & nonce \end{array}\vspace{0.4cm}
\begin{array}{cccc} tag & tag & \textbf{tag} & tag \\ key & key & key & \textbf{key} \\ \textbf{key} & key & key & key \\ counter & \textbf{counter} & nonce & nonce \end{array}\vspace{0.4cm}
\begin{array}{cccc} tag & tag & tag & \textbf{tag} \\ \textbf{key} & key & key & key \\ key & \textbf{key} & key & key \\ counter & counter & \textbf{nonce} & nonce \end{array}At this point, we have the stream matrix as a copy of the block matrix operated with binary diffusion. To finalize the generation of the stream matrix, we simply add the block matrix to it. Binary overflow due to this addition is possible, but it can be ignored since the decimal representation is not relevant, and therefore, arithmetic concerns are not taken into account.
Finally, as a purely illustrative simplification, the generation of the keystream is given by:
\(StreamMatrix = BlockMatrix + \text{RepeatedBinaryDifusion}(BlockMatrix)\)
4.3. Quarter-Round
Previously, we mentioned the existence of a binary diffusion function that operates on 4 elements of the stream matrix, and we need to define it. However, before that, we need to learn about two more bitwise operations, \texttt{leftRotate} and \texttt{rightRotate}, which are binary rotations. Their definitions extend the binary shift operation present in the C language, with the difference that instead of adding leading zeros to the beginning or end of the bits, the bits rotate, with the first bit becoming the last, or vice versa for the other operation.
We call the binary diffusion function \texttt{qRoundF}, and it takes 4 in-out arguments:
\begin{lstlisting} qRoundF(a, b, c, d) is auxA = a + b auxB = b auxD = (leftRotate(d $\oplus$ auxA), 16) auxC = c + auxD auxB = leftRotate(auxB $\oplus$ auxC, 16) auxB = rightRotate(auxB, 4) auxA = auxA + auxB auxD = leftRotate(auxD $\oplus$ auxC, 8) auxC = auxC + auxD auxB = leftRotate(auxB $\oplus$ auxC, 8) auxB = rightRotate(auxB, 1) return auxA, auxB, auxC, auxD end \end{lstlisting}Essa função é uma variação do procedimento original, em que \texttt{<<<=} represeta \texttt{leftRotation}:
This function is a variation of the original procedure that is not used, where \texttt{<<<=} represents \texttt{leftRotation}:
a += b; d ^= a; d <<<= 16; c += d; b ^= c; b <<<= 12; a += b; d ^= a; d <<<= 8; c += d; b ^= c; b <<<= 7;
Viewing the stream matrix as a sequential list in the usual order, and with \texttt{xi} representing the i-th element within this list, the following sequence is applied repeatedly, updating the same input elements with the output of the function:
qRoundF(x0, x4, x8, x12) qRoundF(x1, x5, x9, x13) qRoundF(x2, x6, x10, x14) qRoundF(x3, x7, x11, x15) qRoundF(x0, x5, x10, x15) qRoundF(x1, x6, x11, x12) qRoundF(x2, x7, x8, x13) qRoundF(x3, x4, x9, x14)
The number of times this sequence is applied is typically referred to as ``rounds''.
4.4. Usage
For the use of what has been described so far, it is only necessary to apply the stream matrix to any message and generate a new stream matrix from the block matrix.
For the application of the stream matrix to the message, it is first considered that the matrix can be interpreted as an array of 64 bytes (4 columns, 4 rows, 4 bytes per element). With this in mind, two cursors are implemented, one on top of the message and another on top of the stream matrix, both initialized at zero. In this way, the first byte is operated with the first byte of the stream matrix, and so on, incrementing both cursors by 1.
If the message cursor reaches the end, the operation is complete, but the stream matrix cursor will be stored for the next use, considering that the message may not arrive in its entirety. If the message has been completely received, it is the responsibility of the user of this algorithm to reset it to its initial state.
In the other case, when the key matrix cursor reaches the end but the message cursor does not, it is only necessary to regenerate a new stream matrix. However, it is necessary to modify the block matrix since generating the new stream matrix would result in the same matrix. Ideally, the keystream (obtained from the key matrix) should be a pseudo-random sequence and not periodic.
Therefore, whenever all the bytes of the stream matrix are used up, a new one should be generated from the current state of the block matrix, and after it is generated, update the block matrix, increasing its entropy. The chosen method increments, byte by byte, the values contained in the counter segment. The responsible for the increment is called the ``mixer'', which has an initial value of 1 and is 16 bits in size. We add the first byte of the counter to the mixer and overwrite the first byte of the counter with the last 8 bits of the result. Note that we deliberately do not use the 8 most significant bits of the mixer because our next step is to apply an 8-bit right shift to the mixer, transforming the most significant bits into the least significant bits. From there, we move on to the next byte of the counter, with the mixer not necessarily equaling 1, and repeat the same process. If you observe this behavior carefully, you will notice that the first byte of the counter will show a counter of how many times the key matrix was generated (considering the subtraction of the initial value), and from the second byte onwards, a more explosive count, quickly reaching the limit of the representation of a single byte and increasing the overall entropy of the block matrix as a whole.
To minimize the difficulty, the existence of functions encapsulating basic file handling functionalities should also be considered.
The first functionality should concern itself with indicating, at the moment when a file is opened for reading, whether it is known to be encrypted or not. This can be easily done by attempting to read the first 4 bytes at the beginning of the file, and if those 4 bytes match the format defined for it (commonly referred to as ``MAGIC''), the file is considered encrypted, and these first bytes read should be discarded. For reading, with the file open, you can read as many bytes as desired, and if it is indicated that the file in context is encrypted, you simply apply the procedure previously elucidated using the stream matrix, resulting in decrypted bytes. Finally, for file closure, just position the cursor at the end of the stream matrix (forcing it to generate a new one for a future file), remove the indication that the file in context is encrypted, and close the file normally, as usual.
5. References
\noindent [1] BERNSTEIN, Daniel J. et al. ChaCha, a variant of Salsa20. In: Workshop record of SASC. 2008. p. 3-5.
\vspace{0.3cm}
\noindent [2] BERNSTEIN, Daniel J. The Salsa20 family of stream ciphers. In: New stream cipher designs: the eSTREAM finalists. Berlin, Heidelberg: Springer Berlin Heidelberg, 2008. p. 84-97.
\vspace{0.3cm}
\noindent [3] ROBSHAW, M. Stream Ciphers. RSA Laboratories Technical Report TR-701. Redwood City, CA, 1995.
\vspace{0.3cm}
\noindent [4] KLEIN, Andreas et al. Stream ciphers. London: Springer, 2013.