Introduction
BankBI strongly advises that no personally identifiable information (PII) is uploaded to our cloud analytics platform. There are, however, circumstances where it is desirable to be able to cross reference the output of a BankBI report with additional data stored within the organisation’s source systems.
In order to simplify the process of uploading data to BankBI, our Upload Agent has been enhanced to automatically encrypt specified columns before being uploaded to the BankBI cloud analytics platform. In addition, this functionality has been built to allow the report viewer to enter the encryption password in the browser to reveal the original values entirely within the secure boundary of the organisation’s infrastructure.
This document describes how the data are encrypted and decrypted, and the algorithms used, to assist with the compliance assessment of IT security.
Deterministic Encryption
By default, randomised data are used during the encryption process so that the same input results in differing output. This makes it harder for a malicious actor to derive any useful information from the encrypted data. However, where the value needs to be consistent to allow sorting, grouping, or joining between different files, it must be ensured that the same input always produces the same output.
This is achieved by marking such fields as requiring deterministic encryption in configuration. If so marked, random values will not be used. Instead, the random data are derived from a one-way hash function over the input data. This still ensures that inputs A and B will use different values in place of the usually random data (the Initialisation Vector), but A will always encrypt to X and B will always encrypt to Y.
Generating the IV from the HMAC of the input value provides more uncertainty in the output than using a fixed IV. Additionally, using an HMAC with the column tag as a key prevents the same input data in two unrelated (differently tagged) columns from generating the same ciphertext.
Client-Side Decryption
A key requirement in the design of this encryption process is to ensure that the encryption password never needs to be communicated to BankBI. Indeed, it never needs to leave the confines of the organisation’s network.
When encrypted data are included in reports in the BankBI cloud reporting application, JavaScript running entirely within the web browser, using native functionality offered by the web platform, can decrypt the data when an authorised user enters the encryption password. All data stored in the BankBI cloud analytics platform and transmitted across the internet are always encrypted and opaque to BankBI.
For regulatory reporting applications, the decryption process is managed by an Excel plugin created by BankBI. This plugin uses the same decryption functionality as the web application to be able to handle the decryption process entirely within the Excel application. Only the encrypted data are downloaded in the Excel report and decrypted without further communication with BankBI.
Algorithm Selection
Symmetric Encryption
Data are encrypted with the AES-GCM authenticated encryption algorithm with a 256-bit key. Advanced Encryption Standard (AES) is the symmetric encryption algorithm recommended by NIST[1] and ENISA[2]. The choice of the Galois/Counter Mode (GCM) mode of operation ensures integrity and confidentiality and defence against a chosen ciphertext attack.
Master Encryption Key Derivation
To ensure a user-friendly decryption process, a Master Encryption Key (MEK) is derived from an encryption password that can be entered into the user interface by an authorised user. This encryption password is passed to a password based key derivation function, PBKDF2, along with a random salt. PBKDF2 is recommended by NIST[3] (even though this recommendation dates from 2010, it is still current, as referenced by a much later publication[4]) to generate the MEK.
Content Encryption Key Derivation
The MEK is not used directly to encrypt any content. It instead acts as the input key material to a key based key derivation function, HKDF[5], that optionally combines the MEK with a random salt and column specific tag to produce a Content Encryption Key (CEK). For deterministic encryption, the salt is omitted and a column specific tag is required. For randomised encryption, the salt will be a pseudo-random value generated once per file upload.
Asymmetric Encryption
Asymmetric encryption is used only to wrap the MEK and store it securely on disk. Unwrapping the MEK requires access to the private key and therefore loss of the configuration file from the disk of the Upload Agent machine is inconsequential without the associated private key.
RSA-OAEP is used as the wrapping encryption algorithm, using an RSA certificate stored in the local machine’s certificate store.
Whilst Elliptic Curve asymmetric encryption may be the preferred choice for new projects, the presence of an existing RSA key management infrastructure within BankBI used for client authentication by the Upload Agent drove the selection of RSA-OAEP in the absence of any known vulnerabilities. As the use of this algorithm is constrained to the securing of the MEK on the UA machine, it can be modified in the future if required without impacting the rest of the system design.
Web Crypto API
As well as being industry standard algorithms, those selected have also been constrained to those available in the W3C Web Crypto API standard[6]. This ensures that the decryption process can be managed entirely within the web browser.
Ciphertext Definition
Due to the encryption process detailed above, there are a number of additional pieces of information required in addition to the encryption password and the ciphertext to decrypt and present the original value. These are encapsulated within the value written to the encrypted CSV and uploaded to the BankBI cloud analytics platform.
Binary Format (Version 0x0)
Field |
Length (bytes) |
Description |
Version |
1 |
Encrypted field layout version |
MEK Salt |
16 |
PBKDF2 salt |
MEK Iterations |
2 |
Iterations used by PBKDF2 (Little Endian) |
CEK Salt Length |
1 |
Length of the CEK Salt in bytes (ns) |
CEK Salt |
ns |
HKDF salt |
CEK Info Length |
1 |
Length of the CEK Info in bytes (ni) |
CEK Info |
ni |
HKDF info parameter |
Initialisation Vector |
12 |
AES Initialisation Vector |
Text Format
To safely write these binary data to CSV and subsequently store and display in reports, the data are encoded to text using Base64 encoding.
Encryption Process
Upload Agent Installation
- User enters the Encryption Password to installer
- The installer generates the MEK salt and stores this in configuration
- The installer uses the Encryption Password and MEK salt to derive the Master Encryption Key using PBKDF2
- The MEK is encrypted with the authentication certificate’s public key and saved in configuration.
File Upload
- The authentication certificate’s private key is used to decrypt the MEK from configuration
- For each column configured for encryption, a Content Encryption Key is derived using HKDF
- For randomised encryption, generate a random salt, for deterministic encryption do not use a salt
- If configured, the column tag is used as the info parameter to the HKDF implementation
- For each record in a column configured for encryption
- Generate the initialisation vector
- For randomised encryption, generate a random initialisation vector
- For deterministic encryption, generate the IV from the HMAC SHA-256 of the input using the column as the key
- Encrypt the input value using AES-GCM with the column’s content encryption key and the current value’s initialisation vector
- Produce the binary representation of the various salts, IV and ciphertext
- Write the Base64 encoded binary representation
- Generate the initialisation vector
Report View
- User enters the Encryption Password to the report application
- For each encrypted value encountered
- Extract the MEK salt from the encrypted value and use this and the entered Encryption Password to derive the MEK using PBKDF2
- Use the CEK salt and info to derive the CEK
- Decrypt the value passing the derived CEK along with the IV and ciphertext from the encrypted value to the AES algorithm
- If the password is incorrect or encrypted data have been modified, the decryption process will error, so the encrypted values are still displayed and not incorrectly decrypted data.
[1] FIPS-197 https://csrc.nist.gov/publications/detail/fips/197/final
[2] Algorithms, key size and parameters report 2014 https://www.enisa.europa.eu/publications/algorithms-key-size-and-parameters-report-2014
[3] Recommendation for Password-Based Key Derivation https://csrc.nist.gov/publications/detail/sp/800-132/final
[4] Recommendation for Cryptographic Key Generation https://csrc.nist.gov/publications/detail/sp/800-133/rev-1/final
Comments
0 comments
Please sign in to leave a comment.