A Simple Introduction to the Architecture of Salesforce Platform Encryption

The architecture of the Salesforce Platform Encryption solution is described here.

I thought I’d have a go at writing a simplified version in a way that’s easy for me to understand, starting with the encryption of the data and then moving out to key management.

Encryption Basics

In this post I’m going to assume a certain amount of knowledge about encryption but let’s start with some simplified basics.

Symmetric encryption is where you have the same key to both encrypt (for privacy) and decrypt the data. This is the fastest way to encrypt/decrypt but it is also the easiest to crack and if you lose the key then you’re in trouble e.g. if you encrypt something with the key and someone else wants to decrypt it then they need to have the same key and then there’s nothing to stop them imitating you.

Symmetric Encryption

Public key encryption (PKI) addresses this issue using key pairs. The key that does the encryption is different to the key that does the decryption. The key that does the encryption (the public key) can be made public, anyone can use it to encrypt but only the holder of the other half of the pair (the private key) will be able to decrypt it.

Public Key Encryption

The same public key technology can be used for signing (for authentication). Someone can use their private key to sign something and people with the corresponding public key will be able to verify that the sender used that private key. Public key encryption is sometimes called asymmetric because the encrypting/decrypting keys are different.

Public Key Authentication

Asymmetric security is more secure than symmetric because you don’t have to share the encrypting key and it takes longer to crack but it also takes longer to encrypt and so sometimes the performance impact can be too high. So, typically, a combination of the two is used. The symmetric key is used for the encryption/decryption but its distribution and storage is protected using the public key technology.

Salesforce Security

Salesforce has always been a very secure platform, using a range of services such as encryption of the data in transit, two factor authentication, verification of login address, profiles, permissions and penetration tests. They are now adding to this a new feature called Platform Encryption which allows customers to optionally encrypt some fields at rest i.e. while they are stored in the Salesforce database.

How does Salesforce Platform Encryption Work?

Salesforce uses a symmetric encryption key to encrypt the customer data that it stores. (The symmetric encryption used is AES with 256-bit keys using CBC mode, PKCS5 padding, and random initialization vector (IV).) The symmetric mode gives the performance benefit but means that the key needs to be closely protected. For this reason the Data Encryption Key (which is also the decryption key) is never transmitted or even written to disk (persisted). It is created/derived in the Salesforce platform and never leaves. It is created in a component of the platform called the Key Derivation Server.

Platform Encryption Architecture

So this brings us to the question of how is it created, and how can we ensure that it’s the same when it’s recreated to do the decryption? Also, given that this is a multi-tenant environment, what is the customer specific component? The answer is that the encryption key is derived/created from a combination of a Salesforce component and customer/tenant specific component. These are called secrets. Sometimes they are also referred to as key fragments.

The encryption key is generated from the master secret (Salesforce component) and the tenant secret (customer component) using PBKDF2 (Password-Based Key Derivation Function 2). The derived data encryption key is then securely passed to the encryption service and held in the cache of an application server.

Key Derivation Server

The Write Process

So, to write an encrypted record, Salesforce retrieves the Data Encryption Key from the cache and performs the encryption. As well as writing the encrypted data into the record it also stores the IV and the id of the tenant secret.

The Read Process

Similarly, to decrypt the data Salesforce reads the encrypted data from the database and if the encryption (decryption) key is not in the cache then it needs to derive it again using the associated tenant secret, and then it decrypts using the key and the associated IV.

So, we’ve established that the data can’t be accessed without the data encryption key and that this key can’t be accessed without the master and tenant secrets, but how do we know that the secrets are secure?

Generation of Secrets

Remember that for this discussion, there is one master secret for Salesforce itself, and a tenant secret and key derivation server for each customer. Actually these secrets are regularly replaced, which is why we need to keep their ids.

The master secret is created by a dedicated air gapped HSM. It is then encrypted using the key derivation server’s public key (tenant wrapping key) and signed with the HSM’s private key (master wrapping key) and transported to the key derivation server where it is stored.

Master HSM

The tenant secret is created on the key derivation server, with a different HSM. This is initiated by the customer who connects using their usual transport level security. It is then encrypted with the tenant wrapping key (public key) and stored in the database. The tenant secret never leaves the key derivation server and can only be accessed using the tenant wrapping key private key which also never leaves the key derivation server.

The Transit Key

A unique transit key is generated on the a Salesforce Platform application server each time it boots up. The transit key is used to encrypt the derived data encryption key before it’s sent back from the key derivation server to the encryption service. The transit key is a symmetric key but itself is encrypted with an asymmetric key, created by the master HSM, to get it to the key derivation server.

That’s Enough For Now

There’s a lot more that can be explained. There are more keys for more parts of the process. There are more distribution processes, and processes for updating the keys and keeping the system working using updated keys. There are processes for archiving data and keys, and for destroying the archives. But for now, I think I’ve understood enough to be comfortable with the way platform encryption works and the extra layer of security that it provides. Please let me know if you spot any glaring errors. For more detail please see the original document or suggest future posts.