Posts Tagged 'security'

Closing the holes in MQ security

In choosing the default settings for MQSeries, IBM has had to strike a balance between making the product easy to use as quickly as possible and making it secure straight out of the box. In more recent releases, they have put more emphasis on ease of use and so relaxed the default security settings. This is one of the reasons why administrators must now reconfigure their systems if they require them to be secure. This article examines some of the potential security holes of which administrators should be aware, and also describes ways in which administrators can close these holes.

Default channel definitions

There are a number of objects, such as SYSTEM.DEF.SVRCONN and SYSTEM.DEFAULT.LOCAL.QUEUE, that are created by default when you install and configure a queue manager. These are really intended only as definitions to be cloned for their default attributes in the creation of new objects. However, a potential infiltrator can exploit the fact that they are also well-defined objects that probably exist on your system.

Originally, on distributed platforms, the definition of channel SYSTEM.DEF.SVRCONN had its MCAUSER parameter set to ‘nobody’. IBM had so many complaints from users who couldn’t get clients connected that it has now changed this parameter to blank (‘ ’).

The MCAUSER parameter specifies the userid that is checked when an inbound message is put on a queue. Setting this field to blank means that the authority of the userid running the channel (usually ‘mqm’) is checked. In other words, messages are always authorized to be put on all queues.

The thinking behind putting ‘nobody’ in this field is that no one should be allowed to put messages on queues unless the administrator actually changes settings to allow them to do so. Unfortunately this default setting was not documented and so users could not work out how they were required to change things.

There are many users who don’t need client channels and so haven’t even read this section of the manual. They’re unaware that nowadays, with default settings in place, anyone who can connect to their machine (for instance, someone on the same LAN) can start a client channel to them called SYSTEM.DEF.SVRCONN and have access to put messages on any of their queues and – often more importantly – to get messages from any of their queues.

This is not an entirely new problem – even the original systems suffered from it, as there are other channels, such as SYSTEM.DEF.RECEIVER and SYSTEM.DEF.REQUESTER, that have always had a blank MCAUSER. With a little effort, users have always been able to connect to these and put messages on queues using full authority. If the queue manager is the default one, the infiltrator needs no prior knowledge of the system.

As previously mentioned, these definitions are used to provide defaults for the creation of new channels. This means that, in many systems, newly created channels also have MCAUSER set to blank.

It is recommended that the following commands be executed using RUNMQSC to close this loophole:

alter chl(SYSTEM.DEF.SVRCONN) chltype(SVRCONN) trptype(LU62) +

Mcauser(NOBODY)

alter chl(SYSTEM.DEF.RECEIVER) chltype(RCVR) trptype(LU62) +

Mcauser(NOBODY)

alter chl(SYSTEM.DEF.REQUESTER) chltype(RQSTR) trptype(LU62) +

Mcauser(NOBODY)

Do not start MQ using root

It’s worth noting that much of this section is described in Unix terms, though it’s applicable to most platforms, once Unix terms are substituted with their equivalents.

All MQSeries components should be started using the MQSeries administration userid (mqm). Many system administrators like to make the system administration userid (root) a member of the mqm group. This is understandable, as they can then run all of their administration commands, not all of which are for MQ, as root. However, this is a very dangerous thing for them to do as they are effectively giving root authority to all of the members of the mqm group.

For example, if the trigger monitor of the default queue manager is started by root using default parameters, a member of the mqm group whose workstation has IP address ‘myhost’ can enter the following commands using RUNMQSC:

DEFINE QL(MYQUEUE) TRIGGER PROCESS(MYPROCESS) +

INITQ(SYSTEM.DEFAULT.INITIATION.QUEUE)

DEFINE PROCESS(MYPROCESS) APPLICID(‘xterm –display myhost:0 &’)

and then enter the command:

echo hello | amqsput MYQUEUE

This causes a terminal to appear on their screen giving them a command line with root authority from which they have full control of the system.

Similarly, if a channel is started by root, or the channel initiator starts a channel and the channel initiator is started by root, then any exits called by the channel will run as root. So the mqm member could write and install an exit that again spawns a root-authorized xterm.

The receiver channel could have the same problems, for example, if started as root by the listener, inetd, or Communications Manager.

A good start to overcoming this problem is to remove root from the mqm group. However, on some systems root will still have access to the strmqm command and, while it may look as though it has started the queue manager, there may be unexpected errors later when it performs commands for which the OAM checks authority.

The system administrator may find it useful to create commands that only root is authorized to run which switch to the mqm userid before performing the instruction. For example the following shell script could be called strmqm and put higher in root’s path than the real strmqm.

#!/bin/ksh

su – mqm -c /usr/lpp/mqm/bin/strmqm $1

Only use groups on UNIX OAM

The setmqaut command is used to set access to MQSeries objects. Among its parameters you may specify ‘-p PrincipalName’ or ‘-g GroupName’ to indicate to which users you intend this command to apply.

For example, the following command specifies that all members of the group tango are to be allowed to put messages on queue orange.queue on queue manager saturn.queue.manager (note the use of the continuation character, ‘‰’, in the code below to show that one line of code maps to more than one line of print)

setmqaut -m saturn.queue.manager -n orange.queue -t queue

‰  -g tango +put

Similarly, the command:

setmqaut -m saturn.queue.manager -n orange.queue -t queue

‰  -p theuser +put

specifies that the userid theuser should be allowed to put messages on queue orange.queue on queue manager saturn.queue.manager. On most platforms this works fine. However, the implementation on Unix systems is that:

setmqaut -m saturn.queue.manager -n orange.queue -t queue

‰  -p theuser +put

specifies that all of the members of theuser’s primary group are allowed to put messages on queue orange.queue on queue manager saturn.queue.manager.

This is can be very dangerous, as a system administrator can give access to a particular user unaware that in doing so he has accidentally also given access to many other users. User theuser may also be unhappy to be blamed by administrators for actions that they believe only he is authorized to have carried out.

The way around this problem is never to use the ‘-p’ parameter on Unix. The same effect can be obtained by specifying ‘-g PrimaryGroup’, which is a lot clearer.

Only create objects as mqm on unix

As described above, MQSeries on Unix does all of its security using the primary group of a userid rather than the userid itself, as you would expect. This has other knock-on effects.

When a queue is created, access to it is automatically granted to the mqm group and to the primary group of the userid that created it. It’s quite reasonable for someone designing the security of an MQSeries infrastructure to assume that access to all queues has been forbidden to all users except members of the mqm group. From here, the administrator would specify additional security settings that need to be made.

This works fine when queues are created either by the mqm user or by someone whose primary group is mqm. The problem arises when another user whose primary group is, for instance, staff, but who is also a member of mqm, defines the queue. In this case authority is also granted automatically and unintentionally to all members of the staff group.

This also applies to the creation of queue managers. If a queue manager is created by a userid whose primary group is staff, then all members of staff by default have access to the queue manager.

The simplest solution to this problem is to enforce a policy whereby no userid other than mqm may create MQSeries objects or queue managers. An alternative policy is never to make a userid a member of the mqm group unless this is its primary group.

OAM uses union

The Object Authority Manager uses the union of the authority settings that it finds. So, to take the example above a step further, suppose a queue, orange.queue, is created by a userid whose primary group is staff. At some point later it is found that another userid, worker, who shouldn’t have access to the queue, is nevertheless able to access it. worker is a member of staff but has team as his primary group. To resolve this problem an administrator might try running:

setmqaut -m saturn.queue.manager -n orange.queue -t queue

‰  -p worker –all

However, this will not solve the problem. While it will remove team from the authorization list, members of staff, including worker, still have access to the queue.

This also applies to other platforms, such as NT, that implement the ‘-p’ parameter. Although the problem of primary groups is not present, it should be realized that, while:

setmqaut -m saturn.queue.manager -n orange.queue -t queue

‰  -p worker +all

gives full access to worker,

setmqaut -m saturn.queue.manager -n orange.queue -t queue

‰  -p worker –all

only forbids all access if worker is not a member of any authorized groups.

Caching

On some platforms, such as Unix, group membership is cached by MQSeries. This means that, if a new user joins a group and needs access to MQSeries objects, the queue manager needs to be restarted. Similarly (and probably more importantly), if a user leaves the team or company, it is not sufficient just to remove them from the group. The user retains access to objects until such a time as the queue manager is restarted.

Only enable things if you need them

This is no more than common sense, and the defaults are such that this won’t cause problems, but for the sake of completion the following points are worth mentioning:

  • Automatic channel definition

Enabling the automatic definition of channels increases the ability of machines to connect to your queue manager with little prior knowledge of your system, so this should be enabled only if definitely required.

  • Command server

The command server is very powerful and can render weak security even weaker. For instance, on a system running MQSeries version 2 in which users do not have the authority to use the client channel, they could still connect using a sender channel called SYSTEM.DEF.RECEIVER. This could put messages on the command server’s input queue requesting it to create a channel and transmission queue back out. This could then be used for further breaches of security. If you’re not confident of your system’s security, it’s advisable to start the command server only when it is needed and to grant users only the minimum required levels of authority to it.

 

Sam Garforth

SJG Consulting Ltd (UK)                                                     © S Garforth 1999

Advertisements

Using RACF and the OAM for end-to-end security

It’s a commonly held view that MQSeries is not a secure product and that to install it in your network infrastructure is to give hackers a free reign. In this article I’ll demonstrate that this isn’t necessarily so.
Security is a general term that covers such tasks as sender and receiver authentication, encryption and privacy, non-repudiation, and message integrity and data authentication.
When communications between companies occurs, MACs, digital signatures, and public key encryption may be employed to enforce security, perhaps by means of third-party products, such as Baltimore Secure MQ from Baltimore. However, many companies consider that, when it comes to communication within an enterprise, such measures are not required as all machines in their infrastructure are managed by administrators whom they trust. All that’s required is to provide administrators with the means to prevent unauthorised users from accessing the network and creating messages, while still providing access to authorised users.
This can be done using third-party products, many of which have the potential to secure communications completely. Nevertheless, there is a lot that can be achieved using just the security mechanisms provided by MQSeries itself – that is, using RACF on MVS and the Object Authority Manager (OAM) on distributed platforms.
This article covers the policies that a company would need to put in place and the configuration that administrators would need to implement in order to establish an acceptable level of security in an environment where administrators are trusted but users are not.
EXAMPLE ENVIRONMENT
Consider the following example:
For a number of years a large company has successfully used techniques such as file transfer (FTP) to carry out point-to-point communication. However, in order to improve speed of development they decide to move to an MQSeries-based infrastructure, also deciding to use a central hub managed by a trusted group, as this yields benefits in manageability and allows new connections to be added quickly.
Most of the machines are based in a secure machine room (while it’s possible to log on to the machines from outside the room, a discussion of how to secure this type of access is beyond the scope of this article). Each business unit owns one machine. Business units don’t trust users (who could be disgruntled employees), and they don’t trust administrators of machines belonging to other business units, though they do trust their own administrators. Most security problems, such as ‘sniffers’ on the communication lines, were addressed when FTP was set up (possible solutions include using encryption at the communications layer, splitting SNA packets into so many parts that they are virtually impossible to read, and using security calls within applications themselves).
Consider a situation in which A and B need to communicate with each other using MQSeries, as do C and D (see Figure 1). Most of the security issues that exist in this environment also apply to FTP, though a major new one is introduced.

image001

With the environment shown above, if business A decides that it needs to talk to business D, the infrastructure is already in place and only the application development needs to be done. This is a very strong reason for using MQSeries and a hub environment. However, it also introduces the problem that an unauthorised person on A could send a message to D.
In order to secure the end-to-end connection, including preventing the generation of unauthorised messages, it is necessary to carry out the measures detailed in this article.
COMMUNICATIONS LAYER SECURITY
Firstly, security needs to be set up at the level of the communication layer. SNA bind or session-level security can be used to ensure that, when an SNA bind request comes from A, the hub knows that the request really does originate at A. This is a default with most communications packages (but not ones from Tandem) and involves providing the same password at each end. Obviously the password must be kept secret and should be accessible only by the machines’ administrators. Something similar can be done for TCP/IP using secure domain units or some form of Virtual Private Network.
CHANNEL INITIATION SECURITY
We’ve now ensured that no boxes are connected to the hub that shouldn’t be connected to it. However, it’s still possible for a user on A to define a queue manager called C and a channel on A called C.TO.HUB (for example, by knowing the naming convention or by querying the hub’s command server), and then connect to the hub by impersonating C and having messages routed to D.
If the channel is a sender/receiver channel, the only way around this is to use a security exit provided by a third-party product (such as Baltimore, mentioned earlier). However, if A is a secure machine, users won’t have the authority necessary to add these definitions to the system. An alternative is to use requester/sender channels. This is similar to a call-back system: the hub acts as a requester and thus needs to initiate the conversation. It calls out to the known LU/IP address stating that it wants to start a channel. A, acting as the sender, would then initiate the channel back to the hub. If A were to try to start a channel to the wrong requester, the request would not be accepted. Similarly D, acting as a requester, could initiate a conversation asking the hub, as a sender, to call it back. As the hub’s sender channel contains D’s CONNAME and calls it back the most that A could do in this set-up is to get the hub to call D.
ROUTING SECURITY
So now we have a system where we can be confident that all messages coming to the hub on any channel are from the machine that they should be from.
The hub is merely a queue manager, looking after transmission queues and running the associated channels. Each transmission queue is named after the queue manager that it points to. The next problem, as mentioned above, is that a user on A could, by default, do an MQPUT specifying as its target the queue manager of D. The message would be put on A’s default transmission queue (to the hub); when it reaches the hub, it would automatically be put on transmission queue D, and thus get to a destination that it shouldn’t be able to reach.
The way around this is to specify the MCAUSER parameter on the receiver/requester channel definitions. By default the inbound channel at the hub puts messages on its target transmission queue using the userid running the channel. This userid has full access to put messages on all queues. However, if you change the channel’s MCAUSER parameter, the message will be put on the queue using the userid specified by the parameter.
So, define one userid for each inbound channel on the hub. For example, define a userid called A for the channel from A, a userid called C for the channel from C, etc. Alter the inbound channels to put messages on queues using their corresponding userid – for example:
ALTER CHL(A.TO.HUB) CHLTYPE(RQSTR) TRPTYPE(LU62) MCAUSER(A)
ALTER CHL(C.TO.HUB) CHLTYPE(RQSTR) TRPTYPE(LU62) MCAUSER(C)
Next set the permissions on the hub’s transmission queues to accept only messages from authorised channels. How you do this depends on your set-up – on distributed platforms, use the following commands:
SETMQAUT –M HUB –T QMGR –P A +CONNECT +SETALL
SETMQAUT –M HUB –T QMGR –P C +CONNECT +SETALL
SETMQAUT –M HUB –T Q –N B –P A +PUT +SETALL
SETMQAUT –M HUB –T Q –N D –P C +PUT +SETALL
If you use RACF, then the following commands are needed:
RDEFINE MQQUEUE HUB.B UACC(NONE)
PERMIT HUB.B CLASS(MQQUEUE) ID(A) ACCESS(UPDATE)
RDEFINE MQQUEUE HUB.D UACC(NONE)
PERMIT HUB.D CLASS(MQQUEUE) ID(B) ACCESS(UPDATE)
TARGET QUEUE SECURITY
So now B and D can be confident that all the messages they receive are from authorised queue managers. The next problem is to make sure that the right messages go to the right queues. For example, user UserX on A might be allowed to send messages to queue QueueQ on B, and user UserY on A might be allowed to send messages to queue QueueR on B (see Figure 2). However, we need to ensure that UserX cannot send messages to QueueR. To do so without either using security exits or changing applications, business B needs to trust A’s administrator (but not C’s, etc). Also a system-wide naming convention of userids needs to be enforced.

image002

By default, when a user on A sends a message, the user’s userid is put in the USERID field of the message descriptor. The user is not allowed to change this. Also, by default the inbound channel at the receiver (for instance, B) puts messages on its target queue using the userid that’s used to run the channel. This userid has sufficient access rights to put messages on all queues. If you change the PUTAUT parameter of the channel from PUTAUT(DEF) to PUTAUT(CTX), messages are placed on the queue using the authority of the userid specified in the message descriptor.
So queues can now be secured by defining userids on the receiving machines that have the same names as the userids on the sending machines. The receiving userids do not need authority to log on. So, in this example:
Define two users, UserX and UserY, on B.
ALTER CHL(HUB.TO.B) CHLTYPE(RQSTR) TRPTYPE(LU62) PUTAUT(CTX)
On distributed platforms, issue the following command:
SETMQAUT –M B –T QMGR –P UserX +CONNECT
SETMQAUT –M B –T Q –N QueueQ –P UserX +PUT +SETALL
SETMQAUT –M B –T QMGR –P UserY +CONNECT
SETMQAUT –M B –T Q –N QueueR –P UserY +PUT +SETALL
If you use RACF, the following commands are needed (assuming MQM runs the channel):
RDEFINE MQQUEUE B.QueueQ UACC(NONE)
PERMIT B.QueueQ CLASS(MQQUEUE) ID(MQM) ACCESS(UPDATE)
PERMIT B.QueueQ CLASS(MQQUEUE) ID(UserX) ACCESS(UPDATE)
RDEFINE MQQUEUE B.QueueR UACC(NONE)
PERMIT B.QueueR CLASS(MQQUEUE) ID(MQM) ACCESS(UPDATE)
PERMIT B.QueueR CLASS(MQQUEUE) ID(UserY) ACCESS(UPDATE)

As mentioned previously, security is a bit more complex on Tandem systems. On Tandem, the ‘userid’ in the message descriptor is actually a groupid. For the above to work when a Tandem system is the receiver, it is necessary to use a group that’s defined and authorised with the same name as the sending userid. When a Tandem system is the sender, the receiver needs a userid defined and authorised with the same name as the sending group.
LOCAL QUEUE SECURITY AT THE SENDER
If users on the sender machine do not trust one another, some additional work is necessary to set up security.
If QREMOTE queues are not used, and users specify the target queue manager in the MQPUT call, then messages from UserX and UserY on A are put directly on the transmission queue and there is no way for MQSeries to stop them specifying one another’s target queues. It is also possible, when the channel is not running, for them to remove one another’s messages before the messages are sent.
The best way to solve this problem is to restrict access to transmission queues (this is the default) and to allow users to put messages only on QREMOTE queues that point to the target queues. Using this approach, a secure structure can be set up such that UserX and UserY cannot put messages on one another’s queues.
For instance, using RUNMQSC, enter the following definitions:
DEFINE QR(TO.Q.ON.B) RNAME(QueueQ) RQMNAME(B)
DEFINE QR(TO.R.ON.B) RNAME(QueueR) RQMNAME(B)
The commands below are the ones to use on distributed platforms.
SETMQAUT –M A –T QMGR –P UserX +CONNECT
SETMQAUT –M A –T QMGR –P UserY +CONNECT
SETMQAUT –M A –T Q –N TO.Q.ON.B –P UserX +PUT
SETMQAUT –M A –T Q –N TO.R.ON.B –P UserY +PUT
While the ones below are for use with RACF.
PERMIT A.BATCH CLASS(MQCONN) ID(UserX) ACCESS(READ)
PERMIT A.BATCH CLASS(MQCONN) ID(UserY) ACCESS(READ)
RDEFINE MQQUEUE A.TO.Q.ON.B UACC(NONE)
RDEFINE MQQUEUE A.TO.R.ON.B UACC(NONE)
PERMIT A.TO.Q.ON.B CLASS(MQQUEUE) ID(UserX) ACCESS(UPDATE)
PERMIT A.TO.R.ON.B CLASS(MQQUEUE) ID(UserY) ACCESS(UPDATE)
However, if you are happy to allow applications to write to the transmission queue, you could use either the following commands on distributed platforms:
SETMQAUT –M A –T Q –N B –P UserX +PUT
SETMQAUT –M A –T Q –N B –P UserY +PUT
or this one with RACF:
RDEFINE MQQUEUE A.B UACC(UPDATE)
LOCAL QUEUE SECURITY AT THE RECEIVER
If users on the receiving machine do not trust one another, then it’s necessary to set up some additional security.
Say UserQ is able to read messages on queue QueueQ and UserR is able to read messages on queue QueueR. If the users are not considered trustworthy, then one needs to guard against the possibility that UserR may put a message on queue QueueQ and for UserQ to receive it believing it to have come from A. Similarly UserR could get messages from queue QueueQ before UserQ gets them. To prevent this, it is necessary to run the following OAM commands:
SETMQAUT –M B –T QMGR –P UserQ +CONNECT
SETMQAUT –M B –T QMGR –P UserR +CONNECT
SETMQAUT –M B –T Q –N QueueQ –P UserQ +GET
SETMQAUT –M B –T Q –N QueueR –P UserR +GET
With RACF, the following commands would be needed:
PERMIT B.BATCH CLASS(MQCONN) ID(UserQ) ACCESS(READ)
PERMIT B.BATCH CLASS(MQCONN) ID(UserR) ACCESS(READ)
RDEFINE MQQUEUE B.QueueQ UACC(NONE)
RDEFINE MQQUEUE B.QueueR UACC(NONE)
PERMIT B.QueueQ CLASS(MQQUEUE) ID(UserQ) ACCESS(UPDATE)
PERMIT B.QueueR CLASS(MQQUEUE) ID(UserR) ACCESS(UPDATE)
Note that on MVS a problem still remains. UserQ (or perhaps a member of the same group) can run an application that puts messages on queue QueueQ that the main UserQ application then reads off in the belief that they came from A. On distributed platforms, the OAM command SETMQAUT can be used to ensure that UserQ can get messages from a queue but not put them on it. RACF does not have this facility. A user is either able to both get and put messages on a queue or neither. One solution to this is to use ‘alias’ queues.
For example:
DEFINE QA(ACCESS.BY.USERQ) TARGQ(QueueQ) PUT(DISABLED)
RACF could then be used to be used to prevent UserQ from directly accessing queue QueueQ while giving the user full access to the ACCESS.BY.USERQ alias queue. The PUT(DISABLED) attribute ensures that the user can’t put messages on the queue. Note that the PUT(DISABLED) attribute could not have been used directly on queue QueueQ, as this would have stopped the channel from being able to write messages.
While this method works, it’s a bit of overkill. As it’s common in MVS for a user to have read/write access to a dataset, allowing them also to have read/write access to a queue is usually seen as a natural extension. Another consideration is that, in MVS, it’s less likely that the same userid is used to run different applications.
Note that, on all platforms, such measures are unnecessary if the administrator has secured the machine so that users cannot add or run their own applications.

Sam Garforth
SJG Consulting Ltd (UK) © S Garforth 1999

Updates

The following comments have been made by a reader more recently and need to be incorporated into the article:

The statement is made that “If A were to try to start a channel to the wrong requester, the request would not be accepted.” This is not true. In the diagram, A, B, C or D could all start the same requestor on the hub (although not more than one at a time).

Also, if you give the MCAUSER +setall authority and set PUTAUT(CTX), what is to prevent a malicious user on A from sending messages as mqm to the command server (or any arbitrary queue) at the hub or any of the other machines? Each company has to trust the MQ administrators at all the other companies using that hub. That’s a lot to ask.

You might want to add a note in the document for TCP/IP users. It’s good practice to have the users coming in on different ports from each other and from the one you use internally. If, for example, you use 1414 internally and A, B, C and D use 1415, 1416, 1417 and 1418 respectively you gain a lot of control. You can stop all external traffic while allowing internal traffic to continue by shutting down listeners on all ports other than 1414. Or you can stop traffic from one business partner without affecting the others.

There are a bunch of other measures that can and probably should be taken to secure a hub QMgr that talks to several different external partners. These include adjusting the channel retry to enforce some flood control, disabling QMgr resolution by not naming XMit queues after the QMgr they serve and SVR channel behavior which can be exploited in a hub environment.

Sam Garforth
SJG Consulting Ltd (UK) © S Garforth 2003

A Simple Introduction to the Architecture of Salesforce Platform Encryption

The architecture of the Salesforce Platform Encryption solution is described here.

I thought I’d have a go at writing a simplified version in a way that’s easy for me to understand, starting with the encryption of the data and then moving out to key management.

Encryption Basics

In this post I’m going to assume a certain amount of knowledge about encryption but let’s start with some simplified basics.

Symmetric encryption is where you have the same key to both encrypt (for privacy) and decrypt the data. This is the fastest way to encrypt/decrypt but it is also the easiest to crack and if you lose the key then you’re in trouble e.g. if you encrypt something with the key and someone else wants to decrypt it then they need to have the same key and then there’s nothing to stop them imitating you.

Symmetric Encryption

Public key encryption (PKI) addresses this issue using key pairs. The key that does the encryption is different to the key that does the decryption. The key that does the encryption (the public key) can be made public, anyone can use it to encrypt but only the holder of the other half of the pair (the private key) will be able to decrypt it.

Public Key Encryption

The same public key technology can be used for signing (for authentication). Someone can use their private key to sign something and people with the corresponding public key will be able to verify that the sender used that private key. Public key encryption is sometimes called asymmetric because the encrypting/decrypting keys are different.

Public Key Authentication

Asymmetric security is more secure than symmetric because you don’t have to share the encrypting key and it takes longer to crack but it also takes longer to encrypt and so sometimes the performance impact can be too high. So, typically, a combination of the two is used. The symmetric key is used for the encryption/decryption but its distribution and storage is protected using the public key technology.

Salesforce Security

Salesforce has always been a very secure platform, using a range of services such as encryption of the data in transit, two factor authentication, verification of login address, profiles, permissions and penetration tests. They are now adding to this a new feature called Platform Encryption which allows customers to optionally encrypt some fields at rest i.e. while they are stored in the Salesforce database.

How does Salesforce Platform Encryption Work?

Salesforce uses a symmetric encryption key to encrypt the customer data that it stores. (The symmetric encryption used is AES with 256-bit keys using CBC mode, PKCS5 padding, and random initialization vector (IV).) The symmetric mode gives the performance benefit but means that the key needs to be closely protected. For this reason the Data Encryption Key (which is also the decryption key) is never transmitted or even written to disk (persisted). It is created/derived in the Salesforce platform and never leaves. It is created in a component of the platform called the Key Derivation Server.

Platform Encryption Architecture

So this brings us to the question of how is it created, and how can we ensure that it’s the same when it’s recreated to do the decryption? Also, given that this is a multi-tenant environment, what is the customer specific component? The answer is that the encryption key is derived/created from a combination of a Salesforce component and customer/tenant specific component. These are called secrets. Sometimes they are also referred to as key fragments.

The encryption key is generated from the master secret (Salesforce component) and the tenant secret (customer component) using PBKDF2 (Password-Based Key Derivation Function 2). The derived data encryption key is then securely passed to the encryption service and held in the cache of an application server.

Key Derivation Server

The Write Process

So, to write an encrypted record, Salesforce retrieves the Data Encryption Key from the cache and performs the encryption. As well as writing the encrypted data into the record it also stores the IV and the id of the tenant secret.

The Read Process

Similarly, to decrypt the data Salesforce reads the encrypted data from the database and if the encryption (decryption) key is not in the cache then it needs to derive it again using the associated tenant secret, and then it decrypts using the key and the associated IV.

So, we’ve established that the data can’t be accessed without the data encryption key and that this key can’t be accessed without the master and tenant secrets, but how do we know that the secrets are secure?

Generation of Secrets

Remember that for this discussion, there is one master secret for Salesforce itself, and a tenant secret and key derivation server for each customer. Actually these secrets are regularly replaced, which is why we need to keep their ids.

The master secret is created by a dedicated air gapped HSM. It is then encrypted using the key derivation server’s public key (tenant wrapping key) and signed with the HSM’s private key (master wrapping key) and transported to the key derivation server where it is stored.

Master HSM

The tenant secret is created on the key derivation server, with a different HSM. This is initiated by the customer who connects using their usual transport level security. It is then encrypted with the tenant wrapping key (public key) and stored in the database. The tenant secret never leaves the key derivation server and can only be accessed using the tenant wrapping key private key which also never leaves the key derivation server.

The Transit Key

A unique transit key is generated on the a Salesforce Platform application server each time it boots up. The transit key is used to encrypt the derived data encryption key before it’s sent back from the key derivation server to the encryption service. The transit key is a symmetric key but itself is encrypted with an asymmetric key, created by the master HSM, to get it to the key derivation server.

That’s Enough For Now

There’s a lot more that can be explained. There are more keys for more parts of the process. There are more distribution processes, and processes for updating the keys and keeping the system working using updated keys. There are processes for archiving data and keys, and for destroying the archives. But for now, I think I’ve understood enough to be comfortable with the way platform encryption works and the extra layer of security that it provides. Please let me know if you spot any glaring errors. For more detail please see the original document or suggest future posts.

A Guide to IBM Bluemix Resiliency and Security

This post was originally published on ThoughtsOnCloud on February 7th, 2015.

I’m pleased to say that it was also published for the 20,000 attendees at IBM Interconnect on Feb 26th.

B-yCORyUcAA8Gq1IBM Bluemix is suitable for high performance, high input/output (I/O), high availability or latency-sensitive production applications, as well as development and test deployments. This is due to the IBM Bluemix configuration of Cloud Foundry within its data centers and the underlying strength of the IBM SoftLayer cloud infrastructure platform.

All Bluemix applications have their infrastructure automatically deployed as required and in real time. For example, if an application is dynamically scaled because it requires extra capacity, Bluemix handles it automatically. There is a full web-based management console and programmable management interfaces, which enable completely flexible monitoring of users’ applications.

IBM Bluemix configures Cloud Foundry in a highly available topology within the IBM SoftLayer data center. All Cloud Foundry components have been replicated to avoid any single point of failure (SPOF). These components include Droplet Execution Agent (DEA), Cloud Controller, router, Health Manager and login server. If any component fails it will be restarted within the data center while the remaining components provide continued availability. Other deployments can become available for the purposes of disaster recovery for IBM Bluemix applications.

IBM Bluemix exploits the IBM SoftLayer cloud infrastructure platform, hosted in data centers with Tier 3 resiliency. IBM SoftLayer provides a compelling set of service level agreements (SLAs) which in turn provide a strong platform for IBM Bluemix technology.

IBM Bluemix is able to exploit IBM SoftLayer’s triple network, which isolates public Internet, private application traffic and infrastructure management traffic. Together with highly redundant servers, each of which has five network cards, and the ability to seamlessly integrate with secure client private networks, IBM Bluemix applications benefit from a highly available and resilient network.

A large catalog of application services is available, each of which typically provides an appropriate range of priced service levels. The service plan will document a priced service level as well as the free service tier. While the free tier provides the ability for developers to try out the functional behavior, the priced levels provide increasing operational quality of service. This

service plan is fully documented with the details of the service performance and capacity, as well as specifying high availability and disaster recovery options. This flexible service approach enables departments to match their development and operations with the appropriate service plan to ensure the most economical mix of service levels.

The IBM approach to information assurance is to provide evidence according to government security principles. IBM Bluemix and its underlying cloud platform infrastructure, IBM SoftLayer, are designed to comply with these 14 principles for all security elements including people, process and technology.

The IBM SoftLayer cloud infrastructure platform has already demonstrated compliance with SOC2 Type II, EU Safe Harbor, and CSA STAR CAIQ and CCM self-assessments, as well as the ISO 9000 quality assurance standard. These standards represent the ongoing commitment to the European Commission data privacy requirements.

From an engineering and support perspective, IBM Bluemix and its underlying cloud infrastructure technologies undergo continuous rigorous security testing in accordance with IBM Secure Engineering development practices. If a security exposure is identified by IBM or a third party, then IBM Support will use the IBM Product Security Incident Response Team (PSIRT) process to apply appropriate and timely updates to ensure the overall system security and integrity is maintained.

As you can see, the security and compliance offered by Bluemix is attractive and comprehensive. Do you think Bluemix is right for you?

What is Cloud Computing? Is everything cloud?

Cloud wordleCloud is consumption model. It’s the idea of taking away all the IT skills and effort required by a user and letting them focus on their actual functional requirements. All the IT detail is hidden from them in The Cloud. Smart phones and tablets have really helped consumers understand this concept. They’ve become liberated. Knowing very little about IT they have become empowered with self-service IT to access functionality on demand. Within seconds they can decide that they want a business application, they can find it on an app store, buy and install it themselves and be up and running using it. When they’ve finished they can delete it.

CIOs are asking themselves why it can still take IT many months to get their business project up and running when in their personal lives they can have what they want when they want it.

The Cloud doesn’t take away the need for IT; for hardware, software, and systems management. It just encapsulates it. It puts it in the hands of the specialists working inside the cloud, and by centralising the IT and the skills costs can be reduced, risk can be reduced, businesses can focus on their core skills and have improved time to market and business agility.

It is confusing to talk about cloud without explaining whose point of view you’re looking at it from. Different people want different levels of complexity outsourced to the cloud.

Many users see cloud as a way of outsourcing all their IT. Some go even further and outsource the whole business process. I think the jury is out on whether cloud has to involve IT at all. Business Process as a Service (BPaaS) is talked about as one of the cloud offerings. I think the important thing is to let the customer get on with their core business and take away any activity that is not a differentiator for them.

Software as a Service (SaaS) is the area that most people think about first when they hear the word cloud. People have been using web based email for over 10 years. They don’t need to worry about maintaining a high spec PC and all the associated software. As long as they have a web browser they’re up and running. There is a move and a demand to make many, if not all, computer software applications available on the cloud, via simple consoles. Not unlike the idea of thin clients 15 years ago or mainframe terminals 40 years ago.

Moving down the stack a little further we come to a different group of users; the application developers. The people who want to be involved in IT, who want to create the business applications that run on the cloud. They still want to focus on business value though. They still want someone to take away the effort of writing the middleware. The code that is the same in 90% of all applications. The communication systems, the database, the interaction with the user. They want Platform as a Service (PaaS). An environment that’s just there, up and running, as and when they need it.

Finally we come to Infrastructure as a Service (IaaS). This is for real programmers or system administrators. For people who just want the base operating system to install or write the applications on, like they did in the old days. These people like the paradigm, of having a computer that’s all theirs. In the old days when their CIO wanted an environment for a new project they would request that someone find some data centre space, buy a PC, install it in the data centre with power and cooling etc, install the operating system, and then 6 months later hand it over to them to start the project development. Now they don’t need to worry about the physical world. They can just request the infrastructure as a service i.e. access to a brand new operating system install, and they’ll be up and running in minutes.

Or course these things can all run on top of one another. The business process can run on the software which runs on the platform which runs on the infrastructure, all provided as a service. But they don’t have to. The whole point is that the user doesn’t need to worry about what’s happening inside of their cloud. There could just be an infinite number of monkeys with an infinite number of typewriters working away inside the cloud. As long as the user is getting the service that they’re looking for they don’t care.

Which brings us to the other side of the picture. The cloud service providers. These can be traditional Managed Service Providers (MSPs), System Integrators, or the in house data centre offering the IT service to the lines of business. These guys are already taking the IT effort away from businesses, they’re already encapsulating and obfuscating the details of IT. But they’re in a competitive market, driven by the new expectations of the consumer and so they need to work smarter. They need to adopt some of these new architectures to be able to pass on the cost benefits and speed of delivery that their customer expects.

This is where some of the other terms associated with cloud computing come in – virtualisation, automation, standardisation. They’re not essential for cloud computing. The monkeys could do the job. But they really make it a lot easier. To make a step change improvement in delivery speed the IT departments need to share the environments on the same computer. Instead of having hundreds of servers running at 50% capacity they can just have one bigger one and schedule who’s using the capacity when. Instead of manually installing the application and all its dependencies and bug fixing and testing each part individually and together, they can standardise and automate and use virtual appliances to remove room for error. By introducing virtualisation, automation and self service a private data centre is moving towards and enabling cloud. Similarly, pay as you go, and sharing services between companies, are not cloud per se, but they are drivers towards and benefits from cloud.

So it can be confusing. People are talking about the same thing, but from different points of view. When people talk about cloud they might be talking about the hardware and automation in the data centre, or they might be talking about the complete absence of hardware by using business process outsourcing, they might be talking about handing all their data over to another company or they might be talking about making their private data more accessible to their own users.

So cloud covers a lot, but not everything.


My twitter feed