IBM MQSeries Professional Certification

INTRODUCTION
The value of professional qualifications in the computing industry is a contentious issue but there is no doubt that they are here to stay. Five years ago IBM introduced the Certified MQSeries Engineer qualification. This has since mutated into three others qualifications and more have arrived too.
IBM’s motivation for introducing the MQSeries certification is debated. Back in 1995 it was difficult for them to achieve sales, as customers were rightly concerned about the availability of skilled support staff. Also, there was concern that Microsoft’s imminent MSMQ could become serious competition. Microsoft already had a certification programme which enabled them to boast about the number of people with skills in their products. IBM’s introduction of the MQSeries certification helped alleviate this difficulty. However, Peter Goss of MQSeries Product Marketing responds to this suggestion saying, “Your statement is not correct. The tests in the MQSeries Family Certification programme – all 8 of them – were introduced to allow individuals to validate the success of the education they have taken and experience they have gained working with the products. And it allows Business Partners and customers to validate that their employees have gained the education, knowledge and experience necessary to do their jobs.”
There is also debate over who should pay for the sitting of the tests. It is in IBM’s interests to have lots of qualified people in the market. If employers demand that job candidates are certified then people have no choice but to take the tests and often to attend supporting IBM education. In order for a software vendor or consulting company to receive support from IBM it must become an IBM Business Partner. One of the requirements for a company to be a member of IBM’s PartnerWorld programme is that they have a certain number of employees who are certified.
Peter goes on to say, “Tests therefore have real value to an individual or to a company, and so it’s no surprise that it costs money to take the test.”
Another area of concern is that employers could use the fact that an employee has failed the test as justification for dismissal. For this reason some employees refuse to take the test until they are absolutely sure that they are ready, but this can cause conflict.
Having said all this, the Certification Programme is here to stay and if you achieve the qualification you are undoubtedly in a better situation than if you do not.
WHAT IS THE IBM PROFESSIONAL CERTIFICATION PROGRAMME?
The IBM Professional Certification Programme is the roadmap provided by IBM such that you can achieve internationally recognised certification by attending education classes and taking examinations that demonstrate your abilities and experience. IBM says that its Professional Certification Programme “offers a business solution for skilled technical professionals seeking to demonstrate their expertise to the world.” The programme is designed to objectively validate skills and demonstrate proficiency in the latest IBM technology and solutions. They say that by giving you and your employer confidence that your skills have been tested, the certification can help you excel at your job, delivering higher levels of service and technical expertise and thus move you on a faster career track.
For optimum benefit, the certification tests must reflect the critical tasks required for a job, the skill levels of each task, and the frequency by which a task needs to be performed. IBM has comprehensive, documented processes, which ensure that the certification tests remain relevant to the work environment of potential certification candidates.

IS IT WORTH THE MONEY?
A typical price for a test is £105 plus VAT. Although it is sometimes possible to get a “You Pass We Pay” deal depending upon your position (e.g. business partner). Sometimes, at IBM conferences, it is possible to sit the test for free. The prerequisite IBM MQSeries Courses typically cost around £300 plus VAT per day and last between one and five days.
IBM justifies the cost of taking the test and associated education with a Return on Investment study. They say that the study indicated positive business results for the companys taking the test such as improved Revenue (Profitability), Efficiency (Productivity), and Customer Satisfaction (Credibility).

image001
THE PATH TO CERTIFICATION
Screen Shot 2016-04-17 at 16.05.03

WHAT MQSERIES RELATED QUALIFICATIONS ARE AVAILABLE?
At Level 1 you can take tests to qualify as an IBM Certified Specialist in MQSeries, MQSeries Integrator or MQSeries Integrator V2. This title certifies you to perform basic operational services such as basic planning, configuration, installation, support, management, and maintenance, with limited assistance or to perform administration of the product, with limited assistance.
At Level 2 you can qualify as an IBM Certified Solutions Expert in the same products and also in MQSeries Workflow. This title demonstrates breadth of basic operational services skills in more than one environment or demonstrates depth of advanced operational services skills such as customising, integrating, migrating and tuning, in one environment.
Also at level 2 is the IBM Certified Developer qualification for MQSeries. It demonstrates the capability to plan and design an application requirement and build a prototype.
As an example, the people who should consider applying to be IBM Certified Solutions Experts in MQSeries are people responsible for planning and architecting software solutions and designing applications based on MQSeries. They should have the knowledge available from attending the MQSeries Technical Introduction course along with the Application Programming course, the Domino Connections course, and the Advanced System Design and Connections course. They should also have practical experience of implementing MQ with transaction management and database products, systems management, prototyping IT solutions, basic programming concepts, IT security concepts, plus the ability to gather business requirements and to translate those requirements into IT solutions, and skills in implementing systems on multiple computing platforms.

WHICH TESTS ARE REQUIRED
Each of the qualifications requires the taking of at least one test, with the recommended prerequisite study and experience.
The mapping between the qualifications and test names is not immediately obvious but it does help you to understand what the qualifications signify:

Certification

Test

Prerequisite
qualification

MQ Specialist

MQ
Installation and Configuration

MQ Solutions Expert

MQ System Planning and
Design

MQ Developer

MQ Application Design

MQSI Specialist

MQSI Installation and
Configuration

MQ Specialist

MQSI Solutions Expert

MQSI System Planning and
Design

MQ Solutions Expert

MQSI V2 Specialist

MQSI V2 Implementation

MQ Specialist

MQSI V2 Solutions Expert

MQSI V2 System Architecture
and Design

Any MQ certification

MQSeries Workflow Solutions
Expert

MQSeries Workflow System
Planning and Modelling

Details of the individual test objectives, describing the topics potentially covered on the tests can be found at http://www.ibm.com/education/certify/tests/index.phtml

WHAT STUDY IS AVAILABLE
Each test specifies as a prerequisite course the MQSeries Technical Introduction, which is available from IBM Learning Services as a class lesson or as a computer-based training CD-ROM. They each require at least familiarity with the MQ manuals and publications and knowledge of basic MQSeries functions and facilities. Each certification also has other specific courses associated with it, the details of which are available on the tests web page.

EXPERIENCE
This of course is the contentious part. The qualification is designed to show that you have adequate experience to cope with real life situations. Taking courses designed to get you through the test and missing out the real life experience is defeating the object.

SAMPLE TESTS
When you think you may be ready to take the test it is worth trying a sample one. This way you will know whether it’s worth spending the time, money and stress on the real thing. Also you’ll get a feel for the style of questions and may even learn more through the experience. Sample tests are available from the certification web site.

Example Question
Here is an example question. It is a very simple question but the answer is not so easy. See the web site for the answer.

 An MQSeries application has created a queue with the following conditions specified on the DEFINE QLOCAL command:
 DEFPRTY(0)
 MSGDLVSQ(FIFO)
 TRIGMPRI(5)
 TRIGTYPE(DEPTH)
 TRIGDPTH(10)
 TRIGGER

When will a trigger message be generated?

 A. No trigger messages will be generated
 B. when the queue contains 5 messages
 C. when the queue contains 10 messages
 D. when the queue contains 5 priority messages
 E. when the queue contains 10 priority 5 messages

It is common to say that IBM has simply got the answer wrong and/or that there is more than one answer. This is a common criticism of the tests in general but usually shows that the person has not fully understood the question or the reason for the answer. This is a good way to identify which people deserve the certification but unfortunately gives the test a bad reputation.

ENROLLING FOR THE TEST
Tests can be scheduled with either an IBM Learning Centre or Prometric, Inc. (formerly Sylvan Prometric). IBM Learning Services can be contacted on 0845 758 1329. It may be possible to register as late as the day before the test. You will be required to pay at the time of scheduling and if time permits, you will be sent a letter of confirmation.

TAKING THE TEST ITSELF
You will need to take two forms of identification. At the conclusion of the test you will receive a full score report with section analyses. If you fail you will be able to reregister for a test. This test should have different questions.

THE CONTENT OF THE TESTS
Each of the MQSeries tests has questions that relate to MQSeries Version 5.1 for distributed platforms and MQSeries for OS/390 Version 2.1. Each test has between 60 and 80 questions and is of multiple-choice, closed book format. Most last 90 minutes but some are 75 minutes.
Some people say it is possible to pass and still know very little, or to study, cram and pass the test and then forget it. Some object to the tests covering areas that are not relevant to their work, however you do not need to get 100% to pass so it is not expected that you will know all the areas of the product. It shows that you have a reasonable percentage of knowledge. A common criticism of the tests is that they test retained knowledge and people say that in real life they would not need to know it, that they would look up the information in a manual or call IBM. The certification specifications specifically say that they certify that a person is self-sufficient, performing the tasks with limited assistance from peers, product documentation and vendor support services. There is only so much that can be achieved by someone who needs to rely on documentation and support. In the crisis situation of a live production problem there may not be the time to look things up or to wait for a call to be returned. An employer will be looking for someone who can cope under this sort of pressure. Also, it is necessary to have complete understanding and retention of fundamentals to be able to understand the more complex MQ and application scenarios. For example, it would not be possible to master MQSI if you had to keep referring back to MQSeries itself. However, the tests also cater for the other point of view by asking where you might expect to look up certain information.

THE WELCOME PACKAGE
In 3 to 6 weeks after you passing IBM will send you the Certification Agreement and a welcome package, which includes your certificate. After accepting the Certification Agreement you may open the package and use the IBM Professional Certification title and trademark. The package contains a certificate, a wallet sized certificate, a lapel pin and details of the certification logo and how you can use it. The logo looks like this:

image011

You are asked to keep the testing centre informed of any changes in your personal details. They in turn keep IBM informed. Apparently this means that they can keep you informed of all programme information. However, I have never received anything other than the original pack.

CONCLUSION
MQSeries skills are still scarce at the moment and so people may not feel the need for certification, but as more people learn MQ skills, certification may become more useful. It is in the interests of both employers and good employees to find a way to filter out the people who claim abilities that they do not have. The certification system is by no means perfect but it is still a good step towards achieving this.
Sam Garforth
IBM Certified Specialist – MQSeries
IBM Certified Developer – MQSeries
IBM Certified Solutions Expert – MQSeries
© S Garforth 2000

Beyond WebSphere Family Monitoring

The integrated application processes that drive real-time business must flow smoothly, efficiently and without interruption. These processes are comprised of a series of interrelated complex events which must be tightly managed to ensure the health of your business processes. To achieve this, deep visibility into the core metrics of business processes is crucial.. But this level of insight is impossible when you’re limited to static events. This paper explores the next level in dealing with the increasing complexity of inter-application communication in the real-time enterprise with the ability to dynamically create events based on actual conditions and related data elements across the enterprise.

Stanford University Emeritus professor David Luckham has spent decades studying event processing. In his book, The Power of Events he says that we need to

“View enterprise system based on how we use them – not in terms of how we build them.”  This is an important paradigm shift which can move us towards the goal of business activity monitoring

Currently we have Enterprise systems management (ESM) with a foundation of event based monitoring (EBM). It provides availability with automation using event based instrumentation, with threshold based performance and alerts and notifications. This is good but not good enough for the evolving needs of real time business. Gartner says that event driven applications are the next big thing.

Typically in event based monitoring events come from different middlewares – web servers, databases, applications, network devices, mobile devices, through the IT infrastructure of middleware, application servers, applications and the network. The events can be seen by the data centre. The idea is that the ESM system will monitor, detect, notify and take corrective action either automatically or with manual intervention. But there are different business users with their own perpectives.

Event based ESMs can’t really take good corrective action as they can’t correlate the event with the effect.

Here’s a typical example of a business activity. The head office requests a price change in all the stores. It updates the price in the master database and checks the inventory levels and then transmits the change to all the stores. But what happens if something goes wrong.

You/the customer will be asking yourself lots of questions. 1. From an IT perspective you’ll want to know where the problem/slowdown is i.e. which queue or channel has the problem that caused the change not to happen. But there are business level questions too. 2. You want to know why the change didn’t take place at all the stores. 3. You need to ask yourself from a business perspective what the impact of this is. 4. You may know that a channel’s gone down or you may know that a price change hasn’t happened but do you know what else has been affected?

These are problems that you don’t really have time to address. EBMs don’t know what to do about this out of the box as they are designed at a technology level and to configure them to understand the business is too hard to set up. With constantly changing business and application needs you can’t adapt your monitoring and automation fast enough.

Here is another real life example.  A stock trade or share purchase. The customer says they want to buy something. You check their account, then the stock availability and price and then they agree they want to buy at that price. Then you process and complete the trade and update the stock price.

This is straight through processing. The transaction has to be done atomically as one unit of work within a certain time. These are serially dependent transactions. But what happens if they don’t complete in time. Again you will ask yourself the questions. From an IT perspective you will want to know what the cause of the problem was. But also the business units who were affected will want to know about it. You will want to know which business units and transactions were affected. The correct, and only the correct, business guys will want to know. You will want to know the business impact of this – to see the problem and impact from a business perspective.

So what is the business impact? At a high level its loss of money. You’ll know that the transaction didn’t take place but you won’t know the real root cause so everyone will be blaming everyone else, wasting time and damaging morale and relationships. During this time you are not delivering the business service. You have damaged your relationship with the customer as you haven’t delivered what they needed and you can’t even explain why. So you’re going to lose your competitive advantage.

To give this root cause analysis and business impact analysis customers normally have to put a lot of resource into customising an event based solution or just developing their own monitoring solution but this is not flexible enough. It is not feasible in an increasingly complex environment of technology. So we have to ask ourselves how can we have a monitoring solution that is flexible enough to keep our business systems productive, rapidly and constantly adapting to incorporate a changing IT and business environment.

So in summary, the big questions are Why is it so hard to detect and prevent these situations? How can we make the transition to real-time on-demand monitoring. How can we align our IT environment with the business priorities to achieve the business goals?

These problems arise because we’re using event based monitoring. Monitoring at an IT or technology level is preventing us from achieving business activity monitoring.

David Luckham refines this more to talk about Business Impact Analysis – overcoming IT blindness. We should be looking at the complex events. Correlating or aggregating the various events and metrics to see the business impact. He talks about the event hierarchy and the processing of complex events. MQ has about 40 static events like queue full, channel stopped etc. But there are events from WAS, DB2 etc, and there are metrics like channel throughput, cpu usage and TIME. There are also home grown apps which need monitoring and there are business events and metrics. All these need to be taken into account to give a higher level complex event. For example if a queue is filling up at a certain rate you can calculate that in a certain amount of time you will receive a static simple queue high event. But by that time it will be too late. You need to aggregate the metrics queue fill rate, queue depth, maximum depth and time to generate a complex event.

So the problems with the current state of event based monitoring are:

Event causality – there’s not enough information to identify the root cause of the problem. The price update didn’t’ happen, but why? MQ failed but why? Maybe it was a disk space problem. Maybe it was caused by something in a different part of the environment – a different computer or application.

Interpretation – looking at it at a simple technology level we don’t have enough information to see the effect of this simple problem on the different parts of the enterprise – to see the effect from the perspectives of the different users, and to notify them and resolve the problems it causes for them.

Agility – Out of the box ESM or EBM solutions cannot possibly know the business requirements. They require a lot of customisation when you initially set them up to be able to understand the effects of different problems on the different users and then constant customisation as the technological and business environment constantly changes. They are constantly playing a game of catch up that they can never win.

Awareness – Because they are only looking at individual points of technology they have a blindness to the end to end transaction. They cannot know how a simple technology problem affects the rest of the technologies or businesses.

Another shortcoming of the current generation of system management is false positives. This is a big problem with simple event based monitoring. You have a storm of alerts. The data centre sees an event saying the queue is full. They call the MQ team who say not to worry about it; it’s just a test queue, or a queue for an application that hasn’t gone live yet. After the first 24 times that this happens the data centre stops paying attention to queue full events. Then the 25th one happens which is for an important transaction which needs to be dealt with immediately and they just ignore it. The company loses business etc and it’s as if they didn’t have a monitoring solution at all. So what we need is a high level of granularity on the queue monitoring based not just on whether a queue is full but what queue it is, who the application owner is, what time of day it is, what applications are reading from the queue etc.

It’s not enough to provide monitoring data, it has to be information. It has to be interpreted in a way that is useful. What we need is dynamic metric based monitoring. The difference between events and metrics or data and information. You need metric based monitoring to create complex events – in context, user specific events that are pre-emptive before a real business problem happens which can be actionable. The problem isn’t getting events, its event correlation with rules etc. You need to watch more than the vendor gives. It can’t be enough.

There is something called the ‘aha phenomenon’. When a problem occurs you spend ages trying to identify the cause, looking at all the queues and middlewares and applications. All the time you’re looking the technology’s not running and the business is losing money. Eventually you find it and say ‘aha!’ Then what happens? Can you easily adapt your monitoring environment to make sure it doesn’t happen again or that you at least don’t have to search again when it does happen. In other words you need dynamic monitoring – where the monitoring environment of event correlation, metric selection and rules application can be constantly updated.

So let’s expand the vision of what we need. We need a unified approach like the service oriented architectures that are so popular for applications i.e. a reusable monitoring architecture. We don’t need a silo or isolated tool – the antithesis of SOAs. It needs to be a business oriented on demand solution. It needs to be modular, extensible, adaptable, scalable and reusable. We need instrumentation for all the different applications and middlewares. And the environment status needs to be shown to all the different stakeholders from their own perspective for their own roles and responsibilities.

By applying the service oriented architecture principles we can achieve the Business Activity Monitoring and business agility that we really need. A business centric solution aligning IT to the business processes so the business can actually benefit from the technology rather than being constrained by it. Using this you can see the impact of a problem from all perspectives and you can rapidly adapt to the changing business and technological environment learning from mistakes. Currently 80% of IT resource is consumed by maintaining the technology. Using this architecture we can free the resources to other products, develop the business and make more money.

In summary, this unified model gives business and technology continuity and automatic recovery. It gives very granular complex events allowing root cause analysis and business impact analysis by being aware of the business processes affected by the technology and displaying the information in a business context giving an improved quality of service.

Of course there are pros and cons to being standards based. Some Service Oriented Architectures such as .Net and WebServices are still in flux. We need unified SOA security across all platforms. To be proactive in the way that is needed will require polling which needs to be configured to avoid performance problems.

But anyway, what I’ve proposed here is a unified model, a base for business activity monitoring. As David Luckham says “The challenge is not to restrict communication flexibility, but to develop new technologies to understand it”. So I propose that the key to dealing with complexity and delivering true business activity monitoring solutions is a unified model based on a service oriented architecture. This doesn’t happen out of the box as no vendor or developer can know all your requirements but it is a framework which is modular, extensible, adaptable, scalable and resuable enough to facilitate what we need.

© Sam Garforth   2005

Closing the holes in MQ security

In choosing the default settings for MQSeries, IBM has had to strike a balance between making the product easy to use as quickly as possible and making it secure straight out of the box. In more recent releases, they have put more emphasis on ease of use and so relaxed the default security settings. This is one of the reasons why administrators must now reconfigure their systems if they require them to be secure. This article examines some of the potential security holes of which administrators should be aware, and also describes ways in which administrators can close these holes.

Default channel definitions

There are a number of objects, such as SYSTEM.DEF.SVRCONN and SYSTEM.DEFAULT.LOCAL.QUEUE, that are created by default when you install and configure a queue manager. These are really intended only as definitions to be cloned for their default attributes in the creation of new objects. However, a potential infiltrator can exploit the fact that they are also well-defined objects that probably exist on your system.

Originally, on distributed platforms, the definition of channel SYSTEM.DEF.SVRCONN had its MCAUSER parameter set to ‘nobody’. IBM had so many complaints from users who couldn’t get clients connected that it has now changed this parameter to blank (‘ ’).

The MCAUSER parameter specifies the userid that is checked when an inbound message is put on a queue. Setting this field to blank means that the authority of the userid running the channel (usually ‘mqm’) is checked. In other words, messages are always authorized to be put on all queues.

The thinking behind putting ‘nobody’ in this field is that no one should be allowed to put messages on queues unless the administrator actually changes settings to allow them to do so. Unfortunately this default setting was not documented and so users could not work out how they were required to change things.

There are many users who don’t need client channels and so haven’t even read this section of the manual. They’re unaware that nowadays, with default settings in place, anyone who can connect to their machine (for instance, someone on the same LAN) can start a client channel to them called SYSTEM.DEF.SVRCONN and have access to put messages on any of their queues and – often more importantly – to get messages from any of their queues.

This is not an entirely new problem – even the original systems suffered from it, as there are other channels, such as SYSTEM.DEF.RECEIVER and SYSTEM.DEF.REQUESTER, that have always had a blank MCAUSER. With a little effort, users have always been able to connect to these and put messages on queues using full authority. If the queue manager is the default one, the infiltrator needs no prior knowledge of the system.

As previously mentioned, these definitions are used to provide defaults for the creation of new channels. This means that, in many systems, newly created channels also have MCAUSER set to blank.

It is recommended that the following commands be executed using RUNMQSC to close this loophole:

alter chl(SYSTEM.DEF.SVRCONN) chltype(SVRCONN) trptype(LU62) +

Mcauser(NOBODY)

alter chl(SYSTEM.DEF.RECEIVER) chltype(RCVR) trptype(LU62) +

Mcauser(NOBODY)

alter chl(SYSTEM.DEF.REQUESTER) chltype(RQSTR) trptype(LU62) +

Mcauser(NOBODY)

Do not start MQ using root

It’s worth noting that much of this section is described in Unix terms, though it’s applicable to most platforms, once Unix terms are substituted with their equivalents.

All MQSeries components should be started using the MQSeries administration userid (mqm). Many system administrators like to make the system administration userid (root) a member of the mqm group. This is understandable, as they can then run all of their administration commands, not all of which are for MQ, as root. However, this is a very dangerous thing for them to do as they are effectively giving root authority to all of the members of the mqm group.

For example, if the trigger monitor of the default queue manager is started by root using default parameters, a member of the mqm group whose workstation has IP address ‘myhost’ can enter the following commands using RUNMQSC:

DEFINE QL(MYQUEUE) TRIGGER PROCESS(MYPROCESS) +

INITQ(SYSTEM.DEFAULT.INITIATION.QUEUE)

DEFINE PROCESS(MYPROCESS) APPLICID(‘xterm –display myhost:0 &’)

and then enter the command:

echo hello | amqsput MYQUEUE

This causes a terminal to appear on their screen giving them a command line with root authority from which they have full control of the system.

Similarly, if a channel is started by root, or the channel initiator starts a channel and the channel initiator is started by root, then any exits called by the channel will run as root. So the mqm member could write and install an exit that again spawns a root-authorized xterm.

The receiver channel could have the same problems, for example, if started as root by the listener, inetd, or Communications Manager.

A good start to overcoming this problem is to remove root from the mqm group. However, on some systems root will still have access to the strmqm command and, while it may look as though it has started the queue manager, there may be unexpected errors later when it performs commands for which the OAM checks authority.

The system administrator may find it useful to create commands that only root is authorized to run which switch to the mqm userid before performing the instruction. For example the following shell script could be called strmqm and put higher in root’s path than the real strmqm.

#!/bin/ksh

su – mqm -c /usr/lpp/mqm/bin/strmqm $1

Only use groups on UNIX OAM

The setmqaut command is used to set access to MQSeries objects. Among its parameters you may specify ‘-p PrincipalName’ or ‘-g GroupName’ to indicate to which users you intend this command to apply.

For example, the following command specifies that all members of the group tango are to be allowed to put messages on queue orange.queue on queue manager saturn.queue.manager (note the use of the continuation character, ‘‰’, in the code below to show that one line of code maps to more than one line of print)

setmqaut -m saturn.queue.manager -n orange.queue -t queue

‰  -g tango +put

Similarly, the command:

setmqaut -m saturn.queue.manager -n orange.queue -t queue

‰  -p theuser +put

specifies that the userid theuser should be allowed to put messages on queue orange.queue on queue manager saturn.queue.manager. On most platforms this works fine. However, the implementation on Unix systems is that:

setmqaut -m saturn.queue.manager -n orange.queue -t queue

‰  -p theuser +put

specifies that all of the members of theuser’s primary group are allowed to put messages on queue orange.queue on queue manager saturn.queue.manager.

This is can be very dangerous, as a system administrator can give access to a particular user unaware that in doing so he has accidentally also given access to many other users. User theuser may also be unhappy to be blamed by administrators for actions that they believe only he is authorized to have carried out.

The way around this problem is never to use the ‘-p’ parameter on Unix. The same effect can be obtained by specifying ‘-g PrimaryGroup’, which is a lot clearer.

Only create objects as mqm on unix

As described above, MQSeries on Unix does all of its security using the primary group of a userid rather than the userid itself, as you would expect. This has other knock-on effects.

When a queue is created, access to it is automatically granted to the mqm group and to the primary group of the userid that created it. It’s quite reasonable for someone designing the security of an MQSeries infrastructure to assume that access to all queues has been forbidden to all users except members of the mqm group. From here, the administrator would specify additional security settings that need to be made.

This works fine when queues are created either by the mqm user or by someone whose primary group is mqm. The problem arises when another user whose primary group is, for instance, staff, but who is also a member of mqm, defines the queue. In this case authority is also granted automatically and unintentionally to all members of the staff group.

This also applies to the creation of queue managers. If a queue manager is created by a userid whose primary group is staff, then all members of staff by default have access to the queue manager.

The simplest solution to this problem is to enforce a policy whereby no userid other than mqm may create MQSeries objects or queue managers. An alternative policy is never to make a userid a member of the mqm group unless this is its primary group.

OAM uses union

The Object Authority Manager uses the union of the authority settings that it finds. So, to take the example above a step further, suppose a queue, orange.queue, is created by a userid whose primary group is staff. At some point later it is found that another userid, worker, who shouldn’t have access to the queue, is nevertheless able to access it. worker is a member of staff but has team as his primary group. To resolve this problem an administrator might try running:

setmqaut -m saturn.queue.manager -n orange.queue -t queue

‰  -p worker –all

However, this will not solve the problem. While it will remove team from the authorization list, members of staff, including worker, still have access to the queue.

This also applies to other platforms, such as NT, that implement the ‘-p’ parameter. Although the problem of primary groups is not present, it should be realized that, while:

setmqaut -m saturn.queue.manager -n orange.queue -t queue

‰  -p worker +all

gives full access to worker,

setmqaut -m saturn.queue.manager -n orange.queue -t queue

‰  -p worker –all

only forbids all access if worker is not a member of any authorized groups.

Caching

On some platforms, such as Unix, group membership is cached by MQSeries. This means that, if a new user joins a group and needs access to MQSeries objects, the queue manager needs to be restarted. Similarly (and probably more importantly), if a user leaves the team or company, it is not sufficient just to remove them from the group. The user retains access to objects until such a time as the queue manager is restarted.

Only enable things if you need them

This is no more than common sense, and the defaults are such that this won’t cause problems, but for the sake of completion the following points are worth mentioning:

  • Automatic channel definition

Enabling the automatic definition of channels increases the ability of machines to connect to your queue manager with little prior knowledge of your system, so this should be enabled only if definitely required.

  • Command server

The command server is very powerful and can render weak security even weaker. For instance, on a system running MQSeries version 2 in which users do not have the authority to use the client channel, they could still connect using a sender channel called SYSTEM.DEF.RECEIVER. This could put messages on the command server’s input queue requesting it to create a channel and transmission queue back out. This could then be used for further breaches of security. If you’re not confident of your system’s security, it’s advisable to start the command server only when it is needed and to grant users only the minimum required levels of authority to it.

 

Sam Garforth

SJG Consulting Ltd (UK)                                                     © S Garforth 1999

Using RACF and the OAM for end-to-end security

It’s a commonly held view that MQSeries is not a secure product and that to install it in your network infrastructure is to give hackers a free reign. In this article I’ll demonstrate that this isn’t necessarily so.

Security is a general term that covers such tasks as sender and receiver authentication, encryption and privacy, non-repudiation, and message integrity and data authentication.
When communications between companies occurs, MACs, digital signatures, and public key encryption may be employed to enforce security, perhaps by means of third-party products, such as Baltimore Secure MQ from Baltimore. However, many companies consider that, when it comes to communication within an enterprise, such measures are not required as all machines in their infrastructure are managed by administrators whom they trust. All that’s required is to provide administrators with the means to prevent unauthorised users from accessing the network and creating messages, while still providing access to authorised users.

This can be done using third-party products, many of which have the potential to secure communications completely. Nevertheless, there is a lot that can be achieved using just the security mechanisms provided by MQSeries itself – that is, using RACF on MVS and the Object Authority Manager (OAM) on distributed platforms.

This article covers the policies that a company would need to put in place and the configuration that administrators would need to implement in order to establish an acceptable level of security in an environment where administrators are trusted but users are not.

EXAMPLE ENVIRONMENT
Consider the following example:
For a number of years a large company has successfully used techniques such as file transfer (FTP) to carry out point-to-point communication. However, in order to improve speed of development they decide to move to an MQSeries-based infrastructure, also deciding to use a central hub managed by a trusted group, as this yields benefits in manageability and allows new connections to be added quickly.

Most of the machines are based in a secure machine room (while it’s possible to log on to the machines from outside the room, a discussion of how to secure this type of access is beyond the scope of this article). Each business unit owns one machine. Business units don’t trust users (who could be disgruntled employees), and they don’t trust administrators of machines belonging to other business units, though they do trust their own administrators. Most security problems, such as ‘sniffers’ on the communication lines, were addressed when FTP was set up (possible solutions include using encryption at the communications layer, splitting SNA packets into so many parts that they are virtually impossible to read, and using security calls within applications themselves).

Consider a situation in which A and B need to communicate with each other using MQSeries, as do C and D (see Figure 1). Most of the security issues that exist in this environment also apply to FTP, though a major new one is introduced.

image001

With the environment shown above, if business A decides that it needs to talk to business D, the infrastructure is already in place and only the application development needs to be done. This is a very strong reason for using MQSeries and a hub environment. However, it also introduces the problem that an unauthorised person on A could send a message to D.
In order to secure the end-to-end connection, including preventing the generation of unauthorised messages, it is necessary to carry out the measures detailed in this article.

COMMUNICATIONS LAYER SECURITY
Firstly, security needs to be set up at the level of the communication layer. SNA bind or session-level security can be used to ensure that, when an SNA bind request comes from A, the hub knows that the request really does originate at A. This is a default with most communications packages (but not ones from Tandem) and involves providing the same password at each end. Obviously the password must be kept secret and should be accessible only by the machines’ administrators. Something similar can be done for TCP/IP using secure domain units or some form of Virtual Private Network.

CHANNEL INITIATION SECURITY
We’ve now ensured that no boxes are connected to the hub that shouldn’t be connected to it. However, it’s still possible for a user on A to define a queue manager called C and a channel on A called C.TO.HUB (for example, by knowing the naming convention or by querying the hub’s command server), and then connect to the hub by impersonating C and having messages routed to D.

If the channel is a sender/receiver channel, the only way around this is to use a security exit provided by a third-party product (such as Baltimore, mentioned earlier). However, if A is a secure machine, users won’t have the authority necessary to add these definitions to the system. An alternative is to use requester/sender channels. This is similar to a call-back system: the hub acts as a requester and thus needs to initiate the conversation. It calls out to the known LU/IP address stating that it wants to start a channel. A, acting as the sender, would then initiate the channel back to the hub. If A were to try to start a channel to the wrong requester, the request would not be accepted. Similarly D, acting as a requester, could initiate a conversation asking the hub, as a sender, to call it back. As the hub’s sender channel contains D’s CONNAME and calls it back the most that A could do in this set-up is to get the hub to call D.

ROUTING SECURITY
So now we have a system where we can be confident that all messages coming to the hub on any channel are from the machine that they should be from.
The hub is merely a queue manager, looking after transmission queues and running the associated channels. Each transmission queue is named after the queue manager that it points to. The next problem, as mentioned above, is that a user on A could, by default, do an MQPUT specifying as its target the queue manager of D. The message would be put on A’s default transmission queue (to the hub); when it reaches the hub, it would automatically be put on transmission queue D, and thus get to a destination that it shouldn’t be able to reach.
The way around this is to specify the MCAUSER parameter on the receiver/requester channel definitions. By default the inbound channel at the hub puts messages on its target transmission queue using the userid running the channel. This userid has full access to put messages on all queues. However, if you change the channel’s MCAUSER parameter, the message will be put on the queue using the userid specified by the parameter.
So, define one userid for each inbound channel on the hub. For example, define a userid called A for the channel from A, a userid called C for the channel from C, etc. Alter the inbound channels to put messages on queues using their corresponding userid – for example:

ALTER CHL(A.TO.HUB) CHLTYPE(RQSTR) TRPTYPE(LU62) MCAUSER(A)
ALTER CHL(C.TO.HUB) CHLTYPE(RQSTR) TRPTYPE(LU62) MCAUSER(C)

Next set the permissions on the hub’s transmission queues to accept only messages from authorised channels. How you do this depends on your set-up – on distributed platforms, use the following commands:

SETMQAUT –M HUB –T QMGR –P A +CONNECT +SETALL
SETMQAUT –M HUB –T QMGR –P C +CONNECT +SETALL
SETMQAUT –M HUB –T Q –N B –P A +PUT +SETALL
SETMQAUT –M HUB –T Q –N D –P C +PUT +SETALL

If you use RACF, then the following commands are needed:

RDEFINE MQQUEUE HUB.B UACC(NONE)
PERMIT HUB.B CLASS(MQQUEUE) ID(A) ACCESS(UPDATE)
RDEFINE MQQUEUE HUB.D UACC(NONE)
PERMIT HUB.D CLASS(MQQUEUE) ID(B) ACCESS(UPDATE)
TARGET QUEUE SECURITY

So now B and D can be confident that all the messages they receive are from authorised queue managers. The next problem is to make sure that the right messages go to the right queues. For example, user UserX on A might be allowed to send messages to queue QueueQ on B, and user UserY on A might be allowed to send messages to queue QueueR on B (see Figure 2). However, we need to ensure that UserX cannot send messages to QueueR. To do so without either using security exits or changing applications, business B needs to trust A’s administrator (but not C’s, etc). Also a system-wide naming convention of userids needs to be enforced.

image002

By default, when a user on A sends a message, the user’s userid is put in the USERID field of the message descriptor. The user is not allowed to change this. Also, by default the inbound channel at the receiver (for instance, B) puts messages on its target queue using the userid that’s used to run the channel. This userid has sufficient access rights to put messages on all queues. If you change the PUTAUT parameter of the channel from PUTAUT(DEF) to PUTAUT(CTX), messages are placed on the queue using the authority of the userid specified in the message descriptor.

So queues can now be secured by defining userids on the receiving machines that have the same names as the userids on the sending machines. The receiving userids do not need authority to log on. So, in this example:
Define two users, UserX and UserY, on B.

ALTER CHL(HUB.TO.B) CHLTYPE(RQSTR) TRPTYPE(LU62) PUTAUT(CTX)

On distributed platforms, issue the following command:

SETMQAUT –M B –T QMGR –P UserX +CONNECT
SETMQAUT –M B –T Q –N QueueQ –P UserX +PUT +SETALL
SETMQAUT –M B –T QMGR –P UserY +CONNECT
SETMQAUT –M B –T Q –N QueueR –P UserY +PUT +SETALL

If you use RACF, the following commands are needed (assuming MQM runs the channel):

RDEFINE MQQUEUE B.QueueQ UACC(NONE)
PERMIT B.QueueQ CLASS(MQQUEUE) ID(MQM) ACCESS(UPDATE)
PERMIT B.QueueQ CLASS(MQQUEUE) ID(UserX) ACCESS(UPDATE)
RDEFINE MQQUEUE B.QueueR UACC(NONE)
PERMIT B.QueueR CLASS(MQQUEUE) ID(MQM) ACCESS(UPDATE)
PERMIT B.QueueR CLASS(MQQUEUE) ID(UserY) ACCESS(UPDATE)

As mentioned previously, security is a bit more complex on Tandem systems. On Tandem, the ‘userid’ in the message descriptor is actually a groupid. For the above to work when a Tandem system is the receiver, it is necessary to use a group that’s defined and authorised with the same name as the sending userid. When a Tandem system is the sender, the receiver needs a userid defined and authorised with the same name as the sending group.

LOCAL QUEUE SECURITY AT THE SENDER

If users on the sender machine do not trust one another, some additional work is necessary to set up security.

If QREMOTE queues are not used, and users specify the target queue manager in the MQPUT call, then messages from UserX and UserY on A are put directly on the transmission queue and there is no way for MQSeries to stop them specifying one another’s target queues. It is also possible, when the channel is not running, for them to remove one another’s messages before the messages are sent.

The best way to solve this problem is to restrict access to transmission queues (this is the default) and to allow users to put messages only on QREMOTE queues that point to the target queues. Using this approach, a secure structure can be set up such that UserX and UserY cannot put messages on one another’s queues.

For instance, using RUNMQSC, enter the following definitions:

DEFINE QR(TO.Q.ON.B) RNAME(QueueQ) RQMNAME(B)
DEFINE QR(TO.R.ON.B) RNAME(QueueR) RQMNAME(B)

The commands below are the ones to use on distributed platforms.

SETMQAUT –M A –T QMGR –P UserX +CONNECT
SETMQAUT –M A –T QMGR –P UserY +CONNECT
SETMQAUT –M A –T Q –N TO.Q.ON.B –P UserX +PUT
SETMQAUT –M A –T Q –N TO.R.ON.B –P UserY +PUT

While the ones below are for use with RACF.

PERMIT A.BATCH CLASS(MQCONN) ID(UserX) ACCESS(READ)
PERMIT A.BATCH CLASS(MQCONN) ID(UserY) ACCESS(READ)
RDEFINE MQQUEUE A.TO.Q.ON.B UACC(NONE)
RDEFINE MQQUEUE A.TO.R.ON.B UACC(NONE)
PERMIT A.TO.Q.ON.B CLASS(MQQUEUE) ID(UserX) ACCESS(UPDATE)
PERMIT A.TO.R.ON.B CLASS(MQQUEUE) ID(UserY) ACCESS(UPDATE)

However, if you are happy to allow applications to write to the transmission queue, you could use either the following commands on distributed platforms:

SETMQAUT –M A –T Q –N B –P UserX +PUT
SETMQAUT –M A –T Q –N B –P UserY +PUT

or this one with RACF:

RDEFINE MQQUEUE A.B UACC(UPDATE)
LOCAL QUEUE SECURITY AT THE RECEIVER

If users on the receiving machine do not trust one another, then it’s necessary to set up some additional security.

Say UserQ is able to read messages on queue QueueQ and UserR is able to read messages on queue QueueR. If the users are not considered trustworthy, then one needs to guard against the possibility that UserR may put a message on queue QueueQ and for UserQ to receive it believing it to have come from A. Similarly UserR could get messages from queue QueueQ before UserQ gets them. To prevent this, it is necessary to run the following OAM commands:

SETMQAUT –M B –T QMGR –P UserQ +CONNECT
SETMQAUT –M B –T QMGR –P UserR +CONNECT
SETMQAUT –M B –T Q –N QueueQ –P UserQ +GET
SETMQAUT –M B –T Q –N QueueR –P UserR +GET

With RACF, the following commands would be needed:

PERMIT B.BATCH CLASS(MQCONN) ID(UserQ) ACCESS(READ)
PERMIT B.BATCH CLASS(MQCONN) ID(UserR) ACCESS(READ)
RDEFINE MQQUEUE B.QueueQ UACC(NONE)
RDEFINE MQQUEUE B.QueueR UACC(NONE)
PERMIT B.QueueQ CLASS(MQQUEUE) ID(UserQ) ACCESS(UPDATE)
PERMIT B.QueueR CLASS(MQQUEUE) ID(UserR) ACCESS(UPDATE)

Note that on MVS a problem still remains. UserQ (or perhaps a member of the same group) can run an application that puts messages on queue QueueQ that the main UserQ application then reads off in the belief that they came from A. On distributed platforms, the OAM command SETMQAUT can be used to ensure that UserQ can get messages from a queue but not put them on it. RACF does not have this facility. A user is either able to both get and put messages on a queue or neither. One solution to this is to use ‘alias’ queues.

For example:

DEFINE QA(ACCESS.BY.USERQ) TARGQ(QueueQ) PUT(DISABLED)

RACF could then be used to be used to prevent UserQ from directly accessing queue QueueQ while giving the user full access to the ACCESS.BY.USERQ alias queue. The PUT(DISABLED) attribute ensures that the user can’t put messages on the queue. Note that the PUT(DISABLED) attribute could not have been used directly on queue QueueQ, as this would have stopped the channel from being able to write messages.

While this method works, it’s a bit of overkill. As it’s common in MVS for a user to have read/write access to a dataset, allowing them also to have read/write access to a queue is usually seen as a natural extension. Another consideration is that, in MVS, it’s less likely that the same userid is used to run different applications.

Note that, on all platforms, such measures are unnecessary if the administrator has secured the machine so that users cannot add or run their own applications.

Sam Garforth
SJG Consulting Ltd (UK) © S Garforth 1999

Updates

The following comments have been made by a reader more recently and need to be incorporated into the article:

The statement is made that “If A were to try to start a channel to the wrong requester, the request would not be accepted.” This is not true. In the diagram, A, B, C or D could all start the same requestor on the hub (although not more than one at a time).

Also, if you give the MCAUSER +setall authority and set PUTAUT(CTX), what is to prevent a malicious user on A from sending messages as mqm to the command server (or any arbitrary queue) at the hub or any of the other machines? Each company has to trust the MQ administrators at all the other companies using that hub. That’s a lot to ask.

You might want to add a note in the document for TCP/IP users. It’s good practice to have the users coming in on different ports from each other and from the one you use internally. If, for example, you use 1414 internally and A, B, C and D use 1415, 1416, 1417 and 1418 respectively you gain a lot of control. You can stop all external traffic while allowing internal traffic to continue by shutting down listeners on all ports other than 1414. Or you can stop traffic from one business partner without affecting the others.

There are a bunch of other measures that can and probably should be taken to secure a hub QMgr that talks to several different external partners. These include adjusting the channel retry to enforce some flood control, disabling QMgr resolution by not naming XMit queues after the QMgr they serve and SVR channel behavior which can be exploited in a hub environment.

Sam Garforth
SJG Consulting Ltd (UK) © S Garforth 2003