Coupling Facility
In IBM mainframe computers, a Coupling Facility or CF is a piece of computer hardware which allows multiple processors to access the same data.
A Parallel Sysplex relies on one or more Coupling Facilities (CFs). A coupling facility is a mainframe processor (runs in an own LPAR, with dedicated physical CP, defined thru HMC), with memory and special channels (CF Links), and a specialised operating system called Coupling Facility Control Code (CFCC). It has no I/O devices, other than the CF links. The information in the CF resides entirely in memory as CFCC is not a virtual memory operating system. A CF typically has a large memory – of the order of several gigabytes. In principle any IBM mainframe can serve as a coupling facility. The CF runs no application software.
When originally introduced, the CFCC executed in a separate mainframe unit that was essentially a processor without I/O facilities other than the CF links. Later IBM enabled the use of an Internal Coupling Facility where the CFCC runs in a logical partition (LPAR) defined in standard processor complex and communicates over internal links within that processor complex hardware. Links to another processor unit are over copper cables. More than one CF is typically configured in a Sysplex cluster for reliability and availability. Recovery support in the z/OS operating system allows structures to be rebuilt in the alternate CF in the event of a failure.
Supported by CFs, a Sysplex cluster scales very well up to several hundreds of CPUs (in up to 32 members, each with up to 64 CPUs) running transaction and data base applications. Using the CF links, data can be directly exchanged between the CF memory and the memory of the attached systems, using a direct memory access like mechanism, without interrupting a running program. Systems in a Sysplex cluster store CF information in local memory in an area called a bit vector. This enables them to locally query critical state information of other systems in the Sysplex without the need for issuing requests to the CF. The System z Architecture includes 18 special machine instructions and additional hardware features supporting CF operation.
Coupling Facility structures
A CF is used for three purposes:
- Locking information that is shared among all attached systems
- Cache information (such as for a data base) that is shared among all attached systems (or maintaining coherency between local buffer pools in each system).
- Data list information that is shared among all attached systems
These three purposes are catered for by three types of structure:
- Lock
- Cache
- List (and the variant Serialised List)
A structure is a dedicated portion of CF memory. It is said to be connected to by specific CF-exploiting applications on the coupled z/OS systems. A typical Parallel Sysplex contains several structures of each type. Each software exploiter may use several structures of each type. For example each DB2 Data Sharing Group uses one Lock structure, one List structure and several cache structures (one for each Group Buffer Pool (GBP)).
Structure duplexing
Structures may be duplexed across different CFs, allowing two copies of the same structure to be kept synchronised. Duplexing is often used as part of an installation's drive to remove single points of failure, with the aim of reducing the incidence and duration of application outages. In the event of the failure of one CF, the other copy of the structure is used to satisfy all requests.
Coupling Facility requests
A request to a CF structure is of one of two kinds:
- Synchronous (sync) requests. When a z/OS system issues a request it waits for the request to complete, actively "spinning" on one of its own processors. Sync requests are quick but the response time is the same as the coupled system's "spinning" CPU loss. So Sync requests are relatively expensive in CPU terms – from the coupled system's perspective.
- Asynchronous (async) requests. When a z/OS system issues a request it doesn't wait for the request to complete. Async requests are slower than sync requests (as they have a lower priority in the CF) but don't lead to the coupled system's processor "spinning".
Exploiting z/OS applications explicitly issue CF requests as synch or asynch.
Dynamic Request Conversion
In z/OS Release 2, the "Dynamic Request Conversion" heuristic algorithm was introduced. This uses sampled response times to decide whether to convert Sync requests to Async or not. These decisions are based on such criteria as coupled processor speed. The greater the distance between the coupled z/OS system and the CF the greater the likelihood requests will be converted to Async from Sync.
Async requests are never converted to Sync.
This heuristic algorithm complements a previously-existing algorithm that automatically (but not heuristically) converted requests, based on conditions such as path busy and on request data size. The difference is the new algorithm samples response times dynamically.
CFs are unique to S/390, zSeries and System z mainframes. They are key to Parallel Sysplex technology.
Coupling Facility Levels and Exploiting Software Levels
The CFCC code is released as "Levels", usually denoted by their "CFLEVEL". For example, CFLEVEL 15 was announced in April 2007. Each level brings new function and sometimes improved performance. In most cases the new function or performance improvement requires a corequisite release of z/OS and perhaps new function in some subsystem (such as DB2). One such example is Coupling Facility Structure Duplexing. (Sometimes support from the operating system and subsystems is available via PTFs rather than a full release.)