Building high performance computer systems is all about finding and removing bottlenecks. Unfortunately, this is a never ending game. As fast as you identify and deal with one bottleneck, another appears. For a while, the problem was the raw performance of the four key elements in a computer system; processor, memory, network and storage.
Over the last decade, however, the problem has moved away from these four components. The amount of compute power inside modern processors continues to grow. Today’s processors are faster, use less power and are capable of vastly improved performance than the previous generation. They are no longer constricted by the amount of memory available to them as memory chips continue to increase in density and speed. Storage and networking are also delivering higher performance at lower cost.
The next war is system interfaces
The problem today is how those elements are all connected together. This is where IBM and Intel are beginning to diverge in how they address the problem. To complicate matters, the refresh cycle of the system bus that everything is connected to is slowing. It is also out of step with the refresh cycles of hardware and that means that there is a lot of pressure on manufacturers to find alternative solutions.
PCI-Express: the workhorse of intersystem communication
The current king of the system bus protocols is the PCI-Express (PCI-E). It has been around for 15 years having evolved from the earlier PCI standard. The current version is PCI-E 3.0 and can be found throughout the data centre. Virtually all plug-in cards are designed to work with this version. Both Intel and IBM support PCI-E 3.0 with their existing processor families – XEON (Intel) and POWER (IBM).
The PCI-E 4.0 specification has already been ratified and this is where IBM and Intel diverge with their support. Intel has made it clear that its next generation of processors, recently announced, will not support PCI-E 4.0. In fact, only when pressed will Intel say that it currently expects to see support in 2019. Not only is this a long way off, it is very close to the expected release of PCI-E 5.0.
IBM, however, has already said that its POWER9 processor, due to ship late 2017, will support PCI-E 4.0. How many peripheral manufacturers will announce cards to support this is unclear. It is important to not see this as a unilateral move by IBM. The OpenPOWER Foundation has a large number of members who are likely to follow IBM’s lead. As well as shipping their own POWER9-based systems, they are likely to ensure PCI-E 4.0 slots on the motherboards.
Will backing from the ecosystem influence Intel’s hand?
Among the IBM system partners are the likes of Supermicro, Tyan, Google, Boston, Gigabyte, MSi, NEC, Wistron and Rackspace. It will be interesting to see how many of these do produce motherboards with PCI-E 4.0 slots.
There is also a very large number of partners who produce plug-in boards including Mellanox, Xilinx, NVIDIA, Infineon and Qlogic. While PCI-E 4.0 slots will support earlier generations, it is likely that some of these vendors will quickly announce their own PCI-E 4.0 products.
All of these vendors have a history and do business with Intel. If they are supporting PCI-E 4.0 there may be pressure on Intel to do something about its XEON processors. It will be interesting to see how quickly Intel does respond if that is the case.
Coherent Accelerator Processor Interface (CAPI) – a new speed differentiator?
With the POWER8 processor IBM introduced CAPI. CAPI allows third-party hardware to talk directly to the processor. While CAPI devices are plugged into PCI-E 3.0 slots, they do not have the overhead of using the PCI-E protocol. Instead, they talk direct to the CPU which reduces the transaction overhead of the protocol. IBM says that the difference is a reduction of around 95% in traffic. The result is that this allows for more traffic between the CAPI device and the CPU.
The first devices to take advantage of CAPI were flash memory cards. This allowed the POWER 8 processor to support large in-memory databases faster than Intel XEON driven systems. A number OpenPOWER Foundation partners such as XILINX have helped third-parties build other accelerators using CAPI. In the past two years this has helped IBM develop a viable ecosystem around CAPI.
With POWER9 IBM is introducing two new versions of CAPI. CAPI 2.0 has been developed by IBM. It also gave the original CAPI code away to an Open Source project called OpenCAPI. This has drawn in a number of companies who are not OpenPOWER Foundation partners. This includes AMD, Dell EMC and Hewlett Packard Enterprise. There are no announced plans from these big three x86 supporters as to what they will do with OpenCAPI. However, the much larger bandwidth that it provides could result in them adding OpenCAPI support into their own technology.
Intel has not yet joined OpenCAPI nor does it have any announced plans to provide a similar capability in its XEON processors. If the OpenCAPI consortium start to become very active and support grows, it begs the question how long Intel can remain on the side lines?
Addressing the GPU – a versatile workhorse
Graphical Processing Units (GPUs) have become extremely interesting in computing terms. Look beyond gaming and playing videos and GPUs are used for a range of very complex compute needs. One of these is mining cryptocurrencies (e.g. Bitcoin). To mine cryptocurrencies means being able to solve increasingly complex problems that requires access to large amounts of compute power.
GPUs are ideally suited to cryptocurrency mining as they are designed for parallel processing. This allows them to break up problems, work on each piece at the same time and then reassemble the pieces. Parallel processing is also important to engineering, chemical and drug design, deep learning and analytics.
With POWER8, NVIDIA and IBM designed an interface, NVLink that allows the POWER8 processor to talk directly to the GPU. This allows the POWER8 processor to pass instructions and data to the GPU for it to execute. In 2016, IBM refreshed the POWER8 range to include motherboards and systems that support up to 4 GPUs talking to a pair of POWER8 CPUs.
With POWER9 the two companies will introduce NVLink 2.0. The link is designed to be at least twice as fast as NVLink 1.0 and will support NVIDIA’s latest generation of high-speed GPUs. The currently announced plans will see support for up to 4 GPUs in a system, although it is likely that when POWER9 refreshes in 2018 there will support for 6 or even 8 NVIDIA GPUs.
Intel requires GPUs to be plugged in to the PCI-E bus. There is no direct CPU to GPU connection such as NVLink and it is unlikely that, at the moment, Intel will add such a technology. With POWER8, this results in a 3x speed improvement for IBM over Intel. Until POWER9 ships, it is not possible to know what speed improvement NVLink 2.0 will deliver. However, as Intel is staying with PCI-E 3.0 it’s ability to transfer to and from the GPU will have no obvious performance improvement.
Buyer’s choice: What does this all mean?
With both IBM and Intel having announced their processor plans for the next two years and with silicon due to go into production, there are no more big changes expected until 2018 at the very earliest. This means in terms of accelerators and System Interfaces, IBM has stolen a march on Intel.
Do not expect this to go unnoticed. Intel does have the ability to change its plans in 2018 when it will refresh its announced XEON line. At that point it could introduce support for PCI-E 4.0. However, it might choose to wait until 2019 when PCI-E 5.0 should be ratified. There is a 4x improvement of PCI-E 5.0 over PCI-E 3.0. This will be the biggest jump in performance for PCI-E in its history.
It will be interesting to see what IBM does over the next two years. Will it slow down or even stop its own CAPI development and throw everything into OpenCAPI. That is unlikely but it does know that it will need to move quickly to ensure that whatever version of CAPI is around in POWER10 (Due 2020) is more than capable of beating PCI-E 5.0.
At the moment, IBM only supports Linux on Power Systems and z Systems. For customers using that operating system they will need to think about their workloads and what is important. If they are looking for better file and print servers to support their office productivity tools Power-based solutions are likely to be overkill.
If they are looking for solutions where parallel processing or access to very large amounts of memory are key, then POWER9 is worth a look.