Watson finally gets to play Holmes

IBM has announced plans to strengthen its security product line with Watson for Cyber Security.

Augmented skills: a new path towards closing the gap

Part of the challenge for many companies when it comes to cyber security has been the lack of staff. Even when they do find qualified staff, they are often poached by security service providers who have much deeper pockets than their customers.

It is interesting that IBM has led with closing the skills gap as a key part of this announcement. In this sense, it is taking a proven approach – and one it has used elsewhere.

When Watson was introduced to assist doctors in finding probable causes of illness, IBM talked about the amount of new information being produced every year and the inability of the medical profession to keep up with the changes.

The same is true here. IBM points out the number of security events each day and the problem of improving accuracy. It also talks about the speed and frequency with which new vulnerabilities are discovered and reported. It concludes, rightly, that no single researcher can absorb all that data, let alone put it to good use.

Smart application, but a question remains

Bringing Watson into this arena makes sense but also raises some questions. At CIC we’ve spent a lot of time with the IBM Security Division. They’ve introduced transactional databases to security teams in order to manage the volume of data. Qradar, IBM’s Security Information and Event Management (SIEM) tool now sits on IBM mainframes, allowing it to bring vast amounts of computing power to play in order to identify and eliminate most of the false positives long before they get to a person.

IBM has also invested heavily in threat intelligence tools and behavioural analytics. All of these are, in the main, automated solutions that speed up the processing of data and the discovery of new attacks. It hasn’t yet added support for graph database technology to that mix and with Watson it may not need to.

With all of that, the question for many will be “what exactly is Watson going to add?”

The Watson factor

It’s a good question and one that goes to the complexity of data and the core of Watson itself.

While the flow of data from machine logs and network traces is highly structured, a lot of the data used to initiate attacks isn’t. Spear phishing, steganography (the art of hiding things in images), video, macros hidden inside documents and social engineering attacks are all based on what is termed unstructured data.

The same is true for information about security. Articles, academic papers, blogs, vlogs, video from conferences along with the constant diet of podcasts and webcasts add to the unstructured problem, but this time in terms of data security, experts need to know.

Overall, unstructured data makes up around 80% of all the data organisations possess and security data, despite all its structure from logs, is no different. IBM is hoping that Watson will be able to learn quickly what is right and what isn’t and therefore detect patterns in data.

Learned support in strengthening Watson’s ability

To help Watson extend it body of knowledge, IBM has enlisted the aid of eight universities. Students will prepare documents for Watson, annotating them to speed up the understanding and then see how well Watson understood the knowledge. What will be particularly challenging here is how to teach Watson a false positive from a real attack. This will require some careful thinking from both the Watson team and those feeding it the data.

The next step will be to add customer data into this mix. This will be messy and will be a real test of Watson’s ability to determine fact from fiction, attack from feint, bad from good. We would expect to see this happening in parallel with some of the university work as it will speed up the creation of a working taxonomy for Watson.

Expanding the picture to crack the crime

One of the areas where this announcement has real potential is in the way Watson is able to use its cognitive abilities to make sense of what it has found. Advanced Persistent Threats (APTs) can take years to discover if crafted carefully enough. One of the reasons that they are missed is that the signs of an attack are spread over a very wide attack surface. This is often so wide that no individual or team often sees more than one part of the attack until it is too late.

A boost for IBM

As Watson for Cyber Security develops we look forward to hearing IBM talk about how Watson has identified unknown APTs and other attacks. This will be a real bonus and will be something that no other IBM technology can do.

This announcement is also important for IBM if it wants to stay at the front of cyber security research. Other vendors are already beginning to show off their machine learning and artificial intelligence based solutions. With Watson for Cyber Security, IBM will be hoping that it can bring a solution to market before Microsoft, HP and Google, all of whom are active in this area.

It will also be interesting to see if IBM acquires some of the new players in this space such as Cylance who are using machine learning for end-user device protection. It would certainly be a good fit for the IBM Trusteer team.

Another area and one where we believe Watson could make a significant impact is in code analysis. The best vendors struggle to get better than 0.5 defects per thousand lines of code (kloc). With IBM Bluemix seeing rapid growth in its customer base there is a serious opportunity here for Watson to look through the code, apply what it is learning about attacks and exploits and then help determine the security risk of an API. This would be a huge step forward for the industry.

We believe that Watson for Cyber Security is a major step change for IBM provided it can move quickly from the learning to the production phase.

The Press release: