OpenIDEO is an open innovation platform. Join our global community to solve big challenges for social good. Sign Up / Login or Learn more

Multi-layered embeddable graph representation for Cyber Threat Intelligence data (STIX)

Prototype of a graphing framework (as an embeddable JS library) for visualising CTI graphs in various publications based on STIX data model.

Photo of Sergey Polzunov
7 3

Written by

Sketch 1 (1,500 characters)

CTI (Cyber Threat Intelligence) is very much about telling stories. Information becomes intelligence when it is complimented with a context and is placed in a story. These stories are usually crystallised in the reports by an intelligence provider and disseminated to the customers. If intelligence provider cares about structured machine-readable CTI, the reports produced will be supplemented with STIX2 bundles. There is a gap there between a story, narrated in a report, and a structured CTI snapshot, represented by a STIX2 bundle. The objective of proposed graphing framework is to provide easily embeddable STIX2 graphs with necessary level of interactivity and semantics, so that CTI community can create informative and engaging stories. The prototype of the framework is an embeddable JS library "stixview" (https://github.com/traut/stixview). Example of a story, told with stixview library can be seen here - https://traut.github.io/stixview/dist/demos/story.html The library can be used in blog posts, reports, briefings, etc, for visualising STIX2 bundles in an interactive and engaging way. By emphasising the structured data, intelligence providers are enabling machine-readable exchanges while saving resources focusing on creating well-defined STIX2 graphs that can be easily integrated into customer's knowledge bases (threat intelligence platforms).

Sketch 2 (1,500 characters)

The customers of cyber threat intelligence are CTI analysts and stakeholders. CTI analysts rely on intelligence purchased or gathered to provide them with the context and leads for their analysis or investigations. Stakeholders are usually prioritise business interests, so consuming CTI products, they are interested in strategic view, high level picture. Understanding the full context -- that's what takes a lot of time. I believe proper visualisation can help with that. My goal is to take cyber threat intelligence, expressed in STIX2 format (https://oasis-open.github.io/cti-documentation/), and present it in a clear, intuitively understandable format. The challenge here is how to display multiple abstraction layers on the same graph: intelligence data (STIX entities), metadata (groupings, classifications), structural characteristics (relationships, hyper-connectedness). I'm using various visual cues for this: 1) different shapes of entities, combined with a distinctive icon per type, helps classify CTI entity at the first glance 2) size of the node represents number of relations the entity has. It can be configured to represent number of neighbours 1 or 2 hops away, showing "popularity" of the node 3) opacity emphasises the context -- entities that are many hops away from the one in question (so are less relevant), are faded out, focusing user's attention on the area with the most relevant information. 4) color-coded groups represent metadata layers

What have you learned through this sketching process? (1,000 characters)

One important finding for me was: to make visual representation of CTI easily comprehendible, the amount of information must be limited. Humans can easily grasp the structure only up to a certain limit of complexity of that structure. That means the problem is not just how to represent the information, but also what information to show. That's where smart groupings and abstractions can help -- we can minimise complexity of the representation while still providing information about big picture, and leaving the details few clicks away. This satisfies business stakeholders (it answers strategic questions right away) and provides CTI professionals with ability to dig deep for the details (by opening up the groups, reading descriptions and walking the on graph). It seems, the better job CTI graph visualisation does in highlighting abstractions, the better.

Tell us more about you. (1,000 characters)

I'm a backend engineer at EclecticIQ. I've been involved in design and development of EclecticIQ Threat Intelligence Platform, working on CTI data ingestion, processing, storage and conversion. I'm an initial author of OpenTAXII (https://github.com/eclecticiq/OpenTAXII/) and Cabby (https://github.com/eclecticiq/cabby) libraries. With STIX2 released, I'm excited to see how it can benefit users and producers of CTI. I've made stixview (https://github.com/traut/stixview) library as a playground for advanced CTI graphing experiments.

Why are you participating in this Challenge? (750 characters)

I feel we're just touching the depths of CTI visual representation. I'm excited to try different approaches, challenge my old assumptions and experiment story telling via CTI graphs.

Website(s)

- https://polzunov.com - https://www.linkedin.com/in/polzunov - The graph of automated systems for IPv4 and IPv6 spaces ("The skeleton of the Internet") -- https://polzunov.com/as-graph-ipv4/ and https://polzunov.com/as-graph-ipv6/ - stixview storytelling demo https://traut.github.io/stixview/dist/demos/story.html

What is your experience with the field of cybersecurity?

  • I have considerable experience and/or knowledge in the cybersecurity field.

What best describes you?

  • I’m a cybersecurity professional with an interest in visuals.

How did you hear about this OpenIDEO Challenge?

  • Someone in my network (word of mouth)

Location: City

Amsterdam

Location: Country

  • Netherlands

This inspired (1)

For the same pulse

7 comments

Join the conversation:

Comment
Photo of DeletedUser

DeletedUser

I definitely support having the graph build itself interactively depending on the reported beginning to expert view of the viewer, like an onion. Very interesting submission.

As a fellow entrant, I invite you to visit my submission and let me know how you think it could be improved.. https://challenges.openideo.com/challenge/cybersecurity-visuals/review/what-is-appropriate-encryption

Best of luck.

John Messing

Photo of Sergey Polzunov

Hi John. Thank you for your reply! You have a very interesting submission as well

Photo of Patrizia Russ

Hi Sergey, I do understand what you might want to show with that graph and I like the way you approached the topic. Maybe it is easier for a person who does not yet understand a lot of this topic to start with one information layer and build up on it step by step (e.g. a movie/animation format would fit very well for that).

Photo of Sergey Polzunov

Hi Patrizia. Definitely, gradual interactive graph build up will be the next thing I will try with this framework. I already experimented with "story telling" approach using step-by-step growth of the same graph (https://traut.github.io/stixview/dist/demos/story.html) but interactivity and animation will make it even more engaging

Photo of Ben Banks

Sergey. I think this in interesting stuff. I do understand a reasonable amount of this. But there is curse of the expert issue. I think that a non-sec person needs a visually simpler language. Perhaps different icons? Or a way of only displaying indicative connections. I think a chat might help develop something?

Photo of Sergey Polzunov

@Ben Banks I agree. My thinking was: if the visualisation approach can handle expert-level information density, entry-level density will be much easier to represent. The view can be tuned for the user's level - simpler icons, animation with pop-up tips in the graph, etc.

The idea I have for entry-level outline is to have a graph building itself step by step (like a wizard), with popup explanations, links to news / coverage, and arriving at the full structured STIX graph, representing an event (breach, malware attack, etc)

Photo of Ben Banks

The place i always starts is - Information Visualisation Mantra “Overview first, zoom and filter, details on demand” - Ben Shniederman. That implies (as you suggest) a layered visual technical language that rolls up, at one level, to one which anyone can understand but is extensible for specialists to gain insights from the same data set. The trick is finding a compromise that will do both.