Back to Talks

When a Picture is Worth a Thousand Network Packets and System Logs

Awalin Sopan FireEye Inc

Audience level: Intermediate
Topic area: Misc

Description

A typical Security Operation Center (SOC) employs security analysts who monitor security log from heterogeneous devices. The analysts identify whether there is a security threat and how to respond to that threat by analyzing that data. Visualizing this large-scale data to a succinct human digestible form can reduce their cognitive load and enable them to operate more efficiently.

SLIDES: https://speakerdeck.com/dataintelligence/when-a-picture-is-worth-a-thousand-network-packets-and-system-logs

Abstract:

A typical Security Operation Center (SOC) employs security analysts who monitor security log from heterogeneous devices. The analysts identify whether there is a security threat and how to respond to that threat by analyzing that data. Visualizing this large-scale data to a succinct human digestible form can reduce their cognitive load and enable them to operate more efficiently.

While defending against cyber attacks, security analysts detect intrusion (recommend a solution, gather evidence of the attack and insight of the threat), prevent them (block suspected traffic), and often perform forensic analysis (create rules to prevent future attacks, gather knowledge of the nature of the attack). At each step, they analyze data from various sources using various tools; in this case, they are like detectives who are trying to catch a thief or stop a potential crime before it happens. The sheer volume of information they have to process in a very short time makes it very challenging to operate mission-critical tasks. Missing critical cues may mean a threat to our national security. We need to provide them a better analytic workflow and reduce their cognitive overload to make their lives better so they can effectively defend our cyberspace. There come data visualization and visual analytics; visually representing relevant information and provide useful interaction to the analysts can help them uncover patterns, trends, and anomaly in such data, and thus help them make a better decision.

One of the key challenges to analyze cyber threat is that its heterogeneous data coming from multiple source and format. Computing devices, sensors, actuators, and the interconnection of such machines make up our cyberspace generating terra bytes of information. We cannot expect them to read through pages system logs and rows of network information. Some data can be multivariate, like network packets, TCP dump, etc. having attributes like IP, port, packet size, time, etc. Another very important source of information is the network topology: the information flow through the network (used in detecting network-based intrusion detection systems). These are relational data and can be represented as Node-link diagram showing the network topology and communication among the nodes.

While analyzing event logs for malicious activity, it is important to know the time and sequence of the events, we call them temporal data. No matter what format the data is in, for visualization we need to transform the data attributes into visual attributes, such as color, shape, position, orientation, size, glyphs, and organize them in a meaningful way. Multivariate data can be presented with tabular representation, as well as scatter plots, parallel coordinates plots. They help understand the correlation among the variables. Histograms show distributions of certain variables, for example, if certain IP is getting a lot of hits, we can see it easily using a real time histogram of data coming to that IP. For understanding the temporal aspects of event logs we can use time series and timeline visualization where the horizontal axis indicates the time. Such diagrams are very useful to uncover anomalous behavior. To understand the hierarchy of data, we can use tree visualization, may it be a treemap view or node-link tree diagram. A geographical map view is also often used to show the location of IP to see where the attacks are coming from. Visualizing clusters of similar data can also reveal anomaly and outliers. All of these visualizations can be used in different phases of the analysis depending on the analysis workflow. This talk summarizes the type of data and visualization techniques that can be essential for cyber threat analysis.