Building a decent PCAP analysis engine has turned out to be a lot of work. Since my last post I decided to scrap the Postgres database and RESTful api design for an Elasticsearch backend. The decision was primarily motivated by how unscalable the entire setup was. Every time I wanted to add a new log I had to create a corresponding model. Initially this seemed simple, but defining constraints became a huge issue. For example, DNS queries sometimes returned hundreds of lines of response data, which would break INSERTs 1 in 10000 times. A schema based backend forced me to define fields for every potential value, but many times only a fraction of the fields were populated, leaving tons of null data in my database. Another issue was the absolute shit, data-transport protocol I improvised. Moving the data from analysis nodes to storage nodes often took twice as long as the actual analysis.
Switching to an Elastic backend was a huge pain, but ended up being the perfect solution to the unstructured data I was storing. The result payed off as I did not have to define new log sources. Instead the processing nodes translate each extracted log into a a list of dictionaries. Each dictionary represents a row in the log. The result is wrapped in a JSON object, given a type and then stored in an analysis index on my Elastic cluster.
I cut analysis times from 2 minutes to 8 seconds by reading and not skimming the BRO IDS documentation (really important to do that). Up to this point I had assumed BRO operated only in live mode, and did not realize -r would read PCAP files in offline mode, generating logs without reading directly off the network card. Previously I had been using tcpreplay to replay the PCAP over a physical network interface at max speed. This was fairly inefficient, even with PF_RING kernel modules installed the process took almost 2 minutes. I swapped the tcpreplay method with bro -r and could almost instantaneously get results.
Another area I spent a lot of time on was the UI itself. The UI got a complete redesign. During stages of the submission and analysis process the interface rearranges, displaying only relevant information during that stage of the analysis process. When analysis of the PCAP file is completed only the analysis panel and a small tools interface are available. I also incorporated several jQuery-UI widgets to allow drag-drop and resizing of panels.
I took advantage of lobipanel.js’s built in full-screen mode incase a user wants to focus attention on one specific panel.
Another concept I had been experimenting with is that of row-specific tools. The idea being that each row contains data which may constitute further analysis. I decided to categorize each potential cell value as a datatype using various regex patterns. When a user click “tools” on the left of any entry, that row is parsed, and fields which were assigned a datatype are extracted. I then generate a set of tools which can be used provide further information about extracted row-data.
Two row-tools I’ve built so far are a simple IP2Geo tool as well as a whois lookup.
I also plan on adding tools which make it easy to pivot between corresponding connections in various BRO logs (those sharing connection UID).
The last major improvement came with incorporating Suricata into my analysis nodes. BRO is great at extracting protocol information and giving you a good idea of the content of the PCAP file. This provides context and is necessary for any decent PCAP analysis, however out-of-the-box BRO is not very good at telling you if PCAP data contains indicators of malicious activity. Suricata on the other hand uses Emerging Threat signatures and can instantly tell whether or not malicious binaries, suspicious HTTP requests, or other IOCs exist within the capture.
The next steps of the project are around making this tool actually useful. Up to this point I have been capturing a ton of data about individual PCAPs but ultimately throwing the PCAP away once analysis is complete. I want to allow the user to download the PCAP as well as artifacts extracted from it, for this I am considering several large-scale storage options.
Hopefully, my next update will be weeks not months from now.