Arkime: open source, large scale, full packet capturing, indexing, and database system
Arkime
Arkime is an open-source, large-scale, full packet capturing, indexing, and a database system. Arkime augments your current security infrastructure to store and index network traffic in standard PCAP format, providing fast, indexed access. An intuitive and simple web interface is provided for PCAP browsing, searching, and exporting. Arkime exposes APIs which allow for PCAP data and JSON formatted session data to be downloaded and consumed directly. Arkime stores and exports all packets in standard PCAP format allow you to also use your favorite PCAP ingesting tools, such as Wireshark, during your analysis workflow.
Access to Arkime is protected by using HTTPS with digest passwords or by using an authentication providing webserver proxy. All PCAPs are stored on the sensors and are only accessed using the Arkime interface or API. Arkime is not meant to replace an IDS but instead work alongside them to store and index all the network traffic in standard PCAP format, providing fast access. Arkime is built to be deployed across many systems and can scale to handle tens of gigabits/sec of traffic. PCAP retention is based on available sensor disk space. Metadata retention is based on the Elasticsearch cluster scale. Both can be increased at any time and are under your complete control.
Architecture
Here are some sample deployments of Arkime for different network architectures. Most folks will probably run a hybrid of the following since no one solution fits all. The ability to scale capturing can be done horizontally by adding more capture machines, vertically by adding more CPUs/disk, or both. We usually recommend scaling horizontally unless physically space-constrained and using a network packet broker in front of multiple machines. However it is possible to use big machines, with lots of cpu/disk, and run Arkime-capture with more threads.
Legend/Info:
- A box represents a physical machine.
- It is possible to run multiple capture processes per machine or have a single capture process to listen to multiple interfaces – (FAQ Answer)
- Recommend “Big Data” style boxes for capture – (FAQ Answer)
- Run multiple Elasticsearch processes per machine since each ES node should be configured at most to 30G – (FAQ Answer)
- Except for single-host deployments, it is recommended/useful that all operator access flows through a single Apache/viewer combination that can provide better authentication, logging, and a single choke point – (FAQ Answer)
Security
- All ES instances should have iptables for port 9200-920N and 9300-930N, where N is the number of ES instances per machine, and only allow the other elasticsearch, capture and viewer machines to connect
- All viewer hosts, except the apache/viewer box, should have iptables for port 8005 and only allow other viewer machines to connect. The viewer must listen on OS interface if using multiple machines
- The shared viewer instances can listen on localhost since only apache talks to it
Notes:
- Using a Network Packet Broker (NPB) allows traffic to be load balanced and recombined. This is especially useful in HA or asymmetric routing cases
- By using an NPB, other security devices can see the same traffic Arkime sees
- When running multiple Arkime-captures on the same host make sure the IO doesn’t overwhelm the disk and other subsystems.
- Use a TAP with high traffic networks since many mirror ports drop traffic under heavy load
- Operators use an apache fronted viewer and don’t hit the other viewers directly. The apache provides authentication.
- Lockdown ES and Arkime viewer with iptables
Multiple Clusters
Notes:
- It is possible to use a single ES cluster using the prefix= ini configuration
- Operator uses apache fronted viewers and doesn’t hit the other viewers directly. The apache provides authentication. Can use virtual paths to route to different clusters.
- NPBs are recommended for high traffic networks
Download && Tutorial
Copyright 2012-2017 AOL Inc