System requirements for a NetFlow collector are a lot higher than the average program. While I am still the “new guy” in support I am already seeing some trends here at Plixer. The majority of cases I have been working with involve servers that seem to run slow despite being on top of the line equipment. Nearly every time this issue comes up it is caused by an improper hard disk configuration.
One of the most overlooked NetFlow Collector System Requirements that write heavy database servers have is disk IOPS (Input/output Operations per Second). You have to remember that a spinning disk is very limited on how many writes it can make at any given time, and if the collector cannot write to the disk fast enough it can cause a lot of problems. This has been the root cause of a lot of slow NetFlow collectors.
A large network can have thousands of flows per second, and each one of those has to be written to a storage device. The real problem is that on average a standard 7200RPM drive will struggle to get 100 IOPS. Simply upgrading to a 15K RPM drive can more than double that performance, but even a 15K RPM drive can be brought to its knees by a high flow rate.
Many of you already know that a RAID array can help boost NetFlow collector performance. But with so many RAID configurations how do you choose which one to use? First let me explain the most common and well-known two levels.
- RAID 0: a block-level striping of two or more drives with no redundancy. This will result in improved performance and additional storage but no fault tolerance.
- RAID 1: Is mirroring 2 drives for redundancy and will have very little affect on performance.
With most NetFlow collector hardware you ideally want the quality’s of both, additional speed and storage but with redundancy to keep historical data safe. This is where RAID 10 comes in.
RAID 10: The NetFlow collector recommended RAID, it is a RAID 0 array of an even number of RAID 1 Mirrors. This gives you additional performance, and fault tolerance. This is the configuration you will want to use to comply to the NetFlow Collector Recommended Specs.
Now that I have explained some points on what can help performance, let me share a few things that could hurt it.
While we are still on the subject of RAID, the next most popular RAID configuration is RAID 5. RAID 5 is a block-level striping of 3 or more drives with distributed parity. This will increase disk space, redundancy, and read speed. The issue with RAID 5 in regards to a NetFlow application is because of the need to write parity information. You do not gain write speed with this configuration, so the same 4 15K RPM hard drives will perform much better with a collector server in a RAID 10 than RAID 5.
What if you are expecting a very high flow rate and you need to be guaranteed a solution that can handle it without the fuss of setting it up? Well Plixer can help even with the largest of networks with its new High Performance NetFlow Collector Appliance. The High Speed Appliance comes preconfigured and can be scaled to collect 100,000 flows per second. It provides reassurance that the collector is not dropping flow data which would decreases the accuracy of reporting and threat detection.
If an appliance is not what you are looking for then make sure you have the proper disk configuration for disk IO the first time before it costs you more later down the line.