next up previous
Next: Analysis of operational costs Up: Experimental analysis Previous: Experimental analysis

Packet sniffing performance

Figure 11 shows the overhead incurred by running the packet filters at increasing bitrates for 1500 byte packets (600Mbps is the maximum rate we are able to generate reliably from a single source). While not shown in the figure, we verified that packet size plays no role in this experiment, only the packet rate. In general, we can see that FFPF makes more efficient use of the system than LSF, as the amount of FFPF idle time for high rates may exceed that of LSF by a factor of two, depending on the capture length. Unlike LSF, FFPF performance, while always better than LSF for high rates, depends strongly on the maximum packet capture length. Varying the number of slots in the packet buffer has a similar effect on performance, which leads us to conclude that it is probably caused by memory access and cache behaviour. As LSF lacks a circular buffer, cache misses will be rare. Larger caches, or tweaking of buffer size helps to alleviate the dependency. However, this was not done in these experiments. The drop rate for FFPF in all of these configurations is negligible, while for LSF at 600 Mbps the drop rate is 2-3%, depending on the capture size.

Figure 11: System idle time for FFPF and LSF as a function of the bandwidth for different capture lengths
viewport=15 10 600 360,clip

The use of shared buffers in FFPF reduces copying and context switching, especially if the number of applications increases. It is our hypothesis that network monitoring will be increasingly important and that multiple different applications will want to filter overlapping traffic (e.g., for intrusion detection, traffic engineering, a sysadmin interested in an overview of activity of protocols, etc.).

Figure 12 shows, for high bitrate, how the two frameworks scale when starting an increasing number of tcpdump applications with overlapping flows. Since LSF duplicates much of the work for each application, it quickly saturates. We should point out that for reasons unknown to us, OProfile never reports 0% idle time (the minimum is always 2-3%). Even with just two simultaneous applications LSF reaches maximum system load and consequently starts dropping packets. With 6 client applications LSF drops between 64% (LSF-100) and 75% (LSF-800) of all incoming packets. FFPF, on the other hand, drops 10% (FFPF-100) to 15% (FFPF-800). Interestingly, as the CPU load never reaches 100% for FFPF, the drop cannot be attributed to starvation. Rather it is caused by buffer overflow: by keeping the number of \ensuremath{PBuf} slots constant throughout the experiments (1000 slots), there were not enough slots to support six parallel client applications. Increasing the number of slots will decrease the droprate due to buffer overflow, but will increase overall system load due to cache and line misses. However, even without tweaking FFPF clearly outperforms LSF. Its more gradual performance degradation can be expected as little work is duplicated. Filtering is handled in the kernel and duplicate tasks are merged. The remaining performance penalty is therefore related only to userspace data output and the remaining context switches.

Figure 12: Idle time as function of the no. of concurrent applications for various capture lengths at 600Mbps

next up previous
Next: Analysis of operational costs Up: Experimental analysis Previous: Experimental analysis
Herbert Bos 2004-10-06