Ruler - overview

Ruler is a language to match and rewrite strings. It was designed for the anonymization of network packets, but it is flexible enough to support a wide range of applications. One of its key features is that it uses a highly efficient regular-expression matching and rewrite algorithm. By specifying the anonymization process as rewrite specifications on network packets, a clear and flexible specification is possible.

A single rewrite operation is called a rule. A rule consists of a match pattern and a result pattern. The match pattern specifies the layout and specific values of the string to be matched. If a string matches, it is accepted as-is, rejected, or an output string is constructed.

Next to an accept, reject, or rewrite action on a packet, a rule can also specify a classification code for a pattern. This pattern is used in some contexts to specify the way the packet should be handled, e.g. how it should be routed.

A filter consists of a list of rewrite rules. The rules are matched in the order that they are specified: top to bottom. Any string that doesn't match a rule is rejected, but it is easy to add a `fallback' rule to catch all unmatched strings.

For example, the filter

include "layouts.rli"

filter www
  eh:Ethernet_IPv4 ih:IPv4_TCP th:TCP with [dest=80~2] p:*
     =>
  eh with [e_dest=0#6,e_src=0#6]
  ih with [src=0~4,dest=0~4]
  th
  p;
when applied to a stream of Ethernet packets, matches packets that start with an Ethernet header containing the IPv4 protocol number, followed by a IPv4 header with TCP as protocol, followed by a TCP header with port 80 as destination port, followed by an arbitrary span of bytes. This filter selects all TCP traffic to port 80 (the standard HTTP port), and rewrites it to a version with zeroed source and destination address, but untouched payload.

The Ruler language operates on patterns consisting of sequences of bytes and/or bits. In its simplest form, a pattern only describes the layout of a sequence of bits and bytes. Any sequence of bytes with the correct length matches such a pattern, and the pattern only serves to give names to the elements of the pattern. It is also possible to specify values for some or all bits or bytes in the pattern, in which case the pattern only matches byte sequences with the specified data. Finally, the pattern may contain spans of arbitrary length.


Last modified Thursday 29 March 2007 09:12:19 UT by Kees van Reeuwijk.