TY - GEN
T1 - An improved algorithm to accelerate regular expression evaluation
AU - Becchi, Michela
AU - Crowley, Patrick
PY - 2007
Y1 - 2007
N2 - Modern network intrusion detection systems need to perform regular expression matching at line rate in order to detect the occurrence of critical patterns in packet payloads. While deterministic finite automata (DFAs) allow this operation to be performed in linear time, they may exhibit prohibitive memory requirements. In [9], Kumar et al. propose Delayed Input DFAs (D 2FAs), which provide a trade-off between the memory requirements of the compressed DFA and the number of states visited for each character processed, which corresponds directly to the memory bandwidth required to evaluate regular expressions. In this paper we introduce a general compression technique that results in at most 2N state traversals when processing a string of length N. In comparison to the D2FA approach, our technique achieves comparable levels of compression, with lower provable bounds on memory bandwidth (or greater compression for a given bandwidth bound). Moreover, our proposed algorithm has lower complexity, is suitable for scenarios where a compressed DFA needs to be dynamically built or updated, and fosters locality in the traversal process. Finally, we also describe a novel alphabet reduction scheme for DFA-based structures that can yield further dramatic reductions in data structure size.
AB - Modern network intrusion detection systems need to perform regular expression matching at line rate in order to detect the occurrence of critical patterns in packet payloads. While deterministic finite automata (DFAs) allow this operation to be performed in linear time, they may exhibit prohibitive memory requirements. In [9], Kumar et al. propose Delayed Input DFAs (D 2FAs), which provide a trade-off between the memory requirements of the compressed DFA and the number of states visited for each character processed, which corresponds directly to the memory bandwidth required to evaluate regular expressions. In this paper we introduce a general compression technique that results in at most 2N state traversals when processing a string of length N. In comparison to the D2FA approach, our technique achieves comparable levels of compression, with lower provable bounds on memory bandwidth (or greater compression for a given bandwidth bound). Moreover, our proposed algorithm has lower complexity, is suitable for scenarios where a compressed DFA needs to be dynamically built or updated, and fosters locality in the traversal process. Finally, we also describe a novel alphabet reduction scheme for DFA-based structures that can yield further dramatic reductions in data structure size.
KW - deep packet inspection
KW - DFA
KW - regular expressions
UR - https://www.scopus.com/pages/publications/77954026099
U2 - 10.1145/1323548.1323573
DO - 10.1145/1323548.1323573
M3 - Conference contribution
AN - SCOPUS:77954026099
SN - 9781595939456
T3 - ANCS'07 - Proceedings of the 2007 ACM Symposium on Architecture for Networking and Communications
SP - 145
EP - 154
BT - ANCS'07 - Proceedings of the 2007 ACM Symposium on Architecture for Networking and Communications
T2 - 3rd ACM/IEEE Symposium on Architectures for Networking and Communications Systems, ANCS 2007
Y2 - 3 December 2007 through 4 December 2007
ER -