Automatic Signature Generation in Traffic

    Network applications like peer-to-peer (P2P), multicast download, and worms are producing a large amount of traffic and connections in today's Internet. These newly emerged applications  are difficult to detect because they often use dynamic port to communicate or even use the well-known ports to avoid filtering. This becomes the major problem to the network management since the traffic can not be identified by network managers. Signature generation is a technology that can help to discover the patterns of the packet payload and now it has been widely used in worm detection and early warning system.  Several algorithms and systems like EarlyBird, AutoGraph, PolyGraph have been designed to achieve the accurate and efficient discovery of worm signatures. However, when these methods are used to discover the patterns in normal traffic, they confront their difficulties in validating the efficiency and quality of signatures since there are not the complete dataset available for each application. Moreover, these methods depends greatly on the address dispersions and content repetitions. So, they may probably not be able to detect "low speed" worms without the feature of big address dispersion.

Figure 1. The Implementation of Signature Generation

    In this work, we proposed a special method for application signature generation. We designed a special schema to refine the signatures by measuring the quality of each signature first , then computing their similarities with the refined Needleman-Wunsch algorithm and finally merging the  signatures that similar enough and will generate a new signature with higher quality. This method aims to assure a more accurate signature generation process which may discover more information by gradual learning, since the application signatures, unlike worms or attacking codes having their complete samples under research and clear difference in wide address dispersion and large traffic, are difficult to accumulate completely and often thus suffer from small training data.

Reports & Papers

  1. Mingwei Zhang, Daiping Liu. "Accurate and Scalable Application Signature Discovery". In IEEE Pacific-Asia Workshop on Computational Intelligence and Industrial Application (PACIIA 2008), Wuhan, China, December 2008.