Data Interception
and ANAlysis
english | nederlands 


 • Homepage

 • Project goals

 • Project partners

 • People

 • Contact

 • Publications

 • Related links
Computer networks such as the Internet, intranets, wireless networks, company networks etc., generate very large streams of data. These data streams can be analyzed in real time when combined with data from other sources such as transactional databases, CRM systems, log files etc. The results of these analyses can be used for various purposes, such as network intrusion detection, detection of fraudulent acts within electronic transactions, the obtaining of up-to-date marketing information and other "business intelligence", the deployment of systems which interact with customers in a dynamic way based on modified behaviour or new to be expected behaviour, the detection of misuse of information systems, et cetera.

Research in "data stream mining"

The development of successfull applications which use such data streams is at present severely limited because of a few important factors:
  • The research area "mining data streams" has only just begun to be explored and there are as of yet no widely accepted universal algorithms for the study of its problems;
  • Integration with other systems: because of the high requirements that the processing of data streams imposes with respect to calculation capacity, this strongly influences the performance of the other systems;
  • Maintainability: systems based on data mining techniques are very difficult to maintain;
  • Adaptability and interpretability: the world is changing continuously, models need to be adapted regularly and changes need to be interpreted correctly; this is, in the context of data streams, a formidable challenge;
  • Complexity of implementation: the implementation of "data mining" systems requires, besides the introduction of analysis systems, also modifications in the systems of which the data is going to be analyzed.

Project goal

The project is aimed at the development of scientific and technological expertise which enables us to design efficient solutions for the aforementioned problems. We will base our study on the way in which these problems manifest themselves at the project participants BKWI, Moniforce, Robeco Direct and Interpay, so that practical application of the research results is vouchsafed.

New algorithms for "data stream mining"

One of the goals of the project is the development of new algorithms to study data streams. This area of study has only been researched worldwide for a few years and is still largely unexplored.

Adaptivity

Next to the development of the algorithms, the "adaptivity" of the technologies to be used will be studied. Adaptivity here means the capacity to dynamically adapt to changes in statistical properties of the observed data.

Modelling and maintainability

Maintainability indicates the ease with which models can be adapted when the structure of the data changes. Where models at present need to be (manually) adapted when the content of the data changes, the research will aim at how models can dynamically be adapted when the "world" they see changes.

Research platforms

Another important goal is the development of research platforms. These research platforms will use protocol analysis techniques to prepare and preprocess the data streams so that they are ready for analysis, without interfering in the data streams themselves (i.e. by means of "eavesdropping"). The implementation of the research platforms will be able to take place without any adaptation of the systems responsible for the data stream exchange. Also, the use of generic algorithms implies that only a moderate amount of configuration and modelling will be necessary before deployment.




 © 2004-2005 DIANA project - All rights reserved