MAGDA - Alarm supervision in Telecommunication Networks

Eric Fabre, Albert Benveniste, Claude Jard, Laurie Ricker, Mark Smith

MAGDA (Modélisation et Apprentissage pour une Gestion Distribuéee des Alarmes) is a research project supported by the French National Research Network in Telecommunications programme RNRT (Réseau National de la Recherche en Télécommunications). The challenge for the MAGDA project is twofold: 1/ adapting and mixing different formalisms towards the general objective of fault diagnosis in telecommunication networks, using alarm correlation, and 2/ building an experimental platform to validate this approach. The MAGDA project is a collaboration involving two academic research centres (Irisa/Inria, Rennes, and Université de Paris-Nord), and three industrial companies (France-Telecom/CNET, Alcatel/CRC, and Ilog).

Today, there are no widely recognized standards for tools that could help alarm correlation. Thus it is worth mentioning some key points of this research that could be regarded as useful achievements for themselves:


All the above mentioned points are very likely to lead to innovations in alarm correlation and network monitoring. We present focus now on the part of the project devoted to distributed diagnosis algorithms.

Modelling

We regard a telecommunication network as a network of asynchronously interacting finite-state machines. Such systems are subject to spontaneous faults, occurrences of which may trigger alarms. Also, network elements get services from other elements and, in turn, provide services to several alternative network elements. This causes
both fault and alarm propagation throughout the network. As a first idea, we have proposed modelling such a situation as follows:

Mathematical framework

Alarm interpretation is then regarded as the problem of inferring, from the observation of alarms, the hidden state history of the PN. As some of the events (in particular, spontaneous faults) are typically random in nature, we consider some kind of probabilistic form for our PN. To prepare for distributed diagnostics, our requirement was that stochastic independence should match concurrency, implying that if two transitions of the PN are concurrent, all interleavings of their reception by the supervisor should have equal likelihood.

We have proposed a new class of stochastic PN that satisfy the above property: Partially Stochastic Petri Nets (PSPN). The well known Hidden Markov Models (HMM) theory has been extended to them. HMM are stochastic automata for which is it desired to infer the most likely hidden state (or transition) sequence from an observed sequence of transition signatures. This machinery is very popular in pattern recognition and in speech
recognition. The basic algorithm is the so-called Viterbi algorithm, which computes on-line the most likely state history. In our PSPN framework, transitions are associated with "tiles" that describe local state changes. The Viterbi algorithm then reconstructs hidden trajectories by concatenating tiles that match the observations. Therefore it was renamed the "Viterbi puzzle".

Further issues for current research

The following topics require further development for MAGDA and are currently under investigation: