Typical examples of distributed and asynchronous Discrete Event Systems are networked systems, such as telecommunications networks. In such systems, the sensor system is distributed, it involves several local sensors, attached to some nodes of the network. Each sensor has only a partial view of the overall system. The different sensors possess their own local time, but they are not synchronized. Alarms are reported to the global supervisor asynchronously, and this supervisor performs diagnosis. This is the typical architecture in telecommunications network management systems today, our motivating application (see the Magda project). Events may be correctly ordered by each individual sensor, but communicating alarm events via the network causes a loss of synchronization, and results in a nondeterministic and possibly unbounded interleaving at the supervisor. Hence, the right picture, for what the supervisor collects, is not a sequence of alarms, but rather a partially ordered set of alarms.
In addition, for future telecommunications networks, services, and systems, supervision will not be centralized, but rather distributed. This means that each supervisor will have only a partial view of the overall system, observes only its local alarms, and has only to report local diagnosis. But, clearly, the different local diagnoses delivered by the different supevisors need to be coherent, i.e., they must be local projections of a unique overall diagnosis, this is what we mean by distributed diagnosis. Clearly, the distributed diagnosers need to cooperate at establishing distributed diagnosis. See [MagdaPaper2002] for a description of this application.
Eric Fabre et al. [FBJ2001]
have proposed to regard distributed and asynchronous Discrete Event
Systems
as Bayesian networks of dynamical systems. Eric Fabre has further
elaborated
this approach in [Fabre_2002_a]
and [Fabre_2002_b]
. Such models can be randomized or alternatively be simply
nondeterministic.
Diagnosis is then formulated as the distributed state trajectory
reconstruction,
from distributed and asynchronous sensor observations. This approach to
distributed diagnosis can be decomposed into
several subproblems : 1/ an abstract algorithmic setting for
distributed constraint solving [Fabre2003-2] , and 2/ a factorized and
modular representation for trajectories of
distributed systems, using so-called net
unfolding
techniques [Fabre2003-1] [Fabre2004]
. Putting everything
together to solve the distributed diagnosis
problem yields papers [BFHJ2003-a]
and [Concur2003].
In 2004, Eric Fabre has proposed an alternative representation for
runs of
distributed systems, by means of trellis
processes instead of unfoldings. Trellis processes generalize to
distributed and concurrent systems the
usual notion of trellis, as it is commonly used in dynamic programming
or Viterbi decoding, for automata. They
enjoy the same nice algebraic properties as unfoldings. See slides on trellis nets for a
preliminary presentation.
Since 2004, Stefan Haar is leading a new research to extend our diagnosis techniques to dynamically reconfigurable systems. Static models such as Petri nets are not suitable any more. Graph Grammars can be conveniently used instead. See [HaaBFJ2005]
See slides
on modeling and slides
on distributed diagnosis for
a comprehensive presentation.
[HaaBFJ2005] A. Benveniste, S. Haar, E. Fabre, and C. Jard. ``Fault diagnosis for distributed asynchronous dynamically reconfigured discrete event systems''. In Proc. of the 16th IFAC World Congress, Prague. July 2005.
[Fabre2004] E. Fabre. Factorization of Unfoldings for Distributed Tile Systems Part 2: General Case. INRIA Res., Rep. 5186. May 2004.
[Concur2003]
A. Benveniste, S. Haar, E. Fabre, and C. Jard. ``Distributed monitoring
of concurrent and asynchronous systems''. Plenary address at
CONCUR'2003,
Proceedings
of CONCUR'2003.
Informal presentation published in Proc. of
IEEE Control and Decision Conference, Dec. 2003.
Final and improved version published in Discrete Event Dynamic Systems, 15(1), 33-84, Mar. 2005.
Extended version available as Irisa
Research
Report 1636, July 2004 and INRIA Research Report 4842 V2, July 2004.
[Fabre2003-2] E. Fabre. Convergence of Turbo Algorithms for Systems Defined by Local Constraints. INRIA Res., Rep. 4860. July 2003.
[Fabre2003-1]
E. Fabre. Factorization of Unfoldings for Distributed Tile Systems
Part 1 : Reduced Interaction Case. INRIA Res., Rep. 4829. May 2003.
[MagdaPaper2002] A. Aghasaryan, C. Dousson, E. Fabre, Y. Pencolé, A. Osmani. Modeling Fault Propagation in Telecommunications Networks for Diagnosis Purposes. XVIII World Telecommunications Congress, 22-27 September 2002 - Paris, France.
[FBJ2001]
E. Fabre, A. Benveniste, C. Jard. ``Distributed Diagnosis for Large
Discrete
Event Dynamic Systems''.
In Proc of the IFAC world congress, July 2002.
[Fabre_2002_a]
E. Fabre. Compositional models of distributed and asynchronous
dynamical
systems. In Proc of the 2002 IEEE Conf. on Decision and Control,
1--6, Dec. 2002, Las Vegas, 2002.
[Fabre_2002_b] E. Fabre. Monitoring distributed systems with distributed algorithms. In Proc of the 2002 IEEE Conf. on Decision and Control, 411--416, Dec. 2002, Las Vegas, 2002.
[BFHJ2003-a]
A. Benveniste, E. Fabre, C. Jard, and S. Haar. ``Diagnosis of
asynchronous
discrete event systems, a net unfolding approach''. IEEE
Trans.
on Automatic Control, 48(5), 714-727, May 2003.