next up previous
Next: References

A DSM-based structural programming environment
for distributed and parallel processing

Lionel Brunie and Laurent Lefèvre
Laboratoire de l'Informatique du Parallelisme
Ecole Normale Supérieure de Lyon
69364 LYON Cedex 07, France
(lbrunie, llefevre)@lip.ens-lyon.fr

Abstract:

This paper describes an original programming environment based on the DOSMOS tex2html_wrap_inline360 system. Its implementation is built on a structural approach of parallel programming. By combining this structural model with weak consistencies protocols, we improve the performances of DSM system and provide a programming model which mixes message-passing and virtually shared objects. Morevoer by integrating a development platform consisting of processes mapping, shared objects management, pre-processing step and monitoring facilities, the DOSMOS system has been designed to provide a performant user-friendly programming environment.

Introduction

Distributed Shared Memory systems (DSM) have been designed to implement, above a distributed memory architecture, a programming model allowing a transparent manipulation of virtually shared data. Thus, in practice, a DSM system has to handle all communications and to maintain coherence of the shared data.

This paper describes an original programming environment, called DOSMOSgif. This system is based on a structural approach of parallel programming. In other words, DOSMOS proposes to the user to hierarchically structure processes into groups and sub-groups of processes sharing a same set of variables. This feature, combined with weak consistency protocols allows to reduce the amount of communications required for the management of the shared data, and, as a consequence, to ensure efficiency and scalability to the applications.

Moreover, to be used by various kind of programmers, DOSMOS allows to mix two programming models : message-passing (PVM [GBD tex2html_wrap_inline358 93]) and DSM (DOSMOS) code in a same application.

To provide an useful development platform above DOSMOS, we have added a set of tools to design a complete programming environment. This Tool Kit allows the designing and execution of DOSMOS applications in terms of processes mapping (DOSMOS-Map tool), shared objects management and pre-processing step. To complete the programming environment, DOSMOS integrates a devoted monitoring tool (called DOSMOS-Trace) which has been added to the system to clearly understand and analyse the behavior of DOSMOS applications. At last, this programming environment has been designed to run both on distributed systems and on parallel machines. Thus, to ensure the portability of both the system and the applications, DOSMOS has been developed on top of PVM.

This paper is divided into three parts. After a short description of previous works in terms of DSM systems (section 2), we describe the basics of DOSMOS DSM system in terms of coherence protocols, shared objects, programming model and process structuring (section 3). Then a description of the DSM-based programming environment is proposed. At last, section 5 proposes a discussion both on the basic features of this programming environment and on implementation choices and points out future developments.

Purpose of Distributed Shared Memory systems and previous works   By allowing the programmer to share "memory objects" (i.e. programming variables) in a transparent way, Distributed Shared Memory Systems (DSM) propose a interesting trade-off between the easy-programming of shared memory machines and the efficiency and scalability of distributed memory systems. Basically, a Distributed Shared Memory system is a mechanism that allows application processes to access to shared data in a transparent way. In other words, a DSM system releases the programmer from the management of all inter-process communications.

Both hardware and software implementations have been proposed. The main systems require to implement an additional software layer :

Virtual Shared Memory systems
(VSM) allow to share pages of data, i.e. to merge into a single wide address space a set of memory pages distributed in the network. First systems like IVY [Li88] or KOAN [LP92] were dedicated to specific parallel machines. Other systems like MIRAGE [FP89], MUNGI [HERV93] or MUNIN [CBZ91] have to deal with specific problems of operating systems.

Object-based Distributed Shared Memory systems
(DSM) work at the program level, i.e. they implement a software layer that automatically generates, on the user's behalf, all the communications required to manipulate shared data. In other words, instead of defining (and writing in the code) the inter-process communications, the programmer only specifies which data are actually shared. Then he can use these data as if they were local. On its side, the DSM system takes into charge all the communications necessary (as a message-passing programmer would do). Such DSM systems like ORCA[TKB92] or CLOUDS system [RAK89] have been implemented on parallel machines. The DOSMOS[BL94, BL96] system belongs to this class of systems.

Basics of DOSMOS   DOSMOS is an object-based DSM system (cf section 2), i.e. it allows processes to share in a transparent way a set of passive objects (i.e. of programming variables) distributed in the network.

However, DOSMOS integrates novel features :

DOSMOS Processes
 : Basically, a DOSMOS application is composed of three kind of processes :

Array allocation
 : DOSMOS allows to manipulate both basic type variables (integer, float, char...) and distributed arrays. These arrays are split into several ``system objects'', distributed in the network. Various splittings are provided : by row, by column, by block and by cyclic block. The system ensures a transparent access to arrays, whatever the splitting implemented.

Weak consistency protocols
 : for efficiency and scalability purpose, DOSMOS allows to duplicate shared objects. It is clear that these replicas have to be kept coherent. Most of actually implemented models are strong consistency oriented. DOSMOS implements a weak protocol : the release consistency. This model [] provides two synchronization operators : acquire and release. These operators allow processes which want to modify shared objects to lock and unlock them (in other words, these routines actually implement a mutual exclusion on the accesses to the shared objects).

Hierarchical structuring of the application processes
 : Previous DSM systems have always proposed ``flat'' models in which any shared object is accessible from any process. Such ``anarchical'' models cannot be scalable. In DOSMOS, processes can be group into groups in sub-groups in order to optimize the management of the coherence of data.

When one observes the behaviour of a DSM application, and more particularly the behaviour of a process participating to the application, it appears that if some shared data are intensively accessed by this process, some others are either very not often accessed or never accessed. This leads us to introduce some definitions (see fig. 1) :

   figure51
Figure 1: Example of accesses distribution

Let P a process. We have :

displaymath362

Usually, in previous systems, when an object O is modified, an invalidation message is sent to all the processes P such that tex2html_wrap_inline368 . This prevents, as noted before, to ensure a good scalability. By using a hierarchical grouping of processes, DOSMOS limits the invalidation messages to processes such that tex2html_wrap_inline370 .

   figure62
Figure 2: Hierarchical grouping of processes

Basically, DOSMOS proposes to structure the application into hierarchical groups of processes sharing the same objects (Figure 2). In practise, a group is defined by a set of processes and a set of shared objects. Processes of a same group share all the objects attached to the group, i.e. if they request an object, they will receive a copy of this object which will be automatically updated by the system.

But DOSMOS also allows processes to access to extra-group shared objects. For this purpose, in each group, a dedicated memory process, called Link Process (LP), plays the role of link between groups (see Fig 3). Thus, these special MPs takes into charge all the communications between groups.

   figure69
Figure 3: Groups and link processes

This model presents two important advantages :

Experiments done with DOSMOS system on network of workstations and parallel machines have shown great improvements when using hierarchical groups (see [Lef96]).

Programming model

As soon as the Use_Dosmos() primitive has been executed, the user can access to the shared objects in a transparent way. However, DOSMOS, as any DSM system, does not pretend to be efficient in all the situations. Consequently, in order to allow the user to optimize specific applications, DOSMOS allows to combine different programming models for user's confort. Consequently, three programming models are available :

Programming environment   The implementation of DOSMOS is based on different layers which assume pre-process of the code, management of shared objects, creation of groups and various processes involved in execution of the application (APs, MPs, LPs) and monitoring of the execution (see Figure 4) :

   figure84
Figure 4: Dosmos Environment

Pre-processing level This layer analyses the user's application in order to detect and generate accesses to shared objects. This layer allows the system to be ``transparent'' by transforming all accesses to shared objects.

DOSMOS Tool Kit

This set of tools allow to graphically design and execute a DOSMOS application (see Figure 7). Various features are provided to the user :

DOSMOS primitives By adding only a few new primitives, DOSMOS system stays well-adapted for beginner users. All accesses (except exclusives ones) are totally transparent for the user.

   figure100
Figure 5: DOSMOS primitives

Use_Dosmos() and End_Dosmos() allow the beginning and the end of sharing of the objects. All data accesses are not specified by the user but in order to access exclusively shared objects, two operators are proposed : Acquire and Release. The barrier routine allows to synchronize all processes sharing a given object or all processes of a given group.

DOSMOS-Trace monitoring environment  

The only way a user can influence the behaviour of his application is the modification of the structure of the shared variables space. So, from the user point of view, monitoring facilities should allow him to precisely know the ``activity'' of the shared variables (like in figure 6).

The purpose of the DOSMOS-Trace[BLR96] monitoring environment is to provide such information in a scalable and weakly intrusive way. The DOSMOS-Trace tool is based on a set of dedicated processes which collect informations during execution. This data collection is completely transparent for the user. This tool provides several visualizations and informations about the execution like statistics on shared objects, histories...

Such diagrams are extremely useful for the user to analyse problematical situations. Indeed they allow to very easily isolate ping-pong effects, over-accessed variables, too important extra-group accesses, bottlenecks, not actually shared variables, etc.

   figure124
Figure: Number and origin of the read accesses performed on an object vs execution time (in black : inter-group accesses)

Discussion and future works   This paper has described a novel DSM-based programming environment, the DOSMOS system. In comparison with previous works, this system integrates original functionalities : structuring of the application processes into hierarchical groups, mixing message-passing code and DSM code, weak consistency protocols, designing and execution facilities and monitoring tools.

The whole system has been designed to be as efficient and scalable as possible. Thus the process grouping allows, in conjunction with weak consistency protocols, to reduce the amount of communications required by the management of the DSM system.

Opened to various programming models, designed to be efficient both on parallel machines and distributed systems, DOSMOS provides a portable development platform. Moreover by only adding few new primitives and by providing graphical interfaces to design and to analyse execution application, DOSMOS is an user-friendly programming environment which can easily adapt to a non-expert parallel programming user. Tests have shown the effectiveness of the approach developed in DOSMOS. For more informations about how to use a DSM system like DOSMOS, see [Lef96].

Moreover, we currently continue to improve our programming environment by adding a new distributed tool to DOSMOS which will allow to debug, in a distributed way, the code of DOSMOS applications.

Acknowledgments The authors wish to thank Olivier Reymann, Sebastien Tixier and Jérôme Bolliet for their help.

 

  figure133


Figure 7: DOSMOS Tool Kit




next up previous
Next: References

Laurent Lefevre
Fri Jan 31 19:32:03 MET 1997