Lionel Brunie and Laurent Lefèvre
Laboratoire de l'Informatique du Parallelisme
Ecole Normale Supérieure de Lyon
69364 LYON Cedex 07, France
This paper describes an original programming environment based on the DOSMOS system. Its implementation is built on a structural approach of parallel programming. By combining this structural model with weak consistencies protocols, we improve the performances of DSM system and provide a programming model which mixes message-passing and virtually shared objects. Morevoer by integrating a development platform consisting of processes mapping, shared objects management, pre-processing step and monitoring facilities, the DOSMOS system has been designed to provide a performant user-friendly programming environment.
Distributed Shared Memory systems (DSM) have been designed to implement, above a distributed memory architecture, a programming model allowing a transparent manipulation of virtually shared data. Thus, in practice, a DSM system has to handle all communications and to maintain coherence of the shared data.
This paper describes an original programming environment, called DOSMOS. This system is based on a structural approach of parallel programming. In other words, DOSMOS proposes to the user to hierarchically structure processes into groups and sub-groups of processes sharing a same set of variables. This feature, combined with weak consistency protocols allows to reduce the amount of communications required for the management of the shared data, and, as a consequence, to ensure efficiency and scalability to the applications.
Moreover, to be used by various kind of programmers, DOSMOS allows to mix two programming models : message-passing (PVM [GBD 93]) and DSM (DOSMOS) code in a same application.
To provide an useful development platform above DOSMOS, we have added a set of tools to design a complete programming environment. This Tool Kit allows the designing and execution of DOSMOS applications in terms of processes mapping (DOSMOS-Map tool), shared objects management and pre-processing step. To complete the programming environment, DOSMOS integrates a devoted monitoring tool (called DOSMOS-Trace) which has been added to the system to clearly understand and analyse the behavior of DOSMOS applications. At last, this programming environment has been designed to run both on distributed systems and on parallel machines. Thus, to ensure the portability of both the system and the applications, DOSMOS has been developed on top of PVM.
This paper is divided into three parts. After a short description of previous works in terms of DSM systems (section 2), we describe the basics of DOSMOS DSM system in terms of coherence protocols, shared objects, programming model and process structuring (section 3). Then a description of the DSM-based programming environment is proposed. At last, section 5 proposes a discussion both on the basic features of this programming environment and on implementation choices and points out future developments.
Purpose of Distributed Shared Memory systems and previous works By allowing the programmer to share "memory objects" (i.e. programming variables) in a transparent way, Distributed Shared Memory Systems (DSM) propose a interesting trade-off between the easy-programming of shared memory machines and the efficiency and scalability of distributed memory systems. Basically, a Distributed Shared Memory system is a mechanism that allows application processes to access to shared data in a transparent way. In other words, a DSM system releases the programmer from the management of all inter-process communications.
Both hardware and software implementations have been proposed. The main systems require to implement an additional software layer :
Basics of DOSMOS DOSMOS is an object-based DSM system (cf section 2), i.e. it allows processes to share in a transparent way a set of passive objects (i.e. of programming variables) distributed in the network.
However, DOSMOS integrates novel features :
When one observes the behaviour of a DSM application, and more particularly the behaviour of a process participating to the application, it appears that if some shared data are intensively accessed by this process, some others are either very not often accessed or never accessed. This leads us to introduce some definitions (see fig. 1) :
Figure 1: Example of accesses distribution
Let P a process. We have :
Usually, in previous systems, when an object O is modified, an invalidation message is sent to all the processes P such that . This prevents, as noted before, to ensure a good scalability. By using a hierarchical grouping of processes, DOSMOS limits the invalidation messages to processes such that .
Figure 2: Hierarchical grouping of processes
Basically, DOSMOS proposes to structure the application into hierarchical groups of processes sharing the same objects (Figure 2). In practise, a group is defined by a set of processes and a set of shared objects. Processes of a same group share all the objects attached to the group, i.e. if they request an object, they will receive a copy of this object which will be automatically updated by the system.
But DOSMOS also allows processes to access to extra-group shared objects. For this purpose, in each group, a dedicated memory process, called Link Process (LP), plays the role of link between groups (see Fig 3). Thus, these special MPs takes into charge all the communications between groups.
Figure 3: Groups and link processes
This model presents two important advantages :
As soon as the Use_Dosmos() primitive has been executed, the user can access to the shared objects in a transparent way. However, DOSMOS, as any DSM system, does not pretend to be efficient in all the situations. Consequently, in order to allow the user to optimize specific applications, DOSMOS allows to combine different programming models for user's confort. Consequently, three programming models are available :
Programming environment The implementation of DOSMOS is based on different layers which assume pre-process of the code, management of shared objects, creation of groups and various processes involved in execution of the application (APs, MPs, LPs) and monitoring of the execution (see Figure 4) :
Figure 4: Dosmos Environment
Pre-processing level This layer analyses the user's application in order to detect and generate accesses to shared objects. This layer allows the system to be ``transparent'' by transforming all accesses to shared objects.
DOSMOS Tool Kit
This set of tools allow to graphically design and execute a DOSMOS application (see Figure 7). Various features are provided to the user :
DOSMOS primitives By adding only a few new primitives, DOSMOS system stays well-adapted for beginner users. All accesses (except exclusives ones) are totally transparent for the user.
Figure 5: DOSMOS primitives
Use_Dosmos() and End_Dosmos() allow the beginning and the end of sharing of the objects. All data accesses are not specified by the user but in order to access exclusively shared objects, two operators are proposed : Acquire and Release. The barrier routine allows to synchronize all processes sharing a given object or all processes of a given group.
DOSMOS-Trace monitoring environment
The only way a user can influence the behaviour of his application is the modification of the structure of the shared variables space. So, from the user point of view, monitoring facilities should allow him to precisely know the ``activity'' of the shared variables (like in figure 6).
The purpose of the DOSMOS-Trace[BLR96] monitoring environment is to provide such information in a scalable and weakly intrusive way. The DOSMOS-Trace tool is based on a set of dedicated processes which collect informations during execution. This data collection is completely transparent for the user. This tool provides several visualizations and informations about the execution like statistics on shared objects, histories...
Such diagrams are extremely useful for the user to analyse problematical situations. Indeed they allow to very easily isolate ping-pong effects, over-accessed variables, too important extra-group accesses, bottlenecks, not actually shared variables, etc.
Figure: Number and origin of the read accesses performed on an object vs execution time (in black : inter-group accesses)
Discussion and future works This paper has described a novel DSM-based programming environment, the DOSMOS system. In comparison with previous works, this system integrates original functionalities : structuring of the application processes into hierarchical groups, mixing message-passing code and DSM code, weak consistency protocols, designing and execution facilities and monitoring tools.
The whole system has been designed to be as efficient and scalable as possible. Thus the process grouping allows, in conjunction with weak consistency protocols, to reduce the amount of communications required by the management of the DSM system.
Opened to various programming models, designed to be efficient both on parallel machines and distributed systems, DOSMOS provides a portable development platform. Moreover by only adding few new primitives and by providing graphical interfaces to design and to analyse execution application, DOSMOS is an user-friendly programming environment which can easily adapt to a non-expert parallel programming user. Tests have shown the effectiveness of the approach developed in DOSMOS. For more informations about how to use a DSM system like DOSMOS, see [Lef96].
Moreover, we currently continue to improve our programming environment by adding a new distributed tool to DOSMOS which will allow to debug, in a distributed way, the code of DOSMOS applications.
Acknowledgments The authors wish to thank Olivier Reymann, Sebastien Tixier and Jérôme Bolliet for their help.
Figure 7: DOSMOS Tool Kit