The PeerWare coordination model is centered around the notion of a global virtual data structure (GVDS), which is a generalization of the Lime coordination model. According to this model, peers interact by accessing a data space that is transiently shared and dynamically built out of the data spaces provided by each accessible peer. From the point of view of the user accessing this GVDS, the content of the data structure is automatically and dynamically reconfigured according to changes occurring in the system, typically induced by changes in connectivity among peers.
The data structure managed by PeerWare is organized as a directed graph composed of labelled nodes and documents, collectively referred to as items. Nodes are organized in an unrooted tree, while documents represent the leaves of the graph and are linked to one or more nodes. This graph is meant to represent a containement relation used to structure and classify the documents managed through the middleware and resembles a filesystem, where directories play the role of nodes, files are the documents, and Unix-like hard links are allowed only on documents. Figure below shows an example of this data structure.
Each peer is associated with a local data structure, organized as described above, whose content is assumed to be stored locally to the peer. At any time, the local data structures held by the peers connected to PeerWare are made available to the other peers as part of the GVDS managed by PeerWare, which has the same structure of the local data structure (i.e., it complies with the above definition) and whose content is obtained by "superimposing" all the local data structures belonging to the peers currently connected, as shown in figure below.
Changes in connectivity among peers (e.g., determined by mobility, or simply by logging in and out the PeerWare net) determine changes in the content of the GVDS managed by PeerWare, as new local data structures may become available or disappear. Nevertheless, this reconfiguration is completely hidden to the peers accessing the GVDS, which need only to be aware of the fact that its content and structure is allowed to change over time.
The goal in designing PeerWare was to develop a flexible and extensible middleware including a minimal set of primitives, which could support both a proactive and a reactive style of interactions among peers. At the model level we pursued this goal by introducing only three main primitives, which can be applied either to the local data structure associated with a peer or to the GVDS: the first proactively operates on the data managed by PeerWare, the second is used to subscribe to events occurring on such data, while the third atomically combines the first two. The operation effectively performed by these primitives is not encoded within them, while it is provided as a parameter: the action in the description below.
I=execute(Fn, Fi, A).
Fn
, an item filter Fi
,
and an action A
, and executes the action on the projection of
the data structure identified by Fn
and Fi
to determine
a set of items I, which is returned back to the caller. In particular, Fn
determines a set of matching nodes and Fi
filters the content
of such nodes to determine the set of items handed by action A
.subscribe(Fn, Fi, Fe, C).
Fe
and being published within the projection of the data
structure identified by the filters Fn
and Fi
. When
the event occurs the callback C
is executed locally to the caller.I=executeAndSubscribe(Fn, Fi, Fe, A, C).
A
on the projection of the data
structure identified by Fn
and Fi
, similarly to
the execute
primitive. Also, in the same atomic step, it subscribes
for events that match Fe
, and occur within the same projection
of data, by specifying the callback C
that must be executed locally
to the caller, when one of such events occurs.Despite the fact that the signature of these operations is identical for both local and global data structures, their effect is limited in scope by the nature of the data structure they are applied to. Moreover, also the semantics of the operations is affected by this choice. In paticular, the semantics of a global operation can be regarded as being equivalent to a distributed execution of the corresponding operation on the local data structures of the peers currently connected.
As for the atomicity of the operations, this is guaranteed when they are invoked on the local data structure of a single peer, while when executed globally, PeerWare only guarantees atomicity on the execution of the corresponding operations on each local data structure, that we said to be an integral part of the global execution.
As a final remark, we may observe that the executeAndSubscribe
operation extends execute
with the ability to "hook"
on some information, by allowing the realization of schemes providing strong
consistency on such information by retrieving some data and monitoring events
occurring on them. For instance, a programmer might want to retrieve the content
of a node and be notified if any new document appears in that node, e.g., to
build a graphical browser of the GVDS. The same behavior cannot be obtained
by simply invoking execute
followed by subscribe
.
In fact, given the inherently distributed and asynchronous nature of the system,
a peer could publish a relevant event right in between the execute
and the subscribe
. Such event would not be captured by the subscription,
and the notification would never show up, thus leading to an inconsistent state.
Other operations are included in the model, i.e., to create new items, destroy existing items, and notify the occurrence of events, for further details on these operations we suggest to read the papers describing PeerWare or to jump directly to the API.
The PeerWare model naturally suggests a middleware implementation that is intrinsically
peer-to-peer, where each peer hosts a repository that contains its local data
structure. An operation on the GVDS managed by the middleware, e.g., a global
execute
, is then performed by disseminating on the connected peers
the request for a local invocation of the corresponding primitive, and sending
the results back to the caller. Hence, each peer needs to host a run-time support
to manage the routing of system messages, like event notifications and requests
for operations.
Nevertheless, the model does not prescribe anything about how such
routing must be performed, e.g., what is the topology of the network interconnecting
the peers, and what algorithms are used to perform routing on top of it. On
the other hand, the PeerWare model includes several choices that have been made
on purpose to open up opportunities to improve efficiency and scalability of
any PeerWare implementation, independently from the underlying architecture.
In particular, the hierarchical nature of the data structure chosen happens
to provide a natural way to restrict the scope of the operations performed over
the GVDS, and thus to allow optimizations of the processing involved. For instance,
the distribution of requests for an execute
should always be somehow
"steered" only towards the peers that actually contain the nodes that
are targeted by this operation.
Moreover, the mechanism of actions not only allows programmers to define dynamically the exact behavior of the primitives through which they access the GVDS, but also allows computation to be moved close to resources, thus opening up interesting opportunities to efficiently implement complex operations over documents. At the architectural level this involves the use of mobile code technology to implement the shipping and fetching of the code of actions.
Finally, the model leaves unspecified the nature of the languages used to specify
the filters Fn
, Fi
, and Fe
. Here, the
tradeoffs are between the expressive power placed in the hands of the programmer
and the burden of added complexity and overhead placed on the middleware run-time
support.
As mentioned, current PeerWare implementation is meant to support the development of peer-to-peer applications for collaborative work, in an a typical enterprise domain in which users are connected through wired or wireless links to a medium-sized fixed network. In this scenario, the fixed network may provide a backbone of permanently active peers, taking care of processing and routing the control messages related to requests for operations, as well as subscriptions to and notifications of events. Other peers may be permanently or discontinuously attached as leaves of this backbone, including a dynamic fringe of mobile peers, whose connectivity is enabled by wireless devices.
To support portability and platform independence, Java was chosen as the implementation language. Moreover, we designed the PeerWare run-time in a way that is independent from the underlying repository, by decoupling the two through the use of an adaptation layer, represented by a Java interface, which specifies the operations the PeerWare run-time needs to perform on the underlying repository. As a consequence of this choice, we prescribe very little about the data filtering language or the document format. Documents are managed by the PeerWare run-time as opaque data returned by the repository, whose processing may be delegated further to actions. In the current implementation, we chose a simple, open source XML repository, thus data filters are XQL queries and documents are XML data.
As for the node filtering language, we adopted a simplified form of regular expression, similar to the one used by Unix shells, to reduce the effort needed to interpret and evaluate node filters. In this language, the wildcard "*" may appear only at the end of an expression, hence allowing to point at either specific nodes or to all the subnodes of a given one.
For events and the related filters we borrowed from our previous experience
in implementing and using Jedi, a distributed, publish/subscribe middleware.
Then, PeerWare events are characterized by fields, each one having a name and
a string value, like in Jedi. In this schema, the event filtering language allows
programmers to specify which fields must be
present in the events they are interested to and, through a regular expression,
which must be their value.
Jedi inspired also the design of the mechanisms currently used to route messages
(i.e., events as well as requests for execute
, subscribe
,
and executeAndSubscribe
operations), which are based on a hierarchical
architecture where peers are arranged in an unrooted tree. Given the characteristics
of the domain we targeted, and the requirements we set, we decided that this
architecture offers the best tradeoff, as it enables the possibility of using
a fixed backbone of permanently active peers, it avoids the potential for routing
loops, and keeps the routing algorithms simple and efficient. To enhance flexibility,
the tree of peers is allowed to change dynamically.
Information about the peers that host each node is dynamically maintained by
the PeerWare run-time and it is used to steer execute
, subscribe
,
and executeAndSubscribe
operations only towards the peers that
effectively hosts the data required to perform the operation.
As in the case of the repository, security and access control is not handled directly by PeerWare, while it is delegated to an external security module, thus allowing for different policies and protocols to be supported. Clearly, PeerWare provides all the necessary hooks to establish secure communication channels as well as perform authentication and access control on top of it.
For further details on this issues we suggest to read the papers describing PeerWare or to refer to the documentation of the PeerWare API.