CSDocs Architecture Sketch

Compound/Structured-Document Extensions to DMA 1.0

These pages provide a sketch of the overall architecture for CSDocs, the extensions to the DMA 1.0 model for Compound/Structured Documents.

This is work in progress.  It is organized in this fashion to have more flexibility in annotation, cross linking, and other operations while the architectural sketch is being drafted, explored, validated, and revised.  The HTML is used at a level that is easily incorporated back into one or more DMA Architecture Change Proposals later in the process.

This is an intermediate top-level piece to the sketch: a sketch-of-the-sketch.  This more-informal presentation is circulated and posted in order to obtain early review and give all Technical Committee and DMA participants a sense of the current direction.  The key ideas are presented and developed in the Architectural Approach section.  This is the roadmap being followed.   Later sections expand on the roadmap and provide current information, location of specification drafts, and so on.  The What's New? section indicates where the authors and subcommittee are concentrating in current work on the proposal.  The About This Material section provides information on how to obtain the latest information on the sketch and other CSDocs activities.

The CSDocs Foundation draft proposal is now available in HTML and Word format.

This is sketch version 1.8 created on 2001-05-03-17:50 -0700 (pdt)


Content

Architectural Approach

CSDocs Modularity

CSDocs Foundation

Advanced Features

Postponed CSDocs Extensions

CSDocs Foundation

Basic Compound/Structured Document Model

CSRoot Renditions of Compound/Structured Documents

Unique Identifiability of Persistent DMA Elements

Specifying the Content Dependencies of Compound/Structured Documents

CSDocs Advanced Features

Direct Navigation to CSDocs Elements

Virtual Elements

Miscellaneous Topics

Access to DMA Elements by URL

Metadata for Description of Policy Mechanisms

Specifying Referential-Integrity Strength

Following New Versions with Relationships


What's New?

Following the June 15-17, 1999 DMA Technical Committee meeting in Costa Mesa, California, it was agreed to pursue the development of the proposal for the CSDocs Foundation.  On the June 29, 1999 CSDocs subcommittee conference call, there were three topics (in two areas) to be deepened under the Foundation:

  1. How the level of abstraction of a CSRoot DocVersion is determined by a CSRoot-aware application,
    A brief position statement on CSRoot Renditions is included in the section on CSRoot Renditions, below.
  2. Where the InstanceId property is introduced into the DMA class hierarchy.
    An expanded position statement on InstanceId generally, and location of the property generally is now available here.
  3. How InstanceId properties are formed, providing a basis for referring to specific elements within an independently-persistent DMA object.
    An expanded position statement and discussion of the formation of InstanceId values is now available here.

The foundation proposal, promised for July 13 was not available at that time.   Discussion continued and the first draft of a complete foundation proposal was made available on July 28.

While the CSDocs Foundation  is being reviewed and advanced toward recommendation by the CSDocs subcommittee for acceptance by the DMA Technical Committee, discussion and proposal of Advanced Features will begin in parallel, via E-mail.

About This Material

to contents

Download Locations

You are viewing a copy of the CSDocs Architecture Sketch.  It is not necessarily the latest version.

To obtain the latest version, look in 

to top of section
to contents

Related Materials and Resources

to top of section
to contents

Architectural Approach

to contents

It is intended that the CSDocs architecture be worked through and approved in layers.  The first level is for the foundation.  Then there is an opportunity to propose advanced extensions for a number of topics.  Some developers require some or all of the advanced extensions to deliver their complete compound-structured document models.  Other developers, and many applications, will operate at the simpler foundation level, with more enforcement of the compounding rules by application agreement instead of DocSpace object-model and policy mechanisms.

There are expected to be several architecture-change proposals as part of achieving the CSDocs extensions.  The foundation will be produced in a single proposal, followed by supplements added to the foundation.

to top of section
to contents

CSDocs Modularity

The table below shows the general modularization and progression of CSDocs features. 

Level CSDocs Extensions Generic DMA Extensions

Foundation

Basic Compound/Structured Document Model -
Overall Renditions of Compound/Structured Documents -
Specifying the Content Dependencies of Compound/Structured Documents Unique Identifiability of Persistent DMA Elements

Advanced

Direct Navigation to CSDocs Elements Direct Reference and Navigation to Persistent DMA Elements*
Virtual Components in CSDocs DMA Extensions for Virtual Elements
Miscellaneous Access to DMA Elements by URL
Metadata for Description of Policy Mechanisms
Specifying Referential-Integrity Strength
Following New Versions of Components Following New Versions with Relationships*

* these two topics are not expanded upon in this sketch.  The potential generalization beyond the CSDocs-specific cases is straightforward.

to top of section
to contents

CSDocs Foundation

The first three topics are candidates for inclusion into a single CSDocs foundation proposal:

  1. The Basic Compound Document Model establishes a new Relationship class that represents the dependencies of root documents on component documents.  This is used for all compound-document dependencies under CSDocs.   The nomenclature around roots and components, and structures of them, is defined to be application neutral.
  2. The Overall Renditions approach addresses the way that renditions, if any, of a root document are related to the overall content of the compound document that is represented.
  3. The relationship between a root and component  is often a relationship between an element of the root's content structure (rendition or content element) and an element of the component's content structure.  The third foundational extension augments the basic model so that the particular internal elements involved in the relationship are explicitly identifiable.

to top of section
to contents

Advanced Features

The advanced extensions for CSDocs are often best accomplished by introducing generic extensions to DMA.

The advanced-level extensions are:

  1. Direct navigation to dependently-persistent elements
  2. Virtual elements in compound relationships
  3. Accessing DMA elements by URL
  4. Metadata for specifying constraints and policies
  5. Referential-integrity strength
  6. Following new versions of components 

to top of section
to contents

 

Postponed CSDocs Extensions

There are also potential interactions with CBSearch that are not addressed in the current sketch.

There are also potentially-complex interactions with extensions of the DMA 1.0 Versioning model to support complex configurations (especially branching).

Features of the current sketch might also be postponed to accelerate initial trial use and agreement on the basic features of DMA CSDocs extensions.

to top of section
to contents

Basic Compound/Structured Document Model

The basic compound document model establishes how the constituents of a compound document can be represented in structures of separate DMA objects.  The interdependencies among the objects are established with DMA relationship objects:

Basic Compound Documents (UML)

Figure 1 UML scheme for Basic Compound Documents (a) with typical instance (b)
[click for Visio version]

1. The CSDocs extension depends on a new optionally-supported subclass of the DMA 1.0 dmaClass_Relationship class. This is the dmaClass_CSRelationship class and its subclasses. 

2. The reflective property of a dmaClass_CSRelationship dmaProp_Tail property is always a dmaProp_CSComponents enumeration property. 

3. The reflective property of a dmaClass_CSRelationship dmaProp_Head property is always a dmaProp_CSRoots enumeration property. 

In the basic CSDocs model, an object can be both a CSRoot and a CSComponent.  A DocSpace may support a variety of different families of CSDocs structures by use of subclassing and relationship restrictions.

to top of section
to contents

CSRoot Renditions of Compound/Structured Documents

The basic compound document model does not establish anything about the application of the CSDocs structures.  Anything about what a CSDocs structure is for and how one employs the DocVersions that comprise it depends on application-agreement, published profiles, and other practices around use of the basic compound document model.

The CSRoot Renditions principles are the first that ground CSDocs for specific, practical application: Determining the relationship between the semantics of a CSRoot and of any renditions it has.

This portion of the foundation represents the CSDocs response to the following questions:  

Current Position Statement

Here is the current position on renditions of CSRoot DocVersions.  It is reflected in the initial CSDocs Foundation proposal.  Further discussion will be held to establish the final position included in the CSDocs Foundation proposal.

  1. The renditions of a CSRoot document, as for any DocVersion, are at the same level of abstraction.  Either all of the renditions are for the overall compound document or none of them are.
  2. The CSRoot Style property is necessary to specify when the CSRoot renditions are not complete renditions of the overall document that the CSRoot stands at the root of.
  3. A CSRoot can be a CSComponent.  It can contribute its elements as components of another CSRoot the same as any CSComponent DocVersion (see Specifying Content Dependencies,  below).

to top of section
to contents

Unique Identifiability of Persistent DMA Elements

The extensions that CSDocs depends on to accomplish identification of and navigation to dependently-persistent components is of value in all areas of DMA 1.0.  Because of that, it is recommended that a general proposal also be produced.  The related CSDocs extension then specifies precisely how those extensions are exploited for CSDocs.

to contents

Current Position Statement

This is a generic extension to DMA 1.0: 

  1. It  adds the optionally-supported dmaProp_InstanceId property to dmaClass_DMA.
  2. dmaProp_InstanceId is an implementation-optional, system-derived, read-only, and value-not-required property.
  3. The value of dmaProp_InstanceId is of type DmaId.
  4. The property is meaningful only on persistent DMA objects.  On a DMA object instance for which there is (presently) no corresponding persistent element, any dmaProp_InstanceId property must have no value.
  5. If the property is supported on a persistent DMA object, it will have a unique value from the time the persistent form is created until that particular persistent form no longer exists.  The value is never reused for another object.

The CSDocs subcommittee is currently discussing position elements (1) and (3). These are discussed and explored further in separate position statements on the InstanceId.

This property is a valuable adjunct in query and in the unique identification of elements of independently-persistent objects.  The property permits globally-unique, unambiguous identification regardless of the context of an element and does not depend on applications to provide supplemental properties or introduce other practices to allow elements to be uniquely identified. 

This property is valuable in the property list of an independently-persistent object also.  The dmaProp_OIID property is a string and it includes location information and a DocSpace-specific object identifier (the ObjectId field) as part of that URL-format string.  When an independently-persistent object has an Instance Id, it is appropriate for this to be encoded in the OIID as the value of either the IdmaOIID::GetObjectIdText method or the IdmaOIID::GetGUID method.

The current methods for accessing objects by their dmaProp_OIID property values are confined to independently-persistent objects.  It would be inconsistent and disruptive to have dependently-persistent elements suddenly be returned by such an operation performed by a DMA 1.0 client.  To allow direct access to a dependently-persistent element that is uniquely identifiable requires further extension to the DMA 1.0 model. 

to top of section
to contents

Extending Navigation to Identify Specific Elements

DMA relationship Head and Tail properties can only have independently-persistent objects as their values.  In general, all navigation from one independent object to another is to an independently-persistent object and not a dependently-persistent element of any object.  For compound documents and other applications, it is useful to supplement navigational properties to identify specific elements as the object of navigation.

The idea is that the property for navigating to the independently-persistent object be supplemented by additional properties to make an "extended reference" that provide a unique path to a desired internal element.  The scheme involves general principles requiring that an unambiguous path to the element be determined.

In many cases, as in CSDocs, a specific path is employed.  If a generic path to an arbitrary element of a document is needed, it can be satisfied by a dmaClass_ListOfId valued property that has a sequence of alternating property identifications and dmaProp_InstanceId values: propid[1], instid[1], propid[2], instid[2], ..., propid[n], instid[n].  (This "element path" could even thread through more than one object to its target.)

to top of section
to contents

Specifying the Content Dependencies of Compound/Structured Documents

The next level of specialization extends CSRelationship subclasses to provide more information about the way in which a root document depends on its component documents.

The capabilities for specifying dependently-persistent components do not alter the fundamental characteristics of the DMA relationship object model.  Relationships are always between independently-persistent objects.  This extension allows the relationship to also identify the specific elements that participate in the relationship when it is meaningful to do so.

The extension has two parts:

  1. Elements of  CSDocs DocVersions have optional properties that give them a unique identification.  This allows another object to carry DmaId-valued properties for locating a specific, unique element within a CSRoot or CSComponent object.
  2. Additional properties are provided on a CSRelationship to qualify the dmaProp_Head and dmaProp_Tail.  These are used to determine which element of the CSRoot and CSComponent are involved in the particular compound dependency relationship that a CSRelationship object represents.

to top of section
to contents

The specialization of CSRelationship to have accurate identification of elements is as follows.

The rules of interpretation for these properties are as follows:

The additional properties dmaProp_TailRenditionId and dmaProp_TailContentElementId are introduced and used in the same way to provide refined identification of that part of a CSRoot that depends on the component identified in the particular relationship object.

[It is not expected that every case of dependency is meaningful. That is, having a content element at the tail and a rendition at the head might not make sense.  On the other hand, the scheme is perfectly general and users and systems can implement the cases that matter for the CSDocs structures being used.  The CSDocs model does not say what the CSRelationship dependency is, it says that there is one.  Subclassing and additional edge data can be used to deal with specialized cases to the degree that is essential to a particular kind of CSDocs structure. -- dh:99-05-25]

to top of section
to contents

Direct Navigation to CSDocs Elements

With the CSDocs provisions for identifying specific elements, the next level of extensions involves direct navigation to an element as the value of an object-valued property.  That is, a navigational property can be a direct "short-cut" that allows direct navigation to the target element.

The ordinary way to navigate from some object to an identified element is as follows:

  1. Traverse through the object-valued property (e.g., dmaProp_Head) that locates the independently-persistent object having the desired element.
  2. Use the path of dmaProp_InstanceId property values in any supplemental properties to navigate to the precise element.

For example, to find a CSComponent element by traversal from a CSRelationship object, the operations are as follows:

  1. Get the CSComponent DocVersion by traversing the dmaProp_Head property of the CSRelationship object.
  2. If the CSRelationship object has a dmaProp_HeadRenditionId property with a value, traverse to the CSComponent DocVersion's dmaProp_Renditions list-valued property.   Search the list and stop on the Rendition object whose dmaProp_InstanceId property value matches that given by the dmaProp_HeadRenditionId property value.
  3. If the CSRelationship object also has a dmaProp_HeadContentElementId property with a value, use the Rendition object of (b) and traverse to its dmaProp_ContentElements list-valued property.  Walk through the list and stop on the Content Element whose dmaProp_InstanceId property matches that given by the dmaProp_HeadContentElementId property value.

Direct navigation has the exact same semantics.   The necessary InstanceId values must be provided in the CSRelationship object.  There are also optional direct-navigation properties, the dmaProp_HeadElement object-valued property and the dmaProp_TailElement object-valued property.

Direct-navigation properties have no reflective property.   The object-valued property in (1) and the individual path property elements in (2) may also be made system derived and read-only if direct navigation is the required way to set up the path to the element.

A direct navigation is created by inserting the target element into the element property using IdmaEditProperties::PutPropValObjectBy... in the usual way.  When the CSRelationship object is made persistent, the proper values are "snapped" into the object-valued property for the independently-persistent object having the element and for the InstanceId-valued properties that represent the path to the element.

When an object having such a reference is used, the identifying elements are enough to allow the element to be found by the traversal method (1-2) above.  Alternatively, the dmaProp_...Element property can be traversed directly and an element will be provided by direct navigation as if the traversal had been performed silently and only the ultimate element then returned.  The result is exactly the same.

[Side note: When direct navigation is supported, the properties for navigation to the independently-persistent object (e.g., dmaProp_Head) and for direct navigation to the element (e.g., dmaProp_HeadElement), will appear to be bound simultaneously, including the path from the independently-persistent object to the intended element. In the case when it is the CSComponent DocVersion that is the target (with no rendition or content element identified), the two navigations are the same and bound to the same instance. --dh:99-06-02]

To introduce direct navigation in the CSRelationship model, the basic extension involves adding optionally-implemented properties dmaProp_HeadElement and dmaProp_TailElement. 

Any CSRelationship subclass that implements dmaProp_HeadElement will usually have dmaProp_Head, dmaProp_HeadRenditionId and (if implemented) dmaProp_HeadContentElementId as system-derived and read-only.  Any CSRelationship subclass that implements dmaProp_TailElement will have dmaProp_Tail, dmaProp_TailRenditionId and (if implemented) dmaProp_ContentElementId as system derived and read-only.

Further streamlining of operation is supported in query if the dmaProp_HeadElement and dmaProp_TailElement properties are selectable and are usable in query expressions (e.g., are properties usable in join relationships).

to top of section
to contents

Virtual Elements

Virtual elements are an extension to DMA for having elements that are derived from other, shared elements.  The relationship between the virtual element and the shared element on which it is based is established with dmaClass_Relationship subclasses.  Access to the virtual element automatically yields the information of the shared element, as if the shared element were accessed instead. 

Virtual elements have many valuable applications independent of CSDocs.  They provide for sharing of the same content material among different DocVersions, including between versions of the same or different document.

DMA Extensions for Virtual Elements

In addition to the use of the unique identification model and related extensions for identification and navigation of elements, a new optional interface is added to dmaClass_DMA (to have it available above all persistent elements that might be virtualized).

This interface, IdmaVirtualElement, works like IdmaVersionable.  The method IdmaVirtualElement::SetDerivation takes a dmaClass_Relationship subclass object as its operand, and it establishes the head of that relationship as the source for the implementation of the virtually-derived element.  The current element (the one with the IdmaVirtualElement interface) is made a virtual element when it is successfully made persistent.

A simple kind of virtual derivation is when the derivation provides the head object as the literal implementation of the virtual element with at-most trivial embellishments.

CSDocs Virtually-Derived Components

A CSRelationship can be a virtual-derivation relationship.   All that is required is for the appropriate elements of the CSRoot to offer IdmaVirtualElement interfaces and for there to be a CSRelationship subclass that is compatible for use in holding the virtual-derivation relationship.

to top of section
to contents

Miscellaneous Topics

There are a number of additional extensions that are important for advanced CSDocs features and for the satisfaction of all of the agreed CSDocs requirements:

Access to DMA Elements by URL

There is a requirement to also provide for direct access to elements with DMA objects by URL.  This becomes possible as a consequence of unique identifiability, the provision of paths to components, and the ability to navigate directly to elements. 

There is also supplemental machinery required in order to establish an appropriate URL protocol (part of dma://) or protocols (part of http:// as well, etc.)

to top of section
to contents

Metadata for Description of Policy Mechanisms

The exploration of CSDocs has exposed a number of places where it is important to explicitly add policy information about DMA classes and properties. 

For example, it is important to be able to determine whether an object-valued property requires an independently-persistent object for its value or whether it treats all supplied values as sources for creation of a dependently-persistent value.  There are other places where supplemental description is needed to support users in understanding the conditions to be satisfied when creating a new object of some class, when supplying values for properties, and so on.

This extension involves addition of a new metadata subclass for identified and registerable policies, using aliasable metadata identifications.  The properties of the object employ at least the standard display name and description capabilities to provide human usable explanatory material and supplemental documentation.

The rules for these properties in creation of merged scopes are the same as if these properties are unknown user-defined extensions of metadata.

Constraint Metadata Class

Figure 2. Introduction of Constraint Metadata Class

[click for Visio version]

Constraint metadata are added as the value of dmaProp_Constraints list-of-object properties of class and property descriptions.  Typical usage of constraints is to specify a policy that is enforced on the usage of a class or on the value of a property.  For example,

There will be a number of well-known constraints that are registered along with the introduction of constraint metadata.

to top of section
to contents

Specifying Referential-Integrity Strength

The variety of ways that CSDocs structures are used increases the importance of addressing ways to control the strength of the CSRelationship and how insertion, replacement, and deletion of the CSRelationship, of the CSRoot, and of the CSComponent are handled. 

This is a valuable place for policy metadata also, when the DocSpace has a fixed, non-specifiable policy concerning insertion, deletion, and replacement of CSRelationship objects and the involved CSRoot and CSComponent.

In addition to being able to tell what the insertion, deletion, and replacement/update policies are for a CSRelationship object, it is desirable to provide ways for a requester to specify what the rules are to be for a given CSRelationship.  This might be through properties specified on the CSRelationship or by some other means, such as offering different subclasses for the different supported constraints.

to top of section
to contents

Following New Versions of Components

There is a general requirement in compound/structured documents, and in other use of versioned objects (e.g., container elements), to have a form of relationship that follows objects in a changing version series.  A Head or Tail property will automatically leap to a new current version of the Version Series being tracked, for example. 

This mechanism allows an application to publish or manage a compound document that is simply the latest view formed with all of the latest versions of the components.   This is expected to be in addition to the simple form of CSDocs versions that always have specific components for a specific root, regardless of how any of them occur as versions under one or more configuration histories.

to top of section
to contents


created 1999-04-30-14:09 -0700 (pdt) by Dennis E. Hamilton
$$Author: Orcmid $
$$Date: 01-05-03 17:59 $
$$Revision: 27 $