dm_notes: Documentum Notes

June 26, 2008

Role of UCF in Documentum Clients

Filed under: dfc, notes, ucf, wdk, webtop — Tags: — Raj V @ 10:32 pm

In early days, all Documentum clients were thick clients (either DFC based or DMCL based). This  means it was on a 2-tier architecture. You either need to have a DFC or a DFC based client installed on every client that accesses the Documentum Repository or alternatively use the legacy DMCL library based (IAPI, IDQL) to access the repository .

With the advent of WDK (a web based repository access manager), the thick client is no longer required on the individual clients. This was made possible by moving the Documentum Repository client layer (DFC) to the middle tier ( a 3-tier based architecture) from the traditional Client/Server based 2-tier architecture).

Moving DFC to the middle tier will enable the Application server to access the Repository. But how can the end client access the content in the Repository locally, where there is no footprint of Documentum?

When you perform content management operations, the content is retrieved by DFC on behalf of WDK from the content server and is transferred to the Application server where DFC resides. But how do we transport the content from the App Server to the end client.

To answer this, Documentum has come up with a HTTP based content transfer program that runs with in the context of the Client browser.
This program is a Java based Applet that transfers the content from the App Server to the client (Outbound operations ) and vice-versa (In-bound operations).
But due to the applets limitations to process complex document structures like XML Links/OLE Links etc. this transport mechanism was limited to basic content management functionality.

These limitations have put forward a new robust and extensible transport mechanism called UCF (Unified Client Facilities).

We can enable UCF in WDK based applications( 5.3 or later) through the configuration parameter.

How can you identify which transport facility (http or ucf) is used in your WDK based application?

It is defined in your app.xml file of your application (default entry : wdk\app.xml).
The below config element defines the mode

<contentxfer>
<default-mechanism>ucf</default-mechanism>

</contentxfer>

What is UCF?

UCF is composed of two components (at a very high level). UCF Server and UCF Client.

UCF Server plays two different roles. It presents itself as a end client to the DFC layer that communicates with the Repository and it presents itself as a server to the UCF Client (end client).

The broader communication channel is as below:

Content Server <—> DFC <– –> WDK/UCF Server<—> UCF Client.

What are the benifits of UCF over HTTP transfer mode and why is it being made the de-facto standard in  Documentum based applications (from D6)?

  • Performance and Throughput
    • There is a disbelief that the earlier HTTP based transport is faster than UCF transport. In general any standard http based uploads are better than http based downloads. Not sure why? May be the Server’s is responding more to the clients as most of the users are accessing the application and there are only few users that are trying to write back to the server (upload). (Just a thought)
    • I do agree there have been few performance issues and there have been lot of improvements over the time. It has improved a lot lately.
    • UCF provides client information to the server as and when required, accesses the registry, optimizes the content transfer, etc.. There is a delay involved in launching the UCF client and initializing it compared to the applet based transport. This delay is due to the JVM startup(launch), UCF client making the initial connection to the App Server and protocol negotiation with the server.
  • Extensibility
    • UCF is extensible. You can add your own Requests (Server)/handlers (Client) and plug-in to the UCF Infrastructure and enable it to perform your custom tasks.
  • Recovery
    • UCF has the support for recovery. Say when you are Exporting/Viewing a content file and the socket connection was broken during the transfer, UCF tries to re-try the operation from where it has left and attempts to complete the operation.
  • DFC based analysis for the client
    • As UCF has a small footprint on the client it sends the content file to the server. Then the server analyzes the content and initiates a 2-way communication channel with the client. This communication channel enables the server and client together to perform the Content Analysis.
  • Client information available on the server (for DFC and WDK components)
    • UCF client makes available all the required client information at the Server giving the impression of a client.

None of the above benefits were available with the http based transport implementation or were available with limited support.

June 19, 2008

Query to find list of objects in folder along with its Folder Path

Filed under: dfc, documentum, dql — Tags: , , — Raj V @ 3:42 pm

Occasionally we require to find the list of all objects from a folder and also retrieve their exact folder path with in the same query.

This can be achieved easily through a DFC Program. But its a little tricky when you want it through a a DQL Query. As dm_sysobject stores only the folder id (i_folder_id) of the object instead of the folder path.

The folder path is hidden in the dm_folder object and is a repeating attribute. So we need to query dm_folder for r_folder_path. The issue in DQL is you can’t select repeating attributes when you join multiple types. You will hit DM_QUERY2_E_REPEAT_TYPE_JOIN if you do so.

Lets see what DFC can do and how to approach the same with DQL.

DFC code snippet looks as below:

IDfFolder folder = (IDfFolder) session.getObjectByPath(<<folderpath>> );
if (folder != null) {
     getContents(session, folder, docs);
     System.out.println("Total Number of Files : "+docs.getSize() );
}
 ....

private List getContents(IDfSession session, IDfFolder folder, List docs) throws DfException, IOException {
   // get all the r_object_id
   IDfCollection collection = folder.getContents("r_object_id");
   if (collection != null) {
      while (collection.next()) {
	String objectId = collection.getString("r_object_id");
	IDfSysObject object = (IDfSysObject) session.getObject(new DfId(objectId));
	if (object.getTypeName().equals("dm_folder" ) || object.getType().isSubTypeOf("dm_folder" )) {
           getContents(session, (IDfFolder) object, docs, writer);
	} else {
           IDfFolder folderObj = (IDfFolder) session.getObjectByQualification("dm_folder where r_object_id = '+ object.getString("i_folder_id" ) + "'");
           if (folderObj != null) {
	      buffer = object.getObjectName() + "\t"+ object.getOwnerName() + "\t"+
                                  folderObj.getString("r_folder_path" ) + "\t"+ object.getModifyDate();
           }
	}
    }
    collection.close();
   }
}

In DQL the same can be achieved as below:

DQL> select A.r_object_id, A.object_name, B.r_folder_path from dm_document A, dm_folder_r B where any A.i_folder_id = B.r_object_id and B.r_folder_path like ‘%/System/%’;

What we are trying to achieve is to join the dm_document type repeating attribute ‘i_folder_id’ and a dm_folder single value attribute table. This way we don’t end up querying the r_folder_path a repeating attribute. If we would have queried dm_folder type (instead of dm_folder_r) we would have hit the DQL restriction of DM_QUERY2_E_REPEAT_TYPE_JOIN error. However querying the underlying table enables us to pass through the DQL translator for _r table (just like any registered table concept).

October 16, 2007

Object Fetch Vs DQL Fetch Performance

Filed under: dfc, notes — Tags: , , — Raj V @ 7:22 am

A Object Fetch call would retrieve all attribute information of the object from server, this information will then be cached on the client side DMCL cache.

A DQL query will only retrieve the attributes specified in the “select” statement of the query.

A dm_document object has around 70+ attributes. If you are only interested in few attributes of an object you should use a DQL statement to avoid retrieving unnecessary information, this becomes significant especially in a low bandwidth environment.

Object Fetch should used when most attributes of an object are wanted, and/or that attribute information is repeatedly needed in multiple places.

Note that a DFC Session.getObject() call is effectively a fetch call, you should avoid creating a IDfSysobject with session.getObject() just to look at a couple attributes of the object, use query.execute(iDfSession, IDfQuery_DF_READ_QUERY) and specify the attributes of interest in the select statement of the query instead.

October 15, 2007

How to find if a document is a Virtual Document?

Filed under: dfc, notes, vdm, virtual document — Tags: , , , — Raj V @ 1:01 pm

A document is said to be a virtual document, if the value “r_is_virtual_doc” attribute is set to “1”.
If the value is set to “0” then it is not a virtual document unless “r_link_cnt” attribute of the object is greater than “1” and the object is not of type “dm_folder” or “dm_cabinet”.

If r_link_cnt is > 0 the object is a virtual document or a folder/cabinet.
“r_link_cnt” will never have a value of 1, because the parent of the virtual document is also considered a child. So if you add one component, r_link_cnt=2. If you add a second component, r_link_cnt=3, and so on.”r_link_cnt” attribute applies to dm_folder/dm_cabinet, indicating how many objects are linked to this folder.

Virtual Documents in detail

Filed under: dfc, vdm, virtual document — Tags: , , — Raj V @ 9:49 am

Ran through a good explanation on Virtual documents in EMC| Documentum Forums Link.

Excerpt from the forum (copied as is)
——————————

Inherently Virtual documents are complex. Assemblies add a little to that but here goes as simple as possible:

From the server object perspective, the dmr_containment object is the key to understanding Virtual Documents
dm_sysobjects
+ r_is_virtual_doc- A dm_sysobject in the Docbase becomes a Virtual Document if r_is_virtual_doc is set to 1.
+ r_link_cnt- Defines the number of components of the Virtual Document.
+ resolution_label- Defines the the version label for late-bound nodes of the Virtual Document. We will discuss this shortly.

dmr_containment objects
+ A dmr_containment object defines the link between the parent and child nodes.
+ parent_id- The Object ID of the parent node.
+ component_id- This is the chronicle_id, or original version id, of the child node.
+ version_label- Version label for the child component. This overrides parent’s resolution_label.
+ use_node_vers_label- If set to TRUE for early-bound components, the server uses the early-bound symbolic label to resolve late-bound descendants of the component during assembly.
+ follow_assembly- If set to TRUE, directs the system to resolve a component using the component’s Assembly (if the component has an Assembly).
Note that only the root node and the dmr_containment objects have attributes related to Virtual Documents. Because of this, the dm_sysobject can be the child of “thousands” of Virtual Documents without the need for setting “thousands” of attribute values for the object. The connection between the parent and child is maintained in the dmr_containment object.

When a parent node is versioned, new containment objects are created between the existing child node documents and the new parent version.
When a child node is versioned, no containment object is created. This is because the containment object links the version of the parent to “any” version of the child node. (Usually, the “CURRENT” version is requested but other options are available as will be described in the coming Binding section). This is accomplished by the containment object referring to the “chronicle_id” of the child node. The chronicle_id is the r_object_id of the first, original version of that SysObject.
The parent “edge” object contains the link information that relates the parent to “any” version of the child.

To DFC programmers, an “edge” is referenced using an IDfContainment or IDfAssembly reference.

Assembly
+ An Assembly is a “snapshot” of a version of a Virtual Document at the time it was assembled
+ Assembly Objects= dm_assembly on the server
+ An Assembly is a record or “snapshot” of a version of a Virtual Document as it existed at the moment of assembling versions of its parts. Each Assembly object (dm_assembly) contains static binding information between a specific version of a parent node and a specific version of a child node of a Virtual Document. Assemblies will be discussed in greater detail at the end of this module

There are two wrapper interfaces in the fc.client package that publicly expose the operations required to access the services of the vdm package. These two Virtual Document wrapper interfaces are:
+ IDfVirtualDocument – provides a tree representation of a Virtual Document. A Virtual Document tree is comprised of a collection of nodes. Each node object corresponds to a sub-component of a Virtual Document. The IDfVirtualDocument object tracks node changes and manages tree node dynamics as changes are made to the Virtual Document graph. (A Virtual Document is modeled as an “acyclic directed graph”. An object cannot directly or indirectly have itself as a child).
+ IDfVirtualDocumentNode – Represents a single node in a Virtual Document graph.

IDfAssembly – This class provides the functionality for the client to interact with dm_assembly objects in the Docbase.
IDfContainment – Represents a dmr_containment object.
IDfSysObject – This class provides the functionality for the client to interact with dm_sysobject objects in the Docbase. Use the asVirtualDocument() method to turn a dm_sysobject into a Virtual Document.

Fetching a Virtual Document using asVirtualDocument( “preferredVersion”,t/f), causes the system to populate a node tree corresponding to the binding criteria encountered when fetching the descendants of the Virtual Document. This procedure is called, “assembling” a Virtual Document. Until now we have avoided using that term to prevent confusion with this topic. “Assembling” is the act of populating a node tree according to the 6 binding factors mentioned earlier. The purpose is to obtain a “version” of the Virtual Document. The node tree is a “versioned” node tree.
Assemblies allow you to “record” the collection of ObjectIDs from a versioned node tree. You would want to do this if:
A) You would like a “shortcut” to a version you reproduce often.
B) You need to lock certain child object versions from being deleted in order to guarantee the ability to reproduce the same version of the child nodes in the future.
Note: Although the current version of DFC does not support creating assemblies of XML documents, you may find them useful to create a Virtual Document from various Docbase objects.
Tip: An alternative approach to creating assemblies is to use version labels (like, “Assembly 1”) and binding to retrieve specific past versions of the graph. This hint may be useful until DFC supports assemblies of XML documents.

Assembly Objects
An Assembly, therefore, is a recording (or “snapshot”) of a version of a Virtual Document. It is comprised of a collection of Assembly object elements. Each Assembly object in the Assembly collection references one version of one component (sysObject) of the Virtual Document. For each node in the Assembly collection, an “Assembly object” exists in the Docbase server. An Assembly object is actually an instance of “dm_assembly” and contains the absolute ObjectID of the version of the sysObject referenced by that node in the versioned Virtual Document.
Each “dm_assembly” object has fields that are used when fetching the Assembly collection in the future. All Assembly objects of the same Assembly have the same value for their “book_id” field. The book_id is the objectID of a unique sysObject chosen to act as a “handle” for that snapshot.
You can associate the collection Assembly objects with a single sysObject by calling:

Assembly Document
The Assembly collection is conceptually accessed from the Docbase server as a single unit using a unique sysObject we term an “Assembly Document”. An Assembly Document is a unique sysObject (specific version of a document) that refers to that “snapshot” (collection of Assembly objects). It is the sysObject passed to the node.assemble() method.
An Assembly Document can only reference one collection of Assembly objects at a time. Therefore, make sure that the sysObject passed to the assemble() method is not already associated with an Assembly or you will lose the previous Assembly.
The Assembly Document (or “book“) can be either the sysObject referenced by the Assembly’s root node or some other sysObject. However, since an Assembly Document can only reference a single Assembly, a separate Assembly Document would be necessary when multiple assemblies from the same Virtual Document are required in order to avoid overwrite.

Re-Assembling
When the Assembly is requested in the future, the system looks for the dm_assembly objects that share the same “book_id” value and fetches the sysObjects referenced by the “component_id” of that dm_assembly object.

IDfSysObject doc =   sess.getObject (  new   DfID ( "09001234567890ab" ) );
IDfSysObject aDoc1=  sess.getObject (  new   DfID ( "0900ba0987654321" ) );
IDfSysObject aDoc2=  sess.getObject (  new   DfID ( "0900ba0987654322" ) );

In this example, three documents are fetched from the Docbase. “doc” is a Virtual Document. aDoc1 and aDoc2 are generic document objects.
The “doc” object refers to a Virtual Document but does not actually populate the node tree until asked to do so.

IDfVirtualDocument vDoc =  doc.asVirtualDocument (  "Released" ,  false  );
IDfVirtualDocumentNode node =  vDoc.getRootNode ();
 node.assemble ( aDoc1 );

Here, the node tree is finally populated using the binding criteria starting with “Released” as the preferred version. As you can see from the table on the left, the different nodes did not all retrieve the “Released” version of each child. This is because of the binding criteria as well as settings in the nodes (such as early binding) may override the preferred version. Once all the child nodes are assembled, that collection of objects is recorded by calling node.assemble( aDoc1 ). This sets a value in the aDoc1 object as a reference to the collection of the bound version of the object at each node.
aDoc1 is the Assembly Document for the “Released” version of Virtual Document “A”.

IDfVirtualDocument vDoc =  doc.asVirtualDocument (  "CURRENT" ,  false  );
IDfVirtualDocumentNode node =  vDoc.getRootNode ();
 node.assemble ( aDoc2 );

Here, another version of the node tree is populated (assembled) using the binding criteria starting with “CURRENT” as the preferred version. As you can see from the table on the right, the different nodes did not all retrieve the “CURRENT” version of each child. This is because of the binding criteria as well as settings in the nodes (such as early binding) may override the preferred version. Once all the child nodes are assembled, that collection of objects is recorded by calling node.assemble( aDoc2 ). This sets a value in the aDoc2 object as a reference to the collection of the bound version of the object at each node.
aDoc2 is the Assembly Document for the “CURRENT” version of Virtual Document “A”.

Sometimes a Virtual Document nests an Assembly within its hierarchy. When fetching a Virtual Document that potentially has sub-assemblies within it, it is often necessary to obtain the sub-assemblies.
When you include an Assembly Document as a node in a Virtual Document, you must decide if the Assembly components should replace that node, and any of its containment components, with the whole Assembly, or to ignore Assembly altogether.
+ The default is that the Assembly components should be ignored and instead the server uses the binding criteria to fetch the components referenced in the dmr_containment objects;
+ The node can optionally override the binding criteria so the server fetches the Assembly components based on the dm_assembly collection instead of the node or any of its virtual children.
To direct the server to use the node’s Assembly components, call node.setFollowAssembly(true).
If the node is not a Virtual Document, or it does not have children, or it does not have an Assembly, this setting on the node is ignored.

Freeze
+ Cannot add or delete components from the Assembly
+ That version of the component will be frozen
– Can’t change contents or attributes
– Assembly Document will be frozen
– Can’t change contents or attributes

Unfreeze
+ May add or delete from the Assembly again
+ That version of the component is unfrozen
– r_frzn_assembly_cnt = 0
– Can change contents and attributes
+ Only works if it is not nested within another frozen Assembly

Disassemble()
To remove the Assembly status from a sysObject and to remove all its Assembly objects, either call sysObject.disassemble() or node.disassemble(). You must have at least VERSION permission for the sysObject in order to delete its Assembly.
Deleting an Assembly will remove all the Assembly objects but does not remove the sysObject components of the Assembly objects.

Note: The ownership of the article lies with the Original Author and is here only as a reference.
The copyright laws applicable to this note are as defined by EMC Forums.

October 10, 2007

Documentum embraces Developers with Eclipse plugins

Filed under: dfc, wdk, webtop — Tags: , — Raj V @ 12:09 pm

Eclipse being the freely available favorite Java editor for many ( my preference goes to IntelliJ over any) and for sure there are many people who use Eclipse for Documentum Projects. EMC Documentum has embraced eclipse community with few plugins.

Next Step: EMC Documentum is coming up with Documentum Composer that will enable all sort of Documentum developmental activities.

Configuring Eclipse for DFC projects

Filed under: dfc, documentum — Tags: , — Raj V @ 9:05 am

Description

To setup a DFC development environment you will need at a minimum the dfc.jar and dfcbase.jar files. However, we recommend that you reference all the jar files in the folder that contains the dfc.jar file.
On Windows, this folder typically is c:\Program Files\Documentum\Shared.
In addition, you will need any classfolders and jars of your custom business objects. Another important component is the folder that contains the dfc.properties and the log4j.properties file. This folder needs to be in the classpath too.

Creating a Java project for DFC development in Eclipse 3.1

 

  1. Click on menu item File->New->Project
  2. From the ‘New Project’ window select Java->Java Project and then click on ‘Next’
  3. Enter a project name (example – ‘Simple DFC Project’). For sake of keeping things simple, keep the defaults for other options on this screen. Click on ‘Next’.
  4. Select the ‘Libraries’ Tab and click on the ‘Add External Jars’ button.
  5. In the ‘Jar Selection’ dialog navigate to the folder that contains the dfc.jar file. On Windows, this is typically c:\Program Files\Documentum\Shared . Select all (CTRL+A) the jars in this folder and then click on ‘Open’.Fig – Add DFC Jars
  6. Now, click on the ‘Add Class Folder’ button and then click on ‘Create New Folder’Fig – Add Class Folder
  7. In the ‘New Folder’ dialog give some name (e.g. ‘dmConfig’) and click on ‘Advanced’
  8. Select the ‘Link to folder in filesystem’ checkbox and choose the folder that contains the dfc.properties file. On Windows this is typically C:\Documentum\configFig – Link to the folder containing the dfc.properties file
  9. Click on ‘Ok’ in both the ‘New Folder’ and ‘Add Class Folder’ dialogs
  10. Again for the sake of simplicity, keep the default project settings and click on ‘Finish’.
  11. Eclipse will create a project with the default source and output folders. You can now create Java classes that use DFC in the project. To create a new class right-click on the source folder(or package) and select New->Class. To execute a Java class right-click on the class and select Run As->Java Application. For more details related to the Java development options, please refer to the Eclipse documentation.

Reference : Developer Center

Create a free website or blog at WordPress.com.