Why do DQL queries sometime return blank duplicate rows?

October 5, 2007

Why do DQL queries sometime return blank duplicate rows?

Filed under: support note — Tags: documentum, dql, duplicate rows — Raj V @ 12:39 pm

This may happen if you select from a repeating attribute (which is stored in the same underlying database table) and use the ORDER BY clause on the repeating column. Therefore, if one repeating attribute does not repeat as many times as another, it will have a “null” value for all rows where it has no value until it reaches the maximum number of rows for all repeating attributes of the object type.Note: Even if you specify “DISTINCT” in the query, there may appear to be duplicate rows, containing blank values as part of the repeating attribute column. However, this display is not the result of an error with the query. “DISTINCT” operates on all of the columns you specify within the “select” action; each of those columns in the result set, therefore is unique for the values in each row returned.

When executing a DQL query, sorting on a repeating value (e.g. “r_folder_path”) may cause “duplicates” of the objects to be returned such as those returned by the sample query below:

select object_name, r_folder_path from dm_folder ORDER BY r_folder_path or

select object_name, r_version_label from dm_document ORDER BY object_name

Returns “duplicate” rows of results in WorkSpace. The first half of the return values have the path field filled in. The second half of the returned rows do not. However, as explained above, the second half of the return can also be considered unique rows if the values in their corresponding columns are compared (e.g. a blank value is not the same as an actual value).

This is a side effect of the way DQL/SQL joins or maps the single and repeating-value tables when you employ the “ORDER BY” clause on the repeating field. The way DQL was intended is that every object in the Docbase has both an explicit “r_version_label” (e.g.: ‘draft’, ‘approved’, ‘new’) and an implicit “r_version_label” (e.g.: 1.0, 2.0, etc.) When the single and repeating tables are joined in a one-to-many relationship, at least two rows will result for each object.

To avoid this side effect, include in your “select” statement the “r_object_id” attribute and “order by r_object_id” clause. This pairing will force the master-detail relationship to be invoked since this attribute is common to both database tables and will result in only one row per object being returned. Notice that “r_object_id” in the example code below does not need to be first in either the “select” statement or the “order by” clause, but only be present somewhere in both elements:

select object_name, r_object_id, r_version_label from dm_document where owner_name = ‘charless’ order by object_name, r_object_id

This returns the object names, all sorted, with one object per row (no duplicates) and the “r_version” label information all concatenated onto one line. Instead of a row with “1.0” and another row with “CURRENT”, we now have “1.0,CURRENT” This is a feature of WorkSpace, which intelligently combines the results of the result set to remove duplicates if “r_object_id” is specified as an “order by” object.

Note: This only works if a single-valued attribute or “r_object_id” is first in the “order by” clause. If the repeating value is first, then all the values, including duplicates or blanks, are displayed in the result set.

Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

dm_notes: Documentum Notes

October 5, 2007