Thursday, December 3, 2015

How to write BREX context rules (part 3): Learning by Example

In part 3 of "How to write BREX context rules," the <structureObjectRule> element is introduced: the element for defining a structure object rule using XPath. If you are not familiar with XPath, you can review How to write BREX context rules (part 2): XPath Primer.

Since the S1000D specifications provides the element and attribute breakdown of the <structureObjectRule> element (Chap 4.10.2.2, Para 2.1.1 of Issue 4.1), an example-based approach will be used in learning how to write structured rules.

Example Rule 1: All procedural steps must have an ID

For our first example, the following business rule has been decided upon:

All procedural steps must have an authored ID.

For the purposes of this post, we are not really concerned about the reason for this rule. But if you are curious why a project may have such a rule, and where a project can document the reasons behind a rule, see my post, What is the difference between brDoc and BREX?

Attempt 1

At inital glance, writing the structured rule for this decision seems straight-forward, so here is our first try at writing the rule:

Note:
The following is not how the rule should be written. I will explain later on how this illustrates a mistake a beginning BREX rule author may make.
1: <structureObjectRule>
2:  <objectPath allowedObjectFlag="1">//proceduralStep[@id]</objectPath>
3:  <objectUse>A procedural step must have an authored ID.</objectUse>
4: </structureObjectRule>

I will describe each line of the rule:

Line 1

The start tag: The rule itself is defined by child elements.

Line 2

<objectPath> specifies the actual XPath expression and how that expression is applied via the allowedObjectFlag attribute. When allowedObjectFlag is "1", it indicates that the object identified by the XPath expression is required and must exist in the data.

The XPath expression is the textual content of the <objectPath> element, with the given expression matching any <proceduralStep> element in a data module that has an ID.

Line 3

The <objectUse> element provides a human-readable description of the rule. It is highly likely that your IETP authors are not well-versed in XPath expressions, so it is important that you provide a human-readable description of the rule.

Line 4

The end tag.

Problem with Attempt 1

Attempt 1 is based on a misunderstanding of the allowedObjectFlag attribute. The attribute applies to the entire expression, so when it has a value of "1", the expression must always evaluate to a true condition.

What does that mean for the rule we created?

Any data module we validate against the rule will fail if the DM does not have a <proceduralStep> with an id attribute. Therefore, all non-procedural DMs will always fail validation. We either need a way to restrict the rule to apply to procedural DMs only, or a rule that allows us to satisfy the business rule requirement without causing false failures.

Attempt 2: Context-specific rule

As mentioned in Part 1, rules can be contextualized by schema type (but make note in Part 1 of the deficiencies of using context-specific rules). Using this capability, we can have the following:

1: <contextRules
2:     rulesContext="http://www.s1000d.org/S1000D_4-1/xml_schema_flat/proced.xsd">
3:   <structureObjectRuleGroup>
4:     <structureObjectRule>
5:       <objectPath allowedObjectFlag="1">//proceduralStep[@id]</objectPath>
6:       <objectUse>A procedural step must have an authored ID.</objectUse>
7:     </structureObjectRule>
8:   </structureObjectRuleGroup>
9: </contextRules>

This addresses the problem of the rule being applied to non-procedural data modules. However, it still fails to adequately enforce the busines rule.

Problem with Attempt 2

The rule in Attempt 2 will evaluate to true if there is at least one <proceduralStep> with an id attribute. If a data module has other steps, but no id is present, the rule will still pass as long as the expression matches at least one node.

The business rule requires that all steps have an ID. So we need a way to verify that every <proceduralStep> instance in a DM has an id attribute.

Attempt 3: Match what is not allowed

When it comes to writing structured rules, you will realize the following guiding principle will making writing rules much easier:

It is generally easier to match what is not allowed vs what is allowed.

Or, in geek-speak:

Applying inverse logic to a business rule can result in a simplier structured object rule.

As a BREX rules developer, when I read the following business rule, "All procedural steps must have an authored ID," I immediately apply inverse logic and translate it to the following, equivalent statement, "Procedural steps with no authored ID are not allowed."

With the inverse rule, the BREX rule can be expressed as follows:

1: <structureObjectRule>
2:  <objectPath allowedObjectFlag="0">//proceduralStep[not(@id)]</objectPath>
3:  <objectUse>A procedural step must have an authored ID.</objectUse>
4: </structureObjectRule>

The markup is very much like our first attempt, but with the following key changes:

  • The allowedObjectFlag value is set to "0". "0" indicates that any object matched by the expression is not allowed. If a data module contains content that matches a rule with an allowedObjectFlag of "0", the DM will fail validation.

  • The XPath expression has been modifed to match nodes that should not be present, in this case, steps with no ID attribute.

It is worth noting that the <objectUse> text is left unchanged. Inverse statements may make writing BREX rules easier, but are not necessarily easier for IETP authors to understand. An inverse rule tends to utilize negation, which can be harder to comprehend. For example, which of the following equivalent business rules do you think most people will understand easier:

All procedural steps must have an authored ID.
Or,
Procedural steps with no authored ID are not allowed.

?

Problems with Attempt 3

There are no problems! The rule addresses the following deficiencies of previous attempts:

  • Non-procedural DMs are unaffected by the rule since they will never match an expression containing <proceduralStep> (a problem with attempt 1).
  • With inverse logic, we know a single step with an ID cannot mask steps without IDs (a problem with attempt 2).