Wikis - Page

IDM Proven Practices: Efficient IDM Input/Output Transformation Value Mappings

2 Likes
I've been asked a few times about the purpose of various policysets; for what purpose do we have this one vs. that one? Why Event vs. Command, or Creation vs. Placement? There are reasons for all of them, and if followed properly life can be much simpler than if lumping everything under one policyset and hoping for the best.

Today I was preparing for a training and noticed the following snippet of XML in a Command Transformation policyset, which works very well, and is still totally wrong according to proven or best practices:
<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE policy PUBLIC "policy-builder-dtd" "/root/designer/plugins/com.novell.idm.policybuilder_4.0.0.201505010256/DTD/dirxmlscript4.5.1.dtd"><policy>
<rule>
<description>If Active, set employeeStatus to A</description>
<conditions>
<and>
<if-op-attr mode="nocase" name="employeeStatus" op="changing-to">Active</if-op-attr>
</and>
</conditions>
<actions>
<do-reformat-op-attr name="employeeStatus">
<arg-value type="string">
<token-text xml:space="preserve">A</token-text>
</arg-value>
</do-reformat-op-attr>
</actions>
</rule>
<rule>
<description>If Inactive, set employeeStatus to I</description>
<conditions>
<and>
<if-op-attr mode="nocase" name="employeeStatus" op="changing-to">Inactive</if-op-attr>
</and>
</conditions>
<actions>
<do-reformat-op-attr name="employeeStatus">
<arg-value type="string">
<token-text xml:space="preserve">L</token-text>
</arg-value>
</do-reformat-op-attr>
</actions>
</rule>
<rule>
<description>If Terminated, set employeeStatus to T</description>
<conditions>
<and>
<if-op-attr mode="nocase" name="employeeStatus" op="changing-to">Terminated</if-op-attr>
</and>
</conditions>
<actions>
<do-reformat-op-attr name="employeeStatus">
<arg-value type="string">
<token-text xml:space="preserve">T</token-text>
</arg-value>
</do-reformat-op-attr>
<do-veto/>
</actions>
</rule>
</policy>

This code works just fine transforming values from their application versions ('Terminated', 'Active', and 'Inactive') to their Identity Vault equivalents ('T', 'A', and 'L', respectively). Nothing too crazy or slow, although the veto at the end there makes the last transformation to 'T' completely irrelevant as it will never see the light of day. Ironically, that may be the only bit of code I think should remain in this entire policy.

That comment may not be glowing praise, but it's not the condemnation given earlier stating it was all wrong. The reason this code is wrong is less about its zeroes and ones and more about its placement in the driver configuration object. In this case, this rule came from the Command Transformation Policyset , and while it works there it is, by definition, code meant to transform values from the application-specific versions ('Active', etc.) to their Vault values ('A', etc.), and therefore, unless something is really holding it back, this code should be in the Input Transformation Policyset (ITP), and the code that transforms back to the application version of things should be in the Output Transformation Policyset (OTP). There are some big reasons why.

  1. Maintenance: This code assumes that all logic comes from the application to the Vault, and in this case that's probably valid. This is a simple JDBC driver connected to a simple HR application, so it is meant to get data from the application and send them on into the Vault without further ado. In most cases, it does just that, very nicely even, and life goes on. Testing creates in the HR application works perfectly, and besides wasting cycles on the Terminate case (reformatting before vetoing) the results are ideal. The problem, though, is for everything else that may happen in the future.Pretend that sometime in the future when you have forgotten all about this that you need to add some new logic to set some important attributes on the object on a create. One of those is going to rely on this attribute, or something else from the application which is being similarly transformed, and so it will query for values back in the application to see what is happening there before (for example) adding a user to a group based on employeeStatus (the Vault attribute schema-mapped to the application's hr_status). What could be easier?

    Planning out this policy you'll look at the values in the vault, see things like 'A' and 'T', and write some logic to handle those values. They will not work, and after reading traces you'll realize it is because the values in the application are actually 'Active' and 'Terminated'. No biggie, you can change those having only wasted an hour of your life wondering why it was not working as expected, and now you have two places in your driver config that deal with application-specific values even though they are past the Schema Mapping policyset, where application-specific attribute names are no-longer used. Your traces now refer to things like employeeStatus and hr_status interchangeably, or so you hope, and the complexity has only begun to build. It was such a simple thing, such an easy modification, but you'll regret both of these for quite a while.

  • Maintenance: Direct mappings should be done with mapping tables. The only thing I really do not like about the actual code is the over-use of if statements, as if they were a switch/case or select/case statement. If values are one-to-one, just use a mapping table. Building one takes a couple of minutes longer, but you can reuse it over and over, and being two-dimensional (with multiple rows and columns available) you can do a lot with a properly-designed table. In this case, we have a very simple one, and it may not be worth the effort in the short term for a single rule, but it definitely will be worth it in the long run. The table can be called something like 'employee-state' and can have two columns of 'appvalue' and 'vaultvalue', then having rows that match those values 'Terminated' and 'T', 'Active' and 'A'. The call to use a mapping table can be done anywhere in this driver (or in another driver, in the same eDirectory environment, for that matter) and the result is a single authority to capture and maintain that application-to-vault value mappings for this attribute.Even if you implemented this one thing, you would have simpler-to-understand logic when you didn't follow suggestion #1 above and instead put the logic in multiple places throughout your driver config. At least now when you need to change things, you do so in one place, deploy one object, and do not even need to restart the driver (mapping tables are read on the fly). Need to add a new type 'F' for 'Furloughed'? Fine, that's a (literally) two-minute change, and since you're not writing any policy there is no need to restart the driver, no need to test new policies/rules/conditions, and you can be relatively certain things will work. Maybe other policies depend on these values later, but at least this one will just work.

  • Maintenance: Consider the case of a resync for any reason (cache corruption, object restore in the vault, replicas added-to the IDM engine service, or even an intentional migrate). When a resync happens, unexpected things happen. Experienced Identity folks get in the habit of testing for the common Create, Read, Update, and Delete test cases, but Identity Manager has more than that, and one of those is the ability to merge values between the Vault and the Application. How should single-to-multiple value merging happen? Which system is the authority for the e-mail address (probably not the vault, but probably is the e-mail system, even though the e-mail system is not authoritative for anything else sent to it)? What if some kind of hardware failure causes downtime, and you want to resynchronize objects to get them back in shape? What if you have a create event and it matches and existing user in the Vault? Merge, merge, merge, merge.A merge is interesting because it does not get kicked off from policy directly, but instead happens because of the Merge Processor (what an odd name for such a thing, eh?). Based on the rules defined in the filter, the Merge Processor will read values from here (Vault), and from there (Application), and then squish things together in the way you have defined in the filter. Where in that process do you tell it to convert from things like 'Terminated' to 'T', or vice versa? Good luck with that, and now the system may go ahead, since the HR system is authoritative after all, and write values like 'Terminated' into employeeStatus. Suddenly your ten-minute single-user restore is going to end up being days of figuring out what changed, and why. Once you realize it was IDM that did it, trying to reverse-engineer your rules spread amongst all of the wrong policysets to transform values from Application to Vault values will be horrible. If you had that Mapping Table (or one per attribute) you could at least find those and fix them semi-quickly, for one user. Consider, though, if you had to restore a few users, or hundreds of users. How much manual pain and suffering are you willing to endure?

    Also, do not forget about simple queries that you kick off yourself (calls to attr, src-attr, or dest-attr) which also need to be transformed. The Input and Output Transformation Policysets are not really in either individual channel, but instead are there for both channels to use depending on what is needed. If you send a query, it will go through these policysets alone and "out of band", so having them be correct saves tons of time later. Programmers have long written about modularity, and code reuse; this is a great way to reuse code.

  • Maintenance: Noticing a trend with maintenance here? Anybody can do Identity Manager work because the framework is just that powerful and makes it so easy, but keeping it going efficiently over time with the organization's business changes that will come takes a little thought, just like creating the initial system did. Testing is a lot easier when you have properly-created Input and Output Transformation Policyset policies. Once setup you can basically write your tests to watch this value here, or that value there, and send changes through and just know they'll work, mostly because you are not needing to worry about mismatched logic in one policy vs. another (#1), or mismatched values in one policy or another (#2), or need to worry about this type of event vs. that type of events (#3), or anything like that. Testing just works, because the transformation from this to that, or that to this, is done once, in one place, and reliably reused by everything the system ever does.

  • Maintenance: Policysets are named for a purpose, and these are no different. Input and Output Transformation Policysets are meant to transform values for applications as they come into, or leave, the vault. Your new trainee, or best-friend-and-successor, or whomever will come in and want to understand logic, and if this is in the right place there are automatic clues about the logic's purpose. What do these policies in this 'Input Transformation Policyset' thing do? Maybe this is a policyset to transform input to the vault (I was once told that to figure out somethings complex name, just reverse the words and it's a bit more self-explanatory, so that's what I did there in case it was not obvious). Now you may argue that every single policyset is transforming input in some way, and if you want to be pedantic about it that's true, but the 'Inputted Attribute Value Transformation Policyset' is just way too long to write, or type, or say, so we'll give those naming folks a break here. These policysets exist outside of either channel because they are meant to be used by both for operations (transformations of values) relevant to both channels.

  • Maintenance: Doing this once and using the reformat-op-attr token we can have this work for both operation documents that make a change (add, modify, delete) as well as those that convey information to one side or another (query) because the reformat-op-attr token is just that useful.


There are some reasons to do things the right way just off the top of my head, but now let's see how this simple bit of logic could be reworked to be a bit more flexible, and powerful, and useful to everybody.

Note: I wrote this to be as simple for reuse as possible, and thus it is not nearly as simple as the original. With that said, the original was broken, could never be reused, and was not that helpful outside of one, very specific, use case, so we're moving from a Ford Model T to a Tesla Model S; it may feel like moving backward, but you also can go 210 kph, enjoy air conditioning, a great stereo system, airbags, and be in one of the safest cars in the world, all without directly polluting the atmosphere around you.
<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE policy PUBLIC "policy-builder-dtd" "/root/designer/plugins/com.novell.idm.policybuilder_4.0.0.201505010256/DTD/dirxmlscript4.5.1.dtd"><policy>
<rule>
<description>reformat employeeStatus</description>
<comment xml:space="preserve">An attribute needs to be transformed from the application value to the vault value on adds, removes, and query responses.

Valid values should pass through, invalid values should be stripped out. If all values stripped out, the attribute itself should also go away.

If multiple values go through, all valid values should pass through, no invalid values should make it, and if any single value is stripped for being invalid, it alone should be stripped (leaving valid values alone).

0.1.20150701063000 - Initial version</comment>

Test cases with hr_status attribute:

&lt;nds dtdversion="2.0" ndsversion="8.x" xmlns:jdbc="urn:dirxml:jdbc">
&lt;source>
&lt;product build="20150417_0410" instance="employee-hr" version="4.0.1.0">DirXML Driver for JDBC&lt;/product>
&lt;contact>NetIQ Corporation&lt;/contact>
&lt;/source>
&lt;output>
&lt;instance class-name="hr_pers" event-id="0" src-dn="HR_PERSID=248,table=HR_PERS">
&lt;association state="associated">HR_PERSID=248,table=HR_PERS&lt;/association>
&lt;attr attr-name="hr_status">
&lt;value type="string">Active&lt;/value>
&lt;value type="string">Junk&lt;/value>
&lt;/attr>
&lt;/instance>
&lt;status event-id="0" level="success">&lt;/status>
&lt;/output>
&lt;/nds>

&lt;nds dtdversion="2.0" ndsversion="8.x" xmlns:jdbc="urn:dirxml:jdbc">
&lt;source>
&lt;product build="20150417_0410" instance="employee-hr" version="4.0.1.0">DirXML Driver for JDBC&lt;/product>
&lt;contact>NetIQ Corporation&lt;/contact>
&lt;/source>
&lt;input>
&lt;modify class-name="hr_pers" event-id="0" src-dn="HR_PERSID=248,table=HR_PERS">
&lt;association state="associated">HR_PERSID=248,table=HR_PERS&lt;/association>
&lt;modify-attr attr-name="hr_status">
&lt;remove-value>
&lt;value type="string">Junk&lt;/value>
&lt;/remove-value>
&lt;add-value>
&lt;value type="string">Active&lt;/value>
&lt;/add-value>
&lt;/modify-attr>
&lt;/modify>
&lt;status event-id="0" level="success">&lt;/status>
&lt;/input>
&lt;/nds>
<comment name="author" xml:space="preserve">Aaron Burgemeister</comment>
<comment name="version" xml:space="preserve">0.1.20150701063000</comment>
<comment name="lastchanged" xml:space="preserve">2015-07-01T06:30:00</comment>
<conditions>
<and/>
</conditions>
<actions>
<do-if>
<arg-conditions>
<and>
<if-class-name mode="nocase" op="equal">hr_pers</if-class-name>
</and>
</arg-conditions>
<arg-actions>
<do-set-local-variable name="map-attr-name" scope="policy">
<arg-string>
<token-text xml:space="preserve">hr_status</token-text>
</arg-string>
</do-set-local-variable>
<do-if>
<arg-conditions>
<and>
<if-op-attr name="$map-attr-name$" op="changing"/>
</and>
</arg-conditions>
<arg-actions>
<do-reformat-op-attr name="$map-attr-name$">
<arg-value type="string">
<token-map default-value="~drv.employee-hr.map-value-not-found~" dest="vaultvalue" src="appvalue" table="employee-state">
<token-local-variable name="current-value"/>
</token-map>
</arg-value>
</do-reformat-op-attr>
<do-if>
<arg-conditions>
<or>
<if-op-attr mode="nocase" name="$map-attr-name$" op="changing-from">~drv.employee-hr.map-value-not-found~</if-op-attr>
<if-op-attr mode="nocase" name="$map-attr-name$" op="changing-to">~drv.employee-hr.map-value-not-found~</if-op-attr>
</or>
</arg-conditions>
<arg-actions>
<do-trace-message level="1">
<arg-string>
<token-text xml:space="preserve">A value coming from the application was not found in the mapping table which means the mapping table must be updated. For now cleaning out the invalid value(s) to avoid problems later in this operation.</token-text>
</arg-string>
</do-trace-message>
<do-strip-xpath expression="*[@attr-name=$map-attr-name]/value[text()='~drv.employee-hr.map-value-not-found~']"/>
<do-strip-xpath expression="*[@attr-name=$map-attr-name]/add-value[value='~drv.employee-hr.map-value-not-found~']"/>
<do-strip-xpath expression="*[@attr-name=$map-attr-name]/remove-value[value='~drv.employee-hr.map-value-not-found~']"/>
<do-strip-xpath expression="*[@attr-name=$map-attr-name and not(*)]"/>
</arg-actions>
<arg-actions/>
</do-if>
</arg-actions>
<arg-actions/>
</do-if>
</arg-actions>
<arg-actions/>
</do-if>
</actions>
</rule>
</policy>

First, the top big weird-looking XML-ish bit has testing documents, so you can ignore those lines starting with '&lt;' since they are just encoded XML to avoid interfering with the actual XML policy, while keeping it with the policy for simplicity and documentation. If you want to test your deployed rule, then those XML documents are what you should use after decoding them back to regular XML.

Next, as I mentioned, the goal of writing this policy this way was to make its rule as reusable as possible. With very little work this should be able to be reused for any attribute with a direct mapping to/from an application. Creating a new mapping table ('department-values') with the same columns but different data, and then changing the first local variable to have the new attribute name (in the application's schema, so perhaps 'hr_dept') is all you really need to do in order to make this work for another attribute. Change the name of each duplicated rule to keep things less confusing of course, but that is it.

Next, there is logic in there to handle unknown values. Pretend you add that new 'Furloughed' type of state in the business and the HR system, but forget to add it to IDM's mapping table; the logic above will strip out unknown values so that they basically do not exist when returned. That may, or may not, be what you want, but it is the reason for the logic that strips out values when they match the mapping table's default ("I could not find what you gave me to find in the table.") returned value. If that logic does not suit you, remove the first three (of four) do-strip-xpath actions, or maybe just comment them out until you realize you do want them, and then redeploy the rule. The rule will also write a trace message just before that, so you'll have some warning if you are watching in the development/testing phase. If you wanted to you could add audit events, or other actions that would warn you about things that you do not deem possible (missed opportunities, or missed changes, or bugs, or whatever).

Finally, the logic above also cleans up XML that is valueless because of the things removed just before it. It's weird, and an edge case at best, to have a <modify-attr attr-name="hr_status'/> token; there's no value, so strip it out entirely, and the last action does that.

There is the case of the Output Transformation Policyset, which should have a similar (but opposite) rule. Simple enough, use the same logic above, but reverse the columns in the mapping table call, so you're mapping from vault to application (instead of application to vault). Pretty simple, yes. Your testing documents will also need to change values because they're written for application values coming in, but that's also trivial.

What about that Veto at the end of the original bit of code? If there is a desire to veto Terminate events before they go to the vault, that seems pretty odd, but you are allowed to do whatever you want. If the rest of the driver configuration does not worry about, and should not do anything, on a Terminate event, then move the rule with the veto (and nothing else from this policy) to the Publisher's Event Transformation Policyset (ETP); if the other policies and rules in the driver configuration do act on a Terminate, but just should not continue past this point, clean up the other actions from this rule and leave it here to veto things that match at this point.

The points to take away from this are the ones we all learned back in the early days of schooling, whether at home or during some sort of formal education:

  1. Do it nice, or do it twice (or thrice). Arguably the second way is much more complex, but it will also do a lot for you while being easily reusable, or extended, or whatever.

  • Everything has its place; going from beginner to intermediate with Identity Manager implies knowing where things belong, and then keeping them there. Otherwise we need to keep cleaning up our room.

  • Haste Makes Waste: Think long term. Whether you're a fan of Emotional Quotient (EQ) topics, or you're a social scientist with newer/better/faster/stronger models to apply, short-term do it now/quick thinking will abuse you if you do not let yourself do these jobs correctly.

  • Accuracy is more-important than speed. In Identity this is much more true than in other areas of IT. Delete one row of a large retailer's inventory, meaning one item is now missing from millions in history, and nobody may notice; mess up one user's account in Identity, and you could be on the receiving end of a pink slip. Accuracy matters.

  • Location, Location, Location. Maybe that was the real estate guy more than the school teacher, but we all know it's true. Where something happens maybe as important as when or if; do it in the right spot.


Happy Computing!

Labels:

How To-Best Practice
Comment List
Related
Recommended