Mapping notes

This document (from Ryan Brooks) contains implementation notes to aid maintenance of the transform.

Requirements

This transformation is built with the Talend Open Studio Data Integration suite.

  • Talend Open Studio Data Integration, >~5.6.2.
  • Git

Transformation steps

The transform is broken down into 5 steps:

  1. Extract the Excel spreadsheet.
  2. Map the spreadsheet columns to NHIC data elements.
  3. Replace spurious values with nulls.
  4. Replace input strings with their enumerated counterparts.
  5. Load the output XML file.

Replacements

The primary role of the replacements is to swap input values for valid enumeratied values in the XML, for example swapping Positive for Pos.

There are also many values which are non-values, for example Not tested, Awaiting result, and Unknown, when valid values in the XML are Positive and Negative.

NHIC_TRA_43-1

  • Unknown -> 0 (Not known)

NHIC_TRA_44-1

  • Burns -> 17 (Other)
  • Cardiovascular - type unclassified -> 17 (Other)
  • Chronic pulmonary disease -> 17 (Other)
  • Liver failure (not self poisoning) -> 17 (Other)
  • Pneumonia -> 17 (Other)
  • Respiratory - type unclassified (inc smoke inhalation) ->
  • Respiratory failure ->
  • Septicaemia -> 17 (Other)
  • Other drug overdose -> 6 (Drug overdose)
  • Not reported -> 18 (Unknown)
  • Infections - type unclassified -> 8 (Infection)
  • Trauma - RTA - car -> 16 (Trauma)
  • Trauma - RTA - motorbike -> 16 (Trauma)
  • Trauma - RTA - pedestrian -> 16 (Trauma)
  • Trauma - RTA - pushbike -> 16 (Trauma)
  • Trauma - RTA - unknown type -> 16 (Trauma)
  • Other trauma - accident -> 16 (Trauma)
  • Other trauma - unknown cause -> 16 (Trauma)
  • Other trauma - suicide -> 16 (Trauma)

NHIC_TRA_45-1

  • Mixed -> G (Any other mixed background)
  • Other -> S (Any other ethnic group)

NHIC_TRA_56

  • Peritoneal dialysis -> PERITONEAL DIALYSIS (CAPD/APD)
  • Haemodialysis -> UNIT HAEMODIALYSIS ** THIS IS A MASSIVE ASSUMPTION **

NHIC_TRA_70-1

  • Living unrelated - Altruistic -> LIVING UNRELATED
  • Living unrelated - Pooled -> LIVING UNRELATED
  • Unknown -> removed

NHIC_TRA_71-1

  • ALTRUISTIC NON-DIRECTED DONOR -> OTHER LIVING NON-RELATED DONOR - PLEASE SPECIFY
  • POOLED DONOR -> OTHER LIVING NON-RELATED DONOR - PLEASE SPECIFY
  • LIVING NON-RELATED DONOR - PARTNER -> LIVING NON-RELATED DONOR - SPOUSE

m attributes

The modification time (m) attributes have been omitted from the transformation, because each transformation will include the entire NHSBT extract and will therefore supercede any prior values. The NHSBT extract doesn't contain timestamps for modifications, so it would only be possible to add the m value of the extract or transformation date and apply that to all elements.