Data transformation

Aim

To transform measured variables into variables that can be used for analysis (analysis variables)

 

Requirements

  • Notification of the transformations that have been carried out;
  • To ensure reversibility by keeping the original variable intact.

 

Documentation

  • Note the transformations that have been carried out, for example in the syntax

 

Responsibilities

Executing researcher: To execute the data transformation properly.
Project leaders: To advice the executing researcher to transform the data, after data cleaning.
Research assistant: N.a.

 

How To

Examples of data transformation are recoding of an income variable into a new variable of income class, or calculating a new Body Mass Index variable from the height and weight variables. As the name data transformation suggests, this is often generated through the data transformation commands in SPSS, such as recode, compute, etc. It is important that this type of transformation is carried out properly and that the transformations are documented. If the transformations are carried out in SPSS via the syntax window, then it is sufficient to save the syntax file as a logbook. If the transformations in SPSS are carried out via the menu, the documentation may consist of SPSS log file annotated in MS Word. It is recommended that a data transformation schedule is created prior to any modifications, with columns for the measured variable(s), process(es), the variable name and a description of the resulting variable (see details). Appropriate variable labels, value labels and any potential missing value definitions need to be assigned to the new variables. Standard practice is to calculate the frequency distribution of the new variables in order to monitor odd values and outliers. It is important when recoding a variable that the original variable is left intact. Therefore always make sure to create a new variable.
Once the improvements have been made, the files should be stored under a new name. This is also referred to as the working data: Cleaned files, including the derived variables.

Example data transformation

 

Appendices/references/links

 

Audit questions

  1. Has the data transformation been carried out correctly?
  2. How has the data transformation been documented?
  3. Has care been taken to ensure that the process is reversible, that is, can the file be restored to the status prior to reduction, if required?

 

V3.0: 1 December 2016: Revision guideline
V2.0: 12 May 2015: Revision format
V1.1: 1 Jan 2010: English translation
V1.0: 31 Mar 2004: Data reduction has been changed to data transformation