Quality Handbook
To promote structured and targeted data analysis
An analysis plan should be created prior to the data analyses. The analysis plan contains a description of the research question and what the various steps in the analysis are going to be. The analysis plan is intended as a starting point for the analysis. It ensures that the analysis can be undertaken in a targeted manner.
However, both the research questions and the analyses may be revised during the data analysis. It may also be that certain options are not yet clear before the start of the data analysis. Even explorative data analysis is possible. The findings and decisions made during the analyses may be documented at a later stage in the analysis plan, meaning the analysis plan becomes a dynamic document. However, there is also the option of documenting findings and decisions made during the data analysis in SPSS syntax (see guideline 1.4-05 Documentation of data analysis). In this instance the analysis plan only serves as the starting point.
The concrete research question needs to be formulated firstly within the analysis plan; this is the question intended to be answered by the analyses. Concrete research questions may be defined using the acronym PICO: Population, Intervention, Comparison, Outcomes. A question such as: “What are the risk factors for back pain?” is too general. An example of a concrete question could be: “Does frequent bending at work lead to an elevated risk of lower back pain occurring in employees?” (Population = Employees; Intervention = Frequent bending; Comparison = Infrequent bending; Outcome = Occurrence of back pain). Concrete research questions are essential for determining the analyses required.
The analysis plan should then describe which statistical techniques are to be used to analyse the data. The following issues need to be considered in this process and described where applicable:
A statistician may need to be consulted regarding the choice of statistical techniques. See details for an example of an analysis plan.
It can be quite efficient to create a number of empty tables to be included in the article prior to the start of data analysis. This is often very helpful in deciding which analyses are exactly required in order to analyse the data in a targeted manner.
EXAMPLE OF AN ANALYSIS PLAN
Work-related psychosocial risk factors in relation to the occurrence of neck complaints.
Research question
What is the influence of the following psychosocial factors in the occurrence of neck complaints within 1 year in symptom-free employees?
1. Quantitative job demands
2. Skill discretion
3. Decision authority
4. Supervisor support
5. Co-worker support
Population
All 977 individuals who were symptom-free at baseline measurement and had a full follow-up.
Outcome measure (dependent variable)
Dichotomous variable: Presence (1) or absence (0) of neck complaints
Time variable: Time prior to neck complaint arising (minimum length of time of 1 day) in days
Independent variables:
All independent variables and confounders are dimensions of the Job Content Questionnaire (Karasek questionnaire).
1. Quantitative job demands
2. Skill discretion
3. Decision authority
4. Supervisor support
5. Co-worker support
Confounders:
1. Qualitative job demands
2. Job security
For each analysis with 1 central psychosocial factor, the other 4 will be analysed as potential confounders.
Other potential confounders
Statistical analysis
One regression model for each psychosocial factor:
- Firstly, univariate Cox regressions; dependent variable neck complaints, independent variable is the central psychosocial factor
Confounding
- Univariate Cox regressions of all potential confounders. Potential confounders with a p > 0.25 will no longer be considered as confounders.
- Multivariate Cox regressions of always 1 central psychosocial factor and 1 potential confounder using p < 0.25. When the change in the regression coefficient of the central psychosocial factor is around 10% or greater, then the potential confounder should be viewed as a true confounder, and this confounder should then be included in the multivariable analysis.
- Always add 1 potential confounder: If the change in the regression coefficient is greater than 10%, the confounder should be kept in the model, otherwise it can be excluded.
Effect modification
- Sex: Create a sex* psychosocial factor interaction. Add the interaction to the final model (with confounders). If the interaction is significant, then there is effect modification present.
Analysis plan: a stepwise plan created prior to the actual data analysis
V1.2: 1 Jan 2010: English translation.
V1.1: 21 Jan 2008: Text in guideline has been re-written with more emphasis on a flexible approach.