Initial data analysisGuideline in PDF

Aim

To get a first impression of the data. Evaluating the randomisation procedures. Evaluating and potentially imputing missing values and outliers. To get an impression of the distribution properties of the continuous variables and the numbers in the subgroups. Exploration of the validity and reliability of the measurement instruments.

Description

Exploratory analyses

This type of analysis helps in assessing whether there are missing values and/or outliers and whether categories need to be combined.

First impression
It is advisable to always review the distribution of all the variables to be used. Frequencies are reviewed for all categorical variables (e.g. marital status, education). Descriptive statistics (percentage missing values, average, “trimmed” average, standard deviation, median, other possible percentiles, minimum, maximum, skewness and kurtosis) are calculated for continuous variables (e.g. body weight, blood pressure). It is advisable to create figures, e.g. boxplots or histograms in order to review the distribution.

Outliers
So-called outliers may occur in continuous variables. These are values that, theoretically, are not “out of range”, but are extremely unlikely given the observed distribution. Reviewing averages and standard deviations is not enough to discover outliers; a frequency or boxplot will need to be generated for this.

Odd combinations
Cross-tabulations can be generated for categorical variables (e.g. gender x ADL limitations) in order to assess whether odd combinations are present. Scatterplots can be created for continuous variables to reveal any unlikely combinations (simply reviewing correlations is not sufficient). For instance: A weight of 120 kg and height of 1.50 metres will be an outlier in most populations. When it has been decided that a certain value or combination of values are outliers and the true value cannot be recovered from the raw data, then these need to be recoded as “missing”.

Missing values
Also carefully review missing values when evaluating the distributions. Often specific codes (e.g. -1 or 9) are used for missing values. Note whether these codes have been defined as missing values. If there are missing values, consider whether these need to be imputed (filled in). There are a number of methods for this: Please consult a statistician.

Normal distribution
If there is a requirement that the variables are normally distributed for a given analysis, it is advisable to evaluate whether a variable is in fact normally distributed. Graphs can be used for this, such as histograms or Q-Q plots. If it is apparent that the variable is not normally distributed, then a transformation could be considered (for instance a logarithm transformation) to see whether this improves matters.

Distribution of categories
Categories can be combined if the numbers in one or more categories is/are too small. The need for this is not always evident from an ordinary frequency distribution. However, it can be apparent from a cross-tabulation.
For instance in a study where there is stratification by gender and education, the cross-tabulation of education by gender shows that for men the lowest category “not completed primary education” rarely occurs, whereas for women the highest category “completed university education” rarely occurs. The lowest and next lowest categories can then be added together, as well as the highest and second highest.

Evaluating the randomisation procedure
In order to evaluate whether the randomisation has been “successful”, the distribution of all the relevant (prognostic) variables needs to be reviewed separately for each treatment arm. Descriptive statistics (percentages, averages, median, standard deviation, range) can be used for this. Differences between groups can be tested (e.g. chi-square or t-test), although it needs to remembered that due to the randomisation procedure any differences found are, by definition, due to chance.

Scale scores
Prior to the items in a scale being summed, the way in which the items behave in the sample needs to be evaluated. The first step in this process is a frequency plot of the items in a scale. Usually there are “positive” and “negative” items. It may be necessary to reverse-score the positive or negative items prior to summing the items to a sum score, to ensure all items are scored in the same direction. The items can then be summed, possibly once the response categories have been combined (e.g. “very severe” and “severe”).
The second step is a reliability and/or principal components analysis. The principal components or factor analysis can be used to explore which items belong to which (sub)scales. Cronbach’s alpha can be used to determine the internal consistency (homogeneity) of the scale. It is advisable to always determine the Cronbach’s alpha for the scales and, if possible, a principal components analysis (refer to the guideline Questionnaires, selecting, translating and validating ) to evaluate whether the expected scales are also evident in the data.
If it is apparent from the study that a given item does not fit the scale (e.g. the item-total correlation is too low or it does not load adequately onto the principal component), then whether this item should be excluded from the sum score needs to be considered. This does, of course, have consequences on the comparability of scores with other studies. This should therefore be considered carefully. In general it is not advisable to modify frequently used scales. It is better to use the original scales and report the results found (e.g. low alpha or low item-total correlations) in the  discussion of the article.

V1.1: 1 Jan 2010: English translation.
V1.0:  23 Apr 2007.

    • Has the distribution of all the variables been reviewed?
    • Were there variables with a high percentage of missing values? If so, how were these dealt with?
    • Have outliers been explored? If so, how?
    • Have the cell numbers for central variables been taken into consideration?
    • Where relevant: How were (large) deviations from normality solved?
    • Has been assessed whether the items belonging to a scale actually fit to the scale?