Pilot studyGuideline in PDF


The aim of pilot studies is to explore certain issues before undertaking a large-scale study.


The aim of pilot studies is to explore certain issues before undertaking a large-scale study (1). Pilot studies are rarely just a small version of a larger study; they have other objectives than the large-scale study.

Aims of a pilot study
Certain aspects of the large-scale study can be tested out in a pilot study. This may involve testing feasibility in practice or improving the methodological quality of parts of the study. Examples will be provided for both of these. Occasionally people undertake pilot studies in order to obtain an estimate of the effect size, with the notion of whether it is worth the effort to conduct a large-scale study in the back of their minds. Pilot studies are not suitable for this purpose, as will be explained below in detail.

Examples of questions about feasibility

  • What is the anticipated number of patients/participants that can be included? Patient inclusion usually appears to be much lower than the total number of eligible patients, and also much lower than the numbers doctors (and other recruiters) promise to deliver. For instance, in some recent trials it appeared that approx. 60% of company doctors (2) and physiotherapists (3) who had promised to provide patients, had not registered a single patient, and approx. 20% of “recruiters” had only recruited a single patient.
  • How many of the potential patients satisfy the inclusion and exclusion criteria? You may consider to relax some of the criteria in the large-scale study.
  • Will the eligible participants agree to be randomised for comparison of the interventions? This is especially important for certain populations, for instance, the elderly or (parents of) small children or groups where experience from other studies is lacking.
  • Why do some participants not want to take part or what are the reasons for drop out? You might be able to interview these people to understand the reasons for non-participation (these may perhaps be reasons that could be easily overcome in the large-scale study).
  • Are all medical files or information from other data sources retrievable?
  • Have all the necessary details been registered in statuses (doctors are sometimes overly positive about this issue), and is the quality of the data adequate?
  • Can all measurements be easily implemented and are they not too bothersome for the research participants?
  • What is the best recruitment method? For instance, is it beneficial to involve the GP, or would this have the opposite effect?
  • Is the intervention appropriate for the intended target group (e.g. language use, type of intervention)?
  • Is it possible to carry out the intervention in the target group with respect to acceptability and compliance?
  • Is it possible to the implement the intervention and/or study in the organisation?
  • How much time does the measurement procedure take, and how many times can this be undertaken per day?
  • What is the best way to approach the target group?

Examples of questions about methodological quality

  • Is it possible to blind the patients for the type of intervention?
  • Are the measurement instruments valid?
  • Are the measurements reproducible: Are your research assistants performing the measurements in exactly the same way?
  • Are the questionnaires being completed properly? Is the wording of the questions clear and comprehensible?
  • What is the most efficient measurement method for reproducible measurements (see for instance Observations: Who, when and how often?)
  • Is exposure (for instance in a cohort study) being properly measured?

Pilot studies are less suitable for estimates of effect size
This applies to both efficacy of an intervention, as well as to the strength of association in an observational study. The reason for this is that pilot studies are too small to achieve a reliable estimate of the effect. In other words, the confidence interval around the observed effect size will be very large, and all values lying within the confidence interval may be potential values of the actual effect size. The effect size determined in a pilot study is often used to calculate, on the basis of the confidence interval, the number of participants required in a large trial in order for the effect to be statistically significant. This is not justified. The erroneous assumption here is that the effect size found represents the true effect. Beurskens et al. (1) has presented an illustrative example of how misleading a pilot study can be. The correct interpretation is that all values in the confidence interval could represent the actual effect, meaning that when the confidence interval includes the neutral value (0 for differences and 1 for ratios), the effect could actually be both positive and negative. However, pilot studies can be used to provide an indication of the variance in measurement, that is, the standard deviation which can be used for the power calculation for the main study.

Organisation: Internal or external pilot studies
External pilot:
Pilot studies may precede larger studies and may include a clear go/no go decision. If the research aim/question of the pilot study is formulated in concrete, measurable terms, then a clear decision can be taken on the basis of the results. This decision may be for instance, that clear modifications in study design are required or that the larger study is not feasible.

Internal pilot:

If the pilot study exactly replicates the large study, e.g. a trial, then it may turn out that nothing needs to be altered based on the pilot study Therefore, why shouldn’t patients in the pilot study be included as the first participants in the larger study? There is nothing to stop you from doing this. However, you do need to make sure that the decision not to change the procedures is made, because one wants to add the patients of the pilot study to the larger study. It helps if clear agreements have been made in advance about this.

Observations: Who, when and how often?
You want to measure employees’ physical exertion using a cohort study on “musculoskeletal complaints”. The decision is made to create video recordings, which are reviewed at a later stage by research assistants scoring certain parameters. A feasibility study (4) can be carried out in order to judge whether these recordings need to be made and/or scored for all employees (with the same role), or whether whole working days (8 hours per day) need to be scored, and for how many days in the week, and whether the research assistants are scoring in the same way. A study such as this is aimed at finding the most efficient and reliable way of collecting the data.
All potential variations can be included in the pilot study (e.g. different research assistants, different durations, different days, different employees with the same role). Subsequently, the resulting variance components can be used to determine the most efficient and preferable method to be used in the larger study. This example can, of course, also be translated to other situations that include observations: E.g. observations of patients or their carers, or certain types of child behaviour.

  • Beurskens AJHM, de Vet HCW, IJ Kant. Dwalingen in de methodolologie (Methodological errors). VIII. Pilot onderzoeken: zin en onzin. (Pilot studies: Sense and nonsense). Ned T Geneesk 1998; 142: 2142-2145.
  • Steenstra I. Back pain management in Dutch occupational health care. Thesis VU Amsterdam 2004 (in general discussion).
  • Pool J. Neck pain, a pain in the neck? A study on therapeutic modalities and clinimetrics. Thesis VU Amsterdam 2007 (in general discussion).
  • Streiner DL, Norman GR. Health measurements scales. Chapter 9 Generalisability theory. 3rd ed, Oxford University Press, 2003.

V1.1:  1 Jan 2010: Translated into English.
V1.0:   14 Feb 2007.