Blinded experiment

A blind — or blinded — experiment is an experiment in which information about the test is masked (kept) from the participant, to reduce or eliminate bias, until after a trial outcome is known.^[1] It is understood that bias may be intentional or unconscious, thus no dishonesty is implied by blinding. If both tester and subject are blinded, the trial is called a double-blind experiment.

Blind testing is used wherever items are to be compared without influences from testers' preferences or expectations, for example in clinical trials to evaluate the effectiveness of medicinal drugs and procedures without placebo effect, observer bias, or conscious deception; and comparative testing of commercial products to objectively assess user preferences without being influenced by branding and other properties not being tested.

Blinding can be imposed on researchers, technicians, or subjects. The opposite of a blind trial is an open trial. Blind experiments are an important tool of the scientific method, in many fields of research—medicine, psychology and the social sciences, natural sciences such as physics and biology, applied sciences such as market research, and many others. In some disciplines, such as medicinal drug testing, blind experiments are considered essential.

In some cases, while blind experiments would be useful, they are impractical or unethical; an example is in the field of developmental psychology: although it would be informative to raise children under arbitrary experimental conditions, such as on a remote island with a fabricated enculturation, it is a violation of ethics and human rights.

The terms blind (adjective) or to blind (transitive verb) when used in this sense are figurative extensions of the literal idea of blindfolding someone. The terms masked or to mask may be used for the same concept; this is commonly the case in ophthalmology, where the word 'blind' is often used in the literal sense.

History

The French Academy of Sciences originated the first recorded blind experiments in 1784: the Academy set up a commission to investigate the claims of animal magnetism proposed by Franz Mesmer. Headed by Benjamin Franklin and Antoine Lavoisier, the commission carried out experiments asking mesmerists to identify objects that had previously been filled with "vital fluid", including trees and flasks of water. The subjects were unable to do so. The commission went on to examine claims involving the curing of "mesmerized" patients. These patients showed signs of improved health, but the commission attributed this to the fact that these patients believed they would get better—the first scientific suggestion of the now well-known placebo effect.^[2]

In 1799 the British chemist Humphry Davy performed another early blind experiment. In studying the effects of nitrous oxide (laughing gas) on human physiology, Davy deliberately did not tell his subjects what concentration of the gas they were breathing, or whether they were breathing ordinary air.^[2]^[3]

Blind experiments went on to be used outside of purely scientific settings. In 1817, a committee of scientists and musicians compared a Stradivarius violin to one with a guitar-like design made by the naval engineer François Chanot. A well-known violinist played each instrument while the committee listened in the next room to avoid prejudice.^[4]^[5]

One of the first essays advocating a blinded approach to experiments in general came from Claude Bernard in the latter half of the 19th century, who recommended splitting any scientific experiment between the theorist who conceives the experiment and a naive (and preferably uneducated) observer who registers the results without foreknowledge of the theory or hypothesis being tested. This suggestion contrasted starkly with the prevalent Enlightenment-era attitude that scientific observation can only be objectively valid when undertaken by a well-educated, informed scientist.^[6]

Double-blind methods came into especial prominence in the mid-20th century.^[7]

Single-blind trials

Single-blind describes experiments where information that could introduce bias or otherwise skew the result is withheld from the participants, but the experimenter will be in full possession of the facts.

In a single-blind experiment, the individual subjects do not know whether they are so-called "test" subjects or members of an "experimental control" group. Single-blind experimental design is used where the experimenters either must know the full facts (for example, when comparing sham to real surgery) and so the experimenters cannot themselves be blind, or where the experimenters will not introduce further bias and so the experimenters need not be blind. However, there is a risk that subjects are influenced by interaction with the researchers – known as the experimenter's bias. Single-blind trials are especially risky in psychology and social science research, where the experimenter has an expectation of what the outcome should be, and may consciously or subconsciously influence the behavior of the subject.

A classic example of a single-blind test is the Pepsi Challenge. A tester, often a marketing person, prepares two sets of cups of cola labeled "A" and "B". One set of cups is filled with Pepsi, while the other is filled with Coca-Cola. The tester knows which soda is in which cup but is not supposed to reveal that information to the subjects. Volunteer subjects are encouraged to try the two cups of soda and polled for which ones they prefer. One of the problems with a single-blind test like this is that the tester can unintentionally give subconscious cues which influence the subjects. In addition, it is possible the tester could intentionally introduce bias by preparing the separate sodas differently (e.g., by putting more ice in one cup or by pushing one cup closer to the subject). If the tester is a marketing person employed by the company which is producing the challenge, there's always the possibility of a conflict of interest where the marketing person is aware that future income will be based on the results of the test.

Double-blind trials

"Double blind" redirects here. It is not to be confused with double bind.

Double-blind describes an especially stringent way of conducting an experiment which attempts to eliminate subjective, unrecognized biases carried by an experiment's subjects (usually human) and conductors. Double-blind studies were first used in 1907 by W. H. R. Rivers and H. N. Webber in the investigation of the effects of caffeine.^[8]

In most cases, double-blind experiments are regarded to achieve a higher standard of scientific rigor than single-blind or non-blind experiments.

In these double-blind experiments, neither the participants nor the researchers know which participants belong to the control group, nor the test group. Only after all data have been recorded (and, in some cases, analyzed) do the researchers learn which participants were which. Performing an experiment in double-blind fashion can greatly lessen the power of preconceived notions or physical cues (e.g., the placebo effect, observer bias, experimenter's bias) to distort the results (by making researchers or participants behave differently from in everyday life). Random assignment of test subjects to the experimental and control groups is a critical part of any double-blind research design. The key that identifies the subjects and which group they belonged to is kept by a third party, and is not revealed to the researchers until the study is over.

Double-blind methods can be applied to any experimental situation in which there is a possibility that the results will be affected by conscious or unconscious bias on the part of researchers, participants, or both. For example, in animal studies, both the carer of the animals and the assessor of the results have to be blinded; otherwise the carer might treat control subjects differently and alter the results.^[9]

Computer-controlled experiments are sometimes also erroneously referred to as double-blind experiments, since software may not cause the type of direct bias between researcher and subject. Development of surveys presented to subjects through computers shows that bias can easily be built into the process. Voting systems are also examples where bias can easily be constructed into an apparently simple machine based system. In analogy to the human researcher described above, the part of the software that provides interaction with the human is presented to the subject as the blinded researcher, while the part of the software that defines the key is the third party. An example is the ABX test, where the human subject has to identify an unknown stimulus X as being either A or B.

Triple-blind trials

A triple-blind study is an extension of the double-blind design; the committee monitoring response variables is not told the identity of the groups. The committee is simply given data for groups A and B. A triple-blind study has the theoretical advantage of allowing the monitoring committee to evaluate the response variable results more objectively. This assumes that appraisal of efficacy and harm, as well as requests for special analyses, may be biased if group identity is known. However, in a trial where the monitoring committee has an ethical responsibility to ensure participant safety, such a design may be counterproductive since in this case monitoring is often guided by the constellation of trends and their directions. In addition, by the time many monitoring committees receive data, often any emergency situation has long passed.^[10]

Use

In medicine

Double-blinding is relatively easy to achieve in drug studies, by formulating the investigational drug and the control (either a placebo or an established drug) to have identical appearance (color, taste, etc.). Patients are randomly assigned to the control or experimental group and given random numbers by a study coordinator, who also encodes the drugs with matching random numbers. Neither the patients nor the researchers monitoring the outcome know which patient is receiving which treatment, until the study is over and the random code is revealed.

Effective blinding can be difficult to achieve where the treatment is notably effective (indeed, studies have been suspended in cases where the tested drug combinations were so effective that it was deemed unethical to continue withholding the findings from the control group, and the general population),^[11]^[12] or where the treatment is very distinctive in taste or has unusual side-effects that allow the researcher and/or the subject to guess which group they were assigned to. It is also difficult to use the double blind method to compare surgical and non-surgical interventions (although sham surgery, involving a simple incision, might be ethically permitted). A good clinical protocol will foresee these potential problems to ensure blinding is as effective as possible. It has also been argued^[13] that even in a double-blind experiment, general attitudes of the experimenter such as skepticism or enthusiasm towards the tested procedure can be subconsciously transferred to the test subjects.

Evidence-based medicine practitioners prefer blinded randomised controlled trials (RCTs), where that is a possible experimental design. These are high on the hierarchy of evidence; only a meta analysis of several well designed RCTs is considered more reliable.^[14]

In physics

Modern nuclear physics and particle physics experiments often involve large numbers of data analysts working together to extract quantitative data from complex datasets. In particular, the analysts want to report accurate systematic error estimates for all of their measurements; this is difficult or impossible if one of the errors is observer bias. To remove this bias, the experimenters devise blind analysis techniques, where the experimental result is hidden from the analysts until they've agreed—based on properties of the data set other than the final value—that the analysis techniques are fixed.

One example of a blind analysis occurs in neutrino experiments, like the Sudbury Neutrino Observatory, where the experimenters wish to report the total number N of neutrinos seen. The experimenters have preexisting expectations about what this number should be, and these expectations must not be allowed to bias the analysis. Therefore, the experimenters are allowed to see an unknown fraction f of the dataset. They use these data to understand the backgrounds, signal-detection efficiencies, detector resolutions, etc.. However, since no one knows the "blinding fraction" f, no one has preexisting expectations about the meaningless neutrino count N' = N × f in the visible data; therefore, the analysis does not introduce any bias into the final number N which is reported. Another blinding scheme is used in B meson analyses in experiments like BaBar and CDF; here, the crucial experimental parameter is a correlation between certain particle energies and decay times—which require an extremely complex and painstaking analysis—and particle charge signs, which are fairly trivial to measure. Analysts are allowed to work with all the energy and decay data, but are forbidden from seeing the sign of the charge, and thus are unable to see the correlation (if any). At the end of the experiment, the correct charge signs are revealed; the analysis software is run once (with no subjective human intervention), and the resulting numbers are published. Searches for rare events, like electron neutrinos in MiniBooNE or proton decay in Super-Kamiokande, require a different class of blinding schemes.

The "hidden" part of the experiment—the fraction f for SNO, the charge-sign database for CDF—is usually called the "blindness box". At the end of the analysis period, one is allowed to "unblind the data" and "open the box".

In forensics

In a police photo lineup, an officer shows a group of photos to a witness or crime victim and asks him or her to pick out the suspect. This is basically a single-blind test of the witness's memory, and may be subject to subtle or overt influence by the officer. There is a growing movement in law enforcement to move to a double-blind procedure in which the officer who shows the photos to the witness does not know which photo is of the suspect.^[15]^[16]

In music

In recruiting musicians to perform in orchestras and so on, blind auditions are now routinely done: the musicians perform behind a screen so that their physical appearance and gender cannot prejudice the listener judging the performance.

References

↑ Oxford English Dictionary, 2nd ed.
1 2 Holmes, Richard (2009). The Age of Wonder: How the Romantic Generation Discovered the Beauty and Terror of Science.
↑ http://www.pharmaclinicalresearch.com/
↑ Fétis, François-Joseph (1868). Biographie Universelle des Musiciens et Bibliographie Générale de la Musique, Tome 1 (Second ed.). Paris: Firmin Didot Frères, Fils, et Cie. p. 249. Retrieved 2011-07-21.
↑ Dubourg, George (1852). The Violin: Some Account of That Leading Instrument and its Most Eminent Professors... (Fourth ed.). London: Robert Cocks and Co. pp. 356–357. Retrieved 2011-07-21.
↑ Daston, Lorraine (2005). "Scientific Error and the Ethos of Belief". Social Research. 72 (1): 18.
↑ Alder, Ken (2006), Kramer, Lloyd S.; Maza, Sarah C., eds., "A Companion to Western Historical Thought", The History of Science, Or, an Oxymoronic Theory of Relativistic Objectivity, Blackwell Companions to History, Wiley-Blackwell, p. 307, ISBN 978-1-4051-4961-7, retrieved 2012-02-11, Shortly after the start of the Cold War [...] double-blind reviews became the norm for conducting scientific medical research, as well as the means by which peers evaluated scholarship, both in science and in history.
↑ Rivers WHR and Webber HN. “The action of caffeine on the capacity for muscular work” Journal of Physiology 36: 33-47: 1907.
↑ Aviva Petrie; Paul Watson (28 February 2013). Statistics for Veterinary and Animal Science. Wiley. pp. 130–131. ISBN 978-1-118-56740-1.
↑ Friedman, L.M., Furberg, C.D., DeMets, D.L. (2010). Fundamentals of Clinical Trials. New York: Springer, pp. 119-132. ISBN 9781441915856
↑ "Male circumcision 'cuts' HIV risk". BBC News. 2006-12-13. Retrieved 2009-05-18.
↑ McNeil Jr, Donald G. (2006-12-13). "Circumcision Reduces Risk of AIDS, Study Finds". The New York Times. Retrieved 2009-05-18.
↑ "Skeptical Comment About Double-Blind Trials". The Journal of Alternative and Complementary Medicine. Retrieved 2010-05-04.
↑
↑ Psychological sleuths – Accuracy and the accused on apa.org
↑ Under the Microscope – For more than 90 years, forensic science has been a cornerstone of criminal law. Critics and judges now ask whether it can be trusted.

External links

"control group study" – The Skeptic’s Dictionary – More on why double blind is important.
PharmaSchool JargonBuster Clinical Trial Terminology Dictionary

Clinical research and experimental design

Overview	Clinical trial Trial protocols Adaptive clinical trial Academic clinical trials Clinical study design

Controlled study (EBM I to II-1; A to B)	Randomized controlled trial Scientific experiment Blind experiment Open-label trial

Observational study (EBM II-2 to II-3; B to C)	Cross-sectional study vs. Longitudinal study, Ecological study Cohort study Retrospective Prospective Case-control study (Nested case-control study) Case series Case study Case report

Epidemiology/ methods	occurrence: Incidence (Cumulative incidence) Prevalence Point Period association: absolute (Absolute risk reduction, Attributable risk, Attributable risk percent) relative (Relative risk, Odds ratio, Hazard ratio) other: Clinical endpoint Virulence Infectivity Mortality rate Morbidity Case fatality rate Specificity and sensitivity Likelihood-ratios Pre/post-test probability

Trial/test types	In vitro In vivo Animal testing Animal testing on non-human primates First-in-man study Multicenter trial Seeding trial Vaccine trial

Analysis of clinical trials	Risk–benefit ratio Systematic review Replication Meta-analysis

Interpretation of results	Selection bias Survivorship bias Correlation does not imply causation Null result

Category Glossary List of topics

Statistics

Descriptive statistics

Continuous data

Center	Mean arithmetic geometric harmonic Median Mode

Dispersion	Variance Standard deviation Coefficient of variation Percentile Range Interquartile range

Shape	Moments Skewness Kurtosis L-moments

Count data

Index of dispersion

Summary tables

Dependence

Graphics

Data collection

Study design	Population Statistic Effect size Statistical power Sample size determination Missing data

Survey methodology	Sampling Standard error stratified cluster Opinion poll Questionnaire

Controlled experiments	Design control optimal Controlled trial Randomized Random assignment Replication Blocking Interaction Factorial experiment

Uncontrolled studies	Observational study Natural experiment Quasi-experiment

Statistical inference

Statistical theory

Frequentist inference

Point estimation	Estimating equations Maximum likelihood Method of moments M-estimator Minimum distance Unbiased estimators Mean-unbiased minimum-variance Rao–Blackwellization Lehmann–Scheffé theorem Median unbiased Plug-in

Interval estimation	Confidence interval Pivot Likelihood interval Prediction interval Tolerance interval Resampling Bootstrap Jackknife

Testing hypotheses	1- & 2-tails Power Uniformly most powerful test Permutation test Randomization test Multiple comparisons

Parametric tests	Likelihood-ratio Wald Score

Specific tests

Z (normal) Student's t-test F

Goodness of fit	Chi-squared Kolmogorov–Smirnov Anderson–Darling Normality (Shapiro–Wilk) Likelihood-ratio test Model selection Cross validation AIC BIC

Rank statistics	Sign Sample median Signed rank (Wilcoxon) Hodges–Lehmann estimator Rank sum (Mann–Whitney) Nonparametric anova 1-way (Kruskal–Wallis) 2-way (Friedman) Ordered alternative (Jonckheere–Terpstra)

Bayesian inference

Correlation	Pearson product–moment Partial correlation Confounding variable Coefficient of determination

Regression analysis	Errors and residuals Regression model validation Mixed effects models Simultaneous equations models Multivariate adaptive regression splines (MARS)

Linear regression	Simple linear regression Ordinary least squares General linear model Bayesian regression

Non-standard predictors	Nonlinear regression Nonparametric Semiparametric Isotonic Robust Heteroscedasticity Homoscedasticity

Generalized linear model	Exponential families Logistic (Bernoulli) / Binomial / Poisson regressions

Partition of variance	Analysis of variance (ANOVA, anova) Analysis of covariance Multivariate ANOVA Degrees of freedom

Categorical / Multivariate / Time-series / Survival analysis

Categorical

Multivariate

Time-series

General	Decomposition Trend Stationarity Seasonal adjustment Exponential smoothing Cointegration Structural break Granger causality

Specific tests	Dickey–Fuller Johansen Q-statistic (Ljung–Box) Durbin–Watson Breusch–Godfrey

Time domain	Autocorrelation (ACF) partial (PACF) Cross-correlation (XCF) ARMA model ARIMA model (Box–Jenkins) Autoregressive conditional heteroskedasticity (ARCH) Vector autoregression (VAR)

Frequency domain	Spectral density estimation Fourier analysis Wavelet

Survival

Survival function	Kaplan–Meier estimator (product limit) Proportional hazards models Accelerated failure time (AFT) model First hitting time

Hazard function	Nelson–Aalen estimator

Test	Log-rank test

Applications

Biostatistics	Bioinformatics Clinical trials / studies Epidemiology Medical statistics

Engineering statistics	Chemometrics Methods engineering Probabilistic design Process / quality control Reliability System identification

Social statistics	Actuarial science Census Crime statistics Demography Econometrics National accounts Official statistics Population statistics Psychometrics

Spatial statistics	Cartography Environmental statistics Geographic information system Geostatistics Kriging

Category
Portal
Commons
WikiProject

Design of experiments

Scientific method	Scientific experiment Statistical design Control Internal and external validity Experimental unit Blinding Optimal design: Bayesian Random assignment Randomization Restricted randomization Replication versus subsampling Sample size

Treatment and blocking	Treatment Effect size Contrast Interaction Confounding Orthogonality Blocking Covariate Nuisance variable

Models and inference	Linear regression Ordinary least squares Bayesian Random effect Mixed model Hierarchical model: Bayesian Analysis of variance (Anova) Cochran's theorem Manova (multivariate) Ancova (covariance) Compare means Multiple comparison

Designs Completely randomized	Factorial Fractional factorial Plackett-Burman Taguchi Response surface methodology Polynomial and rational modeling Box-Behnken Central composite Block Generalized randomized block design (GRBD) Latin square Graeco-Latin square Orthogonal array Latin hypercube Repeated measures design Crossover study Randomized controlled trial Sequential analysis Sequential probability ratio test

Glossary Category Statistics portal Statistical outline Statistical topics

This article is issued from Wikipedia - version of the 11/13/2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.