Results paragraph: The obsession with P
We first used random text samples from Results sections and figure captions in 18 articles from 5 different journals (Nature, Neuron, Gastroenterology, Biological Psychiatry, British Journal of Pharmacology). The articles and journals were chosen randomly and a total of 2868 words were used. Words with very low occurring frequencies have automatically been excluded by the word generator.
There is no need to take out your old ruler and measure the size of the words. It is clear from this cloud that researchers in biology frequently use "p", "P" and "0.05" and often refer to "significant" results. This demonstrates an apparent collective obsession for p-values below 0.05 as a unique readout of null hypothesis significant testing (NHST). But p-values are not the only recurrent preoccupation highlighted in this cloud. The term "SEM" (Standard Error of the Mean) prevails over other error bars. Note also the apparent monopoly of "t-test" over other visible tests, i.e. "ANOVA" and non-parametric "Mann-Whitney" (we’ll let you find all these words in this stats version of Where's Wally!).
Statistics paragraph: Journal specificities
We are aware that not all statistical tests are cited in the Results sections, but rather in the Statistics paragraphs of Methods sections. Therefore, we also created the following word clouds, each corresponding to an individual journal, and made up of Statistics paragraphs of 4 articles (small/simple words not related to statistics were excluded).
Nature and Science:
In Nature and Science, there is a striking domination of parametric tests (Student's t-tests and ANOVA) over non-parametric tests, with "log-rank", "Wilcoxon" or "Mann-Whitney" written in very small characters. Once again there is a leading representation of the terms "significant", "significance", "p-values" (especially in Science), "0.05" and "Student's t-tests".
Plenty of the same comments made above can be applied here, for example with "Student's t-tests", "SEM", "significance" or "0.05". However, this Neuron's cloud is characterised by a wealth of new statistical words. Many tests (parametric or non-parametric) are present such as "Kruskal-Wallis" or "Fisher's" to cite a few.
British Journal of Pharmacology:
We were not aiming, in doing this, to carry out a scientific demonstration into the misuse of biostatistics. Our sample size of a few articles from a handful of journals would be too low to conduct proper descriptive statistics and no real quantitative investigation was performed. In addition, the clouds are only based on disclosed information making flaws by non-disclosure invisible. This later point is critical because a comprehensive and quantitative study that we are currently conducting on hundreds of articles shows that the absence of disclosure is a major flaw in biomedical publishing. Nevertheless, some intriguing features emerged from our clouds and many words highlighted here tend to be in line with the results we obtained in our quantitative studies. In particular the quest for "significant" "p-values" below the "0.05" threshold using parametric "Student's t-tests" or "ANOVA" is pervasive, even when parametric assumptions are not respected.
The Biotelligences team
All word clouds were generated on www.wordle.net