Researchers from the University of Kentucky College of Medicine have found a statistical issue that may result in the misinterpretation of research findings.
Their work, “Balancing after randomization in orthopedic trials: Are we even or even paying attention?” was published in the November 2022 edition of the Journal of Orthopaedic Research.
When OTW asked David Landy, M.D., an orthopedic surgeon and co-author, what led to this study, he stated, “Anecdotally, we would see readers and sometimes researchers say groups were balanced because the P values comparing variables across them were not significant. Sometimes this was said even when there was a modest size difference on imbalance in an important variable. We were curious whether this was a common issue and wondered how often people were ignoring imbalances after randomization in orthopedic surgery research just because the differences were not statistically significant.”
The researchers identified all RCTs from four leading orthopedic journals (Am J Sports Med, J Bone Joint Surg, Bone Joint J, and Clin Orthop Relat Res) published between July 2019 and June 2020.
They looked at whether the articles contained a discussion of balancing (meaning that there was a statement made that the groups “appeared similar, were comparable, or which noted differences.”
“Of 86 orthopedic RCTs [randomized controlled trial] reviewed,” wrote the authors, “59 (69%) assessed balancing and 50 (58%) used statistical significance testing to compare baseline characteristics. Of 74 articles specifying a primary outcome, 33 (45%) used a PROM [patient reported outcome measures] with 23 (70%) reporting baseline PROM values. Of these articles, 17 (74%) had a difference of less than 0.25 standard deviations (SDs) between groups, 4 (17%) had a difference of between 0.25 and 0.50 SDs, and 3 (13%) had a difference greater than 0.5 SDs.”
Dr. Landy told OTW, “Orthopaedic research too frequently relies on P values to assess balancing after randomization even though these P values do not inform readers of either the magnitude of the difference or the importance of the variable which are the two pieces of information needed to understand an imbalance.”
“We were pleasantly surprised that over half of articles commented on whether the groups appeared balanced after randomization. With that said, our other findings were unfortunately not surprising including an overreliance on P values to assess balancing and that smaller trials, which are common in orthopaedics, were more vulnerable to imbalance.”
“Assessing balancing after randomization requires significant attention and is still relatively subjective. We need to figure out how best to help readers home in on and understand imbalances without overburdening them.”

