How ‘Fragile’ Are P-Values? More Than We Realize, Apparently

All peer reviewed, published studies are said to have predictive value and, typically that rises to a 95% rate of predictive value, which is more commonly known as p value.

If a study does not reach the magic 0.95 p-value threshold, it will likely not make the cut for publication in a peer-review journal.

But, asked a group of intrepid researchers, how robust are those peer-reviewed, published p values—specifically in the hip arthroscopy literature?

Their detailed answers, which were derived from a multicenter systematic review, can be found in the published paper “The Fragility of Significance in the Hip Arthroscopy Literature: A Systematic Review,” which appears in the October-December 2021 edition of The Journal of Bone and Joint Surgery.

Robert Parisien, M.D., an orthopedic surgeon at Mount Sinai Health System in New York City and co-author on this work, the first of its kind, told OTW, “The field of hip preservation has picked up rapidly over the years, since Ganz first coined the term femoroacetabular impingement (FAI) in 2003. This has also been fueled by new minimally invasive arthroscopic techniques and literature supporting the natural history of untreated symptomatic FAI as leading to labral and chondral damage and ultimately osteoarthritis of the hip.”

Why Data Quality Matters

“Furthermore, as in any field of investigation, the quality of data and reporting standards are of utmost importance. With hip arthroscopy as a burgeoning clinical field and area of research, we thought it wise to analyze the statistical stability of this body of literature.”

“Simultaneously, fragility analysis research has become increasingly utilized, with most evaluations generated by this research group, to evaluate the statistical robustness of study findings. With a standardized threshold p value of 0.05, fragility analysis seeks to determine the true robustness of that statistical finding. This is important as a statistically significant outcome events with p values < 0.05 inform clinical decision-making. As such, fragility analysis provides two additional data points, the Fragility Index, and the Fragility Quotient, in addition to the P value to better inform clinical decision-making.”

The Results of the Study

Dr. Parisien and his team evaluated 150 outcome events across 52 comparative trials in order to calculate a p value fragility ratio. What they found, according to Dr. Parisien, is that the fragility analysis “demonstrated an overall Fragility Index of just 3.5 and FQ of 0.032. Meaning, a change in just 3.5 outcome events, or 3.2 out of 100 patients, is all that is required to change study significance.”

“Although threshold Fragility Index and the Fragility Quotient values have yet to be determined, this analysis demonstrates that statistical significance hinges on only a few outcome events and therefore lacks statistical robustness.”

“In a sub-analysis, randomized control trials proved to be more statistically stable than non-randomized control trials with a Fragility Index of 6 versus 3, respectively. Furthermore, 42.5% of studies either failed to report lost to follow-up data or reported lost to follow-up greater than the overall Fragility Index. Therefore, demonstrating a lack of standardization of lost to follow-up reporting in the hip arthroscopy literature and the possibility of a reversal of significance simply by maintaining follow up.”

What Is the Value of a P Value?

“Fragility analysis with the inclusion of Fragility Index and the Fragility Quotient values provides additional objective data to better inform clinical decision-making.” Which is, by definition, supposed to form the foundation of evidence based medicine, but, as OTW has noted in prior articles, is also subject to systemic bias.

“For example, p <0.05 may represent a statistically significant finding but, with the addition of Fragility Index and the Fragility Quotient values, the clinician is now armed with more objective data to determine if that finding is robust enough for them to possibly make a change in their clinical practice. Fragility analyses help to provide a greater context to statistical findings, thus allowing clinicians to make a more informed clinical decision, with the intent to improve the quality of care delivered to each and every patient.”

For more information see also: Dump the P-Value; American Statistical Association Issues P-Value Warning, and Systemic Bias in Clinical Research.

How ‘Fragile’ Are P-Values? More Than We Realize, Apparently

Leave a comment

The Promise of Biomimetic Implant Surfaces & Nano + Femtosecond Laser Texturing

Sign up for an OTW Subscription

Sign up for Orthopedics This Week

About OTW

Sections

More