Abstract
Objective
To evaluate the correlation and agreement between fracture risk assessment tool (FRAX) and the Garvan fracture risk calculator (GFRC) in estimating the 10-year hip fracture risk in postmenopausal Turkish women with osteoporosis.
Materials and Methods
A retrospective cross-sectional study was conducted in the Üsküdar State Hospital. Medical records of 347 postmenopausal women aged between 50 and 90 were analyzed. Data on clinical risk factors were collected, and fracture probabilities were calculated using FRAX and GFRC tools. Spearman’s rank correlation test was used to assess correlation, Wilcoxon signed-rank test was used to compare means, while the Interclass Correlation Coefficient (ICC) was calculated using two-way mixed effects model (ICC3) to evaluate agreement.
Results
FRAX 10-year hip fracture risk scores were significantly higher than GFRC 10-year hip fracture scores (p<0.001). The Spearman’s correlation between FRAX and GFRC 10-year hip fracture risk scores was found to be strong (r=0.821, p<0.001). The ICC3 value was 0.054 [95% confidence interval (0.02, 0.11)].
Conclusion
In its current form, GFRC should be used complementarily rather than interchangeably for fracture risk prediction in postmenopausal Turkish women and should be reserved for specific patient populations. The results underline the need for population-specific calibrations for GFRC to improve predictive accuracy and clinical utility.
Introduction
Osteoporosis is a bone disorder that develops due to the loss of bone mass and the alteration of bone structure, resulting in more fragile and fracture-prone bones (1). The disease is a matter of concern for more than 200 million people with this condition, as well as health professionals and insurers, because of its clinical consequences and economic costs (2). Moreover, the burden of the disease extends to increased healthcare expenditures, decline in productivity, and reduced quality of life for the affected population, primarily the elderly (3). The issue that needs to be addressed by policymakers is this increasing challenge that is expected to worsen because of the demographic transitions that occur with the population getting older (4).
Osteoporotic fractures, which are referred to as the most severe manifestations of osteoporosis, are hip, spine, and wrist fractures, with each of them causing considerable morbidity, mortality, and healthcare costs (5). Among them, hip fractures, which are the most important, usually result in long-term disability and higher mortality rates. The assessment of fracture risk is an essential tool in the prevention of healthcare strategies, and predictive algorithms are of paramount importance in the early detection of the individuals who are at risk (6).
The fracture risk assessment tool (FRAX) is one widely used algorithm designed to estimate the 10-year probability of fractures based on clinical risk factors and bone mineral density (BMD) (7). FRAX’s utility lies in its integration of patient-specific risk factors with global epidemiological data, enabling clinicians to make informed treatment decisions. However, its reliability varies by population, emphasizing the need for validation studies across diverse demographics (8). Although FRAXplus holds great promise for the future, it is currently in beta testing and there is a risk that its widespread use may be limited by its cost (9).
The Garvan fracture risk calculator (GFRC), another predictive tool, includes additional variables such as fall history to enhance fracture risk assessment. While it offers a broader risk profile, the literature suggests that its clinical application requires further validation to confirm its generalizability and accuracy across different cultural settings (10).
Evaluating the correlation and agreement between different fracture prediction methods, like FRAX and GFRC within cultural contexts is essential for improving the precision of osteoporosis management globally. Such studies have the possibility to refine risk stratification and ensure the appropriateness of interventions tailored to specific populations (11). In the light of all this information, the aim of this study was to investigate the agreement and correlation levels of FRAX and GFRC in the Turkish population. To the best of our knowledge, this is the first study to compare FRAX and GFRC methods among the specific population of postmenopausal Turkish women with osteoporosis.
Materials and Methods
This retrospective cross-sectional study was conducted between Üsküdar State Hospital. The study was conducted in accordance with the Declaration of Helsinki and was approved by the University of Health Sciences Türkiye, Zeynep Kamil Women and Children Diseases Training and Research Hospital Ethics Committee (approval no: 11 date: 11.01.2023).
Patients under the age of 50 were not included due to the inability to calculate the GFRC and above the age of 90 due to inability to calculate the FRAX score. Due to potential differences in technicians and equipment, T-score and BMD were not used in the risk calculations. The medical records of postmenopausal female patients diagnosed with osteoporosis and followed up at the Agreement of FRAX and Garvan in Fracture Risk between September 1, 2023 and September 1, 2024, were reviewed. The data collected from the patient files included age, height, weight, current smoking status, high alcohol consumption (≥3 units/day), history of previous fragility fractures, number of falls in the past year, parental history of hip fracture, current or past use of glucocorticoids, presence of rheumatoid arthritis, type 1 diabetes, untreated long-standing hyperthyroidism, hypogonadism or premature menopause, chronic malnutrition or malabsorption, and chronic liver disease.
Before the calculations, the browser was operated in incognito mode to mitigate potential interference and bias. FRAX score computations were performed using the online tool at https://frax.shef.ac.uk/FRAX/tool.aspx?lang=tu, and the 10-year probability of hip scores were recorded. Similarly, GFRC calculations were carried out via the website https://fractureriskcalculator.com.au/calculator/ and 10-year risk of hip fracture scores have been documented. Due to the risk of time and the coverage of fracture locations differences adversely impacting the analysis, the 10-year probability of major osteoporotic fracture from FRAX and the 5-year risk scores and the risk of any fracture from GFRC were excluded from the study. The browser was restarted for each calculation to ensure consistency and avoid bias.
Statistical Analysis
The behavior of quantitative variables was assessed using centralization and variance measurements: Mean ± standard deviation. Wilcoxon signed-rank test was used to compare the distribution of continuous variables between two dependent groups that have non-normal distributed data. The Shapiro-Wilk test was used to determine the normality of the distribution. Spearman’s rank correlation test was used to investigate a possible correlation. A correlation between 0.10 and 0.39 is considered weak, between 0.40 and 0.69 is considered moderate, between 0.70 and 0.89 is considered strong, and between 0.90 and 1 is considered very strong (12). For interclass correlation coefficient (ICC) values; values less than 0.5 are considered indicative of poor reliability, values between 0.5 and 0.75 are indicative of moderate reliability, values between 0.75 and 0.9 are indicative of good reliability, and values greater than 0.90 are indicative of excellent reliability (13). A significance level of p=0.05 was set for all analyses. Statistical analyses were conducted using the IBM SPSS (Statistical Package for the Social Sciences, Version 27.0, Armonk, NY, IBM Corp.) software package and Python version 3.11.10 (Python Software Foundation), utilizing “pandas”, “pyreadstat” and “pingouin” librarioes for statistical analysis. “Plotly” library was used for visualizations.
Results
A total number of 347 patients were included in this study. Table 1 and Table 2 summarize the demographic and disease specific data of the participants.
The Saphiro-Willk test results demonstrated that FRAX 10-year probability of hip fracture and GFRC 10-year risk of hip osteoporotic fracture scores deviated from a normal distribution (p<0.001, p<0.001, respectively).
Figure 1 visually compares the distribution of FRAX 10-year probability of hip fracture and GFRC 10-year risk of hip osteoporotic fracture scores, with corresponding kernel density estimates providing a smoothed representation of the data. The results of the Wilcoxon signed-rank test indicated a statistically significant difference between the two measures, with FRAX scores being significantly higher (p<0.001).
Figure 2 presents a scatterplot illustrating the relationship between FRAX 10-year probability of hip fracture and GFRC 10-year risk of hip osteoporotic fracture scores, with a locally weighted scatterplot smoothing (LOWESS) curve applied to enhance visualization. The Spearman rank correlation analysis demonstrated a strong positive correlation between the two risk scores (r=0.821, p<0.001). The LOWESS curve revealed a non-linear relationship and variation across different score ranges.
An ICC calculation using two-way mixed effects model (ICC3) was performed to assess the agreement between the FRAX 10-year probability of hip fracture score and GFRC 10-year risk of hip osteoporotic fracture scores. The ICC3 value was determined to be 0.054 [95% confidence interval (0.02, 0.11)] (p<0.001).
Discussion
The data collated from the current research revealed a strong rank-order correlation between FRAX and GFRC instruments in postmenopausal Turkish women with osteoporosis, while the same instruments estimated fracture risks with moderate to low agreement. Also, it was observed that GFRC demonstrated lower scores than FRAX. Based on this information, it is evident that the Garvan fracture risk score must be used with the utmost care when determining fracture risk in this group of women and only for those who show specific characteristics, like increased risk of falls, and it should be taken as a supplementary measure.
The FRAX tool is an important instrument for assessing fracture risk in the Turkish population, as it predicts the 10-year probability of major osteoporotic and hip fractures using clinical risk factors, with or without BMD inputs. It is beneficial for risk stratification and cost-efficient population screening, especially in settings where diagnostic tools like DEXA scans are not widely available (14). A study on Turkish postmenopausal women with osteopenia demonstrated moderate agreement between FRAX predictions made with and without BMD, showcasing its utility even in resource-constrained environments (15). Additionally, FRAX models are calibrated to country-specific fracture and mortality rates, ensuring their relevance to local populations like Türkiye (16). Globally, FRAX is a validated and widely recognized reference tool, often incorporated into clinical guidelines to guide treatment strategies (17). It provides a robust framework for assessing fracture risk and planning prevention strategies for the Turkish population.
On the other hand, the GFRC is another vital instrument in detecting individual fracture risks since it focuses on clinical risk factors such as fall history, and the number of prior fractures compared to FRAX. Because of this, the GFRC is especially helpful for evaluating patients who are vulnerable to falls, like the elderly, in whom FRAX may be less accurate (18). Also, the literature suggests that GFRC may be more advantageous in some cases than FRAX because of its flexibility, as it offers a single 10-year risk assessment and a 5-year risk assessment, which can be a plus for individuals (19). GFRC has also been observed in literature to be more sensitive in predicting future fractures in people with a fracture history than FRAX (20). In such cases, GFRC might be a better option for the “any fracture” prediction, even if it is usually better for predicting hip fractures (21). Addressing the specific limitations of FRAX, for instance, the omission of fall risk and offering personalized fracture predictions, GFRC stands as a valuable alternative for population subgroups in fracture risk evaluations.
The variance in fracture risk estimates predicted by the FRAX and GFRC tools among Turkish postmenopausal females remains in line with earlier research results. However, contrary to previous literature, FRAX scores were significantly higher than GFRC. This difference may be ascribed to several variables, such as methodological or demographic-specific factors. For example, FRAX omits some risk factors, like a history of falls, while GFRC adds them, which makes it more sensitive to fracture risk prediction in populations where falls are prevalent (22). This has been shown in a study conducted with an Australian cohort where GFRC has been more precise in forecasting fracture risks in patients having a history of falls (21). Conversely, FRAX, being aligned with national fracture epidemiology, usually issues conservative risk estimates, while GFRC might overestimate fracture risks as it is based on generalizations that are not accurate in certain cases, which has been observed in New Zealand cohort (23).
Another reason for the low agreement between the two models could be the unmeasured variables that are of importance for the prediction of fractures. The FRAX and GFRC both overlook the cortical bone properties that have been shown to independently predict fractures (24). In addition to that, comorbidities, medications, and physical activity levels may also be factors impacting the fracture risk besides being inconsistently considered in both models (23, 25). The markers of bone turnover may also have the potential to make predictions more accurate; however, they have not yet been included in the routine assessments (26).
The mismatch between FRAX and GFRC predictions can be worked on through some possible improvements, such as making the models more inclusive and more adaptable. For instance, the addition of extra risk factors such as falls history, cortical bone properties, and bone turnover markers into FRAX and GFRC could lead to these models’ better predictive accuracy [Billington et al. (18), (24, 26)]. Also, calibration of the GFRC model to the national population’s epidemiological data, similar to the design of FRAX, could potentially minimize differences in the risk estimations for certain communities (25) Moreover, incorporating other imaging technologies, such as trabecular bone score, would not only enhance the information but also present a small-scale assessment of skeletal health alongside using the existing models (22, 27). Regular validation of these tools via recent data, as well as the implementation of machine learning for the dynamic improvement of algorithms, can also be the way to cope with the discrepancies (21, 28).
Implementing country-specific calibrations into osteoporotic risk calculation tools is highly necessary since there are different fracture risks and mortality rates among the various populations due to different factors such as genetics, lifestyle, and healthcare infrastructure. For example, different rates of fracture are observed in different regions of the world, and these differences are attributed to the levels of calcium and vitamin D intake, physical activity, and genetic predisposition, which can lead to problems of under/overestimation if tools are not calibrated locally (29). In addition, mortality rates also impact the predictive value of fracture risk calculations. For instance, the FRAX tool incorporates competing death risks into its model, which varies across regions based on the healthcare available and economic factors (20). Besides that, cultural and environmental factors, such as fall risks, smoking, alcohol consumption, and sunlight exposure, affect fracture incidence, too; thus, localized correction becomes necessary (23). Without country-specific data, fracture risk tools may misclassify patients, leading to inappropriate treatment decisions or resource allocation. Therefore, calibrations tailored to specific populations ensure more accurate and clinically relevant risk predictions and interventions.
Study Limitations
This study has several limitations. The calculations might have been error-prone due to the lack of BMD values. The research was centered only on postmenopausal osteoporosis individuals who were more than 50 years old, which might affect its applicability to broader demographic groups. Moreover, the retrospective character of this investigation might bring about biases due to the uncertain data, reliance on records, and the difficulty of proving causation between the exposure and the outcome. Furthermore, the lack of utilization of FRAXplus in this study represents an additional limitation, as its inclusion could have contributed to more boarder results. Future studies should try to implement prospective approaches and insert measurements of BMD to raise the reliability of their findings. The conduction of research utilizing larger and more assorted populations is fundamentally necessary for the generalization and relevance of the discoveries. Additionally, incorporating other risk calculation tools like FRAXplus in future investigations could play a significant role for border and more generalizable results.
Conclusion
In conclusion, the strong correlation yet low concordance between the FRAX and GFRC may indicate their complementary nature rather than interchangeability. Under the current circumstances, the GFRC system must be run as an additional tool and applied only in a limited way to specialized patient groups within the postmenopausal Turkish patients. The results may also hint the potential need for local calibrations to be established to ensure the accuracy and relevance of predictive algorithms within specific populations, such as Türkiye. Overcoming these limitations via specific modifications and broader studies has the potential to improve the Garvan tool in its application to the Turkish population, hence the development of more accurate and culturally relevant medical interventions for osteoporosis management strategies.


