Impact of iron deficiency on attention among school-aged adolescents in Hong Kong

Hong Kong Med J 2025 Apr;31(2):139–47 | Epub 9 Apr 2025
© Hong Kong Academy of Medicine. CC BY-NC-ND 4.0
 
ORIGINAL ARTICLE
Impact of iron deficiency on attention among school-aged adolescents in Hong Kong
YT Cheung, PhD1; Dorothy FY Chan, MB, ChB2; CK Lee, MB, BS, MD3; WC Tsoi, MB, ChB3; CW Lau, MB, ChB3; Jennifer NS Leung, MB, BS3; Jason CC So, MB, BS4; Stella TY Tsang, PhD5; Chris LP Wong, PhD6; Yvonne YL Chu, MB7; CK Li, MB, BS, MD7
1 School of Pharmacy, Faculty of Medicine, The Chinese University of Hong Kong, Hong Kong SAR, China
2 Department of Paediatrics, Prince of Wales Hospital, Hong Kong SAR, China
3 Hong Kong Red Cross Blood Transfusion Service, Hospital Authority, Hong Kong SAR, China
4 Department of Pathology, Hong Kong Children’s Hospital, Hong Kong SAR, China
5 Department of Pathology, Hong Kong Molecular Pathology Diagnostic Centre, Hong Kong SAR, China
6 Amber Medical Group Limited, Hong Kong SAR, China
7 Department of Paediatrics, Faculty of Medicine, The Chinese University of Hong Kong, Hong Kong SAR, China
 
Corresponding author: Prof CK Li (ckli@cuhk.edu.hk)
 
 Full paper in PDF
 
Abstract
Introduction: Adolescence is a critical period for higher-order cognitive function development. The adverse effects of low iron reserves on attention are particularly relevant to school-aged students. Based on our previous study identifying a 11.1% prevalence of iron deficiency (ID) among Chinese school-aged adolescents aged 16 to 19 years in Hong Kong, the present study examined the association between iron status and attention outcomes in these adolescents.
 
Methods: This cross-sectional study recruited 523 adolescents (65.0% female; mean age=17.5 years) from 16 local schools. Serum ferritin levels and complete blood counts were measured. Iron deficiency was defined as serum ferritin concentration <15 μg/L. The Conners Continuous Performance Test Third Edition was administered to assess impairments in three attention domains, namely, sustained attention, inattention, and impulsivity. Multivariable analyses, conducted both for the overall cohort and stratified by sex, were used to evaluate the associations between serum ferritin levels and attention outcomes, adjusting for fatigue and dietary patterns.
 
Results: In the overall cohort, a lower serum ferritin concentration was significantly associated with sustained attention impairment (risk ratio [RR]=0.825, 95% confidence interval [95% CI]=0.732-0.946; P=0.040). Among female participants, those with sustained attention impairment had significantly lower serum ferritin concentrations than those with intact attention function (median=40.0 μg/L; interquartile range [IQR]=18.8-52.1 vs median=48.5 μg/L; IQR=21.8-73.8; P=0.038). Multivariable analysis showed a similar trend, though the association was not statistically significant (RR=0.954, 95% CI=0.904-1.005; P=0.073). Among male adolescents, iron reserves were not significantly associated with attention outcomes.
 
Conclusion: These findings highlight the importance of timely ID screening and correction in school-aged adolescents, particularly among female adolescents.
 
 
New knowledge added by this study
  • The prevalence of iron deficiency among Chinese school-aged adolescents aged 16 to 19 years in Hong Kong is 11.1%.
  • Lower serum ferritin reserves were associated with sustained attention impairment in the overall cohort.
Implications for clinical practice or policy
  • The consequences of low iron reserves on health and functional outcomes should be emphasised among school-aged adolescents.
  • Adolescents with low ferritin concentrations should receive counselling on the consumption of iron-rich foods and iron supplementation.
  • Future research should evaluate the effects of iron supplementation on functional outcomes.
 
 
Introduction
Adolescence marks a critical stage of physical growth, lean body mass development, and pubertal maturation. These biological and physiological changes increase the demand for micronutrients. In particular, iron deficiency (ID) remains a global public health concern.1 Iron deficiency is the most common nutritional deficiency and the leading cause of iron deficiency anaemia (IDA). Because dietary intake is the primary source of iron for most individuals, inadequate dietary iron intake is the main cause of IDA, particularly in adolescents, who are more likely to have poor dietary patterns.2 The Global Burden of Disease 2020 report estimated that approximately 60% of the total global burden of anaemia in 2019 arose from inadequate dietary iron intake.3 Consequently, ID was identified as the most important cause of anaemia-related disability.3 4
 
In addition to its essential role in haemoglobin synthesis, iron is a key element in brain metabolism and is vital for multiple cellular processes, including neurotransmitter synthesis, neuron myelination, and mitochondrial function.5 Studies in young children have demonstrated that ID during early life adversely affects psychomotor development, concentration, memory, and learning ability.6 7 Notably, the attention domain has received considerable research interest because iron plays a crucial role in the regulation of dopaminergic activity, which is implicated in the pathogenesis and symptoms of attention-deficit hyperactivity disorder (ADHD). Some studies have detected lower ferritin concentrations in children diagnosed with ADHD than in non-ADHD controls.8 9 However, many cognitive studies regarding ID have involved children aged ≤15 years.8 10 Few population-based studies have examined the effect of iron status on cognitive outcomes in adolescents and young adults, and no such studies have been conducted in Chinese populations.
 
Our previous study11 reported a prevalence of 11.1% for ID among Chinese school-aged adolescents aged 16 to 19 years in Hong Kong, with ID and IDA affecting 17.1% and 10.9% of girls, respectively, while no male participants were affected More than one-third of these adolescents reported regularly skipping at least one meal per day.11 Notably, lower serum ferritin concentrations were observed in adolescents who skipped meals, reported infrequent intake of iron-rich foods, or had heavy menstrual bleeding.11 Consistent with findings from other studies, poor iron reserves were associated with greater self-reported fatigue, reduced physical functioning, and worse school performance.11 Adolescence represents the second most critical period for the development of higher-order cognitive functions, including attention, self-control, and executive function. The adverse effects of low iron reserves on attention span and attentiveness are particularly relevant to upper secondary students in Hong Kong, who are expected to excel academically and prepare for the Hong Kong Diploma of Secondary Education Examination, the city’s university entrance examination. This study aimed to examine the association between iron status and attention outcomes among school-aged adolescents in Hong Kong.
 
Methods
This cross-sectional study recruited healthy adolescent students through the Hong Kong Red Cross Blood Transfusion Service blood donation campaigns at 16 secondary schools between October 2020 and December 2021. The detailed methodology was described in our previous report,11 which aimed to identify the risk factors of ID and IDA in this cohort to facilitate future association studies on health and functional outcomes. In the present study, the dataset was used to delineate the impact of iron reserves on performance-based attention functioning, which is distinct from the self-reported daily functioning outcomes presented in the previous report.11
 
Study population
Students eligible for this study were aged ≥16 years and had agreed to participate in blood donation screening. Students were excluded if they exhibited signs or symptoms of an active infection, reported a history of anaemia, or were receiving treatment for anaemia. Students who did not pass the blood donation screening were still permitted to participate in the study.
 
Prevalences of iron deficiency and iron deficiency anaemia
A serum ferritin concentration <15 μg/L was considered indicative of ID in both male and female participants, based on the World Health Organization definition.12 Iron deficiency anaemia was defined as the presence of both ID and anaemia. In accordance with the recommendations of the World Health Organization, anaemia was defined as a haemoglobin concentration <12 g/dL in female participants and <13 g/dL in male participants.13 All assays were conducted on the same day in the Department of Pathology Laboratory at Hong Kong Children’s Hospital. The specifications of the instruments and tests have been reported in our prior study.11
 
Attention outcomes
Before blood donation, participants completed the Conners Continuous Performance Test Third Edition (CPT-III), a validated assessment commonly used in clinical and research settings to evaluate attention.14 The CPT-III requires 14 minutes to complete and generates specific CPT attention measures (online supplementary Tables 1 to 3). Raw scores for each CPT measure were converted into T-scores based on normative samples (mean=50, standard deviation [SD]=10). Each CPT measure was classified as indicating no/mild (T-score within <1 SD), moderate (T-score within 1-2 SDs), or severe (T-score within >2 SDs) impairment.
 
Based on the CPT-III manual and the clinical discretion of a developmental specialist (the second author), attention measures were categorised into three clinically relevant attention domains of interest,14 namely, sustained attention impairment (inability to maintain attention), inattention (inability to focus or concentrate), and impulsivity (difficulty with response inhibition).
 
Covariates
Fatigue, a recognised risk factor for diminished neurocognitive function, is associated with ID.11 15 Participants completed the PedsQL Multidimensional Fatigue Scale, which has been validated in young adults up to 25 years of age.16 Each item was scored on a 100-point reverse scale, where lower scores indicated more severe fatigue. The Traditional Chinese version of the PedsQL Multidimensional Fatigue Scale has demonstrated good internal consistency, reliability, and content validity in the Chinese population.17 18
 
We previously reported that dietary patterns are associated with iron reserves in Hong Kong adolescents.11 All participants self-reported their dietary patterns, including meal-skipping habits (breakfast, lunch, or dinner) and the frequency of consuming common iron-rich foods, namely, seafood, meat, iron-fortified cereal, leafy vegetables, beans, nuts, dried fruits, and eggs.11
 
Statistical analyses
The demographic and haematological characteristics of the cohort, along with their attention outcomes, were summarised using descriptive analysis.
 
The primary outcome was attention impairment. Serum ferritin concentration was used as the predictor of interest, rather than a comparison of attention outcomes between participants with and without ID or IDA, considering that clinical thresholds for diagnosing ID and IDA may not be applicable when evaluating the effect of iron on functional outcomes. Even if an adolescent is not clinically diagnosed with ID or IDA, a low-to-normal ferritin concentration may affect functional outcomes; previous studies have shown that the impact of ID on neurodevelopment may occur before ID manifests as clinical anaemia.19 20 The Mann-Whitney U test was utilised to compare serum ferritin concentrations between participants with normal attention function (ie, those who did not exhibit impairment in any of the three attention domains) and those with moderate or severe impairment in sustained attention, inattention, or impulsivity.
 
Multivariable analysis using a log-binomial regression model was conducted, with serum ferritin concentration, fatigue, dietary pattern, and dietary iron intake as predictors. Models were adjusted for age and sex. Risk ratios (RRs) and 95% confidence intervals (95% CIs) were calculated.
 
Given that previous studies have shown a positive association between iron reserves and functional outcomes regardless of sex,8 15 20 21 we first conducted all analyses in the overall cohort. Subsequently, analyses were performed separately for male and female participants.
 
The significance threshold was set at P<0.05. All statistical analyses were performed using SAS 9.4 (SAS Institute, Cary [NC], US) and were two-tailed.
 
Results
As reported in our previous study,11 a total of 523 students were recruited (participation rate: 70%). Twenty-nine students were deferred from blood donation due to low haemoglobin concentrations but still completed the study procedures. Two-thirds of participants were female (n=340, 65.0%). The demographics of the study cohort, stratified by sex, are presented in Table 1.
 
The median ferritin concentration in male participants was 136.17 μg/L (interquartile range [IQR]=89.89-219.83; Fig a); no male participants were diagnosed with ID. Among female participants diagnosed with ID (n=58/340, 17.1%), the median haemoglobin concentration was 11.6 g/dL (IQR=11.1-12.2; Fig b). Among female participants with normal serum ferritin concentrations (n=282/340, 82.9%), the median serum ferritin concentration was 56.07 μg/L (IQR=33.82-84.11; Fig c).
 

Table 1. Demographics and dietary characteristics of participants (n=523)
 

Figure. Distribution of serum ferritin level among participants stratified by sex. (a) Male participants. (b) Female participants diagnosed with iron deficiency. (c) Female participants with normal serum ferritin concentrations
 
Attention outcomes
Overall, 249 participants (47.6%) exhibited normal function in all three attention domains. Approximately one-quarter of the participants demonstrated moderate-to-severe impairment in sustained attention (n=131/523, 25.0%), inattention (n=145/523, 27.7%), and impulsivity (n=157/523, 30.0%).
 
Among female participants with ID, the rates of moderate-to-severe impairment in sustained attention, inattention, and impulsivity were 36.2% (n=21/58), 27.6% (n=16/58), and 37.9% (n=22/58), respectively. The rates of moderate-to-severe impairment in inattention and impulsivity among female participants with IDA were numerically higher at 43.5% (n=10/23 for both domains). Among male participants, the rates of moderate-to-severe impairment in sustained attention, inattention, and impulsivity were 18.0% (n=33/183), 23.5% (n=43/183), and 22.4% (n=41/183), respectively (Table 2).
 

Table 2. Attention outcomes stratified by sex and iron deficiency status
 
Association between iron reserves and attention outcomes in the overall cohort
In the overall cohort, participants with sustained attention impairment had significantly lower serum ferritin concentrations relative to those with intact attention function (median=51.2 μg/L, IQR=27.1-106.8 vs median=73.9 μg/L, IQR=37.8-138.0; P=0.020). Although the associations were not statistically significant, trends of lower serum ferritin concentrations were also observed in participants with impulsivity impairment (median=68.1 μg/L, IQR=29.0-114.8 vs median=73.9 μg/L, IQR=37.8-138.0; P=0.067) and inattention impairment (median=69.9 μg/L, IQR=32.0-110.8 vs median=73.9 μg/L, IQR=37.8-138.0; P=0.142) relative to those with intact attention function.
 
Pooled analysis of the overall cohort, adjusted for age and sex, showed a significant association between lower serum ferritin concentration and sustained attention impairment (RR=0.825, 95% CI=0.732-0.946; P=0.040), suggesting that each 10 μg/L increase in serum ferritin concentration was associated with a 17.6% decrease in the risk of sustained attention impairment. A higher level of fatigue was associated with impairment in sustained attention (RR=0.772, 95% CI=0.652-0.926; P=0.004), inattention (RR=0.824, 95% CI=0.733-0.942; P=0.016), and impulsivity (RR=0.792, 95% CI=0.683-0.922; P=0.004). Serum ferritin concentration was not significantly associated with risks of impairment in inattention or impulsivity (Table 3).
 

Table 3. Factors associated with attention impairment stratified by sex and overall cohort
 
Association between iron reserves and attention outcomes stratified by sex
Female participants with sustained attention impairment had marginally lower serum ferritin concentrations relative to those with intact attention function (median=40.0 μg/L, IQR=18.8-52.1 vs median=48.5 μg/L, IQR=21.8-73.8; P=0.038). Although the associations were not statistically significant, trends for lower serum ferritin concentrations were also observed in participants with impulsivity impairment (median=43.0 μg/L, 95% CI=19.5-63.2 vs median=48.5 μg/L, IQR=21.8-73.8; P=0.071) relative to those with intact attention function. No significant difference was observed for inattention impairment. Additionally, no significant association was detected between iron reserves and attention impairment in male participants.
 
Multivariable analysis revealed that the association between iron reserves and sustained attention impairment in female participants was attenuated and not statistically significant (RR=0.954, 95% CI=0.904-1.005; P=0.073). A higher level of fatigue was associated with an increased risk of sustained attention impairment (RR=0.793, 95% CI=0.652-0.964; P=0.021). Among male participants, iron reserves did not affect attention outcomes, but fatigue was associated with impulsivity impairment (RR=0.712, 95% CI=0.548-0.942; P=0.018). Dietary patterns were not significantly associated with attention outcomes in either male or female participants (Table 3).
 
Discussion
In the overall cohort, a lower serum ferritin concentration was associated with a higher risk of sustained attention impairment, consistent with previous reports that iron reserves play an essential role in functional performance in adolescents.6 7 8 21 When the analysis was stratified by sex, a similar but modest association between low iron reserves and sustained attention impairment was observed in female school-aged adolescents. This finding is supported by studies regarding the neurobiology of attention-related developmental disorders associated with ID.6 7 9 10 A meta-analysis of 10 studies, comprising 2191 healthy children and 1196 children with ADHD, showed that serum ferritin concentrations were 0.4-fold lower in children with ADHD than in those without developmental disorders.8 Iron deficiency may be associated with disruptions in monoamine synthesis and monoamine signal transduction, which manifest as attention deficits.10 22 Adequate iron intake and iron stores may, therefore, be important factors influencing the onset of attention problems in the developing brain. This finding should be prospectively validated in larger cohorts with a comprehensive assessment of cognitive domains beyond attention. However, from a developmental perspective, sustained attention is closely related to performance on targeted assessments, such as mathematical fluency and reading comprehension, as well as broader academic measures in national standardised examinations.23 24 This relationship is particularly relevant because the Hong Kong educational system is well known for its examination-dominated culture. Most examinations range from 2 to 3 hours, requiring students to maintain a high level of sustained attention. Therefore, these findings may have long-term implications for students’ academic success. Future research should investigate the effects of ID and IDA on subsequent academic achievement in Hong Kong adolescents.
 
Evidence regarding the effectiveness of iron supplementation in terms of improving neurocognitive function in children and adolescents has been inconclusive. Furthermore, iron supplements are associated with gastrointestinal symptoms and constipation, which contribute to non-adherence, particularly in adolescents.25 A systematic review of 14 randomised controlled trials indicated that iron supplementation improved attention and intelligence quotient in anaemic older children and adults.26 However, these effects were inconsistent across studies; they were influenced by socio-economic factors, participant age, and the clinical thresholds used to define ID and IDA.20 25 26 The benefits for cognitive development in older adolescents remain uncertain and warrant further investigation.26
 
In this study, we found that students who reported higher levels of fatigue were more likely to have worse attention outcomes. We also previously reported that lower serum ferritin concentrations are associated with self-reported fatigue in adolescents.11 Evidence supporting the role of iron supplementation in fatigue reduction is more consistent than its effects on cognitive function in young adults, particularly among non-anaemic menstruating women with low ferritin concentrations.21 27 Notably, iron supplementation has been associated with reductions in subjective measures of fatigue among non-anaemic iron-deficient adults.21 The present findings suggest that ID correction in adolescents could reduce fatigue levels, which may indirectly improve attention outcomes. Using a serum ferritin concentration threshold of 15 μg/L to diagnose clinical ID, some researchers have demonstrated that iron supplementation can improve fatigue and physical performance among individuals with serum ferritin concentrations at the lower end of the normal range (30-50 μg/L).21 Collectively, the known health risks of ID, including impaired physical growth, fatigue, and reduced fitness in adolescents, underscore the need to educate students about maintaining a balanced diet with adequate iron intake. Adolescents with low ferritin concentrations should receive counselling focused on the consumption of iron-rich foods and iron supplementation to alleviate fatigue, even in the absence of documented anaemia.
 
Dietary patterns and self-reported intake of iron-rich foods were not directly associated with attention outcomes in the multivariate analysis, likely because neurocognitive function is a multifactorial and complex phenotype influenced by both nutritional and non-nutritional factors. Additionally, we did not use a comprehensive measure of dietary iron intake. However, we previously showed that skipping at least one meal per day or exhibiting low dietary iron intake was associated with lower iron reserves.11 Iron deficiency prevention in adolescents requires effective management of knowledge gaps related to food nutrition, dieting, and body image. Collectively, these findings highlight the importance of developing nutrition education programmes to encourage proactive adoption of dietary and other nutrition-related behaviours that promote health and well-being.
 
Limitations
Despite the relatively large cohort of school-aged adolescents and the well-characterised haematological assessments, this study had several important limitations. First, the participation rate in the blood donation programme was affected by the coronavirus disease 2019 pandemic and school closures. This change in participation rate may have introduced sampling bias because students with worse health statuses may have been more likely to decline blood donation. Second, we only assessed attention measures in this study. It was not feasible to administer a full neurocognitive test battery, which typically requires >1 hour, in a school-based environment with limited time, space, and supervisory personnel. Future studies should include a more comprehensive evaluation of neurocognitive function. Finally, we did not evaluate factors potentially associated with the causes of anaemia and cognitive function, such as markers of socio-economic status, family functioning, living environment, and physical activity.28 29 Nevertheless, our findings regarding the association between iron status and attention outcomes provide valuable local population data and guidance for future iron supplementation initiatives.
 
Conclusion
Lower serum ferritin concentrations and self-reported fatigue were associated with an increased risk of sustained attention impairment among school-aged adolescents in Hong Kong. The potential health consequences of ID without anaemia, particularly its effects on physical well-being and school performance, should be effectively communicated to the Hong Kong population, especially to female adolescents. Dietary interventions should target
 
Author contributions
Concept or design: All authors.
Acquisition of data: CK Lee, WC Tsoi, CW Lau, JNS Leung, STY Tsang, CLP Wong, YYL Chu, CK Li.
Analysis of data: YT Cheung, DFY Chan.
Interpretation of data: All authors.
Drafting of the manuscript: YT Cheung.
Critical revision of the manuscript for important intellectual content: All authors.
 
All authors had full access to the data, contributed to the study, approved the final version for publication, and take responsibility for its accuracy and integrity.
 
Conflicts of interest
All authors have disclosed no conflicts of interest.
 
Acknowledgement
The authors thank the principals and staff of the participating schools, as well as Mr Calvin Lam from Department of Paediatrics of The Chinese University of Hong Kong for assistance with data collection.
 
Declaration
Part of the results was presented at the Joint Annual Scientific Meeting 2022 (hybrid meeting) of The Hong Kong Paediatric Society, Hong Kong College of Paediatricians, Hong Kong Paediatric Nurses Association, and Hong Kong College of Paediatric Nursing in Hong Kong on 26 September 2022.
 
Funding/support
This research was funded by the Health and Medical Research Fund, the former Food and Health Bureau, Hong Kong SAR Government (Ref No.: 17180441). The funder had no role in study design, data collection, analysis, interpretation, or manuscript preparation.
 
Ethics approval
This research was approved by the Joint Chinese University of Hong Kong—New Territories East Cluster Clinical Research Ethics Committee, Hong Kong (Ref No.: 2019.107). Participants aged ≥18 years provided written informed consent, whereas those aged <18 years provided written assent along with informed consent from a parent or legal guardian.
 
Supplementary material
The supplementary material was provided by the authors and some information may not have been peer reviewed. Accepted supplementary material will be published as submitted by the authors, without any editing or formatting. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by the Hong Kong Academy of Medicine and the Hong Kong Medical Association. The Hong Kong Academy of Medicine and the Hong Kong Medical Association disclaim all liability and responsibility arising from any reliance placed on the content.
 
References
1. Powers JM, O’Brien S, Berlan ED, Hoppin AG, editors. Iron requirements and iron deficiency in adolescents. UpToDate. Available from: https://www.uptodate.com/contents/iron-requirements-and-iron-deficiency-in-adolescents. Accessed 1 Apr 2025.
2. Camaschella C, Girelli D. The changing landscape of iron deficiency. Mol Aspects Med 2020;75:100861. Crossref
3. Safiri S, Kolahi AA, Noori M, et al. Burden of anemia and its underlying causes in 204 countries and territories, 1990-2019: results from the Global Burden of Disease Study 2019. J Hematol Oncol 2021;14:185. Crossref
4. Institute for Health Metrics and Evaluation, University of Washington. GBD results. 2020. Available from: https://vizhub.healthdata.org/gbd-results/. Accessed 6 May 2023.
5. Hare D, Ayton S, Bush A, Lei P. A delicate balance: iron metabolism and diseases of the brain. Front Aging Neurosci 2013;5:34. Crossref
6. Jáuregui-Lobera I. Iron deficiency and cognitive functions. Neuropsychiatr Dis Treat 2014;10:2087-95. Crossref
7. Pivina L, Semenova Y, Doşa MD, Dauletyarova M, Bjørklund G. Iron deficiency, cognitive functions, and neurobehavioral disorders in children. J Mol Neurosci 2019;68:1-10. Crossref
8. Wang Y, Huang L, Zhang L, Qu Y, Mu D. Iron status in attention-deficit/hyperactivity disorder: a systematic review and meta-analysis. PLoS One 2017;12:e0169145. Crossref
9. Bener A, Kamal M, Bener H, Bhugra D. Higher prevalence of iron deficiency as strong predictor of attention deficit hyperactivity disorder in children. Ann Med Health Sci Res 2014;4(Suppl 3):S291-7. Crossref
10. Tseng PT, Cheng YS, Yen CF, et al. Peripheral iron levels in children with attention-deficit hyperactivity disorder: a systematic review and meta-analysis. Sci Rep 2018;8:788. Crossref
11. Cheung YT, Chan DF, Lee CK, et al. Iron deficiency among school-aged adolescents in Hong Kong: prevalence, predictors, and effects on health-related quality of life. Int J Environ Res Public Health 2023;20:2578. Crossref
12. World Health Organization. WHO guideline on use of ferritin concentrations to assess iron status in individuals and populations. 2020. Available from: https://apps.who.int/iris/handle/10665/331505. Accessed 6 Oct 2023.
13. World Health Organization. Haemoglobin concentrations for the diagnosis of anaemia and assessment of severity. 2011 May 31. Available from: https://www.who.int/publications/i/item/WHO-NMH-NHD-MNM-11.1. Accessed 6 Oct 2023.
14. Conners CK, Sitarenios G. Conners’ Continuous Performance Test (CPT). In: Kreutzer JS, DeLuca J, Caplan B, editors. Encyclopedia of Clinical Neuropsychology. New York: Springer; 2011: 681-3. Crossref
15. Sulheim D, Fagermoen E, Sivertsen ØS, Winger A, Wyller VB, Øie MG. Cognitive dysfunction in adolescents with chronic fatigue: a cross-sectional study. Arch Dis Child 2015;100:838-44. Crossref
16. Varni JW, Limbers CA. The PedsQL Multidimensional Fatigue Scale in young adults: feasibility, reliability and validity in a university student population. Qual Life Res 2008;17:105-14. Crossref
17. Yeung NC, Lau JT, Yu X, et al. Psychometric properties of the Chinese version of the Pediatric Quality of Life Inventory 4.0 Generic Core Scales among pediatric cancer patients. Cancer Nurs 2013;36:463-73. Crossref
18. Hao Y, Tian Q, Lu Y, Chai Y, Rao S. Psychometric properties of the Chinese version of the Pediatric Quality of Life Inventory 4.0 Generic Core Scales. Qual Life Res 2010;19:1229-33. Crossref
19. Camaschella C. Iron deficiency. Blood 2019;133:30-9. Crossref
20. Hermoso M, Vucic V, Vollhardt C, et al. The effect of iron on cognitive development and function in infants, children and adolescents: a systematic review. Ann Nutr Metab 2011;59:154-65. Crossref
21. Houston BL, Hurrie D, Graham J, et al. Efficacy of iron supplementation on fatigue and physical capacity in non-anaemic iron-deficient adults: a systematic review of randomised controlled trials. BMJ Open 2018;8:e019240. Crossref
22. Kim J, Wessling-Resnick M. Iron and mechanisms of emotional behavior. J Nutr Biochem 2014;25:1101-7. Crossref
23. Gallen CL, Schaerlaeken S, Younger JW; Project iLEAD Consortium; Anguera JA, Gazzaley A. Contribution of sustained attention abilities to real-world academic skills in children. Sci Rep 2023;13:2673. Crossref
24. Schmengler H, Peeters M, Stevens GW, et al. Educational level, attention problems, and externalizing behaviour in adolescence and early adulthood: the role of social causation and health-related selection—the TRAILS study. Eur Child Adolesc Psychiatry 2023;32:809-24. Crossref
25. Finkelstein JL, Herman HS, Guetterman HM, Peña-Rosas JP, Mehta S. Daily iron supplementation for prevention or treatment of iron deficiency anaemia in infants, children, and adolescents. Cochrane Database Syst Rev 2018;2018:CD013227. Crossref
26. Falkingham M, Abdelhamid A, Curtis P, Fairweather-Tait S, Dye L, Hooper L. The effects of oral iron supplementation on cognition in older children and adults: a systematic review and meta-analysis. Nutr J 2010;9:4. Crossref
27. Vaucher P, Druais PL, Waldvogel S, Favrat B. Effect of iron supplementation on fatigue in nonanemic menstruating women with low ferritin: a randomized controlled trial. CMAJ 2012;184:1247-54. Crossref
28. Hess SY, Owais A, Jefferds ME, Young MF, Cahill A, Rogers LM. Accelerating action to reduce anemia: review of causes and risk factors and related data needs. Ann N Y Acad Sci 2023;1523:11-23. Crossref
29. Meredith WJ, Cardenas-Iniguez C, Berman MG, Rosenberg MD. Effects of the physical and social environment on youth cognitive performance. Dev Psychobiol 2022;64:e22258. Crossref

Migrant workers’ well-being after the rampant sweep of the Omicron wave in Hong Kong

Hong Kong Med J 2025 Apr;31(2):130–8 | Epub 9 Apr 2025
© Hong Kong Academy of Medicine. CC BY-NC-ND 4.0
 
ORIGINAL ARTICLE
Migrant workers’ well-being after the rampant sweep of the Omicron wave in Hong Kong
Kitty KY Lai, BSc1; Hong Qiu, BSc, PhD1,2; Eliza LY Wong, MPH, PhD1,2
1 The Jockey Club School of Public Health and Primary Care, Faculty of Medicine, The Chinese University of Hong Kong, Hong Kong SAR, China
2 Centre for Health Systems and Policy Research, The Jockey Club School of Public Health and Primary Care, Faculty of Medicine, The Chinese University of Hong Kong, Hong Kong SAR, China
 
Corresponding author: Prof Eliza LY Wong (lywong@cuhk.edu.hk)
 
 Full paper in PDF
 
Abstract
Introduction: The impact of the coronavirus disease 2019 pandemic has rendered migrant workers a vulnerable population susceptible to psychological distress. This cross-sectional study aimed to estimate the prevalence of anxiety and examine associations of perceived social support and working conditions with anxiety among Filipina domestic workers (FDWs) after the peak of the Omicron wave in Hong Kong.
 
Methods: In total, 370 female FDWs were recruited through convenience sampling in Central, Hong Kong, during holiday gatherings from June to August 2022; social normalcy had begun to return during this period after the peak of the Omicron pandemic. Anxiety levels were assessed using the Generalised Anxiety Disorder-7 (GAD-7) scale. Perceived social support and working conditions were measured using validated instruments. Socio-demographic characteristics and health-related information were recorded for consideration as covariates.
 
Results: The estimated prevalence of anxiety (GAD-7 score ≥10) was 8.6% (95% confidence interval [CI]=5.8%-11.5%). Multivariable logistic regression demonstrated that greater satisfaction with compensation and salary (adjusted odds ratio [aOR]=0.825, 95% CI=0.728-0.935), increased free time and rest periods (aOR=0.878, 95% CI=0.780-0.987), and higher satisfaction with value orientation (aOR=0.887, 95% CI=0.796-0.989) were associated with lower anxiety risk.
 
Conclusion: Migrant workers constitute a vital workforce but are often neglected in preventive care. Based on these findings, preventive measures such as labour protection, compensation for overtime work, adequate rest periods, and improved working conditions are crucial in mitigating anxiety. This study highlights key areas for policy refinement and governmental support to enhance migrant workers’ well-being.
 
 
New knowledge added by this study
  • Overall, 8.6% of Filipina domestic workers (FDWs) experienced probable anxiety after the Omicron wave of the coronavirus disease 2019 pandemic in Hong Kong.
  • Associations between anxiety and working conditions were identified, indicating potential factors that influence the mental well-being of FDWs.
  • No significant association was observed between anxiety and perceived social support.
Implications for clinical practice or policy
  • The Hong Kong government could prioritise refining policies to support favourable working conditions for migrant workers, including negotiation of an increase in meal allowances and strict enforcement of regular working hours.
  • Non-governmental organisations could tailor psychological interventions to migrant workers to address diverse mental health needs.
 
 
Introduction
Declared a public health emergency of international concern by the World Health Organization, coronavirus disease 2019 (COVID-19) has continuously posed a threat to both physical and psychological health.1 Beginning in December 2021, the Omicron variant triggered the fifth wave of the pandemic in Hong Kong, endangering psychological well-being.1 2 Filipina domestic workers (FDWs), the primary group of migrant domestic workers, constitute >2.5% of the Hong Kong population3 and are considered a vulnerable population. Before the Hong Kong government reiterated the rights of migrant workers, many FDWs faced mistreatment, including abuse, exploitation, and illegal dismissal upon infection with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2).4 5 6 Filipina domestic workers were susceptible to both direct and indirect consequences of the COVID-19 pandemic.
 
Migrant workers often experience poor psychosocial conditions and substandard working environments.4 5 6 7 8 However, few studies have consistently examined the well-being of FDWs.8 9 10 Anxiety, a key indicator of well-being, commonly coexists with other psychological conditions. Considering the large number of domestic workers in Hong Kong, efforts to safeguard the psychological health of this minority population are essential to prevent excessive strain on the healthcare system.11 Additionally, various aspects of working conditions should be investigated in relation to anxiety.12
 
This study aimed to estimate the prevalence of anxiety and examine its relationships with perceived social support and decent work among FDWs after the peak of the Omicron wave during the COVID-19 pandemic in Hong Kong. Insights regarding the psychosocial conditions encountered by FDWs during the aftermath of the pandemic may contribute to existing literature.
 
Methods
Study design
A cross-sectional survey, written in English, was administered between June and August 2022. The target population comprised FDWs. Eligibility criteria included age ≥18 years, ability to read and understand English, and ability to provide informed consent. Filipina domestic workers who began employment on or after 1 February 2022 in Hong Kong, as well as male FDWs, were excluded from the present study. Because the majority of FDWs are women (97.8%), the inclusion of a small sample of male FDWs could compromise representativeness.3
 
Convenience sampling was utilised. Recruitment was conducted at gathering places in Central, Hong Kong, where a large proportion of FDWs spend their days off. Data collection was performed on rest days (Sundays and statutory holidays). Support and clarifications were provided to respondents who required assistance in understanding the questions. Respondents were offered a gratuity of HK$20 in cash as a token of appreciation for their time and assistance. According to Yeung et al,10 the prevalence of anxiety among FDWs in Hong Kong at the beginning of the pandemic was 25%. With a 95% confidence interval (95% CI) and a desired margin of error of ±5%, the minimum required sample size was estimated to be 289.
 
Data collection tool and measurement
The questionnaire consisted of four sections, namely, anxiety, perceived social support, working conditions, and potential covariates (eg, socio-demographic and health-related factors). The questionnaire was developed based on validated instruments and a literature review of similar contexts.12 13 14 15 16 17
 
The Generalised Anxiety Disorder-7 (GAD-7) scale was adopted to assess anxiety levels.13 The total score ranges from 0 to 21; a threshold score of ≥10 to identify self-reported anxiety provides optimal sensitivity (89%) and specificity (82%).13 The GAD-7 has demonstrated high internal consistency in the general population (Cronbach’s alpha=0.92) and among FDWs working in Chinese regions (Cronbach’s alpha=0.80).18 19
 
The Multidimensional Scale of Perceived Social Support, using a 7-point Likert scale, was used to measure perceived social support across three domains, namely, significant others, family, and peers.14 Each domain comprises four items. We calculated a mean score for each domain ranging from 1 to 7 and a total mean score averaged from the three concerned domains to represent the total score of perceived social support. A higher score indicates a greater level of perceived social support. The authors of the scale proposed multiple approaches for interpreting perceived social support, one of which involved analysing continuous data for the three domains and the overall score.14 This scale has been validated with high internal consistency among Southeast Asian domestic workers in Hong Kong (Cronbach’s alpha=0.96).16 20
 
The Decent Work Scale was adopted to evaluate working conditions, including 15 items grouped into five components, namely, physically and interpersonally safe working conditions, access to essential healthcare support, sufficient income, adequate rest time, and alignment of working settings with social values.12 Each item scored from 1 to 7, resulting in component scores ranging from 3 to 21 and a total score ranging from 15 to 105, with higher scores indicating better working conditions. This scale has been validated with high internal consistency among the working population in the US (Cronbach’s alpha=0.86).12
 
Data analyses
Statistical analysis was performed using SPSS (Windows version 27.0; IBM Corp, Armonk [NY], US). Confidence intervals were established at the 95% level, and P values <0.05 were considered statistically significant. We computed 95% CIs for anxiety prevalence. Socio-demographic variables were compared between anxiety statuses using the Chi squared test, whereas scores for perceived social support and working conditions were compared using the independent samples t test.
 
Odds ratios (ORs) with 95% CIs were computed using a binary logistic regression model. For univariable analysis, simple logistic regression was conducted; perceived social support and working conditions constituted the main independent variables. Multivariable logistic regression analysis was performed to estimate the independent effects of these variables while adjusting for potential confounders.
 
The GAD-7 scores were categorised into four levels of anxiety severity: minimal (0-4), mild (5-9), moderate (10-14), and severe (15-21).13 We conducted sensitivity analysis using the GAD-7 score as an ordinal outcome and constructed an ordinal logistic regression model to assess the robustness of previously identified anxiety-associated factors.
 
Results
Among the 441 FDWs approached, 71 declined to participate, yielding a response rate of 83.9% (Fig 1). Primary reasons for refusal were survey length and time constraints. The distribution of GAD-7 scores was positively skewed (Fig 2). The estimated prevalence of probable anxiety (GAD-7 score ≥10) was 8.6% (95% CI=5.8-11.5). Among the 370 respondents, approximately half were aged 35 to 44 years (51.1%) and married (48.4%). Most respondents had attained a university-level education or higher (60.3%), reported a monthly income ranging from HK$4630 to HK$4999 (68.6%), and had children residing in their home country (82.7%). The proportions of respondents residing on Hong Kong Island, in Kowloon and the New Territories were evenly distributed. The median (interquartile range) duration of employment in Hong Kong was 5.0 years (interquartile range, 3.0-9.0). Most respondents had no history of COVID-19 (81.9%) and no chronic diseases (97.8%) [Table 1]. Table 2 shows that the mean scores for the three domains of perceived social support ranged from 5.5 to 5.7 out of 7, whereas the mean score for decent work was 78.1 out of 105. Among the five components of working conditions measured by the Decent Work Scale, the lowest mean score was observed for rest periods (14.1); access to healthcare had the highest mean score (17.1).
 

Figure 1. Participants’ recruitment
 

Figure 2. Distribution of Generalised Anxiety Disorder-7 scores
 

Table 1. Demographic characteristics of participants
 

Table 2. Perceived social support and working conditions among participants
 
Participants with probable anxiety had a higher proportion of chronic diseases relative to those without anxiety (9.4% vs 1.5%; P=0.024) [Table 1]. Respondents with probable anxiety reported worse perceptions of social support and working conditions; they had lower scores across all domains relative to those of respondents without anxiety (Table 2).
 
Associations of perceived social support and working conditions with anxiety
Simple logistic regression analysis indicated that one domain of perceived social support and multiple subscales of working conditions were significantly associated with anxiety (Table 3). Filipina domestic workers with higher perceived social support from significant others, better access to healthcare, greater satisfaction with compensation and salary, increased free time and rest periods, and higher satisfaction with their employer’s value orientation exhibited a lower likelihood of experiencing probable anxiety. Multivariable logistic regression analysis—adjusted for all relevant socio-demographic variables, health status, and subscales of perceived social support and working conditions—identified three variables that remained statistically significant (Table 3). Greater satisfaction with compensation and salary (adjusted odds ratio [aOR]=0.825, 95% CI=0.728-0.935), increased free time and rest periods (aOR=0.878, 95% CI=0.780-0.987), and higher satisfaction with value orientation (aOR=0.887, 95% CI=0.796-0.989) were associated with lower anxiety risk. Sensitivity analysis, which examined the four levels of anxiety as an ordinal outcome using an ordinal logistic regression model, showed that effect estimates were slightly attenuated. However, the findings confirmed the association between anxiety levels and inadequate compensation, while also identifying a history of chronic diseases as a risk factor for increased anxiety severity (Table 4).
 

Table 3. Associations of socio-demographic characteristics, health status, perceived social support, and working conditions of participants (n=370)
 

Table 4. Sensitivity analysis for the associations of socio-demographic characteristics, health status, perceived social support, and working conditions of participants (n=370)
 
Discussion
Estimated prevalence of anxiety
The observed prevalence of anxiety among FDWs was 8.6%, representing a lower proportion compared with previous studies.10 11 21 22 The Omicron variant led to an unprecedented surge in cases, which peaked in early March 2022. Compared with a local study conducted at the onset of the COVID-19 pandemic,10 the prevalence of probable anxiety among FDWs declined from 25% to 8.6%. A remarkably lower prevalence of anxiety was observed when using the official cut-off score of ≥7 for the Anxiety subscale of the Depression, Anxiety, and Stress Scale-21 Items (DASS-21-A) in both the general population of Hong Kong (14%)11 and the Philippines (38.4%).21 In Singapore, 17.5% of migrant workers exhibited probable anxiety (DASS-21-A score ≥8).22 The discrepancy in anxiety prevalence across studies may be attributed to differences in study contexts and timeframes. Although the fifth wave of COVID-19 had nearly subsided in Hong Kong during the present study period, other regions were still experiencing high caseloads. The relatively low prevalence of anxiety among FDWs may indicate the development of psychological resilience after the Omicron pandemic. Additionally, information dissemination and vaccine availability were more established compared with the second and third waves of the pandemic.10
 
In response to the fifth wave of the COVID-19 pandemic, the local government implemented comprehensive public health policies to safeguard rights and facilitate risk communication among minority populations in Hong Kong. Coronavirus disease 2019 and vaccine-related information were made available in multiple languages, including Tagalog and English, thereby improving access to formal and accurate health information for FDWs. Access to adequate and accurate health information is essential for mitigating psychological distress and reducing anxiety levels associated with the pandemic, as demonstrated by the findings of a study conducted in the Philippines.21
 
Access to COVID-19 vaccines may partially explain the findings. In Hong Kong, domestic workers were designated as a priority group for vaccination within 1 month of launching the COVID-19 vaccination programme.23 Furthermore, the initial procurement of 22.5 million vaccine doses ensured sufficient supply for the entire population, allowing domestic workers to choose between Sinovac and BioNTech vaccines at no cost. The high effectiveness of COVID-19 vaccination may have contributed to anxiety reduction. As of August 2021, the majority of sampled domestic workers (80%) had received at least one dose of COVID-19 vaccine.24 A study by McMenamin et al25 demonstrated the substantial protective effect of COVID-19 vaccines against severe or fatal outcomes (BioNTech: two doses=83.9%; three doses=97.9%). Vaccination significantly reduces the risk of severe COVID-19 complications, hospitalisation, and mortality, which may have indirectly alleviated probable anxiety among FDWs. This assumption is supported by the results of a study examining the psychological impact of COVID-19 vaccination, which revealed lower anxiety levels among vaccinated individuals.26 However, the aforementioned local10 11 20 and Singapore studies22 assessing the anxiety of migrant workers were conducted during periods when no pharmaceutical preventive measures were available. Therefore, access to COVID-19 vaccines is a plausible explanation for the lower prevalence of probable anxiety among FDWs.
 
Additionally, job security may explain the decline in probable anxiety. Some FDWs expressed concerns regarding job insecurity and experienced distress due to job loss.4 Amid increasing reports of illegal contract terminations, the government intervened to uphold FDWs’ employment rights.27 On 5 March 2022, a government spokesperson emphasised zero tolerance for employers who illegally dismissed FDWs exhibiting SARS-CoV-2 infection.27 Any violation of the Employment Ordinance and related laws was subject to prosecution and fines.27 Filipina domestic workers exhibiting SARS-CoV-2 infection or identified as close contacts of individuals with COVID-19 receive the same assistance and support as other Hong Kong citizens, including quarantine and isolation arrangements.27 Greater institutional support for their employment may have contributed to the lower prevalence of anxiety among FDWs.
 
Perceived social support and anxiety
The significant others domain of perceived social support was negatively associated with anxiety in univariable analysis but was no longer significant according to multivariable regression. Significant others are individuals that the respondents regard as special persons.12 This finding contrasts with previous studies that identified perceived social support as an essential factor in coping with psychological distress among migrant workers.8 9 This discrepancy may be attributable to the small sample size. However, the finding is consistent with results from a local study conducted in a similar context.10
 
Filipina domestic workers migrate to foreign countries to support their families’ livelihoods; they are often portrayed as resilient and independent figures by the Philippine Government. This narrative may subtly reinforce the perception among FDWs that they are the sole breadwinners responsible for their families’ well-being.28 Consequently, although FDWs may seek informal social support from significant others, their self-disclosure remains selective. Psychological concerns, in particular, may be considered sensitive topics, leading to avoidance of such discussions in an effort to protect their self-esteem. This avoidance may explain the absence of an observed association between perceived social support and anxiety.
 
Working conditions and anxiety
Another key finding was that better working conditions—including greater satisfaction with compensation and salary, increased free time and rest periods, and higher satisfaction with value orientation—were associated with a lower likelihood of probable anxiety. Working conditions are recognised as social determinants of mental health. Findings from the World Health Organization suggest that jobs offering high rewards and a greater sense of control serve as protective factors for mental well-being, thereby reinforcing the importance of favourable working conditions for employees.29 Consistent with the previous findings,30 high and regular monetary compensation was linked to lower probable anxiety in our study. According to the Occupational Wages Survey in the Philippines,30 the median monthly income was PHP13 646 (HK$1865, US$239), whereas the minimum monthly wage in Hong Kong was HK$4630 (US$594) during the study period.31 Filipina domestic workers in Hong Kong earned at least 2.48-fold more than their counterparts in the Philippines. Higher monthly earnings are often allocated toward property purchases in the Philippines, meeting family obligations, and fulfilling roles and responsibilities. Thus, greater satisfaction with compensation and salary may have contributed to lower probable anxiety among FDWs. Although this factor may explain the observed association, a qualitative study would provide deeper insights into the relationship between higher compensation and reduced psychological distress.
 
Additionally, increased free time and rest periods were associated with a lower risk of probable anxiety. An occupational health study32 established an inverse relationship between working hours and sleep duration, where anxiety and depression scores were higher among individuals working longer hours. These findings suggest that increased free time and rest periods can help reduce anxiety risk.
 
Notably, greater alignment between FDWs’ working environments and their social values was associated with lower anxiety risk. Value orientation refers to the principles an individual upholds, including ethics, morality, and attitudes toward work. In the workplace, each aspect of the working environment is interconnected with FDWs and their employers, influencing the likelihood of psychological distress. Employers are encouraged to engage in discussions with FDWs regarding working conditions—such as job demands and task restructuring—to ensure alignment in value orientation between both parties.
 
Other covariates
While chronic disease was not a statistically significant predictor of anxiety in multivariable logistic regression model, sensitivity analysis using an ordinal outcome revealed that it remained a risk factor for increased anxiety severity. Despite the inconclusive findings regarding this association, a systematic review33 indicated that a history of chronic diseases is linked to higher anxiety levels. The presence of chronic diseases has a negative impact on mental health.33
 
Limitations and strengths
Some limitations were inherent in our sampling method and study design. First, we could not establish causality. Because cross-sectional study designs provide only short-term data regarding associations, longitudinal studies are needed to examine temporal sequences and causal relationships. Second, the use of convenience sampling may introduce selection bias; therefore, generalisations of the findings to the entire FDW population should be made with caution. However, this bias is likely minimal because all FDWs were approached, and none were selectively invited based on specific characteristics; also, the demographic distribution of the sample closely resembled that of domestic workers recorded in the Hong Kong Population Census.34 The age distributions in the Census data34 and the study sample were comparable: 18-34 years (29.8% vs 27.8%), 35-44 years (48.2% vs 51.1%), and ≥45 years (22.0% vs 21.1%). Additionally, the respondents’ residence areas were evenly distributed across Hong Kong Island, Kowloon, and the New Territories. These findings suggest high representativeness and generalisability in the study sample. Furthermore, monetary incentives were provided, which may have contributed to higher-quality responses.
 
Conclusion
This study identified associations between optimal working conditions and lower probable anxiety among FDWs. The findings update the estimated prevalence of anxiety in this population and suggest that favourable working conditions may serve as protective factors. The study provides insights for the development and refinement of public health measures and occupational policies related to migrant workers, including compensation for overtime work, job security, and adequate rest periods. Psychological interventions tailored to domestic workers should be developed to address diverse mental health needs while incorporating labour protection. Regular review and refinement of occupational policies may be necessary. The Labour Department could consider conducting large-scale quantitative surveys and qualitative interviews with domestic workers to assess and accommodate their occupational needs. Future studies should aim to include domestic workers of various nationalities and other migrant worker populations.
 
Author contributions
Concept or design: KKY Lai, ELY Wong.
Acquisition of data: KKY Lai.
Analysis or interpretation of data: All authors.
Drafting of the manuscript: KKY Lai.
Critical revision of the manuscript for important intellectual content: ELY Wong.
 
All authors had full access to the data, contributed to the study, approved the final version for publication, and take responsibility for its accuracy and integrity.
 
Conflicts of interest
All authors have disclosed no conflicts of interest.
 
Acknowledgement
The authors thank Prof Marc KC Chong, Ms Annie WL Cheung and Mr Jonathan CH Ma from the Centre for Health Systems and Policy Research, The Jockey Club School of Public Health and Primary Care, The Chinese University of Hong Kong for their valuable comments on the study and support in data analysis. The authors also thank all study respondents for their valuable time in completing the questionnaires and for their contributions as migrant workers in Hong Kong.
 
Funding/support
This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
 
Ethics approval
This research was approved by the Survey and Behavioural Research Ethics Committee of The Chinese University of Hong Kong, Hong Kong (Ref No.: 018-22). The study was conducted in accordance with the principles of the Declaration of Helsinki. Informed consent was obtained from the participants prior to commencement of the survey.
 
References
1. Centre for Health Protection and Hospital Authority, Hong Kong SAR Government. Statistics on 5th wave of COVID- 19 (from 31 Dec 2021 up till 31 May 2022 00:00). Available from: https://www.coronavirus.gov.hk/pdf/5th_wave_statistics/5th_wave_statistics_20220531.pdf. Accessed 5 Dec 2022.
2. Xiong J, Lipsitz O, Nasri F, et al. Impact of COVID-19 pandemic on mental health in the general population: a systematic review. J Affect Disord 2020;277:55-64. Crossref
3. Census and Statistics Department, Hong Kong SAR Government. 2021 Population Census. Main Results. Available from: https://www.census2021.gov.hk/doc/pub/21c-main-results.pdf. Accessed 1 Apr 2025.
4. Chow Y. No home away from home for domestic workers terminated after contracting coronavirus amid Hong Kong’s fifth wave. Young Post. South China Morning Post; 2022 May 16. Available from: https://www.scmp.com/yp/discover/news/hong-kong/article/3177657/no-home-away-home-domestic-workers-terminated-after. Accessed 4 Dec 2022.
5. Cheung JT, Tsoi VW, Wong KH, Chung RY. Abuse and depression among Filipino foreign domestic helpers. A cross-sectional survey in Hong Kong. Public Health 2019;166:121-7. Crossref
6. Choy CY, Chang L, Man PY. Social support and coping among female foreign domestic helpers experiencing abuse and exploitation in Hong Kong. Front Commun 2022;7:1015193. Crossref
7. Sterud T, Tynes T, Mehlum IS, et al. A systematic review of working conditions and occupational health among immigrants in Europe and Canada. BMC Public Health 2018;18:770. Crossref
8. Ioannou M, Kassianos AP, Symeou M. Coping with depressive symptoms in young adults: perceived social support protects against depressive symptoms only under moderate levels of stress. Front Psychol 2019;9:2780. Crossref
9. Straiton ML, Aambø AK, Johansen R. Perceived discrimination, health and mental health among immigrants in Norway: the role of moderating factors. BMC Public Health 2019;19:325. Crossref
10. Yeung NC, Huang B, Lau CY, Lau JT. Feeling anxious amid the COVID-19 pandemic: psychosocial correlates of anxiety symptoms among Filipina domestic helpers in Hong Kong. Int J Environ Res Public Health 2020;17:8102. Crossref
11. Choi EP, Hui BP, Wan EY. Depression and anxiety in Hong Kong during COVID-19. Int J Environ Res Public Health 2020;17:3740. Crossref
12. Duffy RD, Allan BA, England JW, et al. The development and initial validation of the Decent Work Scale. J Couns Psychol 2017;64:206-21. Crossref
13. Spitzer RL, Kroenke K, Williams JB, Löwe B. A brief measure for assessing generalized anxiety disorder: the GAD-7. Arch Intern Med 2006;166:1092-7. Crossref
14. Zimet GD, Dahlem NW, Zimet SG, Farley GK. The Multidimensional Scale of Perceived Social Support. J Pers Assess 1988;52:30-41. Crossref
15. International Organization for Migration. The Determinants of Migrant Vulnerability. Geneva: United Nations; 2019. Available from: https://www.iom.int/sites/g/files/tmzbdl486/files/our_work/DMM/MPA/1-part1-thedomv.pdf. Accessed 5 Dec 2022.
16. Garabiles MR, Lao CK, Yip P, Chan EW, Mordeno I, Hall BJ. Psychometric validation of PHQ-9 and GAD-7 in Filipino migrant domestic workers in Macao (SAR), China. J Pers Assess 2020;102:833-44. Crossref
17. Mendoza NB, Mordeno IG, Latkin CA, Hall BJ. Evidence of the paradoxical effect of social network support: a study among Filipino domestic workers in China. Psychiatry Res 2017;255:263-71. Crossref
18. Yeung NC, Kan KK, Wong AL, Lau JT. Self-stigma, resilience, perceived quality of social relationships, and psychological distress among Filipina domestic helpers in Hong Kong: a mediation model. Stigma Health 2021;6:90-9. Crossref
19. Pan American Health Organization. Questionnaire to Assess the Diagnosis and Treatment of Chronic Diseases. Geneva: World Health Organization. Available from: https://www.paho.org/hq/dmdocuments/2009/cncd_mgt_questionnaire.pdf. Accessed 4 Dec 2022.
20. Leung DD, Tang EY. Correlates of life satisfaction among Southeast Asian foreign domestic workers in Hong Kong: an exploratory study. Asian Pac Migr J 2018;27:368-77. Crossref
21. Tee ML, Tee CA, Anlacan JP, et al. Psychological impact of COVID-19 pandemic in the Philippines. J Affect Disord 2020;277:379-91. Crossref
22. Saw YE, Tan EY, Buvanaswari P, Doshi K, Liu JC. Mental health of international migrant workers amidst large-scale dormitory outbreaks of COVID-19: a population survey in Singapore. J Migr Health 2021;4:100062. Crossref
23. Labour Department, Hong Kong SAR Government. Foreign domestic helpers. Vaccination priority groups to be expanded to cover people aged 30 or above. 2021 Mar 15. Available from: https://www.fdh.labour.gov.hk/en/news_detail.html?year=2021&n_id=190. Accessed 28 Mar 2025.
24. Sumerlin TS, Kim JH, Wang Z, Hui AY, Chung RY. Determinants of COVID-19 vaccine uptake among female foreign domestic workers in Hong Kong: a cross-sectional quantitative survey. Int J Environ Res Public Health 2022;19:5945. Crossref
25. McMenamin ME, Nealon J, Lin Y, et al. Vaccine effectiveness of one, two, and three doses of BNT162B2 and CoronaVac against COVID-19 in Hong Kong: a population-based observational study. Lancet Infect Dis 2022;22:1435-43. Crossref
26. Babicki M, Malchrzak W, Hans-Wytrychowska A, Mastalerz-Migas A. Impact of vaccination on the sense of security, the anxiety of COVID-19 and quality of life among polish. A nationwide online survey in Poland. Vaccines (Basel) 2021;9:1444. Crossref
27. Hong Kong SAR Government. Government’s response on situation of foreign domestic helpers affected by COVID-19 (with photos) [press release]. 2022 Mar 5. Available from: https://www.info.gov.hk/gia/general/202203/05/P2022030500399.htm. Accessed 4 Dec 2022.
28. Rich GJ. Filipina migrant domestic workers in Asia: mental health and resilience. In: Rich GJ, Jaafar JL, Barron D, editors. Psychology in Southeast Asia: Sociocultural, Clinical, and Health Perspectives. London: Routledge, Taylor & Francis Group; 2020. Crossref
29. World Health Organization. Social determinants of mental health. 2014 May 18. Available from: https://www.who.int/publications/i/item/9789241506809. Accessed 1 Apr 2025.
30. Mapa DS. Average monthly wage rates of selected occupations: 2018 and 2020 [Internet]. 2020 Occupational Wages Survey (OWS). Philippine Statistics Authority; 2022. Available from: https://psa.gov.ph/statistics/occupational-wages-survey/node/168472. Accessed 4 Dec 2022.
31. Hong Kong SAR Government. Minimum allowable wage and food allowance for foreign domestic helpers [press release]. 2021 Sep 30. Available from: https://www.info.gov.hk/gia/general/202109/30/P2021093000329.htm. Accessed 4 Dec 2022.
32. Afonso P, Fonseca M, Pires JF. Impact of working hours on sleep and mental health. Occup Med (Lond) 2017;67:377-82. Crossref
33. Clarke DM, Currie KC. Depression, anxiety and their relationship with chronic diseases: a review of the epidemiology, risk and treatment evidence. Med J Aust 2009;190:S54-60. Crossref
34. Census and Statistics Department, Hong Kong SAR Government. 2021 Population Census: Summary Results. 2022. Available from: https://www.censtatd.gov.hk/en/data/stat_report/product/B1120106/att/B11201062021XXXXB01.pdf. Accessed 4 Dec 2022.

Mask-wearing intention after the removal of the mandatory mask-wearing requirement in Hong Kong: application of the protection motivation theory and the theory of planned behaviour

Hong Kong Med J 2025 Apr;31(2):119–29 | Epub 7 Apr 2025
© Hong Kong Academy of Medicine. CC BY-NC-ND 4.0
 
ORIGINAL ARTICLE
Mask-wearing intention after the removal of the mandatory mask-wearing requirement in Hong Kong: application of the protection motivation theory and the theory of planned behaviour
Tommy KC Ng, MSc1; Ben YF Fong, MPH, FHKAM (Community Medicine)1; Vincent TS Law, DBA, PMgr2; Pimtong Tavitiyaman, PhD3; WK Chiu, PhD1
1 Division of Science, Engineering and Health Studies, College of Professional and Continuing Education, Hong Kong Polytechnic University, Hong Kong SAR, China
2 Division of Social Sciences, Humanities and Design, College of Professional and Continuing Education, Hong Kong Polytechnic University, Hong Kong SAR, China
3 Division of Business and Hospitality Management, College of Professional and Continuing Education, Hong Kong Polytechnic University, Hong Kong SAR, China
 
Corresponding author: Mr Tommy KC Ng (tommy.ng@cpce-polyu.edu.hk)
 
 Full paper in PDF
 
Abstract
Introduction: The mandatory mask-wearing requirement, which had been in place for nearly 1000 days in Hong Kong, was lifted on 1 March 2023. Little is known about the intention to continue wearing a mask after the removal of the mandate in the city. This study aimed to examine predictors of mask-wearing intention after the mandate was lifted, using the protection motivation theory (PMT) and the theory of planned behaviour (TPB).
 
Methods: A conceptual model was developed to depict the relationships between the constructs of PMT and TPB in predicting continued mask-wearing intention after the removal of the mandate. A cross-sectional study was conducted using an online questionnaire from 8 to 20 March 2023. Partial least squares structural equation modelling was utilised to examine relationships between the constructs.
 
Results: In total, 483 responses were included in the data analysis. Perceived severity (β=0.089; P=0.017), perceived self-efficacy (β=0.253; P<0.001), subjective norms (β=0.289; P<0.001), and attitude (β=0.325; P<0.001) had significant positive effects on the intention to continue wearing a mask. In contrast, the perceived reward of maladaptive behaviours had a significant negative effect on mask-wearing intention (β=-0.071; P=0.012). Perceived vulnerability, perceived response efficacy, perceived response cost, and perceived behavioural control were not significantly associated with mask-wearing intention.
 
Conclusion: The findings indicate that attitude towards continued mask-wearing was the strongest predictor of mask-wearing intention, followed by subjective norms and perceived self-efficacy. Insights from this study may inform public health policymaking regarding mask-wearing practices in future health crises.
 
 
New knowledge added by this study
  • More than half of the respondents (53.6%) consistently wore a mask after the mandatory mask-wearing requirement had been lifted in Hong Kong.
  • Attitude towards continued mask-wearing was the strongest predictor of mask-wearing intention, followed by subjective norms and perceived self-efficacy.
Implications for clinical practice or policy
  • A high frequency of mask-wearing was observed after the mandatory mask-wearing requirement had been lifted. The progress of Hong Kong citizens in returning to pre-pandemic norms requires further evaluation.
  • The positive attitude towards mask-wearing among Hong Kong citizens suggests that they are prepared for future health crises.
 
 
Introduction
The coronavirus disease 2019 (COVID-19) pandemic has had extensive global social and health impacts. It triggered an international health and economic crisis that has profoundly altered people’s lives, perceptions, and behaviours. As of 13 March 2025, about 778 million confirmed cases of COVID-19 had caused around 7.1 million deaths worldwide.1 Various levels of non-pharmaceutical interventions, including frequent handwashing, mask-wearing, and social distancing, were implemented in most countries.2 These interventions played important roles in reducing community transmission of COVID-19.3 However, the stringent measures also led to negative consequences, such as economic slowdown, disrupted education, and increased social isolation and psychological stress.4 5 Many countries lifted non-pharmaceutical interventions while the number of cases was still increasing. In England, all COVID-19–related restrictions were lifted on 22 February 2022 under the ‘Living with COVID’ strategy,6 although the number of cases increased in subsequent months. Australia, Singapore, and Hong Kong adopted a ‘Zero-COVID’ strategy.7 In Australia, all mandatory mask-wearing requirements on public transport were lifted in mid-September 2022.8 Singapore also lifted such requirements on 9 February 2023.9 Hong Kong, a leading international business and financial centre, finally lifted all mandatory mask-wearing requirements on 1 March 2023,10 nearly 1000 days after the start of the pandemic in 2020. Since then, the city has been transitioning towards the post–COVID-19 era.
 
During the COVID-19 pandemic, many governments mandated mask-wearing in public areas. Mask-wearing behaviour was largely a response to legal restrictions and requirements. Obedience, as a form of social influence, played a role in mask adherence; individuals sought to avoid social punishment, including fines or imprisonment. Additionally, normative social influence emerged as a means of curbing the spread of COVID-19. A positive correlation was observed between social norms regarding mask-wearing and mask uptake, such that individuals were more likely to wear a mask if their friends and relatives did so.11 Furthermore, individuals’ beliefs about engaging in the right behaviour were associated with their behavioural intentions. Personal norms regarding mask-wearing were significantly associated with mask-wearing intention.12 In the post–COVID-19 era, individuals may continue mask-wearing even after governments have lifted mandatory requirements, potentially due to self-motivation for health protection. This study aimed to identify predictors of mask-wearing intentions and practices after the mandatory mask-wearing requirement had been lifted in Hong Kong by integrating the protection motivation theory (PMT) with the theory of planned behaviour (TPB). This integration provides a comprehensive framework for evaluating mask-wearing intentions by examining key factors influencing health behaviours, including perceived severity, perceived vulnerability, attitudes, and subjective norms. This approach may offer a nuanced understanding of predictors of mask-wearing intentions after the mandatory mask-wearing requirement had been lifted.
 
Protection motivation theory
Protection motivation theory has been widely used as a framework for predicting the adoption of health-protective behaviours.13 This theory assumes that the adoption of protective behaviour against health threats depends on personal motivation for self-protection. Rooted in expectancy-value theory, PMT explains the social and cognitive processes underlying protective behaviours. The theory is based on the premise that the decision to counteract a health threat is determined by threat and coping appraisal processes.14 According to PMT, two primary processes—threat appraisal and coping appraisal—determine behavioural intention. Threat appraisal consists of three components: perceived vulnerability, perceived severity, and the perceived reward of maladaptive behaviours. Perceived vulnerability refers to an individual’s assessment of the likelihood of experiencing a health threat or developing a health condition. Perceived severity concerns the perceived seriousness of potential consequences associated with the condition. Therefore, perceptions of COVID-19 severity and vulnerability to disease would significantly predict adherence to protective measures.15 Perceived reward of maladaptive behaviours refers to beliefs regarding the benefits associated with engaging in risky behaviours. Patients with COVID-19 may experience long COVID symptoms, including increased fatigue, depressive symptoms, and reduced mental acuity.16 In this context, individuals may continue wearing masks due to concerns about long-COVID severity. Thus, perceived vulnerability and perceived severity are expected to be positively associated with the intention to continue wearing a mask in the post–COVID-19 era, whereas the perceived reward of maladaptive behaviours is expected to be negatively associated with this behaviour. Three hypotheses were proposed in relation to these elements (H1 to H3 in the online supplementary Table).
 
Coping appraisal comprises perceived response efficacy, perceived self-efficacy, and perceived response cost. Perceived response efficacy refers to belief in the effectiveness of the recommended behaviour with respect to mitigating or preventing potential harm.17 Perceived self-efficacy denotes an individual’s confidence in overcoming barriers to implementing the recommended behaviour.18 Perceived response cost refers to perceived costs associated with the behaviour. Perceived response efficacy has been positively associated with social distancing behaviours, a non-pharmaceutical intervention for COVID-19, among Hong Kong adults.19 Three hypotheses were derived in relation to these elements (H4 to H6 in the online supplementary Table).
 
Theory of planned behaviour
The TPB is a well-established model for explaining health-related behavioural intentions, which are influenced by subjective norms (perceived expectations from significant others regarding the behaviour), attitude (personal feelings and beliefs about the behaviour), and perceived behavioural control (perceived ability to perform the behaviour). Individuals with a more positive attitude towards non-pharmaceutical interventions exhibit a greater intention to implement such interventions.20 Similarly, subjective norms and perceived behavioural control have demonstrated positive associations with the intention to adopt interventions against COVID-19.20 Five hypotheses were formulated in relation to these elements (H7 to H11 in the online supplementary Table).
 
Integration of protection motivation theory and theory of planned behaviour
The integration of PMT and TPB has been utilised to predict behavioural intention in various research contexts, such as adherence to COVID-19 behavioural guidelines,21 behavioural intention towards COVID-19 booster vaccination,22 and factors affecting preventive behaviours during the COVID-19 pandemic.23 In this study, the attitude component of TPB was used to assess an individual’s attitude towards continuing to wear a mask. Attitudes may be influenced by an individual’s protection motivation. A meta-analysis identified perceived importance, perceived benefits, perceived effectiveness, and perceived barriers to preventive behaviour as key attitudinal factors influencing such behaviour.24 Therefore, a conceptual model was developed to illustrate relationships between the constructs of PMT and TPB in predicting continued mask-wearing after the announcement that all mandatory mask-wearing requirements had been lifted. Fourteen hypotheses were formulated in relation to these elements (H12 to H25 in the online supplementary Table).
 
Methods
Participant recruitment
This cross-sectional study was conducted using an online questionnaire between 8 and 20 March 2023. Participants were recruited through a non-probability snowball sampling method that had been used in a previous study.3 The target sample size was determined based on the requirement that it should be 10 times the maximum number of measurement items associated with a single construct in the partial least squares path model.25 In this study, 37 items measured ten constructs, resulting in a target sample size of 370 (10 × 37). The online questionnaire was distributed via email and WhatsApp, a widely used social media platform in Hong Kong. Using the researchers’ personal social networks, eligible individuals of various ages and educational backgrounds were invited to participate. They were also encouraged to share the questionnaire link with suitable colleagues and friends. Additionally, the researchers contacted the heads of local community colleges to seek collaboration and support. Upon receiving approval from directors or presidents, the researchers sent the online questionnaire to those leaders for recruitment of eligible participants. Individuals were included in this study if they were Hong Kong residents aged ≥18 years and had access to the internet via a smartphone or computer. Participants read a statement on the survey’s background, anonymity, and participation agreement before providing consent. To prevent duplicate submissions, the prefix and first three digits of the Hong Kong Identity Card were collected and later removed prior to data analysis.
 
Measures within the questionnaire
The questionnaire, consisting of four sections, was designed to assess perceived vulnerability, perceived severity, perceived reward of maladaptive behaviours, perceived response efficacy, perceived self-efficacy, perceived response cost, attitude, perceived behavioural control, subjective norms, and intention to continue wearing a mask after the mandatory mask-wearing requirement had been lifted. The first section included two questions focused on mask-wearing frequency after the mandatory requirement had been lifted and on verification of Hong Kong residency. The second section examined respondents’ adoption of health-protective behaviours, based on PMT.26 27 The third section measured variables related to respondents’ intention to continue wearing a mask, based on TPB.3 27 All items in the second and third sections were assessed using a five-point Likert scale (1=strongly disagree to 5=strongly agree). The final section collected demographic information, such as age, gender, education level, economic status, and self-reported health status, through close-ended questions.
 
Data analysis
Partial least squares structural equation modelling was utilised to examine the conceptual framework in this study. The SmartPLS 3.0 statistical software (SmartPLS GmbH, Bönningstedt, Germany) was used to assess both the reflective measurement model and the structural model. Study reliability and validity were evaluated by assessing internal consistency and convergent validity in the reflective measurement model.25 Convergent validity was considered acceptable if the outer loadings of the measurement items exceeded 0.5 and the average variance extracted for each construct was >0.5.25 28 Internal reliability was evaluated using composite reliability, which was recommended to exceed 0.708, and Cronbach’s alpha, which should be >0.6.25 Path coefficients were assessed within the structural model. A P value <0.05 was considered significant.
 
Results
Participant characteristics
In total, 483 valid responses were included in the data analysis. Table 1 presents the participants’ demographic characteristics. The largest proportion of respondents belonged to the 18-25 age-group (28.2%), followed by the 56-65 (18.4%), the 66-75 (13.7%), and the 36-45 (13.0%) age-groups. The mean age was 43.56 years. Among the participants, 269 (55.7%) were men and 214 (44.3%) were women. Most respondents (59.0%) had attained a degree-level education or higher; more than two-fifths of respondents were employed. Additionally, approximately half of the respondents (46.6%) rated their health status as good. More than half of the respondents (53.6%) reported always wearing a mask after the mandatory mask-wearing requirement had been lifted. The median number of COVID-19 vaccine doses received was three (interquartile range=1).
 

Table 1. Participant demographic characteristics (n=483)
 
Measurement model
Table 2 presents the model reliability. Loadings >0.7 indicate a satisfactory level of item reliability.25 29 The outer loadings of all items exceeded 0.7, except for one item related to perceived behavioural control; consequently, this item was removed. Internal consistency reliability was considered satisfactory because composite reliability and Cronbach’s alpha exceeded the threshold value of 0.7. The average variance extracted for all constructs was >0.5, suggesting good convergent validity after the removal of five items: one item each from perceived severity, perceived response efficacy, perceived self-efficacy, attitude, and behavioural intention. The variance inflation factor for each item was <5, indicating no critical levels of collinearity. Table 3 depicts the results of the assessment of discriminant validity. Given the adequacy of indicator reliability, internal consistency reliability, convergent validity, and discriminant validity, evaluation of the structural model could proceed.29
 

Table 2. Construct validity and reliability of the measurement model
 

Table 3. Values of construct correlations, square roots of average variance extracted (italic font), and heterotrait-monotrait ratio of correlations (grey shades)
 
Structural model
Table 4 displays the results of direct effects in the structural model. Of the 17 hypotheses, 10 were supported based on the results generated through a bootstrapping procedure with 5000 resamples. Four constructs—perceived severity, perceived self-efficacy, subjective norms, and attitude—had significant positive effects on the intention to continue wearing a mask. In contrast, perceived reward of maladaptive behaviours had a significant negative effect on mask-wearing intention. Consequently, hypotheses H2, H3, H5, H7, and H8 were supported. However, perceived vulnerability, perceived response efficacy, perceived response cost, and perceived behavioural control were not significantly associated with the intention to continue wearing a mask. Thus, hypotheses H1, H4, H6, and H9 were not supported.
 

Table 4. Direct effects of the structural model
 
Furthermore, subjective norms, perceived severity, perceived response efficacy, and perceived self-efficacy had significant positive effects on attitude, whereas perceived reward of maladaptive behaviours had a significant negative effect on attitude. Therefore, hypotheses H10, H13, H14, H15, and H16 were supported. However, no significant relationships were observed between perceived behavioural control and attitude, perceived vulnerability and attitude, or perceived response cost and attitude. These findings did not support hypotheses H11, H12, and H17 (Table 4). The results of the structural model are depicted in the Figure.
 

Figure. Depiction of the structural model
 
Table 5 shows the results of the mediation model. Attitude had a partial mediating effect on the relationships of perceived self-efficacy, perceived reward of maladaptive behaviours, subjective norms, and perceived severity with the intention to continue wearing a mask. These results partially supported hypotheses H19, H20, H21, and H22. Additionally, attitude had a full mediating effect on the relationship between perceived response efficacy and the intention to continue wearing a mask, supporting hypothesis H24. However, no mediating effect of attitude was observed in the relationships of perceived response cost, perceived vulnerability, and perceived behavioural control with continuous behavioural intention. These results did not support hypotheses H18, H23, and H25.
 

Table 5. Mediating effects of the structural model
 
Discussion
Most respondents continued wearing masks during the 3 weeks after the mandatory mask-wearing requirement had been lifted. Perceived severity, perceived self-efficacy, subjective norms, and attitude were positively associated with the intention to continue wearing a mask, whereas the perceived reward of maladaptive behaviours was negatively associated with this intention. Perceived severity suggests that individuals were concerned about the consequences of contracting COVID-19. Given that COVID-19 had influenced daily life and behaviour for 3 years, it is understandable that perceived severity remained a motivator for continued mask-wearing as a protective measure. Furthermore, some individuals may have experienced anxiety and sought to minimise the risk of infection. Thus, the pandemic itself may have outweighed their desire to return to pre-pandemic norms.30 Additionally, perceived self-efficacy indicates that individuals with confidence in their ability to wear a mask effectively were more likely to continue doing so. Personal protective measures can reduce the risk of infectious diseases31; mask-wearing is considered a feasible and acceptable method for preventing and reducing the spread of influenza-like illnesses.32 During the COVID-19 pandemic, some studies showed that perceived severity and perceived self-efficacy were significantly associated with intentions to comply with COVID-19 preventive behaviours.17 33 34 Individuals perceived that contracting COVID-19 posed a serious threat, whereas mask-wearing remained a feasible and effective strategy for preventing transmission, even after the mandatory mask-wearing requirement had been lifted.
 
Notably, the perceived reward of maladaptive behaviours had a significant negative effect on the intention to continue wearing a mask. This finding suggests that individuals who perceived benefits from not wearing a mask were less likely to express an intention to continue mask-wearing. The decision not to wear a mask may be attributed to various factors, including concerns about social judgement, the inconveniences associated with preventive measures, and daily hassles.35 36 The prolonged COVID-19 pandemic led to pandemic fatigue, which may have contributed to a perception among some individuals that the pandemic had ended once the mandatory mask-wearing requirement was lifted, thereby reducing their motivation to continue wearing a mask.
 
Attitudes and subjective norms had significant positive effects on the intention to continue wearing a mask. This observation indicates that individuals who held a favourable attitude towards mask-wearing and perceived social pressure or influence from others to wear a mask were more likely to express an intention to continue this practice. Attitudes and subjective norms were previously identified as predictors of mask-wearing intention during the COVID-19 pandemic.3 Before the pandemic, the local population in Hong Kong exhibited a positive attitude towards mask-wearing. For example, patients and caregivers in outpatient settings generally wore face masks; protecting others was a primary motivation for this approach.37 Individuals with a positive attitude towards mask-wearing may have been influenced by government-led promotion of preventive behaviours since the severe acute respiratory syndrome epidemic in 2003, which caused mask-wearing to become a social norm within the community.38 The present findings indicate that higher levels of perceived self-efficacy, perceived reward of maladaptive behaviours, subjective norms, and perceived severity not only directly increased the intention to wear a mask but also influenced individuals’ attitudes, leading to an increased intention to continue mask-wearing. These results provide empirical evidence supporting the role of attitude as a mediator in the intention to continue wearing a mask. Thus, the relationships among perceived self-efficacy, perceived reward of maladaptive behaviours, subjective norms, perceived severity, and the intention to continue mask-wearing can also be explained by individuals’ attitudes.
 
In the present study, perceived vulnerability did not directly predict the intention to continue wearing a mask. A study also showed no significant association between perceived vulnerability and the adoption of preventive behaviours.39 A possible explanation is that the prolonged COVID-19 pandemic led individuals to consider themselves less vulnerable compared with early stages of the pandemic. The removal of government restrictions may have further reinforced the perception of reduced vulnerability to COVID-19.40 Additionally, the results of this study did not demonstrate a statistically significant direct effect between perceived response efficacy and the intention to continue wearing a mask. However, a mediating role for attitude was identified in this relationship, indicating that perceived response efficacy influenced attitude, which then determined the intention to wear a mask.
 
Implications
This study highlights the importance of understanding the predictors of mask-wearing intention after the mandatory mask-wearing requirement was lifted. A high frequency of mask-wearing was observed after the removal of the requirement. This finding has implications for future research regarding the long-term effects of habitual mask use and its impact on public health. From a practical perspective, the findings indicate that attitude towards continued mask-wearing was the strongest predictor of mask-wearing intention, suggesting that citizens are prepared for future health crises. Policymakers can utilise these insights to develop guidelines encouraging mask use during influenza seasons.
 
Limitations
This study had certain limitations. First, the sampling method relied on non-probability snowball sampling, which may affect the representativeness of the sample. Second, participation was limited to individuals with access to email and social media, leading to overrepresentation of younger and more educated individuals. Younger participants may consider themselves less likely to experience severe health consequences if they contract COVID-19. Consequently, the findings may not be generalisable to the entire population.
 
Conclusion
To our knowledge, this is one of the first studies to use an online questionnaire to identify the predictors of mask-wearing intention after the mandatory mask-wearing requirement in Hong Kong was lifted in March 2023. Attitude towards continued mask-wearing, subjective norms, and perceived self-efficacy exhibited strong positive effects on the intention to continue wearing a mask. Regarding research implications, this study provides new insights into the evaluation of Hong Kong citizens’ transition to a post-pandemic era. The high frequency of mask-wearing observed may be attributed to concerns about COVID-19 and the establishment of mask-wearing as an accepted and habitual behaviour within the local population. Furthermore, the findings suggest that Hong Kong citizens are well prepared for future health crises, such as severe acute respiratory syndrome and additional COVID-19 outbreaks. The positive attitude towards mask-wearing reflects recognition of its feasibility and effectiveness as a durable non-pharmaceutical public health intervention to reduce airborne disease transmission.
 
Author contributions
Concept or design: TKC Ng, BYF Fong.
Acquisition of data: All authors.
Analysis or interpretation of data: TKC Ng, BYF Fong.
Drafting of the manuscript: TKC Ng, BYF Fong.
Critical revision of the manuscript for important intellectual content: All authors.
 
All authors had full access to the data, contributed to the study, approved the final version for publication, and take responsibility for its accuracy and integrity.
 
Conflicts of interest
All authors have disclosed no conflicts of interest.
 
Funding/support
This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
 
Ethics approval
This research was approved by the Research Committee of the College of Professional and Continuing Education of Hong Kong Polytechnic University, Hong Kong (Ref No.: RC/ETH/H/133). Informed consent was obtained from all participants prior to the study and for the publication of this research.
 
Supplementary material
The supplementary material was provided by the authors and some information may not have been peer reviewed. Accepted supplementary material will be published as submitted by the authors, without any editing or formatting. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by the Hong Kong Academy of Medicine and the Hong Kong Medical Association. The Hong Kong Academy of Medicine and the Hong Kong Medical Association disclaim all liability and responsibility arising from any reliance placed on the content.
 
References
1. World Health Organization. WHO COVID-19 Dashboard. Available from: https://covid19.who.int/. Accessed 13 Mar 2025.
2. Lison A, Banholzer N, Sharma M, et al. Effectiveness assessment of non-pharmaceutical interventions: lessons learned from the COVID-19 pandemic. Lancet Public Health 2023;8:e311-7. Crossref
3. Duan Y, Shang B, Liang W, et al. Predicting hand washing, mask wearing and social distancing behaviors among older adults during the COVID-19 pandemic: an integrated social cognition model. BMC Geriatr 2022;22:91. Crossref
4. Diallo I, Ndejjo R, Leye MM, et al. Unintended consequences of implementing non-pharmaceutical interventions for the COVID-19 response in Africa: experiences from DRC, Nigeria, Senegal, and Uganda. Global Health 2023;19:36. Crossref
5. ÓhAiseadha C, Quinn GA, Connolly R, et al. Unintended consequences of COVID-19 non-pharmaceutical interventions (NPIs) for population health and health inequalities. Int J Environ Res Public Health 2023;20:5223. Crossref
6. Limb M. COVID-19: scientists and medics warn that it is too soon to lift all restrictions in England. BMJ 2022;376:o469. Crossref
7. Zhan Z, Li J, Cheng ZJ. Zero-COVID strategy: what’s next? Int J Health Policy Manag 2023;12:6757. Crossref
8. Dunstan J. Victoria to ease COVID-19 mask mandate on public transport from 11:59pm Thursday. ABC News [newspaper on the Internet]. 2022 Sep 21. Available from: https://www.abc.net.au/news/2022-09-21/victoria-mask-mandate-ends-trains-trams-buses-transport/101460606. Accessed 7 Jul 2024.
9. Lin C. Singapore relaxes COVID travel curbs, mask rules further. Reuters [newspaper on the Internet]. 2023 Feb 9. Available from: https://www.reuters.com/world/asia-pacific/singapore-relaxes-covid-travel-curbs-mask-rules-further-2023-02-09/. Accessed 7 Jul 2024.
10. Hong Kong SAR Government. Government lifts all mandatory mask-wearing requirements [press release]. 2023 Feb 28. Available from: https://www.info.gov.hk/gia/general/202302/28/P2023022800677.htm. Accessed 13 Mar 2025.
11. Barceló J, Sheen GC. Voluntary adoption of social welfare-enhancing behavior: mask-wearing in Spain during the COVID-19 outbreak. PloS One 2020;15:e0242764. Crossref
12. Lipsey NP, Losee JE. Social influences on mask-wearing intentions during the COVID-19 pandemic. Soc Pers Psychol Compass 2023;17:e12817. Crossref
13. Ezati Rad R, Mohseni S, Kamalzadeh Takhti H, et al. Application of the protection motivation theory for predicting COVID-19 preventive behaviors in Hormozgan, Iran: a cross-sectional study. BMC Public Health 2021;21:466. Crossref
14. Fischer-Preßler D, Bonaretti D, Fischbach K. A protection-motivation perspective to explain intention to use and continue to use mobile warning systems. Bus Inf Syst Eng 2022;64:167-82. Crossref
15. González-Castro JL, Ubillos-Landa S, Puente-Martínez A, Gracia-Leiva M. Perceived vulnerability and severity predict adherence to COVID-19 protection measures: the mediating role of instrumental coping. Front Psychol 2021;12:674032. Crossref
16. Bierbauer W, Lüscher J, Scholz U. Illness perceptions in long-COVID: a cross-sectional analysis in adults. Cogent Psychol 2022;9:2105007. Crossref
17. Lahiri A, Jha SS, Chakraborty A, Dobe M, Dey A. Role of threat and coping appraisal in protection motivation for adoption of preventive behavior during COVID-19 pandemic. Front Public Health 2021;9:678566. Crossref
18. Bandura A. The growing centrality of self-regulation in health promotion and disease prevention. Eur Health Psychol 2005;7:11-2.
19. Yu Y, Lau JT, Lau MM. Competing or interactive effect between perceived response efficacy of governmental social distancing behaviors and personal freedom on social distancing behaviors in the Chinese adult general population in Hong Kong. Int J Health Policy Manag 2022;11:498-507. Crossref
20. Ohnmacht T, Hüsser AP, Thao VT. Pointers to interventions for promoting COVID-19 protective measures in tourism: a modelling approach using domain-specific risk-taking scale, theory of planned behaviour, and health belief model. Front Psychol 2022;13:940090. Crossref
21. Nudelman G. Predicting adherence to COVID-19 behavioural guidelines: a comparison of protection motivation theory and the theory of planned behaviour. Psychol Health 2024;39:1689-705. Crossref
22. Zhou M, Liu L, Gu SY, et al. Behavioral intention and its predictors toward COVID-19 booster vaccination among Chinese parents: applying two behavioral theories. Int J Environ Res Public Health 2022;19:7520. Crossref
23. Khaday S, Li KW, Dorloh H. Factors affecting preventive behaviors for safety and health at work during the COVID-19 pandemic among Thai construction workers. Healthcare (Basel) 2023;11:426. Crossref
24. Liang W, Duan Y, Li F, et al. Psychosocial determinants of hand hygiene, facemask wearing, and physical distancing during the COVID-19 pandemic: a systematic review and meta-analysis. Ann Behav Med 2022;56:1174-87. Crossref
25. Hair Jr JF, Hult GT, Ringle CM, Sarstedt M. A Primer on Partial Least Squares Structural Equation Modeling (PLS-SEM). Los Angeles [CA]: Sage Publications; 2021.
26. Youn SY, Lee JE, Ha-Brookshire J. Fashion consumers’ channel switching behavior during the COVID-19: protection motivation theory in the extended planned behavior framework. Cloth Text Res J 2021;39:139-56. Crossref
27. Zhang X, Liu S, Wang L, Zhang Y, Wang J. Mobile health service adoption in China: integration of theory of planned behavior, protection motivation theory and personal health differences. Online Inf Rev 2020;44:1-23. Crossref
28. Ting MS, Goh YN, Isa SM. Determining consumer purchase intentions toward counterfeit luxury goods in Malaysia. Asia Pac Manage Rev 2016;21:219-30. Crossref
29. Sarstedt M, Ringle CM, Hair JF. Partial least squares structural equation modeling. In: Homburg C, Klarmann M, Vomberg AE, editors. Handbook of Market Research. Switzerland: Springer International Publishing; 2021: 587-632. Crossref
30. Mo PK, Yu Y, Lau MM, Ling RH, Lau JT. Time to lift up COVID-19 restrictions? Public support towards living with the virus policy and associated factors among Hong Kong general public. Int J Environ Res Public Health 2023;20:2989. Crossref
31. Masai AN, Akin L. Practice of COVID-19 preventive measures and risk of acute respiratory infections: a longitudinal study in students from 95 countries. Int J Infect Dis 2021;113:168-74. Crossref
32. Polonsky JA, Bhatia S, Fraser K, et al. Feasibility, acceptability, and effectiveness of non-pharmaceutical interventions against infectious diseases among crisis-affected populations: a scoping review. Infect Dis Poverty 2022;11:14. Crossref
33. Acar D, Kıcali ÜÖ. An integrated approach to COVID-19 preventive behaviour intentions: protection motivation theory, information acquisition, and trust. Soc Work Public Health 2022;37:419-34. Crossref
34. Kwok KO, Li KK, Chan HH, et al. Community responses during early phase of COVID-19 epidemic, Hong Kong. Emerg Infect Dis 2020;26:1575-9. Crossref
35. Lai DW, Jin J, Yan E, Lee VW. Predictors and moderators of COVID-19 pandemic fatigue in Hong Kong. J Infect Public Health 2023;16:645-50. Crossref
36. Rieger MO. To wear or not to wear? Factors influencing wearing face masks in Germany during the COVID-19 pandemic. Asian J Soc Health Behav 2020;3:50-4. Crossref
37. Ho HS. Use of face masks in a primary care outpatient setting in Hong Kong: knowledge, attitudes and practices. Public Health 2012;126:1001-6. Crossref
38. Mo PK, Lau JT. Illness representation on H1N1 influenza and preventive behaviors in the Hong Kong general population. J Health Psychol 2015;20:1523-33. Crossref
39. Zancu SA, Măirean C, Diaconu-Gherasim LR. The longitudinal relation between time perspective and preventive behaviors during the COVID-19 pandemic: the mediating role of risk perception. Curr Psychol 2024;43:12981-9. Crossref
40. Stefanczyk MM, Rokosz M, Białek M. Changes in perceived vulnerability to disease, resilience, and disgust sensitivity during the pandemic: a longitudinal study. Curr Psychol 2024;43:23412-24. Crossref

Willingness to pay and preferences for mindfulness-based interventions among patients with chronic low back pain in the Hong Kong public healthcare sector

Hong Kong Med J 2025 Apr;31(2):108–18 | Epub 14 Apr 2025
© Hong Kong Academy of Medicine. CC BY-NC-ND 4.0
 
ORIGINAL ARTICLE  CME
Willingness to pay and preferences for mindfulness-based interventions among patients with chronic low back pain in the Hong Kong public healthcare sector
Mengting Zhu, PhD1; Phoenix KH Mo, PhD1; Kailu Wang, PhD1; Hermione HM Lo, MSc1; YK Choi, PGDip2; SW Law, MSc3; Regina WS Sit1, MD
1 The Jockey Club School of Public Health and Primary Care, Faculty of Medicine, The Chinese University of Hong Kong, Hong Kong SAR, China
2 Department of Family Medicine, The New Territories East Cluster, Hospital Authority, Hong Kong SAR, China
3 Department of Orthopaedics and Traumatology, Alice Ho Miu Ling Nethersole Hospital, Hong Kong SAR, China
 
Corresponding author: Prof Regina WS Sit (reginasit@cuhk.edu.hk)
 
 Full paper in PDF
 
Abstract
Introduction: Low back pain (LBP) is a leading cause of disability worldwide. Mindfulness-based interventions (MBIs) are effective for LBP management when combined with medication and physical therapy. An understanding of patients’ willingness to pay (WTP) and preferences is needed to integrate MBIs into standard LBP care. We examined WTP and preferences for MBIs, as well as associated factors, among patients with chronic LBP in the Hong Kong public healthcare sector.
 
Methods: A cross-sectional survey was conducted in two Hong Kong public hospitals. We used the payment card method to assess patients’ WTP for MBIs and performed a discrete choice experiment to examine patients’ preferences for MBIs. Tobit regression was utilised to analyse factors associated with WTP for MBIs. Patients’ relative preferences for MBIs were estimated through a mixed logit model.
 
Results: Mean WTP for an eight-session course of MBIs was HK$258.75±508.11. Higher pain scores, monthly family income >HK$30 000, high school education, higher treatment expenses, and stronger belief in MBIs were associated with greater WTP. Patients were more likely to choose MBIs with lower costs, greater improvements in pain relief and the ability to perform daily activities, and a face-to-face delivery mode.
 
Conclusion: Patients with chronic LBP exhibited low WTP for MBIs. Strategies to improve education and awareness may enhance WTP; affordability and accessibility should be considered for individuals from diverse socio-economic backgrounds. The identified preferences provide insights for designing MBIs that align with patient needs. These findings offer valuable methodological references for other healthcare evaluations.
 
 
New knowledge added by this study
  • Patients with chronic low back pain have a low willingness to pay for mindfulness-based interventions (MBIs).
  • Individuals experiencing more severe pain and possessing greater financial capacity are more willing to pay for MBIs.
  • Patients prefer MBIs with lower costs, greater treatment effectiveness, and a face-to-face delivery mode.
Implications for clinical practice or policy
  • These findings have practical implications for the future implementation of MBIs in chronic pain management.
  • This study provides a methodological reference that could be adapted for evaluation of similar treatments in diverse international settings.
 
 
Introduction
Low back pain (LBP) is a prevalent health condition that can have disabling effects on individuals of all ages.1 This condition also imposes substantial socio-economic costs, as evidenced by studies demonstrating its impacts on healthcare systems and workforce productivity worldwide.2 3
 
Psychological treatments, particularly when combined with medication and physical therapy, are effective in managing LBP.4 Mindfulness-based interventions (MBIs; ie, evidence-based psychological approaches) have been shown to reduce pain, disability, and psychological distress associated with LBP.5 Moreover, studies have emphasised the cost-effectiveness of MBIs in reducing chronic pain–related healthcare expenses and productivity losses.6 7 Although the exact mechanisms through which MBIs alleviate pain have not been elucidated, there is evidence that they may alter pain signal processing in the brain, fostering acceptance and non-judgemental awareness. These outcomes enhance pain tolerance and reduce emotional reactivity to pain.8
 
Other commonly used social and psychotherapeutic modalities include cognitive-behavioural therapy and acceptance and commitment therapy. Cognitive-behavioural therapy targets maladaptive thought patterns and behaviours,9 whereas acceptance and commitment therapy focuses on promoting psychological flexibility despite the presence of pain.10 Mindfulness-based interventions uniquely emphasise cultivating present-moment awareness and acceptance of pain sensations.11 Key advantages of MBIs include their accessibility and cost-effectiveness: they can be efficiently delivered in group settings (either online or face-to-face), facilitating scalability for public healthcare initiatives.12 13 Moreover, they have the potential to enhance self-management skills for sustainable pain management.14 Acceptance and commitment therapy has limited empirical support and mixed results regarding its effectiveness in terms of improving pain intensity among patients with chronic pain.15 16 Cognitive-behavioural therapy is a widely used and well-researched therapeutic approach for chronic pain.12 However, it is considered suitable for one-on-one (rather than group-based) formats because it requires personalised treatment plans that address the unique needs and concerns of each patient.17 Furthermore, MBIs have demonstrated greater cost-effectiveness relative to cognitive-behavioural therapy among patients with chronic LBP.18
 
In Hong Kong, approximately 90% of specialist and inpatient care services and 30% of primary care services are provided by the public sector.19 Given the absence of universal health insurance or co-payment, the majority of chronic diseases (eg, LBP) are managed within the public healthcare system.20 The incorporation of MBIs into standard LBP treatment within this system requires an understanding of patients’ willingness to pay (WTP) and preferences. Relatively few studies have explored WTP or preferences for MBIs among patients with chronic LBP. An understanding of WTP is crucial for efforts to assess the perceived value of healthcare interventions, inform policy decisions, and guide resource allocation.21 22 Consideration of patient preferences in healthcare service decisions can improve uptake, adherence, efficiency, and patient satisfaction while reducing costs.23 24
 
This study aimed to estimate WTP and preferences for MBIs among patients with chronic LBP in the public healthcare sector and to explore factors associated with WTP and preferences for MBIs.
 
Chronic LBP is significantly influenced by psychological factors; social determinants play a crucial role in the interpretation of chronic LBP and the ways that individuals seek and receive pain treatment.25 26 The socio-psychobiological model of chronic pain represents a paradigmatic shift from the conventional biopsychosocial model.27 28 Whereas the latter model recognises the interplay of social, psychological, and biological factors, it tends to prioritise biological determinants over social and psychological aspects.27 28 In contrast, the socio-psychobiological model primarily emphasises social determinants, followed by psychological and biological factors.27 28
 
Our research, which assesses WTP and preferences for MBIs in the context of chronic LBP, aligns with the socio-psychobiological model for pain management. The examination of WTP and preferences can provide valuable insights into the socio-economic backgrounds of individuals with chronic LBP, which may strongly influence their experiences of pain and responses to pain management interventions. The findings may also clarify patients’ abilities to access and afford pain management strategies.29 This aspect is particularly important because it underscores the social dimensions of chronic pain management, highlighting disparities and barriers that may exist in pain experiences and access to effective interventions. Furthermore, MBIs constitute a psychological and group-based approach to chronic pain management, addressing both psychological and social factors emphasised within the socio-psychobiological model.12 30 These interventions provide individuals with skills to manage psychological distress linked to chronic LBP while also promoting social support and connectivity in group settings.31 32 By fostering mindfulness practices, MBIs equip individuals with coping mechanisms to navigate the psychological distress often associated with chronic LBP, while also enhancing social support and connectivity within group settings.31 32
 
Methods
Study design and setting
We conducted a prospective cross-sectional survey using convenience sampling to recruit eligible patients with chronic LBP from two Hong Kong public hospitals between September 2022 and February 2023. We utilised a discrete choice experiment (DCE) design to examine preferences for MBIs. This study adhered to the STROBE (Strengthening the Reporting of Observational Studies in Epidemiology) guidelines.
 
Participants
The inclusion criteria for this study were age ≥18 years, chronic non-specific LBP, and the ability to speak and understand Chinese. Chronic non-specific LBP was defined as pain in the lumbosacral region, with or without sciatica, that persisted for >3 months and lacked a clearly identifiable cause or pathology based on clinical evaluation, imaging, or laboratory tests. Exclusion criteria were chronic LBP with a specific identifiable cause or pathology, such as inflammatory diseases, tumours, infections, fractures, structural abnormalities, or other spinal pathologies evident on clinical evaluation, imaging, or laboratory tests. Patients who did not provide written informed consent, were pregnant, or were <6 months postpartum or post-weaning were also excluded.
 
Sample size calculation
To determine the sample size for evaluating WTP, we used the payment card elicitation format sample size formula established by Mitchell and Carson.33 The formula is:

 

where n is the minimum required sample size, Z1-α/2 represents the desired confidence interval, Z1-β corresponds to the value for power, V denotes the coefficient of variation (ie, ratio of estimated standard deviation of WTP to estimated mean WTP), and D is the designed effect (ie, percentage difference between true WTP and mean of estimated WTP bids). For this study, assuming α=0.05, β=0.20, V=0.98 (based on a previous study evaluating WTP for reduced pain intensity among patients with chronic pain),34 and D=0.20, the calculated minimum sample size was 470, considering a 20% non-response rate.
 
To explore preferences for MBI receipt using a DCE design, we applied the rule of thumb described by Orme35 and Johnson and Orme.36 The minimum sample size required for the main effects was calculated as follows:


Under conditions of two alternatives, a maximum of four attribute levels and eight scenarios per patient, a minimum of 125 patients was required. Considering two subgroups with different characteristics and a 20% non-response rate, the adjusted minimum sample size was 312.
 
Survey data
A self-administered questionnaire was used to collect data. An onsite research assistant invited patients in the clinic waiting area to participate in the survey and was available to provide assistance if needed.
 
Independent variable
The independent variables of the study are as follows:
  1. Socio-demographic characteristics: Age, gender, education level, employment status, and personal and family income were recorded.
  2. General self-reported health status: A single-item self-rated health scale was used to assess participants’ self-rated health, with response options ranging from ‘Very good’ to ‘Very poor.’37 Studies have shown that this scale is associated with patients’ WTP for pain treatments.38 39 40 41
  3. Knowledge and usage of MBIs: Knowledge of mindfulness was assessed using two items adapted from a previous study that investigated health professionals’ and health profession students’ knowledge of and attitudes toward mindfulness.42 The items were as follows: (1) What is the extent of your knowledge of MBIs? (2) Might MBIs be useful for treating chronic pain? Usage of MBIs was determined using two items adapted from a previous study that evaluated employees’ preferences for accessing MBIs.43 The items were as follows: (1) Have you ever participated in mindfulness courses? (2) How many mindfulness sessions have you attended?
  4. Pain-related characteristics: Pain-related characteristics included pain duration, pain intensity, disability, and frequency of treatment for chronic LBP. Pain intensity was measured using an 11-point Numeric Rating Scale (NRS).44 Disability was assessed using the Roland-Morris Disability Questionnaire.45 Pain duration was determined by asking participants to report the number of months they had experienced an ongoing LBP problem. Frequency of treatment was evaluated by asking participants to report how many times they had consulted a doctor or other healthcare professional for LBP in the past 12 months.
  5. Satisfaction with current treatment: An item was adapted from a previous study that assessed treatment satisfaction in patients with osteoarthritis and LBP.46 This item asked participants to rate their satisfaction with the effectiveness of current treatment in controlling LBP.
  6. Monthly expenses on current treatment: Participants were asked to report their monthly expenses with respect to chronic LBP treatment.
 
Dependent variables
Willingness to pay and preferences for MBIs were the two dependent variables of the current study. The payment card method was used to assess WTP for MBIs.47 This approach minimises starting point bias and reduces the high rate of item non-response relative to other elicitation methods.48 To ensure that participants were familiar with MBIs, we provided an introduction using a text description and a video before each participant responded to the WTP question (online supplementary Fig). Participants were presented with a range of monetary values (HK$0 to HK$10 000) and asked to select the value that best represented the amount they would be willing to pay for MBIs. Additionally, WTP for pain reduction was evaluated using two items adapted from a previous study that assessed WTP for reductions in chronic LBP and neck pain using the payment card method.49 These items asked participants to indicate the amount they would be willing and able to pay out-of-pocket per month for their chronic LBP to be reduced by half or entirely eliminated. Participants unwilling to pay any amount were asked to specify their reasons.
 
Participants were invited to respond to eight choice sets evaluating patient preferences for MBIs. In each choice task, they were asked to select their most preferred option from two hypothetical MBIs with different attribute levels. To ensure comprehension, we included a test scenario with a dominant alternative. If participants did not choose the dominant option, research staff provided clarification. Internal validity was assessed by including a choice set with dominant pairs, in which one alternative was clearly superior across all attributes.
 
Statistical analyses
Complete-case analysis was utilised for the dependent variable of WTP for MBIs. The Tobit regression model was used to estimate the associated factors.50 This model was selected because WTP measures exhibited left-censoring (ie, a substantial proportion of zero values [46.6% of the sample]; the remaining responses indicated positive WTP for MBIs). Multicollinearity was examined using tolerance and the variance inflation factor (VIF). Continuous variables were presented as mean±standard deviation. The level of statistical significance was set at 5%.51
 
Study design
A DCE design was used in this study to examine the preferences of individuals with chronic LBP for MBIs. The DCE comprised four key steps: (1) conducting a literature review to identify conceptual attributes and levels; (2) conducting qualitative research to determine contextual attributes and levels; (3) integrating attributes and levels into choice sets, conducting pilot tests, and refining the questionnaire; and (4) collecting experimental data and performing data analysis.
 
Systematic review
A systematic review of DCEs examining patient preferences for non-surgical treatments in chronic musculoskeletal pain was conducted.52 Studies that used DCEs to evaluate patient preferences for the management of chronic musculoskeletal pain were included.
 
Qualitative research
Participants with chronic LBP were invited to discuss characteristics of MBIs they might consider valuable when deciding whether to participate in MBIs. These valued characteristics were summarised. A panel of experts from relevant fields (chronic pain, DCE methodology, and psychology) then reviewed and refined the attributes and levels, selecting six to eight attributes for inclusion.
 
Generation of choice sets, piloting, and refinement of the questionnaire
A D-efficient experimental design was used to generate choice sets, which were randomly assigned to five blocks. A pilot DCE survey was conducted to assess cognitive difficulty and questionnaire length. Twenty patients with chronic LBP participated in the pilot study; they provided feedback and suggestions for improvement.
 
Experimental data collection and data analysis
Discrete choice experiment data were collected as part of the cross-sectional survey. Respondents’ relative preferences were estimated using a mixed logit model with panel specification to adjust for correlated choices within individuals. The coefficients of four variables—‘improvement in capacity to perform daily life activities’, ‘risk of adverse events’, ‘improvement in pain relief’, and ‘out-of-pocket costs’—were assumed to be random, following a zero-bounded triangular distribution because the distribution of these random parameters should comprise only positive or negative values. ‘Out-of-pocket costs’ was specified as a continuous variable in the mixed logit model. The marginal WTP for different levels within each attribute was calculated through division of the negative estimated beta coefficient for each level by the estimated beta coefficient for ‘out-of-pocket costs’. The log-likelihood and adjusted McFadden’s pseudo–R-squared were calculated to assess model goodness of fit. Higher log-likelihood and adjusted McFadden’s pseudo–R-squared values indicate a better-fitting model.53 54 Subgroup analyses were conducted to assess preference heterogeneity across characteristics, including age, gender, family monthly income, and education.
 
Results
Participant characteristics
Of the 589 participants invited, 488 questionnaires were returned, yielding a response rate of 82.9%. The study sample had a mean age of 60.06±12.72 years; 69.5% of the participants were women. The average pain duration was 6.46±8.16 years; mean NRS and Roland-Morris Disability Questionnaire scores were 4.70±2.12 and 7.58±5.63, respectively. Participant characteristics are summarised in Table 1.
 

Table 1. Background characteristics of patients (n=488)
 
Knowledge and usage of mindfulness-based interventions
Regarding knowledge and usage of MBIs, 77.3% of participants were unfamiliar with MBIs, 84.5% were uncertain about their effectiveness in treating chronic LBP, and 94.5% had never attended an MBI session. Knowledge and usage of MBIs are summarised in Table 2.
 

Table 2. Knowledge and usage of mindfulness-based interventions among patients (n=488)
 
Willingness to pay for pain reduction and mindfulness-based interventions
The mean monthly WTP values for MBIs to reduce pain by half and to entirely eliminate pain were HK$684.68±1347.43 and HK$1102.70±1983.83, respectively. The overall mean WTP for an eightsession MBI programme was HK$258.75±508.11. Among the participants, 237 were not willing to pay for MBIs, citing reasons such as limited knowledge of MBIs, unwillingness to spend money on treatment, lack of time, and scepticism regarding MBI effectiveness (online supplementary Table 1).
 
Results of multicollinearity tests
Multicollinearity among the independent variables was assessed; all tolerance values were >0.25 and VIF values were <4, except for two similar variables (ie, usage of MBIs measured as a binary variable [‘Yes’ or ‘No’] and number of MBI sessions attended). Given that only a small number of participants had attended MBIs, the variable measuring the number of MBI sessions was selected for inclusion in the Tobit regression model (online supplementary Table 2).
 
Factors associated with willingness to pay for mindfulness-based interventions
Factors associated with WTP for MBIs are summarised in Table 3. Participants with a higher NRS score (β=81.26; P=0.003), family monthly income of ≥HK$30 000 (β=320.1; P=0.035), high school education (β=242.94; P=0.045), and higher monthly expenses on chronic LBP treatment (β=0.11; P=0.003) were more willing to pay for MBIs. Conversely, participants who did not believe in the usefulness of MBIs (β=-528.88; P=0.033) were less willing to pay for them.
 

Table 3. Factors associated with willingness to pay for mindfulness-based interventions according to Tobit regression (n=488)
 
Evaluation of patient preferences for mindfulness-based interventions
Conceptual attributes and levels identified through literature review
In total, 15 eligible studies were included.52 The attributes most frequently cited were ‘capacity to realize daily life activities’, ‘risk of adverse events’, ‘effectiveness in pain reduction’, and ‘out-of-pocket costs’, which were also ranked among the top three most important attributes. Other attributes, cited less frequently but revealing important preferences, included ‘treatment frequency’ and ‘onset of treatment efficacy’.52
 
Contextual attributes and levels identified through qualitative research
Eight patients with chronic LBP participated in this stage of developing contextual attributes through patient-public involvement. Two focus group interviews were conducted to identify contextual attributes. Valued characteristics of MBIs were summarised, including effectiveness in pain reduction, mood regulation, and sleep improvement; treatment environment; reliability of mindfulness instructors; reputation of the organisation; safety; affordability; flexibility (availability of online resources at all times); availability of follow-up courses; and a group-based course format. Three experts finalised the selection of seven attributes for inclusion (Table 4).
 

Table 4. Attributes and levels included in the final discrete choice experiment
 
Pilot study of discrete choice experiment
Only minor changes in terminology were applied to attribute levels after the pilot study. This pilot study verified the attributes and their levels, as presented in Table 4. The pilot study also indicated that most patients understood the instructions and attributes. Only minor layout adjustments were made—some participants reported that the font size was too small.
 
Factors associated with patients’ preferences for mindfulness-based interventions
After the exclusion of participants who declined to answer DCE questions due to difficulties in comprehension or unwillingness to respond (n=69, 14.1%) and those with missing DCE responses (n=4, 0.8%), the final participant count was reduced to 415. Among these participants, six (1.4%) did not pass the dominance test; thus, 409 participants were included in the analysis. The results of the DCE examining factors associated with patients’ preferences for MBIs are presented in Table 5. Participants were more likely to choose MBIs with lower out-of-pocket costs, higher levels of pain relief, and greater improvements in capacity to perform daily life activities. Face-to-face treatment modes were preferred over online formats. Regarding model fit, the log-likelihood and adjusted McFadden’s pseudo–R-squared for the mixed logit model were -1502.8 and 0.330, respectively.
 

Table 5. Factors influencing patients’ preferences for mindfulness-based interventions according to a mixed logit model (n=409)
 
Subgroup analyses
The results of subgroup analyses are presented in online supplementary Tables 3 to 6. Preferences differed substantially between age-groups, family income levels, and education levels, but showed no gender-based significant differences. Improvement in the capacity to perform daily life activities was an important attribute when selecting MBIs for older participants, those with lower family monthly income, and those with higher education level; this attribute was not important for younger participants and those with higher family monthly income and lower education level. Group size was an important attribute for younger participants and those with higher family monthly income but not for older participants or those with lower family monthly income. Younger participants and those with higher family monthly income preferred MBIs with a group size of one person, rather than 7 to 12 people. Treatment mode was an important attribute for participants with lower family monthly income and higher education level but not for those with higher family monthly income and lower education. Participants with lower family monthly income and higher education preferred face-to-face treatment over online treatment. Furthermore, participants with lower family monthly income and older age placed greater priority on out-of-pocket costs for MBIs, as indicated by substantially larger regression coefficients for out-of-pocket costs in subgroup analyses.
 
Discussion
Consistent with previous studies,34 49 we found that patients with higher pain scores, higher family income, and higher monthly expenses on LBP treatment were more willing to pay for MBIs. Comparison of WTP for MBIs in this study to a national survey on WTP for complementary and alternative medicine treatments in England55 revealed that participants in the present study had a lower WTP. One possible explanation for this discrepancy is that complementary and alternative medicine practices, such as acupuncture and herbal medicine, are more established in some cultures; MBIs are relatively new and may be less familiar to our study population.
 
In Hong Kong’s public healthcare system, physiotherapy and occupational therapy for chronic pain cost HK$80 per visit. If MBIs followed this fee structure, eight sessions would cost a total of HK$640. However, the current WTP for MBIs is HK$258.75, approximately 40% of this cost. Notably, WTP was calculated in a population with limited knowledge of MBIs. Increased awareness of their efficacy may enhance WTP, aligning it more closely with the existing fee structure.
 
Our study evaluating preferences for MBIs confirmed previous findings that chronic pain treatment preferences are significantly influenced by treatment effectiveness and out-of-pocket costs.52 56 57 However, in contrast to prior studies,52 56 57 we found that the risk of adverse events was not an attribute considered important by patients with chronic LBP during MBI selection. One possible explanation is that the risk of adverse events from psychological interventions is lower and less severe than the risk of such events associated with pharmacological or exercise-based interventions.58 59 60 Additionally, we observed that treatment mode constituted an important attribute of MBIs, consistent with investigations of exercise therapy preferences among patients with chronic pain.39
 
Our study focused on assessing WTP and preferences for MBIs in chronic LBP, following the socio-psychobiological model that prioritises social and psychological factors over biological factors.27 28 This approach provides insights into the socio-economic backgrounds of patients with chronic LBP and highlights their pain experiences and access to pain management strategies, emphasising the social dimension of chronic pain management. Mindfulness-based interventions, as a psychological and group-based approach, equip individuals with skills to manage psychological distress related to chronic LBP while fostering social support and connectivity through group interaction.
 
The current approach to chronic pain care often results in the underutilisation of high-value care (eg, psychological therapies) and overuse of low-value care, including invasive procedures and opioid medications.4 28 The adoption and implementation of a socio-psychobiological model could serve as an effective strategy for establishing pain care systems that prioritise high-value care.27 28
 
Despite the recognised value of MBIs in chronic pain management, their limited integration into clinical practice may be attributed to patients’ unfamiliarity and lack of knowledge about these interventions, coupled with insufficient investment in primary care resources. Additionally, economic incentives often favour high-volume practice models in primary care settings.28 Thus, there is an urgent need for educational initiatives to enhance awareness and knowledge of MBIs among individuals with chronic LBP, as well as increased investment in primary care resources.
 
This study provided critical insights into the integration of MBIs for chronic LBP management within the Hong Kong public healthcare system. In the context of Hong Kong’s public healthcare settings, we propose integrating MBIs as an intermediary step between primary care and specialist care for chronic LBP management. Primary care providers could identify patients experiencing psychological and social distress who may benefit from MBIs and facilitate their referral for MBI treatment. Patients whose condition does not improve after an MBI could then be referred to specialist clinics. This approach could substantially reduce waiting times for chronic LBP treatment within the Hong Kong public healthcare system.
 
Strengths and limitations
This study has several strengths. To our knowledge, it is the first investigation to assess WTP and preferences for MBIs in chronic pain management; it included a comprehensive list of independent variables covering key factors that influence WTP. Additionally, the study utilised a mixed logit model to consider preference heterogeneity within the sample. Furthermore, a rigorous systematic review and qualitative interviews informed the attributes and levels used in the DCE. However, certain limitations should be acknowledged. First, participants’ limited knowledge of MBIs may have influenced WTP and preferences. Second, participants were recruited through convenience sampling from outpatient clinics in two Hong Kong public hospitals, which may have introduced selection bias that skewed the sample composition and limited its representativeness. This limitation may affect the generalisability of the findings beyond the specific group sampled. Third, the cross-sectional design of the study precluded establishment of causal relationships between WTP and preferences for MBIs, as well as associated factors.
 
Although WTP and preferences are essential considerations for MBI implementation, they should not be the sole determinants. Factors such as cost-effectiveness, impact on quality of life, and infrastructure availability must also be considered. Further research is required to provide additional evidence for implementation within the Hong Kong public healthcare system. Nevertheless, this study established a rationale for assessing WTP and preferences for MBIs, with a methodology that can be adapted for healthcare evaluations in other countries.
 
Conclusion
This study highlights the need to increase awareness of MBIs for chronic LBP management within the public healthcare system. The findings indicate low WTP among participants, suggesting a gap in understanding and utilisation. Notably, individuals with higher pain scores, higher family income, and higher monthly LBP treatment expenses, as well as a stronger belief in MBIs, were more willing to pay for such interventions; these observations indicate targeted demand. Patient preferences favoured lower costs, face-to-face treatment, and enhanced effectiveness. These findings provide practical insights for designing patient preference–aligned MBIs and will serve as valuable references for future healthcare evaluations.
 
Author contributions
Concept or design: M Zhu, PKH Mo, RWS Sit.
Acquisition of data: M Zhu.
Analysis or interpretation of data: All authors.
Drafting of the manuscript: M Zhu.
Critical revision of the manuscript for important intellectual content: All authors.
 
All authors had full access to the data, contributed to the study, approved the final version for publication, and take responsibility for its accuracy and integrity.
 
Conflicts of interest
As an editor of the journal, RWS Sit was not involved in the peer review process. Other authors have disclosed no conflicts of interest.
 
Funding/support
This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
 
Ethics approval
This research was approved by the Joint Chinese University of Hong Kong–New Territories East Cluster Clinical Research Ethics Committee, Hong Kong (Ref. No.: 2022.279). The research was conducted in accordance with the Declaration of Helsinki. All participants provided written informed consent before completing the questionnaire.
 
Data availability
The datasets generated during and/or analysed during the current study are not publicly available due to ethics restrictions. A request for the code can be made directly to the corresponding author.
 
Supplementary material
The supplementary material was provided by the authors and some information may not have been peer reviewed. Accepted supplementary material will be published as submitted by the authors, without any editing or formatting. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by the Hong Kong Academy of Medicine and the Hong Kong Medical Association. The Hong Kong Academy of Medicine and the Hong Kong Medical Association disclaim all liability and responsibility arising from any reliance placed on the content.
 
References
1. Hoy D, March L, Brooks P, et al. The global burden of low back pain: estimates from the Global Burden of Disease 2010 study. Ann Rheum Dis 2014;73:968-74. Crossref
2. Dieleman JL, Cao J, Chapin A, et al. US health care spending by payer and health condition, 1996-2016. JAMA 2020;323:863-84. Crossref
3. Hong J, Reed C, Novick D, Happich M. Costs associated with treatment of chronic low back pain: an analysis of the UK General Practice Research Database. Spine (Phila Pa 1976) 2013;38:75-82. Crossref
4. Foster NE, Anema JR, Cherkin D, et al. Prevention and treatment of low back pain: evidence, challenges, and promising directions. Lancet 2018;391:2368-83. Crossref
5. Anheyer D, Haller H, Barth J, Lauche R, Dobos G, Cramer H. Mindfulness-based stress reduction for treating low back pain: a systematic review and meta-analysis. Ann Intern Med 2017;166:799-807. Crossref
6. Herman PM, Anderson ML, Sherman KJ, Balderson BH, Turner JA, Cherkin DC. Cost-effectiveness of mindfulness-based stress reduction versus cognitive behavioral therapy or usual care among adults with chronic low back pain. Spine (Phila Pa 1976) 2017;42:1511-20. Crossref
7. Pérez-Aranda A, D’Amico F, Feliu-Soler A, et al. Cost-utility of mindfulness-based stress reduction for fibromyalgia versus a multicomponent intervention and usual care: a 12-month randomized controlled trial (EUDAIMON study). J Clin Med 2019;8:1068. Crossref
8. Day MA, Jensen MP, Ehde DM, Thorn BE. Toward a theoretical model for mindfulness-based pain management. J Pain 2014;15:691-703. Crossref
9. Ehde DM, Dillworth TM, Turner JA. Cognitive-behavioral therapy for individuals with chronic pain: efficacy, innovations, and directions for research. Am Psychol 2014;69:153-66. Crossref
10. Hayes SC, Luoma JB, Bond FW, Masuda A, Lillis J. Acceptance and commitment therapy: model, processes and outcomes. Behav Res Ther 2006;44:1-25. Crossref
11. Zhang D, Lee EK, Mak EC, Ho CY, Wong SY. Mindfulness-based interventions: an overall review. Br Med Bull 2021;138:41-57. Crossref
12. Khoo EL, Small R, Cheng W, et al. Comparative evaluation of group-based mindfulness-based stress reduction and cognitive behavioural therapy for the treatment and management of chronic pain: a systematic review and network meta-analysis. Evid Based Ment Health 2019;22:26-35. Crossref
13. Liu Z, Jia Y, Li M, et al. Effectiveness of online mindfulness-based interventions for improving mental health in patients with physical health conditions: systematic review and meta-analysis. Arch Psychiatr Nurs 2022;37:52-60. Crossref
14. Khusid MA, Vythilingam M. The emerging role of mindfulness meditation as effective self-management strategy, part 2: clinical implications for chronic pain, substance misuse, and insomnia. Mil Med 2016;181:969-75. Crossref
15. Veehof MM, Trompetter HR, Bohlmeijer ET, Schreurs KM. Acceptance- and mindfulness-based interventions for the treatment of chronic pain: a meta-analytic review. Cogn Behav Ther 2016;45:5-31. Crossref
16. Hughes LS, Clark J, Colclough JA, Dale E, McMillan D. Acceptance and commitment therapy (ACT) for chronic pain: a systematic review and meta-analyses. Clin J Pain 2017;33:552-68. Crossref
17. Gryesten JR, Poulsen S, Moltu C, Biering EB, Møller K, Arnfred SM. Patients’ and therapists’ experiences of standardized group cognitive behavioral therapy: needs for a personalized approach. Adm Policy Ment Health 2024;51:617-33. Crossref
18. Zhang L, Lopes S, Lavelle T, et al. Economic evaluations of mindfulness-based interventions: a systematic review. Mindfulness (N Y) 2022;13:2359-78. Crossref
19. Census and Statistics Department, Hong Kong SAR Government. Thematic Household Survey Report No. 50. Jan 2013. Available from: https://www.statistics.gov.hk/pub/B11302502013XXXXB0100.pdf. Accessed 7 Apr 2025.
20. Health Bureau, Hong Kong SAR Government. The healthcare challenges in Hong Kong. 2022. Available from: https://www.primaryhealthcare.gov.hk/bp/en/supplementary-documents/challenges/. Accessed 18 Mar 2025.
21. Abbas SM, Usmani A, Imran M. Willingness to pay and its role in health economics. JBUMDC 2019;9:62-6.
22. Liu S, Yam CH, Huang OH, Griffiths SM. Willingness to pay for private primary care services in Hong Kong: are elderly ready to move from the public sector? Health Policy Plan 2013;28:717-29. Crossref
23. Krist AH, Tong ST, Aycock RA, Longo DR. Engaging patients in decision-making and behavior change to promote prevention. Stud Health Technol Inform 2017;240:284-302. Crossref
24. Ostermann J, Brown DS, de Bekker-Grob EW, Mühlbacher AC, Reed SD. Preferences for health interventions: improving uptake, adherence, and efficiency. Patient 2017;10:511-4. Crossref
25. Alhowimel A, AlOtaibi M, Radford K, Coulson N. Psychosocial factors associated with change in pain and disability outcomes in chronic low back pain patients treated by physiotherapist: a systematic review. SAGE Open Med 2018;6:2050312118757387. Crossref
26. Karran EL, Grant AR, Moseley GL. Low back pain and the social determinants of health: a systematic review and narrative synthesis. Pain 2020;161:2476-93. Crossref
27. Carr DB, Bradshaw YS. Time to flip the pain curriculum? Anesthesiology 2014;120:12-4. Crossref
28. Mardian AS, Hanson ER, Villarroel L, et al. Flipping the pain care model: a sociopsychobiological approach to high-value chronic pain care. Pain Med 2020;21:1168-80. Crossref
29. Allen-Watts K, Sims AM, Buchanan TL, et al. Sociodemographic differences in pain medication usage and healthcare provider utilization among adults with chronic low back pain. Front Pain Res (Lausanne) 2022;2:806310. Crossref
30. Majeed MH, Ali AA, Sudak DM. Mindfulness-based interventions for chronic pain: evidence and applications. Asian J Psychiatr 2018;32:79-83. Crossref
31. Smith SL, Langen WH. A systematic review of mindfulness practices for improving outcomes in chronic low back pain. Int J Yoga 2020;13:177-82. Crossref
32. Petrucci G, Papalia GF, Russo F, et al. Psychological approaches for the integrative care of chronic low back pain: a systematic review and metanalysis. Int J Environ Res Public Health 2021;19:60. Crossref
33. Mitchell RC, Carson RT. Using Surveys to Value Public Goods: The Contingent Valuation Method. New York and London: Resources for the Future; 1989.
34. Chuck A, Adamowicz W, Jacobs P, Ohinmaa A, Dick B, Rashiq S. The willingness to pay for reducing pain and pain-related disability. Value Health 2009;12:498-506. Crossref
35. Orme BK. Sample size issues for conjoint analysis studies. In: Orme BK, editor. Getting Started with Conjoint Analysis: Strategies for Product Design and Pricing Research. 4th ed. Madison [WI]: Research Publishers LLC; 1998: 57-65.
36. Johnson R, Orme B. Sawtooth Software Research Paper Series. Getting the most from CBC. WA: Sawtooth Software; 2003. Available from: https://sawtoothsoftware.com/resources/technical-papers/getting-the-most-from-cbc. Accessed 24 Mar 2025.
37. Hanmer J. Measuring population health: association of self-rated health and PROMIS measures with social determinants of health in a cross-sectional survey of the US population. Health Qual Life Outcomes 2021;19:221. Crossref
38. Copsey B, Buchanan J, Fitzpatrick R, Lamb SE, Dutton SJ, Cook JA. Duration of treatment effect should be considered in the design and interpretation of clinical trials: results of a discrete choice experiment. Med Decis Making 2019;39:461-73. Crossref
39. Cranen K, Groothuis-Oudshoorn CG, Vollenbroek-Hutten MM, IJzerman MJ. Toward patient-centered telerehabilitation design: understanding chronic pain patients’ preferences for web-based exercise telerehabilitation using a discrete choice experiment. J Med Internet Res 2017;19:e26. Crossref
40. Ferreira GE, Howard K, Zadro JR, O’Keeffe M, Lin CC, Maher CG. People considering exercise to prevent low back pain recurrence prefer exercise programs that differ from programs known to be effective: a discrete choice experiment. J Physiother 2020;66:249-55. Crossref
41. Laba TL, Brien JA, Fransen M, Jan S. Patient preferences for adherence to treatment for osteoarthritis: the MEdication Decisions in Osteoarthritis Study (MEDOS). BMC Musculoskelet Disord 2013;14:160. Crossref
42. McKenzie SP, Hassed CS, Gear JL. Medical and psychology students’ knowledge of and attitudes towards mindfulness as a clinical intervention. Explore (NY) 2012;8:360-7. Crossref
43. Lau MA, Colley L, Willett BR, Lynd LD. Employee’s preferences for access to mindfulness-based cognitive therapy to reduce the risk of depressive relapse—a discrete choice experiment. Mindfulness 2012;3:318-26. Crossref
44. Atisook R, Euasobhon P, Saengsanon A, Jensen MP. Validity and utility of four pain intensity measures for use in international research. J Pain Res 2021;14:1129-39. Crossref
45. Yamato TP, Maher CG, Saragiotto BT, Catley MJ, McAuley JH. The Roland-Morris Disability Questionnaire: one or more dimensions? Eur Spine J 2017;26:301-8. Crossref
46. Turk D, Boeri M, Abraham L, et al. Patient preferences for osteoarthritis pain and chronic low back pain treatments in the United States: a discrete-choice experiment. Osteoarthritis Cartilage 2020;28:1202-13. Crossref
47. Tian X, Yu X, Holst R. Applying the payment card approach to estimate the WTP for green food in China. In: IAMO Forum 2011; 2011 Jun 23-24; Halle, Germany; 2011: No.23.
48. Soeteman L, van Exel J, Bobinac A. The impact of the design of payment scales on the willingness to pay for health gains. Eur J Health Econ 2017;18:743-60. Crossref
49. Herman PM, Luoto JE, Kommareddi M, Sorbero ME, Coulter ID. Patient willingness to pay for reductions in chronic low back pain and chronic neck pain. J Pain 2019;20:1317-27. Crossref
50. Pavel MS, Chakrabarty S, Gow J. Assessing willingness to pay for health care quality improvements. BMC Health Serv Res 2015;15:43. Crossref
51. Kanter G, Komesu Y, Qaedan F, Rogers R. 5: Mindfulness-based stress reduction as a novel treatment for interstitial cystitis/bladder pain syndrome: a randomized controlled trial [abstract]. Am J Obstet Gynecol 2016;214(4 Suppl 1):S457-8. Crossref
52. Zhu M, Dong D, Lo HH, Wong SY, Mo PK, Sit RW. Patient preferences in the treatment of chronic musculoskeletal pain: a systematic review of discrete choice experiments. Pain 2023;164:675-89. Crossref
53. UCLA: Statistical Consulting Group. FAQ: How are the likelihood ratio, wald, and lagrange multiplier (score) tests different and/or similar? Available from: https://stats.oarc.ucla.edu/other/mult-pkg/faq/general/faqhow-are-the-likelihood-ratio-wald-and-lagrange-multiplier-score-tests-different-andor-similar/#:~:text=The%20log%20likelihood%20(i.e.%2C%20the,model%20with%20a%20likelihood%20function. Accessed 12 May 2024.
54. Hu B, Shao J, Palta M. PSEUDO-R 2 in logistic regression model. Stat Sin 2006;16:847-60.
55. Sharp D, Lorenc A, Morris R, et al. Complementary medicine use, views, and experiences: a national survey in England. BJGP Open 2018;2:bjgpopen18X101614. Crossref
56. Al-Omari B, McMeekin P, Bate A. Systematic review of studies using conjoint analysis techniques to investigate patients’ preferences regarding osteoarthritis treatment. Patient Prefer Adherence 2021;15:197-211. Crossref
57. Poder TG, Beffarat M. Attributes underlying non-surgical treatment choice for people with low back pain: a systematic mixed studies review. Int J Health Policy Manag 2021;10:201-10. Crossref
58. Ho EK, Chen L, Simic M, et al. Psychological interventions for chronic, non-specific low back pain: systematic review with network meta-analysis. BMJ 2022;376:e067718. Crossref
59. Els C, Jackson TD, Kunyk D, et al. Adverse events associated with medium- and long-term use of opioids for chronic non-cancer pain: an overview of Cochrane Reviews. Cochrane Database Syst Rev 2017;10:CD012509. Crossref
60. Geneen LJ, Moore RA, Clarke C, Martin D, Colvin LA, Smith BH. Physical activity and exercise for chronic pain in adults: an overview of Cochrane Reviews. Cochrane Database Syst Rev 2017;4:CD011279. Crossref

Filicide (child homicide by parents) in Hong Kong

Hong Kong Med J 2025 Apr;31(2):99–107 | Epub 1 Apr 2025
© Hong Kong Academy of Medicine. CC BY-NC-ND 4.0
 
ORIGINAL ARTICLE  CME
Filicide (child homicide by parents) in Hong Kong
Yuen Dorothy Yee Tang, MB, BS, FHKAM (Psychiatry)1; Jessica PY Lam, MB, BS, FHKAM (Psychiatry)2; Amy CY Liu, MB, ChB, FHKAM (Psychiatry)1; Bonnie WM Siu, MB, ChB, FHKAM (Psychiatry)1
1 Department of Forensic Psychiatry, Castle Peak Hospital, Hong Kong SAR, China
2 Department of Psychiatry, Queen Mary Hospital, Hong Kong SAR, China
 
Corresponding author: Dr Yuen Dorothy Yee Tang (tyy551@ha.org.hk)
 
 Full paper in PDF
 
Abstract
Introduction: Filicide refers to an act in which a parent or stepparent kills a child. This retrospective study provides the first comprehensive analysis of filicides in Hong Kong over a 15-year period.
 
Methods: The study explored the local epidemiology, differences between maternal and paternal filicides, associated mental illnesses, and the criminal responsibility of the perpetrators.
 
Results: Among 81 filicide cases (43 female victims, 37 male victims, and 1 victim of unknown gender), the incidence rate was 0.7 per 100 000 population. Mothers were responsible for two-thirds (66.7%) of the cases, fathers for 19.8%, and the remainder involved both parents. Victims aged <1 year (n=44) were nearly equal in number to those aged between 1 and 17 years (n=41). Mental illness was diagnosed in 31.0% of the perpetrators, predominantly depression and psychotic disorders. Paternal perpetrators exhibited a higher prevalence of mental illness and were more frequently involved in filicide-suicides. One-third (33%) of perpetrators with mental illness invoked the psychiatric defence of diminished responsibility, resulting in Hospital Order sentencing. Reduced culpability due to mental illness and the application of infanticide provisions provided legal protections for mothers who killed their children aged <1 year.
 
Conclusion: Understanding the local epidemiology of filicide and the mental health conditions of perpetrators may help identify at-risk populations and develop effective intervention strategies.
 
 
New knowledge added by this study
  • The epidemiology, differences between maternal and paternal filicides, associated mental illnesses, and the criminal responsibility of the perpetrators in Hong Kong from 2003 to 2017 were explored.
  • Maternal perpetrators were disproportionately responsible for infanticides, highlighting the protective legal provisions applied to mothers who kill their children aged <1 year.
Implications for clinical practice or policy
  • Understanding the local epidemiology of filicide and the mental health conditions of perpetrators may help identify at-risk populations and develop effective intervention strategies.
  • Enhanced mental health screening and support for parents, particularly mothers of infants, could potentially prevent cases of filicide.
 
 
Introduction
Child homicide represents a rare but important global issue with devastating consequences for families and communities. The global homicide rate among children aged 0 to 17 years was 1.6 per 100 000 population in 2016,1 and approximately 95 000 children are murdered annually.2 A 2017 review by Stöckl et al3 found that the majority of child homicides were committed by a family member; parents were responsible for over half of the cases involving child victims.3
 
Filicide
Filicide refers to the act of killing one’s own child. Subcategories of filicide include neonaticide, a term introduced by Resnick4 to describe the murder of a child within the first 24 hours after birth, and infanticide, which applies when the victim is aged <1 year. Resnick4 identified various motives for filicide. In altruistic filicide, the parent believes that the act is in the child’s best interests. An acutely psychotic parent may kill a child under the influence of severe mental illness. In unwanted child filicide, a parent kills a child who is perceived as a hindrance. Accidental/fatal maltreatment describes the unintentional death of a child due to parental abuse or neglect. Spouse revenge filicide occurs when a child is killed as a means of exacting revenge upon the spouse or the other parent. Bourget and Bradford5 later emphasised the importance of the perpetrator’s gender by introducing paternal filicide as a distinct category.
 
Victim and perpetrator characteristics vary in cases of filicide. The first year of life is a critical period, and the highest risk of filicide occurs within the first 24 hours. Neonaticides are predominantly committed by mothers,6 and mothers are overrepresented across the entire spectrum of filicide.4 5 However, contradictory results have been reported.5 7 8 The gender distribution of victims also varies. Male children aged <1 year are at greater risk in high-income Western countries, such as the US9 and the UK10; the opposite trend has been observed in India and China.11 Some studies have shown that boys are overrepresented among victims,7 12 whereas others have identified comparable numbers of male and female filicide victims.13
 
Maternal and paternal perpetrators of filicide exhibit distinct characteristics.14 15 Maternal perpetrators tend to be younger and have younger victims compared with fathers.15 Younger maternal perpetrators are often poor, experience psychosocial stress, and lack family and community support, whereas older maternal perpetrators frequently have mental illnesses and lack criminal histories.13 14 16 In contrast, paternal perpetrators are more commonly driven by anger, jealousy, or marital and life discord.15 Fatal abuse and acts of retaliation are more prevalent among paternal perpetrators than among maternal perpetrators.17 Fathers are also more likely to attempt or die by suicide12 17 18 when committing filicide.14 18 Additionally, fathers typically use more violent methods to cause death.19
 
Filicide and mental illness
Pathological filicide, characterised by altruistic or actively psychotic motives, constitutes one of the most common categories of filicide.17 Psychiatric factors are involved in 36% to 85% of all filicide cases.5 16 20 21 22 Maternal perpetrators are more likely to have a history of mental illness and to exhibit symptoms at the time of the offence.22 The most frequent diagnosis among maternal perpetrators is major depressive disorder, followed by schizophrenia.5 16 20 23 Personality disorders and substance use are more often associated with paternal filicides.8
 
The criminal justice system and infanticide laws
Filicide presents unique challenges for the criminal justice system. Societal attitudes regarding parents who kill their children are often ambivalent, balancing the need for justice due to loss of innocent life against calls for mercy towards offenders who may require care rather than punishment.
 
Legal systems worldwide acknowledge that filicide should be treated differently from other forms of homicide. The UK enacted the Infanticide Act in 1922 (amended in 1938)24 to recognise the biological vulnerability of women to psychiatric illnesses during the perinatal period. The Act mandated sentences of probation and psychiatric treatment for offenders.24 By the late 20th century, 29 countries had revised penalties for infanticide to consider unique biological and psychological changes associated with childbirth.25
 
In Hong Kong, perpetrators with mental illnesses can invoke psychiatric defences, including insanity or diminished responsibility. The insanity defence is based on the M’Naghten principles, which hold that it is unjust to punish an individual for an action performed without the mental capacity to control it. The defence of diminished responsibility applies when the offender demonstrates abnormal mental function arising from a recognised medical condition, which has substantially impaired their ability to either understand the nature of their conduct, form a rational judgement, or exercise self-control (or any combination of these impairments). Perpetrators with mental illnesses who are found not guilty by reason of insanity, or who successfully raise the partial defence of diminished responsibility—thereby reducing the charge from murder to manslaughter—may be sentenced to a Hospital Order at the Correctional Services Department Psychiatric Centre (Siu Lam Psychiatric Centre [SLPC]), under Section 75 of the Criminal Procedure Ordinance26 or Section 45 of the Mental Health Ordinance,27 respectively, for psychiatric observation and management.
 
A separate legal provision exists for mothers who kill their children aged <1 year. Hong Kong has adopted the UK concept of infanticide, in which mothers experiencing vulnerability after childbirth are charged with infanticide rather than murder, under Section 47C of the Offences against the Person Ordinance.28
 
A study has shown that the local homicide rate in Hong Kong is lower than global averages (0.32 vs 6.1 victims per 100 000 population in 2017),29 but no filicide-specific data are available. The underlying hypothesis in this study was that the incidence of filicide would be lower in Hong Kong than in Western countries, consistent with the lower local homicide rate and the protective effects of cultural factors. The objectives of this study were to describe the epidemiology of filicide in Hong Kong, examine the characteristics of victims and perpetrators (including associated mental illnesses), and evaluate the local criminal justice system’s response to infanticide and other forms of filicide.
 
Methods
Data were obtained from the Hong Kong Police Force regarding child homicide cases that occurred from 2003 to 2017. These data included the age and gender of the victim, relationship of the perpetrator to the victim, mode of death, year of offence, and charges against the defendant along with corresponding outcomes and sentences. Medical records from the Hospital Authority and the SLPC of the Correctional Services Department were reviewed to determine any history of mental illness. Psychiatric diagnoses of the perpetrators, based on the International Classification of Diseases, Tenth Revision, were documented during forensic psychiatric assessments conducted by two psychiatrists, at least one of whom was a specialist. For the minority of defendants who were not sent to psychiatric hospitals or SLPC after the offences, the presence or absence of mental illness was cross-referenced using newspaper articles. Charges and sentences were verified through judgements available on the Judiciary’s official website.
 
All statistical analyses were performed using SPSS software (Windows version 21.0; IBM Corp, Armonk [NY], US). Data were analysed with descriptive statistics, including the mean, median, standard deviation, 95% confidence interval, and percentages for categorical variables. Differences between groups in demographic characteristics were assessed using t tests and univariate analysis of variance for continuous data. For nominal data, the Kruskal–Wallis and Chi squared tests were utilised.
 
Results
Epidemiology of child homicide
From 2003 to 2017, 107 child homicide victims were recorded in Hong Kong, equating to approximately 0.70 death per 100 000 population, based on a population of 1 024 000 children aged <18 years in 2010.30 Among these victims, 81 (75.7%) were killed by their parents (Fig).
 

Figure. Child homicide cases in Hong Kong from 2003 to 2017
 
Characteristics of victims and perpetrators
Among the filicide victims (n=81), 53.1% were female, 45.7% were male, and the gender of the remaining victim was unknown. There was no significant correlation between the gender of the victim and the gender of the perpetrator (χ2=0.13; P=0.82). The median age of the victims was 6 years (interquartile range [IQR]=0-8) [Table 1].
 

Table 1. Demographics of victims in infanticide and other filicide cases
 
Of the 81 filicide victims, 54 (66.7%) were killed by their mothers, 16 (19.8%) by their fathers, and 11 (13.6%) by both parents. The median age of victims varied across perpetrator groups; the paternal group victims had a median age of 7.5 years (IQR=5-10.25), compared with 0 year (IQR=0-3.5) for the maternal group and 2 years (IQR=0-5) for the parental couple group (H2=14.31; P<0.001).
 
Characteristics of infanticide and other filicide cases
Forty victims aged <1 year were killed by 44 perpetrators, and 41 victims aged ≥1 year were killed by 40 perpetrators. No significant gender differences were observed among the victims (Table 1).
 
The median age of paternal perpetrators, 43.5 years, was significantly older than the median ages of maternal and parental couple perpetrators (H2=16.50; P<0.001). The median age of offenders in the infanticide group was younger than that of offenders in the other filicide group. In the infanticide group, nine mothers (26.5%) were <20 years, and all pregnancies had been concealed. These infants were killed immediately after birth. Single offenders were more prevalent in the infanticide group, whereas married offenders were more common in the other filicide group. Biological mothers were the main perpetrators in both groups; similar to paternal and couple perpetrators, maternal perpetrators were younger in the infanticide group (Table 2). The maternal group was responsible for 40% of victims aged <4 years, compared with 7.1% in the paternal group. A higher prevalence of mental illness was identified among perpetrators, particularly mothers, in the other filicide group. Among perpetrators in the infanticide group, depression (40%) was the most common diagnosis, followed by a psychotic disorder (20%), mental and behavioural disorders due to psychoactive substance use (20%), and mental retardation (20%). The only biological father in the infanticide group was diagnosed with harmful use of alcohol. In the other filicide group, among maternal perpetrators, 25.0% had a psychotic disorder, 18.8% had depression, 6.4% had bipolar affective disorder, and the remainder had unknown diagnoses. Among paternal perpetrators, 18.0% had depression, 9.1% had a psychotic disorder, and the remainder had undocumented diagnoses.
 

Table 2. Demographics of offenders in infanticide and other filicide cases
 
Suffocation or strangulation was the most common mode of death in infanticides, occurring in 95.7% of cases with maternal perpetrators. In contrast, paternal perpetrators (100%) and couples (50%) caused death mainly by bashing, throwing, or shaking the infants. The two most common modes of death across all filicides were drug overdose or poisoning (including charcoal burning) and stabbing. Drug overdose or poisoning was most frequently performed by maternal perpetrators (36.8%) and couples (57.1%), whereas paternal perpetrators most often engaged in stabbing (57.1%).
 
Excluding the four perpetrators who died by suicide, 80.0% of perpetrators in the infanticide group faced criminal charges and were convicted. The most common convictions were concealing the birth of a child, manslaughter, and infanticide (Table 2). In the other filicide group, excluding the 18 perpetrators who died by suicide, 95.5% of perpetrators were charged and convicted; manslaughter was the most common conviction, followed by murder. Sentences significantly differed between the infanticide and other filicide groups. Noncustodial sentences were more frequent in the infanticide group than in the other filicide group. Given the higher prevalence of mental illness in the other filicide group, 33.3% (5/15) of the perpetrators were convicted of manslaughter under diminished responsibility and sentenced to a Hospital Order, compared with 6.3% in the infanticide group (Table 2). Among paternal and couple perpetrators, 80% in the infanticide group and 92.3% in the other filicide group received prison sentences, ranging from 3 to 10 years and 18 months to life imprisonment, respectively. Similar proportions of maternal perpetrators in both groups—41.0% in the infanticide group and 42.9% in the other filicide group—were imprisoned. Among maternal perpetrators in the infanticide group, all but one received prison sentences of <1 year; the exception received an 8-year sentence. In the other filicide group, maternal perpetrators received sentences of 4 to 7 years.
 
Filicide-suicide is defined as the perpetrator dying by suicide within 24 hours of committing filicide. A significantly greater proportion of filicide-suicides occurred in the other filicide group. In the infanticide group, all perpetrators were biological mothers. In contrast, within the other filicide group, half of maternal perpetrators and 66.7% of paternal perpetrators had a diagnosed mental illness. The difference in mental illness prevalence between the two groups was not statistically significant (Table 3).
 

Table 3. Characteristics of perpetrators in filicide-suicide cases (n=22)
 
Mental illness of filicide offenders
Of the 84 filicide perpetrators, 26 (31.0%) were diagnosed with mental illness. No mental illness was reported in the parental couple group. A higher prevalence of mental illness was observed among paternal perpetrators (58.3%) than among maternal perpetrators (38.0%), although the difference was not statistically significant. Depression was the most common diagnosis, followed by psychotic disorder. In cases of filicide-suicide, mental illness prevalence was higher among paternal perpetrators; this difference was not statistically significant (Table 4).
 
Excluding perpetrators who died by suicide, 41.7% of maternal perpetrators with mental illness received a Hospital Order for an unspecified period. Among the three paternal perpetrators with mental illness who did not die by suicide, only one (33.3%) was sentenced to a Hospital Order for an unspecified period.
 

Table 4. Mental illness in filicide perpetrators (n=26)
 
Discussion
The incidence of child homicide in Hong Kong, at 0.7 per 100 000 population, is lower than the global average (1.6 per 100 000 population)1 and lower than that of Asian countries with similar socio-economic status, such as South Korea (1.03 per 100 000 population).31 The protective influence of traditional Confucian cultural values may play a prominent role in Hong Kong.32 An idiom from the Sung dynasty, ‘even a vicious tiger would not eat its cubs’, continues to be taught in modern primary schools. This cultural ethos could explain why the incidence of child maltreatment in Hong Kong, at <0.14%,30 remains lower than the global rate of 0.3% to 0.4%.33 Consistent with studies worldwide,3 most child homicides in Hong Kong were perpetrated by parents. Mothers were the predominant perpetrators in filicides. The typical profile of an infanticidal perpetrator was a young, single mother who suffocated or strangled the infant. Some cases may represent neonaticides, as suggested by charges of concealing the birth of a child. Among cases involving the filicide of older children, perpetrator characteristics were more heterogeneous. Perpetrators tended to be older and married; they used methods such as overdosing, poisoning, or stabbing. The profiles of perpetrators and victims in this group also differed. The median age of maternal perpetrators was younger and their victims tended to be younger. Mothers most often caused death through overdosing or poisoning, whereas fathers were more likely to kill by stabbing.
 
Mental illness in filicides
In the present study, 31.0% of filicidal perpetrators had a diagnosed mental illness, a lower rate compared with other population studies.8 20 22 23 This discrepancy could be attributed to the lower prevalence of mental illness in Hong Kong. The Hong Kong Mental Morbidity Survey (2010-2013) revealed a 13.3% prevalence of mental disorders among Chinese adults,34 compared with 18.5% among adults in the US in 2013.35 It is also plausible that some perpetrators, especially those involved in filicide-suicide cases, had no prior contact with mental health services and may have had undiagnosed psychiatric illnesses. Mental illness prevalence was higher among paternal perpetrators than among maternal perpetrators in our filicide sample. This finding may be related to the small sample size or could reflect societal changes, such as fathers assuming greater childcare responsibilities.17 Consistent with some studies,20 22 depression was the most common diagnosis, followed by psychotic disorder.
 
Filicide-suicides
Substantial proportions of filicide perpetrators (23.0% of maternal and 34.8% of paternal) died by suicide during or after committing the act. Charcoal burning was the most common method, comparable to the frequency of jumping from height. Charcoal burning is a relatively recent suicidal method,36 which has spread as a contagious phenomenon in other Asian countries; it is often portrayed as a ‘peaceful way of dying’ and has been used during >10% of suicides in the region.37 The proportion of filicide-suicides observed in this study was lower than that reported in other studies.17 23 This difference may be related to the lower prevalence of mental illness in our sample, the relatively lower lethality of charcoal burning in Hong Kong compared with firearm use in Western countries, or the possibility that attempted suicides not resulting in death were not captured in our data. Filicide-suicide events were more frequent in cases involving older children than in infanticides, potentially due to differences in underlying motives. Half of the filicide-suicide perpetrators in the present study had a history of mental illness, suggesting that altruistic motives were involved. Depression was the most frequently diagnosed condition in these cases.18 20
 
The local law and filicides
The majority of perpetrators with mental illness were convicted of manslaughter under diminished responsibility and sentenced to a Hospital Order at SLPC for an unspecified period under Section 45 of the Mental Health Ordinance.27 No insanity pleas were recorded in our sample. Consistent with international studies,38 maternal perpetrators in Hong Kong received more lenient outcomes relative to paternal perpetrators. Some young mothers who killed their children aged <1 year were released without charge; among those convicted, a few received noncustodial sentences. In contrast, all fathers who killed their children were imprisoned, with the exception of one who was sentenced to a Hospital Order at SLPC.
 
Hong Kong developed its legislation based on the UK law, including the British Infanticide Act of 1922.21 24 Section 47C of the Offences against the Person Ordinance28 defines the offence of infanticide as follows: “Where a woman by any wilful act or omission causes the death of her child being a child under the age of 12 months but at the time of the act or omission the balance of her mind was disturbed by reason of her not having fully recovered from the effect of giving birth to the child or by reason of the effect of lactation consequent upon the birth of the child, then, notwithstanding that the circumstances were such that but for the provisions of this section the offence would have amounted to murder, she shall be guilty of infanticide, and shall be liable to be punished as if she were guilty of manslaughter.” In the present study, eight mothers who killed their children aged <1 year were convicted under the infanticide provision. There appears to be considerable application of this provision in Hong Kong; lenient noncustodial sentences are issued to mothers in such cases.
 
Limitations
First, information provided by the Police was restricted to arrest cases; thus, the study may underreport the true incidence of filicides in Hong Kong. Second, although multiple sources of information were utilised, details regarding the perpetrators’ and victims’ abuse or victimisation histories, involvement with social services, or autopsy reports were unavailable. Third, the classification of neonaticides was challenging, although charges of concealing the birth of a child may indicate the death of a victim within 24 hours of birth. Fourth, although most diagnoses of offenders with mental illnesses were accessible, the availability of psychiatric records was limited. Information for a small number of cases (<5) was obtained from newspaper reports. Sixth, the absence of critical details, such as the onset of mental illness, symptomatology, and medication adherence, impeded a thorough exploration of the relationship between mental illness and filicides. A more comprehensive approach, such as conducting psychological autopsies—particularly in filicide-suicide cases—would provide deeper insights. Finally, the sample size was insufficient to allow for robust comparisons among perpetrators in maternal, paternal, parental couple, and stepparent filicide groups.
 
Conclusion
In this study, most child homicides were perpetrated by parents; mothers committed filicide more frequently than fathers. Maternal perpetrators and their victims were younger than their counterparts in the paternal perpetrator group. Mental illness was prevalent among filicidal perpetrators of both genders, with a higher prevalence in paternal perpetrators. Filicide-suicide is a substantial problem. Psychiatrists should remain vigilant in identifying depressed or psychotic parents and in eliciting self-harm or filicidal ideations among both mothers and fathers. Social support and child protection services should be actively offered to young single mothers. In Hong Kong, a comprehensive child development service has been established since 2005,39 with the aim of identifying and intervening early in cases that involve children and mothers in need; this service seeks to improve health outcomes for children and families. However, no local policies specifically address the needs of fathers. A multidisciplinary approach involving mental health professionals and social workers is recommended to screen fathers experiencing mental illness or distress and to identify early warning signs of risk. Finally, given the high prevalence of mental illness among filicidal perpetrators, forensic psychiatrists and related professionals should maintain a high index of suspicion for the presence of mental illness when evaluating filicidal offenders.
 
Author contributions
Concept or design: All authors.
Acquisition of data: YDY Tang.
Analysis or interpretation of data: YDY Tang, JPY Lam.
Drafting of the manuscript: YDY Tang.
Critical revision of the manuscript for important intellectual content: YDY Tang.
 
All authors had full access to the data, contributed to the study, approved the final version for publication, and take responsibility for its accuracy and integrity.
 
Conflicts of interest
All authors have disclosed no conflicts of interest.
 
Funding/support
This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
 
Ethics approval
This research was approved by the New Territories West Cluster Research Ethics Committee of the Hospital Authority, Hong Kong (Ref No.: NTWC/REC/19021). A waiver for informed patient consent was granted by the Committee due to the retrospective nature of the research.
 
References
1. United Nations Office on Drugs and Crime. Global Study on Homicide. Killing of Children and Young Adults. 2019. Available from: https://www.unodc.org/documents/data-and-analysis/gsh/Booklet_6new.pdf. Accessed 6 Apr 2023.
2. UNICEF. Hidden in plain sight: a statistical analysis of violence against children. New York: United Nations International Children’s Emergency Fund. 2014 Sep 4. Available from: https://data.unicef.org/resources/hidden-in-plain-sight-a-statistical-analysis-of-violence-against-children/. Accessed 6 Apr 2023.
3. Stöckl H, Dekel B, Morris-Gehring A, Watts C, Abrahams N. Child homicide perpetrators worldwide: a systematic review. BMJ Paediatr Open 2017;1:e000112. Crossref
4. Resnick PJ. Child murder by parents: a psychiatric review of filicide. Am J Psychiatry 1969;126:325-34. Crossref
5. Bourget D, Bradford JM. Homicidal parents. Can J Psychiatry 1990;35:233-8. Crossref
6. Wilson RF, Klevens J, Fortson B, Williams D, Xu L, Yuan K. Neonaticides in the United States—2008-2017. Acad Forensic Pathol 2022;12:3-14. Crossref
7. Vanamo T, Kauppi A, Karkola K, Merikanto J, Räsänen E. Intra-familial child homicide in Finland 1970-1994: incidence, causes of death and demographic characteristics. Forensic Sci Int 2001;117:199-204. Crossref
8. Bourget D, Gagné P. Paternal filicide in Québec. J Am Acad Psychiatry Law 2005;33:354-60.
9. Mariano TY, Chan HC, Myers WC. Toward a more holistic understanding of filicide: a multidisciplinary analysis of 32 years of U.S. arrest data. Forensic Sci Int 2014;236:46-53. Crossref
10. Brookman F, Nolan J. The dark figure of infanticide in England and Wales: complexities of diagnosis. J Interpers Violence 2006;21:869-89. Crossref
11. Sahni M, Verma N, Narula D, Varghese RM, Sreenivas V, Puliyel JM. Missing girls in India: infanticide, feticide and made-to-order pregnancies? Insights from hospital-based sex-ratio-at-birth over the last century. PLoS One 2008;3:e2224. Crossref
12. Dawson M. Canadian trends in filicide by gender of the accused, 1961–2011. Child Abuse Negl 2015;47:162-74. Crossref
13. Camperio Ciani AS, Fontanesi L. Mothers who kill their offspring: testing evolutionary hypothesis in a 110-case Italian sample. Child Abuse Negl 2012;36:519-27. Crossref
14. Harris GT, Hilton NZ, Rice ME, Eke AW. Children killed by genetic parents versus stepparents. Evol Hum Behav 2007;28:85-95. Crossref
15. West SG, Friedman SH, Resnick PJ. Fathers who kill their children: an analysis of the literature. J Forensic Sci 2009;54:463-8. Crossref
16. Friedman SH, Horwitz SM, Resnick PJ. Child murder by mothers: a critical analysis of the current state of knowledge and a research agenda. Am J Psychiatry 2005;162:1578-87. Crossref
17. Bourget D, Grace J, Whitehurst L. A review of maternal and paternal filicide. J Am Acad Psychiatry Law 2007;35:74-82.
18. Hatters Friedman S, Hrouda DR, Holden CE, Noffsinger SG, Resnick PJ. Filicide-suicide: common factors in parents who kill their children and themselves. J Am Acad Psychiatry Law 2005;33:496-504.
19. West SG, Hatters Friedman S. Filicide: a research update. In: Browne RC, editor. Forensic Psychiatry Research Trends. New York: Nova Science Publishers; 2007: 29-62.
20. Bourget D, Gagné P. Maternal filicide in Québec. J Am Acad Psychiatry Law 2002;30:345-51.
21. Hatters Friedman S, Resnick PJ. Child murder by mothers: patterns and prevention. World Psychiatry 2007;6:137-41.
22. Flynn SM, Shaw JJ, Abel KM. Filicide: mental illness in those who kill their children. PLoS One 2013;8:e58981. Crossref
23. Kauppi A, Kumpulainen K, Karkola K, Vanamo T, Merikanto J. Maternal and paternal filicides: a retrospective review of filicides in Finland. J Am Acad Psychiatry Law 2010;38:229-38.
24. Legislation.gov.uk. Infanticide Act 1938. Available from: https://www.legislation.gov.uk/ukpga/Geo6/1-2/36. Accessed 25 Mar 2025.
25. Oberman M. Mothers who kill: coming to terms with modern American infanticide. Am Crim L Rev 1996;34:1-110.
26. Hong Kong SAR Government. Criminal Procedure Ordinance (Cap 221). Available from: https://www.elegislation.gov.hk/hk/cap221. Accessed 17 Mar 2025.
27. Hong Kong SAR Government. Mental Health Ordinance (Cap 136). Available from: https://www.elegislation.gov.hk/hk/cap136. Accessed 6 Apr 2023.
28. Hong Kong SAR Government. Offences against the Person Ordinance (Cap 212). Available from: https://www.elegislation.gov.hk/hk/cap212. Accessed 6 Apr 2023.
29. Hong Kong Police Force. Crime statistics comparison. 2017. Available from: https://www.police.gov.hk/ppp_en/09_statistics/csc.html. Accessed 6 Apr 2023.
30. Child Fatality Review Panel, Social Welfare Department, Hong Kong SAR Government. Second report for child death cases in 2010-2011. July 2015. Available from: https://www.swd.gov.hk/storage/asset/section/655/en/fcw/CFRP2R-Eng.pdf. Accessed 6 Apr 2023.
31. Jung K, Kim H, Lee E, et al. Cluster analysis of child homicide in South Korea. Child Abuse Negl 2020;101:104322. Crossref
32. Lassi N. A Confucian theory of crime [dissertation]. University of North Dakota; 2018.
33. Stoltenborgh M, Bakermans-Kranenburg MJ, Alink LR, van IJzendoorn MH. The prevalence of child maltreatment across the globe: review of a series of meta-analyses. Child Abuse Rev 2015;24:37-50. Crossref
34. Lam LC, Wong CS, Wang MJ, et al. Prevalence, psychosocial correlates and service utilization of depressive and anxiety disorders in Hong Kong: the Hong Kong Mental Morbidity Survey (HKMMS). Soc Psychiatry Psychiatr Epidemiol 2015;50:1379-88. Crossref
35. Center for Behavioral Health Statistics and Quality, Substance Abuse and Mental Health Services Administration, US Department of Health and Human Services. Results from the 2013 National Survey on Drug Use and Health: Mental Health Findings. November 2014. Available from: https://www.samhsa.gov/data/sites/default/files/NSDUHmhfr2013/NSDUHmhfr2013.pdf. Accessed 6 Apr 2023.
36. The Hong Kong Jockey Club Centre for Suicide Research and Prevention, The University of Hong Kong. Statistics of suicide data in Hong Kong (by year). Distribution of method of suicide by age group in Hong Kong. 2020. Available from: https://www.csrp.hku.hk/statistics/. Accessed 6 Apr 2023.
37. Chang SS, Chen YY, Yip PS, Lee WJ, Hagihara A, Gunnell D. Regional changes in charcoal-burning suicide rates in East/ Southeast Asia from 1995 to 2011: a time trend analysis. PLoS Med 2014;11:e1001622. Crossref
38. Porter T, Gavin H. Infanticide and neonaticide: a review of 40 years of research literature on incidence and causes. Trauma Violence Abuse 2010;11:99-112. Crossref
39. Education Bureau, Hong Kong SAR Government. Comprehensive Child Development Service. 2021. Available from: https://www.edb.gov.hk/en/edu-system/preprimary-kindergarten/comprehensive-child-development-service/index.html. Accessed 18 Mar 2025.

Use of pronase in screening for early cancers of the upper gastrointestinal tract

© Hong Kong Academy of Medicine. CC BY-NC-ND 4.0
 
ORIGINAL ARTICLE (HEALTHCARE IN CHINA)
Use of pronase in screening for early cancers of the upper gastrointestinal tract
Zhengqi Wu, BSc1; Shihua Li, BSc2; Linzhi Lu, BSc2; Zhiyi Zhang, BSc2; Guiqi Wang, BSc, MD3; Tianyan Qin, MSc2; Guangyuan Zhao, MSc2; Jindian Liu, MSc2
1 Department of Gastroenterology, Wuwei Liangzhou Hospital, Wuwei, China
2 Department of Gastroenterology, Wuwei Tumor Hospital, Wuwei, China
3 Department of Endoscopy, Cancer Hospital, Chinese Academy of Medical Sciences, Beijing, China
 
Corresponding author: Prof Zhengqi Wu (wzqwwzl@163.com)
 
 Full paper in PDF
 
Abstract
Introduction: This study aimed to investigate the effectiveness of pronase in improving the detection rate of early cancer and enhancing visual field clarity during gastroscopy in China.
 
Methods: In total, 1450 patients who participated in an early diagnosis and treatment programme of upper gastrointestinal cancer in Wuwei, Gansu Province between 2020 and 2021 were enrolled. Cluster randomisation was utilised at the community level. All patients underwent endoscopy and biopsy. The experimental group (n=725) received pronase granules and dimethicone prior to gastroscopy; the control group (n=725) received dimethicone alone. Endoscopic visibility scores, examination durations, and lesion detection rates were recorded for both groups.
 
Results: Visibility scores for all regions of the stomach were significantly lower in the experimental group than in the control group (P<0.001). This finding remained consistent after adjustment for confounding factors in multiple linear regression analysis. The detection rate of precancerous lesions and early cancer was significantly higher in the experimental group than in the control group (77.5% vs 62.5%; P<0.001). Binary logistic regression analysis indicated that the likelihood of detecting early cancer was greater in the experimental group, with an odds ratio of 3.840 (95% confidence interval=1.204-12.241; P=0.023). Also, average gastroscopy time was significantly shorter in the experimental group than in the control group (6.52±2.51 min vs 10.03±1.23 min, t=33.81; P=0.001).
 
Conclusion: The administration of pronase prior to gastroscopy enhances visual field clarity, reduces examination time, and increases the detection rates of precancerous lesions and early cancer.
 
 
New knowledge added by this study
  • Pronase enhances visual field clarity during gastroscopy and reduces examination time.
  • Pronase can enhance diagnostic precision by minimising misdiagnoses and missed lesions.
Implications for clinical practice or policy
  • Pronase improves the detection rates of precancerous lesions and early cancer. The results provide a strong scientific foundation for using pronase in endoscopic screening during clinical diagnostic examinations.
  • The findings support adoption of pronase as a standard adjunct in gastroscopy to improve diagnostic accuracy and procedural efficiency.
 
 
Introduction
The implementation of early gastric cancer screening in community populations and performance of endoscopic examinations in high-risk groups represents a feasible, cost-effective, and efficient strategy to address the challenges of gastric cancer diagnosis and treatment in China.1 More than 80% of early-stage gastric cancer cases are identified in asymptomatic community populations aged ≥40 years. Thus, community-based screening programmes are important for increased detection of early-stage cancer. Gastroscopy remains the gold standard for diagnosing upper gastrointestinal diseases. High-quality intragastric visibility is essential for ensuring diagnostic accuracy, minimising the risks of misdiagnosis and missed diagnosis, and improving the detection of minimal-change gastric lesions. However, air bubbles and mucus in the stomach often reduce gastroscopic field visibility, leading to missed diagnoses and prolonged examination times. Pretreatment with defoaming agents and mucolytic agents enhances gastroscopic field visibility.2 Pronase, a proteolytic enzyme isolated from the culture filtrate of Streptomyces griseus, effectively cleaves the peptide bonds of glycoproteins, thereby dissolving and eliminating gastric mucus.3 This study aimed to evaluate the impact of pronase on the detection rate of precancerous lesions and early cancer, clarifying its utility in early gastric cancer screening. The findings will provide foundational evidence for the incorporation of pronase in endoscopic screening for upper gastrointestinal tract cancers and clinical diagnostic examinations.
 
Methods
Participants
This study enrolled 1450 individuals aged 40 to 70 years from a community population who participated in the 2020-2021 Upper Gastrointestinal Cancer Screening Programme in Wuwei, Gansu Province, China. The inclusion criteria were: (1) ability to cooperate with the gastroscopic procedure; (2) ability to discontinue anticoagulant medications 1 week prior to endoscopy; and (3) voluntary participation and provision of written informed consent. The exclusion criteria were: (1) contraindications to gastroscopy; (2) severe heart disease or heart failure; (3) severe respiratory disease; (4) posterior pharyngeal abscess or severe spinal deformity; (5) other serious illnesses or physical conditions that precluded tolerance of endoscopy; and (6) bleeding tendency.
 
Gastroscopy examinations
Using a random number table, all 1450 participants from the community population were randomly assigned to either an experimental group (n=725) or a control group (n=725). All participants underwent gastroscopy and tissue biopsy. In the experimental group, 1 sachet (20 000 U) of pronase (Beijing Tide Pharmaceutical, Beijing, China) and 1 g of sodium bicarbonate were dissolved in 50 to 80 mL of drinking water (20-40°C) by shaking. The solution was orally administered 15 to 30 minutes before gastroscopy (GIF-H290; Olympus, Tokyo, Japan). Dimethicone was also given orally to lubricate the cavity and remove gastric bubbles. To ensure that pronase reached all areas of the stomach, participants laid flat on a bed under a nurse’s guidance, then turned sideways three to five times. Subsequently, routine gastroscopy was performed. In the control group, participants received oral dimethicone 15 to 30 minutes before routine gastroscopy (GIF-H290).
 
The gastroscopy examinations were performed by two physicians holding the title of associate chief physician or higher, each having >10 years of experience in gastroscopy. The visibility of each part of the visual field was evaluated during the procedure; pathological examinations were conducted on tissue biopsies collected from minimal-change lesions.
 
Observation indicators
Endoscopic visibility scores were compared between the two groups. Scoring criteria were as follows4: 1 point, no mucus; 2 points, a small amount of mucus but no blurring of the visual field; 3 points, a large amount of mucus with a blurred visual field, requiring <30 mL of water for rinsing; and 4 points, very thick mucus with a blurred visual field, requiring ≥30 mL of water for rinsing. Lower scores indicated better endoscopic visibility. To minimise errors during the scoring process, each visibility score was recorded as the average of scores assigned by the two physicians who performed gastroscopy. The lesion detection rate was defined as the percentage of subjects within a group in whom lesions were identified. Gastroscopy time was measured from entry of the gastroscope into the oesophagus until its removal. Adverse reactions included nausea, vomiting, difficulty breathing, facial flushing, and other symptoms.
 
Statistical analyses
R software (version 4.0.5) was used for statistical analysis. Quantitative data were expressed as mean±standard deviation; intergroup differences were analysed using independent sample t tests. Qualitative data were expressed as frequency and percentage; intergroup differences were assessed using the Chi squared test or Fisher’s exact test. Multivariable linear regression analysis was performed to evaluate the effect of group assignment on visibility scores after adjustment for confounding factors. Differences in early cancer detection rates between the two groups were analysed using multivariable binary logistic regression analysis. All statistical tests were two-sided, and P values <0.05 were considered statistically significant.
 
Results
A summary of the baseline characteristics of the experimental and control groups is provided in Table 1. Among the 1450 patients in the cohort, 416 (28.7%) had a family history of gastrointestinal disease, 172 (11.9%) had a history of smoking, 91 (6.3%) had a history of alcohol consumption, and 335 (23.1%) had a history of gastrointestinal disease. Significant differences between the two groups were observed in the proportions of patients with a history of smoking, alcohol consumption, and gastrointestinal disease.
 

Table 1. Baseline characteristics of the study groups
 
Average visibility scores for the oesophagus, cardia, gastric fundus, gastric body, gastric antrum, gastric angle, and duodenum were significantly lower in the experimental group than in the control group (P<0.001 for all comparisons) [Table 2]. The visibility of different regions of the stomach under gastroscopy substantially differed between the two groups (Fig).
 

Table 2. Gastroscopy visibility scores of the study groups
 

Figure. Images of each part of the stomach under gastroscopy: (a) oesophagus, (b) cardia, (c) fundus, (d) corpus, and (e) duodenum. Upper and lower images show experimental and control groups, respectively
 
Effect of pronase on visibility score
Multiple linear regression analysis was performed with the visibility score for each site as the dependent variable and group assignment as the independent variable; adjustments were conducted for sex, age, marital status, education level, smoking status, alcohol consumption, history of gastrointestinal disease, and family history of gastrointestinal disease. After adjustment for these confounding factors, the visibility scores for all regions of the stomach remained significantly higher in the control group than in the experimental group (P<0.001 for all visibility scores) [Table 3].
 

Table 3. Effect of pronase on visibility score
 
Lesion and early cancer detection rates
Chi squared test analyses revealed that the detection rates of precancerous lesions (including atrophic gastritis, intestinal metaplasia, and low-grade intraepithelial neoplasia5) and early cancer were significantly higher in the experimental group than in the control group (77.5% vs 62.5%; P<0.001) [Table 4].
 

Table 4. Rates of lesion detection in the study groups
 
Multivariable binary logistic regression analysis was performed with early cancer detection as the dependent variable and group assignment as the independent variable; adjustments were conducted for sex, age, marital status, education level, smoking status, alcohol consumption, history of gastrointestinal disease, and family history of gastrointestinal disease. The likelihood of early cancer detection was significantly higher in the experimental group compared with the control group, with an odds ratio of 3.840 (95% confidence interval=1.204-12.241; P=0.023) [Table 5].
 

Table 5. Comparison of early cancer detection rates between the study groups
 
Examination time
Average gastroscopy times were 6.52±2.51 minutes in the experimental group and 10.03±1.23 minutes in the control group. Gastroscopy time significantly differed between the two groups (t=33.81; P=0.001).
 
Adverse reactions
No adverse reactions, such as nausea, vomiting, dyspnoea, or facial flushing, were reported in either group.
 
Discussion
Currently, approximately 90% of primary gastric cancers in China are diagnosed at an advanced stage.6 The prognosis of affected patients is closely related to the timing of diagnosis and treatment. Despite surgical intervention, the 5-year survival rate for patients with advanced gastric cancer remains <30%.7 After treatment, the 5-year survival rate for patients with early gastric cancer exceeds 90%, and cure may be achieved.8 However, the rates of early diagnosis and treatment of gastric cancer in China are <10%, substantially lower than rates reported in Japan (70%) and South Korea (50%).9 In Wuwei, the incidence and mortality rates of gastric cancer remain among the highest in the country; gastric cancer ranks first among malignant tumours in the city.10 Screening for upper gastrointestinal cancer is one of the most effective methods for population-level detection of early-stage cancer. Since 2010, Wuwei Tumour Hospital has implemented an upper gastrointestinal cancer screening programme (endoscopy combined with tissue biopsy) in Wuwei. Improvements in the detection rates of precancerous lesions and upper gastrointestinal cancer are key objectives of this screening initiative.
 
Gastroscopy is currently a widely used method for the clinical diagnosis and treatment of gastrointestinal diseases. A clear endoscopic field of vision is essential for accurate diagnosis and effective treatment by endoscopists. To optimise gastroscopy outcomes and enhance visibility within the stomach, bubbles and mucus must be removed. The use of pronase in combination with defoaming agents is recommended by the Consensus on Early Gastric Cancer Screening and Endoscopic Diagnosis and Treatment in China11 and the Guidelines for Endoscopic Diagnosis of Early Gastric Cancer (2019 edition) developed by the Japan Gastroenterological Endoscopy Society.12
 
Lee et al13 demonstrated that administering pronase 10 to 20 minutes before gastroscopy significantly improved the visibility of the endoscopic visual field and reduced the number of water washes required. Similarly, a multicentre randomised controlled study by Liu et al14 indicated that the combination of pronase and dimethicone significantly enhanced the visibility of the upper gastrointestinal mucosa. Pronase has also been utilised in narrow-band imaging endoscopy. A randomised controlled study by Cha et al15 compared the effects of orally administering pronase and simethicone 10 minutes before narrow-band imaging endoscopy on mucosal visibility and diagnostic performance. The results showed that mucosal visibility within the proximal stomach was significantly better in the pronase group than in the simethicone group.15 In the present study, the visibility scores for all sites in patients who received pronase were approximately 1 point, indicating minimal mucus adhesion. After adjustment for confounding factors, multiple linear regression analysis confirmed that visibility scores remained significantly lower in the pronase group than in the control group at all sites; this finding further validated the effectiveness of pronase. The present study also revealed that the average endoscopic examination time was significantly shorter (approximately 5 minutes) in the pronase group than in the control group. This reduced examination time was attributed to the near-complete absence of mucus adhesion after pronase administration, which decreased the number of rinses needed during the procedure. The shorter examination also enhanced patient comfort and increased compliance for subsequent screenings.
 
Zhang et al16 and Gao et al17 conducted retrospective analyses of 25 314 patients who underwent gastroscopy at Nanfang Hospital of Southern Medical University and 166 260 patients at Bazhong Central Hospital, revealing early cancer detection rates of 0.2% and 0.62%, respectively. Zhang et al1 performed a follow-up analysis of individuals in Liangzhou District in Wuwei who underwent upper gastrointestinal cancer screening in 2017; they observed an early cancer detection rate of 2.8%.1 In the present study, lesion detection rates for the experimental and control groups were 77.5% and 62.5%, respectively; corresponding early cancer detection rates were 3.0% and 2.1%. These percentages align with findings from the previous study in Wuwei1 and are substantially higher than those reported for other regions.16 17 The present results suggest that in Wuwei, a region displaying one of the highest incidences of upper gastrointestinal cancer in China, early cancer screening should be actively promoted. Furthermore, the detection rates of precancerous lesions and early cancer can be improved by using endoscopy combined with tissue biopsy.
 
The efficacy of pronase in improving the endoscopic visual field is well established, but studies investigating its impacts on the detection rates of precancerous lesions and early cancer have yielded inconsistent results.14 18 19 Chen et al18 conducted a randomised controlled trial that enrolled older patients undergoing gastroscopy; they found that the detection rate of minimal-change lesions was higher in the pronase group than in the control group (45.2% vs 27.5%; P<0.05).18 Lee et al19 demonstrated that the use of pronase when rinsing a lesion during endoscopy significantly increased the tissue depth of endoscopic biopsies and improved the anatomical localisation of biopsy sites, thereby enhancing the accuracy of disease diagnosis. In the present study, the detection rates of precancerous lesions and early cancer were significantly higher in the experimental group than in the control group (P<0.001). After adjustment for confounding factors, multivariable logistic regression showed that the likelihood of detecting early cancer was significantly greater in the experimental group than in the control group (odds ratio=3.840; P=0.023) [Table 5]. This finding indicates that pronase pretreatment before gastroscopy can enhance the detection rates of precancerous lesions and early cancer. The enhancement may be attributed to the clear visual field provided by pronase, which facilitates accurate selection of biopsy sites and improves recognition of minimal-change lesions. Gastroscopy physicians have substantial daily workloads and manage large numbers of patients requiring treatment. The use of pronase reduced the time required for endoscopy, potentially improving patient compliance with clinical microscopy.
 
Limitations
As an early cancer screening study, this investigation had a relatively small sample size; therefore, the findings require further validation in large-scale clinical studies. Cluster randomisation was used in this study, leading to baseline differences between groups; however, adjustments for these factors were included in the statistical analyses. The gastroscopy procedures were performed by highly skilled endoscopists. The generalisability of the findings to all endoscopists warrants additional investigation.
 
Conclusion
Pronase pretreatment before gastroscopy improves visual field clarity, reduces examination time, increases the detection rates of precancerous lesions and early cancer, and demonstrates good safety. This approach is beneficial for early cancer screening in regions with a high incidence of upper gastrointestinal cancer. The practical value of this method requires confirmation in large-scale clinical studies.
 
Author contributions
Concept or design: Z Wu, S Li, G Wang.
Acquisition of data: L Lu, G Zhao, J Liu, S Li.
Analysis or interpretation of data: T Qin.
Drafting of the manuscript: Z Zhang.
Critical revision of the manuscript for important intellectual content: Z Wu.
 
All authors had full access to the data, contributed to the study, approved the final version for publication, and take responsibility for its accuracy and integrity.
 
Conflicts of interest
All authors have disclosed no conflicts of interest.
 
Funding/support
This research was supported by the National Key Research and Development Program of China (Ref No.: 2017YFC0908302). The funder had no role in study design, data collection, analysis, interpretation, or manuscript preparation.
 
Ethics approval
This research was approved by the Medical Ethics Committee of Wuwei Cancer Hospital, Wuwei, Gansu, China (Ref No.: 2019-Ethical review-11). The trial was registered with the Chinese Clinical Trial Registry (Ref No.: ChiCTR2200064855). Informed consent was obtained from all study participants, including consent for the publication of their anonymised data and clinical photos.
 
References
1. Zhang Z, Wu Z, Lu L, et al. Analysis of the upper gastrointestinal cancer screening and follow-up results in Liangzhou District of Wuwei City from 2009 to 2017 [in Chinese]. Chin J Cancer Prev Treat 2019;23:1750-5.
2. Choi IJ. Gastric preparation for upper endoscopy. Clin Endosc 2012;45:113-4. Crossref
3. Kim GH, Cho YK, Cha JM, Lee SY, Chung IK. Effect of pronase as mucolytic agent on imaging quality of magnifying endoscopy. World J Gastroenterol 2015;21:2483-9. Crossref
4. Beg S, Ragunath K, Wyman A, et al. Quality standards in upper gastrointestinal endoscopy: a position statement of the British Society of Gastroenterology (BSG) and Association of Upper Gastrointestinal Surgeons of Great Britain and Ireland (AUGIS). Gut 2017;66:1886-99. Crossref
5. Gomceli I, Demiriz B, Tez M. Gastric carcinogenesis. World J Gastroenterol 2012;18:5164-70. Crossref
6. Committee of Laboratory Medicine of Chinese Association of Integrative Medicine. Chinese Expert Consensus on Detection Technologies for Early-stage Gastric Cancer Screening [in Chinese]. Chin J Lab Med 2023;46:347-59.
7. Katai H, Ishikawa T, Akazawa K, et al. Five-year survival analysis of surgically resected gastric cancer cases in Japan: a retrospective analysis of more than 100,000 patients from the nationwide registry of the Japanese Gastric Cancer Association (2001-2007). Gastric Cancer 2018;21:144-54. Crossref
8. Sumiyama K. Past and current trends in endoscopic diagnosis for early-stage gastric cancer in Japan. Gastric Cancer 2017;20(Suppl 1):20-7. Crossref
9. Ren W, Yu J, Zhang Z, Song Y, Li Y, Wang L. Missed diagnosis of early gastric cancer or high-grade intraepithelial neoplasia. World J Gastroenterol 2013;19:2092-6. Crossref
10. Lu L, Nie P, Zhang Z. Analysis of incidence and mortality of stomach cancer from 2011 to 2015 in Wuwei City, Gansu Province [in Chinese]. China Cancer 2020;29:677-81.
11. Chinese Society of Digestive Endoscopy; Chinese Anti-Cancer Association The Society of Tumor Endoscopy. Chinese Consensus on Screening and Endoscopic Diagnosis and Management of Early Gastric Cancer (Changsha, April 2014) [in Chinese]. Chin J Gastroenterol 2014;19:408-27.
12. Yao K, Uedo N, Kamada T, et al. Guidelines for endoscopic diagnosis of early gastric cancer. Dig Endosc 2020;32:663-98. Crossref
13. Lee GJ, Park SJ, Kim SJ, Kim HH, Park MI, Moon W. Effectiveness of premedication with pronase for visualization of the mucosa during endoscopy: a randomized, controlled trial. Clin Endosc 2012;45:161-4. Crossref
14. Liu X, Guan CT, Xue LY, et al. Effect of premedication on lesion detection rate and visualization of the mucosa during upper gastrointestinal endoscopy: a multicenter large sample randomized controlled double-blind study. Surg Endosc 2018;32:3548-56. Crossref
15. Cha JM, Won KY, Chung IK, Kim GH, Lee SY, Cho YK. Effect of pronase premedication on narrow-band imaging endoscopy in patients with precancerous conditions of stomach. Dig Dis Sci 2014;59:2735-41. Crossref
16. Zhang Q, Chen Z, Chen C, et al. Training in early gastric cancer diagnosis improves the detection rate of early gastric cancer: an observational study in China. Medicine (Baltimore) 2015;94:e384. Crossref
17. Gao Z, Liang S, Li M, et al. Clinicopathological features and trends of 1025 cases of early gastric cancer, 2006-2020 [in Chinese]. J Cancer Control Treat 2021;34:649-54.
18. Chen L, Feng Y, Wang W, Zheng P. Clinical value of pronase combined with sodium bicarbonate in gastroscopy of elderly patients [in Chinese]. Zhejiang JITCWM 2018;28:225-7.
19. Lee SY, Han HS, Cha JM, Cho YK, Kim GH, Chung IK. Endoscopic flushing with pronase improves the quantity and quality of gastric biopsy: a prospective study. Endoscopy 2014;46:747-53. Crossref

Liver- and tumour-specific indicators predicting suboptimal survival following repeat transarterial chemoembolisation in patients with hepatocellular carcinoma

© Hong Kong Academy of Medicine. CC BY-NC-ND 4.0
 
ORIGINAL ARTICLE
Liver- and tumour-specific indicators predicting suboptimal survival following repeat transarterial chemoembolisation in patients with hepatocellular carcinoma
LM Chen, PhD1,2,3,4; Simon CH Yu, MB, BS, FHKAM (Radiology)1,2; Leung Li, MB, ChB, FRCP5; Edwin P Hui, MB, ChB, FHKAM (Medicine)5; Winnie Yeo, MB, BS, FHKAM (Medicine)5,6; Stephen L Chan, MB, BS, FHKAM (Medicine)5,6
1 Department of Imaging and Interventional Radiology, Faculty of Medicine, The Chinese University of Hong Kong, Hong Kong SAR, China
2 Vascular and Interventional Radiology Foundation Clinical Science Centre, Prince of Wales Hospital, The Chinese University of Hong Kong, Hong Kong SAR, China
3 Department of Medical Ultrasonics, The Sixth Affiliated Hospital, Sun Yat-Sen University, Guangzhou, China
4 Biomedical Innovation Center, The Sixth Affiliated Hospital, Sun Yat-Sen University, Guangzhou, China
5 Department of Clinical Oncology, Prince of Wales Hospital, The Chinese University of Hong Kong, Hong Kong SAR, China
6 State Key Laboratory of Translational Oncology, China
 
Corresponding author: Prof Simon CH Yu (simonyu@cuhk.edu.hk)
 
 Full paper in PDF
 
Abstract
Introduction: This study explored liver- and tumour-specific indicators predictive of suboptimal survival outcomes following repeat transarterial chemoembolisation (TACE) in intermediate-stage hepatocellular carcinoma (HCC) patients after an initial TACE.
 
Methods: This study included 300 HCC patients who underwent TACE treatment. Based on whether persistent albumin–bilirubin (ALBI) grade deterioration (PABD) occurred after the initial TACE, defining as a shift in ALBI grade to a higher grade from baseline without recovery within 90 days, patients were divided into PABD and non-PABD groups. Overall survival of non-PABD and PABD groups according to subgroups stratified by baseline ALBI grade and tumour burden was compared with that of patients receiving only sorafenib or supportive care during the same period.
 
Results: Repeat TACE provided a survival benefit over systemic therapy or supportive care for patients in all post-TACE non-PABD or most PABD subgroups, regardless of baseline liver condition (ie, modified albumin–bilirubin [mALBI] grade and tumour burden). This benefit was absent in two subgroups among patients who developed PABD after the initial TACE, namely, (1) those with a baseline liver condition of mALBI grade 1 or 2a and tumour burden exceeding the up-to-11 criteria, and (2) those with a baseline liver condition of mALBI grade 2b, regardless of tumour burden.
 
Conclusion: Repeat TACE is not recommended for patients with persistent liver function deterioration after the initial TACE, particularly those exhibiting suboptimal baseline liver function or excessive tumour burden. Understanding the liver condition and tumour burden in HCC patients may assist clinicians in planning optimal treatment strategies, leading to better prognosis.
 
 
New knowledge added by this study
  • The identification of objective and specific indicators predictive of suboptimal survival outcomes following repeat transarterial chemoembolisation (TACE) would be clinically valuable.
  • The survival benefit of repeat TACE was not significant in two subgroups of patients who developed persistent albumin–bilirubin (ALBI) grade deterioration after the initial TACE, namely, (1) those with a baseline liver condition of modified albumin–bilirubin (mALBI) grade 1 or 2a and tumour burden exceeding the up-to-11 criteria, and (2) those with a baseline liver condition of mALBI grade 2b, regardless of tumour burden.
Implications for clinical practice or policy
  • Liver function changes after initial TACE combined with tumour burden could serve as indicators to select patients suitable for repeat TACE.
  • Repeat TACE is not recommended for patients with persistent liver function deterioration and a baseline liver condition of mALBI grade 1 or 2a and tumour burden exceeding the up-to-11 criteria, or for those with a baseline liver condition of mALBI grade 2b, regardless of tumour burden.
 
 
Introduction
Hepatocellular carcinoma (HCC) imposes a substantial cancer burden worldwide; its incidence rate in 2020 was ranked seventh, whereas its mortality rate was ranked second.1 Transarterial chemoembolisation (TACE) is commonly used as a first-line treatment for patients with intermediate-stage HCC, preserved liver function, and good performance status.2 3
 
Liver function deterioration occurs in 15.1% to 52% of patients after TACE4 5 6 7 8 9; among these patients, 3% to 31% experience chronic or irreversible liver function deterioration.5 6 7 9 Patients with post-TACE liver function deterioration may have a suboptimal long-term prognosis.5 8 10 Repeat TACE is indicated when residual tumour remains or when a new tumour is detected after the initial TACE.2 Patients with tumours refractory to TACE are preferably treated with systemic therapy; switching to such therapy has demonstrated a survival benefit and better liver function preservation relative to continued TACE.11 12
 
Liver condition is crucial to the clinical outcome of repeat TACE. Patients with suboptimal liver function are more likely to experience irreversible liver function deterioration after repeat TACE, leading to suboptimal survival outcomes. Such patients also exhibit risks of reduced treatment efficacy and compromised safety during subsequent treatment with systemic therapy. In patients with HCC, liver condition is inevitably linked to tumour burden; liver function deterioration occurs more frequently in those with a high tumour burden.5 13
 
The identification of objective and specific indicators predictive of suboptimal survival outcomes following repeat TACE would be clinically valuable because such indicators could guide decisions regarding whether to pursue repeat TACE or switch to systemic therapy. We hypothesised that specific indicators based on liver condition and tumour burden, predictive of suboptimal survival outcomes following repeat TACE, could be identified. In this study, we sought to identify liver- and tumour-specific indicators predictive of suboptimal survival outcomes with repeat TACE relative to sorafenib or supportive care (SC) in patients who had received an initial TACE.
 
Methods
All patients presenting to our institution with unresectable HCC between January 2005 and December 2019 who met the eligibility criteria were recruited. Inclusion criteria consisted of treatment-naïve unresectable HCC confirmed by biopsy or contrast-enhanced imaging demonstrating typical enhancement features, Barcelona Clinic Liver Cancer stage B disease, and treatment with one of three options: TACE, sorafenib, or SC. Exclusion criteria were age <18 years, intrahepatic tumours with vascular invasion, extrahepatic metastases, liver function classified as albumin–bilirubin (ALBI) grade 3, or incomplete post-TACE liver function data. According to standard practice at our institution during the study period, patients with unresectable intermediate-stage HCC and no contraindication to TACE were prioritised for TACE. Patients who refused TACE were treated with sorafenib; those who declined both treatments received SC.
 
Liver condition indicator
Liver condition was assessed using the modified albumin–bilirubin (mALBI) grade.14 The grade was defined by the ALBI score, which was calculated using the following equation: log10 (bilirubin [in μmol/L])×0.66+albumin [in g/L]×(-0.085). Patients were categorised into four grades: 1 (ALBI score ≤-2.60), 2a (ALBI score >-2.60 and ≤-2.27), 2b (ALBI score >-2.27 and ≤-1.39), and 3 (ALBI score >-1.39). Post–transarterial chemoembolisation liver condition was classified into three categories based on post-TACE ALBI grade deterioration, defined as a shift to a higher grade from baseline following TACE, such as from grade 1 to grade 2-3, grade 2a to 2b-3, or grade 2b to 3. No ALBI grade deterioration (NABD) was regarded as the lack of a shift to a higher ALBI grade after TACE. Temporary ALBI grade deterioration (TABD) constituted ALBI grade deterioration that resolved within 90 days after TACE. Persistent ALBI grade deterioration (PABD) was defined as ALBI grade deterioration that did not resolve within 90 days after TACE. Patients in NABD and TABD groups were categorised as non- PABD group.
 
Tumour burden indicators
Tumour burden was assessed using the up-to-7 and up-to-11 criteria, defined as the sum of the tumour number and the largest tumour diameter in centimetres, with thresholds set at 7 and 11, respectively. Tumour burden was subclassified into four categories: within or beyond the up-to-7 or up-to-11 criteria.
 
Study design
At our institution, it was standard practice for patients initially treated with TACE to receive repeat TACE if residual or recurrent intrahepatic tumours were present, until a contraindication to TACE occurred. Contraindications included an Eastern Cooperative Oncology Group performance status score >2 or a Child-Pugh score >7, regardless of liver condition changes following the initial TACE. Assuming that patients with PABD after the initial TACE have a higher risk of further liver damage and worse survival outcomes if subjected to repeat TACE, such patients were targeted in this study. The overall survival (OS) of patients with or without PABD after the initial TACE was compared with the OS of patients receiving only sorafenib or SC during the same period. Among patients with or without post-TACE PABD, we identified subgroups with baseline mALBI grade and tumour burden who showed no survival benefit over sorafenib or SC; these patients were considered unsuitable for repeat TACE. Overall survival was calculated from the date of TACE or sorafenib initiation to the date of death from any cause. For patients who received SC, OS was calculated from the date of HCC diagnosis to the date of death from any cause. Censoring was applied to patients who were lost to follow-up, underwent subsequent liver resection, or were last known to be alive.
 
Transarterial chemoembolisation
The TACE procedure was performed under local anaesthesia and guided by digital subtraction angiography. An emulsion consisting of aqueous cisplatin (Platosin; Pharmachemie BV, Haarlem, the Netherlands) and ethiodised oil in a 1:1 volume ratio was delivered transarterially into the tumour vasculature until flow stagnation occurred or a maximum dose of 40 mL emulsion was reached. Tumour-feeding arteries were subsequently embolised using 5 to 10 mL of gelatin sponge. The completeness of the procedure was verified using digital subtraction angiography, with or without non-contrast multiplanar computed tomography (CT).
 
Systemic therapy
Oral sorafenib was administered twice daily at a standard dose of 400 mg. Dose adjustments or drug discontinuation were performed at the discretion of the oncologist based on patient tolerance.
 
Statistical analysis
Categorical variables are presented as numbers (percentages) and continuous variables are presented as medians (interquartile ranges). The Chi squared test was used to compare categorical data. The Mann-Whitney U test or Kruskal–Wallis test was performed for comparisons of continuous data. Differences in OS between subgroups were analysed using the log-rank test and hazard ratios (HRs) with 95% confidence intervals (CIs). Interaction terms were included to evaluate whether the survival benefit of the post-TACE PABD or non-PABD group over the sorafenib or SC group varied across subgroups. P values <0.05 were considered statistically significant. Data analysis was performed using SPSS (Windows version 25.0; IBM Corp, Armonk [NY], United States).
 
Results
Study participants
In total, 300 treatment-naïve patients with HCC received TACE. The median age was 65 years (interquartile range, 56-72); the cohort included 255 men and 45 women. After the first TACE, 235 of 300 patients experienced ALBI deterioration: 154 exhibited TABD and 81 displayed PABD. The demographics of patients with NABD, TABD, and PABD are listed in Table 1. The OS was similar for patients with NABD and TABD (22.40 vs 23.83 months), indicating that TABD did not adversely affect treatment outcomes. Therefore, patients with NABD and TABD were combined into the non-PABD group. The demographics of patients in non-PABD group and PABD group were compared to sorafenib group and SC group, as listed in Table 2. Patients in non-PABD group and PABD group had significantly better OS than those in sorafenib and SC group (23.13, 8.03, 5.11, and 2.57 months, respectively).
 

Table 1. Demographics of patients with different albumin–bilirubin deterioration statuses after the initial transarterial chemoembolisation
 

Table 2. Demographics of patients with different albumin–bilirubin deterioration statuses after the initial transarterial chemoembolisation relative to those receiving sorafenib or supportive care
 
Overall survival
Patients with post–transarterial chemoembolisation persistent albumin–bilirubin grade deterioration versus sorafenib in subgroups
Online supplementary Figure 1 illustrates the median OS of patients with post-TACE PABD relative to patients treated with sorafenib. Patients receiving TACE who developed post-TACE PABD had significantly longer median OS than those receiving sorafenib in subgroups within and beyond the up-to-7 criteria (19.63 vs 5.17 months; P=0.019 and 7.63 vs 5.11 months; P=0.030, respectively).
 
A significantly longer median OS was observed in patients receiving TACE who developed post-TACE PABD relative to those receiving sorafenib in the subgroup within the up-to-11 criteria (10.20 vs 5.37 months; P=0.016). However, this difference was not significant in the subgroup beyond the up-to-11 criteria (8.00 vs 4.94 months; P=0.083). Similarly, OS was significantly improved in the post-TACE PABD group relative to the sorafenib group within the mALBI grade 1 or 2a subgroup (11.50 vs 6.60 months; P=0.001). However, no significant difference was observed in the mALBI grade 2b subgroup (3.47 vs 4.39 months; P=0.517) [online supplementary Fig 1].
 
Based on stratification according to mALBI grade and the up-to-7 criteria, patients receiving TACE who developed post-TACE PABD had significantly longer median OS relative to those receiving sorafenib in the subgroup with mALBI grade 1 or 2a and within the up-to-7 criteria (29.57 vs 5.17 months; P=0.003) and the subgroup with mALBI grade 1 or 2a and beyond the up-to-7 criteria (10.57 vs 6.60 months; P=0.020). However, OS was not significantly improved in the subgroup with mALBI grade 2b and within the up-to-7 criteria (6.40 vs 4.39 months; P=0.071) or in the subgroup with mALBI grade 2b and beyond the up-to-7 criteria (3.07 vs 4.39 months; P=0.891). The interaction between treatment effects in subgroups stratified according to mALBI grade and the up-to-7 criteria had a 5% level of significance, with a tendency of a significant interaction that warrants further studies (P=0.058) [online supplementary Fig 1].
 
Based on stratification according to mALBI grade and the up-to-11 criteria, patients receiving TACE who developed post-TACE PABD had significantly longer median OS relative to those receiving sorafenib in the subgroup with mALBI grade 1 or 2a and within the up-to-11 criteria (13.37 vs 5.76 months; P=0.004). However, OS was not significantly improved in the subgroup with mALBI grade 1 or 2a and beyond the up-to-11 criteria (11.50 vs 6.60 months; P=0.061), the subgroup with mALBI grade 2b and within the up-to-11 criteria (5.07 vs 4.52 months; P=0.313), or the subgroup with mALBI grade 2b and beyond the up-to-11 criteria (3.07 vs 4.10 months; P=0.316). The interaction between treatment effects in subgroups stratified according to mALBI grade and the up-to-11 criteria had a 5% level of significance, with a tendency of a significant interaction that warrants further studies (P=0.071) [online supplementary Fig 1].
 
Patients with post–transarterial chemoembolisation persistent albumin–bilirubin grade deterioration versus sorafenib in subgroups
The median OS of patients who developed post-TACE PABD relative to those receiving SC is shown in online supplementary Figure 2. Patients receiving TACE who developed post-TACE PABD had significantly longer median OS compared with those receiving SC in the subgroup with mALBI grade 1 or 2a and within the up-to-7 criteria (29.57 vs 15.38 months; P=0.036) and the subgroup with mALBI grade 1 or 2a and beyond the up-to-7 criteria (10.57 vs 3.32 months; P<0.001). However, no significant improvement in OS was observed in the subgroup with mALBI grade 2b and within the up-to-7 criteria (6.40 vs 5.40 months; P=0.266) or in the subgroup with mALBI grade 2b and beyond the up-to-7 criteria (3.07 vs 2.18 months; P=0.051).
 
Patients receiving TACE who developed post- TACE PABD also had significantly longer median OS relative to those receiving SC in the subgroup with mALBI grade 1 or 2a and within the up-to-11 criteria (13.37 vs 4.29 months; P=0.035) and the subgroup with mALBI grade 1 or 2a and beyond the up-to-11 criteria (11.50 vs 3.32 months; P=0.001). However, no significant improvement in OS was observed in the subgroup with mALBI grade 2b and within the up-to-11 criteria (5.07 vs 2.57 months; P=0.084) or in the subgroup with mALBI grade 2b and beyond the up-to-11 criteria (3.07 vs 2.08 months; P=0.269) [online supplementary Fig 2].
 
Patients with post–transarterial chemoembolisation non-persistent albumin–bilirubin grade deterioration versus sorafenib or supportive care in subgroups
Significantly longer median OS was observed among patients in the non-PABD group after TACE relative to those receiving sorafenib (all P<0.001) [online supplementary Fig 3] or SC in all subgroups (all P<0.001, except for the subgroup with mALBI grade 1 or 2a and within the up-to-7 criteria, which displayed a P value of 0.012) [online supplementary Fig 4] stratified according to various criteria.
 
Discussion
Principal findings
This study demonstrated that repeat TACE provided a survival benefit over systemic therapy or SC for patients who developed TABD or PABD after the first TACE, regardless of baseline liver condition (according to ALBI grade, tumour burden, or liver function). However, this benefit was absent in the following two subgroups among patients who developed PABD after the first TACE: (1) those with a baseline liver condition of mALBI grade 1 or 2a and tumour burden exceeding the up-to-11 criteria, and (2) those with a baseline liver condition of mALBI grade 2b, regardless of tumour burden. These two subgroups could serve as specific indicators to guide the decision against prescribing repeat TACE for individual patients, based on their baseline liver condition, tumour burden, and occurrence of PABD after the initial TACE. In such cases, the treatment outcomes of repeat TACE are unlikely to differ from those of sorafenib or SC. Notably, there was a 5% level of significance, with a tendency of a significant interaction that warrants further studies.
 
Current knowledge of previous studies
Liver function deterioration after TACE is associated with worsened long-term survival.5 8 10 Patients with no increase in Child-Pugh score 1 month after TACE had significantly better survival rates than those with an increased Child-Pugh score at the same time point (84.5% vs 44.4%, 43.75% vs 18.5%, and 8.3% vs 0% for 1-year, 2-year, and 3-year survivals, respectively).8 The extent of liver function deterioration after TACE also impacts survival outcomes. The median OS was significantly longer in patients with ALBI grade migration to grade 2 than in patients with migration to grade 3 during both the acute phase (30.9 months vs 8.9 months; P<0.001) and the chronic phase (30.9 months vs 5.7 months; P<0.001).5 Higher tumour burden is linked to liver function deterioration and worse survival outcomes after TACE.15 16 17 Based on the 7-11 criteria, patients with high tumour burden experienced significantly higher rates of liver function deterioration (24.4% vs 14.9% or 14.4%) and shorter median survival (11.9 vs 22.3 or 33.1 months) relative to those with low or intermediate tumour burden.17 Currently, there are no reports in the literature concerning studies that identified liver- and tumour-specific indicators to predict survival benefits of repeat TACE.
 
Implications for clinical practice
Repeat TACE can damage liver function and worsen long-term survival. If a patient’s liver function is irreversibly and severely impaired by repeat TACE, the opportunity to switch to systemic therapy may be missed. To maximise survival benefits, the decision to repeat TACE, discontinue TACE, or transition to systemic therapy should be carefully considered and individualised. Two scoring systems have been developed to guide retreatment strategies,18 19 but universal validation of their predictive value is needed. Studies have shown that these systems are ineffective in terms of supporting decision-making for sequential treatment.20 21
 
Most patients who develop TABD are able to spontaneously recover their baseline liver function. In this study, similar median OS was observed among patients with TABD and NABD (23.83 vs 22.40 months). Transarterial chemoembolisation provided a statistically significant survival benefit for patients within the non-PABD group, regardless of tumour burden, relative to those receiving sorafenib or SC. This finding suggests that TABD has minimal impact on survival benefit or long-term prognosis after TACE, and repeat TACE remains feasible in these patients with reversed or reversible liver function. Based on the present findings, repeat TACE is not recommended for patients with PABD and a baseline liver condition of mALBI grade 2b, regardless of tumour burden, because survival outcomes in this subgroup are unlikely to be superior to those achieved with sorafenib or SC. For the same reason, repeat TACE is not recommended for patients with PABD, a baseline liver condition of mALBI grade 1 or 2a, and tumour burden beyond the up-to-11 criteria. Systemic therapy is preferred for this subgroup, considering that its effectiveness is likely maximised in patients with better liver function (eg, those with ALBI grade 1 or mALBI grade 2a, as stated in an expert consensus).22
 
Limitations
We acknowledge that sorafenib is no longer first-line systemic therapy for HCC. Regimens such as lenvatinib23 or atezolizumab-bevacizumab24 have been associated with significantly better OS relative to sorafenib. We recognise that the use of sorafenib as a control was a limitation of this study. However, no alternative was available because a sufficiently large database with long-term clinical outcomes for newer systemic therapies was not accessible for the local population. The primary objective of this study was not to evaluate the role of sorafenib compared with TACE, but to use sorafenib as a control to identify specific liver and tumour indicators predictive of suboptimal survival outcomes after repeat chemoembolisation. These indicators are intended to serve as contraindications for repeat TACE in patients with the corresponding liver and tumour conditions. The use of a systemic drug with lower OS benefit, such as sorafenib, as a control might lead to overestimation of the value of repeat TACE and, consequently, to the identification of indicators under worse liver and tumour conditions. However, this observation does not compromise the validity of these indicators as criteria for contraindicating repeat TACE.
 
Other limitations of the study include the relatively small sample size in patient groups receiving sorafenib or SC. Patient numbers were further reduced in some subgroups after stratification according to liver function and tumour burden, which could introduce bias in survival comparisons. Serum alpha-fetoprotein (AFP) levels and tumour response after TACE were not analysed in this study. Considering that elevated AFP levels have been associated with ALBI deterioration, AFP may be partially represented in the baseline ALBI grade. The median time to Child-Pugh deterioration was significantly longer in patients who responded to the initial TACE than in those who were refractory to the initial TACE (55.9 vs 19.6 months).25 Most patients (22/27, 81.5%) ineligible for repeat TACE due to hepatic decompensation exhibited tumour progression at the time of TACE discontinuation.26 Target lesion progression has been associated with no survival improvement and an increased risk of liver dysfunction after repeat TACE.27 Based on findings in the above studies, poor tumour response may eventually lead to liver function deterioration. Although tumour response was not analysed in this study, it is reasonable to assume that tumour response varies according to treatment effectiveness. Given that treatment effectiveness is assumed to remain consistent under the same treatment protocol within a single centre, it may be argued that the overall effect of tumour response in individual patients was reflected in liver function deterioration.
 
Conclusion
This study found that repeat TACE is not recommended for patients with persistent liver function deterioration after the initial TACE, particularly those exhibiting suboptimal baseline liver function or excessive tumour burden. Understanding the liver condition and tumour burden in HCC patients may assist clinicians in planning optimal treatment strategies and improving patient prognosis.
 
Author contributions
Concept or design: SCH Yu.
Acquisition of data: LM Chen, L Li, EP Hui, W Yeo, SL Chan.
Analysis or interpretation of data: LM Chen, SCH Yu.
Drafting of the manuscript: LM Chen, SCH Yu.
Critical revision of the manuscript for important intellectual content: All authors.
 
All authors had full access to the data, contributed to the study, approved the final version for publication, and take responsibility for its accuracy and integrity.
 
Conflicts of interest
All authors have disclosed no conflicts of interest.
 
Funding/support
This research was funded by the Vascular and Interventional Radiology Foundation. The funding body was not involved in the study design, data collection, analysis, interpretation, or manuscript preparation.
 
Ethics approval
This research was approved by the Joint Chinese University of Hong Kong–New Territories East Cluster Clinical Research Ethics Committee, Hong Kong (Ref No.: 2020.672). The research was conducted in accordance with the Declaration of Helsinki and the International Conference on Harmonisation, Good Clinical Practice. The requirement for written informed patient consent was waived by the Committee due to the retrospective nature of the research.
 
Supplementary material
The supplementary material was provided by the authors, and some information may not have been peer reviewed. Accepted supplementary material will be published as submitted by the authors, without any editing or formatting. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by the Hong Kong Academy of Medicine and the Hong Kong Medical Association. The Hong Kong Academy of Medicine and the Hong Kong Medical Association disclaim all liability and responsibility arising from any reliance placed on the content.
 
References
1. Sung H, Ferlay J, Siegel RL, et al. Global Cancer Statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin 2021;71:209-49. Crossref
2. Cheung TT, Kwok PC, Chan S, et al. Hong Kong consensus statements for the management of unresectable hepatocellular carcinoma. Liver Cancer 2018;7:40-54. Crossref
3. European Association for the Study of the Liver. EASL Clinical Practice Guidelines: management of hepatocellular carcinoma. J Hepatol 2018;69:182-236. Crossref
4. Min YW, Kim J, Kim S, et al. Risk factors and a predictive model for acute hepatic failure after transcatheter arterial chemoembolization in patients with hepatocellular carcinoma. Liver Int 2013;33:197-202. Crossref
5. Chi CT, Lee IC, Lee RC, et al. Effect of transarterial chemoembolization on ALBI grade in intermediate-stage hepatocellular carcinoma: criteria for unsuitable cases selection. Cancers (Basel) 2021;13:4325. Crossref
6. Hiraoka A, Kumada T, Kudo M, et al. Hepatic function during repeated TACE procedures and prognosis after introducing sorafenib in patients with unresectable hepatocellular carcinoma: multicenter analysis. Dig Dis 2017;35:602-10. Crossref
7. Miksad RA, Ogasawara S, Xia F, Fellous M, Piscaglia F. Liver function changes after transarterial chemoembolization in US hepatocellular carcinoma patients: the LiverT study. BMC Cancer 2019;19:795. Crossref
8. Kohla MA, Abu Zeid MI, Al-Warraky M, Taha H, Gish RG. Predictors of hepatic decompensation after TACE for hepatocellular carcinoma. BMJ Open Gastroenterol 2015;2:e000032. Crossref
9. Park KH, Kim JH, Choe WH, et al. Risk factors for liver function deterioration after transarterial chemoembolization refractoriness in Child-Pugh class A hepatocellular carcinoma patients. Korean J Gastroenterol 2020;75:147-56. Crossref
10. Sun Z, Li G, Ai X, et al. Hepatic and biliary damage after transarterial chemoembolization for malignant hepatic tumors: incidence, diagnosis, treatment, outcome and mechanism. Crit Rev Oncol Hematol 2011;79:164-74. Crossref
11. Ogasawara S, Ooka Y, Koroki K, et al. Switching to systemic therapy after locoregional treatment failure: definition and best timing. Clin Mol Hepatol 2020;26:155-62. Crossref
12. Piscaglia F, Ogasawara S. Patient selection for transarterial chemoembolization in hepatocellular carcinoma: importance of benefit/risk assessment. Liver Cancer 2018;7:104-19. Crossref
13. Yasui Y, Tsuchiya K, Kurosaki M, et al. Up-to-seven criteria as a useful predictor for tumor downstaging to within Milan criteria and Child-Pugh grade deterioration after initial conventional transarterial chemoembolization. Hepatol Res 2018;48:442-50. Crossref
14. Hiraoka A, Michitaka K, Kumada T, et al. Validation and potential of albumin–bilirubin grade and prognostication in a nationwide survey of 46,681 hepatocellular carcinoma patients in Japan: the need for a more detailed evaluation of hepatic function. Liver Cancer 2017;6:325-36. Crossref
15. Khisti R, Patidar Y, Garg L, Mukund A, Thomas SS, Sarin SK. Correlation of baseline portal pressure (hepatic venous pressure gradient) and indocyanine green clearance test with post–transarterial chemoembolization acute hepatic failure. J Clin Exp Hepatol 2019;9:447-52. Crossref
16. Siriwardana RC, Niriella MA, Dassanayake AS, et al. Factors affecting post-embolization fever and liver failure after trans-arterial chemo-embolization in a cohort without background infective hepatitis–a prospective analysis. BMC Gastroenterol 2015;15:96. Crossref
17. Hung YW, Lee IC, Chi CT, et al. Redefining tumor burden in patients with intermediate-stage hepatocellular carcinoma: the seven-eleven criteria. Liver Cancer 2021;10:629-40. Crossref
18. Sieghart W, Hucke F, Pinter M, et al. The ART of decision making: retreatment with transarterial chemoembolization in patients with hepatocellular carcinoma. Hepatology 2013;57:2261-73. Crossref
19. Adhoute X, Penaranda G, Naude S, et al. Retreatment with TACE: the ABCR SCORE, an aid to the decision-making process. J Hepatol 2015;62:855-62. Crossref
20. Arizumi T, Ueshima K, Iwanishi M, et al. Evaluation of ART scores for repeated transarterial chemoembolization in Japanese patients with hepatocellular carcinoma. Oncology 2015;89 Suppl 2:4-10. Crossref
21. Kloeckner R, Pitton MB, Dueber C, et al. Validation of clinical scoring systems ART and ABCR after transarterial chemoembolization of hepatocellular carcinoma. J Vasc Interv Radiol 2017;28:94-102. Crossref
22. Kudo M, Han KH, Ye SL, et al. A changing paradigm for the treatment of intermediate-stage hepatocellular carcinoma: Asia-Pacific primary liver cancer expert consensus statements. Liver Cancer 2020;9:245-60. Crossref
23. Kudo M, Finn RS, Qin S, et al. Lenvatinib versus sorafenib in first-line treatment of patients with unresectable hepatocellular carcinoma: a randomised phase 3 non-inferiority trial. Lancet 2018;391:1163-73. Crossref
24. Cheng AL, Qin S, Ikeda M, et al. Updated efficacy and safety data from IMbrave150: atezolizumab plus bevacizumab vs. sorafenib for unresectable hepatocellular carcinoma. J Hepatol 2022;76:862-73. Crossref
25. Maesaka K, Sakamori R, Yamada R, et al. Initial treatment response to transarterial chemoembolization as a predictive factor for Child-Pugh class deterioration prior to refractoriness in hepatocellular carcinoma. Hepatol Res 2020;50:1275-83. Crossref
26. Labeur TA, Takkenberg RB, Klümpen HJ, van Delden OM. Reason of discontinuation after transarterial chemoembolization influences survival in patients with hepatocellular carcinoma. Cardiovasc Intervent Radiol 2019;42:230-8. Crossref
27. Zhang YF, Guo RP, Ouyang HY, et al. Target lesion response predicts survival of patients with hepatocellular carcinoma retreated with transarterial chemoembolization. Liver Int 2016;36:1516-24. Crossref

Success rate of induction of labour in twin pregnancies relative to singleton pregnancies in a predominantly Chinese population

Hong Kong Med J 2025 Feb;31(1):24–31 | Epub 12 Feb 2025
© Hong Kong Academy of Medicine. CC BY-NC-ND 4.0
 
ORIGINAL ARTICLE  CME
Success rate of induction of labour in twin pregnancies relative to singleton pregnancies in a predominantly Chinese population
CK Wong, MB, ChB, FHKAM (Obstetrics and Gynaecology); Catherine MW Hung, MB, ChB, FHKAM (Obstetrics and Gynaecology); Vivian KS Ng, MB, ChB, FHKAM (Obstetrics and Gynaecology); WK Yung, MB, BS, FHKAM (Obstetrics and Gynaecology); WC Leung, MD, FHKAM (Obstetrics and Gynaecology); WL Lau, MB, BS, FHKAM (Obstetrics and Gynaecology)
Department of Obstetrics and Gynaecology, Kwong Wah Hospital, Hong Kong SAR, China
 
Corresponding author: Dr CK Wong (wck936@ha.org.hk)
 
 Full paper in PDF
 
Abstract
Introduction: This study assessed the efficacy of induction of labour in twin pregnancies relative to singleton pregnancies within a predominantly Chinese patient population.
 
Methods: This retrospective case-matched cohort study included patients with twin pregnancies who underwent induction of labour at our institution in Hong Kong between 2012 and 2020. Patients with twin pregnancies were matched one-to-one with singleton pregnancies based on parity, maternal age, and the indication for induction of labour. The primary outcome was the mode of delivery. Secondary outcomes included the time from oxytocin infusion to delivery, indications for caesarean or instrumental delivery, and maternal and neonatal outcomes.
 
Results: In total, 160 women with twin pregnancies met the inclusion criteria and were matched with 160 singleton pregnancies. Caesarean section was performed in 42 patients (26.3%) with twin pregnancies and 27 patients (16.9%) with singleton pregnancies undergoing induction of labour. Patients with twin pregnancies had a significantly higher risk of caesarean section relative to those with singleton pregnancies (odds ratio=2.14, 95% confidence interval=1.14-4.04; P=0.024). Internal podalic version was required in 13.6% of cases for the vaginal delivery of the second twin. There was no significant difference between the groups in the time from oxytocin administration to vaginal delivery (P=0.143).
 
Conclusion: Despite a higher induction failure rate, about three quarters of twin pregnancy patients achieved successful vaginal deliveries. Our findings inform decision making for patients and obstetricians, emphasising the importance of training for internal podalic version to aid second twin delivery and reduce caesarean rates in twin pregnancies.
 
 
New knowledge added by this study
  • Approximately three-quarters of patients with twin pregnancies who underwent induction of labour achieved successful vaginal deliveries.
  • The failure rate of induction of labour was higher in twin pregnancies than in singleton pregnancies.
  • The probability of requiring a caesarean section for the second twin when the first twin is delivered vaginally is only 0.8% if experts in twin vaginal delivery are available. Internal podalic version was necessary in 13.6% of cases for the vaginal delivery of the second twin.
Implications for clinical practice or policy
  • Patients with twin pregnancies undergoing induction of labour should be counselled regarding the increased risk of unsuccessful labour induction relative to singleton pregnancies.
  • Proficient obstetricians skilled in internal podalic version should be readily available during the delivery of the second twin to improve the success rate of vaginal delivery for the second twin.
  • Greater emphasis should be placed on implementing training opportunities, simulation models, and practices for junior obstetricians to enhance proficiency in internal podalic version. These measures can facilitate second twin delivery and reduce the need for caesarean sections in second twin births.
 
 
Introduction
The global twin birth rate has increased by one-third since the 1980s, rising from 9.1 to 12 per 1000 deliveries, resulting in approximately 1.6 million twin pairs born annually.1 One major factor contributing to this trend is the growing use of assisted reproductive techniques in recent decades.1 Relative to singleton pregnancies, twin pregnancies are associated with higher incidences of maternal and fetal complications, which may require earlier delivery.2 3 Even in uncomplicated cases, the National Institute for Health and Care Excellence (NICE) guidelines recommend delivery at 37 weeks for dichorionic twin pregnancies and 36 weeks for monochorionic twin pregnancies.4 Consequently, earlier delivery is frequently required in twin pregnancies. The Twin Birth Study,5 a large multicentre randomised controlled trial published in 2013, demonstrated the safety of both vaginal and caesarean birth in twin pregnancies where the first twin presented in cephalic position at 32 weeks of gestation or later. These findings have supported an increase in vaginal deliveries for twin pregnancies through induction of labour.
 
The success rate, benefits, and complications associated with induction of labour in singleton pregnancies have been extensively studied.6 7 However, only a limited number of studies have compared the success rate of induction of labour in twin pregnancies relative to singleton pregnancies.8 9 10 11 According to Loscul et al8 and Okby et al,9 induction of labour in twin pregnancies increases the likelihood of caesarean section. In contrast, Fausett et al10 and Taylor et al11 reported that the risk of caesarean delivery in twin pregnancies is comparable to the risk in singleton pregnancies undergoing induction of labour. These conflicting findings may arise from variations in induction methods, ethnicity-related factors, selection biases, and differences in study designs.
 
To provide appropriate counselling to patients, obstetricians must understand the likelihood of vaginal delivery after induction of labour. Reliance on data from singleton deliveries to estimate this likelihood for twin pregnancies may be inappropriate due to inherent differences between twin and singleton pregnancies.12 Considering the current lack of robust evidence regarding induction of labour in twin pregnancies, this study aimed to evaluate the rate of caesarean section (including classical or lower segment caesarean sections) and associated outcomes in twin pregnancies undergoing induction of labour, compared with singleton pregnancies.
 
Methods
Study design
Our institution, a regional public hospital in Hong Kong, provides obstetric services for 3000 to 5000 deliveries annually. All cases of twin pregnancies are recorded in a specialised twin pregnancy clinic registry and managed by a dedicated team of obstetricians and midwives in the Twin Pregnancy Clinic. The medical professionals overseeing this clinic have specialised expertise in maternal fetal medicine.13 In accordance with departmental protocol, patients attend regular follow-up appointments and undergo ultrasound examinations. When a patient approaches term or requires earlier delivery, the attending obstetrician discusses the mode of delivery with the patient. For uncomplicated dichorionic-diamniotic and monochorionic-diamniotic twin pregnancies, vaginal delivery is encouraged if the first twin presents in cephalic position.
 
The same induction of labour protocol is applied to both twin and singleton pregnancies. Patients are admitted to the hospital and a cervical examination is conducted to assess the modified Bishop score. If the cervix is unfavourable with a modified Bishop score (Calder score) <6, cervical priming is performed using either dinoprostone tablets or a cervical ripening balloon (Cook Medical, Bloomington [IN], United States). If the cervix is favourable with a modified Bishop score ≥6, the patient is transferred to the labour ward for artificial rupture of membranes and administration of synthetic oxytocin. The induction of labour protocol used in this study aligns with NICE recommendations, except that the modified Bishop score was used instead of the Bishop score to assess cervical readiness for induction.14 15
 
Medical records of patients with twin pregnancies who underwent induction of labour and delivered at our institution between January 2012 and December 2020 were retrospectively identified using the International Classification of Diseases codes through the Clinical Data Analysis and Reporting System of Hospital Authority. The identified medical records were individually reviewed. Study participants were required to meet all of the following inclusion criteria: gestational age ≥24 weeks, intact membranes, and planned induction of labour. Exclusion criteria included premature rupture of membranes, labour in the latent or active phase, threatened preterm labour resulting in spontaneous labour, and intrauterine fetal death.
 
Each twin pregnancy patient who underwent induction of labour was matched with a singleton pregnancy patient at a 1:1 ratio in the same hospital during the same study period. Matching was based on specific criteria, including parity (nulliparous or multiparous),16 maternal age (advanced maternal age of ≥35 years, or not),16 17 and the indication for induction of labour.18 These criteria were selected to minimise confounding factors that could affect the success rate of induction of labour. To further reduce the impact of variations in medical practice during the study period, the singleton pregnancy patient with the delivery date closest to that of the twin pregnancy patient was selected.
 
For both twin and singleton pregnancies, demographic data, past obstetric history, parity, modified Bishop score, method of cervical priming, indication for induction of labour, and use of epidural analgesia were recorded. The primary outcome was the mode of delivery. Secondary outcomes included the time from oxytocin infusion to vaginal delivery or caesarean section, indications for caesarean section or instrumental delivery, and maternal and neonatal outcomes. Postpartum haemorrhage was defined as blood loss of ≥500 mL, regardless of the mode of delivery. Patients who underwent caesarean section for the second twin after vaginal delivery of the first twin were considered to have undergone caesarean section.
 
Statistical analyses
To calculate the required sample size, it was assumed that the incidences of caesarean section were 25% in the control group and 40% in the study group. The proportion of discordant pairs was assumed to be 0.45. A two-sided significance level of 0.05 was selected, and the study aimed to achieve a 1:1 comparison between the groups. Calculations in G*Power software (version 3.1.9.6; Erdfelder, Faul, & Buchner, Germany) indicated that a total sample size of 160 pairs would provide 80% power for the analysis.
 
Categorical variables are reported as numerator and denominator values (%), whereas continuous variables are presented as mean±standard deviation. McNemar’s test was used to analyse the primary outcome for twin pregnancies and their matched controls. Data for the matched pairs are presented in a 2×2 table, showing concordant and discordant study pairs. Odds ratios (ORs) were calculated as the ratio of discordant pairs, and the test statistic was derived from McNemar’s test. For the remaining outcomes, paired t tests, Wilcoxon signed rank test and analyses of variance were used to analyse continuous variables, whereas McNemar’s tests or Fisher’s exact tests were utilised for categorical variables when comparing the case and control groups. For non-matched data, the Chi squared test was applied for categorical data, and unpaired t tests were used for normally distributed continuous data.
 
Statistical analyses of data using McNemar’s test were performed with Epi Info (version 7.2.5.0; Centers for Disease Control and Prevention, Atlanta [GA], US). All other analyses were conducted with SPSS (Windows version 27.0; IBM Corp, Armonk [NY], US). Two-sided P values <0.05 were considered statistically significant.
 
Results
During the study period, 760 women with twin pregnancies were recorded out of 42 280 maternity cases. Of these, 160 women met the inclusion criteria for this study. The incidence of twin pregnancies was 1.8% and the rate of induction of labour in twin pregnancies was 21.1%.
 
The study group consisted of women with twin pregnancies who underwent induction of labour, whereas the control group comprised women with singleton pregnancies who delivered at the same hospital during the same period. The two groups were matched in terms of age, parity and indication for induction of labour, and were well balanced with respect to these matching factors. Patients with twin pregnancies had a significantly lower body mass index (21.1 kg/m2 vs 22.2 kg/m2; P=0.008) and a significantly higher modified Bishop score (6.2 vs 5.4; P<0.001) relative to the control group. The mean gestational age at delivery was significantly earlier in twin pregnancies than in the control group (37.1 weeks vs 40.2 weeks; P<0.001). Other baseline characteristics, including ethnicity, prior caesarean section, and use of epidural anaesthesia, did not significantly differ between the two groups (Table 1).
 

Table 1. Baseline maternal characteristics
 
Success rate of induction of labour
Out of 160 pairs, 44 were discordant (ie, one member of the pair had a caesarean section and the other had a vaginal delivery), and 116 were concordant (ie, both members of the pair had either a caesarean section or a vaginal delivery) [Fig]. Patients with twin pregnancies who underwent induction of labour had a significantly higher risk of caesarean section relative to those with singleton pregnancies (OR=2.14, 95% confidence interval [CI]=1.14-4.04; P=0.024).
 

Figure. Mode of delivery results in matched twin and singleton pregnancies
 
Among the patients with twin pregnancies, there were 118 vaginal deliveries, 41 caesarean sections, and one case in which the first twin was delivered vaginally and the second twin was delivered by caesarean section. Among the patients with singleton pregnancies, there were 133 vaginal deliveries and 27 caesarean sections. Instrumental deliveries were performed in 32 patients with twin pregnancies and eight patients with singleton pregnancies (Table 2).
 

Table 2. Maternal outcomes
 
There was no significant difference between the groups in the time from oxytocin administration to vaginal delivery (P=0.143) or the time from oxytocin administration to caesarean section (P=0.054). In total, eight patients with twin pregnancies and three patients with singleton pregnancies delivered after dinoprostone insertion without requiring artificial rupture of membranes or oxytocin infusion. Three patients with twin pregnancies and one patient with a singleton pregnancy had an unfavourable cervix after repeated doses of dinoprostone; thus, caesarean section was performed (Table 2).
 
Obstetric outcomes
Twin pregnancies were associated with significantly greater blood loss relative to singleton pregnancies (median: 400 mL vs 250 mL; P<0.001) and a higher incidence of postpartum haemorrhage (35.0% vs 10.6%; P<0.001). However, there was no significant difference between the groups in blood transfusion rates (8.1% vs 3.8%; P=0.115). The aetiology of postpartum haemorrhage and need for second-line treatments were also comparable between the two groups (Table 2).
 
Mode of delivery in twin pregnancies
Among those 118 vaginal deliveries, 45 had vaginal cephalic deliveries of both twins, whereas 42 had a vaginal cephalic delivery of the first twin followed by a vaginal breech delivery of the second twin. Additionally, 32 patients required instrumental delivery with vacuum or forceps for at least one twin (Table 3).
 

Table 3. Mode of delivery in twin pregnancies (n=160)
 
In cases where vaginal breech delivery was required for the second twin, most babies were in breech presentation. Internal podalic version was performed in 16 patients (13.6%) to facilitate delivery of the second twin (Tables 3 and 4). Notably, even in five cases where the second twin was in cephalic presentation, internal podalic version was performed by manual upward displacement of the fetal head to expedite delivery due to fetal bradycardia or cord presentation.
 

Table 4. Vaginal delivery of the second twin (n=118)
 
Neonatal outcomes
When neonatal outcomes were compared between the two groups, no significant differences were observed in Apgar scores at 1 and 5 minutes or in rates of admission to neonatal intensive care units. However, both the first and second twins were significantly lighter in weight relative to neonates in singleton pregnancies (Table 5).
 

Table 5. Neonatal outcomes
 
Discussion
Primary outcomes
This case-control study, utilising matched controls, demonstrated that the rate of failed induction of labour was significantly higher in twin pregnancies than in singleton pregnancies. Nevertheless, 73.8% of patients with twin pregnancies achieved successful vaginal deliveries. This study represents the largest cohort investigation of its kind in a predominantly Chinese population and differs from previous studies conducted in Western countries.8 9 10 11 A previous study19 revealed that ethnic variation can influence the success of induction of labour; thus, our findings provide valuable insights for counselling and managing Chinese patients with twin pregnancies.
 
Comparison with previous studies
The literature on induction of labour in twin pregnancies compared with singleton pregnancies remains limited. The present findings are consistent with those reported by Loscul et al8 and Okby et al,9 both of which identified an increased risk of caesarean delivery after induction of labour in twin pregnancies. Loscul et al8 reported an adjusted OR of 1.8 (95% CI=1.4-2.2), whereas Okby et al9 reported an adjusted OR of 2.2 (95% CI=1.7-2.7). Similarly, the current study demonstrated an OR of 2.14, reinforcing the notion that induction of labour in twin pregnancies is associated with a higher rate of caesarean section relative to singleton pregnancies. However, limitations existed in these studies. For instance, the large cohort study by Loscul et al,8 which included 1995 twin deliveries and 2771 singleton deliveries, did not consider chorionicity, and the methods of induction were not described; considering the multicentre retrospective design of that study, interhospital variations may have existed in terms of induction methods, intrapartum assessment, and decisions regarding caesarean section. Furthermore, Okby et al9 included 191 twin deliveries and 25 913 singleton deliveries, but did not provide details regarding induction methods, cervical status prior to induction, Bishop score, or the chorionicity of twin pregnancies. Conversely, Fausett et al10 and Taylor et al11 included smaller cohorts of twin pregnancies (62 and 100 patients, respectively), and their findings may have been influenced by the small sample sizes. The method of random patient selection used in the control group of the study by Taylor et al11 may have introduced potential bias. Another factor potentially contributing to differences in findings among these studies is ethnic variation.
 
Physiological explanations
Physiological differences in the myometrium between twin and singleton pregnancies may explain the higher incidence of failed labour induction in twin pregnancies. Research has shown that myometrial activity in twin pregnancies is characterised by shorter and more frequent contractions compared with singleton pregnancies, particularly at term.20 Shortened contraction duration may result in ineffective and dysfunctional contractions, increasing the likelihood of failed labour induction. Additionally, the uterus undergoes greater distension and stretching in twin pregnancies. Physiological studies have indicated that increased myometrial stretching is associated with reduced uterine contraction in response to oxytocin stimulation.21 One potential mechanism for this phenomenon is that prolonged stretching enhances the expression or activity of TWIK-related K+ channels, which subsequently diminish myometrial contraction in response to oxytocin.21 Further physiological and molecular investigations are warranted to explore the differences between singleton and twin pregnancies in greater detail.
 
Clinical implications of secondary outcomes
In conjunction with the primary outcomes, the secondary outcomes of this study provide important clinical insights and have substantial implications. Notably, 26.3% of twin pregnancies delivering vaginally required a vaginal breech delivery for the second twin (Tables 3 and 4). Therefore, we recommend that senior obstetricians with expertise in internal podalic version and breech extraction be present during such deliveries.
 
This study identified one patient who required a caesarean section for the second twin after the first twin had been delivered vaginally. When the first twin was delivered vaginally in our cohort, the probability of caesarean section for the second twin was 0.8%. Patients should be informed of this potential risk prior to induction of labour. Previous studies have shown that the risk of caesarean section for the second twin after vaginal delivery of the first twin ranges from 4.3% to 10.7%.5 22 23 24 In our cohort, the percentage of caesarean deliveries for second twins was much lower than that in other series, including another retrospective study conducted in Hong Kong with the same ethnic population.22 23 24 This discrepancy may be attributed to selection bias because the present study included only patients undergoing induction of labour, whereas other studies included patients with both induction of labour and spontaneous onset of labour. Furthermore, elective induction of labour for twin pregnancies in our unit is typically scheduled during daytime hours. This practice ensures the availability of experienced staff proficient in internal podalic version, potentially improving the likelihood of successful vaginal delivery for the second twin. Our findings showed that 13.6% of cases involving second twin deliveries required internal podalic version, primarily due to transverse or oblique lie (Table 4). Even when the second twin presented in cephalic position (as observed in five cases), internal podalic version was required to expedite delivery because of complications such as cord presentation or fetal bradycardia. The presence of experienced staff skilled in performing internal podalic version can significantly increase the likelihood of achieving successful vaginal delivery for the second twin. At our hospital, vaginal twin deliveries during daytime hours are typically supervised by experienced obstetric consultants or associate consultants.
 
Strengths and limitations
To minimise the impact of variations in medical practice during the study period, we utilised a rigorous matching approach in which the singleton pregnancy patient with the delivery date closest to that of the twin pregnancy patient was selected. This approach effectively reduced the potential for confounding factors, including variations in medical practices, and ensured that the same induction of labour protocol was applied to both patient groups. Also, we recorded detailed information concerning chorionicity, indications for induction of labour, and the modified Bishop score. Finally, the induction of labour protocol used in this study aligns with NICE recommendations; thus, the results are applicable to other centres that use a similar protocol.
 
However, this study had some limitations. The earlier gestational age at delivery in the twin pregnancy group, as recommended by international guidelines4 even for uncomplicated twin pregnancies, may have affected the efficacy of induction of labour and influenced the overall outcomes. Additionally, patients with twin pregnancies had a higher initial modified Bishop score relative to those with singleton pregnancies. This difference may be due to selection bias because obstetricians often discourage vaginal delivery in patients with low initial modified Bishop scores; instead, they recommend caesarean section. Despite the higher initial modified Bishop score in twin pregnancies, the success rate of vaginal delivery remained lower in this group than in singleton pregnancies, suggesting that this factor did not significantly influence the study’s results.
 
Although the induction of labour protocol was consistent for both twin and singleton pregnancies, variations in obstetricians’ assessments of cervical dilatation, labour progression, and confidence in managing vaginal twin deliveries may have influenced study outcomes. Obstetricians with less experience or confidence in vaginal twin delivery may have been more likely to diagnose failed induction of labour and proceed with caesarean section. However, no statistically significant difference was observed in the time from initiation of oxytocin to selection of caesarean section between the two groups. The retrospective nature of the study introduced potential biases, and the limited incidence of twin pregnancies in a single regional hospital restricted the sample size. A larger sample size and multicentre design would enhance the generalisability of the findings. Furthermore, because the study primarily included Chinese patients, the applicability of these conclusions to other ethnic groups is limited; there is a need for further research in this area.
 
Conclusion
The failure rate of induction of labour was higher in twin pregnancies than in singleton pregnancies. Nevertheless, 73.8% of patients with twin pregnancies achieved successful vaginal deliveries; approximately 20% required instrumental delivery for at least one twin. Furthermore, twin pregnancies were associated with a higher incidence of postpartum haemorrhage. These findings can help facilitate informed decision making for patients and obstetricians when considering induction of labour and selecting the most appropriate mode of delivery for patients with twin pregnancies.
 
Author contributions
Concept or design: CK Wong, WL Lau.
Acquisition of data: CK Wong.
Analysis or interpretation of data: CK Wong.
Drafting of the manuscript: CK Wong, CMW Hung.
Critical revision of the manuscript for important intellectual content: All authors.
 
All authors had full access to the data, contributed to the study, approved the final version for publication, and take responsibility for its accuracy and integrity.
 
Conflicts of interest
All authors have disclosed no conflicts of interest.
 
Acknowledgement
The authors thank the Kwong Wah Hospital Clinical Research Centre and Mr Steven Lau, Biostatistician from the Centre for Clinical Research and Biostatistics of The Chinese University of Hong Kong, for their statistical advice.
 
Declaration
This work was posted on Authorea as a registered online preprint (https://doi.org/10.22541/au.169320065.53300017/v1).
 
Funding/support
This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
 
Ethics approval
This research was approved by the Kowloon Central Cluster/Kowloon East Cluster Research Ethics Committee of Hospital Authority, Hong Kong (Ref No.: KC/KE-22-0113/ER-3). The requirement for informed patient consent was waived by the Committee due to the retrospective nature of the research and the use of anonymised data in the research.
 
References
1. Monden C, Pison G, Smits J. Twin peaks: more twinning in humans than ever before. Hum Reprod 2021;36:1666-73. Crossref
2. Santana DS, Cecatti JG, Surita FG, et al. Twin pregnancy and severe maternal outcomes: the World Health Organization multicountry survey on maternal and newborn health. Obstet Gynecol 2016;127:631-41. Crossref
3. Cheong-See F, Schuit E, Arroyo-Manzano D, et al. Prospective risk of stillbirth and neonatal complications in twin pregnancies: systematic review and meta-analysis. BMJ 2016;354:i4353. Crossref
4. National Institute for Health and Care Excellence. NICE guideline [NG137]. Twin and triplet pregnancy. London: National Institute for Health and Care Excellence; 2019.
5. Barrett JF, Hannah ME, Hutton EK, et al. A randomized trial of planned cesarean or vaginal delivery for twin pregnancy. N Engl J Med 2013;369:1295-305. Crossref
6. Sotiriadis A, Petousis S, Thilaganathan B, et al. Maternal and perinatal outcomes after elective induction of labor at 39 weeks in uncomplicated singleton pregnancy: a meta-analysis. Ultrasound Obstet Gynecol 2019;53:26-35. Crossref
7. Stock SJ, Ferguson E, Duffy A, Ford I, Chalmers J, Norman JE. Outcomes of elective induction of labour compared with expectant management: population-based study. BMJ 2012;344:e2838. Crossref
8. Loscul C, Schmitz T, Blanc-Petitjean P, Goffinet F, Le Ray C; JUMODA and MEDIP study groups. Risk of cesarean after induction of labor in twin compared to singleton pregnancies. Eur J Obstet Gynecol Reprod Biol 2019;237:68-73. Crossref
9. Okby R, Shoham-Vardi I, Ruslan S, Sheiner E. Is induction of labor risky for twins compare to singleton pregnancies? J Matern Fetal Neonatal Med 2013;26:1804-6. Crossref
10. Fausett MB, Barth WH Jr, Yoder BA, Satin AJ. Oxytocin labor stimulation of twin gestations: effective and efficient. Obstet Gynecol 1997;90:202-4. Crossref
11. Taylor M, Rebarber A, Saltzman DH, Klauser CK, Roman AS, Fox NS. Induction of labor in twin compared with singleton pregnancies. Obstet Gynecol 2012;120:297-301. Crossref
12. Amikam U, Hiersch L, Barrett J, Melamed N. Labour induction in twin pregnancies. Best Pract Res Clin Obstet Gynaecol 2022;79:55-69. Crossref
13. Yung WK, Liu AL, Lai SF, et al. A specialised twin pregnancy clinic in a public hospital. Hong Kong J Gynaecol Obstet Midwifery 2012;12:21-32.
14. National Institute for Health and Care Excellence. NICE guideline [NG207]. Inducing labour. London: National Institute for Health and Care Excellence; 2021.
15. Thomas J, Kavanagh J, Kelly A, editors. RCOG Evidence-based Clinical Guidelines Induction of labour. RCOG Press; 2001.
16. Batinelli L, Serafini A, Nante N, Petraglia F, Severi FM, Messina G. Induction of labour: clinical predictive factors for success and failure. J Obstet Gynaecol 2018;38:352-8. Crossref
17. Jeong Y, Choo SP, Yun J, Kim EH. Effect of maternal age on maternal and perinatal outcomes including cesarean delivery following induction of labor in uncomplicated elderly primigravidae. Medicine (Baltimore) 2021;100:e27063. Crossref
18. Chan YY, Lo TK, Yu EL, Ho LF. Indications for induction of labour and mode of delivery in nulliparous term women with an unfavourable cervix. Hong Kong J Gynaecol Obstet Midwifery 2021;21:69-75. Crossref
19. Papoutsis D, Antonakou A, Tzavara C. The effect of ethnic variation on the success of induced labour in nulliparous women with postdates pregnancies. Scientifica (Cairo) 2016;2016:9569725. Crossref
20. Turton P, Arrowsmith S, Prescott J, et al. A comparison of the contractile properties of myometrium from singleton and twin pregnancies. PLoS One 2013;8:e63800. Crossref
21. Yin Z, He W, Li Y, et al. Adaptive reduction of human myometrium contractile activity in response to prolonged uterine stretch during term and twin pregnancy. Role of TREK-1 channel. Biochem Pharmacol 2018;152:252-63. Crossref
22. Mok SL, Lo TK. Vaginal delivery of second twins: factors predictive of failure and adverse perinatal outcomes. Hong Kong Med J 2022;28:376-82. Crossref
23. Tang HT, Liu AL, Chan SY, et al. Twin pregnancy outcomes after increasing rate of vaginal twin delivery: retrospective cohort study in a Hong Kong regional obstetric unit. J Matern Fetal Neonatal Med 2016;29:1094-100. Crossref
24. Kong CW, To WW. The predicting factors and outcomes of caesarean section of the second twin. J Obstet Gynaecol 2017;37:709-13. Crossref

Prevalence, risk factors, and outcomes of systemic sclerosis–associated interstitial lung disease in a Chinese population

Hong Kong Med J 2025 Feb;31(1):16–23 | Epub 12 Feb 2025
© Hong Kong Academy of Medicine. CC BY-NC-ND 4.0
 
ORIGINAL ARTICLE  CME
Prevalence, risk factors, and outcomes of systemic sclerosis–associated interstitial lung disease in a Chinese population
Dennis TH Chan, MRCP (UK), FHKAM (Medicine)1; Lydia HP Tam, MRCP (UK), FHKAM (Medicine)2; Tommy TO Lam, MRCP (UK), FHKAM (Medicine)2; Jacqueline So, MRCP (UK), FHKAM (Medicine)2; LY Ho, MRCP (UK), FHKAM (Medicine)2; LS Tam, MRCP (UK), FRCP (Lond)2; Ho So, FHKAM (Medicine), FRCP (Lond)2
1 Department of Medicine, Alice Ho Miu Ling Nethersole Hospital, Hong Kong SAR, China
2 Department of Medicine and Therapeutics, Prince of Wales Hospital, The Chinese University of Hong Kong, Hong Kong SAR, China
3 Department of Medicine and Geriatrics, Tai Po Hospital, Hong Kong
 
Corresponding author: Dr Dennis TH Chan (cdt978@ha.org.hk)
 
 Full paper in PDF
 
Abstract
Introduction: Systemic sclerosis–associated interstitial lung disease (SSc-ILD) is a leading cause of mortality among systemic sclerosis (SSc) patients. This multicentre cohort study sought to determine the prevalence of SSc-ILD, identify risk factors for ILD development in SSc patients, and explore poor prognostic factors in SSc-ILD patients.
 
Methods: Medical records were retrospectively reviewed for Chinese patients who met the 2013 American College of Rheumatology/European League Against Rheumatism classification criteria for SSc. Univariable and multivariable analyses were performed to compare SSc patients with and without ILD, as well as SSc-ILD patients with and without disease progression. Survival analysis was also conducted.
 
Results: The study cohort comprised 223 SSc patients with a median follow-up duration of 8.1 years. The prevalence of ILD was 49.8%. A history of bibasal crackles (hazard ratio [HR]=2.813; P=0.001) was independently associated with ILD development. Among ILD patients, 64.1% exhibited progressive disease. An elevated C-reactive protein (CRP) level at ILD diagnosis (HR=1.064; P=0.002) constituted an independent predictor of ILD progression. The overall mortality rate was 24.2% and pneumonia was the most common cause of death. Predictors of mortality included age at SSc diagnosis (HR=1.101; P=0.002), history of smoking (HR=5.173; P=0.028), and CRP level at SSc diagnosis (HR=1.103; P=0.009).
 
Conclusion: Interstitial lung disease was prevalent among SSc patients in this cohort and the majority exhibited disease progression. Comprehensive clinical assessment, supported by investigations such as CRP level measurement, is essential to identify predictors of poor prognosis.
 
 
New knowledge added by this study
  • Interstitial lung disease (ILD) is common and often progressive among systemic sclerosis (SSc) patients in the Hong Kong Chinese population.
  • Baseline C-reactive protein level is independently associated with ILD progression and mortality in SSc patients.
Implications for clinical practice or policy
  • Interstitial lung disease screening is recommended for all SSc patients.
  • C-reactive protein level may serve as a predictor of ILD progression and mortality in SSc patients.
  • Prospective studies are necessary to develop personalised monitoring and treatment strategies.
 
 
Introduction
Systemic sclerosis (SSc) is a heterogeneous connective tissue disorder involving multiple organ systems. Its subtypes comprise limited cutaneous SSc (lcSSc) and diffuse cutaneous SSc (dcSSc).1 Common features include Raynaud’s phenomenon, skin sclerosis, and musculoskeletal inflammation. Organ-based manifestations, such as interstitial lung disease (ILD), pulmonary hypertension (PH), and scleroderma renal crisis, are particularly important because they substantially affect patient quality of life and survival. Systemic sclerosis–associated interstitial lung disease (SSc-ILD) is the leading cause of mortality in SSc, contributing to 35% of disease-related deaths.2 In Hong Kong, SSc has one of the highest standardised mortality ratios among rheumatic diseases.3
 
Systemic sclerosis–associated interstitial lung disease arises from chronic microinjuries to lung endothelial and epithelial cells, which activate the immune system and lead to the recruitment and transformation of fibroblasts into myofibroblasts that secrete excessive collagen-rich extracellular matrix.4 5 This pathological process causes pathological lung stiffness and architectural disruption, producing restrictive lung disease through reductions of lung compliance and volume.6
 
It is well established that there is an ethnic disparity in SSc-ILD; prevalence rates considerably vary among ethnic groups, ranging from 25% to 90%.7 The prevalence of SSc-ILD is reportedly higher in Asian populations than in Western populations.8 However, data concerning the prevalence and predictive factors of SSc-ILD in Southern Chinese individuals remain limited. A prospective case-control study investigating functioning and health-related quality of life in Hong Kong showed that among 78 SSc patients recruited, 24 (30.8%) had ILD.9
 
The clinical course of SSc-ILD ranges from asymptomatic presentation to rapidly progressive disease, which can lead to mortality. Severe disease develops in approximately 25% to 33% of SSc-ILD patients.4 Thus, it is essential to identify patients with early-stage SSc who are asymptomatic but exhibit a risk of ILD development and progression. This approach enables closer monitoring and facilitates timely treatment. Numerous risk factors for ILD development and progression in SSc patients have been reported.8 10 11 According to the 2020 European consensus statements on the identification and management of ILD in SSc,10 predictive factors include respiratory symptoms, smoking history, ethnicity (eg, native American or African heritage), dcSSc, presence of anti-topoisomerase antibody (ATA), and male sex. However, most of these findings were based on studies conducted in Western populations.10
 
To improve the identification and management of SSc patients at risk of ILD development or progression, we conducted a multicentre study that aimed to assess the prevalence of SSc-ILD in the Hong Kong Chinese population, investigate associated risk factors, and identify potential indicators of poor prognosis. The findings of this study are expected to enhance early detection and monitoring of ILD in SSc patients, enabling timely and effective interventions.
 
Methods
Study design and patients
This retrospective longitudinal study included SSc patients who attended Alice Ho Miu Ling Nethersole Hospital, Prince of Wales Hospital, and North District Hospital. These patients were identified via the Clinical Data Analysis and Reporting System, a database established for record keeping and research purposes in Hong Kong, which has been utilised in epidemiological studies.12 The International Classification of Diseases, Ninth Revision, Clinical Modification code 710.1 (Systemic sclerosis) was used to identify SSc patients within the Clinical Data Analysis and Reporting System. The search period spanned from January 2008 to December 2022. Clinical information for each patient was reviewed in the electronic health record. Patients were included if they had attended more than one follow-up appointment and met the 2013 American College of Rheumatology/European League Against Rheumatism classification criteria for SSc.13 Exclusion criteria were age at onset <18 years, overlap syndrome, and non-Chinese ethnicity. Patients with ILD were identified based on radiologists’ reports of high-resolution computed tomography (HRCT) of the thorax. For patients without HRCT records, chest radiographs were reviewed to identify evidence of ILD. Investigations, treatments, and the frequency of follow-ups were determined by the treating physicians.
 
Clinical variable collection
Demographic variables, including sex, smoking history, age at symptom onset, age at SSc diagnosis, and age at ILD diagnosis, were recorded. The first clinical symptoms attributed to SSc, as judged by the treating physicians, and symptoms observed during the follow-up period were documented. These symptoms included Raynaud’s phenomenon, puffy fingers, sclerodactyly, digital ulcers, oesophageal dysmotility, arthralgia, dyspnoea, and cough.14 The presence of bibasal crackles on physical examination by the treating physicians was also documented. The status of PH was recorded based on findings from echocardiography or right heart catheterisation. Disease duration was defined as the time from onset of the first symptom to the last visit. The SSc subtype was categorised as dcSSc or lcSSc based on the extent of skin involvement, using criteria established by LeRoy and Medsger.1
 
Laboratory data, including autoantibodies, C-reactive protein (CRP), and erythrocyte sedimentation rate (ESR) levels, were recorded. C-reactive protein and ESR levels at baseline and at ILD diagnosis were documented. Pulmonary function test (PFT) results at baseline and at the latest available assessment were retrieved. Forced expiratory volume in 1 second, forced vital capacity (FVC), and diffusing capacity of the lungs for carbon monoxide (DLCO) were recorded. In ILD cases, the radiological pattern on HRCT, including non-specific interstitial pneumonitis, usual interstitial pneumonia, or other patterns, was noted.
 
Systemic sclerosis–associated interstitial lung disease outcomes were assessed based on ILD progression and mortality. Disease progression was defined as an increase in ILD extent on serial HRCT, as reported by radiologists, or a decline in FVC of ≥10% from baseline. Alternatively, progression was defined as an FVC decline of 5% to 9% accompanied by a DLCO decline of ≥15%.15 Causes of death were categorised as SSc-related or SSc-unrelated, based on assessment by the treating physicians (when available) or the authors. Clinical variables with >20% missing data were excluded from statistical analyses.
 
Statistical analyses
Descriptive data for continuous variables were presented as mean±standard deviation or median (interquartile range [IQR]), as appropriate. Categorical variables were presented as numbers with percentages. Student’s t test or the Mann-Whitney U test was used for comparisons of continuous variables, depending on the data distribution. Categorical variables were compared using Fisher’s exact test or the Chi squared test. Patients with and without ILD were compared using univariable and multivariable Cox regression analyses to identify risk factors associated with the development of SSc-ILD. Among SSc-ILD patients, those displaying progressive ILD were compared with those lacking progression via univariable and multivariable analyses to identify risk factors for disease progression. The univariate effects of covariates on survival were evaluated using Kaplan–Meier curves; the log-rank test was utilised to assess differences in survival. Multivariable Cox regression analyses were conducted to identify independent predictors of adverse outcomes. Variables with P value <0.2 in univariable analyses were included in the multivariable Cox regression analysis. All statistical analyses were performed using SPSS (Windows version 27.0; IBM Corp, Armonk [NY], United States). P values <0.05 were considered statistically significant.
 
Results
Demographics and clinical characteristics
In total, 223 SSc patients were included in this study (Fig). Table 1 summarises the patients’ baseline characteristics. The median follow-up duration was 8.1 years (IQR=4.0-10.2) and the total cumulative follow-up period was 1951 person-years. The majority of patients were female (86.1%). The median age at SSc diagnosis was 55 years (IQR=48-64). A majority of patients (86.5%) underwent HRCT scans during the follow-up period. Among those without HRCT, none had chest radiographs suggestive of ILD. Limited cutaneous SSc was the most common subtype, displayed by 71.3% of the cohort. Anti-topoisomerase antibody was the most frequently detected autoantibody, present in 39.0% of patients.
 

Figure. Patient recruitment
 

Table 1. Baseline characteristics of systemic sclerosis patients in this study
 
The overall prevalence of ILD among SSc patients was 49.8%. The age at ILD diagnosis ranged from 20 to 85 years, with a median of 57 years. Most patients in the SSc-ILD subgroup were female (86.5%) and non-smokers (86.5%); these characteristics did not significantly differ relative to patients without ILD (Table 1). The median interval from onset of the first SSc symptom to ILD diagnosis was 2.4 years (IQR=1.3-5.4). Among ILD cases, 51.3% were diagnosed within the first 3 years after SSc symptom onset, and 64.0% were diagnosed within 5 years. Of the ILD patients, 18.9% were asymptomatic, whereas symptomatic patients experienced a median interval of 2.4 years (IQR=1.2-6.3) from respiratory symptom onset to ILD diagnosis.
 
The frequency of dcSSc was significantly greater in patients with ILD than in patients without ILD (39.6% vs 16.1%; P<0.001). Conversely, lcSSc was more common in patients without ILD than in patients with ILD (83.0% vs 59.5%; P<0.001). In the ILD group, ATA was the most frequently detected autoantibody (57.7%), whereas anti-centromere antibody (ACA) was more common in the non-ILD group (49.1%) [Table 1].
 
The frequencies of non-respiratory clinical features were comparable between the ILD and non-ILD groups. However, respiratory features, including dyspnoea, cough and bibasal crackles significantly differed between the two groups, both at presentation and during follow-up (P<0.001 for all comparisons). Pulmonary hypertension was significantly more frequent in the ILD group throughout the follow-up period (19.8% vs 2.7%; P<0.001). The ILD group also exhibited a numerically higher baseline ESR, with a median of 21.5 mm/hr (IQR=14-40.5), whereas the non-ILD group displayed a median of 18 mm/hr (IQR=11-30; P=0.074) [online supplementary Table 1].
 
Associative factors of interstitial lung disease development
Univariable analysis showed that several factors were associated with the presence of ILD (online supplementary Table 2). These included dcSSc, ATA, history of dyspnoea, history of cough, history of bibasal crackles, history of PH, and baseline ESR level. Conversely, ACA and lcSSc were negatively associated with ILD development. According to multivariable Cox regression analysis, a history of bibasal crackles was independently associated with the presence of ILD, and a history of dyspnoea showed a trend towards significance.
 
Predictors of interstitial lung disease progression
Among patients with ILD, 64.1% exhibited progression during follow-up. Patients with progressive ILD were younger at ILD diagnosis, displaying a mean age of 54 years (range, 20-85) compared with 60 years (range, 31-81; P=0.051) in patients with non-progressive ILD. The proportions of dcSSc and lcSSc were similar between the progressive and non-progressive ILD groups. Anti-topoisomerase antibody was the predominant autoantibody in both groups, with proportions of 62.7% and 54.5%, respectively (P=0.444) [online supplementary Table 3]. Regarding clinical characteristics, only a history of digital ulcers showed a significant difference; its prevalence was higher in the progressive ILD group (42.4% vs 15.2%; P=0.008) [online supplementary Table 4].
 
Table 2 compares the results of laboratory and PFT between the progressive and non-progressive ILD groups. C-reactive protein levels at both SSc diagnosis and ILD diagnosis were higher in the progressive ILD group; however, only CRP level at ILD diagnosis showed a trend towards significance (P=0.130). The latest values for the predicted percentages of forced expiratory volume in 1 second, FVC and DLCO were significantly lower in the progressive ILD group (all P≤0.001), but baseline values did not differ between the groups. Regarding HRCT patterns, no significant differences were observed between the two groups.
 

Table 2. Laboratory and pulmonary function test results of systemic sclerosis patients with progressive and non-progressive interstitial lung disease
 
The results of the Cox regression analysis for ILD progression are presented in Table 3. In the univariable analysis, factors associated with ILD progression included CRP level at ILD diagnosis (hazard ratio [HR]=1.504; P=0.005) and the latest predicted percentage of DLCO (HR=0.962; P<0.001). Multivariable analysis identified CRP level at ILD diagnosis (HR=1.064; P=0.002) as an independent factor associated with ILD progression, whereas a history of digital ulcers (HR=1.874; P=0.076) showed a trend towards significance.
 

Table 3. Univariable and multivariable Cox regression for predictors of interstitial lung disease progression
 
Mortality
The overall mortality rate in the cohort during the follow-up period was 24.2%; a higher rate was observed in the SSc-ILD group relative to the non-ILD group (29.7% vs 18.8%, P=0.056) [online supplementary Fig]. Among the causes of death, infections were most common, followed by malignancy (online supplementary Table 5). In patients with ILD, 63.6% of deaths resulted from pneumonia; this proportion was 42.9% among patients without ILD. The univariable analysis indicated that factors associated with mortality were older age at SSc diagnosis, male sex, history of smoking, presence of PH, baseline CRP level, and baseline ESR level. Multivariable analysis revealed the following independent predictors of mortality: older age at SSc diagnosis (HR=1.101; P=0.002), history of smoking (HR=5.173; P=0.028), and higher baseline CRP level (HR=1.103; P=0.009) [Table 4].
 

Table 4. Univariable and multivariable Cox regression for predictors of mortality
 
Discussion
We observed an SSc-ILD prevalence of 49.8% in a multicentre cohort of Southern Chinese SSc patients. Among Asian countries, the reported prevalence of SSc-ILD varies. In Korea, prevalence rates range from 40% to 58%,16 17 whereas in Japan, they range from 42% to 51%.18 19 However, considerably higher prevalence estimates of 63% to 85% have been reported in centres from Northern China.20 21 These findings suggest ethnic or geographic variations in the prevalence of SSc-ILD within the Asian population. Given the high prevalence in our cohort and the observed delay between respiratory symptom onset and ILD diagnosis (median=2.4 years), early universal screening for ILD is necessary among SSc patients. This is particularly important because a substantial proportion of patients (18.9%) were asymptomatic.
 
Consistent with previous studies, our findings confirmed that in Chinese SSc patients, the dcSSc subtype and presence of ATA were associated with a higher likelihood of ILD development, whereas the lcSSc subtype and presence of ACA were inversely related to ILD risk.11 22 23 Also, our study showed that a history of bibasal crackles was independently associated with ILD development, similar to the findings of a retrospective cohort study in South Africa.24 However, it is important to recognise that the presence of crackles often reflects established disease, and the new onset of respiratory symptoms may indicate ILD development. Irrespective of the presence of respiratory symptoms, all SSc patients are recommended to undergo screening for ILD via HRCT and PFTs, as specified by expert consensuses.10 25 Regular auscultation for bibasal crackles during follow-up is equally important because it facilitates the identification of individuals who may require repeat investigations.
 
C-reactive protein has been proposed as a biomarker for predicting SSc-ILD progression.26 Similar to our findings, a retrospective cohort study in France27 revealed a significant difference in CRP levels between SSc patients with and without ILD (P=0.003). The multivariate analysis in that study also demonstrated a negative correlation between CRP levels and FVC.27 C-reactive protein production is driven by interleukin 6, and interleukin 6 inhibitors have shown efficacy in preserving lung function among SSc-ILD patients during a phase three randomised controlled trial.28 These findings provide a mechanistic rationale for using CRP levels to identify SSc-ILD patients who may benefit from early investigation and treatment.
 
The cumulative survival rates reported in our study align with those observed in Western populations.11 29 However, a European Scleroderma Trials and Research Group cohort study conducted in China20 demonstrated a higher cumulative survival rate of 87.8% at 10 years, a lower overall mortality rate of 8.9%, and fewer SSc-ILD–related deaths (2.5%). This disparity may be attributed to the higher frequency of infection-related deaths and the greater proportion of patients with progressive disease in our cohort. Although assessments of treatment regimen and response were beyond the scope of our study, due to the confounding by indication involved in its retrospective design, immunosuppressive agents commonly used in the past may have predisposed patients to infections. Indeed, infection has previously been identified as the leading cause of death in local SSc patients.3 Considering the high rate of infection-related mortality, recently available antifibrotic treatments may be preferable to immunosuppressive therapy in selected patients who exhibit increased infection risk. Furthermore, consistent with well-established evidence, we identified increased age15 30 31 32 and elevated CRP levels at SSc diagnosis33 34 as predictors of mortality. It remains unclear whether more aggressive early treatment in patients with elevated baseline CRP levels would improve survival; further investigation is warranted.
 
Limitations
Some limitations should be acknowledged in our study. The data were extracted from the electronic health record, making undercoding of diagnoses unavoidable. Due to the retrospective study design, some clinical data essential to this study might not have been fully documented, and disease progression monitoring was not systematic, which could introduce bias. The presence of symptoms and ILD was assessed by the treating physicians and radiologists, respectively; these assessments potentially lacked specificity or sensitivity. Follow-up investigations were primarily ordered based on clinical judgement, leading to potential selection bias. Our analyses did not adjust for patients with progressive disease who may have received treatment leading to ILD stabilisation, which could have resulted in classification of their ILD as non-progressive. Furthermore, no standardised criteria currently exist for defining SSc-ILD progression. Quantitative assessments of ILD involvement on HRCT, such as percentage involvement or the Warrick score,35 and the extensiveness of skin disease using the modified Rodnan skin score,36 were also unavailable.
 
Conclusion
This is the first multicentre cohort study to investigate SSc-ILD in Hong Kong. Our findings demonstrated a high prevalence of ILD among Chinese SSc patients, with a significant proportion of these patients exhibiting disease progression. Universal ILD screening is recommended for SSc patients, with particular attention to those who develop respiratory symptoms and signs. In addition to imaging and PFTs, CRP levels could serve as a biomarker for ILD progression and poor prognosis.
 
Author contributions
Concept or design: DTH Chan, H So.
Acquisition of data: DTH Chan, LHP Tam.
Analysis or interpretation of data: DTH Chan, H So.
Drafting of the manuscript: DTH Chan, H So.
Critical revision of the manuscript for important intellectual content: All authors.
 
All authors had full access to the data, contributed to the study, approved the final version for publication, and take responsibility for its accuracy and integrity.
 
Conflicts of interest
All authors have disclosed no conflicts of interest.
 
Declaration
The results of this study were presented as poster presentation at 26th Asia-Pacific League of Associations for Rheumatology Congress 2024 in Singapore, 21-25 August 2024.
 
Funding/support
This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
 
Ethics approval
This research was approved by the Joint Chinese University of Hong Kong–New Territories East Cluster Clinical Research Ethics Committee, Hong Kong (Ref No.: CREC-2023-393). The requirement for informed patient consent was waived by the Committee due to the retrospective nature of the research.
 
Supplementary material
The supplementary material was provided by the authors, and some information may not have been peer reviewed. Accepted supplementary material will be published as submitted by the authors, without any editing or formatting. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by the Hong Kong Academy of Medicine and the Hong Kong Medical Association. The Hong Kong Academy of Medicine and the Hong Kong Medical Association disclaim all liability and responsibility arising from any reliance placed on the content.
 
References
1. LeRoy EC, Medsger TA Jr. Criteria for the classification of early systemic sclerosis. J Rheumatol 2001;28:1573-6.
2. Tyndall AJ, Bannert B, Vonk M, et al. Causes and risk factors for death in systemic sclerosis: a study from the EULAR Scleroderma Trials and Research (EUSTAR) database. Ann Rheum Dis 2010;69:1809-15. Crossref
3. Mok CC, Kwok CL, Ho LY, Chan PT, Yip SF. Life expectancy, standardized mortality ratios, and causes of death in six rheumatic diseases in Hong Kong, China. Arthritis Rheum 2011;63:1182-9. Crossref
4. Khanna D, Tashkin DP, Denton CP, Renzoni EA, Desai SR, Varga J. Etiology, risk factors, and biomarkers in systemic sclerosis with interstitial lung disease. Am J Respir Crit Care Med 2020;201:650-60. Crossref
5. Mostmans Y, Cutolo M, Giddelo C, et al. The role of endothelial cells in the vasculopathy of systemic sclerosis: a systematic review. Autoimmun Rev 2017;16:774-86. Crossref
6. Khanna D, Lescoat A, Roofeh D, et al. Systemic sclerosis–associated interstitial lung disease: how to incorporate two Food and Drug Administration–approved therapies in clinical practice. Arthritis Rheumatol 2022;74:13-27. Crossref
7. Qiu M, Nian X, Pang L, Yu P, Zou S. Prevalence and risk factors of systemic sclerosis–associated interstitial lung disease in East Asia: a systematic review and meta-analysis. Int J Rheum Dis 2021;24:1449-59. Crossref
8. Chan DT, So H. Systemic sclerosis–associated interstitial lung disease: prevalence and risk factors. J Clin Rheumatol Immunol 2023;23:15-24. Crossref
9. Chan PT, Mok CC, Chan KL, Ho LY. Functioning and health-related quality of life in Chinese patients with systemic sclerosis: a case-control study. Clin Rheumatol 2014;33:659-66. Crossref
10. Hoffmann-Vold AM, Maher TM, Philpot EE, et al. The identification and management of interstitial lung disease in systemic sclerosis: evidence-based European consensus statements. Lancet Rheumatol 2020;2:e71-83. Crossref
11. Nihtyanova SI, Schreiber BE, Ong VH, et al. Prediction of pulmonary complications and long-term survival in systemic sclerosis. Arthritis Rheumatol 2014;66:1625-35. Crossref
12. So H, So J, Lam TT, et al. Performance of the 2017 European Alliance of Associations for Rheumatology/American College of Rheumatology classification criteria in patients with idiopathic inflammatory myopathy and anti–melanoma differentiation–associated protein 5 positivity. Arthritis Rheumatol 2022;74:1588-92. Crossref
13. van den Hoogen F, Khanna D, Fransen J, et al. 2013 classification criteria for systemic sclerosis: an American College of Rheumatology/European League against Rheumatism Collaborative Initiative. Arthritis Rheum 2013;65:2737-47. Crossref
14. van den Hombergh WM, Carreira PE, Knaapen-Hans HK, van den Hoogen FH, Fransen J, Vonk MC. An easy prediction rule for diffuse cutaneous systemic sclerosis using only the timing and type of first symptoms and auto-antibodies: derivation and validation. Rheumatology (Oxford) 2016;55:2023-32. Crossref
15. Goh NS, Hoyles RK, Denton CP, et al. Short-term pulmonary function trends are predictive of mortality in interstitial lung disease associated with systemic sclerosis. Arthritis Rheumatol 2017;69:1670-8. Crossref
16. Jung E, Suh CH, Kim HA, Jung JY. Clinical characteristics of systemic sclerosis with interstitial lung disease. Arch Rheumatol 2018;33:322-7. Crossref
17. Kim J, Park SK, Moon KW, et al. The prognostic factors of systemic sclerosis for survival among Koreans. Clin Rheumatol 2010;29:297-302. Crossref
18. Sekiguchi A, Inoue Y, Yamazaki S, et al. Prevalence and clinical characteristics of earlobe crease in systemic sclerosis: possible association with vascular dysfunction. J Dermatol 2020;47:870-5. Crossref
19. Aozasa N, Hatano M, Saigusa R, et al. Clinical significance of endothelial vasodilatory function evaluated by EndoPAT in patients with systemic sclerosis. J Dermatol 2020;47:609-14. Crossref
20. Hu S, Hou Y, Wang Q, Li M, Xu D, Zeng X. Prognostic profile of systemic sclerosis: analysis of the clinical EUSTAR cohort in China. Arthritis Res Ther 2018;20:235. Crossref
21. Wang J, Assassi S, Guo G, et al. Clinical and serological features of systemic sclerosis in a Chinese cohort. Clin Rheumatol 2013;32:617-21. Crossref
22. Sánchez-Cano D, Ortego-Centeno N, Callejas JL, et al. Interstitial lung disease in systemic sclerosis: data from the Spanish scleroderma study group. Rheumatol Int 2018;38:363-74. Crossref
23. Gelber AC, Manno RL, Shah AA, et al. Race and association with disease manifestations and mortality in scleroderma: a 20-year experience at the Johns Hopkins Scleroderma Center and review of the literature. Medicine (Baltimore) 2013;92:191-205. Crossref
24. Ashmore P, Tikly M, Wong M, Ickinger C. Interstitial lung disease in South Africans with systemic sclerosis. Rheumatol Int 2018;38:657-62. Crossref
25. Rahaghi FF, Hsu VM, Kaner RJ, et al. Expert consensus on the management of systemic sclerosis–associated interstitial lung disease. Respir Res 2023;24:6. Crossref
26. Distler O, Assassi S, Cottin V, et al. Predictors of progression in systemic sclerosis patients with interstitial lung disease. Eur Respir J 2020;55:1902026. Crossref
27. Chikhoune L, Brousseau T, Morell-Dubois S, et al. Association between routine laboratory parameters and the severity and progression of systemic sclerosis. J Clin Med 2022;11:5087. Crossref
28. Khanna D, Lin CJ, Furst DE, et al. Long-term safety and efficacy of tocilizumab in early systemic sclerosis–interstitial lung disease: open-label extension of a phase 3 randomized controlled trial. Am J Respir Crit Care Med 2022;205:674-84. Crossref
29. Pokeerbux MR, Giovannelli J, Dauchet L, et al. Survival and prognosis factors in systemic sclerosis: data of a French multicenter cohort, systematic review, and meta-analysis of the literature. Arthritis Res Ther 2019;21:86. Crossref
30. Volkmann ER, Tashkin DP, Sim M, et al. Short-term progression of interstitial lung disease in systemic sclerosis predicts long-term survival in two independent clinical trial cohorts. Ann Rheum Dis 2019;78:122-30. Crossref
31. Volkmann ER, Saggar R, Khanna D, et al. Improved transplant-free survival in patients with systemic sclerosis–associated pulmonary hypertension and interstitial lung disease. Arthritis Rheumatol 2014;66:1900-8. Crossref
32. Takei R, Arita M, Kumagai S, et al. Radiographic fibrosis score predicts survival in systemic sclerosis–associated interstitial lung disease. Respirology 2018;23:385-91. Crossref
33. Liu X, Mayes MD, Pedroza C, et al. Does C-reactive protein predict the long-term progression of interstitial lung disease and survival in patients with early systemic sclerosis? Arthritis Care Res (Hoboken) 2013;65:1375-80. Crossref
34. Le Gouellec N, Duhamel A, Perez T, et al. Predictors of lung function test severity and outcome in systemic sclerosis–associated interstitial lung disease. PLoS One 2017;12:e0181692. Crossref
35. Warrick JH, Bhalla M, Schabel SI, Silver RM. High resolution computed tomography in early scleroderma lung disease. J Rheumatol 1991;18:1520-8.
36. Khanna D, Furst DE, Clements PJ, et al. Standardization of the modified Rodnan skin score for use in clinical trials of systemic sclerosis. J Scleroderma Relat Disord 2017;2:11-8. Crossref

Artificial intelligence–based computer-aided diagnosis for breast cancer detection on digital mammography in Hong Kong

Hong Kong Med J 2024 Dec;30(6):468–77 | Epub 19 Dec 2024
© Hong Kong Academy of Medicine. CC BY-NC-ND 4.0
 
ORIGINAL ARTICLE
Artificial intelligence–based computer-aided diagnosis for breast cancer detection on digital mammography in Hong Kong
SM Yu, MB, BS, FHKAM (Radiology)1; Catherine YM Young, MB, BS, FRCR1; YH Chan, MB, ChB, FRCR1; YS Chan, MB, ChB, FHKAM (Radiology)1; Carita Tsoi, MB, ChB, FHKAM (Radiology)1; Melinda NY Choi, MHSc, GCB1; TH Chan, BSc, MSc1; Jason Leung, MSc2; Winnie CW Chu, MB, ChB, FHKAM (Radiology)1; Esther HY Hung, MB, ChB, FHKAM (Radiology)1; Helen HL Chau, MB, ChB, FHKAM (Radiology)
1 Department of Imaging and Interventional Radiology, Prince of Wales Hospital, Hong Kong SAR, China
2 The Jockey Club Centre for Osteoporosis Care and Control, The Chinese University of Hong Kong, Hong Kong SAR, China
 
Corresponding author: Dr SM Yu (ysm687@ha.org.hk)
 
 Full paper in PDF
 
Abstract
Introduction: Research concerning artificial intelligence in breast cancer detection has primarily focused on population screening. However, Hong Kong lacks a population-based screening programme. This study aimed to evaluate the potential of artificial intelligence–based computer-assisted diagnosis (AI-CAD) program in symptomatic clinics in Hong Kong and analyse the impact of radio-pathological breast cancer phenotype on AI-CAD performance.
 
Methods: In total, 398 consecutive patients with 414 breast cancers were retrospectively identified from a local, prospectively maintained database managed by two tertiary referral centres between January 2020 and September 2022. The full-field digital mammography images were processed using a commercial AI-CAD algorithm. An abnormality score <30 was considered a false negative, whereas a score of ≥90 indicated a high-score tumour. Abnormality scores were analysed with respect to the clinical and radio-pathological characteristics of breast cancer, tumour-to–breast area ratio (TBAR), and tumour distance from the chest wall for cancers presenting as a mass.
 
Results: The median abnormality score across the 414 breast cancers was 95.6; sensitivity was 91.5% and specificity was 96.3%. High-score cancers were more often palpable, invasive, and presented as masses or architectural distortion (P<0.001). False-negative cancers were smaller, more common in dense breast tissue, and presented as asymmetrical densities (P<0.001). Large tumours with extreme TBARs and locations near the chest wall were associated with lower abnormality scores (P<0.001). Several strengths and limitations of AI-CAD were observed and discussed in detail.
 
Conclusion: Artificial intelligence–based computer-assisted diagnosis shows potential value as a tool for breast cancer detection in symptomatic setting, which could provide substantial benefits to patients.
 
 
New knowledge added by this study
  • With a threshold score of 30, a commercially available artificial intelligence–based computer-assisted diagnosis (AI-CAD) program showed high sensitivity and specificity for breast cancer detection on digital mammography in symptomatic settings, offering a valuable diagnostic adjunct.
  • The performance of AI-CAD varied according to the radio-pathological characteristics of breast cancer. Notably, the program demonstrated promising accuracy in detecting breast cancers that exhibit architectural distortion, which remains a diagnostic challenge.
  • Observed limitations of AI-CAD, such as underscoring cancers that present as large masses or exhibit nipple retraction as well as its inability to compare with previous studies, highlight concerns regarding standalone use of AI for triage in symptomatic clinics.
Implications for clinical practice or policy
  • Artificial intelligence–based computer-assisted diagnosis exhibits substantial potential for detecting breast cancers in symptomatic settings.
  • To make study findings clinically viable, larger validation studies are needed.
 
 
Introduction
Mammography is the principal modality used for breast cancer screening and detection in women worldwide.1 However, 10% to 30% of breast cancers may be undetected during mammography due to factors such as dense breast tissue, poor imaging technique, perceptual error, and subtle mammographic abnormalities.2
 
Conventional computer-aided diagnosis systems have been developed for more than two decades; however, large-scale studies have shown no significant benefit of such systems in enhancing radiologists’ diagnostic performance.3 4 Such systems do not facilitate differentiation between benign and malignant breast lesions, resulting in numerous false-positive results that require radiologist review, which may lead to reader fatigue and unnecessary additional investigations.
 
Currently, artificial intelligence–based computer-assisted diagnosis (AI-CAD) is widely implemented in mammography to improve diagnostic accuracy and reduce radiologist workload.5 6 The AI-CAD systems developed using deep-learning algorithms make independent decisions and self-learn without the need for feature engineering and computation.7 Artificial intelligence algorithms have been applied to multiple aspects of breast cancer screening, including risk stratification, triage, lesion interpretation, and patient recall.8 As of 2022, the US Food and Drug Administration has approved >15 AI tools for mammography applications, including density assessment, triage, lesion detection, and classification.9 Most commercial AI-CAD programs provide heatmaps with abnormality scores. Generally, higher abnormality scores indicate more suspicious radiological features and a greater likelihood of cancer.
 
Most existing evidence in the literature is derived from population-based screening studies.5 10 11 However, unlike other developed Asian countries such as Singapore and Korea, Hong Kong lacks a large-scale population screening programme.12 13 Our patient population primarily consists of symptomatic individuals. Evidence concerning the application of AI-CAD in symptomatic breast imaging is limited. This study aimed to evaluate the potential of AI-CAD in Hong Kong, focusing on the impact of radio-pathological phenotypes of breast cancer on AI-CAD performance. We analysed the distinctive characteristics of high-score versus low-score breast cancers. We also discuss observed strengths and limitations of AI-CAD in identifying breast cancer.
 
Methods
Study population
In total, 488 consecutive patients with histology-confirmed breast cancers were identified from a prospectively maintained database managed by two tertiary referral centres in Hong Kong during the period between January 2020 and September 2022. In our centres, all patients referred for diagnostic mammography were symptomatic, presenting with various clinical symptoms. We included patients with breast cancers confirmed by core needle biopsy under ultrasound guidance or stereotactic-guided vacuum-assisted breast biopsy performed at our centres. We excluded patients with diagnostic mammography performed at outside facilities (n=6), chest wall recurrence after mastectomy (n=14), cancers identified only in axillary nodes (n=3), tumour locations not feasible for mammography (n=3), and mammographically occult breast cancers undetectable by both reporting radiologists and AI-CAD (n=64) [Fig 1]. Finally, 398 patients with 414 breast cancers and 347 unaffected breasts were included in the study. Sixteen patients were diagnosed with bilateral breast cancer. Among the 382 patients with unilateral breast cancer, 35 had previously undergone contralateral mastectomy.
 

Figure 1. Inclusion and exclusion criteria
 
Image acquisition and analysis
Full-field digital mammography (MAMMOMAT Inspiration; Siemens, Erlangen, Germany or Selenia Dimensions; Hologic, Newark [DE], US) was performed prior to each biopsy. The included mammograms were exported and processed by a commercial AI-CAD program (INSIGHT MMG, version 1.10.2; Lunit, Seoul, South Korea), which is approved by the US Food and Drug Administration for lesion detection and classification in breast imaging.9
 
The AI-CAD algorithm used in the current study was developed and validated through multinational studies.14 15 This algorithm provides a heatmap that highlights mammographic abnormalities and generates a score ranging from 0 to 100 for each view (craniocaudal and mediolateral oblique views). The abnormality score is the maximum value for each breast, reflecting the likelihood of malignancy.
 
All mammograms were interpreted by radiologists subspecialising in breast radiology (with 4 to 20 years of experience in breast imaging). Mammography reports from the time of breast cancer diagnosis were retrieved from the radiology information system and retrospectively reviewed for breast density, dominant mammographic features of breast cancer, and any axillary lymphadenopathy. The clinical findings, pathological results, and molecular profiles of breast cancers were also recorded. Breast density was categorised from 1 to 4 using the BI-RADS (Breast Imaging Reporting and Data System) classification.16 The cancers were classified according to their dominant mammographic features as asymmetrical density, mass (with or without calcifications), calcifications alone, or architectural distortion.
 
For breast cancers presenting as a mass without calcifications, the tumour distances from the chest wall and the tumour-to–breast area ratio (TBAR) were measured in mammograms using the picture archiving and communication system by a radiologist with 2 years of experience in breast imaging. Tumour distance from the chest wall was defined as the shortest distance between the tumour and the pectoralis major in the mediolateral oblique view (Fig 2a). Tumours partially visible within the lower breast in the mediolateral oblique view, where the pectoralis muscle is not discernible, were assigned a chest wall distance of 0 cm. The TBAR was calculated via division of the tumour area by the breast area, as measured using the freehand region-of-interest tool (Fig 2b).
 

Figure 2. (a) Index cancer (white arrows) and measurement of tumour distance from the chest wall on the picture archiving and communication system (PACS) [double arrow]. (b) Measurement of tumour-to–breast area ratio by freehand region-of-interest on the PACS, indicated by curved arrow (tumour area) and open arrows (breast area)
 
The radiologists matched the index lesion to the AI-CAD heatmap to determine whether the AI-CAD correctly localised the known cancer. When the cancer was correctly localised by the AI-CAD, an abnormality score of ≥30 was regarded as a true positive, whereas a score &LT;30 was considered a false negative. When the cancer was undetected or incorrectly localised by the AI-CAD, this result also was regarded as a false negative. Breast cancers with abnormality scores of ≥90 and <30 were designated as ‘high-score tumour’ and ‘low-score tumour’, respectively.
 
Statistical analysis
Abnormality scores are presented as medians with interquartile ranges. The scores were analysed according to patient symptoms, breast density, mammographic findings, cancer histology, and molecular profile using the Mann-Whitney U test or Kruskal–Wallis H test. The AI-CAD abnormality scores were divided into three intervals: 0 to <30, 30-90, and >90 to 100. The Chi squared test and Mantel-Haenszel test for trend were used to analyse associations with different factors. For cancers presenting as a mass, mean abnormality scores across various TBARs and distances to the chest wall were evaluated using analysis of variance with pairwise comparisons. Statistical analyses were performed using SPSS (Windows version 26; IBM Corp, Armonk [NY], US). P values <0.05 were considered statistically significant.
 
Results
In total, 398 patients (mean age, 62.4 years; range, 35-100) with 414 breast cancers and 347 unaffected breasts were included in the study. The cohort consisted of two men and 396 women. Among the 414 breast cancer cases, 284 (68.6%) were palpable (Table 1).
 

Table 1. Median abnormality scores assigned by artificial intelligence–based computer-assisted diagnosis according to clinical, radiological, and pathological phenotypes of breast cancers (n=414)
 
Distribution of abnormality scores
The median and mean abnormality scores for the 414 breast cancers were 95.6 and 80.6, respectively (range, 0.4-99.9). The distribution of breast cancers according to abnormality score interval is presented in Figure 3. The sensitivity of the AI-CAD algorithm in detecting breast cancers was 91.5%, based on breast cancer identification using an abnormality score of ≥30. Overall, 65.7% of breast cancers were classified as high-score tumours, whereas 8.5% were classified as low-score tumours with abnormality scores <30; these low-score tumours were regarded as false-negative cases. Table 1 presents the medians and interquartile ranges of abnormality scores according to clinical, radiological, and pathological phenotypes.
 

Figure 3. Distribution of breast cancers according to abnormality score interval (n=414)
 
Palpable lesions, cancers in entirely fatty or scattered fibroglandular breasts, cancers presenting as masses with or without calcifications and architectural distortion, and larger cancers were associated with higher abnormality scores (all P<0.001) [Table 1]. Invasive cancers had higher abnormality scores compared with ductal carcinoma in situ (P=0.010). Axillary nodal status (P=0.078) and cancer molecular subtype (P=0.820) were not associated with abnormality scores (Table 2).
 

Table 2. Comparison of clinical, radiological, and pathological phenotypes of breast cancers between false-negative and truepositive results of artificial intelligence–based computer-assisted diagnosis (n=414)
 
Phenotypic features of high-score breast cancer
High-score breast cancers had higher prevalences of palpable disease, cancers presenting as masses with or without calcifications, invasive cancers, and larger cancers (>1 cm) [Table 2].
 
Phenotypic features of low-score, false-negative breast cancer
The false-negative rate for AI-CAD was 8.5% (35/414). These cancers had higher prevalences of non-palpable disease, cancers presenting as asymmetrical densities, small cancers (<1 cm), and locations in heterogeneously dense or extremely dense breast tissue.
 
Impact of tumour-to–breast area ratio and tumour distance from chest wall on abnormality score
Overall, 158 cancers presenting as masses without calcifications were included in this analysis. The mean abnormality score for cancers with a TBAR of ≥30% was significantly lower than for those with a TBAR of <30% (86.7 vs 54.4; P<0.001). Tumours bordering the chest wall (ie, distance of 0 cm from chest wall) demonstrated significantly lower abnormality scores compared with those located 1 cm and ≥2 cm away from the chest wall (mean, 65.5 vs 89.2 vs 87.2; P<0.001).
 
Distribution of abnormality scores for unaffected breasts
In the analysis of 347 unaffected breasts (regarded as negative findings by reporting radiologists), the median abnormality score was 0 (mean, 3.5; range, 0-81). Using a threshold score of 30, the false-positive rate was 3.7% (13/347), indicating 96.3% specificity. Most of these false positives (11/13) scored between 30 and 50; none scored >90. One case with known postoperative changes from breast conservative surgery showed stable mammographic finding for 10 years, scored 81 by AI-CAD. One case with a breast cyst scored 73, which was confirmed via fine needle aspiration cytology.
 
Discussion
Performance and potentials
Most AI-CAD algorithms provide heatmaps with abnormality scores ranging from 0 to 100; a higher score generally implies a greater likelihood of cancer. Previous AI-CAD studies have used various threshold scores; some set a threshold of 10 for population screening,18 19 20 whereas Weigel et al21 set a threshold of 28 for detecting malignant calcifications. However, the clinical implications of the abnormality score itself have not been clarified; a score range from 10 to 100 may be too broad for distinguishing malignancies in clinical practice. These aspects highlight the need for further validation of the appropriate reference score provided by AI-CAD algorithms. In this study, we set the threshold at 30 because, unlike population screening approaches, our patients were symptomatic individuals. A higher threshold score appears more practical in the clinical setting of symptomatic patients.
 
In our study, the AI-CAD algorithm detected 91.5% (379/414) of breast cancers with an abnormality score of >30; of these 379 cancers, 71.7% exhibited a high abnormality score of >90. The false-negative rate of 8.5% is comparable to previously reported rate for this AI-CAD algorithm.5
 
All cancers presenting as architectural distortion in our study were correctly localised by the AI-CAD, with abnormality scores >30; 87.5% of them were assigned high abnormality scores of >90 (Fig 4a and b). Unlike cancers presenting as masses or calcifications, cancers presenting as architectural distortion remain challenging for radiologists to detect and interpret.22 23 24 Wan et al25 showed that a standalone AI algorithm did not outperform radiologists; however, with AI assistance, junior radiologists demonstrated significant improvements in diagnostic accuracy for architectural distortion.
 

Figure 4. Cases illustrating the strengths of artificial intelligence–based computer-assisted diagnosis (AI-CAD). (a) A 52-year-old woman presenting with a right breast mass. The mediolateral-oblique mammographic view shows an architectural distortion (white arrow) in the upper right breast. (b) The AI-CAD program successfully detected this asymmetrical distortion within heterogeneously dense breast tissue, assigning a high abnormality score of 97. (c) A 55-year-old woman with a subtle asymmetrical density, identified as ductal carcinoma in situ on biopsy. The mediolateral-oblique mammographic view shows a subtle asymmetrical density (white arrow) in the upper left breast. The reporting radiologist did not detect the lesion on mammography but detected it via concurrent diagnostic ultrasound. (d) The AI-CAD program detected the subtle asymmetrical density, assigning an abnormality score of 68. (e) A 50-year-old woman with bilateral polyacrylamide gel implants presenting with a small lump in the left breast. The mediolateral-oblique mammographic view shows that the gel had been injected into various layers of the anterior chest wall (behind and within breast tissue, subcutaneous layer, and muscle). A subtle group of amorphous calcifications is visible in the upper left breast (white arrow). (f) The AI-CAD program detected these grouped calcifications in the context of breast augmentation, assigning a high abnormality score of 91
 
One case of breast cancer presenting as asymmetric density in heterogeneously dense breast tissue was missed by the reporting radiologist but detected by AI-CAD, which assigned an abnormality score of 68. The cancer was later identified by the radiologist via ultrasound, which is part of routine workup for symptomatic patients in our centre. Retrospective review indicated that the asymmetric density was visible on mammography (Fig 4c and d). In a study by Kim et al,26 40 of 128 mammographically occult breast cancers were correctly identified by the AI algorithm, demonstrating its added value in detecting such cancers.
 
The 64 cases of mammographically occult breast cancer not detected by either the AI-CAD or the radiologists were excluded from the study. Of these cases, 84.3% were found in heterogeneously dense and extremely dense breast tissue (BI-RADS 3 and 4).16 Dense breast tissue is recognised as a significant feature associated with mammographically occult and missed cancers.27 28 29 30 We suspect that mammographic signs of cancer are masked or obscured by dense breast parenchyma, thus evading detection by the AI-CAD. Conversely, both radiologists and the AI-CAD tended to more effectively detect cancers in fatty breasts.18
 
In our study, the AI-CAD correctly localised a small breast cancer with a high abnormality score (>90) in a patient with polyacrylamide hydrogel (PAAG)–injected augmentation mammoplasty (Fig 4e and f). The diagnosis of breast cancer after PAAG-injected augmentation mammoplasty is challenging. Lesion visualisation may be masked by the presence of polyacrylamide gel, and extravasated polyacrylamide gel may mimic a lesion on mammography, potentially delaying early cancer detection. In such cases, assessments of suspicious calcifications and parenchymal distortion within visible breast parenchyma are considered the main goals of screening mammography.31 32 The effectiveness of AI-CAD in detecting breast cancer among patients with augmentation mammaplasty remains uncertain, warranting further studies.
 
Detection challenges and future directions
Isolated cases of large, clearly visible lesions that evaded AI detection have been described by Lång et al33 and Choi et al.18 To our knowledge, our study is the first to investigate factors contributing to such evasion. In this study, the AI algorithm tended to underscore cancers presenting as large masses (Fig 5a and b). Cancers with a TBAR of ≥30% had significantly lower mean abnormality scores relative to those with a ratio of <30%. Tumours bordering the chest wall (0 cm distance) also showed significantly lower abnormality scores than those located away from the chest wall. The underlying cause remains unclear; however, these findings highlight concerns regarding the use of AI-CAD as a standalone tool for triaging cases in symptomatic populations. We also noted that the AI-CAD missed certain cancers with obvious findings, such as nipple retraction and diffuse dermal thickening (Fig 5c to f).
 

Figure 5. Cases illustrating the limitations of artificial intelligence–based computer-assisted diagnosis (AI-CAD). (a) A 48-year-old woman presenting with a left breast mass. The craniocaudal mammographic view shows a retracted left breast mostly replaced by a large, irregular, high-density mass with dermal infiltration and suspected pectoralis involvement. (b) The AI-CAD program detected the tumour but assigned it a low abnormality score of 30. (c) A 48-year-old woman presenting with a right breast mass. The mediolateral-oblique mammographic view shows an irregular mass with indistinct margins in the periareolar region of the right breast with nipple retraction (white arrow). (d) The AI-CAD program correctly localised the right breast mass but assigned a low abnormality score of 19, despite the presence of nipple retraction. (e) A 57-year-old woman presenting with a right breast mass. The mediolateral-oblique mammographic view shows a large right breast mass with diffuse skin thickening (white arrows). (f) The AI-CAD program detected the breast mass but assigned a low abnormality score of 32, despite the presence of diffuse skin thickening. (g) A 62-year-old woman—with a history of breast-conserving surgery for breast cancer— exhibited local recurrence on surveillance mammography. The previous mediolateral-oblique mammographic view shows postoperative changes and macrocalcification in the upper right breast; no suspicious lesion was identified. (h) The follow-up mediolateral-oblique mammographic view shows a newly developed small, irregular mass (white arrow) in the upper right breast adjacent to the macrocalcification; biopsy confirmed invasive carcinoma. (i) The AI-CAD program did not detect this lesion, assigning a low abnormality score of 8
 
Moreover, the inability of AI-CAD to compare mammograms with previous studies may hinder its effectiveness in specific scenarios, such as the detection of subtle developing symmetries and identification of early recurrence in postoperative cases (Fig 5g to i). In contrast, radiologists can compare mammograms with previous studies, improving mammogram interpretation accuracy.
 
Studies have shown that the diagnostic performances of AI algorithms are comparable to those of radiologists in terms of assessing screening mammograms; the use of AI to triage screening mammograms could potentially reduce radiologists’ workload.5 34 35 We identified potential limitations and weaknesses of AI-CAD in diagnosing breast cancers under certain conditions, highlighting the need for further large-scale studies to investigate clinical applications of AI-CAD in symptomatic patients.
 
Strengths and limitations
This study had several key strengths. To our knowledge, it is the first to evaluate AI-CAD for breast cancer detection in Hong Kong, using an AI-CAD system that had not previously been exposed to images from our centres during their product development. Additionally, all digital mammograms were obtained before biopsies, avoiding any biopsy-related changes which could potentially affect AI-CAD performance. Limitations of the study include its retrospective design and inclusion of cancer-enriched datasets, which may lead to overestimation of AI-CAD performance; the use of a single AI vendor, hindering applicability to other AI algorithms; and the lack of BI-RADS correlation. Furthermore, there was a lack of information concerning progression in unaffected breasts over an extended follow-up interval (≥2 years), which could impact the false-positive rate of the AI-CAD. An extended observation period is needed to identify potential malignancies that may have been initially missed by radiologists.
 
Conclusion
Unlike other developed cities or countries, Hong Kong does not have population-based screening programmes. The adoption and implementation of AI programs in Hong Kong for breast imaging remains in early stages, mainly due to ongoing debates about efficacy and a lack of sufficient local data to support widespread application. Current literature is almost entirely based on population screening data, which may not be applicable to cities without screening programmes. In our study, AI-CAD demonstrated promising accuracy in detecting breast cancers within symptomatic settings; its performance varied according to radio-pathological characteristics. To translate these research findings into practical clinical applications, further validation studies with larger sample sizes are required; these would confirm the reliability of AI-CAD systems. The development of protocols for integrating AI-CAD into existing clinical workflows, formulation of usage guidelines, and initiation of training programmes for radiologists to effectively utilise AI as a second reader are essential elements of this process. Collaborations with information technology departments and hospital management are necessary to ensure successful integration. Although further investigation is needed, this study provides encouraging evidence to support the use of AI-CAD as a breast cancer detection tool in symptomatic settings, ultimately benefitting patients.
 
Author contributions
Concept or design: SM Yu, MNY Choi, EHY Hung, HHL Chau.
Acquisition of data: SM Yu, MNY Choi, TH Chan, CYM Young, YH Chan, YS Chan, C Tsoi.
Analysis or interpretation of data: SM Yu, TH Chan, J Leung.
Drafting of the manuscript: SM Yu, CYM Young.
Critical revision of the manuscript for important intellectual content: SM Yu, CYM Young, WCW Chu, EHY Hung, HHL Chau.
 
All authors had full access to the data, contributed to the study, approved the final version for publication, and take responsibility for its accuracy and integrity.
 
Conflicts of interest
All authors have disclosed no conflicts of interest.
 
Funding/support
This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
 
Ethics approval
This research was approved by the New Territories East Cluster Research Ethics Committee/Institutional Review Board of Hospital Authority, Hong Kong (Ref No.: NTEC-2023-074). The requirement for informed patient consent was waived by the Committee due to the retrospective nature of the research.
 
References
1. Tabár L, Vitak B, Chen HH, Yen MF, Duffy SW, Smith RA. Beyond randomized controlled trials: organized mammographic screening substantially reduces breast carcinoma mortality. Cancer 2001;91:1724-31. Crossref
2. Majid AS, de Paredes ES, Doherty RD, Sharma NR, Salvador X. Missed breast carcinoma: pitfalls and pearls. Radiographics 2003;23:881-95. Crossref
3. Fenton JJ, Taplin SH, Carney PA, et al. Influence of computer-aided detection on performance of screening mammography. N Engl J Med 2007;356:1399-409. Crossref
4. Lehman CD, Wellman RD, Buist DS, et al. Diagnostic accuracy of digital screening mammography with and without computer-aided detection. JAMA Intern Med 2015;175:1828-37. Crossref
5. Dembrower K, Wåhlin E, Liu Y, et al. Effect of artificial intelligence–based triaging of breast cancer screening mammograms on cancer detection and radiologist workload: a retrospective simulation study. Lancet Digit Health 2020;2:e468-74. Crossref
6. Raya-Povedano JL, Romero-Martín S, Elías-Cabot E, Gubern-Mérida A, Rodríguez-Ruiz A, Álvarez-Benito M. AI-based strategies to reduce workload in breast cancer screening with mammography and tomosynthesis: a retrospective evaluation. Radiology 2021;300:57-65. Crossref
7. Erickson BJ, Korfiatis P, Kline TL, Akkus Z, Philbrick K, Weston AD. Deep learning in radiology: does one size fit all? J Am Coll Radiol 2018;15(3 Pt B):521-6. Crossref
8. Schünemann HJ, Lerda D, Quinn C, et al. Breast cancer screening and diagnosis: a synopsis of the European breast guidelines. Ann Intern Med 2020;172:46-56. Crossref
9. Bahl M. Artificial intelligence: a primer for breast imaging radiologists. J Breast Imaging 2020;2:304-14. Crossref
10. Hickman SE, Woitek R, Le EP, et al. Machine learning for workflow applications in screening mammography: systematic review and meta-analysis. Radiology 2022;302:88-104. Crossref
11. Leibig C, Brehmer M, Bunk S, Byng D, Pinker K, Umutlu L. Combining the strengths of radiologists and AI for breast cancer screening: a retrospective analysis. Lancet Digit Health 2022;4:e507-19. Crossref
12. Intelligence Unit, The Economist. Breast cancer in Asia—the challenge and response. A report from the Economist Intelligence Unit. 2016. Available from: https://www.eiuperspectives.economist.com/sites/default/files/EIU Breast Cancer in Asia_Final.pdf. Accessed 19 Nov 2017.
13. Lim YX, Lim ZL, Ho PJ, Li J. Breast cancer in Asia: incidence, mortality, early detection, mammography programs, and risk-based screening initiatives. Cancers (Basel) 2022;14:4218. Crossref
14. Salim M, Wåhlin E, Dembrower K, et al. External evaluation of 3 commercial artificial intelligence algorithms for independent assessment of screening mammograms. JAMA Oncol 2020;6:1581-8. Crossref
15. Kim HE, Kim HH, Han BK, et al. Changes in cancer detection and false-positive recall in mammography using artificial intelligence: a retrospective, multireader study. Lancet Digit Health 2020;2:e138-48. Crossref
16. D’Orsi CJ, Sickles EA, Mendelson EB, et al. ACR BIRADS® Atlas, Breast Imaging Reporting and Data System. Reston, VA: American College of Radiology; 2013.
17. McKinney SM, Sieniek M, Godbole V, et al. International evaluation of an AI system for breast cancer screening. Nature 2020;577:89-94. Crossref
18. Choi WJ, An JK, Woo JJ, Kwak HY. Comparison of diagnostic performance in mammography assessment: radiologist with reference to clinical information versus standalone artificial intelligence detection. Diagnostics (Basel) 2022;13:117. Crossref
19. Lee SE, Han K, Yoon JH, Youk JH, Kim EK. Depiction of breast cancers on digital mammograms by artificial intelligence–based computer-assisted diagnosis according to cancer characteristics. Eur Radiol 2022;32:7400-8. Crossref
20. Koch HW, Larsen M, Bartsch H, Kurz KD, Hofvind S. Artificial intelligence in BreastScreen Norway: a retrospective analysis of a cancer-enriched sample including 1254 breast cancer cases. Eur Radiol 2023;33:3735-43. Crossref
21. Weigel S, Brehl AK, Heindel W, Kerschke L. Artificial intelligence for indication of invasive assessment of calcifications in mammography screening [in English, German]. Rofo 2023;195:38-46. Crossref
22. Suleiman WI, McEntee MF, Lewis SJ, et al. In the digital era, architectural distortion remains a challenging radiological task. Clin Radiol 2016;71:e35-40. Crossref
23. Babkina TM, Gurando AV, Kozarenko TM, Gurando VR, Telniy VV, Pominchuk DV. Detection of breast cancers represented as architectural distortion: a comparison of full-field digital mammography and digital breast tomosynthesis. Wiad Lek 2021;74:1674-9. Crossref
24. Alshafeiy TI, Nguyen JV, Rochman CM, Nicholson BT, Patrie JT, Harvey JA. Outcome of architectural distortion detected only at breast tomosynthesis versus 2D mammography. Radiology 2018;288:38-46. Crossref
25. Wan Y, Tong Y, Liu Y, et al. Evaluation of the combination of artificial intelligence and radiologist assessments to interpret malignant architectural distortion on mammography. Front Oncol 2022;12:880150. Crossref
26. Kim HJ, Kim HH, Kim KH, et al. Mammographically occult breast cancers detected with AI-based diagnosis supporting software: clinical and histopathologic characteristics. Insights Imaging. 2022;13:57. Crossref
27. Lian J, Li K. A review of breast density implications and breast cancer screening. Clin Breast Cancer 2020;20:283-90. Crossref
28. Freer PE. Mammographic breast density: impact on breast cancer risk and implications for screening. Radiographics 2015;35:302-15. Crossref
29. Arora N, King TA, Jacks LM, et al. Impact of breast density on the presenting features of malignancy. Ann Surg Oncol 2010;17 Suppl 3:211-8. Crossref
30. Ma L, Fishell E, Wright B, Hanna W, Allan S, Boyd NF. Case-control study of factors associated with failure to detect breast cancer by mammography. J Natl Cancer Inst 1992;84:781-5. Crossref
31. Cheng NX, Liu LG, Hui L, Chen YL, Xu SL. Breast cancer following augmentation mammaplasty with polyacrylamide hydrogel (PAAG) injection. Aesthetic Plast Surg 2009;33:563-69. Crossref
32. Teo SY, Wang SC. Radiologic features of polyacrylamide gel mammoplasty. AJR Am J Roentgenol 2008;191:W89-95. Crossref
33. Lång K, Dustler M, Dahlblom V, Åkesson A, Andersson I, Zackrisson S. Identifying normal mammograms in a large screening population using artificial intelligence. Eur Radiol 2021;31:1687-92. Crossref
34. Rodriguez-Ruiz A, Lång K, Gubern-Merida A, et al. Stand-alone artificial intelligence for breast cancer detection in mammography: comparison with 101 radiologists. J Natl Cancer Inst 2019;111:916-22. Crossref
35. Rodríguez-Ruiz A, Krupinski E, Mordang JJ, et al. Detection of breast cancer with mammography: effect of an artificial intelligence support system. Radiology 2019;290:305-14. Crossref

Pages