Improving efficiency and effectiveness of workplace-based assessment workshop in postgraduate medical education using a conjoint design

Hong Kong Med J 2025;31:Epub 9 Dec 2025
© Hong Kong Academy of Medicine. CC BY-NC-ND 4.0
 
ORIGINAL ARTICLE
Improving efficiency and effectiveness of workplace-based assessment workshop in postgraduate medical education using a conjoint design
HY So, FHKAM (Anaesthesiology), MHPE1; Eddy WY Wong, FHKCORL, FRCSEd (ORL)2; Albert KM Chan, FHKCA, MHPE1; George KC Wong, MD, FCSHK1; Jessica YP Law, FHKCOG, MHQS (Harvard)3; PT Chan, FHKCOS, MMEd1; CM Ngai, FHKCORL, FRCS (Edin)2
1 The Jockey Club Institute for Medical Education and Development, Hong Kong Academy of Medicine, Hong Kong SAR, China
2 The Hong Kong College of Otorhinolaryngologists, Hong Kong SAR, China
3 Department of Obstetrics and Gynaecology, Pamela Youde Nethersole Eastern Hospital, Hong Kong SAR, China
 
Corresponding author: Dr HY So (sohingyu@fellow.hkam.hk)
 
 Full paper in PDF
 
Abstract
Introduction: Faculty development for trainers and nurturing feedback literacy in trainees is crucial for effective workplace-based assessments (WBAs) to support trainee competency development. Separate training sessions for trainers and trainees can be challenging when resources are limited. Combined training can optimise resources and foster mutual understanding, although such approaches face challenges related to power dynamics. This study aimed to evaluate the effectiveness of a conjoint WBA workshop in enhancing trainer engagement, improving trainee feedback literacy, and exploring the benefits and challenges of integrating trainers and trainees in a shared learning environment.
 
Methods: A mixed-methods study was conducted with 13 trainers and five trainees from the Hong Kong College of Otorhinolaryngologists. Quantitative data were collected using the Feedback Literacy Behaviour Scale for trainees and the Continuing Professional Development–Reaction Questionnaire for trainers. Pre- and post-intervention comparisons were analysed using paired t tests. Qualitative data from focus group interviews were thematically analysed.
 
Results: Quantitative analysis showed statistically significant increases in trainee feedback literacy (P<0.001) and improvements in trainers’ beliefs about capabilities and engagement intentions (P<0.05). The qualitative analysis supported these findings and identified three key factors: mutual understanding, clarification of the WBA purpose, and effective instructional design. Participants valued the mutual understanding fostered in the conjoint setting, which aligned expectations and created a supportive learning environment.
 
Conclusion: Conjoint WBA workshops may effectively promote trainer engagement and trainee feedback literacy, aligning expectations and fostering a positive feedback culture. Further research is needed to explore the longitudinal impact and applicability to other specialties.
 
 
New knowledge added by this study
  • Trainers and trainees learning together in the same workplace-based assessment (WBA) workshop facilitates effective mutual learning.
  • Despite potential power dynamics, psychological safety can be maintained in this setting.
  • Collaboration strengthens trainees’ trust in the value of WBA as a tool for learning.
Implications for clinical practice or policy
  • Conjoint training can be considered an alternative for organising WBA workshops.
  • The Hong Kong Academy of Medicine should support further studies on this design to enhance the effectiveness of WBA workshops.
 
 
Introduction
Competency-based medical education (CBME) emphasises the assessment of trainees through direct observation and feedback using workplace-based assessments (WBA).1 These assessments are designed to support continuous learning and competency development through meaningful feedback.2 Effective implementation of WBA requires trainers who are willing and able to provide constructive feedback,3 4 5 and trainees who are motivated to seek and use feedback. This active engagement with feedback is the essence of feedback literacy, defined by Carless and Boud6 as “the understandings, capacities, and dispositions needed to make sense of information and use it to enhance work or learning strategies”. The construct of intention, based on the theory of planned behaviour, highlights that an individual’s willingness to perform a behaviour is influenced by their attitudes, subjective norms, and perceived behavioural control.7 Intention is emphasised as the best predictor of behaviour, especially where constraints or barriers exist. In the context of WBA, focusing on intention helps us understand the underlying motivations and readiness of trainers and trainees to engage in feedback practices. Trainers’ intentions are shaped by their beliefs about the value of feedback, the expectations of peers, and their confidence in their ability to provide that feedback. Dawson et al,8 building on the works of Carless and Boud6 and Molloy et al,9 conceptualised feedback literacy as five key skills: seeking feedback, making sense of information, using feedback, managing emotional responses, and providing feedback. Based on this framework, effective training is essential for fostering engagement and capability in meaningful feedback practices.
 
Faculty development is often implemented to enhance trainers’ skills, whereas separate sessions aim to build feedback literacy among trainees. However, specialties with small numbers of trainers and trainees face unique challenges in implementing WBA, including limited opportunities to conduct separate training sessions. A conjoint WBA workshop, where both groups train together, may offer an innovative solution to these constraints. Potential benefits include promoting mutual understanding, aligning feedback practices, and fostering a consistent approach to WBA implementation.10 However, concerns regarding power imbalances and psychological safety in mixed-group settings could undermine its effectiveness.11 Thus far, there have been no studies regarding such conjoint workshops; the actual participant experience, including potential advantages and disadvantages, remains unexplored.
 
Therefore, this study aimed to address the following research questions:
  1. Can conjoint training improve the intention of trainers to participate in WBA?
  2. Can conjoint training improve the feedback literacy of trainees?
  3. What are the experiences of trainers in a conjoint training setting?
  4. What are the experiences of trainees in a conjoint training setting?
 
Methods
This study was designed according to the requirements of the SQUIRE-EDU (Standards for QUality Improvement Reporting Excellence in Education) guidelines for educational improvement.12
 
Study setting
The study was conducted with trainers and trainees of the Hong Kong College of Otorhinolaryngologists (HKCORL), a specialty college under the Hong Kong Academy of Medicine. The HKCORL is responsible for training and accrediting specialists in otorhinolaryngology, and has been integrating WBAs into its training curriculum since 2021. The College currently has a total of 206 fellows, 57 of whom are trainers. In May 2023, 20 trainers participated in a WBA workshop specifically designed for them. During the first 2 years, basic surgical trainees are under the Hong Kong Intercollegiate Board of Surgical Colleges and rotate through different surgical specialties. Specialist training in otorhinolaryngology takes place only during the 4 years of higher training. Over the past 5 years, the annual intake of higher trainees has ranged from four to 11. Currently, there are 31 higher trainees, 26 of whom participated in a WBA workshop for trainees held in September 2023. Relationships among fellows and trainees are strengthened through regular training courses, academic lectures, workshops, and an annual scientific meeting, complemented by active participation from the Young Fellows Chapter to enhance engagement in College activities. Camaraderie is also fostered through sports activities and social events.
 
Participant sampling and recruitment
All participants in the workshop were invited by email to participate in this study on a voluntary basis. All 13 trainers and five trainees enrolled in the workshop volunteered to participate in the study. The cohort of trainers was relatively young; 11 were within 10 years of obtaining their fellowship, and seven had only 1 to 2 years of experience as specialists.
 
Instructional design
The 4-hour workshop was designed based on the first principles of instruction, emphasising task-centred learning as the core instructional approach.13 Participants engaged in two authentic learning tasks: procedural-based assessment and case-based discussion, each followed by guided reflection. These tasks provided opportunities to practise giving and receiving feedback, which was the main focus of the workshop.
 
To prepare for these tasks, participants first completed a pre-course e-learning module consisting of five interactive videos (total duration: 53 minutes). These videos introduced essential concepts, including CBME, self-regulated learning, feedback literacy, and the procedures of WBA. The workshop began with an activity to establish psychological safety, following the recommendations of Rudolph et al,14 ensuring that participants felt comfortable to learn and engage openly. Subsequently, participants’ knowledge was reactivated through interactive lectures and demonstrations, effectively preparing them for the practice activities.
 
Quantitative measures
  1. Trainee feedback literacy: The Feedback Literacy Behaviour Scale was used to assess changes in trainees’ feedback literacy. It measures five subscales: Seeking Feedback, Making Sense of Feedback, Using Feedback, Providing Feedback, and Managing Affect.8
  2. Trainer engagement in WBA: Trainers’ engagement was measured using the Continuing Professional Development (CPD)–Reaction Questionnaire, based on social cognitive theories (theory of planned behaviour and Triandis’ theory of interpersonal behaviour). It measures intention, social influence, beliefs about capabilities, beliefs about consequences, and moral norms.7 15 16
 
Both surveys were administered before participants began their e-learning and repeated after completion of the workshop.
 
Statistical analysis
Paired t tests were utilised to compare pre- and post-intervention scores for both groups because this method offers more precise estimates of the effect and improved control over confounding variables compared with an unpaired t test, particularly given the small sample size. Descriptive statistics, including means, standard deviations, and Cohen’s d effect sizes, were calculated for each measure using Jamovi (desktop version 2.3.28).17
 
Qualitative data collection and analysis
Separate focus group interviews were conducted for trainers and trainees immediately after the workshop, using Cantonese. The two moderators were research staff trained by the authors. Semi-structured interviews were conducted using an interview guide created by the authors (online Appendix). The interviews were audio-recorded, anonymised, and transcribed verbatim. Transcripts were analysed using Braun and Clarke’s thematic analysis approach,18 assisted by ATLAS.ti software (version 8.4.5; ATLAS.ti Scientific Software Development, Berlin, Germany).19
 
Member checking
To enhance the credibility of the qualitative findings, results were sent back to participants after thematic analysis to confirm whether they agreed with the interpretation and whether they wished to share additional views. This process helped strengthen the credibility of the qualitative findings.
 
Reflexivity
The first author, an intensivist and educationist with a Master’s degree in Health Professions Education, played a key role in designing the conjoint workshop and framing WBA as a learning tool. The second author, a consultant otorhinolaryngologist and CBME advocate, proposed the joint training concept to address challenges in organising separate trainer and trainee sessions. Support from the seventh author, president of HKCORL, was critical for workshop implementation. Other authors contributed diverse clinical and educational expertise: the third author, a consultant anaesthetist and faculty development chair of the Jockey Club Institute for Medical Education and Development of the Hong Kong Academy of Medicine; the fifth author, an obstetrics and gynaecology consultant with expertise in healthcare quality and simulation; the sixth author, an orthopaedic surgeon and former college censor; and the fourth author, a neurosurgeon experienced in WBA workshops.
 
Their collective advocacy for CBME and WBA informed the study design and interpretation. While offering rich, multifaceted insights into WBA, this commitment may have influenced the emphasis on the conjoint workshop’s benefits, shaping research questions and conclusions accordingly.
 
Results
Quantitative findings
Among the trainees, the total Feedback Literacy Score significantly increased (pre=96.8 ± 4.04, post=125.2 ± 9.93; P<0.001), associated with a large effect size (d= –3.488). There was no statistically significant difference in the subscales of the Feedback Literacy Score (Table 1).
 

Table 1. Trainee feedback literacy scores
 
Among the trainers, the CPD–Reaction Scores showed statistically significant improvement in intention (pre=10.27 ± 1.65, post=11.09 ± 1.88; P=0.036), beliefs about capabilities (pre=15.55 ± 2.01, post=16.73 ± 2.25; P=0.015), beliefs about consequences (pre=10.27 ± 1.65, post=11.45 ± 1.88; P=0.049), and total score (pre=60.18 ± 5.04, post=65.82 ± 5.93; P=0.008). The effect sizes were moderate to large for intention (d= –0.750), moderate for beliefs about capabilities (d= –0.543) and beliefs about consequences (d= –0.631), and large for the total score (d= –0.801) [Table 2].
 

Table 2. Trainer Continuing Professional Development–Reaction Scores
 
Qualitative findings
Trainee focus group analysis
Four themes were identified: understanding WBA assessment, enhancing feedback literacy, presence of trainers in the workshop, and workshop design and delivery. Subthemes and quotations under each theme are listed in online supplementary Table 1.
 
Trainer focus group analysis
Four themes were identified: perceptions of WBA, improvement in feedback skills, presence of trainees in the workshop, and workshop design and delivery. Subthemes and quotations under each theme are listed in online supplementary Table 2.
 
Discussion
This mixed-methods study evaluated the impact of a conjoint WBA workshop designed to enhance both trainer intention to participate in WBA and trainee feedback literacy. The quantitative and qualitative data converged to show that the conjoint workshop improved trainer intention and appreciation of feedback skills; it also enhanced trainee feedback literacy and confidence in managing feedback during their learning process. Specifically, the quantitative results showed statistically significant improvement in trainer intention to participate in WBA as measured by the CPD–Reaction Questionnaire, and in trainee feedback literacy as measured by the Feedback Literacy Behaviour Score. Moreover, the qualitative findings suggested that trainers appreciated the use of open-ended questions and integration of feedback into micro-moments as valuable strategies, whereas trainees reported increased confidence in managing feedback and constructively applying it to their learning processes.
 
Through analysis of the qualitative data, we also identified three key factors that contributed to these findings: mutual understanding between trainers and trainees, clarification of the purpose of WBA, and effective instructional design.
 
Mutual understanding between trainers and trainees
A key finding of this study was the positive reception of the mixed-group learning experience. Both trainers and trainees valued the opportunity to directly engage with each other, which fostered mutual understanding of the assessment process and reduced discrepancies in feedback practices. Notably, the absence of prominent power dynamics was striking. This may be partially attributed to the relatively young cohort of trainers, which likely fostered a more collaborative atmosphere. Although previous literature suggests that hierarchical structures can hinder open communication in feedback settings,11 the present study demonstrated that in contexts with flatter hierarchies, conjoint workshops can be highly effective. Trainees indicated that the emphasis on psychological safety during the workshop helped prepare them for meaningful participation. Adherence to the recommendations of Rudolph et al14 to establish a safe environment likely contributed to this positive outcome. The close relationships already present between trainers and trainees within this small specialty could also have contributed. Existing literature supports the importance of trainer–trainee relationships in WBA.4 20 Interactions within this psychologically safe environment facilitated a more unified understanding of assessment standards and expectations, which helped minimise discrepancies in feedback practices. This alignment fostered trust that both trainers and trainees were working towards the shared goal of using WBA for learning purposes.
 
Our qualitative findings indicated that both groups reported a highly positive experience. The distinction lay in the focus: trainees emphasised gains in feedback literacy and confidence, whereas trainers valued new practical strategies and enhanced mutual understanding. According to the conceptual model of Castanelli et al,21 the level of trust in supervisors influences trainees’ perceptions of WBA. When trust is low, WBAs are regarded as performance evaluations, leading trainees to adopt risk-minimising strategies.22 Conversely, when trust is high, trainees perceive WBA as an assessment for learning, making them more willing to embrace vulnerability. Our findings suggest that, with appropriate measures to ensure psychological safety, a combined workshop setting may help align expectations, create a shared understanding of WBA practices, and strengthen trainees’ trust in their trainers.
 
Clarification of the purpose of workplace-based assessment
Both trainers and trainees recognised that WBA serves as a formative tool that guides reflective practice and enhances clinical competence. This understanding is crucial because it aligns with the principles of adult learning, particularly the notion that adults are self-directed learners who take responsibility for their own education.23 When both trainers and trainees appreciate that WBA facilitates reflective practice, they engage in self-directed learning by utilising feedback to critically analyse their clinical performance. This process empowers them to identify areas for improvement and take actionable steps towards enhancing their skills. Moreover, adults are motivated to learn when the material is directly relevant to their professional needs.23 In this context, WBA’s role in guiding clinical competence is highly pertinent because it connects seamlessly with daily practice. Thus, WBA not only fosters a culture of continuous improvement but also effectively motivates adult learners by linking assessment to professional development. However, motivation alone is insufficient. Participants also noted barriers such as time constraints in the clinical setting and the need for effective evaluation of outcomes. These issues must be addressed to ensure that motivation remains long-lasting and that trainees continue to meaningfully engage with WBAs in their everyday practice.
 
Effective instructional design
The workshop was designed based on the first principles of instruction, an evidence-based model that emphasises moving beyond memorisation to active knowledge application through real-world tasks.13 24 This approach encourages learners to engage in practice, which is often challenging and requires specific support. To address this, support is twofold: cognitive and affective. Cognitive support helps learners understand key concepts through pre-course e-learning, reactivation of prior knowledge, demonstration, and facilitated reflection.13 Affective support focuses on ensuring psychological safety, which is crucial for effective engagement in practice.14 While overall improvement reflects the combined effect of e-learning and the workshop, the qualitative data indicate that the interactive, conjoint nature of the workshop itself was the primary catalyst for enhancing mutual understanding and feedback skills. Our analysis revealed that participants valued this design and highlighted two additional elements that supported their learning: cognitive aids and peer feedback.
 
During the course, we used cognitive aids to remind participants of this six-step framework (Fig), and they found the use of such a framework effective. Workplace-based assessments consist of recurrent constituent skills—the steps to follow—and non-recurrent constituent skills (eg, how to respond in the debriefing conversation). The use of a structured framework and just-in-time information, such as cognitive aids, has been shown to effectively support the learning of recurrent skills.25
 

Figure. Cognitive aid: the six-steps of workplace-based assessments
 
During the guided reflection, we also engaged participants in peer feedback. Our analysis showed that participants found this practice enhanced their learning. Peer feedback enhances metacognitive perceptions by encouraging learners to reflect on their understanding and performance in relation to their peers. This fosters self-awareness as learners evaluate their work against others’, facilitating deeper insights into strengths and areas for improvement.26 There is evidence demonstrating the effectiveness of peer feedback in enhancing feedback literacy.27 28
 
Nonetheless, participants noted that the workshop could be improved by providing clearer instructions for role-playing exercises and using more medical-related cases for demonstration. Effective instruction is important. According to cognitive load theory, ineffective guidance can increase extrinsic cognitive load and impair learning, especially when the task itself is already demanding.29 We used a movie-based scenario not related to medicine to make the activity fun and interesting. However, the participants’ comment is valid, considering evidence that similarity between demonstration and practice is crucial for effective learning. When demonstrations closely resemble real-life applications, learners can better understand and apply concepts. This alignment enhances procedural knowledge, enabling learners to transition from observation to imitation and, eventually, autonomous practice. Furthermore, relevant demonstrations foster engagement and allow immediate feedback, which reinforces learning.30 31 Future workshops should focus on improving these aspects for better learning outcomes.
 
Limitations and future directions
This study had some limitations. The quantitative findings are constrained by the small sample size, particularly among trainees (n=5), which limits statistical power. Furthermore, although participation in the workshop was encouraged by the College, the sample may still reflect a group more engaged in training initiatives, potentially affecting generalisability. While the qualitative data provided rich insights into participants’ experiences, a larger cohort could offer a broader understanding of the impact of this educational intervention. Additionally, the study did not assess long-term changes in behaviour or practice, which are needed to determine sustained effects of the conjoint training on WBA implementation. Future studies could explore the longitudinal impact of such workshops and investigate their applicability in larger specialties where power dynamics might differ. It would also be valuable to assess the scalability of conjoint workshops in different contexts, particularly those with more complex hierarchical structures, to better understand their potential for broader implementation.
 
Conclusion
This study provides evidence that conjoint WBA workshops for trainers and trainees may effectively enhance trainee feedback literacy and trainer engagement in CBME. The mixed-group learning experience promoted mutual understanding and aligned feedback practices without creating significant power imbalances, fostering positive trainer–trainee interactions and enhancing trust, provided measures are taken to ensure psychological safety. Despite the positive outcomes, the study’s limitations, including its small sample size and lack of long-term follow-up, should be considered. Future research could explore the longitudinal impact of conjoint workshops and their applicability in larger specialties with more complex power dynamics.
 
Author contributions
Concept or design: HY So, EWY Wong.
Acquisition of data: HY So, CM Ngai.
Analysis or interpretation of data: HY So, AKM Chan, GKC Wong.
Drafting of the manuscript: HY So.
Critical revision of the manuscript for important intellectual content: All authors.
 
All authors had full access to the data, contributed to the study, approved the final version for publication, and take responsibility for its accuracy and integrity.
 
Conflicts of interest
All authors have disclosed no conflicts of interest.
 
Acknowledgement
The authors thank Mr CF Chan and Ms Cathy Ma of the Jockey Club Institute for Medical Education and Development of Hong Kong Academy of Medicine for valuable assistance in moderating the focus group discussions and preparing the transcripts. The authors also appreciate the logistical support provided by Ms Cindy Leung of The Hong Kong College of Otorhinolaryngologists, as well as Mr CF Chan, Ms Cathy Ma, and Ms Jojo Lee of the Jockey Club Institute for Medical Education and Development of Hong Kong Academy of Medicine in organising the workshop. Additionally, the authors wish to express their heartfelt thanks to Professor Jack Pun from the Department of English at The Chinese University of Hong Kong and Professor Stanley Sau-ching Wong from the Department of Anaesthesiology at The University of Hong Kong for insightful contributions to the preparation of the manuscript.
 
Declaration
Findings from this study were presented at AMEE 2025 of the International Association for Health Professions Education, 23-27 August 2025, Barcelona, Spain.
 
Funding/support
This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
 
Ethics approval
This research was approved by the Survey and Behavioural Research Ethics Committee of The Chinese University of Hong Kong, Hong Kong (Ref No.: SBRE-23-0855). Information sheets regarding the study were provided to all participants, and signed consent was obtained from each participant prior to the study
 
Supplementary material
The supplementary material was provided by the authors and some information may not have been peer reviewed. Accepted supplementary material will be published as submitted by the authors, without any editing or formatting. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by the Hong Kong Academy of Medicine and the Hong Kong Medical Association. The Hong Kong Academy of Medicine and the Hong Kong Medical Association disclaim all liability and responsibility arising from any reliance placed on the content.
 
References
1. So HY, Choi YF, Chan PT, Chan AK, Ng GW, Wong GK. Workplace-based assessments: what, why, and how to implement? Hong Kong Med J 2024;30:250-4. Crossref
2. Holmboe ES, Sherbino J, Long DM, Swing SR, Frank JR. The role of assessment in competency-based medical education. Med Teach 2010;32:676-82. Crossref
3. Anderson HL, Kurtz J, West DC. Implementation and use of workplace-based assessment in clinical learning environments: a scoping review. Acad Med 2021;96:S164-74. Crossref
4. Massie J, Ali JM. Workplace-based assessment: a review of user perceptions and strategies to address the identified shortcomings. Adv Heal Sci Educ Theory Pract 2016;21:455-73. Crossref
5. Lörwald AC, Lahner FM, Mooser B, et al. Influences on the implementation of Mini-CEX and DOPS for postgraduate medical trainees’ learning: a grounded theory study. Med Teach 2019;41:448-56. Crossref
6. Carless D, Boud D. The development of student feedback literacy: enabling uptake of feedback. Assess & Eval High Educ 2018;43:1315-25. Crossref
7. Ajzen I. The theory of planned behaviour. Organ Behav Hum Decis Processes 1991;50:179-211. Crossref
8. Dawson P, Yan Z, Lipnevich A, Tai J, Boud D, Mahoney P. Measuring what learners do in feedback: the Feedback Literacy Behaviour Scale. Assess Eval High Educ 2023;49:348-62. Crossref
9. Molloy E, Boud D, Henderson M. Developing a learning-centred framework for feedback literacy. Assess Eval High Educ 2020;45:527-40. Crossref
10. Illingworth P, Chelvanayagam S. Benefits of interprofessional education in health care. Br J Nurs 2007;16:121-4. Crossref
11. Brooks AK. Power and the production of knowledge: collective team learning in work organizations. Hum Resour Dev Q 1994;5:213-35. Crossref
12. Ogrinc G, Armstrong GE, Dolansky MA, Singh MK, Davies L. SQUIRE-EDU (Standards for QUality Improvement Reporting Excellence in Education): publication guidelines for educational improvement. Acad Med 2019;94:1461-70. Crossref
13. Merrill MD. First principles of instruction. In: Reigeluth CM, Carr-Chellman AA, editors. Instructional Design Theories and Models: Building a Common Knowledge Base. Vol III. New York: Routledge Publishers; 2009: 43-59.
14. Ruldolph JW, Raemer DB, Simon R. Establishing a safe container for learning in simulation: the role of the presimulation briefing. Simul Healthc 2014;9:339-49. Crossref
15. Légaré F, Borduas F, Freitas A, et al. Development of a simple 12-item theory-based instrument to assess the impact of continuing professional development on clinical behavioral intentions. PLoS One 2014;9:e91013. Crossref
16. Triandis HC. Values, attitudes, and interpersonal behaviour. In: Howe HE Jr, Page MM, editors. Nebraska Symposium on Motivation. Lincoln: University of Nebraska Press; 1979: 195-259.
17. Jamovi Project. Jamovi (desktop version 2.3.28 for Mac). 2024. Available from: https://dev.jamovi.org. Accessed 25 Oct 2024.
18. Clarke V, Braun V. Thematic analysis. In Teo T, editor. Encyclopedia of Critical Psychology. New York: Springer; 2014: 1947-52. Crossref
19. ATLAS.ti Scientific Software Development GmbH. ATLAS.ti (software version 8.4.5). 2024. Available from: https://atlasti.com. Accessed 25 Oct 2024.
20. Baboolal SO, Singaram VS. Specialist training: workplace-based assessments impact on teaching, learning and feedback to support competency-based postgraduate programs. BMC Med Educ 2023;23:941. Crossref
21. Castanelli DJ, Weller JM, Molloy E, Bearman M. Trust, power and learning in workplace-based assessment: the trainee perspective. Med Educ 2022;56:280-91. Crossref
22. Gaunt A, Patel A, Rusius V, Royle TJ, Markham DH, Pawlikowska T. ‘Playing the game’: how do surgical trainees seek feedback using workplace-based assessment? Med Educ 2017;51:953-62. Crossref
23. Knowles MS, Holton EF III, Swanson RA. The Adult Learner: The Definitive Classic in Adult Education and Human Resource Development, 6th ed. Amsterdam: Elsevier; 2005.
24. Francom GM, Gardner J. What is task-centered learning? TechTrends 2014;58:27-35. Crossref
25. van Merriënboer JJ, Kirschner PA. Ten Steps to Complex Learning: A Systematic Approach to Four-Component Instructional Design. 3rd ed. New York: Routledge Publisher; 2018. Crossref
26. Lerchenfeldt S, Kamel-ElSayed S, Patino G, Loftus S, Thomas DM. A qualitative analysis on the effectiveness of peer feedback in team-based learning. Med Sci Educ 2023;33:893-902. Crossref
27. Man D, Kong B, Chau MH. Developing student feedback literacy through peer review training. RELC J 2024;55:408-21. Crossref
28. Little T, Dawson P, Boud D, Tai J. Can students’ feedback literacy be improved? A scoping review of interventions. Assess Eval High Educ 2023;49:39-52. Crossref
29. van Merriënboer JJ, Sweller J. Cognitive load theory in health professional education: design principles and strategies. Med Educ 2010;44:85-93. Crossref
30. McLain M. Developing perspectives on ‘the demonstration’ as a signature pedagogy in design and technology education. Int J Tech Design Educ 2021;31:3-26. Crossref
31. Grossman R, Salas E, Pavlas D, Rosen MA. Using instructional features to enhance demonstration-based training in management education. Acad Manag Learn Educ 2012;12:219-43. Crossref

Specific indicators of unsuitability for transarterial chemoembolisation in patients with intermediate-stage hepatocellular carcinoma according to thresholds of tumour burden and liver function as judged by survival benefit over sorafenib

Hong Kong Med J 2025;31:Epub 5 Dec 2025
© Hong Kong Academy of Medicine. CC BY-NC-ND 4.0
 
ORIGINAL ARTICLE
Specific indicators of unsuitability for transarterial chemoembolisation in patients with intermediate-stage hepatocellular carcinoma according to thresholds of tumour burden and liver function as judged by survival benefit over sorafenib
LM Chen, PhD1,2,3; Simon CH Yu, MB, BS, MD1; Leung Li, MB, ChB, MD4; Edwin P Hui, MB, ChB, MD4; Winnie Yeo, MB, BS, MD4,5; Stephen L Chan, MB, BS, MD4,5
1 Department of Imaging and Interventional Radiology, Faculty of Medicine, The Chinese University of Hong Kong, Hong Kong SAR, China
2 Department of Medical Ultrasonics, The Sixth Affiliated Hospital, Sun Yat-sen University, Guangzhou, China
3 Biomedical Innovation Center, The Sixth Affiliated Hospital, Sun Yat-sen University, Guangzhou, China
4 Department of Clinical Oncology, Prince of Wales Hospital, The Chinese University of Hong Kong, Hong Kong SAR, China
5 State Key Laboratory of Translational Oncology, China
 
Corresponding author: Dr Simon CH Yu (simonyu@cuhk.edu.hk)
 
 Full paper in PDF
 
Abstract
Introduction: This study aimed to define specific indicators of unsuitability for transarterial chemoembolisation (TACE) in patients with intermediate-stage hepatocellular carcinoma (HCC) in Hong Kong using thresholds of tumour burden and liver function, as judged by survival benefit over sorafenib.
 
Methods: Patients with treatment-naïve and unresectable HCC who received TACE or sorafenib from 2005 to 2019 and met the eligibility criteria were enrolled. Overall survival (OS) was compared between the TACE and sorafenib groups using the log-rank test and hazard ratios (HRs) in all subgroups classified according to baseline modified albumin–bilirubin (mALBI) grade and tumour burden, including the up-to-7, up-to-11, and N3-S5-S10 criteria.
 
Results: Overall survival was significantly longer in TACE subgroups than in sorafenib subgroups when stratified by mALBI grade and either the up-to-7 or the up-to-11 criteria (all P<0.05). When applying the N3-S5-S10 criteria, OS did not significantly differ between the TACE and sorafenib groups in subgroups with mALBI grade 2b and tumours with number >3 and size >5 cm but ≤10 cm, or tumours with number >3 and size >10 cm (HR=0.550 and 0.965, respectively; both P>0.05). Sensitivity analysis showed non-significant survival benefits in two additional subgroups: those with mALBI grade 2b and tumours with number ≤3 and size >10 cm, and those with mALBI grade 1 or 2a and tumours with number >3 and size >10 cm (HR=0.474 and 0.418, respectively; both P>0.05).
 
Conclusion: More precise criteria for TACE unsuitability are required. The combination of mALBI grade and the N3-S5-S10 criteria may better identify patients with intermediate-stage HCC who are unlikely to benefit from TACE. Validation in a larger cohort is warranted.
 
 
New knowledge added by this study
  • Patients regarded as unsuitable for transarterial chemoembolisation (TACE) under existing criteria may achieve better survival outcomes with TACE than those with systemic therapy.
  • To determine true TACE unsuitability, more precise criteria based on clinical evidence demonstrating improved survival with alternative treatments are required. Modified albumin–bilirubin (mALBI) grade 2b and tumours with number >3 and size >5 cm, or tumours with number ≤3 and size >10 cm, as well as mALBI grade 1 or 2a and tumours with number >3 and size >10 cm, could serve as better indicators of TACE unsuitability in patients with intermediate-stage hepatocellular carcinoma.
Implications for clinical practice or policy
  • Within the framework of TACE unsuitability, the use of more precise discriminatory criteria is crucial to ensure that patients are not inappropriately excluded from the potential benefits of TACE.
  • The integration of mALBI grade with the N3-S5-S10 tumour burden criteria may offer a practical framework for clinicians to individualise treatment selection, optimising outcomes by identifying patients more likely to benefit from TACE versus systemic therapy.
 
 
Introduction
Hepatocellular carcinoma (HCC) is one of the leading malignancies worldwide. At diagnosis, up to 30% of patients have intermediate-stage HCC according to the Barcelona Clinic Liver Cancer system.1 Transarterial chemoembolisation (TACE) has emerged as the first-line treatment for intermediate-stage HCC, supported by two randomised controlled trials2 3 and a meta-analysis4 that demonstrated superior survival outcomes compared with best supportive care or suboptimal therapies.
 
Because patients with intermediate-stage HCC comprise a heterogeneous group characterised by a wide range of tumour burdens and liver function, the effectiveness of TACE as first-line treatment may not be universal, particularly in subgroups with high tumour burden or suboptimal liver function. To address this issue, sub-staging of intermediate-stage HCC based on tumour burden and liver function has been proposed in several criteria, including the Bolondi,5 Kinki,6 and MICAN (Modified Intermediate Stage of Liver Cancer) criteria.7 These criteria have demonstrated discriminative prognostic value in identifying subgroups of patients with intermediate-stage HCC.7 8 Given that survival outcomes of patients treated with TACE can vary across substages of intermediate-stage HCC, it is clinically essential to identify thresholds of tumour burden and liver function that preclude the use of TACE according to survival benefit.
 
Sorafenib has been established as the standard of care for advanced HCC since 2007, based on the demonstration of its significant survival superiority over placebo.9 10 11 Subgroup analyses of clinical trials have shown that sorafenib exerts positive therapeutic efficacy in intermediate-stage HCC, with reported overall survival (OS) ranging from 14.5 to 20.6 months,9 12 13 which is comparable to the OS achieved with TACE. Sorafenib treatment can serve as a benchmark for evaluating the survival benefit of TACE. If TACE does not provide a significant survival benefit compared with sorafenib, it may not be appropriate to subject patients to TACE rather than systemic therapy, given that TACE is invasive and potentially harmful to the liver. Patients may benefit from systemic therapy before liver function becomes suboptimal.
 
It has been hypothesised that specific baseline parameters of tumour burden and liver function, at which TACE fails to show superior survival benefit compared with sorafenib, could be defined as indicators of TACE unsuitability. This study aimed to define specific indicators of TACE unsuitability at baseline in patients with intermediate-stage HCC according to thresholds of tumour burden and liver function, as judged by the survival benefit of TACE over sorafenib.
 
Methods
Study design
Due to the limited number of eligible participants, all available cases with complete clinical data were included. All patients with unresectable HCC who received TACE or sorafenib therapy from January 2005 to December 2019 at Prince of Wales Hospital were enrolled in the study, provided they met all eligibility criteria. Unresectability of intermediate-stage HCC was determined by a multidisciplinary team comprising a surgeon, an interventional radiologist, and an oncologist. Inclusion criteria were treatment-naïve, Barcelona Clinic Liver Cancer-B stage HCC diagnosed by biopsy or a typical vascular pattern on cross-sectional imaging; intrahepatic disease without vascular invasion; and an Eastern Cooperative Oncology Group performance status score of 0 or 1. Exclusion criteria included age under 18 years or Eastern Cooperative Oncology Group performance status score of 2 or above; prior treatment before initial TACE; receipt of hepatectomy, liver transplantation, or local therapy after initial TACE; and any imaging evidence from computed tomography (CT), magnetic resonance imaging, or positron emission tomography/CT showing vascular invasion by tumour (including portal vein tumour thrombus) or extrahepatic metastasis (Fig 1). To identify thresholds for TACE unsuitability, OS of patients treated with TACE was compared with that of patients treated with sorafenib within subgroups defined by baseline tumour burden and liver function. Overall survival was defined as the interval between the initiation of TACE or sorafenib and death from any cause. Patients who were alive or lost to follow-up were censored.
 

Figure 1. Study recruitment and patient subgrouping for transarterial chemoembolisation
 
Study participants
In total, 420 patients were enrolled in the study: 358 received TACE and 62 received sorafenib (Table 1). The TACE group included significantly more older and female patients. The median tumour size was significantly larger in the sorafenib group compared with the TACE group. No significant differences were observed between the two groups in terms of the modified albumin–bilirubin (mALBI) grade distribution or tumour multiplicity. Among patients initially treated with TACE, the median number of TACE sessions was two (range, 1-4); 124 patients received one session, 78 received two sessions, 53 received three sessions, and 103 received more than three sessions. After developing refractoriness to TACE, 60 patients subsequently received systemic agents; of these, 35 received sorafenib, eight received adriamycin, four received doxorubicin, six received lenvatinib, and seven received other agents.
 

Table 1. Demographics of patients (n=420)
 
Patient subgrouping
Patients were classified into six subgroups according to baseline tumour burden and liver function. Tumour burden was subcategorised using the up-to-7, up-to-11, and N3-S5-S10 criteria. The up-to-7 and up-to-11 criteria were derived from the sum of the maximum tumour size (in cm) and the tumour number, with cut-off values of 7 or 11, respectively. Accordingly, patients were categorised as within or beyond the up-to-7 and up-to-11 criteria. In the N3-S5-S10 system, tumour burden was subcategorised according to the combination of tumour number and maximum tumour size; three tumour nodules and 5 cm or 10 cm in size served as the respective cut-off values. This categorisation resulted in the following six subgroups: (1) tumour number ≤3, tumour size ≤5 cm; (2) tumour number ≤3, tumour size >5 cm to ≤10 cm; (3) tumour number ≤3, tumour size >10 cm; (4) tumour number >3, tumour size ≤5 cm; (5) tumour number >3, tumour size >5 cm to ≤10 cm; and (6) tumour number >3, tumour size >10 cm (Fig 1).
 
Liver function subgroups were classified according to the mALBI grade.14 The mALBI grades were determined using the ALBI score, calculated as (log10 [bilirubin level (μmol/L)] × 0.66) + (albumin level [g/L] × –0.085). Based on three cut-off ALBI scores, grades were defined as follows: grade 1 (≤–2.60), grade 2a (>–2.60 to ≤–2.27), grade 2b (>–2.27 to ≤–1.39), and grade 3 (>–1.39). Because the sample size of patients receiving sorafenib with mALBI grade 1 or 2a was relatively small, these two subgroups were combined for analysis. Additionally, given that no patient with mALBI grade 3 received sorafenib, this subgroup was excluded from the analysis (Fig 1).
 
Transarterial chemoembolisation
The TACE procedures were performed using digital subtraction angiography equipment via a femoral approach under local anaesthesia.15 16 In brief, a microcatheter was used to catheterise tumour-feeding arteries at the lobar, segmental, or subsegmental level, depending on tumour size. An emulsion of cisplatin–ethiodised oil (Platosin; Pharmachemie BV, Haarlem, the Netherlands), consisting of up to 20 mg aqueous cisplatin (20 mL) and up to 20-mL ethiodised oil mixed in a 1:1 volume ratio, was administered until flow stasis occurred or a maximum dose of 40-mL emulsion was delivered. Digital subtraction angiography, with or without non-contrast multiplanar CT, was used to confirm treatment completeness. A gelatin sponge (5-10 mL) was used to embolise the feeding arteries.
 
Postprocedure monitoring included blood tests for liver function and tumour markers within 2 days, at 2 weeks, and then every 1 to 3 months, as well as CT imaging every 3 months. Systemic therapy was administered to patients with well-preserved liver function who developed TACE refractoriness, as indicated by continuous elevation of tumour markers and CT evidence of tumour progression.
 
Systemic therapy
According to the customary protocol at Prince of Wales Hospital, The Chinese University of Hong Kong during the study period, patients with unresectable intermediate-stage HCC and no contraindications to TACE were prioritised for TACE treatment. Patients who declined TACE were treated with sorafenib; as a result, some patients in the sorafenib group had smaller tumours or fewer tumour nodules. Sorafenib was administered orally at a prescribed dose of 400 mg twice daily. In the event of intolerable side-effects or serious adverse events, oncologists could adjust the treatment by reducing the dose or discontinuing the drug.
 
Statistical analysis
Categorical variables were presented as numbers (percentages), while continuous variables were summarised as median (interquartile range), median (95% confidence interval [95% CI]), or depending on the results of normality testing. The Chi squared test was used to compare categorical data, and the Mann-Whitney U test was performed for continuous data. Kaplan-Meier curves and Cox proportional hazards models were used to compare OS values among subgroups. The log-rank test and hazard ratio (HR) were utilised to assess survival differences between subgroups. A sensitivity analysis of survival outcomes was conducted, excluding participants who received systemic therapy after TACE. A P value <0.05 was considered statistically significant. Statistical analyses were performed using SPSS (Windows version 25.0; IBM Corp, Armonk [NY], United States).
 
Results
Comparison of overall survival between transarterial chemoembolisation and sorafenib
The median OS of all patients who received TACE was significantly longer than that of patients who received sorafenib (19.37 [16.89-21.85] months vs 5.12 [4.37-5.84] months, P<0.001; Fig 2a). When stratified by mALBI grade, patients with mALBI grade 1 or 2a had significantly longer median OS in the TACE group compared with the sorafenib group (23.83 [18.53-29.13] months vs 6.60 [3.61-9.59] months, P<0.001; Fig 2b). Similarly, patients with mALBI grade 2b had significantly longer median OS in the TACE group than in the sorafenib group (16.20 [11.91-20.49] months vs 4.39 [3.44-5.35] months, P<0.001; Fig 2c).
 

Figure 2. Kaplan-Meier overall survival curves for patients with hepatocellular carcinoma who received transarterial chemoembolisation (TACE) and sorafenib. The median overall survival of all patients who received TACE was significantly longer than that of those who received sorafenib (a). Transarterial chemoembolisation subgroups were associated with significantly longer survival compared with sorafenib subgroups in patients with modified albumin–bilirubin (mALBI) grade 1 or 2a (b) and in those with mALBI grade 2b (c)
 
Overall survival by modified albumin–bilirubin grade and tumour burden in sorafenib-treated patients
The median OS of patients treated with sorafenib, stratified by mALBI grade and tumour burden, is summarised in Table 2. As the sorafenib subgroups with tumour number ≤3 had a relatively small sample size (n=8) according to the N3-S5-S10 criteria, these patients were not further subdivided based on tumour size. Instead, they were combined into a single subgroup with tumour number ≤3 to increase the sample size for comparison with the TACE group. Consequently, OS in the combined sorafenib subgroup (tumour number ≤3, any tumour size) was used for comparison with OS in the three tumour-size TACE subgroups of tumour number ≤3 (Table 2).
 

Table 2. Overall survival of patients receiving sorafenib (n=62)
 
The distribution of sample sizes was uneven across the sorafenib subgroups with tumour number >3 based on the N3-S5-S10 criteria, which may have introduced bias in the survival outcomes, such as a lower tumour burden being associated with worse OS. To avoid underestimation of OS in any tumour-size subgroup when comparing with the TACE subgroups, the longest OS among the subgroups with tumour number >3 was utilised as the OS value for all these subgroups in the analysis, irrespective of tumour size (Table 2). As no patients with mALBI grade 2 were present in the tumour burden subgroup defined as within up-to-7, the OS of patients with tumour burden beyond up-to-7 (Table 2) who were treated with sorafenib was used as the control.
 
Overall survival in modified albumin–bilirubin grade 1 or 2a: transarterial chemoembolisation versus sorafenib
Table 3 presents the median OS of patients treated with TACE or sorafenib, stratified by mALBI grade 1 or 2a and tumour burden. Across all subgroups defined by various tumour burden criteria, patients who received TACE achieved significantly longer OS than those who received sorafenib (all P<0.05), with HRs favouring TACE (ranging from 0.130 to 0.331). Sensitivity analysis showed that survival was not significantly different between TACE and sorafenib in the subgroup with tumour number >3 and tumour size >10 cm (HR=0.418 [95% CI=0.147-1.171]; P=0.097).
 

Table 3. Overall survival of patients with liver function classified as modified albumin–bilirubin grade 1 or 2a
 
Overall survival in modified albumin–bilirubin grade 2b: transarterial chemoembolisation versus sorafenib
In subgroups with mALBI grade 2b, defined by either the up-to-7 or up-to-11 criteria, patients who received TACE exhibited significantly longer median OS than those who received sorafenib across all subgroups (all P<0.05; Table 4). However, when using the N3-S5-S10 criteria, TACE resulted in a significantly longer median OS than sorafenib only in the subgroups with tumour number ≤3 (any tumour size) and in the subgroup with tumour number >3 and tumour size ≤5 cm (both P<0.05; Table 4). In the subgroups with tumour number >3 and tumour size >5 cm to ≤10 cm, and those with tumour number >3 and tumour size >10 cm, although TACE subgroups demonstrated longer median OS than sorafenib subgroups (6.07 vs 3.74 months and 7.73 vs 3.74 months, respectively), the differences were not statistically significant (Table 4). Sensitivity analysis showed that survival was also not significantly different between TACE and sorafenib in the additional subgroup with tumour number ≤3 and tumour size >10 cm (HR=0.474 [95% CI=0.185-1.261]; P=0.120).
 

Table 4. Overall survival of patients with liver function classified as modified albumin–bilirubin grade 2b
 
Due to the small sample size, it was difficult to demonstrate a clear survival benefit of TACE over sorafenib; thus, the risk of overestimating the survival benefit of TACE, due to potential bias from more advanced disease in the sorafenib group, was likely minimised. For example, given the limited number of patients in the subgroups with tumour number >3 and tumour size >5 cm to ≤10 cm and those with tumour number >3 and tumour size >10 cm, these two subgroups were combined into one subgroup (tumour number >3 and tumour size >5 cm). In this combined subgroup, TACE (n=38) still yielded no significant survival benefit over sorafenib (n=14), with OS values of 6.07 months (4.10-8.03) and 3.74 months (1.71-5.78), respectively (HR=0.586 [95% CI=0.325-1.054]; P=0.071).
 
Discussion
Results of subgroup analysis
Subgroup analysis in this study revealed that, within the limitations of the data, TACE probably did not confer a statistically significant survival benefit over sorafenib for patients with mALBI grade 2b and a high tumour burden (number >3 and size >5 cm, or number ≤3 and size >10 cm), or for patients with mALBI grade 1 or 2a and tumour burden of number >3 and size >10 cm. In contrast, TACE did provide a survival benefit when the beyond up-to-7 or beyond up-to-11 criteria were applied. These findings suggest that the use of more precise criteria to define tumour burden and liver function could help identify specific subgroups unsuitable for TACE. Such criteria highlight the threshold at which TACE no longer provides a survival advantage over sorafenib, thereby indicating TACE unsuitability. These indicators would be valuable in guiding the clinical management of intermediate-stage HCC. The small sample size in the sorafenib group may have limited the statistical power to detect a survival benefit of TACE in subgroups with tumour number >3 and size >5 cm. Given that the overall results showed a consistent trend favouring TACE, validation through further studies with larger sample sizes is warranted.
 
Sorafenib as a control
In recent years, systemic therapy for HCC has undergone rapid development, leading to the emergence of new drugs after sorafenib. The combination of certain agents has shown significant improvements in survival compared with sorafenib alone. The IMbrave150 study demonstrated that treatment with atezolizumab plus bevacizumab resulted in a significantly longer median OS than sorafenib alone (19.2 vs 13.4 months).17 Similarly, both sintilimab plus a bevacizumab biosimilar18 and tremelimumab plus durvalumab19 provided significant survival benefits over sorafenib in patients with unresectable HCC. Nevertheless, sorafenib remains the first-line standard treatment and the most effective single agent for advanced HCC. It serves as a benchmark for newer single-agent therapies such as lenvatinib, nivolumab, and durvalumab, which have shown statistical non-inferiority in survival compared with sorafenib.19 20 21 Therefore, the use of sorafenib as the control arm versus TACE in this study is reasonable. With the rapid advancement of systemic agents, novel treatment strategies—such as switching to systemic therapy22 or initiating systemic therapy upfront followed by curative conversion23—have been advocated for patients with intermediate-stage HCC who may not benefit from TACE or repeated TACE. In such cases, it is important to define specific indicators of TACE unsuitability among patients with intermediate-stage HCC, in whom systemic therapy may potentially improve survival.
 
Deficiencies of conventional criteria of unsuitability for transarterial chemoembolisation
The concept of TACE unsuitability has emerged in conjunction with the development and availability of systemic therapies.24 In patients with intermediate-stage HCC, TACE unsuitability has been defined as the presence of mALBI grade 2b and tumour burden beyond the up-to-7 criteria.25 26 This definition was based on worse survival in patients with mALBI grade 2b and the beyond up-to-7 criteria relative to patients displaying better liver function and lower tumour burden, without addressing the potential survival benefit of TACE over alternative treatment options in this subgroup. However, this definition has two key limitations. First, it lacks clinical evidence demonstrating greater survival benefit from other alternative treatments when TACE is withheld. Second, there remains controversy regarding the optimal criteria for defining high tumour burden. If the beyond up-to-7 criteria is used as the criterion for TACE unsuitability, the majority of patients with intermediate-stage HCC would be considered unsuitable, which is both unrealistic and unsupported. In the present study, 79% of patients had high tumour burden beyond up-to-7, comparable to the 70% reported by Hung et al.27
 
Limitations of conventional sub-staging systems
The sub-staging system using the up-to-11 criteria has shown better discriminatory power than the up-to-7 criteria for predicting survival after TACE.28 29 Nonetheless, in this study, neither the up-to-7 nor the up-to-11 criteria were able to identify TACE unsuitability. The findings indicated that both the patient subgroup with mALBI grade 2b and tumour burden beyond the up-to-7 criteria, as well as the subgroup with mALBI grade 2b and tumour burden beyond the up-to-11 criteria, still derived survival benefits from TACE compared with sorafenib, indicating that these subgroups should not be considered TACE unsuitable. The lack of discriminatory power may be attributed to the persistently high heterogeneity among patients classified as having high tumour burden under to these two criteria. Worse survival after TACE in these subgroups, compared with patients displaying better liver function and lower tumour burden, does not justify entirely abandoning TACE in these patients.
 
We propose using the N3-S5-S10 criteria to define tumour burden, as these criteria allow for more specific subgrouping and enable the identification of TACE unsuitability with greater precision, thereby reducing the likelihood of denying patients a potentially beneficial treatment (TACE). Our findings demonstrate that the proposed criteria can identify TACE unsuitability precisely in specific subgroups where the up-to-7 or up-to-11 criteria fail to distinguish survival differences. Based on these findings, we recommend that physicians assess intermediate-stage HCC using both the mALBI grade and the N3-S5-S10 criteria—a more rigorous framework—to determine TACE unsuitability. To our knowledge, this is the first study to demonstrate the survival benefit of TACE over sorafenib in patients with intermediate-stage HCC stratified by both liver function and tumour burden, as well as to identify TACE unsuitability within these subgroups.
 
Limitations
This study provided a larger sample size than previous studies comparing survival benefits between TACE and sorafenib. However, several limitations should be noted. First, the retrospective design of this study inevitably introduced patient selection bias between the TACE and sorafenib groups. Although there were significant differences in age, sex, and tumour size between the groups, such disparities in overall patient demographics might not have critically affected the validity of the survival comparisons, given that these were based on subgroup analyses. Second, the sample size was exceedingly small in some sorafenib subgroups with low tumour burden. The substantial disparity in patient numbers may have contributed to non-significant differences in OS between subgroups. We attempted to mitigate this limitation by combining subgroups with very small sample sizes. Third, some patients in the TACE group received systemic therapy after disease progression. Consequently, survival in the TACE group may have been overestimated as it reflected outcomes of TACE with or without systemic therapy, rather than TACE alone. Nonetheless, ‘TACE followed by systemic therapy’ represents standard clinical practice aimed at achieving the greatest patient benefit, and isolating a TACE-alone group for analysis would not be realistic. Notably, ‘TACE followed by systemic therapy’ accurately reflects real-world treatment practice and does not conflict with the study’s primary objective, which was to define specific indicators of TACE unsuitability at baseline rather than at the point when TACE becomes unsuitable. Finally, no power calculation was performed in the statistical analysis.
 
Conclusion
More precise criteria for TACE unsuitability are required. The combination of mALBI grade and N3-S5-S10 criteria may serve as a better indicator of TACE unsuitability than the beyond up-to-7 or beyond up-to-11 criteria for patients with intermediate-stage HCC. TACE likely offers no survival benefit compared with sorafenib beyond these thresholds. However, validation in a larger cohort is warranted.
 
Author contributions
Concept or design: SCH Yu.
Acquisition of data: LM Chen, L Li, EP Hui, W Yeo, SL Chan.
Analysis or interpretation of data: LM Chen, SCH Yu.
Drafting of the manuscript: LM Chen, SCH Yu.
Critical revision of the manuscript for important intellectual content: All authors.
 
All authors had full access to the data, contributed to the study, approved the final version for publication, and take responsibility for its accuracy and integrity.
 
Conflicts of interest
All authors have disclosed no conflicts of interest.
 
Funding/support
This research was funded by the Vascular and Interventional Radiology Foundation, Hong Kong. The funding body was not involved in the design of the study, collection of data, analysis/interpretation of data, or writing of the manuscript.
 
Ethics approval
This research was approved by The Chinese University of Hong Kong–New Territories East Cluster Ethics Committee, Hong Kong (Ref No.: 2020.672). It was conducted in accordance with the Declaration of Helsinki and the International Conference on Harmonisation–Good Clinical Practice guidelines. The requirement for written informed patient consent was waived by the Committee due to the retrospective nature of the research.
 
References
1. Sung H, Ferlay J, Siegel RL, et al. Global Cancer Statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin 2021;71:209-49. Crossref
2. Llovet JM, Real MI, Montaña X, et al. Arterial embolisation or chemoembolisation versus symptomatic treatment in patients with unresectable hepatocellular carcinoma: a randomised controlled trial. Lancet 2002;359:1734-9. Crossref
3. Lo CM, Ngan H, Tso WK, et al. Randomized controlled trial of transarterial lipiodol chemoembolization for unresectable hepatocellular carcinoma. Hepatology 2002;35:1164-71. Crossref
4. Llovet JM, Bruix J. Systematic review of randomized trials for unresectable hepatocellular carcinoma: chemoembolization improves survival. Hepatology Feb 2003;37:429-42. Crossref
5. Bolondi L, Burroughs A, Dufour JF, et al. Heterogeneity of patients with intermediate (BCLC B) hepatocellular carcinoma: proposal for a subclassification to facilitate treatment decisions. Semin Liver Dis 2012;32:348-59. Crossref
6. Kudo M, Arizumi T, Ueshima K, Sakurai T, Kitano M, Nishida N. Subclassification of BCLC B stage hepatocellular carcinoma and treatment strategies: proposal of modified Bolondi’s subclassification (Kinki criteria). Dig Dis 2015;33:751-8. Crossref
7. Hiraoka A, Kumada T, Nouso K, et al. Proposed new sub-grouping for intermediate-stage hepatocellular carcinoma using albumin–bilirubin grade. Oncology 2016;91:153-61. Crossref
8. Arizumi T, Ueshima K, Iwanishi M, et al. Validation of Kinki criteria, a modified substaging system, in patients with intermediate stage hepatocellular carcinoma. Dig Dis 2016;34:671-8. Crossref
9. Bruix J, Raoul JL, Sherman M, et al. Efficacy and safety of sorafenib in patients with advanced hepatocellular carcinoma: subanalyses of a phase III trial. J Hepatol 2012;57:821-9. Crossref
10. Cheng AL, Kang YK, Chen Z, et al. Efficacy and safety of sorafenib in patients in the Asia-Pacific region with advanced hepatocellular carcinoma: a phase III randomised, double-blind, placebo-controlled trial. Lancet Oncol 2009;10:25-34. Crossref
11. Llovet JM, Ricci S, Mazzaferro V, et al. Sorafenib in advanced hepatocellular carcinoma. N Engl J Med 2008;359:378-90. Crossref
12. Iavarone M, Cabibbo G, Piscaglia F, et al. Field-practice study of sorafenib therapy for hepatocellular carcinoma: a prospective multicenter study in Italy. Hepatology 2011;54:2055-63. Crossref
13. Marrero JA, Kudo M, Venook AP, et al. Observational registry of sorafenib use in clinical practice across Child-Pugh subgroups: the GIDEON study. J Hepatol 2016;65:1140-7. Crossref
14. Hiraoka A, Michitaka K, Kumada T, et al. Validation and potential of albumin–bilirubin grade and prognostication in a nationwide survey of 46,681 hepatocellular carcinoma patients in Japan: the need for a more detailed evaluation of hepatic function. Liver Cancer 2017;6:325-36. Crossref
15. Yu SC, Hui JW, Hui EP, et al. Unresectable hepatocellular carcinoma: randomized controlled trial of transarterial ethanol ablation versus transcatheter arterial chemoembolization. Radiology 2014;270:607-20. Crossref
16. Yu SC, Hui JW, Li L, et al. Comparison of chemoembolization, radioembolization, and transarterial ethanol ablation for huge hepatocellular carcinoma (≥10 cm) in tumour response and long-term survival outcome. Cardiovasc Intervent Radiol 2022;45:172-81. Crossref
17. Cheng AL, Qin S, Ikeda M, et al. Updated efficacy and safety data from IMbrave150: atezolizumab plus bevacizumab vs. sorafenib for unresectable hepatocellular carcinoma. J Hepatol 2022;76:862-73. Crossref
18. Ren Z, Xu J, Bai Y, et al. Sintilimab plus a bevacizumab biosimilar (IBI305) versus sorafenib in unresectable hepatocellular carcinoma (ORIENT-32): a randomised, open-label, phase 2-3 study. Lancet Oncol 2021;22:977-90. Crossref
19. Abou-Alfa GK, Chan SL, Kudo M, et al. Phase 3 randomized, open-label, multicenter study of tremelimumab (T) and durvalumab (D) as first-line therapy in patients (pts) with unresectable hepatocellular carcinoma (uHCC): HIMALAYA. J Clin Oncol 2022;40(4_suppl):379. Crossref
20. Yau T, Park JW, Finn RS, et al. Nivolumab versus sorafenib in advanced hepatocellular carcinoma (CheckMate 459): a randomised, multicentre, open-label, phase 3 trial. Lancet Oncol 2022;23:77-90. Crossref
21. Kudo M, Finn RS, Qin S, et al. Lenvatinib versus sorafenib in first-line treatment of patients with unresectable hepatocellular carcinoma: a randomised phase 3 non-inferiority trial. Lancet 2018;391:1163-73. Crossref
22. Ogasawara S, Ooka Y, Koroki K, et al. Switching to systemic therapy after locoregional treatment failure: definition and best timing. Clin Mol Hepatol 2020;26:155-62. Crossref
23. Kudo M. A novel treatment strategy for patients with intermediate-stage HCC who are not suitable for TACE: upfront systemic therapy followed by curative conversion. Liver Cancer 2021;10:539-44. Crossref
24. Kudo M. Extremely high objective response rate of lenvatinib: its clinical relevance and changing the treatment paradigm in hepatocellular carcinoma. Liver Cancer 2018;7:215-24. Crossref
25. Kudo M, Han KH, Ye SL, et al. A changing paradigm for the treatment of intermediate-stage hepatocellular carcinoma: Asia-Pacific Primary Liver Cancer Expert Consensus Statements. Liver Cancer 2020;9:245-60. Crossref
26. Kudo M, Kawamura Y, Hasegawa K, et al. Management of hepatocellular carcinoma in Japan: JSH Consensus Statements and Recommendations 2021 update. Liver Cancer 2021;10:181-223. Crossref
27. Hung YW, Lee IC, Chi CT, et al. Redefining tumor burden in patients with intermediate-stage hepatocellular carcinoma: the seven-eleven criteria. Liver Cancer 2021;10:629-40. Crossref
28. Kim JH, Shim JH, Lee HC, et al. New intermediate-stage subclassification for patients with hepatocellular carcinoma treated with transarterial chemoembolization. Liver Int 2017;37:1861-8. Crossref
29. Lee IC, Hung YW, Liu CA, et al. A new ALBI-based model to predict survival after transarterial chemoembolization for BCLC stage B hepatocellular carcinoma. Liver Int 2019;39:1704-12. Crossref

Incidence, risk factors, and clinical outcomes of peripartum cardiomyopathy in Hong Kong

Hong Kong Med J 2025;31:Epub 27 Nov 2025
© Hong Kong Academy of Medicine. CC BY-NC-ND 4.0
 
ORIGINAL ARTICLE
Incidence, risk factors, and clinical outcomes of peripartum cardiomyopathy in Hong Kong
Liliana SK Law, MB, ChB1; LT Kwong, MB, BS1; KH Siong, MB, BS1; Sani TK Wong, MB, ChB2; WL Chan, MB, ChB3; KY Tse, MB, BS4; Yannie YY Chan, MB, BS5; KS Eu, MB, BS6; CY Chow, MB, ChB7; Joan KO Wai, LMCHK8; HC Mok, MB, BS1; PL So, MB, BS1
1 Department of Obstetrics and Gynaecology, Tuen Mun Hospital, Hong Kong SAR, China
2 Department of Obstetrics and Gynaecology, Prince of Wales Hospital, The Chinese University of Hong Kong, Hong Kong SAR, China
3 Department of Obstetrics and Gynaecology, Kwong Wah Hospital, Hong Kong SAR, China
4 Department of Obstetrics and Gynaecology, Queen Elizabeth Hospital, Hong Kong SAR, China
5 Department of Obstetrics and Gynaecology, Princess Margaret Hospital, Hong Kong SAR, China
6 Department of Obstetrics and Gynaecology, Pamela Youde Nethersole Eastern Hospital, Hong Kong SAR, China
7 Department of Obstetrics and Gynaecology, United Christian Hospital, Hong Kong SAR, China
8 Department of Obstetrics and Gynaecology, Queen Mary Hospital, The University of Hong Kong, Hong Kong SAR, China
 
Corresponding author: Dr Liliana SK Law (lawskliliana@gmail.com)
 
 Full paper in PDF
 
Abstract
Introduction: Peripartum cardiomyopathy (PPCM) is an uncommon but serious form of heart failure affecting women during late pregnancy or early postpartum. This territory-wide multicentre retrospective study aimed to evaluate the local incidence, risk factors, and clinical outcomes, including subsequent pregnancies, in Hong Kong.
 
Methods: Medical records were retrospectively reviewed for women who delivered at all public hospitals between 1 January 2013 and 31 December 2022 and met the 2010 European Society of Cardiology Working Group criteria for PPCM. Regression analysis was performed to investigate maternal risk factors.
 
Results: Thirty Asian women were diagnosed with PPCM, corresponding to an incidence of 1 in 11 179 live births. Eleven (36.7%) had antepartum onset of symptoms, and 25 (83.3%) were diagnosed after childbirth, most presenting with severe symptoms (90%). The median left ventricular ejection fraction was 30% (range, 10%-44%). Notable complications included cardiogenic shock (10%), respiratory failure (23.3%), acute renal failure (23.3%), and thromboembolism (23.3%). Most women received guideline-directed heart failure therapy. At 12 months, all-cause mortality was 6.7%, and cardiac recovery occurred in 60%. Eleven women had 13 subsequent pregnancies (three miscarriages, five terminations, and five live births). There were no maternal deaths or cases of recurrent PPCM. Genetic testing identified potentially pathogenic variants in at least 10% of women. Antenatal anaemia (adjusted odds ratio [OR]=13.04; 95% confidence interval [95% CI]=3.72-45.70) and hypertensive disorders of pregnancy (adjusted OR=38.00; 95% CI=9.66-149.52) were associated with higher odds of PPCM.
 
Conclusion: This study highlights the substantial morbidity and mortality associated with PPCM. Genetic testing may aid in risk stratification and prognostication.
 
 
New knowledge added by this study
  • Peripartum cardiomyopathy (PPCM) is an uncommon but potentially fatal disease in Hong Kong.
  • Genetic testing by next-generation sequencing identified 10% of women with PPCM as carriers of potential genetic variants associated with cardiomyopathy.
  • Antenatal anaemia and hypertensive disorders of pregnancy are independent clinical risk factors for PPCM.
Implications for clinical practice or policy
  • Screening for and prevention of anaemia during pregnancy and pre-eclampsia may help reduce the incidence of PPCM.
  • The integration of genetic testing in PPCM management may support personalised medical care.
 
 
Introduction
Peripartum cardiomyopathy (PPCM) is a rare form of heart failure that occurs in relation to pregnancy, resulting in substantial morbidity and mortality.1 In 2010, the Heart Failure Association of the European Society of Cardiology (ESC) defined PPCM as “an idiopathic cardiomyopathy presenting with heart failure secondary to left ventricular systolic dysfunction towards the end of pregnancy or in the months following delivery, where no other cause of heart failure is found”.2 Globally, its incidence varies widely, ranging from 1 in 100 live births in Nigeria3 to 1 in 20 000 live births in Japan.4
 
The exact pathogenesis of PPCM is not yet fully understood; the current hypothesis proposes a ‘two-hit’ model involving an initial vascular insult caused by vasculotoxic hormonal effects, including soluble FMS-like tyrosine kinase-1 and prolactin, followed by a second hit of underlying predisposition—such as genetic susceptibility and other risk factors—that limits some women’s ability to withstand this vasculotoxic insult.1 Genetic or familial predisposition to PPCM has been supported by multiple reports.5 6 7 8 Additionally, well-recognised risk factors for PPCM include advanced maternal age, African American ancestry, multiple pregnancies, hypertension, and pre-eclampsia.9
 
Peripartum cardiomyopathy is a potentially life-threatening myocardial disease that affects women of all ethnic groups10 and can have long-term health consequences.11 Until now, there has been a lack of information regarding the clinical phenotype and outcomes of this disease in Hong Kong. The present population-based study was conducted to evaluate the local incidence, clinical presentation, management, complications, 12-month outcomes, and subsequent pregnancies in women with PPCM. Additionally, we examined potential risk factors by comparing the clinical characteristics of women with and without PPCM to provide a basis for future preventive strategies.
 
Methods
Study design
This was a population-based retrospective study of all women with PPCM who delivered in public hospitals in Hong Kong between 1 January 2013 and 31 December 2022. Cases were identified through the Clinical Data Analysis and Reporting System, which captures obstetric data and hospitalisation diagnoses from eight public hospitals providing obstetric services. First, all women who delivered during the study period and had a diagnosis code for heart failure from the third trimester to 6 months postpartum were identified. Each woman’s medical record was systematically reviewed by two authors to determine whether the following criteria for PPCM were met: development of cardiac failure (with left ventricular ejection fraction [LVEF] <45% on echocardiography) during the third trimester or within 6 months postpartum without an identifiable cause. Women were excluded if LVEF was ≥45%, a recognised cause of heart failure was identified, or there was no physician-confirmed diagnosis of PPCM.
 
Clinical variable collection
Baseline characteristics (including socio-demographics, preexisting health conditions, and obstetric history) at the time of PPCM diagnosis were obtained from medical records. Clinical presentation and initial investigations, including electrocardiography, chest radiography, echocardiography, and laboratory results, were collected. All in-hospital complications and reported outcomes during follow-up were recorded, including all-cause mortality and cardiac recovery determined by echocardiography at 12 months. Management strategies were documented, including admission to the intensive care unit or cardiac care unit, use of mechanical ventilation or circulatory support, medications prescribed at hospital discharge, pacemaker insertion, and heart transplantation. Complete recovery of cardiac function was defined as LVEF ≥50%. Some patients underwent genetic evaluation, and their reports were analysed.
 
Obstetric outcomes at the time of the PPCM event were assessed, including hypertensive disorders of pregnancy; gestational diabetes; thyroid disease; antenatal anaemia (defined as a haemoglobin level <10.5 g/dL); use of tocolytics; placenta accreta spectrum; placental abruption; fetal growth restriction; preterm delivery; assisted vaginal delivery or caesarean section; primary postpartum haemorrhage (blood loss ≥500 mL); and caesarean hysterectomy. Neonatal outcomes were examined, including stillbirth, sex, birth weight, small for gestational age, Apgar scores, admission to the neonatal intensive care unit, and death within 28 days of life. Data from the territory-wide electronic healthcare database were also extracted regarding outcomes of subsequent pregnancies, including LVEF before, during, and after pregnancy. The interval between the PPCM pregnancy and the first subsequent pregnancy was recorded.
 
To investigate risk factors for PPCM, women who gave birth during the same period but did not develop heart failure were selected as the control group, with a PPCM-to-control ratio of 1:4. Demographic and clinical characteristics were compared between women with and without PPCM.
 
Statistical analysis
Data analysis was conducted using SPSS (Windows version 26.0; IBM Corp, Armonk [NY], United States). The incidence rate was calculated by dividing the total number of PPCM cases by the total number of live births during the study period. Descriptive data for continuous variables were presented as mean ± standard deviation or median (range or interquartile range), and categorical data were presented as numbers with percentages. Comparisons between women with and without PPCM were performed using Student’s t test or the Mann-Whitney U test for continuous variables, and the Chi squared test or Fisher’s exact test for categorical variables. Risk factors associated with PPCM were assessed using univariable and multivariable logistic regression analyses, with results expressed as odds ratios (ORs) and 95% confidence intervals (95% CIs). A P value of <0.05 was considered statistically significant. The STROBE (Strengthening the Reporting of Observational Studies in Epidemiology) guidelines were followed in the preparation of this article.
 
Results
Incidence of peripartum cardiomyopathy in Hong Kong
During the 10-year study period, 30 women with PPCM delivered in public hospitals (Fig 1). Over the same period, there were 335 376 live births, yielding an estimated PPCM incidence of 1 in 11 179 live births in Hong Kong.
 

Figure 1. Identification of study population
 
Demographics, clinical characteristics, and investigations
Detailed characteristics are listed in Table 1. All women in this study were Asian. The mean age was 33.5 years and the median body mass index was 22.0 kg/m2. One woman had a positive family history of heart failure of unknown cause; no women had a previous history of PPCM or cardiac disease.
 

Table 1. Maternal socio-demographic characteristics, medical history, and obstetric history (n=30)
 
Symptoms began antepartum in 36.7% of women and postpartum in 63.3%; PPCM was predominantly diagnosed postpartum (83.3%). The median time from symptom onset to diagnosis was 3.5 days (range, 0-107). At diagnosis, 90% of women had severe symptoms (New York Heart Association functional class III/IV), most commonly comprising shortness of breath, peripheral oedema, and desaturation. Common electrocardiographic findings included sinus tachycardia and prolonged QTc interval. At the first echocardiographic assessment, the median LVEF was 30% (range, 10-44). More than half of the women had abnormal chest radiographs showing congestive lung fields, cardiomegaly, and pleural effusion (Table 2).
 

Table 2. Clinical presentation and investigations (n=30)
 
Complications, management, and cardiac recovery
Detailed results are presented in Table 3. Of the 30 women with PPCM, 19 (63.3%) were managed in the intensive care unit or cardiac care unit. Cardiogenic shock, respiratory failure, and acute renal failure occurred in 10% to 20% of cases. Inotropic support, mechanical ventilation, extracorporeal membrane oxygenation, and renal replacement therapy were used during acute treatment.
 

Table 3. Management, complications, and cardiac recovery during hospitalisation and follow-up (n=30)
 
At hospital discharge, most women were prescribed angiotensin-converting enzyme inhibitors (ACEis) or angiotensin receptor blockers (ARBs) and beta-blockers. Four women received prophylactic low–molecular-weight heparin for venous thromboembolism prevention after the event; another four required warfarin for the treatment of cerebral venous thrombosis, brachial artery thromboembolism, pulmonary embolism, or deep vein thrombosis (Table 3).
 
One woman experienced decompensated heart failure requiring an intra-aortic balloon pump and a left ventricular assist device 9 months after diagnosis, followed by heart transplantation 1 year after the event. Two women underwent implantable cardioverter-defibrillator insertion due to symptomatic premature ventricular contractions and poor LVEF recovery. Seven women (23.3%) experienced nine thromboembolic events within 1 year of the PPCM episode, including left ventricular thrombi, ischaemic stroke, and pulmonary embolism. The median follow-up duration after PPCM was 47 months (range, 3-140). At 12 months, all-cause in-hospital mortality was 6.7%; causes of death were myocardial infarction and pulmonary embolism. Overall, recovery of left ventricular function (LVEF ≥50%) occurred in 60% of women (Table 3).
 
Antenatal co-morbidities, obstetric outcomes, and neonatal outcomes
Prior to PPCM, 80% of women received antenatal care. Four women (13.3%) had twin pregnancies. Antenatal anaemia was present in 50% of women. Hypertensive disorders of pregnancy occurred in 56.7%, whereas gestational diabetes was noted in 13.3%. Complications related to pre-eclampsia included haemolysis, elevated liver enzymes, and low platelets syndrome in 3.3%; eclampsia in 3.3%; and placental abruption in 6.7%. No women received tocolytics during pregnancy. The median gestational age at delivery was 37 weeks (range, 28-41). The caesarean section rate was 53.3%, and the most frequent indication was unstable maternal condition (31.3%). Primary postpartum haemorrhage occurred in 30% of cases; one woman required hysterectomy for placenta accreta spectrum. Among the 34 newborns, 32 (94.1%) were born alive; two were stillborn in the third trimester (5.9%) due to placental abruption and trisomy 18. The median birth weight was 2745 g, and 11.8% of newborns were small for gestational age. Four newborns (11.8%) had an Apgar score below 7 at 5 minutes, and nine (26.5%) required admission to a neonatal intensive care unit. There were no cases of early neonatal death (Table 4).
 

Table 4. Antenatal co-morbidities, obstetric outcomes, and neonatal outcomes
 
Outcomes of subsequent pregnancies
The obstetric and cardiac outcomes of the 11 women with subsequent pregnancies are shown in Figure 2. The median interval between the PPCM-affected pregnancy and the next pregnancy was 17 months (range, 4-60). There were 13 subsequent pregnancies (three miscarriages, five terminations, and five live births). Of the five terminations, two were advised due to poor cardiac condition; the remaining three were elective for maternal anxiety or social reasons. There were no maternal deaths or cases of recurrent PPCM.
 

Figure 2. Obstetric and cardiac outcomes of subsequent pregnancies
 
Cases with genetic testing
Genetic analysis using a dilated cardiomyopathy (DCM) panel by next-generation sequencing was requested by physicians in three cases (online supplementary Table 1). Case 1, involving a woman with a family history of heart failure, revealed a pathogenic variant in the FLNC gene. Case 2, concerning a patient with a history of cancer-related chemotherapy who developed refractory postpartum heart failure requiring heart transplantation 1 year after PPCM diagnosis, had no prior signs of heart failure before pregnancy. A genetic test identified two pathogenic variants in the TTN and MYBPC3 genes. Case 3 involved a woman with chronic kidney disease who exhibited persistent left ventricular systolic dysfunction 4 years after PPCM diagnosis. Genetic evaluation was pursued due to her young-onset multisystem disease, revealing a variant in the NEXN gene. This variant, associated with autosomal dominant monogenic DCM, was absent from population databases but showed conflicting results on in silico prediction algorithms; therefore, it was classified as a variant of uncertain significance. Overall, potentially pathogenic genetic variants were identified in at least 10% of women with PPCM.
 
Maternal factors associated with peripartum cardiomyopathy
Compared with the control group, univariable logistic regression analysis showed that factors associated with PPCM included advanced maternal age (≥40 years), smoking, hypertensive disorders of pregnancy, and antenatal anaemia. In multivariable regression analysis, PPCM was independently associated with hypertensive disorders of pregnancy (adjusted OR=38.00; 95% CI=9.66-149.52; P<0.001) and antenatal anaemia (adjusted OR=13.04; 95% CI=3.72-45.70; P<0.001) [online supplementary Table 2].
 
Discussion
Time from symptom onset to diagnosis
Over the 10-year study period, we observed a PPCM incidence of 1 in 11 179 live births in Hong Kong. Worldwide variation in PPCM incidence may relate to ethnic and socio-economic factors12; rates are expected to increase because of advancing maternal age,13 multiple pregnancies, and obesity. About one-third of our patients developed symptoms before delivery, a finding comparable to the Asia-Pacific group in the ESC EURObservational Research Programme registry.10 Overall, 30% of women were diagnosed more than 7 days after symptom onset. Among those with antepartum-onset symptoms, 54.5% were diagnosed after delivery. This diagnostic delay may be attributed to the difficulty in distinguishing PPCM from normal physiological changes of pregnancy—its symptoms often mimic those of late gestation and may only be recognised postpartum when they become more pronounced. Delayed diagnosis has been associated with lower rates of left ventricular recovery.14 Early recognition and awareness among both pregnant women and healthcare professionals are crucial to enable prompt initiation of heart failure therapy, which may improve cardiac recovery. To support early detection and facilitate timely specialist referral for diagnostic evaluation, serum biomarkers can be measured to rule out heart failure with high probability during pregnancy or the postpartum period.15
 
Pre-eclampsia and peripartum cardiomyopathy
In our study, approximately half of the cases involved pre-eclampsia, a finding consistent with the Asia-Pacific cohort in the ESC EURObservational Research Programme registry.10 A meta-analysis of 22 studies demonstrated a fourfold higher prevalence of pre-eclampsia among women with PPCM relative to the general obstetric population (22% vs 5%).16 Our multivariable regression analysis confirmed that hypertensive disorders of pregnancy constituted an independent risk factor for PPCM. The association between pre-eclampsia and PPCM may be explained by their shared pathophysiological mechanism—systemic vascular angiogenic imbalance.1 15 17 Preeclampsia and PPCM might represent a single disease spectrum with substantial overlap.17 Low-dose aspirin is generally used for the prevention of pre-eclampsia and its associated morbidity and mortality.18 Although aspirin use for PPCM prevention is not supported by evidence-based guidelines, it could theoretically provide benefit due to the shared vascular dysfunction pathways. Consequently, the use of aspirin for pre-eclampsia prevention may indirectly reduce the risk of PPCM in high-risk women.
 
Anaemia and peripartum cardiomyopathy
We found that antenatal anaemia was independently associated with PPCM. A systematic review and meta-analysis previously indicated that women with anaemia had up to fivefold higher odds of developing PPCM compared with women exhibiting normal haemoglobin levels.19 The precise nature of this association remains unclear; iron deficiency may contribute by impairing myocardial contractile function.20 Anaemia screening and correction during pregnancy may help reduce the risk of PPCM.
 
Management of peripartum cardiomyopathy
A multidisciplinary approach involving cardiologists, obstetricians, intensivists, cardiac surgeons, anaesthesiologists, neonatologists, and nurses is essential for the management of PPCM.21 In severe cases with haemodynamic instability, acute management—including immediate resuscitation and mechanical respiratory or circulatory support—may be required.15 Urgent caesarean section should be considered for advanced heart failure that persists despite optimal medical therapy. According to international consensus, the main treatment should follow guideline-directed medical therapy for heart failure with reduced ejection fraction in non-pregnant patients, while respecting contraindications for certain drugs during pregnancy.6 22 23 24 25 Standard therapies include diuretics, ACEis or ARBs, mineralocorticoid receptor antagonists, vasodilators (hydralazine/nitrates), digoxin, beta-blockers, and anticoagulants. A 2022 meta-analysis of global data demonstrated that frequent prescription of beta-blockers, ACEis/ARBs, and bromocriptine or cabergoline was associated with lower all-cause mortality and better left ventricular recovery at 12 months.26 In our study, most patients received ACEis/ARBs and beta-blockers; fewer were prescribed bromocriptine at discharge. The rationale for using dopamine agonists to inhibit prolactin secretion lies in the proposed pathophysiological mechanism involving 16-kDa prolactin, an oxidative stress-mediated cleavage product that damages cardiovascular tissue.27 Regarding prolactin inhibition in women with PPCM, a meta-analysis reported that those treated with bromocriptine had higher odds of left ventricular recovery, without a significant difference in all-cause mortality.28 However, bromocriptine use is associated with an increased risk of thromboembolic complications. The 2019 ESC–Heart Failure Association position statement issued a weak recommendation for bromocriptine use, advising that it should always be accompanied by at least prophylactic anticoagulation.15 Future randomised controlled trials and registry data with longer follow-up are needed to provide stronger evidence supporting its use. For women who do not recover from PPCM within 1 year, the American College of Cardiology/American Heart Association Joint Committee and the ESC recommend implantable cardioverter-defibrillator therapy for the primary prevention of sudden cardiac death due to ventricular tachyarrhythmia.22 29 30 Cardiac transplantation may be required for patients with refractory severe heart failure despite maximal medical therapy, as occurred in one of our cases.
 
Cardiac recovery and mortality
Estimates of left ventricular recovery and mortality in PPCM vary considerably across geographic regions,26 presumably due to differences in medical therapy, access to healthcare services, and follow-up duration. A 2022 meta-analysis of 4875 patients from 60 countries reported overall 12-month rates of left ventricular recovery and all-cause mortality of 58.7% and 9.8%, respectively.26 In our cohort, 60% of women achieved cardiac recovery; two patients (6.7%) died of myocardial infarction and pulmonary embolism within 12 months of diagnosis. Both had poor social support and did not adhere to treatment or attend follow-up visits, which likely contributed to their adverse outcomes. These findings highlight the need for greater public awareness, improved medication compliance, and stronger social support systems. We recommend enhanced nursing outreach and structured patient education, along with post-discharge monitoring, to optimise outcomes.
 
Prevention of thromboembolic complications
Thromboembolism, a potentially life-threatening complication of PPCM, affected 23.3% of women in our cohort. This high rate may be attributed to the hypercoagulable state of pregnancy, impaired circulation, and blood stasis from cardiac failure. Our incidence was higher than the reported global rate of 6.1% in a recent international study.26 Therapeutic anticoagulation is recommended for patients with intracardiac thrombus or systemic embolism. In our study, 13.3% of patients received low molecular weight heparin for thromboembolism prophylaxis. Both the AHA and ESC recommend anticoagulation in PPCM cases involving severe left ventricular dysfunction (LVEF <30% to <35%) during the peripartum period and up to 8 weeks postpartum.29 31 Despite the high thromboembolic risk in PPCM, anticoagulation remains a subject of ongoing debate.32 Our data support prophylactic anticoagulation for all women with PPCM, given the high incidence observed. Ultimately, individual assessment of thromboembolic risk—considering the extent of left ventricular dysfunction, caesarean delivery, immobility, and ventricular dilatation—may help identify patients most likely to benefit from thromboprophylaxis.
 
Relapse of peripartum cardiomyopathy in subsequent pregnancies
Relapse of PPCM and associated mortality in subsequent pregnancies are not uncommon; rates range from 5.3% to 29.5% and 0% to 55.5%, respectively.33 In our study, nine of 11 patients (81.8%) had confirmed recovery of cardiac function before conception. There were no maternal deaths or PPCM recurrences during pregnancy. A recent meta-analysis showed that women with persistent left ventricular dysfunction prior to a subsequent pregnancy had a higher risk of mortality and worsening function compared to women whose cardiac function had recovered.33 However, recovered left ventricular function does not guarantee an uncomplicated subsequent pregnancy.34 35 It is crucial to monitor cardiac function throughout pregnancy—and up to 6 months postpartum—to detect subclinical left ventricular dysfunction or PPCM recurrence. Women with a history of PPCM should be counselled regarding the risks of future pregnancies, including irreversible ventricular deterioration, maternal death, and fetal loss.36 Subsequent pregnancy is not recommended if LVEF fails to normalise. Contraceptive counselling should begin early after the acute event; reliable methods with minimal thromboembolic risk are preferred.37
 
Genetic assessment
A study has demonstrated a genetic contribution to PPCM in at least 15% of cases.38 The most commonly affected gene is TTN, which encodes the large sarcomeric protein titin.39 The relative prevalence of truncating variants in these genes is nearly identical between PPCM and DCM.39 In our study, three of 30 patients (10%) were screened for cardiomyopathy-related genes (TTN, FLNC, MYBPC3, NEXN), all of whom were in the non-recovery group, indicating that at least 10% had a genetic predisposition to PPCM. The American College of Cardiology/American Heart Association Joint Committee recommends that patients with non-ischaemic cardiomyopathy undergo genetic counselling and testing for inherited cardiomyopathies to facilitate early cardiac disease detection and timely initiation of treatments that reduce heart failure progression and sudden death risk.22 The identification of pathogenic genetic variants can provide valuable prognostic information and clarify associated risks (eg, arrhythmic complications linked to FLNC and DSP mutations), thereby guiding decisions on preventive measures, including implantable defibrillator placement and exercise recommendations. Furthermore, cascade genetic testing for relatives enables closer pregnancy monitoring, informed reproductive decisions (including prenatal or preimplantation genetic diagnosis), and lifelong cardiovascular surveillance to improve outcomes.40 The value of routine genetic testing remains limited by low penetrance, variable clinical expression, and uncertain variant significance. It may also lead to patient anxiety, potential genetic discrimination, and substantial resource implications. Careful patient selection with thorough pre- and post-test counselling is essential. Because the clinical presentation of PPCM closely resembles that of DCM, the ESC suggests that genetic testing be considered in PPCM cases with a positive family history,15 where clinically actionable findings are most likely to be identified.
 
Limitations
This study had several limitations. Because PPCM is a rare condition, a small sample size was inevitable. The retrospective nature of data collection over a 10-year period may have resulted in incomplete information. Outcomes could also have been influenced by variations in heart failure management over time and across hospitals. Furthermore, some PPCM cases managed in the private sector or outside Hong Kong might not have been captured. The long-term impact of PPCM on women’s overall health was not assessed. The establishment of a local PPCM registry would facilitate a better understanding of the condition, identification of outcome determinants, and optimisation of clinical care in Hong Kong.
 
Conclusion
Peripartum cardiomyopathy is an uncommon but potentially life-threatening medical condition affecting women worldwide. Genetic factors contribute to disease susceptibility in at least 10% of cases. Genetic testing may offer a valuable tool to guide prognosis and management in affected women.
 
Author contributions
Concept or design: LSK Law, LT Kwong, PL So.
Acquisition of data: LSK Law, KH Siong, HC Mok, STK Wong, JKO Wai, CY Chow, WL Chan, KY Tse, YYY Chan, KS Eu, PL So.
Analysis or interpretation of data: LSK Law, PL So.
Drafting of the manuscript: LSK Law, PL So.
Critical revision of the manuscript for important intellectual content: All authors.
 
All authors had full access to the data, contributed to the study, approved the final version for publication, and take responsibility for its accuracy and integrity.
 
Conflicts of interest
All authors have disclosed no conflicts of interest.
 
Acknowledgement
The authors thank all staff in the Statistics Department at Tuen Mun Hospital for their assistance with data collection.
 
Funding/support
This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
 
Ethics approval
This research was approved by the Central Institutional Review Board of Hospital Authority, Hong Kong (Ref No.: CIRB-2023-114-3). The requirement for informed patient consent was waived by the Board due to the retrospective nature of the research. All data used in the research were anomymised and unidentifiable.
 
Supplementary material
The supplementary material was provided by the authors and some information may not have been peer reviewed. Accepted supplementary material will be published as submitted by the authors, without any editing or formatting. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by the Hong Kong Academy of Medicine and the Hong Kong Medical Association. The Hong Kong Academy of Medicine and the Hong Kong Medical Association disclaim all liability and responsibility arising from any reliance placed on the content.
 
References
1. Davis MB, Arany Z, McNamara DM, Goland S, Elkayam U. Peripartum cardiomyopathy: JACC state-of-the-art review. J Am Coll Cardiol 2020;75:207-21. Crossref
2. Sliwa K, Hilfiker-Kleiner D, Petrie MC, et al. Current state of knowledge on aetiology, diagnosis, management, and therapy of peripartum cardiomyopathy: a position statement from the Heart Failure Association of the European Society of Cardiology Working Group on peripartum cardiomyopathy. Eur J Heart Fail 2010;12:767-78. Crossref
3. Isezuo SA, Abubakar SA. Epidemiologic profile of peripartum cardiomyopathy in a tertiary care hospital. Ethn Dis 2007;17:228-33.
4. Kamiya CA, Kitakaze M, Ishibashi-Ueda H, et al. Different characteristics of peripartum cardiomyopathy between patients complicated with and without hypertensive disorders. -Results from the Japanese Nationwide survey of peripartum cardiomyopathy-. Circ J 2011;75:1975-81. Crossref
5. Pierce JA, Price BO, Joyce JW. Familial occurrence of postpartal heart failure. Arch Intern Med 1963;111:651-5. Crossref
6. Morales A, Painter T, Li R, et al. Rare variant mutations in pregnancy-associated or peripartum cardiomyopathy. Circulation 2010;121:2176-82. Crossref
7. van Spaendonck-Zwarts KY, van Tintelen JP, van Veldhuisen DJ, et al. Peripartum cardiomyopathy as a part of familial dilated cardiomyopathy. Circulation 2010;121:2169-75. Crossref
8. van Spaendonck-Zwarts KY, Posafalvi A, van den Berg MP, et al. Titin gene mutations are common in families with both peripartum cardiomyopathy and dilated cardiomyopathy. Eur Heart J 2014;35:2165-73. Crossref
9. Honigberg MC, Givertz MM. Peripartum cardiomyopathy. BMJ 2019;364:k5287. Crossref
10. Sliwa K, Petrie MC, van der Meer P, et al. Clinical presentation, management, and 6-month outcomes in women with peripartum cardiomyopathy: an ESC EORP registry. Eur Heart J 2020;41:3787-97. Crossref
11. Koerber D, Khan S, Kirubarajan A, et al. Meta-analysis of long-term (>1 year) cardiac outcomes of peripartum cardiomyopathy. Am J Cardiol 2023;194:71-7. Crossref
12. Karaye KM, Ishaq NA, Sai’du H, et al. Disparities in clinical features and outcomes of peripartum cardiomyopathy in high versus low prevalent regions in Nigeria. ESC Heart Fail 2021;8:3257-67. Crossref
13. Kolte D, Khera S, Aronow WS, et al. Temporal trends in incidence and outcomes of peripartum cardiomyopathy in the United States: a nationwide population-based study. J Am Heart Assoc 2014;3:e001056. Crossref
14. Lewey J, Levine LD, Elovitz MA, Irizarry OC, Arany Z. Importance of early diagnosis in peripartum cardiomyopathy. Hypertension 2020;75:91-7. Crossref
15. Bauersachs J, König T, van der Meer P, et al. Pathophysiology, diagnosis and management of peripartum cardiomyopathy: a position statement from the Heart Failure Association of the European Society of Cardiology Study Group on peripartum cardiomyopathy. Eur J Heart Fail 2019;21:827-43. Crossref
16. Bello N, Rendon IS, Arany Z. The relationship between pre-eclampsia and peripartum cardiomyopathy: a systematic review and meta-analysis. J Am Coll Cardiol 2013;62:1715-23. Crossref
17. Parikh P, Blauwet L. Peripartum cardiomyopathy and preeclampsia: overlapping diseases of pregnancy. Curr Hypertens Rep 2018;20:69. Crossref
18. Henderson JT, Vesco KK, Senger CA, Thomas RG, Redmond N. Aspirin use to prevent preeclampsia and related morbidity and mortality: updated evidence report and systematic review for the US Preventive Services Task Force. JAMA 2021;326:1192-206. Crossref
19. Cherubin S, Peoples T, Gillard J, Lakhal-Littleton S, Kurinczuk JJ, Nair M. Systematic review and meta-analysis of prolactin and iron deficiency in peripartum cardiomyopathy. Open Heart 2020;7:e001430. Crossref
20. Anand IS, Gupta P. Anemia and iron deficiency in heart failure: current concepts and emerging therapies. Circulation 2018;138:80-98. Crossref
21. Sigauke FR, Ntsinjana H, Tsabedze N. Peripartum cardiomyopathy: a comprehensive and contemporary review. Heart Fail Rev 2024;29:1261-78. Crossref
22. Heidenreich PA, Bozkurt B, Aguilar D, et al. 2022 AHA/ACC/HFSA Guideline for the Management of Heart Failure: a report of the American College of Cardiology/American Heart Association Joint Committee on Clinical Practice Guidelines. Circulation 2022;145:e895-1032. Crossref
23. Arany Z. Peripartum cardiomyopathy. N Engl J Med 2024;390:154-64. Crossref
24. Azibani F, Sliwa K. Peripartum cardiomyopathy: an update. Curr Heart Fail Rep 2018;15:297-306. Crossref
25. Maddox TM, Januzzi JL Jr, Allen LA, et al. 2024 ACC Expert Consensus Decision Pathway for treatment of heart failure with reduced ejection fraction: a report of the American College of Cardiology Solution Set Oversight Committee. J Am Coll Cardiol 2024;83:1444-88. Crossref
26. Hoevelmann J, Engel ME, Muller E, et al. A global perspective on the management and outcomes of peripartum cardiomyopathy: a systematic review and meta-analysis. Eur J Heart Fail 2022;24:1719-36. Crossref
27. Hilfiker-Kleiner D, Kaminski K, Podewski E, et al. A cathepsin D–cleaved 16 kDa form of prolactin mediates postpartum cardiomyopathy. Cell 2007;128:589-600. Crossref
28. Kumar A, Ravi R, Sivakumar RK, et al. Prolactin inhibition in peripartum cardiomyopathy: systematic review and meta-analysis. Curr Probl Cardiol 2023;48:101461. Crossref
29. Bauersachs J, Arrigo M, Hilfiker-Kleiner D, et al. Current management of patients with severe acute peripartum cardiomyopathy: practical guidance from the Heart Failure Association of the European Society of Cardiology Study Group on peripartum cardiomyopathy. Eur J Heart Fail 2016;18:1096-105. Crossref
30. McDonagh TA, Metra M, Adamo M, et al. 2021 ESC Guidelines for the diagnosis and treatment of acute and chronic heart failure: developed by the Task Force for the diagnosis and treatment of acute and chronic heart failure of the European Society of Cardiology (ESC). With the special contribution of the Heart Failure Association (HFA) of the ESC. Eur J Heart Fail 2022;24:4-131. Crossref
31. Bozkurt B, Colvin M, Cook J, et al. Current diagnostic and treatment strategies for specific dilated cardiomyopathies: a scientific statement from the American Heart Association. Circulation 2016;134:e579-646. Crossref
32. Radakrishnan A, Dokko J, Pastena P, Kalogeropoulos AP. Thromboembolism in peripartum cardiomyopathy: a systematic review. J Thorac Dis 2024;16:645-60. Crossref
33. Wijayanto MA, Myrtha R, Lukas GA, et al. Outcomes of subsequent pregnancy in women with peripartum cardiomyopathy: a systematic review and meta-analysis. Open Heart 2024;11:e002626. Crossref
34. Pachariyanon P, Bogabathina H, Jaisingh K, Modi M, Modi K. Long-term outcomes of women with peripartum cardiomyopathy having subsequent pregnancies. J Am Coll Cardiol 2023;82:16-26. Crossref
35. Fett JD, Shah TP, McNamara DM. Why do some recovered peripartum cardiomyopathy mothers experience heart failure with a subsequent pregnancy? Curr Treat Options Cardiovasc Med 2015;17:354. Crossref
36. Sliwa K, van der Meer P, Petrie MC, et al. Corrigendum to ‘Risk stratification and management of women with cardiomyopathy/heart failure planning pregnancy or presenting during/after pregnancy: a position statement from the Heart Failure Association of the European Society of Cardiology Study Group on Peripartum Cardiomyopathy’ [Eur J Heart Fail 2021;23:527-540]. Eur J Heart Fail 2022;24:733. Crossref
37. Sliwa K, Petrie MC, Hilfiker-Kleiner D, et al. Long-term prognosis, subsequent pregnancy, contraception and overall management of peripartum cardiomyopathy: practical guidance paper from the Heart Failure Association of the European Society of Cardiology Study Group on Peripartum Cardiomyopathy. Eur J Heart Fail 2018;20:951-62. Crossref
38. Ware JS, Li J, Mazaika E, et al. Shared genetic predisposition in peripartum and dilated cardiomyopathies. N Engl J Med 2016;374:233-41. Crossref
39. Goli R, Li J, Brandimarto J, et al. Genetic and phenotypic landscape of peripartum cardiomyopathy. Circulation 2021;143:1852-62. Crossref
40. Arany Z. It is time to offer genetic testing to women with peripartum cardiomyopathy. Circulation 2022;146:4-5. Crossref

Use of 18F-fluorodeoxyglucose positron emission tomography coupled with computed tomography in early breast cancer management: consensus-based local recommendations by the Hong Kong Breast Cancer Foundation PET/CT Study Group

Hong Kong Med J 2025;31:Epub 12 Nov 2025
© Hong Kong Academy of Medicine. CC BY-NC-ND 4.0
 
ORIGINAL ARTICLE
Use of 18F-fluorodeoxyglucose positron emission tomography coupled with computed tomography in early breast cancer management: consensus-based local recommendations by the Hong Kong Breast Cancer Foundation PET/CT Study Group
Carol CH Kwok, MB, ChB, FHKAM (Radiology)# † 1; Henry CY Wong, MB, BS, FHKAM (Radiology)# † 1; Catherine YH Wong, MB, BS, FHKAM (Radiology)† 2; LW Yuen, MS, MA3; CC Yau, MB, BS, FHKAM (Radiology)† 3; Polly SY Cheung, MB, BS, FHKAM (Surgery)† 3
1 Department of Oncology, Princess Margaret Hospital, Hong Kong SAR, China
2 Department of Nuclear Medicine, Hong Kong Sanatorium & Hospital, Hong Kong SAR, China
3 Hong Kong Breast Cancer Foundation, Hong Kong SAR, China
# Equal contribution
Members of the Hong Kong Breast Cancer Foundation PET/CT Study Group
 
Corresponding author: Dr Carol CH Kwok (kwokch@ha.org.hk)
 
 Full paper in PDF
 
Abstract
Introduction: 18F-fluorodeoxyglucose positron emission tomography coupled with computed tomography (PET/CT) has been incorporated into breast cancer management. In Hong Kong, PET/CT use is increasing. This study aimed to establish consensus-based recommendations on the use of PET/CT in the management of early breast cancer.
 
Methods: A literature search was conducted in September 2023 using the keywords “breast cancer” and “PET/CT” within PubMed to identify research articles related to the use of PET/CT in early breast cancer. Guidelines from major international cancer agencies were also reviewed. Ten recommendation statements were drafted. A two-round modified Delphi consensus process was conducted over a 3-month period (19 December 2023 to 29 February 2024).
 
Results: A total of 76 experts consented to participate in the first round, of whom 71 completed the second round and were included as members of the expert panel, yielding a second-round response rate of 93.4%. The panel comprised oncologists (n=30, 42.3%), surgeons (n=35, 49.3%), and radiologists (including nuclear medicine radiologists) [n=6, 8.5%]. Experts from the Hospital Authority (n=37, 52.1%) and the private sector (n=32, 45.1%) were well represented. Two experts (2.8%) were from one of the two local university medical faculties. Over 75% of expert panel members had at least 15 years of clinical experience. Of the ten statements, consensus was achieved on seven in the first round and one additional statement in the second round.
 
Conclusion: Through the consensus process, the proposed recommendations are expected to gain wider acceptance and recognition among local healthcare professionals as guidance for the use of PET/CT in early breast cancer management.
 
 
New knowledge added by this study
  • First-of-its-kind local consensus-based recommendations on the use of positron emission tomography coupled with computed tomography (PET/CT) in early breast cancer were established.
  • The proposed recommendations were based on the largest and most up-to-date evidence, which reflected updated international guideline recommendations.
  • The consensus-establishing process provided a platform for exchange and sharing among multidisciplinary teams in resolving controversial aspects of clinical practice.
Implications for clinical practice or policy
  • Local recommendations on the use of PET/CT for early breast cancer patients have been proposed in light of the increasing availability of PET/CT facilities in Hong Kong.
  • These consensus recommendations cover important and relevant clinical settings, including screening, preoperative assessment of multifocality, axillary staging, pretreatment staging, evaluation of tumour response and axillary nodal status in the neoadjuvant setting before surgery, re-staging in recurrence, and follow-up for surveillance.
  • Through the consensus process, the proposed recommendations are expected to gain wider acceptance and recognition among local healthcare professionals as guidance on the use of PET/CT in early breast cancer management.
 
 
Introduction
Diagnostic imaging plays an important role in the screening, diagnosis, staging, and follow-up of patients affected by breast cancer. Mammography and breast ultrasound are the current standards of care for screening, diagnosis, and surveillance. For patients with locally advanced disease, guidelines recommend contrast-enhanced computed tomography (CT) scans and bone scans to detect distant metastases. In recent years, 18F-fluorodeoxyglucose (18F-FDG) positron emission tomography coupled with CT (PET/CT) has been introduced as an important imaging modality in oncological care. It is a powerful tool that combines the spatial resolution of a CT scan with information regarding biological processes within the scanned region. Positron emission tomography coupled with CT has the potential to identify malignant disease that may otherwise be missed or classified as benign based on size or morphological features in conventional imaging modalities.
 
In 2021, the Hong Kong Breast Cancer Foundation (HKBCF) analysed the utilisation of PET/CT among patients enrolled in the Hong Kong Breast Cancer Registry since 2007. Among the 4154 patients studied, the utilisation rate of PET/CT was 40.4% (online supplementary Fig 1). There was an increasing trend in PET/CT scan use for breast cancer staging over the past two decades. The overall utilisation of PET/CT increased from 23.3% in 2006-2010, to 48.5% in 2011-2015, and to 61.6% in the 2016-2021 cohort across all cancer stages (online supplementary Fig 2). This trend largely reflected the increasing availability of PET/CT facilities in Hong Kong. Over the past two decades, multiple PET/CT scanning facilities have been established in both the public and private sectors, making the service more accessible. Overall, usage of PET/CT was correlated with higher pathological stages of disease. Notably, PET/CT was used in up to 13.8% of stage 0 cases and 21.0% of stage I cases (online supplementary Fig 3).
 
Given the relatively high costs, concerns regarding radiation exposure, and the possibility of false-negative results, it is important to provide local recommendations on which groups of patients would benefit from the use of PET/CT in breast cancer. Through this study, we aimed to develop a local guideline regarding the use of PET/CT for early breast cancer to assist healthcare professionals in making evidence-based recommendations.
 
Methods
The objective of this study was to develop local recommendations on how to utilise PET/CT in the screening, diagnosis, staging, treatment response assessment, and surveillance of early breast cancer. A study group consisting of five members from the HKBCF (first, second, third, fifth and sixth authors) was convened. Study Group members were involved in performing the literature search, constructing the Delphi survey, analysing data, interpreting findings, and providing final approval of the recommendations.
 
To construct the survey, a literature search was performed in September 2023 by the Study Group using the keywords “breast cancer” and “PET/CT” in PubMed to identify research articles related to the use of PET/CT in early breast cancer. Systematic reviews and randomised controlled trials were prioritised to form the evidence base for the proposed statements. Guidelines from major international cancer agencies, including the National Comprehensive Cancer Network (NCCN) and the European Society for Medical Oncology, were reviewed. Ten statements were drafted based on the literature and international guidelines.
 
Delphi consensus process
A two-round modified Delphi consensus process was conducted over a period of 3 months (19 December 2023 to 29 February 2024). Surveys were developed using Google Forms, a web-based development tool. Responses provided by individual participants were anonymised to protect confidentiality. This study did not involve any patients as participants. Only individuals who took part in the first round were invited to participate in the second round.
 
Experienced physicians with an interest in breast cancer, working in the medical faculties of The University of Hong Kong and The Chinese University of Hong Kong, the Hospital Authority, and the private sector, were identified by the Study Group and invited to participate in the Delphi process. Additionally, members of the Hong Kong Breast Cancer Registry Steering Committee, the Hong Kong Breast Oncology Group, and the Hong Kong Society of Breast Surgeons were invited. Emails were sent to all potential participants by the Study Group to confirm their interest in participating.
 
After providing informed consent, participants were directed to an online survey for completion. In the first round, participants were provided with a summary of evidence corresponding to each of the ten statements in the survey (online Appendix 1). Participants were asked to indicate the extent of their agreement or disagreement on a five-point Likert scale (‘Completely agree’, ‘Agree’, ‘Neutral’, ‘Disagree’, and ‘Completely disagree’) for each statement. Respondents who selected ‘Disagree’ or ‘Completely disagree’ were asked to provide reasons for their choice in a free-text field within the survey. In accordance with published recommendations, statements that achieved agreement (‘Completely agree’ or ‘Agree’) from more than 75% of participants were considered to have reached consensus.
 
Following participant voting, the Study Group compiled and prepared the results from the first round. Statements that did not reach consensus were reviewed and amended based on participant feedback. For the second round, statements that did not reach consensus, or were newly created or modified based on participant feedback, were sent as a survey to the same participants. Participants were shown the results of the first round and informed where amendments had been made to statements in the second round.
 
Consensus statement disclaimer
The recommendations provided in this publication reflect the majority opinion of the expert panel. Although the recommendations are intended to guide clinical decision-making, they should not be regarded as the sole indications for utilising PET/CT in early breast cancer management. These consensus-based recommendations are designed to provide guidance for oncologists, surgeons, general practitioners, radiologists, and other physicians involved in the care of patients with early breast cancer. Treatment decisions for individual patients should ultimately be made at the discretion of the treating clinician, in conjunction with the patient’s unique needs and through shared decision-making.
 
Results
Two Delphi consensus rounds were completed. Among the 270 invited experts, 76 consented to participate in the first round, of whom 71 completed the second round and were included as members of the expert panel (online Appendix 2). The response rate for the second round was 93.4%. The panel comprised oncologists (n=30, 42.3%), surgeons (n=35, 49.3%), and radiologists (including nuclear medicine radiologists) [n=6, 8.5%]. Experts from the Hospital Authority (n=37, 52.1%) and the private sector (n=32, 45.1%) were well represented. Two experts (2.8%) were from one of the two medical faculties of the local universities. Over 75% of expert panel members had at least 15 years of clinical experience.
 
Of the ten statements, consensus was achieved on seven in the first round. Three statements were returned to the expert panel for rating in the second round, of which one achieved consensus (Fig). The results of the final consensus on the recommendation statements after the two-round Delphi consensus process are listed in the Table.
 

Figure. Modified Delphi process
 

Table. Results of the final consensus on the recommendation statements after a two-round Delphi consensus process
 
Discussion
In recent years, driven by increasing demand and easier access to PET/CT services, there has been a substantial increase in the use of PET/CT for breast cancer patients. Currently, there are 33 PET/CT machines across public, private, and academic institutions in Hong Kong. While PET/CT has the capability to enhance the detection of occult malignant disease, it also carries the risk of identifying false-positives and incidental findings, which could lead to unnecessary investigations and potentially delay curative-intent treatments. Although the utility of PET/CT in various breast cancer settings has been widely studied, there remains a lack of large prospective randomised studies comparing it with other imaging modalities. Given that PET/CT is costly and poses concerns about increased radiation exposure compared with other imaging techniques, such as contrast-enhanced CT scans, the development of local guidance and recommendations regarding its indications is clinically relevant and essential. To our knowledge, this consensus-based guideline is the first to provide practical recommendations on the use of PET/CT for breast cancer management.
 
Of the ten recommendation statements proposed, seven achieved consensus in the first round, suggesting that the indications for PET/CT in these areas are clear-cut and less controversial. These statements covered areas related to the screening, diagnosis, staging, and surveillance of breast cancer. Overall, the majority of local experts agreed that PET/CT should only be utilised in situations where patients have a high risk of distant metastases. This approach includes staging patients with advanced clinical stage disease or aggressive tumour biology and evaluating cancer survivors with suspicious clinical signs and symptoms suggestive of recurrence. Conversely, PET/CT should not be used in situations where the likelihood of detecting malignant disease is low, such as staging of ductal carcinoma in situ or stage I disease, screening asymptomatic women for breast cancer, and routine surveillance of cancer survivors. Increased 18F-FDG avidity of malignant cells forms the basis of 18F-FDG-PET in breast cancer imaging. Tumour characteristics that limit the sensitivity of 18F-FDG-PET in breast cancer imaging include small tumour size, low tumour grade, low proliferation, high expression of hormone receptors (particularly luminal A phenotype), and lobular histological type.1 2 3 Positron emission tomography coupled with CT therefore has limited sensitivity in detecting subcentimetre tumours,4 5 micrometastases, and small lymph node metastases in a clinically negative axilla relative to sentinel lymph node biopsy (SLNB).6 7 Additionally, the specificity of PET/CT is affected—some benign tumours and infectious or inflammatory conditions can demonstrate 18F-FDG uptake.8 Positron emission tomography coupled with CT has limited spatial resolution in assessing the multifocality of breast cancer.9
 
In contrast to its low sensitivity for detecting axillary nodal metastases, 18F-FDG PET/CT demonstrates high sensitivity in detecting extra-axillary lymph node involvement, including internal mammary, infraclavicular, and supraclavicular nodes10 11; distant metastases; and other unsuspected synchronous malignancies during initial breast cancer staging, which can potentially lead to upstaging and ultimately modification of planned treatment.12 13 14 The detection of extra-axillary lymph node involvement aids in selecting candidates for neoadjuvant chemotherapy and may guide subsequent radiotherapy planning to ensure adequate coverage of nodal involvement sites.11 15 16 In contrast to stage 0 and stage I disease, where the likelihood of distant metastasis is low, there is a growing body of evidence that PET/CT may outperform conventional imaging (contrast-enhanced CT of the thorax, abdomen, and pelvis; and bone scan).17 18 Furthermore, high-grade and poor-risk cancer subtypes may exhibit increased 18F-FDG uptake, thereby enhancing the diagnostic yield of PET/CT in staging these tumours.19 20 21 Our recommendations align with those of the NCCN22 and the French working group,23 which recently updated their guidance in this regard.
 
Controversies
The two recommendation statements that did not reach consensus after the Delphi rounds related to post–neoadjuvant therapy evaluation of tumour response to guide surgery to the primary tumour and axilla. In recent years, neoadjuvant chemotherapy has been increasingly used to downstage disease, facilitate surgery, and provide an opportunity for in vivo tumour response assessment to guide individualised treatment escalation or de-escalation after surgery. This approach has become the standard of care for patients with larger tumours who wish to undergo breast-conserving therapy and for stage II and III patients with aggressive tumour biology (eg, triple-negative and human epidermal growth factor receptor 2–positive breast cancer).22 Current studies on post-neoadjuvant chemotherapy tumour response assessment have mainly focused on the prediction of pathological complete response.24 25 26 27 Previous studies have shown that magnetic resonance imaging (MRI) may exhibit higher sensitivity, whereas PET/CT demonstrates higher specificity in predicting the pathological response after neoadjuvant chemotherapy, indicating the complementary value of combining these modalities to improve diagnostic performance.28
 
The method of assessing primary tumour response during neoadjuvant therapy has varied across clinical trials. For example, in the NeoSphere trial, which evaluated the addition of neoadjuvant pertuzumab to docetaxel and trastuzumab, clinical response was assessed via physical examination.29 Other trials have supplemented clinical assessment with diagnostic imaging during treatment. In the PREDIX HER2 trial, which compared neoadjuvant docetaxel, trastuzumab and pertuzumab versus trastuzumab emtansine, investigators routinely utilised mammography, ultrasound, or MRI after the second, fourth, and sixth cycles for response assessment.30 Positron emission tomography coupled with CT was performed at baseline, then repeated after the second and final cycles at the investigators’ discretion.30 Currently, international guidelines vary in their recommendations of preferred assessment modality. The 2024 European Society for Medical Oncology guideline31 recommends the use of MRI to assess local response if pretreatment MRI data are available. The NCCN guidelines22 suggest that assessment should include physical examination and imaging studies, with the choice of imaging modality determined by a multidisciplinary team. The differing opinions within our expert panel reflect these variations in existing evidence and guidelines. Clinicians should individualise their assessment strategy based on the patient’s clinical status and access to imaging modalities.
 
It has long been the standard of care to offer axillary lymph node dissection to patients with a clinically positive axillary lymph node to ensure adequate tumour clearance. However, given the introduction of neoadjuvant systemic therapies, ongoing studies are evaluating alternative approaches to axillary management to reduce the risk of arm lymphoedema. In patients who have converted from clinically node-positive to clinically node-negative disease after systemic therapy, SLNB and targeted axillary lymph node dissection are currently recommended by international guidelines (instead of routine axillary lymph node dissection).22 Our Delphi study surveyed the views of local experts on whether PET/CT should be recommended as an additional imaging modality to screen for occult residual axillary disease. While recognising that PET/CT may yield false-positive results, some experts reported using PET/CT to guide whether axillary lymph node dissection could be undertaken directly without a positive SLNB, particularly in patients with initially bulky axillary disease. This approach aligns with the latest NCCN guidelines,22 which caution against the use of SLNB in pre-chemotherapy clinical N2 stage disease. The statement that PET/CT is not recommended to guide the decision for axillary lymph node dissection in patients with clinically node-positive disease who become node-negative on clinical examination and ultrasound and/or MRI after neoadjuvant systemic therapy remains open. Further studies regarding the accuracy of PET/CT in this context may help resolve the controversy. The management approach for the axilla after neoadjuvant therapy is constantly evolving. For example, axillary radiation is currently being tested as an alternative to axillary lymph node dissection in the ongoing Alliance A011202 randomised trial among patients with a positive SLNB.32 The timing and role of PET/CT will need to be re-evaluated within this ever-changing paradigm of axillary management in the neoadjuvant setting.
 
Positron emission tomography coupled with CT is often presumed to involve high radiation exposure. However, when used appropriately for breast cancer staging with low-dose, non-contrast CT, the radiation exposure can be considerably lower than that of whole-body, high-resolution contrast CT combined with a bone scan. Previous international guidelines have suggested that PET/CT can be performed in situations where standard staging studies are equivocal or suspicious.22 31 Such a sequential approach may not be cost-effective in the clinical scenarios outlined by our expert panel and may expose patients to unnecessary radiation from multiple whole-body imaging examinations. The use of PET/CT as a one-stop assessment enables quicker evaluation of disease status and can facilitate earlier initiation of appropriate treatment.33
 
Strengths and limitations
A strength of our Delphi consensus study is that it involved a large group of experienced specialists representing multiple disciplines and both the public and private sectors. This consensus exercise provided a valuable platform in which clinical experiences, practices, ideas, and opinions were shared and exchanged anonymously. It also helped resolve controversial issues and achieve consensus, particularly in areas where high-level evidence is absent. Recommendations that have achieved consensus should receive wider acceptance and recognition when incorporated into clinical practice.
 
However, our study had notable limitations. First, expert panellists were invited by the Study Group, and thus the consensus results may not fully reflect the views of all local practitioners involved in treating breast cancer patients. Nevertheless, our sample size of more than 70 participants is considered large for Delphi studies, and we achieved balanced representation of participants from various backgrounds. Second, the initial statements were devised based on recently published articles selected by the Study Group, which could introduce bias compared with a formal systematic review. However, the Study Group prioritised reviewing meta-analyses and randomised controlled trials when drafting the initial statements to ensure they reflected the most up-to-date, high-level evidence.
 
Conclusion
Based on the results of this Delphi consensus study, the HKBCF PET/CT Study Group provides recommendations on the use of PET/CT for early breast cancer in areas of screening, diagnosis, staging, and surveillance. These recommendations are intended to guide the appropriate use of PET/CT in the local population across both public and private healthcare settings. Breast cancer management is rapidly advancing, and the management paradigm is continually evolving as new evidence becomes available. As technology progresses, more innovative imaging modalities, such as PET/MRI and PET scans with new radiotracers, are expected to play an increasing role.14 34 35 The Study Group will review and update these recommendation guidelines at regular intervals based on emerging evidence, particularly in relation to response assessment during and after neoadjuvant systemic therapy.
 
Author contributions
Concept or design: PSY Cheung, CC Yau, CCH Kwok, HCY Wong, CYH Wong.
Acquisition of data: CCH Kwok, HCY Wong.
Analysis or interpretation of data: HCY Wong, CCH Kwok, LW Yuen.
Drafting of the manuscript: CCH Kwok, HCY Wong.
Critical revision of the manuscript for important intellectual content: CCH Kwok, HCY Wong, CYH Wong, CC Yau, PSY Cheung.
 
All authors had full access to the data, contributed to the study, approved the final version for publication, and take responsibility for its accuracy and integrity.
 
Conflicts of interest
All authors have disclosed no conflicts of interest.
 
Acknowledgement
The authors thank all participants who contributed to this research.
 
Funding/support
This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
 
Ethics approval
This research was approved by the Breast Cancer Research Centre Research Committee of the Hong Kong Breast Cancer Foundation. The requirement for informed consent from patients was waived by the Committee as patient data collection by the Hong Kong Breast Cancer Registry was approved by respective participating hospitals and centres. The present study does not involve patient participation and there was no new patient data collection.
 
Supplementary material
The supplementary material was provided by the authors and some information may not have been peer reviewed. Accepted supplementary material will be published as submitted by the authors, without any editing or formatting. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by the Hong Kong Academy of Medicine and the Hong Kong Medical Association. The Hong Kong Academy of Medicine and the Hong Kong Medical Association disclaim all liability and responsibility arising from any reliance placed on the content.
 
References
1. Groheux D, Giacchetti S, Moretti JL, et al. Correlation of high 18F-FDG uptake to clinical, pathological and biological prognostic factors in breast cancer. Eur J Nucl Med Mol Imaging 2011;38:426-35. Crossref
2. Buck A, Schirrmeister H, Kühn T, et al. FDG uptake in breast cancer: correlation with biological and clinical prognostic parameters. Eur J Nucl Med Mol Imaging 2002;29:1317-23. Crossref
3. Humbert O, Berriolo-Riedinger A, Cochet A, et al. Prognostic relevance at 5 years of the early monitoring of neoadjuvant chemotherapy using 18F-FDG PET in luminal HER2-negative breast cancer. Eur J Nucl Med Mol Imaging 2014;41:416-27. Crossref
4. Avril N, Rosé CA, Schelling M, et al. Breast imaging with positron emission tomography and fluorine-18 fluorodeoxyglucose: use and limitations. J Clin Oncol 2000;18:3495-502. Crossref
5. Kumar R, Chauhan A, Zhuang H, Chandra P, Schnall M, Alavi A. Clinicopathologic factors associated with false negative FDG-PET in primary breast cancer. Breast Cancer Res Treat 2006;98:267-74. Crossref
6. Peare R, Staff RT, Heys SD. The use of FDG-PET in assessing axillary lymph node status in breast cancer: a systematic review and meta-analysis of the literature. Breast Cancer Res Treat 2010;123:281-90. Crossref
7. Cooper KL, Harnan S, Meng Y, et al. Positron emission tomography (PET) for assessment of axillary lymph node status in early breast cancer: a systematic review and meta-analysis. Eur J Surg Oncol 2011;37:187-98. Crossref
8. Adejolu M, Huo L, Rohren E, Santiago L, Yang WT. False-positive lesions mimicking breast cancer on FDG PET and PET/CT. AJR Am J Roentgenol 2012;198:W304-14. Crossref
9. Ergul N, Kadioglu H, Yildiz S, et al. Assessment of multifocality and axillary nodal involvement in early-stage breast cancer patients using 18F-FDG PET/CT compared to contrast-enhanced and diffusion-weighted magnetic resonance imaging and sentinel node biopsy. Acta Radiol 2015;56:917-23. Crossref
10. Aukema TS, Straver ME, Peeters MJ, et al. Detection of extra-axillary lymph node involvement with FDG PET/CT in patients with stage II–III breast cancer. Eur J Cancer 2010;46:3205-10. Crossref
11. Seo MJ, Lee JJ, Kim HO, et al. Detection of internal mammary lymph node metastasis with 18F-fluorodeoxyglucose positron emission tomography/computed tomography in patients with stage III breast cancer. Eur J Nucl Med Mol Imaging 2014;41:438-45. Crossref
12. Rong J, Wang S, Ding Q, Yun M, Zheng Z, Ye S. Comparison of 18FDG PET-CT and bone scintigraphy for detection of bone metastases in breast cancer patients. A meta-analysis. Surg Oncol 2013;22:86-91. Crossref
13. Sun Z, Yi YL, Liu Y, Xiong JP, He CZ. Comparison of whole-body PET/PET-CT and conventional imaging procedures for distant metastasis staging in patients with breast cancer: a meta-analysis. Eur J Gynaecol Oncol 2015;36:672-6.
14. Han S, Choi JY. Impact of 18F-FDG PET, PET/CT, and PET/MRI on staging and management as an initial staging modality in breast cancer: a systematic review and metaanalysis. Clin Nucl Med 2021;46:271-82. Crossref
15. Groheux D, Espié M, Giacchetti S, Hindié E. Performance of FDG PET/CT in the clinical management of breast cancer. Radiology 2013;266:388-405. Crossref
16. Borm KJ, Voppichler J, Düsberg M, et al. FDG/PET-CT–based lymph node atlas in breast cancer patients. Int J Radiat Oncol Biol Phys 2019;103:574-82. Crossref
17. Caresia Aroztegui AP, García Vicente AM, Alvarez Ruiz S, et al. 18F-FDG PET/CT in breast cancer: evidence-based recommendations in initial staging. Tumor Biol 2017;39:1010428317728285. Crossref
18. Dayes IS, Metser U, Hodgson N, et al. Impact of 18F-labeled fluorodeoxyglucose positron emission tomography–computed tomography versus conventional staging in patients with locally advanced breast cancer. J Clin Oncol 2023;41:3909-16. Crossref
19. de Mooij CM, Ploumen RA, Nelemans PJ, Mottaghy FM, Smidt ML, van Nijnatten TJ. The influence of receptor expression and clinical subtypes on baseline [18F]FDG uptake in breast cancer: systematic review and meta-analysis. EJNMMI Res 2023;13:5. Crossref
20. Basu S, Chen W, Tchou J, et al. Comparison of triple-negative and estrogen receptor–positive/progesterone receptor–positive/HER2-negative breast carcinoma using quantitative fluorine-18 fluorodeoxyglucose/positron emission tomography imaging parameters: a potentially useful method for disease characterization. Cancer 2008;112:995-1000. Crossref
21. Ulaner GA, Castillo R, Goldman DA, et al. 18F-FDG-PET/CT for systemic staging of newly diagnosed triple-negative breast cancer. Eur J Nucl Med Mol Imaging 2016;43:1937-44. Crossref
22. Gradishar WJ, Moran MS, Abraham J, et al. NCCN Guidelines® Breast Cancer Version 4.2023. J Natl Compr Canc Netw 2023;21:594-608. Crossref
23. Groheux D, Hindie E. Breast cancer: initial workup and staging with FDG PET/CT. Clin Transl Imaging 2021;9:221-31. Crossref
24. Elsayed B, Alksas A, Shehata M, et al. Exploring neoadjuvant chemotherapy, predictive models, radiomic, and pathological markers in breast cancer: a comprehensive review. Cancers 2023;15:5288. Crossref
25. Imbriaco M, Ponsiglione A. Predicting pathologic complete response after neoadjuvant chemotherapy. Radiology 2021;299:301-2. Crossref
26. Romeo V, Accardo G, Perillo T, et al. Assessment and prediction of response to neoadjuvant chemotherapy in breast cancer: a comparison of imaging modalities and future perspectives. Cancers (Basel) 2021;13:3521. Crossref
27. Lafci O, Resch D, Santonocito A, Clauser P, Helbich T, Baltzer PA. Role of imaging-based response assessment for adapting neoadjuvant systemic therapy for breast cancer: a systematic review. Eur J Radiol 2025:187:112105. Crossref
28. Caracciolo M, Castello A, Urso L, et al. Comparison of MRI vs. [18F]FDG PET/CT for treatment response evaluation of primary breast cancer after neoadjuvant chemotherapy: literature review and future perspectives. J Clin Med 2023;12:5355. Crossref
29. Gianni L, Pienkowski T, Im YH, et al. Efficacy and safety of neoadjuvant pertuzumab and trastuzumab in women with locally advanced, inflammatory, or early HER2-positive breast cancer (NeoSphere): a randomised multicentre, open-label, phase 2 trial. Lancet Oncol 2012;13:25-32. Crossref
30. Hatschek T, Foukakis T, Bjöhle J, et al. Neoadjuvant trastuzumab, pertuzumab, and docetaxel vs trastuzumab emtansine in patients with ERBB2-positive breast cancer: a phase 2 randomized clinical trial. JAMA Oncol 2021;7:1360-7. Crossref
31. Loibl S, André F, Bachelot T, et al. Early breast cancer: ESMO Clinical Practice Guideline for diagnosis, treatment and follow-up. Ann Oncol 2024;35:159-82. Crossref
32. National Library of Medicine, National Center for Biotechnology Information, US. Comparison of axillary lymph node dissection with axillary radiation for patients with node-positive breast cancer treated with chemotherapy. Available from: https://clinicaltrials.gov/study/NCT01901094. Accessed 13 Jan 2025.
33. Hyland CJ, Varghese F, Yau C, et al. Use of 18F-FDG PET/CT as an initial staging procedure for stage II–III breast cancer: a multicenter value analysis. J Natl Compr Canc Netw 2020;18:1510-7. Crossref
34. Ming Y, Wu N, Qian T, et al. Progress and future trends in PET/CT and PET/MRI molecular imaging approaches for breast cancer. Front Oncol 2020;10:1301. Crossref
35. Zhang-Yin J. State of the art in 2022 PET/CT in breast cancer: a review. J Clin Med 2023;12:968. Crossref

Parental depression in the relationship between parental stress and child health among lowincome families in Hong Kong

Hong Kong Med J 2025 Oct;31(5):374–83 | Epub 23 Sep 2025
© Hong Kong Academy of Medicine. CC BY-NC-ND 4.0
 
ORIGINAL ARTICLE
Parental depression in the relationship between parental stress and child health among low-income families in Hong Kong
Esther YT Yu, FRACGP, FHKAM (Family Medicine)1; Eric YF Wan, PhD, CStat1,2; Rosa SM Wong, PhD3; Ivy L Mak, PhD1; Kiki SN Liu, PhD1; Caitlin HN Yeung, MB, BS, MPH1; Patrick Ip, FRCPCH, FHKAM (Paediatrics)4,5; Agnes FY Tiwari, PhD, FAAN6; Weng Y Chin, FRACGP1; Emily TY Tse, FRACGP, FHKAM (Family Medicine)1; Carlos KH Wong, PhD1,2,7; Vivian Y Guo, PhD8; Cindy LK Lam, MD, FHKAM (Family Medicine)1
1 Department of Family Medicine and Primary Care, The University of Hong Kong, Hong Kong SAR, China
2 Department of Pharmacology and Pharmacy, The University of Hong Kong, Hong Kong SAR, China
3 Department of Special Education and Counselling, The Education University of Hong Kong, Hong Kong SAR, China
4 Department of Paediatrics and Adolescent Medicine, The University of Hong Kong, Hong Kong SAR, China
5 Department of Paediatrics and Adolescent Medicine, Hong Kong Children’s Hospital, Hong Kong SAR, China
6 School of Nursing, Hong Kong Sanatorium & Hospital, Hong Kong SAR, China
7 Laboratory of Data Discovery for Health Limited, Hong Kong Science Park, Hong Kong SAR, China
8 Department of Epidemiology, School of Public Health, Sun Yat-sen University, Guangzhou, China
 
Corresponding author: Dr Eric YF Wan (yfwan@hku.hk)
 
 Full paper in PDF
 
Abstract
Introduction: Low-income families face increased exposure to stressors, including material hardship and limited social support, which contribute to poor health outcomes. The poor health and behavioural problems in children from these families may exacerbate parental stress. This study explored the bidirectional relationship between parental stress and child health, along with its mediators and moderators, among low-income families in Hong Kong.
 
Methods: In total, 217 families were recruited from two less affluent communities between 2016 and 2017; they were followed up at 12 and 24 months. Each parent-child pair was assessed using parent-completed questionnaires on socio-demographics, medical history, parental stress, health-related quality of life, child health and behaviour, family harmony, parenting style, and neighbourhood cohesion.
 
Results: Thirty-eight parents (17.5%) reported significantly higher levels of stress than the control group. These individuals were more likely to be single parents (41.2% vs 18.5%), victims of intimate partner abuse (23.7% vs 10.9%), have a household income below 50% of the Hong Kong population median (50.0% vs 29.9%), and be diagnosed with mental illnesses (23.7% vs 5.1%). A bidirectional inverse relationship was observed between parental stress and child health at respective time points, with cross-effects from baseline child health to later parental stress, and from baseline parental stress to later child health. The relationship was mediated by the level of parental depression.
 
Conclusion: Parental stress both precedes and results from child health and behavioural problems, with reciprocal short-term and long-term effects. Screening and intervention for parental depression are needed to mitigate the impacts of stress on health among parents and children.
 
 
New knowledge added by this study
  • Single parents, victims of intimate partner abuse, individuals with mental illnesses, and/or those living in poverty reported significantly higher levels of stress compared to other low-income parents in Hong Kong.
  • A bidirectional inverse relationship was observed between general parental stress and child health over a 24-month period among low-income families in Hong Kong.
  • Parental depression mediated the relationship between parental stress and child health.
Implications for clinical practice or policy
  • Active screening for parental depression among at-risk parents in low-income communities is urgently needed to enable early intervention and reduce long-term negative impacts on child health.
 
 
Introduction
Low-income families face increased exposure to stressors,1 2 such as material hardship, dispossession, limited social support,3 4 trauma, and violence,1 5 which subsequently affect family relationships and the physical and mental health of parents,6 7 8 contributing to household-wide feelings of stigma, isolation, and exclusion. These stressors are particularly relevant to Hong Kong, where approximately one-fifth of the population lives below the poverty line.9 Adults from low-income families in Hong Kong have reported significantly lower health-related quality of life (HRQOL) than age- and sex-matched individuals from the general population; low income is significantly associated with poorer mental health.10
 
Stressors may persist across the life course and affect the next generation, resulting in intergenerational socio-economic inequality and health disparities. Early caregiving experiences have been linked to later-life child health outcomes through physiological stress responses.11 Moreover, poor mental health in parents may lead to family disharmony and maladaptive parenting practices, which can increase a child’s risk of adverse health outcomes.7 8 Specifically, children of parents with depression tend to exhibit more difficult temperaments and diminished psychosocial functioning.12 13 Children from low-income families in Hong Kong have reported poorer health and more behavioural problems relative to population norms for similar age-groups.14 15 Without adequate parental care and guidance, such children may be more vulnerable to academic difficulties and behavioural problems, thereby exacerbating parental stress. A bidirectional relationship between parental stress and child health has been documented in Western studies6 8 but not within the Chinese context.
 
Stress coping can be mediated or moderated by various social factors.16 17 For instance, stressed parents may contribute to family disharmony, which mediates diminished child health. Neighbourhood cohesion may moderate this relationship by alleviating parental stress and enhancing children’s well-being. The identification of mediators and moderators that may influence the relationship between parental stress and child health enables development of targeted interventions and policy recommendations. Despite strong associations of parental depression with stress18 and child health,12 13 its mediating role in this relationship remains unclear. A recent study demonstrated mediation between parental stress and parent-infant bonding,19 but evidence concerning overall child health is lacking. This study aimed to explore whether a bidirectional relationship exists between parental stress and child health and to identify its mediators and moderators, with the goal of promoting health among parents and children from low-income families in Hong Kong. We hypothesised that parental stress precedes and results from child health, with mediating and moderating effects exerted by factors illustrated in Figure 1.
 

Figure 1. Study concept map based on existing knowledge of the associations of parental, child and family factors with parental stress and child health
 
Methods
Study design
This prospective cohort study involved 217 parent-child pairs in which at least one parent was the primary caregiver and at least one parent was employed, with a monthly household income lower than 75% of the Hong Kong median at baseline. This income criterion included working poor families who lived above the poverty line (50% of the population median) and received limited government support. Families were recruited by research staff when attending health assessments during our previous cohort study20 performed in two less affluent Hong Kong communities between May 2016 and October 2017. Parents unable to communicate in Chinese, as well as children born prematurely and/or with congenital deformities, were excluded. All parents provided written informed consent for themselves and their child to participate in the study. Sample size was determined based on the need to detect a difference in Child Health Questionnaire (CHQ) scores between children of parents with high and low stress levels, classified according to the Depression Anxiety Stress Scales (DASS) stress subscale scores. Our previous cohort study showed that average CHQ general health perceptions subscale scores in children of parents with high and low DASS stress subscale scores were 59 (standard deviation [SD]=17) and 65 (SD=16), respectively20 (effect size=0.4). A sample size of 200 (100 per group) parent-child pairs was required to detect a difference of 6 points in CHQ general health perceptions subscale score between groups using an independent t test with 80% power and a 5% level of significance.
 
Data collection
Each parent-child pair was invited to complete a comprehensive questionnaire survey at three time points (ie, baseline, 12 months, and 24 months) covering parental stress, HRQOL, and mental health; child’s general health, HRQOL, and behaviour; family harmony; parenting style; and neighbourhood cohesion, as reported by the parent. Potential confounders were recorded at baseline, including parental age, gender, education level, marital status, employment status, household income, smoking habits, and alcohol consumption, as well as the child’s age, gender, estimated intelligence quotient, and special education needs. Physical and mental co-morbidities in parents and children were recorded at all three time points.
 
Study instruments
Exposure
Parental stress was measured using the stress subscale of the DASS–21 items questionnaire.21 A cut-off score of ≥15 indicated the presence of significant parental stress.21 The scale has been validated in a Chinese population.22
 
Primary outcome
Child health was measured using the general health perceptions subscale score from the CHQ–Parent Form 50.23 A higher score indicates better perceived physical and psychological HRQOL in the child based on parental proxy report. The Chinese version has demonstrated good psychometric properties in local Chinese children.20
 
Potential mediators/moderators
The Patient Health Questionnaire–9 (PHQ-9)24 was used to screen for parental depression. A cut-off score of ≥10 was regarded as clinically significant depression. The Chinese version of the PHQ-9 was validated and used in our previous study.20 Family harmony was measured using the Family Harmony Scale–Short Form (FHS-5).25 Higher single-factor harmony scores reflect greater harmony. The Chinese version has demonstrated good psychometric properties in local Chinese households.25 Parent-child interaction was assessed using the Child Physical Assault and Neglect subscales of the Parent–Child Conflict Tactics Scale (CTSPC).26 Higher scores indicate higher frequencies of respective issues in the past 12 months. The translated traditional Chinese version has demonstrated good psychometric properties.27 Parenting style was assessed using the Authoritative Parenting Style subscale of the short version of the Parenting Style and Dimensions Questionnaire.28 A higher score indicates a stronger tendency towards authoritative parenting. The questionnaire has been validated in the Chinese cultural setting.29 Neighbourhood support was measured using the Neighbourhood Collective Efficacy Scale.30 Higher scores indicate greater neighbourhood cohesion. The scale has been tested in Chinese in a local study.31
 
Data analysis
Baseline characteristics of parent-child pairs and their households were summarised using descriptive statistics. Differences between groups according to parental stress level were assessed using independent t tests for continuous variables and the Chi squared test for categorical variables.
 
The longitudinal bidirectional relationship between parental stress and child health was assessed using a cross-lagged panel model. Multiple indicators were utilised to evaluate model goodness-of-fit. A statistically non-significant Chi squared P value, Comparative Fit Index and Tucker-Lewis Index >0.95, root mean square error of approximation ≤0.05, and standardised root mean residual >0.08 were considered indicative of desirable goodness-of-fit. The final model was selected using root mean square error of approximation–based forward stepwise selection.
 
A mediation model was used to evaluate candidate mediators. Model estimates were obtained using 5000 bootstrapping samples. A statistically significant indirect effect, along with a reduced direct effect magnitude relative to the total effect, indicated that a given mediator explained the relationship between parental stress and child health.32 A multi-mediator model was constructed; differences in indirect effects between mediators were estimated via pairwise comparison.
 
Potential moderating effects of neighbourhood cohesion and parenting style on the relationship between parental stress and child health were examined by multivariable linear regression. A statistically significant interaction term coefficient indicated a moderation effect. All variables were centred to a mean of zero to reduce multicollinearity related to interaction terms. Confounders were included to improve model goodness-of-fit; R2 and adjusted R2 values were used to evaluate model performance.
 
All descriptive analyses were performed using Stata 16 (StataCorp LLC, College Station [TX], US); all model analyses were carried out using the lavaan package33 version 0.6-6, in R version 4.0.1 (R Foundation for Statistical Computing, Vienna, Austria). Data completion rates are presented in online supplementary Table 1. Complete case analyses were conducted. All tests were two-tailed; P values <0.05 were considered statistically significant.
 
Results
Among the 217 parent-child pairs recruited at baseline, 175 (80.6%) and 184 (87.6%) pairs attended the 12-month and 24-month follow-ups, respectively (online supplementary Fig 1). Their characteristics at each of the three time points are detailed in Table 1.
 

Table 1. Socio-demographics, co-morbidities, and outcome measures
 
Baseline characteristics of parent-child pairs
At baseline, the ages of parents and children (mean ± SD) were 42.4 ± 6.2 years and 10.7 ± 2.0 years, respectively. Approximately half of the children were girls (47.5%), whereas the parents involved were predominantly mothers (91.7%). The majority (75.2%) of parents had completed secondary education. Approximately 39.8% of primary parents were employed, and 57.2% of families had a monthly household income below 75% of the 2016 Hong Kong median (ie, HK$25 000).34
 
Thirty-eight parents (17.5%) experienced significant stress, indicated by a DASS stress subscale score of 15 or above at baseline. Considerable differences were evident in their baseline characteristics compared with parents who were not stressed. Stressed parents were more likely to be single parents (41.2% vs 18.5%) and to have a household income below 50% of the Hong Kong median (50.0% vs 29.9%). A greater proportion of stressed parents reported being victims of intimate partner abuse (23.7% vs 10.9%). Diagnosed mental illnesses (23.7% vs 5.1%) and depression, indicated by a PHQ-9 score ≥10 (21.1% vs 2.4%), were more prevalent among these parents (Table 2). Both their physical and mental HRQOL were significantly worse (physical component score=42.5 ± 9.9 vs 49.1 ± 8.2; mental component score=38.1 ± 10.0 vs 55.5 ± 8.7; P<0.001).
 

Table 2. Baseline characteristics stratified by parental stress group (n=217)
 
Compared with children of parents who were not stressed, children of stressed parents were younger (age=10.0 ± 2.6 years vs 10.8 ± 1.8 years; P=0.020) and had worse general health and HRQOL, as reflected by lower scores in every subscale of the CHQ–Parent Form 50 except bodily pain and self-esteem. In particular, large differences were observed in four subscales: parental impact—emotional, parental impact—time, family activities, and family cohesion.
 
Moreover, stressed parents reported lower scores in family harmony (FHS-5) and neighbourhood cohesion (Neighbourhood Collective Efficacy Scale). Although parenting style did not differ significantly, stressed parents showed a greater tendency for physical punishment, as reflected by higher scores on the CTSPC–physical assault subscale, and for neglect, as indicated by higher CTSPC–neglect subscale scores, compared with parents who were not stressed (Table 2).
 
Relationship between parental stress and child health over time
Figure 2 shows the cross-lagged panel model examining the bidirectional relationship between parental stress and child health. A bidirectional relationship between child health and parental stress was confirmed. Significant associations were observed between parental stress and child health at each time point (estimates: baseline=-0.22, 12 months=-0.21, 24 months=-0.47); between baseline child health and parental stress at 12 months (estimate=-0.40) and 24 months (estimate=-0.42); and between baseline parental stress and child health at 12 months (estimate=-0.57) and 24 months (estimate=-0.10).
 

Figure 2. Cross-lagged panel model between parental stress and child health
 
Mediators and moderators of the parent-child health relationship over time
The multi-mediation model results generated by bootstrapping are illustrated in Figure 3; the model estimates and goodness-of-fit statistics are presented in online supplementary Table 2. The total effect of the relationship between parental stress and child health was reduced when mediators were included in the model. Significant positive associations of parental stress were observed with the PHQ-9 score, as well as the physical assault and neglect subscales of the CTSPC. A significant negative association was noted between parental stress and the FHS-5 score. Among mediators, only the PHQ-9 exerted a significant negative effect on child health.
 

Figure 3. Multi-mediation model between parental stress and child health
 
Table 3 presents the moderation model. Neither neighbourhood cohesion nor parenting style demonstrated a moderating effect on the relationship between parental stress and child health. Estimates for the interaction terms were negligible. The R2 values were around 0.21, and the adjusted R2 values were slightly lower (0.11-0.13), indicating modest explanatory power of the model after adjusting for confounders.
 

Table 3. Moderation effects of the relationship between parental stress and child health
 
Discussion
Our study demonstrated that a substantial proportion of low-income parents experienced stress (17.5%), which was associated with multiple stressors including poverty, marital problems, intimate partner abuse, family disharmony, and reduced neighbourhood support. Children of stressed parents reported worse general health and HRQOL, as well as more behavioural problems. A short-term and long-term bidirectional inverse relationship between parental stress and child health was confirmed; this relationship was partially mediated by the level of parental depression.
 
Compared with the general Hong Kong population, the parent-child pairs in this study were more exposed to various known stressors in addition to low income. The prevalences of single-parent families (22.3% vs 9.8%35) and intimate partner abuse (13.2% vs 7.2%36) were higher, and more parents reported regular alcohol consumption (17.4% vs 8.7%37). Therefore, it is not surprising that a considerably greater proportion of parents in this study experienced elevated levels of stress (17.5% vs 5.2%38) and depression (5.9% vs 1.2%37). The persistently high level of parental stress observed during the study period may be attributed partly to ongoing exposure to various stressors over time and partly to constant exposure to chronic stressors. Both scenarios highlight the urgent need to ensure assessment and intervention for these disadvantaged parents.
 
Previous studies have demonstrated bidirectional interactions between parental stress and child health in relation to both internalising and externalising behaviours.6 8 Increases in behavioural problems have been shown to raise parental stress over time, which in turn exacerbates behavioural issues in children.39 Our study adds to this body of evidence by confirming significant bidirectional effects between general parental stress and child health at each time point. Cross-effects were observed from baseline child health to later parental stress, and from baseline parental stress to later child health at both 12 and 24 months. These findings suggest that parental stress both precedes and results from child health, with reciprocal short-term and long-term influences.40
 
In our attempt to identify pathways through which parental stress affects child health, we observed that only parental depression significantly mediated the relationship. This result is consistent with previous findings that maternal depression and perceived stress directly and negatively influence child development.41 One possible explanation is that depressed mothers may lack the energy or capacity to provide adequate care and support for their child’s health. Research into this mediation effect remains limited; however, one recent study reported similar outcomes regarding the indirect impact of workrelated stress on child health, mediated by maternal depression.42 The implementation of screening and intervention for parental depression is both imperative and urgent to counteract the adverse effects of stress on parental and child health. Medical and social service providers should collaborate to actively screen at-risk parents from low-income families in the community. Early intervention through lifestyle-based care—such as physical activity, relaxation techniques, and mindfulness-based therapies—can help to prevent43 44 and manage45 46 depression, thus mitigating long-term negative impacts on child health.
 
However, it must be noted that parents with depression may be biased towards over-reporting their child’s problems,47 compared with other informants such as teachers and the children themselves.48 Further research is warranted to identify individual and family characteristics that may influence discrepancies between informants. Other potential factors examined in previous studies—such as household structure (dual- vs multi-generational), parental rearing behaviours, and confident and affective social support—might also contribute to the relationship between parental stress and child health; they should be explored in future studies with larger sample sizes.
 
Strengths and limitations
This is one of the first studies to examine the longitudinal relationship between general parental stress and child health, enabling assessment of possible causal relationships between the two outcomes. Specifically, we recruited vulnerable families with substantial socio-economic disadvantages who experience high levels of stress and would benefit most from future interventions. Furthermore, a high response rate was maintained throughout the study, ensuring adequate power for the analyses.
 
However, the findings of our study must be interpreted in light of the following limitations. First, although we conducted a comprehensive analysis of factors related to parental stress and child health, the outcomes were based on self-reported assessments, which are susceptible to respondent bias. Only three measurements, taken 1 year apart, were performed in this study due to concerns regarding practicality and the burden on participating families. Therefore, caution should be exercised in generalising the results with respect to longitudinal trends, given that substantial intra-individual fluctuations may have occurred but were not captured in this study. Second, both parental stress and child health were assessed using parent-report questionnaires, which may contribute to increased shared method variance. Additionally, aspects of the child’s health or behaviour considered problematic by the parent may not align with assessments made by other individuals (eg, teachers). As mentioned earlier, parents with depression may be biased towards over-reporting problems and are more likely to report behavioural issues in their child compared with other informants.47 48 The validity of parent-perceived measures of child health—particularly in relation to parental depression—and their agreement with other caregivers should be examined in future trials specifically designed for this purpose. Third, there were unmeasured confounders in this observational study, such as exercise and social functioning. Moreover, certain socio-demographic factors, including marital and employment statuses, were assumed to be static throughout the study. It remains uncertain whether changes in these factors, if any, may have influenced the observed results. Additional information regarding participant characteristics, observational measures of child behaviour, or objective indicators of child health (eg, cortisol levels) could improve the reliability of the findings.
 
Conclusion
This study showed that a substantial proportion of parents from low-income families in Hong Kong experienced general stress due to multiple stressors, which was negatively associated with their child’s health. A bidirectional relationship was observed between parental stress and child health over time, which may be partly mediated by parental depression. Prompt screening and appropriate intervention are necessary to prevent adverse health outcomes for parents and children in low-income families.
 
Author contributions
Concept or design: EYT Yu, RSM Wong, AFY Tiwari, CKH Wong, VY Guo, CLK Lam.
Acquisition of data: RSM Wong, KSN Liu.
Analysis or interpretation of data: EYT Yu, EYF Wan, RSM Wong, IL Mak, AFY Tiwari, CKH Wong, VY Guo, CLK Lam.
Drafting of the manuscript: EYT Yu, RSM Wong, IL Mak, CHN Yeung.
Critical revision of the manuscript for important intellectual content: All authors.
 
All authors had full access to the data, contributed to the study, approved the final version for publication, and take responsibility for its accuracy and integrity.
 
Conflicts of interest
As advisors of the journal, EYT Yu and CKH Wong were not involved in the peer review process. Other authors have disclosed no conflicts of interest.
 
Acknowledgement
The authors are grateful for the support from Kerry Group Kuok Foundation (Hong Kong) Limited in conducting this study on participants of the Trekkers Family Enhancement Scheme. The authors’ sincere gratitude goes to the Neighbourhood Advice-Action Council, Hong Kong Sheng Kung Hui Lady MacLehose Centre, and Shek Lei Community Hall for their assistance in participant recruitment and provision of venues for data collection, respectively. The authors thank the Social Science Research Centre of The University of Hong Kong (HKU) for their timely completion of the telephone surveys, and Department of Paediatrics and Adolescent Medicine of HKU for performing the assays for DNA extraction and telomere length measurement. The authors also thank the hard work of their research staff in data collection and analysis.
 
Declaration
The study results were disseminated through a poster presentation at the Health Research Symposium 2021 (23 November 2021, hybrid conference), entitled “In-depth exploration of a bidirectional parent-child health relationship and its mediating and moderating factors among low-income families in Hong Kong”.
 
Funding/support
This research was supported by the Health and Medical Research Fund of the Health Bureau, Hong Kong SAR Government (Ref No.: HMRF 14151571). The funder had no role in the study design, data collection/analysis/interpretation, or manuscript preparation.
 
Ethics approval
This research was approved by the Institutional Review Board of The University of Hong Kong/Hospital Authority Hong Kong West Cluster, Hong Kong (Ref No.: UW 16-415). Informed consent was obtained from patients when baseline data were collected.
 
Supplementary material
The supplementary material was provided by the authors and some information may not have been peer reviewed. Accepted supplementary material will be published as submitted by the authors, without any editing or formatting. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by the Hong Kong Academy of Medicine and the Hong Kong Medical Association. The Hong Kong Academy of Medicine and the Hong Kong Medical Association disclaim all liability and responsibility arising from any reliance placed on the content.
 
References
1. Santiago CD, Kaltman S, Miranda J. Poverty and mental health: how do low-income adults and children fare in psychotherapy? J Clin Psychol 2013;69:115-26. Crossref
2. Smith MV, Mazure CM. Mental health and wealth: depression, gender, poverty, and parenting. Annu Rev Clin Psychol 2021;17:181-205. Crossref
3. Evans GW, Kim P. Childhood poverty, chronic stress, self-regulation, and coping. Child Dev Perspect 2013;7:43-8. Crossref
4. Adjei NK, Jonsson KR, Straatmann VS, et al. Impact of poverty and adversity on perceived family support in adolescence: findings from the UK Millennium Cohort Study. Eur Child Adolesc Psychiatry 2024;33:3123-32. Crossref
5. Alto ME, Warmingham JM, Handley ED, Manly JT, Cicchetti D, Toth SL. The association between patterns of trauma exposure, family dysfunction, and psychopathology among adolescent females with depressive symptoms from low-income contexts. Child Maltreat 2023;28:130-40. Crossref
6. van Dijk W, de Moor MH, Oosterman M, Huizink AC, Matvienko-Sikar K. Longitudinal relations between parenting stress and child internalizing and externalizing behaviors: testing within-person changes, bidirectionality and mediating mechanisms. Front Behav Neurosci 2022;16:942363. Crossref
7. Neece CL, Green SA, Baker BL. Parenting stress and child behavior problems: a transactional relationship across time. Am J Intellect Dev Disabil 2012;117:48-66. Crossref
8. Stone LL, Mares SH, Otten R, Engels RC, Janssens JM. The co-development of parenting stress and childhood internalizing and externalizing problems. J Psychopathol Behav Assess 2016;38:76-86. Crossref
9. Economic Analysis Division Economic Analysis and Business Facilitation Unit Financial Secretary’s Office; Census and Statistics Department, Hong Kong SAR Government. Hong Kong Poverty Situation Report 2013. Oct 2014. Available from: https://www.commissiononpoverty.gov.hk/eng/pdf/poverty_report13_rev2.pdf. Accessed 31 Jul 2023.
10. Lam CL, Guo VY, Wong CK, Yu EY, Fung CS. Poverty and health-related quality of life of people living in Hong Kong: comparison of individuals from low-income families and the general population. J Public Health (Oxf) 2017;39:258-65.Crossref
11. Luecken LJ, Lemery KS. Early caregiving and physiological stress responses. Clin Psychol Rev 2004;24:171-91. Crossref
12. Hanington L, Ramchandani P, Stein A. Parental depression and child temperament: assessing child to parent effects in a longitudinal population study. Infant Behav Dev 2010;33:88-95. Crossref
13. Associations between depression in parents and parenting, child health, and child psychological functioning. In: England MJ, Sim LJ, editors. Depression in Parents, Parenting, and Children: Opportunities to Improve Identification, Treatment, and Prevention. Washington (DC): National Academies Press (US); 2009: 119-82.
14. Lee SL, Cheung YF, Wong HS, Leung TH, Lam T, Lau YL. Chronic health problems and health-related quality of life in Chinese children and adolescents: a population-based study in Hong Kong. BMJ Open 2013;3:e001183. Crossref
15. Chan KL, Lo CK, Ho FK, Chen Q, Chen M, Ip P. Modifiable factors for the trajectory of health-related quality of life among youth growing up in poverty: a prospective cohort study. Int J Environ Res Public Health 2021;18:9221. Crossref
16. Asok A, Bernard K, Roth TL, Rosen JB, Dozier M. Parental responsiveness moderates the association between early-life stress and reduced telomere length. Dev Psychopathol 2013;25:577-85. Crossref
17. Evans GW, Kim P, Ting AH, Tesher HB, Shannis D. Cumulative risk, maternal responsiveness, and allostatic load among young adolescents. Dev Psychol 2007;43:341-51. Crossref
18. Hammen C. Stress and depression. Annu Rev Clin Psychol 2005;1:293-319. Crossref
19. Power C, Weise V, Mack JT, Karl M, Garthus-Niegel S. Does parental mental health mediate the association between parents’ perceived stress and parent-infant bonding during the early COVID-19 pandemic? Early Hum Dev 2024;189:105931. Crossref
20. Fung CS, Yu EY, Guo VY, et al. Development of a Health Empowerment Programme to improve the health of working poor families: protocol for a prospective cohort study in Hong Kong. BMJ Open 2016;6:e010015. Crossref
21. Lovibond SH, Lovibond PF; Psychology Foundation of Australia. Manual for the Depression Anxiety Stress Scales. Sydney: Sydney Psychology Foundation; 1995. Crossref
22. Wang K, Shi HS, Geng FL, et al. Cross-cultural validation of the Depression Anxiety Stress Scale–21 in China. Psychol Assess 2016;28:e88-100. Crossref
23. Landgraf JM. Child Health Questionnaire (CHQ). In: Maggino F, editor. Encyclopedia of Quality of Life and Well-being Research. Cham: Springer; 2020: 1-6. Crossref
24. Kroenke K, Spitzer RL, Williams JB. The PHQ-9: validity of a brief depression severity measure. J Gen Intern Med 2001;16:606-13. Crossref
25. Kavikondala S, Stewart SM, Ni MY, et al. Structure and validity of Family Harmony Scale: an instrument for measuring harmony. Psychol Assess 2016;28:307-18. Crossref
26. Straus MA. Measuring intrafamily conflict and violence: the Conflict Tactics (CT) Scales. J Marriage Fam 1979;41:75-88. Crossref
27. Chan KL, Brownridge DA, Fong DY, Tiwari A, Leung WC, Ho PC. Violence against pregnant women can increase the risk of child abuse: a longitudinal study. Child Abuse Negl 2012;36:275-84. Crossref
28. Robinson CC, Mandleco B, Olsen SF, Hart CH. The Parenting Styles and Dimensions Questionnaire (PSDQ). In: Perlmutter BF, Touliatos J, Holden GW, editors. Handbook of Family Measurement Techniques: Vol 3. Instruments & Index. Thousand Oaks: Sage; 2001: 319-21.
29. Wu P, Robinson CC, Yang C, et al. Similarities and differences in mothers’ parenting of preschoolers in China and the United States. Int J Behav Dev 2002;26:481-91. Crossref
30. Sampson RJ, Raudenbush SW, Earls F. Neighborhoods and violent crime: a multilevel study of collective efficacy. Science 1997;277:918-24. Crossref
31. Chou KL. Perceived discrimination and depression among new migrants to Hong Kong: the moderating role of social support and neighborhood collective efficacy. J Affect Disord 2012;138:63-70. Crossref
32. Baron RM, Kenny DA. The moderator–mediator variable distinction in social psychological research: conceptual, strategic, and statistical considerations. J Pers Soc Psychol 1986;51:1173-82. Crossref
33. Rosseel Y. lavaan: an R package for structural equation modeling. J Stat Softw 2012;48:1-36. Crossref
34. Census and Statistics Department, Hong Kong SAR Government. Hong Kong 2016 Population By-census–Thematic Report: Household Income Distribution in Hong Kong. Jun 2017. Available from: https://www.censtatd.gov.hk/en/data/stat_report/product/B1120096/att/B11200962016XXXXB0100.pdf. Accessed 25 Aug 2025.
35. Census and Statistics Department, Hong Kong SAR Government. 2021 Population Census—Thematic Report: Children. Feb 2023. Available from: https://www.census2021.gov.hk/doc/pub/21c-Children.pdf. Accessed 25 Aug 2025.
36. Chan KL. Intimate partner violence in Hong Kong. In: Chan KL, editor. Preventing Family Violence: A Multidisciplinary Approach. Hong Kong: Hong Kong University Press; 2012: 19-58. Crossref
37. Non-Communicable Disease Branch, Centre for Health Protection, Hong Kong SAR Government. Report of Population Health Survey 2020-22 (Part I); 2022. Available from: https://www.chp.gov.hk/files/pdf/dh_phs_2020-22_part_1_report_eng_rectified.pdf. Accessed 31 Jul 2023.
38. Chan SM, Wong H, Chung RY, Au-Yeung TC. Association of living density with anxiety and stress: a cross-sectional population study in Hong Kong. Health Soc Care Community 2021;29:1019-29. Crossref
39. Baker BL, McIntyre LL, Blacher J, Crnic K, Edelbrock C, Low C. Pre-school children with and without developmental delay: behaviour problems and parenting stress over time. J Intellect Disabil Res 2003;47:217-30. Crossref
40. Motrico E, Bina R, Kassianos AP, et al. Effectiveness of interventions to prevent perinatal depression: an umbrella review of systematic reviews and meta-analysis. Gen Hosp Psychiatry 2023;82:47-61. Crossref
41. Vameghi R, Amir Ali Akbari S, Sajedi F, Sajjadi H, Alavi Majd H. Path analysis association between domestic violence, anxiety, depression and perceived stress in mothers and children’s development. Iran J Child Neurol 2016;10:36-48. Crossref
42. Xu L, Xu J. The impact of maternal occupation on children’s health: a mediation analysis using the parametric G-formula. Soc Sci Med 2024;343:116602. Crossref
43. Bellón JÁ, Conejo-Cerón S, Sánchez-Calderón A, et al. Effectiveness of exercise-based interventions in reducing depressive symptoms in people without clinical depression: systematic review and meta-analysis of randomised controlled trials. Br J Psychiatry 2021;219:578-87. Crossref
44. Newland P, Bettencourt BA. Effectiveness of mindfulness-based art therapy for symptoms of anxiety, depression, and fatigue: a systematic review and meta-analysis. Complement Ther Clin Pract 2020;41:101246. Crossref
45. Marx W, Manger SH, Blencowe M, et al. Clinical guidelines for the use of lifestyle-based mental health care in major depressive disorder: World Federation of Societies for Biological Psychiatry (WFSBP) and Australasian Society of Lifestyle Medicine (ASLM) taskforce. World J Biol Psychiatry 2023;24:333-86. Crossref
46. Recchia F, Leung CK, Chin EC, et al. Comparative effectiveness of exercise, antidepressants and their combination in treating non-severe depression: a systematic review and network meta-analysis of randomised controlled trials. Br J Sports Med 2022;56:1375-80. Crossref
47. Chi TC, Hinshaw SP. Mother–child relationships of children with ADHD: the role of maternal depressive symptoms and depression-related distortions. J Abnor Child Psychol 2002;30:387-400. Crossref
48. Richters JE. Depressed mothers as informants about their children: a critical review of the evidence for distortion. Psychol Bull 1992;112:485-99. Crossref

Clinical and imaging patterns of child abuse in Hong Kong: a 10-year review from a tertiary centre

Hong Kong Med J 2025 Oct;31(5):347–54 | Epub 19 Sep 2025
© Hong Kong Academy of Medicine. CC BY-NC-ND 4.0
 
ORIGINAL ARTICLE  CME
Clinical and imaging patterns of child abuse in Hong Kong: a 10-year review from a tertiary centre
Catherine YM Young, MB, BS, FRCR1; CH Yiu1; Kathleen CH Tsoi, MB, ChB, MRCPCH2; Dorothy FY Chan, MB, ChB, FRCPCH2; Ki Wang, MB, BS, FRCR1; Winnie CW Chu, MB, ChB, MD1
1 Department of Imaging and Interventional Radiology, Prince of Wales Hospital, The Chinese University of Hong Kong, Hong Kong SAR, China
2 Department of Paediatrics, Prince of Wales Hospital, The Chinese University of Hong Kong, Hong Kong SAR, China
 
Corresponding author: Dr Catherine YM Young (youngymc@connect.hku.hk)
 
 Full paper in PDF
 
Abstract
Introduction: Child abuse, a pressing medical and social issue in Hong Kong, requires high vigilance for prompt identification and early management. The Mandatory Reporting of Child Abuse Ordinance has recently been gazetted, establishing a mandatory obligation for suspected injury reporting to protect children’s rights. This study aimed to describe the incidence and patterns of child abuse in Hong Kong to draw attention to this key issue.
 
Methods: A retrospective review of all reported child abuse cases admitted to Prince of Wales Hospital over a 10-year period (2014-2023) was performed.
 
Results: In total, 503 cases of child abuse were retrieved from the hospital’s electronic system, revealing an increasing trend over the years. Of these cases, 341 cases (67.8%) were attributed to physical abuse. Most cases involved trivial soft tissue injuries, apart from two limb fracture cases, which represented 0.4% of all reported child abuse cases (n=503) and 0.6% of all reported physical child abuse cases (n=341). Abusive head trauma (n=3) constituted 0.6% of all reported physical child abuse cases and 0.9% of all reported child abuse cases. Two cases of severe abusive head trauma required paediatric intensive care, and one case warranting neurosurgical intervention subsequently exhibited gross motor delay.
 
Conclusion: Most child abuse cases in Hong Kong present with minor clinical manifestations. Imaging evidence of skeletal or neurological injury is present in a small proportion of patients. Abusive head injury is uncommon but carries far-reaching consequences; early recognition is essential to protect affected children from further harm. Paediatric radiologists play a pivotal role in making the diagnosis.
 
 
New knowledge added by this study
  • Fractures resulting from non-accidental injury are less common in Hong Kong, which has a predominantly Chinese population, than in Western countries; the fracture patterns differ.
  • The overall incidence of abusive head trauma is low; however, a substantial proportion of patients with non-accidental injury who undergo further neuroimaging display positive findings.
Implications for clinical practice or policy
  • Interpretation of plain radiographs in cases of non-accidental injury should not solely rely on classical textbook fracture patterns; correlations with a compatible clinical history are particularly important.
  • Neuroimaging is essential for children under 1 year of age with clinical suspicion of non-accidental injury, particularly those showing abnormal neurological signs, to detect abusive head trauma.
 
 
Introduction
Child abuse is a prevalent yet frequently overlooked condition in paediatric patients worldwide, affecting between 4% and 16% of the paediatric population.1 It may manifest as physical abuse, neglect, sexual abuse, or psychological abuse,2 all of which carry substantial long-term medical and psychological consequences. Clinical presentation is often vague, requiring a high degree of clinical suspicion by both clinicians and radiologists to ensure early activation of child protection services. Multidisciplinary input is needed for timely intervention and prevention of recurrence.
 
While clinical evaluation is crucial for identifying apparent or superficial injuries, radiological imaging also plays a vital role in detecting old or clinically occult injuries. John Caffey, a paediatric radiologist, was among the first to describe the association between long bone fractures and chronic subdural haematoma in infants, introducing the concept of non-accidental injury.3 Since then, a growing body of literature has emerged concerning the radiological features of non-accidental injury, contributing to increased global awareness. Various guidelines have also been developed, including those by The Royal College of Radiologists4 and the American College of Radiology,5 which recommend appropriate imaging modalities in suspected cases to protect children’s welfare while balancing the risks of radiation exposure.
 
Various retrospective studies in Western populations have examined the epidemiology, injury patterns, and outcomes of non-accidental paediatric injuries in their respective regions6 7 8 9; however, limited research has been conducted in Asia, particularly within Hong Kong. This study aimed to describe the incidence, clinical presentation, imaging features, and treatment outcomes of child abuse in a tertiary regional hospital in Hong Kong, with the goal of raising awareness towards this commonly overlooked condition.
 
Methods
This retrospective study included all reported cases of child abuse involving paediatric patients (aged 0-18 years) admitted to Prince of Wales Hospital, a tertiary regional hospital in Hong Kong, over a 10-year period (from January 2014 to December 2023). All suspected or confirmed cases of child abuse were identified from the Clinical Data Analysis and Reporting System, an electronic health registry managed by the Hospital Authority of Hong Kong. The search utilised key terms under the International Classification of Diseases, Ninth Revision coding, including “Child maltreatment syndrome”, “Child and adult battering and other maltreatment”, “Child abuse”, and “Child maltreatment syndrome, shaken infant syndrome”. Clinical records of all reported cases were reviewed. Cases were excluded if they were inappropriately categorised (aged >18 years), erroneously reported as unrelated to child abuse, or duplicate entries of the same episode (Fig 1).
 

Figure 1. Patient recruitment
 
Clinical data including patient demographics (age at presentation and sex), clinical presentation, type of abuse, imaging performed, multidisciplinary case conferences (MDCCs) held, management strategies, and any long-term adverse outcomes were reviewed from electronic patient records and case notes. Relevant imaging studies were reviewed by the primary investigator (5 years of radiology experience) and cross-checked against the original reports. In cases of discrepancy, images were re-interpreted through consensus reading with an experienced paediatric radiologist (20 years of radiology experience).
 
Results
Patient demographics and clinical presentation
In total, 503 reported cases of child abuse were included in the study. The number of reported cases showed an upward trend over the 10-year period, from 23 cases in 2014 to 50 cases in 2023 (Fig 2).
 

Figure 2. Trend of reported child abuse cases at Prince of Wales Hospital from 2014 to 2023
 
The case distribution is presented in Table 1. The cohort comprised 265 (52.7%) girls and 238 (47.3%) boys. The mean age was 8.25 years (range, 0-17), with 55 cases (10.9%) involving infants under 1 year of age. Physical abuse was the most common type at presentation, accounting for 341 cases (67.8%). The vast majority (>99%) of patients presented with erythematous marks, bruises, or lacerations. Other presenting symptoms included seizures, loss of consciousness, and vomiting. Sexual abuse was the second most common type (n=87, 17.3%), followed by child neglect (n=75, 14.9%).
 

Table 1. Distribution of various types of reported child abuse by age and sex (n=503)
 
More than half of the cases (n=263, 52.3%) were admitted via the Accident and Emergency (A&E) Department. The vast majority of these patients presented directly to our hospital, and only two transferred from adjacent acute hospitals—one involving abusive head trauma requiring neurosurgical intervention, and another with a suspected vaginal tear necessitating input from obstetricians and gynaecologists. Most of these patients (254 cases, 96.6%) were referred due to clinical suspicion of abuse raised by non-offending parents (n=137), social workers (n=78), the patients themselves (n=22), or witnesses (n=17). In the remaining nine cases (3.4%), suspicion was first raised by medical staff either in the Emergency Department/General Outpatient Clinic (n=4) or after admission (n=5). Although medical staff identified a relatively small proportion of these cases, many were severe, including three abusive head trauma cases initially presenting with seizures. In such cases, abuse was only suspected after imaging.
 
The remaining 240 cases (47.7%) were admitted through other channels, including referral by social workers (n=203), neonatal admission (n=28), abnormalities identified by medical staff during follow-up or screening (n=8), and sibling screening (n=1).
 
Imaging modalities and findings
Imaging was performed for 100 patients (19.9%), including 86 cases with skeletal imaging, 24 with neurological imaging, and one with abdominal imaging. Among the 24 patients who underwent neuroimaging, 10 also had skeletal imaging, while 14 received neuroimaging only.
 
Of the 86 patients who underwent skeletal imaging, 77 had plain radiographs of the targeted region as initial screening, and nine received a complete skeletal survey. Most patients had minor soft tissue injuries. Fractures were identified in two patients: a supracondylar fracture in a 3-year-old boy and a foot fracture in a 13-year-old girl, representing 2.3% of all skeletal imaging cases (Fig 3). Both fractures were detected on dedicated radiographs directed at regions of pain, as indicated by the patients. In another case, initial radiographs in a 13-year-old boy showed no obvious fracture, but magnetic resonance imaging (MRI) for persistent wrist pain subsequently revealed a mild ligamentous sprain.
 

Figure 3. (a) Anteroposterior plain radiograph of the right elbow showing a linear transverse supracondylar fracture of the right humerus (arrow). (b) Anteroposterior plain radiograph of the left fifth toe showing cortical buckling over the lateral aspect of the shaft of the left fifth metatarsal bone (arrow)
 
Computed tomography (CT) was the initial imaging modality in 24 cases evaluated for suspected intracranial injury; five cases (20.8%) showed positive findings. Three cases (12.5%) demonstrated alarming features suggestive of shaken baby syndrome on initial brain CT scans, including subdural haemorrhage (n=3) and cerebral oedema (n=1), prompting further evaluation by MRI. Shaken baby syndrome was confirmed in all three cases on MRI, which showed subdural haemorrhage (n=3) and brain parenchymal injuries, including diffuse axonal injury (n=3) and hypoxic-ischaemic injury (n=2) [Fig 4]. These patients, aged between 2 and 7 months, presented with non-specific symptoms such as seizures (n=3), vomiting (n=2), and loss of consciousness (n=1). Fundoscopic examination confirmed multilayered retinal haemorrhages in all three cases, whereas skeletal surveys were unremarkable (Table 2). The remaining two CT-positive cases included one with a scalp haematoma and another with a mildly depressed parietal skull fracture; both lacked intracranial findings.
 

Figure 4. Representative case of shaken baby syndrome. (a, b) Computed tomography of the brain shows mixed-density subdural haematoma along bilateral cerebral convexities, extending into the interhemispheric space (white arrows in [a]). A large hypodense area with loss of grey-white differentiation in the right parieto-occipital region (black arrows) suggests cerebral oedema or hypoxic-ischaemic injury. (c-h) Magnetic resonance imaging of the brain confirms subdural collections of varying intensities over bilateral cerebral convexities and the interhemispheric space (white arrows in [c] and [d]), as well as a large area of restricted diffusion in the right parieto-occipital lobe (black arrows in [e] to [h]), consistent with hypoxic-ischaemic injury. Restricted diffusion in the splenium of the corpus callosum (white arrowheads in [g] and [h]) indicates diffuse axonal injury. (c) T1-weighted imaging. (d) T2-weighted imaging. (e, g) Diffusion-weighted imaging. (f, h) Apparent diffusion coefficient mapping
 

Table 2. Clinical presentation, radiological findings, and clinical outcomes of the three cases of shaken baby syndrome
 
Ultrasound of the abdomen and pelvis was performed in one patient with persistent abdominal pain; no clinically significant solid organ injury was identified.
 
Multidisciplinary case conference assessment and long-term adverse outcomes
Overall, 44 cases (8.7%) were dismissed for various reasons, such as cross-border status, family refusal, or discharge against medical advice. Of the remaining 459 cases (91.3%) evaluated by MDCC, documentation was not retrievable from clinical records in 45 cases (8.9%).
 
Among the 414 cases with available MDCC documentation or conclusions, child abuse was confirmed in 199 cases (48.1%), comprising physical abuse (n=95), child neglect (n=63), and sexual abuse (n=41). Another 84 cases (20.3%) were categorised as high-risk, involving suspected physical abuse (n=81) or sexual abuse (n=3). Child abuse was not established in the remaining 131 cases (31.6%); these were considered to have low or moderate risk of recurrence.
 
Of the 89 cases in which MDCC was dismissed or notes were unavailable, more than half (n=63, 70.8%) had presented with suspected physical abuse, followed by sexual abuse (n=22, 24.7%) and neglect (n=4, 4.5%). All cases were deemed minor, with no clinically or radiologically significant findings. No specific treatment or long-term follow-up was required.
 
The majority of cases exhibited minor severity and were managed conservatively without long-term adverse outcomes.
 
A long arm cast was applied for one patient with a supracondylar fracture, whereas a resting splint was prescribed for another patient with a ligamentous wrist sprain. Both patients recovered uneventfully after short-term follow-up (1 year) by the orthopaedics team, with no residual impact on daily functioning.
 
Two patients with severe abusive head trauma required admission to the paediatric intensive care unit. One of these patients warranted multiple neurosurgical interventions, including bilateral burr hole drainage and placement of a ventriculoperitoneal shunt. The remaining two cases of abusive head trauma were managed conservatively. At the most recent follow-up, one patient—the most severely affected—demonstrated gross motor delay at 19 months of age. All other patients showed no neurological deficits or developmental delay to date. No mortality was recorded in this cohort.
 
Repeated admissions for suspected child abuse were identified in 22 cases. Of these, 16 were recurrent, established cases of child abuse. In 14 of these 16 cases, the type of abuse remained consistent across episodes, whereas two cases involved different types of abuse in separate incidents. Four cases were initially classified as established child abuse, but subsequent admissions were considered non-established, with recurrence risk ranging from low to high. Two cases were categorised as non-established child abuse on both occasions but were considered to have moderate or high risk of recurrence.
 
Discussion
This retrospective 10-year study documented a significant rise in reported child maltreatment cases, emphasising that child abuse remains an ongoing medical and social concern. This issue persists despite concerted efforts by the government and various organisations to provide social support to new mothers and at-risk families in an effort to prevent child maltreatment.
 
Types of child abuse
Physical abuse was the most common type of presentation in our study, consistent with data from the Child Protection Registry10 and similar findings from Singapore.11 The high prevalence of physical abuse in Hong Kong may reflect cultural differences in parenting practices, such that corporal punishment remains more commonly accepted in Chinese households than in Western contexts.12 Over 50% of families in Hong Kong use physical punishment as part of child-rearing.13 In moments of anger or impulsiveness, the line between ineffective parenting and child abuse may easily be crossed.
 
Pattern of injury and imaging findings
The majority of cases in our study were considered mild in nature, with no serious long-term consequences after clinical evaluation and appropriate imaging. Fractures were infrequent, comprising 0.4% of all reported child abuse cases and 0.6% of all reported physical child abuse cases. These rates are slightly lower than those reported in previous Asian studies, which revealed fractures in 1% of all reported physical child abuse cases11 and 3.6% to 7% of all reported child abuse cases.14 15 The present rates are substantially lower than the 28% observed in a Western population.6 The fracture detection rate among patients who underwent imaging in our study (2.3%) was also considerably lower than that in Western populations (24%-32%).7 8 Compared with a previous Hong Kong study in 2005,15 our findings suggest a decline in the overall fracture rate despite an overall increase in reported child maltreatment cases, implying a trend towards milder injuries in recent years. This trend may reflect increased societal awareness of the consequences of severe child abuse, potentially leading parents to move away from traditional forms of physical punishment (eg, caning) and towards less injurious methods, such as striking with the hand. Greater awareness may also facilitate earlier detection and reporting, thereby preventing escalation.
 
No fractures were identified on skeletal surveys in the few cases of confirmed shaken baby syndrome in our cohort. One case of parietal bone fracture was documented—the parietal bone is among the most commonly fractured skull bones, according to current literature.14 16 The other identified fractures—supracondylar and foot fractures—do not reflect the classical abuse-specific fracture types described in the literature, such as posteromedial rib fractures or metaphyseal corner fractures.16 However, these findings align with previous studies in Singapore, where the humerus was the most frequently fractured bone.11 14 Our results also differ from the findings of Fong et al,15 who reported that forearm and rib fractures were most common in Hong Kong. With the exception of rib fractures, the sites noted in our study are not typically associated with non-accidental injury. This highlights potential differences in injury severity and fracture patterns between Asian and Western populations and underscores the importance of maintaining clinical suspicion for non-accidental injury, even in the absence of classical fracture sites or textbook imaging findings.16
 
Abusive head trauma is the leading cause of morbidity and mortality among children subjected to abuse, with an estimated morbidity rate of up to 80% and a mortality rate ranging from 15% to 30%.17 18 Despite the deceptively low overall occurrence of abusive head trauma in our study (0.6% of all reported physical child abuse cases and 0.9% of all reported child abuse cases), compared with Western counterparts (up to 40%-50%),6 9 it is notable that 20.8% of our imaged cases showed positive findings, and shaken baby syndrome was confirmed in 12.5% via MRI. All confirmed cases involved infants under 1 year of age, whose relatively oedematous brains, immature intracranial vasculature, and poor neck muscle control render them more susceptible to the effects of abusive head trauma.19 It is therefore imperative that neuroimaging be performed for all children under 1 year of age with suspected non-accidental injury, particularly those with abnormal neurological signs, such as seizures or coma.4 Bilateral subdural haemorrhages of varying densities, focal and diffuse brain parenchymal injuries (eg, diffuse axonal injury or cerebral oedema), and multilayered retinal haemorrhages on fundoscopy, as demonstrated in our study, are consistent with cardinal features of abusive head trauma described in the literature.17 20 Our study also revealed more favourable morbidity (33%) and mortality (0%) outcomes compared with current literature reports,2 17 possibly due to the relatively small number of cases.
 
Current practice in the management of cases of suspected child abuse
At present, suspected child maltreatment presents to our hospital via two main pathways: attendance at the A&E Department for suspicious injuries, and referral by social workers who observe unusual behaviour or injuries.21 For cases requiring inpatient care, the paediatric team conducts history taking and physical examination, documents findings (including clinical photographs), and manages the injuries.21 Relevant parties—such as social workers, clinical psychologists, and police officers—are informed as necessary.21 Minor cases may be assessed and discharged directly from the A&E Department.21 An MDCC is typically convened within 10 days of presentation, involving doctors, social workers, school personnel, clinical psychologists, and police officers to determine the nature of the incident, assess the risk of future maltreatment, and recommend preventive measures.21
 
Radiologists play an active role in the multidisciplinary management of child abuse—not only in assessing the full extent of injuries but also in detecting subtle, suspicious findings, alerting the clinical team, and proactively contributing to early intervention and the reduction of long-term adverse outcomes. The reporting of suspicious injuries is currently conducted on a voluntary basis, guided by recommendations from the Social Welfare Department.22 However, the recently gazetted Mandatory Reporting of Child Abuse Ordinance,23 which becomes effective in January 2026, will impose a legal obligation on professionals to report suspected injuries, thereby strengthening safeguards for children.
 
Strengths and limitations
To the best of our knowledge, this is the largest retrospective study to investigate the clinical and radiological features of child abuse in a regional hospital in Hong Kong over the past decade. It provides an updated local overview while drawing comparisons with Western data to highlight distinguishing features and emphasise the need for greater attention to this critical issue.
 
This study had several limitations. First, it was a retrospective analysis based on voluntarily reported cases, and some instances of child abuse may have been under-recognised or underreported by attending clinicians. A small number of cases also lacked accessible MDCC notes or conclusions due to record loss over time. Second, our dataset includes only admitted cases from a single regional hospital, which may have introduced selection bias because minor cases discharged directly from A&E were excluded. The generalisability of our findings is limited, given that the distribution of child maltreatment cases varies substantially across Hong Kong districts. Sha Tin accounted for approximately 6.2% of all reported child maltreatment cases from 2014 to 2023, whereas Yuen Long accounted of 12%.24 Variations in demographic and socio-economic backgrounds across districts may also influence clinical presentation and severity of injuries; further investigation is warranted. Third, despite the large cohort of child abuse cases included in our series, the proportion of positive imaging findings remains relatively small. Larger-scale studies are needed to better characterise local injury patterns. Finally, due to the extended retrospective recruitment period, follow-up durations varied widely—from 15 months in recent cases to 9 years in earlier cases. Consequently, the long-term effects of abusive head trauma may not yet be evident in patients with shorter follow-up, highlighting the need for further longitudinal assessment into later childhood.
 
Conclusion
This study provides an updated overview of the clinical and radiological features of child abuse in Hong Kong, revealing patterns that differ from those described in Western literature. Although most cases involved only minor clinical manifestations, a small proportion of patients exhibited positive imaging findings of skeletal or neurological injury, which may carry serious long-term consequences. Radiologists play a critical role in the multidisciplinary management of child abuse, both in flagging suspicious injuries to alert clinicians and in evaluating the full extent of trauma to protect children from further harm.
 
Author contributions
Concept or design: CYM Young, WCW Chu.
Acquisition of data: All authors.
Analysis or interpretation of data: CYM Young, WCW Chu.
Drafting of the manuscript: CYM Young.
Critical revision of the manuscript for important intellectual content: All authors.
 
All authors had full access to the data, contributed to the study, approved the final version for publication, and take responsibility for its accuracy and integrity.
 
Conflicts of interest
All authors have disclosed no conflicts of interest.
 
Funding/support
This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
 
Ethics approval
This research was conducted in accordance with the Declaration of Helsinki. Ethics approval was obtained from the Joint Chinese University of Hong Kong–New Territories East Cluster Clinical Research Ethics Committee, Hong Kong (Ref No.: 2024.071). The requirement for informed patient consent was waived by the Committee due to the retrospective design of the research.
 
References
1. Gilbert R, Widom CS, Browne K, Fergusson D, Webb E, Janson S. Burden and consequences of child maltreatment in high-income countries. Lancet 2009;373:68-81. Crossref
2. Guastaferro K, Shipe SL. Child maltreatment types by age: implications for prevention. Int J Environ Res Public Health 2023;21:20. Crossref
3. Caffey J. Multiple fractures in the long bones of infants suffering from chronic subdural hematoma. Am J Roentgenol Radium Ther 1946;56:163-73.
4. The Society and College of Radiographers; The Royal College of Radiologists. The Radiological Investigation of Suspected Physical Abuse in Children (Revised First Edition). London: The Royal College of Radiologists; 2018. Available from: https://www.rcr.ac.uk/media/nznl1mv4/rcr-publications_the-radiological-investigation-of-suspected-physical-abuse-in-children-revised-first-edition_november-2018.pdf. Accessed 1 Oct 2024.
5. Wootton-Gorges SL, Soares BP, Alazraki AL, et al. ACR Appropriateness Criteria® suspected physical abuse—child. J Am Coll Radiol 2017;14:S338-49. Crossref
6. Ward A, Iocono JA, Brown S, Ashley P, Draus JM Jr. Non-accidental trauma injury patterns and outcomes: a single institutional experience. Am Surg 2015;81:835-8. Crossref
7. Day F, Clegg S, McPhillips M, Mok J. A retrospective case series of skeletal surveys in children with suspected non-accidental injury. J Clin Forensic Med 2006;13:55-9. Crossref
8. Loos MH, Ahmed T, Bakx R, van Rijn RR. Prevalence and distribution of occult fractures on skeletal surveys in children with suspected non-accidental trauma imaged or reviewed in a tertiary Dutch hospital. Pediatr Surg Int 2020;36:1009-17. Crossref
9. Rosenfeld EH, Johnson B, Wesson DE, Shah SR, Vogel AM, Naik-Mathuria B. Understanding non-accidental trauma in the United States: a national trauma databank study. J Pediatr Surg 2020;55:693-7. Crossref
10. Social Welfare Department, Hong Kong SAR Government. Child Protection Registry Statistical Report 2023. 2024. Available from: https://www.swd.gov.hk/storage/asset/section/654/Annual%20CPR%20Report%202023_Biligual_Final.pdf. Accessed 1 Oct 2024.
11. Chew YR, Cheng MH, Goh MC, Shen L, Wong PC, Ganapathy S. Five-year review of patients presenting with non-accidental injury to a children’s emergency unit in Singapore. Ann Acad Med Singap 2018;47:413-9. Crossref
12. Liu W, Guo S, Qiu G, Zhang SX. Corporal punishment and adolescent aggression: an examination of multiple intervening mechanisms and the moderating effects of parental responsiveness and demandingness. Child Abuse Negl 2021;115:105027. Crossref
13. Tang CS. Corporal punishment and physical maltreatment against children: a community study on Chinese parents in Hong Kong. Child Abuse Negl 2006;30:893-907. Crossref
14. Gera SK, Raveendran R, Mahadev A. Pattern of fractures in non-accidental injuries in the pediatric population in Singapore. Clin Orthop Surg 2014;6:432-8. Crossref
15. Fong CM, Cheung HM, Lau PY. Fractures associated with non-accidental injury—an orthopaedic perspective in a local regional hospital. Hong Kong Med J 2005;11:445-51.
16. Offiah A, van Rijn RR, Perez-Rossello JM, Kleinman PK. Skeletal imaging of child abuse (non-accidental injury). Pediatr Radiol 2009;39:461-70. Crossref
17. Sidpra J, Chhabda S, Oates AJ, Bhatia A, Blaser SI, Mankad K. Abusive head trauma: neuroimaging mimics and diagnostic complexities. Pediatr Radiol 2021;51:947-65. Crossref
18. Karibe H, Kameyama M, Hayashi T, Narisawa A, Tominaga T. Acute subdural hematoma in infants with abusive head trauma: a literature review. Neurol Med Chir (Tokyo) 2016;56:264-73. Crossref
19. Hung KL. Pediatric abusive head trauma. Biomed J 2020;43:240-50. Crossref
20. Sun DT, Zhu XL, Poon WS. Non-accidental subdural haemorrhage in Hong Kong: incidence, clinical features, management and outcome. Childs Nerv Syst 2006;22:593-8. Crossref
21. So EC, Chan D. Management of Child Maltreatment (Abuse). Hong Kong: Hospital Authority New Territories East Cluster Prince of Wales Hospital Department of Paediatrics; 2024.
22. Social Welfare Department, Hong Kong SAR Government. Protecting Children from Maltreatment Procedural Guide for Multi-disciplinary Co-operation (Revised 2020). Jan 2020. Available from: https://www.swd.gov.hk/storage/asset/section/652/en/Procedural_Guide_Core_Procedures_(Revised_2020)_Eng_2Nov2021.pdf. Accessed 1 Oct 2024.
23. Legislative Council, Hong Kong SAR Government. Mandatory Reporting of Child Abuse Ordinance. 2024. Available from: https://www.legco.gov.hk/yr2024/english/ord/2024ord023-e.pdf. Accessed 29 Oct 2024.
24. Social Welfare Department, Hong Kong SAR Government. Statistics on child protection, spouse/cohabitant battering and sexual violence cases captured by the Child Protection Registry (CPR) and the Central Information System on Spouse/Cohabitant Battering Cases and Sexual Violence Cases (CISSCBSV). Social Welfare Department; 2025. Available from: https://data.gov.hk/en-data/dataset/hk-swd-fcw-ca-scb-sv-stat/resource/6229e2b4-73d0-4285-a892-838c683c9966. Accessed 8 Aug 2025.

The scope and impact of original clinical research by Hong Kong public healthcare professionals

Hong Kong Med J 2025 Oct;31(5):363–73 | Epub 12 Sep 2025
© Hong Kong Academy of Medicine. CC BY-NC-ND 4.0
 
ORIGINAL ARTICLE
The scope and impact of original clinical research by Hong Kong public healthcare professionals
Peter YM Woo, MMedSc, FRCS1,2; Desiree KK Wong, MB, BS1; Queenie HW Wong, MB, ChB1; Calvin KL Leung, MB, BS1; Yuki HK Ip, MB, BS1; Danise M Au, MB, BS1; Tiger SF Shek, MB, ChB1; Bertrand Siu, MB, BS1; Charmaine Cheung, MB, BS1; Kevin Shing, MB, BS1; Anson Wong, MB, ChB1; Yuti Khare, MB, BS1; Omar WK Tsui, MB, BS1; Noel HY Tang, BSc3; KM Kwok, MSc3; MK Chiu, MSc3; YF Lau, MPhil3; Keith HM Wan, MB, BS, FRCS4; WC Leung, MD, FRCOG5
1 Department of Neurosurgery, Kwong Wah Hospital, Hong Kong SAR, China
2 Department of Neurosurgery, Prince of Wales Hospital, Hong Kong SAR, China
3 Centre for Clinical Research and Biostatistics, The Chinese University of Hong Kong, Hong Kong SAR, China
4 Department of Orthopaedics and Traumatology, Kwong Wah Hospital, Hong Kong SAR, China
5 Department of Obstetrics and Gynaecology, Kwong Wah Hospital, Hong Kong SAR, China
 
Corresponding author: Dr Peter YM Woo (wym307@ha.org.hk)
 
 Full paper in PDF
 
Abstract
Introduction: This study reviewed the landscape of clinical research conducted by public hospital clinicians in Hong Kong. It also explored whether an association exists between academic productivity and clinical performance.
 
Methods: This was a territory-wide retrospective study of peer-reviewed original clinical research conducted by clinicians providing acute medical care at non-university public hospitals between 2016 and 2021. Citations were retrieved from the MEDLINE biomedical literature database. Scientometric analysis was performed by collecting journal-level, article-level, and author-level performance indicators. Clinical performance was assessed using crude mortality rate, inpatient hospitalisation duration, and the number of 30-day unplanned readmissions.
 
Results: In total, 3142 peer-reviewed studies were published, of which 29.3% (n=921) were conducted by non-university hospital public healthcare professionals. The most productive specialty was clinical oncology, with 0.56 articles published per clinician. The overall mean journal impact factor and Eigenfactor score were 2.34 ± 3.72 and 0.01 ± 0.07, respectively. At the article level, the mean total number of citations was 6.33 ± 24.17, the mean Field Citation Ratio was 3.37 ± 2.04, and the mean Relative Citation Ratio (RCR) was 0.82 ± 3.32. A significant negative correlation was observed between crude mortality rate and RCR (r=-0.63; P=0.022). A negative correlation was also identified between 30-day readmissions and RCR (r=-0.72; P=0.006).
 
Conclusion: Clinicians in Hong Kong’s public healthcare system are research-active and have achieved a substantial degree of influence in their respective fields. Research performance was correlated with hospital crude mortality rates and 30-day unplanned readmissions.
 
 
New knowledge added by this study
  • More than 10% clinicians at non-university public hospitals in Hong Kong have engaged in original clinical research as principal investigators.
  • In total, 29.3% of clinical research published in Hong Kong was conducted by professionals from non-university public hospitals.
  • The quality of the research undertaken was encouraging. All medical specialties achieved a Field Citation Ratio greater than 1.00, indicating that their article citation rates exceeded those of counterparts in the same research field.
Implications for clinical practice or policy
  • Clinical research activity is correlated with reductions in hospital crude mortality rates and 30-day unplanned readmissions.
  • The establishment of a research-supportive infrastructure and dedicated funding for non-university public hospitals may contribute to improved patient outcomes.
 
 
Introduction
Clinical research is fundamental to the advancement of medicine. More than a quarter of a century on, evidence-based medicine—which began as a nascent movement in the early 1990s—has revolutionised healthcare by producing trustworthy observations that support better-informed clinical decision-making and health policy.1 2 Research forms the foundation of evidence-based medicine and plays an important role in understanding disease, thereby contributing to the development of novel therapeutic strategies.3 This contribution has translated into quantifiable outcomes: participation in clinical research can lead to significant reductions in patient mortality and inpatient length of stay (LOS).4 5 6 7 8 9 Clinical research benefits individual patients and drives socio-economic growth. The UK National Institute for Health and Care Research (NIHR) observed that every 1.0 GBP invested in clinical trials yielded a return of up to 7.6 GBP in economic benefit.10 However, a substantial proportion of frontline clinicians typically do not engage in research activities relevant to their daily practice. A cross-sectional survey in North America revealed that 32% of respondents did not know how to participate in research.11 A similar study among Hong Kong family physicians indicated that 27% had no previous experience.3 Hong Kong is an ideal location for conducting clinical research due to its world-class universal healthcare infrastructure, electronic medical records system, use of English in medical documentation, and the presence of a pool of internationally reputable investigators.12 13 Additionally, the Hospital Authority (HA)—a statutory body responsible for managing all public hospitals in the city—provides more than 90% of all inpatient bed-days, and the patient follow-up rate is comparably high.14 Regardless of these favourable factors, according to the Our Hong Kong Foundation—a non-governmental, non-profit public policy institute—the number of clinical trials conducted in Hong Kong declined by 22% between 2015 and 2021, compared with a mean increase of 48% in developed countries and 285% in Mainland China.15 No comprehensive review of the clinical research activity of Hong Kong public healthcare professionals has been conducted. Apart from the UK and Spain, no other region has evaluated the influence of clinician engagement in research on key performance indicators within a universal healthcare system.4 5 6 This study was performed to determine Hong Kong’s research productivity in terms of peer-reviewed published clinical studies, its scholarly impact, and its influence on outcomes for hospitalised patients—including LOS, crude mortality, and 30-day unplanned readmission. A comparative analysis of research productivity and quality across medical disciplines was also performed. Findings from this review could inform health policy by providing a stronger foundation for the evidence-based allocation of resources to support an efficient and sustainable research ecosystem within the HA.
 
Methods
This was a territory-wide retrospective observational study of peer-reviewed original clinical research conducted by HA medical staff at general acute care hospitals, in which the staff member served as principal investigator. The review included articles published in the biomedical literature from 1 January 2016 to 30 April 2021. Research articles from non-university institutions—comprising 30% (13/43) of all HA hospitals—were included. Citations covering this 5.5-year period were retrieved from MEDLINE, the United States National Library of Medicine’s bibliographic database. The database was queried via the PubMed Advanced Search Builder for all studies published within the review period, where the first author’s stated affiliation was a Hong Kong hospital. The internet-based library search package RISmed was used to extract author affiliation data from the PubMed search results into R, an open-source statistical software tool.16 17 The list of citations was then manually reviewed to confirm that the study had been conducted by a clinician from an HA hospital. Published abstracts were categorised according to study design, article type, and corresponding medical specialty (Table 1).18 Systematic reviews performed in accordance with the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) 2020 statement were regarded as original research.19 20 Articles not subject to peer review—such as academic conference proceedings, trial protocols, editorials, letters to the editor, and erratum or corrigendum statements—were excluded. Preclinical studies and secondary research articles, including clinical practice guidelines, position statements, book chapters, and narrative topical reviews, were also excluded. Finally, collaborative studies in which the principal investigator was not employed by the HA were excluded.
 

Table 1. Clinical research publication categories
 
The primary study endpoint was research productivity, measured by the total number of original research studies published, with comparisons made between university- and non–university-affiliated HA hospitals. The hypothesis was that university-affiliated hospitals would produce more original clinical research studies than non-university hospitals because of their access to tertiary education institution resources. Secondary endpoints included research productivity across medical specialties. To control for workforce discrepancies across medical disciplines, the mean number of full-time clinicians for each specialty from 2016 to 2021 was determined. The number of articles per clinician and the proportion of the workforce acting as principal investigator for each specialty were then established. The quality of the research, as reflected by the scientometric performance of each published article, was also assessed. It was hypothesised that research quality from university-affiliated hospitals would be superior to that of their non-university counterparts. Another secondary endpoint was patient outcomes for each non–university-affiliated acute care hospital from 2016 to 2021: crude mortality rate per 100 000 hospitalised patients, length of inpatient stay, and annual number of unplanned readmissions within 30 days of discharge. It was hypothesised that increased research productivity would translate to improved patient outcomes, and an inter-hospital comparison of these key performance indicators was performed. Patient outcome data were collected from the HA’s Clinical Data Analysis and Reporting System and the HA Management Information System.
 
To evaluate research quality, a multi-level scientometric approach was utilised by collecting journal-, article-, and individual author-level data. For journal-level scientometric assessment, two indices were determined: the journal’s mean impact factor (IF) and Eigenfactor score (ES) from 2016 to 2021. These indices were obtained from Clarivate (London, UK), a bibliometric analytics company that manages the Science Citation Index, an online indexing database containing academic journal citation data.1 A journal’s IF is a scientometric index reflecting the mean number of citations received per article in that journal during the preceding 2 years.21 This metric constitutes a reasonable indicator of research quality for general medical journals.22 The ES ranks journals using eigenvector centrality statistics to evaluate the importance of citations within a scholarly network.23 Utilising an algorithm similar to Google’s PageRank (Alphabet Inc, Mountain View [CA], US), the ES considers the number of citations received and the prestige of the citing journal. For article-level metrics, the total number of citations per article (TNC), Relative Citation Ratio (RCR), Field Citation Ratio (FCR), and National Institutes of Health (NIH) percentile attained were documented (Table 2). Author-level data were collected by determining the h-index of the principal investigator (Table 2).24 All scientometric data were censored on 30 June 2023.
 

Table 2. Multi-level scientometric assessment of original clinical research
 
Independent-samples t tests and Chi squared tests were conducted to compare variables. Spearman’s rank analysis was performed to assess correlations between research and hospitalised patient outcomes. P values of less than 0.05 were considered statistically significant. All statistical tests were performed using SPSS (Windows version 21.0; IBM Corp, Armonk [NY], US) and R (version 4.5.0; R Foundation, Vienna, Austria).17
 
Results
Overall original clinical research productivity in Hong Kong
During the 5.5-year period, 4511 peer-reviewed articles were published by Hong Kong medical researchers from acute care public hospitals. Of these, 3142 (69.7%) were original clinical research studies. In total, 29.3% (n=921) of the articles were authored by non-university hospital investigators—a significantly smaller proportion than that published by their university hospital counterparts (independent-samples t test, P<0.001) [Fig 1a]. Throughout the review period, the annual number of publications by non-university hospital investigators remained consistent, with a mean of 167 ± 8 per year (t test, P=0.24) [Fig 1b]. Overall, the medical specialties that produced the most articles were internal medicine, representing 23.4% (n=735) of published studies, and general surgery, representing 16.5% (n=520) [Fig 2].
 

Figure 1. (a) MEDLINE-cited medical research articles by Hong Kong acute care public hospital staff between 2016 and 2021. (b) Comparison of the annual number of original clinical research articles published between university and non-university public hospitals
 

Figure 2. Distribution of original clinical research articles published by clinicians in Hong Kong acute care public hospitals across 17 medical specialties (2016-2021)
 
The majority of excluded articles were narrative reviews that did not meet PRISMA criteria, followed by letters to the editor and editorials (Fig 1a). Regarding study design, most research articles were case reports, followed by retrospective cohort and prospective cohort studies (Fig 3). A significantly larger proportion of case reports were published by non-university hospital staff compared with university hospital staff (P<0.001). In addition to retrospective studies, university hospital investigators were significantly more likely to publish higher level-of-evidence research articles (Table 3).
 

Figure 3. Distribution of original clinical research study designs adopted by non-university public hospital investigators (n=921)
 

Table 3. Comparison of original clinical research study designs produced by non-university and university hospitals
 
Original clinical research productivity among non-university general acute care hospitals and comparisons between medical specialties
The majority of non-university hospital principal investigators were clinicians, with 3.7% (n=34) of studies conducted by nurses or allied healthcare professionals. Among the 887 articles authored by clinicians, 544 individuals were identified, yielding an author-to-article ratio of 1:1.6. These researchers comprised 10.8% of the 5056 full-time non-university hospital clinicians employed during the study period. The most research-productive specialties among non-university hospitals were orthopaedics, followed by internal medicine and obstetrics and gynaecology (Table 4). Among all the medical specialties, the mean number of articles per clinician (×100) was 17.5 ± 22.3 (range, 1.8-56.4). After controlling for workforce discrepancies between disciplines, clinical oncology, orthopaedics and traumatology, and obstetrics and gynaecology constituted the most productive specialties in terms of the mean number of articles published per clinician (Table 4). Collectively, these three specialties published significantly more studies than the other disciplines (independent-samples t test, P<0.001) [Table 4]. The most research-active specialty was obstetrics and gynaecology, where one-third of clinicians acted as principal investigators—this proportion was significantly larger relative to other medical disciplines (t test, P=0.03) [Table 4].
 

Table 4. Original clinical research productivity across medical specialties among non-university hospital clinicians (2016-2021)
 
Original clinical research quality among non-university general acute care hospitals and comparisons between medical specialties
Regarding scientometric performance, the overall mean journal IF and ES were 2.34 ± 3.72 and 0.01 ± 0.07, respectively. No statistically significant difference was observed between the principal investigator’s medical specialty and the journal IF (independent-samples t test, P=0.31). However, with respect to the ES, clinical oncologists published their research in journals with a significantly higher score relative to other medical disciplines (P<0.001) [Table 5].
 

Table 5. Original clinical research quality by scientometric indices across medical specialties among non-university hospital clinicians (2016-2021)
 
For studies performed by non-university clinicians during this period, at the individual article level, the mean TNC per study was 6.33 ± 24.17, the mean number of citations per year was 1.81 ± 9.52, mean RCR was 0.82 ± 3.32, the mean FCR was 3.37 ± 2.04, and the mean NIH percentile achieved was 28.73 ± 25.85. Combined, radiology and otorhinolaryngology research articles had a significantly higher TNC per study (P<0.01) and a higher total number of annual citations (P<0.001) compared with other medical disciplines (Table 5). Articles in anaesthesiology, ophthalmology, otorhinolaryngology, and radiology had a mean RCR exceeding 1.00, indicating that their articles received a higher citation rate than their co-citation network. In particular, anaesthesiology and ophthalmology, studies achieved the highest mean NIH percentile rankings: their research outperformed 47% of all NIH-associated publications. Combined, anaesthesiology and ophthalmology studies also had a significantly higher NIH percentile ranking than other medical disciplines (P=0.001). All medical specialties had an FCR exceeding 1.00, indicating that their article citation rates were higher than those of their counterparts in the same research field. Oncology research had a significantly higher mean FCR (5.82 ± 2.41) compared with other disciplines (P<0.001) [Table 5].
 
In terms of author-level scientometric performance, 18.0% (98/544) of authors did not have a documented h-index. The mean h-index for the remaining researchers was 7.54 ± 10.98. Anaesthesiologists had a significantly higher h-index relative to other specialties (independent-samples t test, P=0.01) [Table 5]. A comparison of scientometric outcomes between university and non-university clinical research also demonstrated uniformly superior performance by academic institution investigators (Table 6).
 

Table 6. Comparison of scientometric performance of clinician-led research articles (2016-2021) between university and non-university hospitals
 
Original clinical research and patient outcomes
For non-university-affiliated hospitals, the overall mean crude mortality rate per 100 000 hospitalised patients was 27 722 ± 5208. Spearman’s rank analysis identified significant negative correlations of mortality rate with TNC (r=-0.69; P=0.01) and RCR (r=-0.63; P=0.022). The overall annual mean number of unplanned readmissions within 30 days of discharge was 1408 ± 756. Similarly, there were significant negative correlations of readmissions with TNC (r=-0.76; P=0.02) and RCR (r=-0.72; P=0.006). The overall mean LOS was 11.7 ± 3.1 days. No significant correlations between LOS and TNC (r=-0.32; P=0.29) or LOS and RCR (r=-0.36; P=0.23) were detected. None of the other scientometric indices were associated with the crude mortality rate, number of unplanned readmissions, or LOS.
 
Discussion
This study reviewed the breadth and quality of original clinical research conducted by Hong Kong’s public healthcare professionals. It is encouraging to observe that, despite the heavy workload of frontline clinicians employed in non-university public hospitals, more than 10% of the workforce engaged in original research as principal investigators. Their endeavours contributed to nearly one-third of peer-reviewed publications produced in the territory. A multi-level scientometric approach was adopted to assess research quality, and our findings indicate that the studies undertaken met the standards of their respective fields. Although the IF and ES values of the published research were not high, all medical specialties achieved a mean FCR of over 1.00. Notably, anaesthesiology, ophthalmology, otorhinolaryngology, and radiology articles attained an RCR exceeding 1.00.
 
Assessing research impact: the Relative Citation Ratio
Introduced in 2016, the RCR is a relatively novel article-level metric that measures a publication’s relevance within the biomedical literature.24 It was developed in response to the limitations of conventional indicators of scientific quality, such as the IF25 and h-index.26 For example, as multidisciplinary collaborations have become more common, researchers in disparate fields may have unequal access to high-profile journals, undermining the IF as a reliable reflection of a study’s performance.21 24 25 Conversely, the h-index does not consider an author’s total number of citations and instead reflects cumulative output, which can disadvantage early-career researchers. Despite their limitations, the IF25 and h-index26 remain pivotal scientometric indices in decisions related to funding and career progression. Given that citations are widely recognised as a form of acknowledging a researcher’s contribution to the field, efforts have been made to formalise this practice into a quantifiable metric. Endorsed by the NIH, the RCR harnesses an article’s co-citation network, normalising the number of citations received according to the article’s publication time and field of expertise. It is calculated as the ratio of the article’s actual citation rate—derived from the FCR—to the expected rate, benchmarked against NIH-funded publications issued in the same year and specialty.24 In recent years, the RCR has gained recognition as a more reliable indicator of an article’s performance within its peer comparison group and is increasingly cited in research grant applications.27 28 29
 
Comparisons with university-affiliated hospitals
The present study showed that university hospitals not only outperformed non-university hospitals in terms of research productivity, but also demonstrated greater influence across all scientometric outcomes. In addition to resource consolidation and the employment of clinician-scientists, another reason for this discrepancy might be the type of studies produced. Approximately half of the articles from non-university hospitals were case reports or technical notes, which provide a lower level of evidence in the evidence-based medicine hierarchy and consequently tend to receive fewer citations. Nonetheless, this form of research is more accessible to junior clinicians and can serve as a gateway to medical writing in resource-limited settings.30 Case reports offer valuable insights into the real-world implications of clinical practice—findings that well-designed randomised controlled trials may fail to capture. They can also stimulate others to report similar observations, serving as a hypothesis-generating opportunity for subsequent systematic enquiry.31
 
Translating research impact into real-world patient outcomes
Few studies have tested the hypothesis that research activity results in improved patient outcomes.4 5 7 8 32 We observed that non-university hospitals whose staff engaged in clinical research had lower crude mortality rates and annual 30-day unplanned readmissions. These findings are supported by reports that patients treated at hospitals participating in clinical trials fared better in terms of 30-day post-intervention mortality and overall survival, relative to those treated at hospitals without such arrangements. This trend has been observed for conditions including acute myocardial infarction, small-cell lung cancer, colorectal cancer, breast cancer, and ovarian cancer.5 33 34 35 36 37 The possibility of a trial effect was reinforced by a systematic review of 13 studies, which attributed this phenomenon to healthcare providers’ greater adherence to clinical practice guidelines and their inclination to adopt evidence-based practices.8 A subsequent systematic review of 33 studies further demonstrated that research activity improved healthcare system performance—reflected by reductions in LOS and risk-adjusted mortality, as well as improvements in patient satisfaction.9 In contrast, few studies have quantitatively analysed peer-reviewed scientometric data and its relationship with patient outcomes. For specific disease conditions, a negative correlation was observed between acute myocardial infarction–related risk-adjusted mortality and a weighted citations ratio among 50 Spanish hospitals.7 A review of 147 National Health Service trusts in the UK demonstrated a negative correlation between the number of research article citations per admission and standardised mortality ratios.5 Econometric modelling using data from 189 Spanish hospitals detected a significant reduction in LOS among institutions that published more clinical research articles or had a higher TNC per article.6
 
Encouraging public hospital healthcare professionals to become principal investigators
There is increasing evidence that clinical research engagement improves patient outcomes, but several barriers to participation remain. First, clinicians have demanding responsibilities that often prohibit involvement in this time-consuming and resource-intensive activity.38 39 40 Clinicians are under-recognised for their overtime efforts—when such work is typically undertaken—and are overburdened with administrative procedures. Research-supportive policies that provide protected time or incentivise clinicians through career advancement could help foster a more scholastic environment.40 Second, Hong Kong has a lengthy and duplicative clinical trial approval process. In a survey of 250 clinician-researchers, 90% reported that approval for a phase I first-in-human study certificate from the Hong Kong SAR Government’s Department of Health required over 3 months.15 Additionally, for HA Clinical Research Ethics Committee study approvals, 50% of respondents reported that the process typically lasted more than 3 months, whereas multi-centre trials frequently required over a year to begin recruitment.15 The establishment of a primary review authority for investigative drug registration—similar to the United States Food and Drug Administration, European Medicines Agency, or China’s National Medical Products Administration—could help streamline regulatory pathways. Third, most funding agencies favour academician-led research over community clinician-led efforts.38 For example, the existing Hong Kong SAR Government’s Health and Medical Research Fund and the Health Care and Promotion Fund—with a combined annual budget of US$530 million—have primarily been allocated to academicians with access to robust research infrastructure. The lack of financial support for community hospitals to develop research capabilities can have clinical implications.4 38 39 40 A review of funding allocations from the NIHR revealed that National Health Service trusts receiving relatively lower levels of research funding had higher risk-adjusted mortality.4 A survey of healthcare professionals in Ontario, Canada, showed that 46% were dissatisfied with their research involvement, although 83% agreed it benefited their careers.39 The major barriers identified were a lack of mentorship and institutional stewardship.39 The establishment of a clinical research institute and academy dedicated to supporting early-career clinician-scientists could help address these challenges.15 Modelled after the NIHR, the provision of publicly funded administrative services to accelerate translational research—by facilitating grant applications for non–university-affiliated hospitals, offering biostatistical support, training research support staff, and nurturing partnerships in a multi-stakeholder ecosystem—can be transformative.40 Following the introduction of NIHR services, there was a tenfold rise in publications, accompanied by a significant increase in mean citation ratios.41 A survey of NIHR stakeholders—including clinicians, nurses, and allied health professionals—also revealed that its training programmes enhanced their research capacity and strengthened individual career development.42
 
Limitations
This study had several limitations. First, we retrieved studies only from the MEDLINE database and not from other sources such as Scopus, Web of Science, Google Scholar, PsycINFO (psychology), CINAHL (nursing and allied health), or HMIC (healthcare management, administration, and policy). MEDLINE was selected because it is the only freely accessible primary source for interrogating the biomedical literature without requiring an institutional user account. While MEDLINE focuses primarily on medicine and the biomedical sciences, other databases cover broader disciplines. Inclusion of these databases would have been ideal, but resource constraints prevented manual review for relevance. Second, a comparison of patient outcomes between university and non-university hospitals was not performed as we were unable to determine whether the principal investigator at teaching hospitals was HA-employed or university-affiliated. Third, only crude mortality rates and LOS were evaluated. A more comprehensive review of public healthcare system key performance indicators—such as risk-adjusted or standardised mortality rates, symptom-to-intervention durations, incremental cost-effectiveness ratios, and patient satisfaction survey results—would have provided greater insight if such data were available.6 43 Important confounding factors were also not assessed, including each hospital’s annual operational income; differences in catchment population size and demographics; and variations in the scope of acute clinical services provided. For example, some institutions are recognised as level-one trauma centres or infectious disease centres. Finally, clinical research from specialties such as psychiatry and family medicine was likely under-represented, as most clinicians in these fields work in dedicated psychiatric hospitals or general outpatient clinics, which are outside the scope of this study focused on general acute care hospitals.
 
Conclusion
This study revealed that clinicians in Hong Kong’s public healthcare system produced nearly one-third of the original peer-reviewed clinical research articles published from the territory. Although the majority of these articles were case reports or retrospective studies, they achieved a relatively high degree of research influence within their respective medical specialties. Research productivity appears to be associated with improved patient outcomes, particularly in terms of crude mortality rates and 30-day unplanned readmissions. Future studies using more refined key performance indicator endpoints and adjustments for confounding factors are necessary to ascertain whether research-active institutions consistently deliver better patient outcomes.
 
Author contributions
Concept or design: PYM Woo, DKK Wong, YF Lau.
Acquisition of data: PYM Woo, DKK Wong, QHW Wong, CKL Leung, YHK Ip, DM Au, TSF Shek, B Siu, C Cheung, K Shing, A Wong, Y Khare, OWK Tsui, NHY Tang, KM Kwok, MK Chiu.
Analysis or interpretation of data: PYM Woo, DKK Wong.
Drafting of the manuscript: PYM Woo, DKK Wong.
Critical revision of the manuscript for important intellectual content: PYM Woo, DKK Wong, KHM Wan, WC Leung.
 
All authors had full access to the data, contributed to the study, approved the final version for publication, and take responsibility for its accuracy and integrity.
 
Conflicts of interest
All authors have disclosed no conflicts of interest.
 
Acknowledgement
The authors thank Kwong Wah Hospital’s Clinical Research Centre, The Hong Kong Student Association of Neuroscience and the Hong Kong Olympia Academy for Clinical Neuroscience Research for providing essential secretarial support and assisting with data acquisition.
 
Funding/support
This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
 
Ethics approval
The research was approved by the Joint Chinese University of Hong Kong–New Territories East Cluster Clinical Research Ethics Committee, Hong Kong (Ref No.: 2025.256).
 
References
1. Djulbegovic B, Guyatt GH. Progress in evidence-based medicine: a quarter century on. Lancet 2017;390:415-23. Crossref
2. Ioannidis JP. Why most clinical research is not useful. PLoS Med 2016;13:e1002049. Crossref
3. Chin WY, Wong WC, Yu EY. A survey exploration of the research interests and needs of family doctors in Hong Kong. Hong Kong Pract 2019;41:29-38.
4. Ozdemir BA, Karthikesalingam A, Sinha S, et al. Research activity and the association with mortality. PLoS One 2015;10:e0118253. Crossref
5. Bennett WO, Bird JH, Burrows SA, Counter PR, Reddy VM. Does academic output correlate with better mortality rates in NHS trusts in England? Public Health 2012;126 Suppl 1:S40-3. Crossref
6. García-Romero A, Escribano Á, Tribó JA. The impact of health research on length of stay in Spanish public hospitals. Res Policy 2017;46:591-604. Crossref
7. Pons J, Sais C, Illa C, et al. Is there an association between the quality of hospitals’ research and their quality of care? J Health Serv Res Policy 2010;15:204-9. Crossref
8. Clarke M, Loudon K. Effects on patients of their healthcare practitioner’s or institution’s participation in clinical trials: a systematic review. Trials 2011;12:16. Crossref
9. Boaz A, Hanney S, Jones T, Soper B. Does the engagement of clinicians and organisations in research improve healthcare performance: a three-stage review. BMJ Open 2015;5:e009415. Crossref
10. National Institute for Health and Care Research. NIHR Annual Report 2022/23. 2024. Available from: https://www.nihr.ac.uk/reports/nihr-annual-report-202223/34501. Accessed 18 Oct 2024.
11. Ciemins EL, Mollis BL, Brant JM, et al. Clinician engagement in research as a path toward the learning health system: a regional survey across the northwestern United States. Health Serv Manage Res 2020;33:33-42. Crossref
12. Cheung BM, Yau HK. Clinical therapeutics in Hong Kong. Clin Ther 2019;41:592-7. Crossref
13. Sek AC, Cheung NT, Choy KM, et al. A territory-wide electronic health record—from concept to practicality: the Hong Kong experience. Stud Health Technol Inform 2007;129:293-6.
14. Kong X, Yang Y, Gao J, et al. Overview of the health care system in Hong Kong and its referential significance to mainland China. J Chin Med Assoc 2015;78:569-73. Crossref
15. Our Hong Kong Foundation; Hong Kong Science and Technology Parks. Developing Hong Kong into Asia’s leading clinical innovation hub. November 2023. Available from: https://ourhkfoundation.org.hk/sites/default/files/media/pdf/OHKF_BioTech_Report_2023_EN.pdf. Accessed 25 Oct 2024.
16. Kovalchik S. RISmed: download content from NCBI Databases. R package version 2.3.0 2021. Available from: https://CRAN.R-project.org/package=RISmed. Accessed 2 Jan 2023.
17. The R Foundation. R: a language and environment for statistical computing Vienna, Austria; 2022. Available from: https://www.R-project.org/. Accessed 2 Jan 2023.
18. Aquilina J, Neves JB, Tran MG. An overview of study designs. Br J Hosp Med (Lond) 2020;81:1-6. Crossref
19. Page MJ, McKenzie JE, Bossuyt PM, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. Syst Rev 2021;10:89. Crossref
20. Krnic Martinic M, Meerpohl JJ, von Elm E, Herrle F, Marusic A, Puljak L. Attitudes of editors of core clinical journals about whether systematic reviews are original research: a mixed-methods study. BMJ Open 2019;9:e029704. Crossref
21. Garfield E. The history and meaning of the journal impact factor. JAMA 2006;295:90-3. Crossref
22. Suelzer EM, Jackson JL. Measures of impact for journals, articles, and authors. J Gen Intern Med 2022;37:1593-7. Crossref
23. Bergstrom CT, West JD. Assessing citations with the Eigenfactor metrics. Neurology 2008;71:1850-1. Crossref
24. Hutchins BI, Yuan X, Anderson JM, Santangelo GM. Relative Citation Ratio (RCR): a new metric that uses citation rates to measure influence at the article level. PLoS Biol 2016;14:e1002541. Crossref
25. Casadevall A, Fang FC. Impacted science: impact is not importance. mBio 2015;6:e01593-15. Crossref
26. Kreiner G. The slavery of the h-index—measuring the unmeasurable. Front Hum Neurosci 2016;10:556. Crossref
27. Reddy V, Gupta A, White MD, et al. Assessment of the NIH-supported relative citation ratio as a measure of research productivity among 1687 academic neurological surgeons. J Neurosurg 2020;134:638-45. Crossref
28. Didzbalis CJ, Avery Cohen D, Herzog I, Park J, Weisberger J, Lee ES. The Relative Citation Ratio: a modern approach to assessing academic productivity within plastic surgery. Plast Reconstr Surg Glob Open 2022;10:e4564. Crossref
29. Gupta A, Meeter A, Norin J, Ippolito JA, Beebe KS. The Relative Citation Ratio (RCR) as a novel bibliometric among 2511 academic orthopedic surgeons. J Orthop Res 2023;41:1600-6. Crossref
30. Balinska MA, Watts RA. The value of case reports in democratising evidence from resource-limited settings: results of an exploratory survey. Health Res Policy Syst 2020;18:84. Crossref
31. Suvvari TK. Are case reports valuable? Exploring their role in evidence-based medicine and patient care. World J Clin Cases 2024;12:5452-5. Crossref
32. Selby P, Autier P. The impact of the process of clinical research on health service outcomes. Ann Oncol 2011;22 Suppl 7:vii5-9. Crossref
33. Majumdar SR, Roe MT, Peterson ED, Chen AY, Gibler WB, Armstrong PW. Better outcomes for patients treated at hospitals that participate in clinical trials. Arch Intern Med 2008;168:657-62. Crossref
34. Rich AL, Tata LJ, Free CM, et al. How do patient and hospital features influence outcomes in small-cell lung cancer in England? Br J Cancer 2011;105:746-52. Crossref
35. Rochon J, du Bois A. Clinical research in epithelial ovarian cancer and patients’ outcome. Ann Oncol 2011;22 Suppl 7:vii16-9. Crossref
36. Janni W, Kiechle M, Sommer H, et al. Study participation improves treatment strategies and individual patient care in participating centers. Anticancer Res 2006;26:3661-7.
37. Downing A, Morris EJ, Corrigan N, et al. High hospital research participation and improved colorectal cancer survival outcomes: a population-based study. Gut 2017;66:89-96. Crossref
38. DiDiodato G, DiDiodato JA, McKee AS. The research activities of Ontario’s large community acute care hospitals: a scoping review. BMC Health Serv Res 2017;17:566. Crossref
39. Senecal JB, Metcalfe K, Wilson K, Woldie I, Porter LA. Barriers to translational research in Windsor Ontario: a survey of clinical care providers and health researchers. J Transl Med 2021;19:479. Crossref
40. Gehrke P, Binnie A, Chan SP, et al. Fostering community hospital research. CMAJ 2019;191:E962-6. Crossref
41. Kiparoglou V, Brown LA, McShane H, Channon KM, Shah SG. A large National Institute for Health Research (NIHR) Biomedical Research Centre facilitates impactful cross-disciplinary and collaborative translational research publications and research collaboration networks: a bibliometric evaluation study. J Transl Med 2021;19:483. Crossref
42. Burkinshaw P, Bryant LD, Magee C, et al. Ten years of NIHR research training: perceptions of the programmes: a qualitative interview study. BMJ Open 2022;12:e046410. Crossref
43. Jonker L, Fisher SJ, Dagnan D. Patients admitted to more research-active hospitals have more confidence in staff and are better informed about their condition and medication: results from a retrospective cross-sectional study. J Eval Clin Pract 2020;26:203-8. Crossref

Presentation, management, and clinical outcomes of von Hippel–Lindau syndrome

Hong Kong Med J 2025 Oct;31(5):355–62 | Epub 28 Aug 2025
© Hong Kong Academy of Medicine. CC BY-NC-ND 4.0
 
ORIGINAL ARTICLE  CME
Presentation, management, and clinical outcomes of von Hippel–Lindau syndrome
Athena YH Lee, MB, ChB1,2 #; David KW Leung, MB, ChB, FRCS1 #; CH Leung, MSc1; Kelly HY Tsang1; Alvina Yiu1; Chloe YK Ho1; Jason MK Ho, FHKAM (Surgery), FRCSEd (Neurosurgery)3; CF Ng, MD, FHKAM (Surgery)1,4
1 Division of Urology, Department of Surgery, Faculty of Medicine, The Chinese University of Hong Kong, Hong Kong SAR, China
2 Cardio-Oncology Research Unit, Cardiovascular Analytics Group, Hong Kong, China–UK Collaboration, Hong Kong SAR, China
3 Division of Neurosurgery, Department of Surgery, Tuen Mun Hospital, Hong Kong SAR, China
4 SH Ho Urology Centre, The Chinese University of Hong Kong, Hong Kong SAR, China
# Equal contribution
 
Corresponding author: Prof CF Ng (ngcf@surgery.cuhk.edu.hk)
 
 Full paper in PDF
 
Abstract
Introduction: von Hippel–Lindau (VHL) syndrome is a rare autosomal dominant genetic disorder that typically leads to the development of multiple tumours in various organs. This study describes the lifetime journey of VHL patients in terms of their hospitalisation, surgery, and functional impairment, and aims to examine the local presentation patterns, treatment courses, and clinical outcomes associated with the condition.
 
Methods: Thirty-two patients with VHL syndrome (mean age=27.9 ± 12.6 years) were retrospectively identified from five local public hospitals managed between 1 January 1993 and 30 September 2024, with a follow-up duration of 18.0 ± 10.8 years. Patient demographics, disease presentation, length of hospital stay, and treatments received were recorded and analysed.
 
Results: Over a total of 575.9 person-years, 17 patients (53.1%) developed renal tumours and 10 (31.3%) underwent partial or radical nephrectomy. Twenty-four patients (75.0%) underwent central nervous system (CNS) surgery for haemangioblastoma. Eleven patients (34.4%) had phaeochromocytoma, and eight (25.0%) underwent adrenalectomy. Nine patients (28.1%) had retinal haemangioblastoma. During the study period, 368 emergency department visits, 1209 inpatient admissions, 192 intensive care unit days, and 5635 hospitalisation days were recorded. In total, 116 surgeries were performed involving the kidneys (n=17), pancreas (n=6), adrenal glands (n=10), and CNS (n=83). Six patients required dialysis; 4373 dialysis sessions were performed. Fifteen patients died. Among the nine who died of VHL syndrome, eight had developed cerebral haemangioblastoma, three had phaeochromocytoma, and four had renal tumours.
 
Conclusion: Patients with VHL syndrome often experience early-onset and recurrent diseases affecting multiple organ systems, leading to substantial morbidity and mortality. A multidisciplinary approach, along with the introduction of novel treatments, may improve disease control and clinical outcomes.
 
 
New knowledge added by this study
  • This study examined the disease journey of von Hippel–Lindau (VHL) patients in Hong Kong, providing insights into disease presentation patterns, the number of treatments and procedures required, treatment outcomes, and morbidity data.
  • The study analysed the substantial healthcare costs incurred in managing VHL syndrome, highlighting the economic burden on healthcare systems due to repeated admissions, multidisciplinary care, long-term followup, surgeries, and other interventions, notably VHL syndrome–related renal cell carcinoma treatment and kidney dialysis.
  • The study emphasises the potential benefits of novel treatments such as belzutifan in managing VHL syndrome among local patients, with promising results that could transform the treatment landscape for this rare genetic disorder, thus reducing disease burden and improving the quality of life of patients.
Implications for clinical practice or policy
  • Given the cross-specialty manifestations of VHL syndrome, the study underscores the importance of a multidisciplinary approach in its management, thereby demonstrating the value of collaborative care in improving clinical outcomes.
  • The study’s findings may prompt policymakers to re-evaluate existing healthcare policies related to rare genetic disorders such as VHL syndrome, particularly in expanding access to innovative treatments by adding belzutifan to the Hospital Authority Drug Formulary.
  • The study highlights the need for dedicated funding to establish local VHL syndrome registries, thereby supporting further clinical trials and large-scale research. The creation of patient support programmes may also contribute to a healthcare environment that addresses the unique challenges faced by VHL patients and fosters a holistic approach to care.
 
 
Introduction
von Hippel–Lindau (VHL) syndrome is a rare autosomal dominant genetic disorder characterised by benign and malignant tumours, including clear cell renal cell carcinoma (RCC), adrenal phaeochromocytoma, pancreatic neuroendocrine tumour, and retinal and central nervous system haemangioblastoma (CNS-Hb).1 According to a 2017 study, its incidence is estimated to be one in 27 300 live births.2 The multi-system manifestations of VHL typically require repeated admissions, multidisciplinary care, and long-term follow-up, placing a substantial socio-economic burden on healthcare systems. Recently, belzutifan, a second-generation hypoxia-inducible factor (HIF)-2α inhibitor, has shown promising results in a phase 2 study involving Western populations.3 However, its applications and benefits for Asian patients remain poorly understood.
 
This multi-centre retrospective cohort study investigated VHL patients to examine local presentation patterns, treatment courses, and clinical and functional outcomes. The findings aim to provide insight into the presentation and management of VHL in Asian patients and, more importantly, to inform resource allocation.
 
Methods
This study identified patients with VHL syndrome from five local public hospitals—Prince of Wales Hospital, Alice Ho Miu Ling Nethersole Hospital, North District Hospital, Tuen Mun Hospital, and Pok Oi Hospital—managed between 1 January 1993 and 31 December 2023, with follow-up data collected up to 30 September 2024. The Clinical Data Analysis and Reporting System, a local online platform recording clinical data from all public hospitals in Hong Kong, was used for patient identification. Patient demographics and clinical information regarding disease course and treatment outcomes were retrieved from the Clinical Management System, an online database storing electronic patient records for public hospitals in Hong Kong. The following data were collected for each included patient: demographic factors (age, sex, body mass index, performance status, and co-morbidities); disease characteristics (initial presentation, time of diagnosis, lag time to diagnosis, number and size of renal and extrarenal lesions, and response or recurrence patterns); treatment details (number and frequency of surgical or ablative interventions, hospital length of stay, intensive care unit [ICU] admissions, associated costs, and resultant complications and disabilities); and health outcomes (health-adjusted life years, quality of life estimates, and economic parameters related to hospitalisations, outpatient services, and medical and surgical care).
 
The study endpoints included rates of VHL-spectrum disease (CNS-Hb, choroid plexus papilloma, retinal haemangioblastoma, endolymphatic sac tumour, RCC, renal cyst, renal angiomyolipoma, phaeochromocytoma, paraganglioma, pancreatic cyst, pancreatic neuroendocrine tumour, pancreatic adenocarcinoma, and liver cyst), emergency department (ED) attendance, admissions, surgeries, and functional outcomes (independent in activities of daily living, wheelchair-bound, or bedbound). According to the local public healthcare system in Hong Kong, the mean cost per ambulatory emergency attendance and per hospitalisation day was HK$750 (US$96.2) and HK$3440 (US$441), respectively.4 The total cost of hospital attendance was defined as the sum of ED and inpatient attendance costs. Descriptive statistics, including mean, standard deviation, median, and interquartile range, were used to summarise the data.
 
Results
Demographics
Initially, 87 patients were identified. After manual review of the medical records, 52 were excluded due to incorrect diagnoses (three non-VHL, two Cowden syndrome, 17 Peutz–Jeghers syndrome, 28 Sturge–Weber syndrome, one hamartoma, and one duplicate record). Two additional patients were excluded due to incomplete data, and one further duplicate was removed. The incorrect diagnoses were likely due to similarities and overlaps in the diagnostic codes used for these conditions.
 
In total, 32 patients were deemed eligible for inclusion, of whom 21 (65.6%) were male. The mean age at first presentation was 27.9 ± 12.6 years and the mean follow-up duration was 18.0 ± 10.8 years. All patients developed tumours. Seventeen patients (53.1%) had renal tumours, and 10 (31.3%) underwent partial or radical nephrectomy. Twenty-four patients (75.0%) underwent CNS surgery for haemangioblastoma. Eleven patients (34.4%) had phaeochromocytoma, and eight (25.0%) underwent adrenalectomy. Retinal haemangioblastoma occurred in nine patients (28.1%). Demographic and disease prevalence data within the VHL syndrome spectrum are summarised in Table 1.
 

Table 1. Demographic and prevalence data of lesions in von Hippel–Lindau syndrome (n=32)
 
von Hippel–Lindau syndrome–related mortality
Over a total of 575.9 person-years, 15 patients died. Causes of death were VHL syndrome in nine (60%), pneumonia in three (20%), metastatic lung cancer in one (6.7%), sepsis in one (6.7%), and congestive heart failure in one (6.7%). Among those who died of VHL syndrome–related tumours, eight had CNS haemangioblastoma, three had phaeochromocytoma, and four had renal tumours. Even in patients whose causes of death were not directly related to VHL, strong associations were observed with the sequelae of VHL-spectrum diseases and treatments. All three patients who died of chest infections were wheelchair-bound after neurosurgical treatment of CNS-Hb; one of them required long-term steroids following bilateral adrenalectomy. The patient who died of sepsis had paraplegia after spinal surgery and end-stage renal failure (ESRF) requiring peritoneal dialysis. The source of sepsis was likely peritoneal dialysis–related peritonitis. All-cause mortality and VHL syndrome–related mortality over time since presentation are shown in Figures 1 and 2, respectively.
 

Figure 1. Kaplan-Meier curve demonstrating all-cause mortality over time since presentation
 

Figure 2. Kaplan-Meier curve demonstrating von Hippel–Lindau syndrome–related mortality over time since presentation
 
von Hippel–Lindau syndrome–related morbidity
Nine patients (28.1%) developed chronic kidney disease, of whom six progressed to ESRF (estimated glomerular filtration rate <15 mL/min/1.73 m2). All six (18.8%) required renal replacement therapy—three underwent haemodialysis, one received peritoneal dialysis, and two began peritoneal dialysis before switching to haemodialysis.
 
By the last follow-up, 15 patients had died, whereas 17 remained independent in their activities of daily living. None of the 17 surviving patients were wheelchair-bound or bedbound.
 
Belzutifan usage
Belzutifan was prescribed to three patients. The mean age at presentation was 26.3 years, with the youngest at 17 years and the oldest at 35 years. The average duration from initial presentation to the initiation of belzutifan therapy was 22.9 years. All three patients had CNS haemangioblastoma, with one experiencing multiple recurrences. One patient also had phaeochromocytoma, and another had a renal tumour. Patient characteristics are summarised in Table 2. The duration of belzutifan therapy ranged from 1 to 7.6 months. Of the three patients, two required dose reductions due to adverse events—specifically, anaemia and deranged liver function.
 

Table 2. Characteristics of belzutifan users
 
von Hippel–Lindau syndrome–attributable healthcare costs
During the study period, a total of 368 ED visits, 1209 inpatient admissions, and 5635 days of hospitalisation were recorded. In total, 21 patients had ICU stays, amounting to 192 ICU days. These utilisation patterns translated to an annualised per-patient ED visit–related cost of HK$8625 and an annualised per-patient inpatient admission–related cost of HK$129 968.4 Six patients required dialysis, and 4373 dialysis sessions were performed during the study period, resulting in a total cost of HK$28.8 million (HK$6580 per dialysis session).4 For the belzutifan patient cohort, no ED visits or inpatient admissions were recorded after initiation of belzutifan therapy, likely due to the short follow-up duration after the prescription of this drug newly approved by the United States Food and Drug Administration. Consequently, we could not directly compare the healthcare cost burden between belzutifan users and non-users.
 
The pattern of tumour-related surgeries and accident and emergency admissions in VHL patients was highly variable; some patients experienced periods of intense activity followed by quieter phases, suggesting non-linear disease progression. Tumour-related operations and deaths since diagnosis are shown in Figure 3, whereas accident and emergency admissions are presented in Figure 4, highlighting individual disease burden. Monitoring and management should be tailored to address these fluctuating needs.
 

Figure 3. Event plot showing tumour-related operations for individual patients with von Hippel–Lindau syndrome since diagnosis
 

Figure 4. Event plot showing accident and emergency visits for individual patients with von Hippel–Lindau syndrome since diagnosis
 
Discussion
From this review, we observed that VHL-spectrum diseases emerged at a young age and recurred throughout patients’ lives, leading to considerable morbidity and mortality. This finding is consistent with existing literature. There is a pressing need to improve the current care of VHL syndrome in Hong Kong to enhance patients’ life trajectories and quality of life.
 
Pathophysiology
The VHL protein normally functions as an E3 ubiquitin ligase that facilitates ubiquitination of the alpha subunit of HIF, leading to its proteolysis.5 In VHL patients, genetic alterations reduce VHL protein activity, thereby disinhibiting HIF-mediated transcription. Consequently, the overexpression of vascular endothelial growth factor, cyclin D1, glucose transporter 1, and erythropoietin promotes neoplastic growth.5 6 The resultant tissue overgrowth leads to early-onset, recurrent, and multi-system benign and malignant neoplasms.1
 
Functional impairment in patients
Patients with VHL syndrome experience a lifelong journey with the disease, characterised by substantial morbidity and mortality.
 
An Italian study of 128 VHL patients showed that the natural history varied according to disease manifestations.7 For RCC, the median age at first presentation was 31 years,7 similar to our cohort, which had a median age of 27.4 years. The first progression typically occurred after 7 to 8 years; a second progression followed 1 to 2 years later. von Hippel–Lindau syndrome–related cerebellar haemangioblastomas generally developed at a median age of 30 years and progressed relatively consistently every 3.5 years. The cumulative incidences of disability were 26.5% for CNS involvement, 16.4% for visual disturbance, 12.5% for hearing loss, 10.9% for adrenergic dysfunction, 4.6% for pancreatic morbidity, and 1.5% for renal impairment.7 One patient died of metastatic RCC (0.8%), another entered a vegetative state after a CNS procedure (0.8%), and five died of postoperative complications (3.9%).7 Overall, the average Karnofsky performance status was 80% at the end of follow-up.7
 
In contrast, in our cohort, the nine patients who died of VHL syndrome–related tumours succumbed to the disease itself, rather than postoperative complications. This highlights the substantial impact of such tumours on patient mortality, underscoring the need for vigilant monitoring and comprehensive management strategies to improve outcomes.
 
Surgery and radiosurgery for von Hippel–Lindau syndrome–related tumours
Central nervous system haemangioblastomas represent a major and disabling manifestation of VHL syndrome. A prospective natural history study focusing on stereotactic radiosurgery for CNS-Hb in VHL patients reported outcomes from 20 individuals treated for 44 lesions.8 Most lesions were located in the cerebellum (n=39), and five in the brainstem. The mean age at treatment was 37.5 ± 12.0 years.8 All patients were alive at a mean follow-up interval of 8.5 years. Tumours (mean volume: 0.5 ± 0.7 cm3) were treated with a mean prescription dose of 18.9 Gy (range, 12-24) to the tumour margin, resulting in local control rates of 91%, 83%, 61%, and 51% at 2, 5, 10, and 15 years, respectively.8 Despite the favourable early response to stereotactic radiosurgery, VHL syndrome–related haemangioblastomas tend to progress during long-term follow-up.
 
With respect to the treatment of VHL syndrome–related RCC (ie, VHL-RCC), the rule of thumb is to strike a balance between oncological control and renal function preservation to avoid or delay ESRF. Common strategies include nephron-sparing surgery and ablative therapies. A retrospective review of VHL-RCC by Duffey et al9 suggested that 3 cm was a reasonable cutoff, beyond which metastasis may occur earlier; therefore, nephron-sparing surgery would be indicated. In a cohort of 54 VHL patients who underwent nephron-sparing surgery, nephrectomy, or thermal ablation for RCC,10 97 kidney treatments were performed. Nephron-sparing surgery was adopted in 96% of first and 67% of second interventions. The probabilities of a second surgery were 21% at 5 years and 42% at 10 years. The overall survival and cancer-specific survival rates were 82.5% and 90.5%, respectively, at the 10-year follow-up. No metastasis was observed for RCCs with a maximum diameter smaller than 4 cm.10
 
Systemic therapies for von Hippel–Lindau syndrome
With greater understanding of the genetics and pathophysiology of VHL syndrome, researchers have been actively developing effective systemic therapies. The advent of belzutifan has revolutionised systemic therapy for VHL syndrome. This HIF-2α inhibitor demonstrated satisfactory objective response rates for RCC (49%), pancreatic lesions (77%), and CNS-Hb (30%), along with an acceptable safety profile—anaemia and fatigue were the most common side-effects.2 On 13 August 2021, belzutifan was approved by the United States Food and Drug Administration for use in adult VHL patients who need treatment for associated RCC, CNS-Hb, or pancreatic neuroendocrine tumours not requiring immediate surgery.11
 
The LITESPARK-004 (MK-6482-004) phase 2 study further supports the clinical benefits of belzutifan in patients with VHL syndrome.12 With over 2 years of follow-up data, the study demonstrated sustained efficacy in reducing tumour burden across multiple organs.12 Objective response rates were consistent with earlier findings: 49% for RCC, 77% for pancreatic lesions, and 30% for CNS-Hb.12 Notably, the responses were durable, with many patients experiencing prolonged disease control without surgical intervention. The safety profile remained acceptable; anaemia and fatigue were the most common adverse events.12 These findings reinforce belzutifan’s potential as a transformative systemic therapy, offering a non-invasive alternative to repeated surgeries and improving patient quality of life. Continued research and access to such therapies, particularly in Asian populations, are essential.
 
Socio-economic impact
von Hippel–Lindau syndrome–related RCC is a notable malignancy within the disease spectrum. In our cohort, the annualised per-patient ED visit–related cost for VHL-RCC patients was HK$2070, and the annualised inpatient admission cost was HK$23 965. In comparison, an American study reported that VHL-RCC patients (n=160) incurred US$36 450 more annually than the control group (n=800), including US$21 123 more for RCC management.13 Among complications, ESRF was the most costly, requiring US$65 338 over 6 months post-nephrectomy.13 Similarly, our cohort incurred approximately HK$28.8 million during the study period for repeated dialysis in six patients with ESRF.
 
Another claims-based study showed that CNS-Hb and pancreatic neuroendocrine tumours due to VHL syndrome similarly increased annual healthcare costs by US$49 645 compared with the control group.14 These findings underscore the importance of novel therapies that can alleviate both clinical and economic burdens.
 
In our local hospital system, the estimated annualised per-patient ED visit–related cost was HK$8625, and the annualised per-patient inpatient admission–related cost was HK$129 968. Additionally, dialysis for the six patients with ESRF required an additional HK$28.8 million. We did not include calculations for the surgical treatment of all tumours and related management due to the practical difficulties of cost estimation within the public hospital system. Nevertheless, we expect these costs to be substantial. Although the current drug cost for belzutifan is high (estimated at around CAD$17 920 per 28 days15), the medical expenses associated with the natural course of VHL syndrome are also considerable. Evidence regarding the cost-effectiveness of medical therapies, including belzutifan, is still emerging; it is important to consider the composite outcomes of mortality, healthcare-related costs, irreversible morbidities, and social dysfunction. Further economic studies are warranted to quantify the potential cost savings associated with this novel treatment.
 
Future directions to optimise care
von Hippel–Lindau syndrome greatly affects patients’ clinical outcomes and quality of life. Frequent hospitalisations, repeated medical and surgical therapies, and recurrent tumours contribute to cumulative morbidities and mortality. The need for multidisciplinary care, ongoing surveillance for recurrence, and genetic counselling further add to the disease burden. Thus, VHL patients require improved access to novel medications.
 
As our results suggest, the management of VHL syndrome should be holistic. Patients with multiple VHL syndrome–related conditions should be discussed at multidisciplinary meetings to facilitate treatment prioritisation. A sensible approach would be to address the most life-threatening and symptomatic disease first.
 
The initial local experience of using belzutifan was promising, with manageable toxicity profiles. With the advent of its coverage by the Samaritan Fund for eligible patients,16 the role of belzutifan is expected to rise in local VHL management. While its safety and efficacy have been demonstrated in Western populations, its benefits for Asian patients remain to be fully defined. This retrospective study showed that one belzutifan user in the cohort developed fewer new-onset VHL syndrome–related conditions than non-users. However, the small sample size (three belzutifan users among 32 VHL patients) limits generalisability. Nevertheless, the encouraging initial results of belzutifan in controlling tumour growth in the kidneys, CNS, retina, and pancreas support the need for coordinated efforts in resource allocation and the establishment of subsidy schemes.3 With increased use of the medication, overall healthcare costs are expected to decline due to reductions in surgeries and hospitalisations. Given the rarity of VHL syndrome, future clinical trials should ideally be multi-national and multi-centre. Local registries should also be established to facilitate long-term follow-up, clinical trial enrolment, and policy development for this patient group. Additionally, patient support groups, social support initiatives, and increased attention to psychological well-being would help provide holistic care for VHL patients. Addressing the financial and disease-related burdens faced by this vulnerable population is essential to improving their quality of life and long-term outcomes.
 
Limitations
Our cohort did not include all VHL patients in Hong Kong. Assuming an incidence of one in 27 30017 and a total population of 7 million in Hong Kong,18 the estimated number of VHL patients in this locality is approximately 250, excluding those who did not present to the participating hospitals or whose follow-up data were unavailable. Nevertheless, our study offers the first insight into the clinical journey of local VHL patients.
 
Conclusion
Overall, VHL patients experience early-onset and recurrent multi-systemic illness, with a substantial risk of irreversible morbidity and mortality. Multidisciplinary care and the promotion of effective treatments such as belzutifan may improve the management of this rare but important disease.
 
Author contributions
Concept or design: AYH Lee, DKW Leung.
Acquisition of data: CH Leung, KHY Tsang, A Yiu, CYK Ho.
Analysis or interpretation of data: AYH Lee, DKW Leung, CH Leung.
Drafting of the manuscript: AYH Lee, DKW Leung.
Critical revision of the manuscript for important intellectual content: JMK Ho, CF Ng.
 
All authors had full access to the data, contributed to the study, approved the final version for publication, and take responsibility for its accuracy and integrity.
 
Conflicts of interest
As an editor of the journal, CF Ng was not involved in the peer review process. Other authors have disclosed no conflicts of interest.
 
Acknowledgement
The authors thank the following contributors for their expertise and support in the research: Dr Jeffrey SK Chan and Dr Esther TW Cheng of the Cardio-Oncology Research Unit, Cardiovascular Analytics Group, Hong Kong, China–UK Collaboration; and Dr Brian WH Siu, Dr Ivan CH Ko, Dr Chris HM Wong, and Dr Alex Liu of the Division of Urology, Department of Surgery, Faculty of Medicine, The Chinese University of Hong Kong.
 
Funding/support
This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
 
Ethics approval
This research was approved by the Joint Chinese University of Hong Kong–New Territories East Cluster Clinical Research Ethics Committee, Hong Kong (Ref No.: 2024.435). A waiver of patient consent was granted by the Committee due to the retrospective nature of the research.
 
References
1. Couch V, Lindor NM, Karnes PS, Michels VV. von Hippel–Lindau disease. Mayo Clin Proc 2000;75:265-72. Crossref
2. Binderup ML, Galanakis M, Budtz-Jørgensen E, Kosteljanetz M, Luise Bisgaard M. Prevalence, birth incidence, and penetrance of von Hippel–Lindau disease (vHL) in Denmark. Eur J Hum Genet 2017;25:301-7. Crossref
3. Jonasch E, Donskov F, Iliopoulos O, et al. Belzutifan for renal cell carcinoma in von Hippel–Lindau disease. N Engl J Med 2021;385:2036-46. Crossref
4. Hospital Authority, Hong Kong. Hospital Authority Annual Report 2007-2008. Available from: https://www.ha.org.hk/ho/corpcomm/Annual%20Report/2007-08.pdf. Accessed 10 Oct 2024.
5. Choueiri TK, Kaelin WG Jr. Targeting the HIF2-VEGF axis in renal cell carcinoma. Nat Med 2020;26:1519-30. Crossref
6. Haase VH. The VHL tumor suppressor: master regulator of HIF. Curr Pharm Des 2009;15:3895-903. Crossref
7. Feletti A, Anglani M, Scarpa B, et al. von Hippel–Lindau disease: an evaluation of natural history and functional disability. Neuro Oncol 2016;18:1011-20. Crossref
8. Asthagiri AR, Mehta GU, Zach L, et al. Prospective evaluation of radiosurgery for hemangioblastomas in von Hippel–Lindau disease. Neuro Oncol 2010;12:80-6. Crossref
9. Duffey BG, Choyke PL, Glenn G, et al. The relationship between renal tumor size and metastases in patients with von Hippel–Lindau disease. J Urol 2004;172:63-5. Crossref
10. Jilg CA, Neumann HP, Gläsker S, et al. Nephron sparing surgery in von Hippel–Lindau associated renal cell carcinoma; clinicopathological long-term follow-up. Fam Cancer 2012;11:387-94. Crossref
11. United States Food and Drug Administration. FDA approves belzutifan for cancers associated with von Hippel–Lindau disease. Available from: https://www.fda.gov/drugs/resources-information-approved-drugs/fda-approves-belzutifan-cancers-associated-von-hippel-lindau-disease. Accessed 6 Aug 2025.
12. Jonasch E, Iliopoulos O, Kimryn Rathmell W, et al. LITESPARK-004 (MK-6482-004) phase 2 study of belzutifan, an oral hypoxia-inducible factor 2α inhibitor (HIF-2α), for von Hippel–Lindau (VHL) disease: update with more than two years of follow-up data. J Clin Oncol 2022;40 (Suppl):4546. Crossref
13. Jonasch E, Song Y, Freimark J, et al. Epidemiology and economic burden of von Hippel–Lindau disease–associated renal cell carcinoma in the United States. Clin Genitourin Cancer 2023;21:238-47. Crossref
14. Jonasch E, Song Y, Freimark J, et al. Epidemiology and economic burden of von Hippel–Lindau disease–associated central nervous system hemangioblastomas and pancreatic neuroendocrine tumors in the United States. Orphanet J Rare Dis 2024;19:73. Crossref
15. Belzutifan (Welireg): CADTH Reimbursement Review: Therapeutic area: von Hippel–Lindau disease–associated tumours [Internet]. Ottawa (ON): Canadian Agency for Drugs and Technologies in Health; 2023 Nov. Pharmacoeconomic Review. Available from: https://www.ncbi.nlm.nih.gov/books/NBK599999/. Accessed 10 Oct 2024.
16. Samaritan Fund. Items supported by the Samaritan Fund. Available from: https://www.ha.org.hk/haho/ho/sf/SF_Items_en.pdf. Accessed 18 Aug 2025.
17. Rare Disease Hong Kong. von Hippel–Lindau Disease. About Rare Diseases Rare Disease Wiki. Available from: https://rdhk.org/post/data?mid=15&id=13471&lang=en. Accessed 10 Oct 2024.
18. Census and Statistics Department, Hong Kong SAR Government. Year-end Population for 2023 [20 Feb 2024]. Available from: https://www.censtatd.gov.hk/en/press_release_detail.html?id=5386. Accessed 10 Oct 2024.

Machine learning model for prediction of coronavirus disease 2019 within 6 months after three doses of BNT162b2 in Hong Kong

Hong Kong Med J 2025 Aug;31(4):296–304 | Epub 23 Jun 2025
© Hong Kong Academy of Medicine. CC BY-NC-ND 4.0
 
ORIGINAL ARTICLE
Machine learning model for prediction of coronavirus disease 2019 within 6 months after three doses of BNT162b2 in Hong Kong
Jing Tong Tan, BSc1; Ruiqi Zhang, PhD1; KH Chan, PhD2; Jian Qin, PhD3; Ivan FN Hung, MD1; KS Cheung, MD, MPH1,4
1 Department of Medicine, School of Clinical Medicine, The University of Hong Kong, Queen Mary Hospital, Hong Kong SAR, China
2 Department of Microbiology, School of Clinical Medicine, The University of Hong Kong, Queen Mary Hospital, Hong Kong SAR, China
3 Department of Medicine, Yulin Traditional Chinese Medicine Hospital, Guangxi, China
4 Department of Medicine, The University of Hong Kong–Shenzhen Hospital, Shenzhen, China
 
Corresponding authors: Prof Ivan FN Hung (ivanhung@hku.hk); Prof KS Cheung (cks634@hku.hk)
 
 Full paper in PDF
 
Abstract
Introduction: We aimed to develop a machine learning (ML) model to predict the risk of coronavirus disease 2019 (COVID-19) among three-dose BNT162b2 vaccine recipients in Hong Kong.
 
Methods: A total of 304 individuals who had received three doses of BNT162b2 were recruited from three vaccination centres in Hong Kong between May and August 2021. The dataset was randomly divided into training (n=184) and testing (n=120) sets in a 6:4 ratio. Demographics, co-morbidities and medications, blood tests (complete blood count, liver and renal function tests, glycated haemoglobin level, lipid profile, and presence of hepatitis B surface antigen), and controlled attenuation parameter (CAP) were used to develop six ML models (logistic regression, linear discriminant analysis, random forest, naïve Bayes, neural network [NN], and extreme gradient boosting models) to predict COVID-19 risk. Model performance was assessed using area under the receiver operating characteristic curve (AUC), sensitivity, specificity, and positive predictive value (PPV) and negative predictive value (NPV).
 
Results: Among the study population (median age: 50.9 years [interquartile range=43.6-57.8]; men: 30.9% [n=94]), 27 participants (8.9%) developed COVID-19 within 6 months. Fifteen clinical variables were used to train the models. The NN model achieved the best performance, with an AUC of 0.74 (95% confidence interval [95% CI]=0.60-0.88). Using the optimal cut-off value based on the maximised Youden index, sensitivity, specificity, PPV, and NPV were 90% (95% CI=55%-100%), 58% (95% CI=48%-68%), 16% (95% CI=8%-29%), and 98% (95% CI=92%-100%), respectively. The top predictors in the NN model include age, prediabetes/diabetes, CAP, alanine aminotransferase level, and aspartate aminotransferase level.
 
Conclusion: An NN model integrating 15 clinical variables effectively identified individuals at low risk of COVID-19 following three doses of BNT162b2.
 
 
New knowledge added by this study
  • A neural network model is a useful tool that effectively predicts coronavirus disease 2019 (COVID-19) risk in individuals who have received three doses of the BNT162b2 vaccine.
  • Metabolic risk factors, including prediabetes/diabetes, non-alcoholic fatty liver disease, and steatohepatitis, play key roles in vaccine immunogenicity.
Implications for clinical practice or policy
  • Clinicians can use the model to identify high-risk patients for booster doses and preventive strategies.
  • Our findings can guide targeted educational campaigns and resource allocation by identifying demographic and clinical factors associated with higher COVID-19 risk despite vaccination.
  • The identification of key variables such as age, prediabetes/diabetes, and liver enzyme levels can prompt further studies to understand the underlying mechanisms and to develop more effective interventions.
 
 
Introduction
The severe acute respiratory syndrome coronavirus 2 pandemic has been a global health crisis, resulting in substantial morbidity and mortality worldwide, with over 13 billion vaccine doses administered.1 To mitigate the risk of breakthrough infections by dominant Omicron variants, a third-dose booster following two doses of BNT162b2 vaccine (BioNTech-Pfizer, Mainz, Germany) has been rolled out. Compared with a two-dose schedule, a third dose significantly reduces the risk of infection, hospitalisation, and severe disease.2 3 However, waning anti-Omicron neutralising antibody and T cell responses have been reported even after the booster dose,4 and sustained long-term immunogenicity remains uncertain.
 
Advanced machine learning (ML) algorithms, such as random forest, artificial neural network (NN), and gradient boosting, have been increasingly utilised to develop prognostic models that can identify individuals at high risk of coronavirus disease 2019 (COVID-19). These models offer potential to improve risk stratification and inform targeted prevention and intervention strategies. Numerous studies have demonstrated the development of such models, which integrate various clinical, demographic, and routine laboratory variables to predict risks of COVID-19, hospitalisation, and mortality.5 6 7 8 9 However, these previous studies did not stratify patients by vaccination status, leading to heterogeneous cohorts of both vaccinated and unvaccinated individuals. This may introduce limitations and biases in model performance, given that vaccination status can substantially affect COVID-19 risk and disease severity.10 11
 
This study focused on individuals who had received three doses of BNT162b2, aiming to identify the ML algorithm with optimal performance for predicting COVID-19 risk using clinically available data. We also sought to identify key predictors used by the model to stratify individuals who may be more susceptible to COVID-19 despite vaccination.
 
Methods
Study design and study population
This multi-centre, prospective cohort study recruited individuals aged 18 years or above who had received three doses of BNT162b2 vaccine from three vaccination centres in Hong Kong, namely, Sun Yat Sen Memorial Park Sports Centre, Queen Mary Hospital, and Sai Ying Pun Jockey Club Polyclinic, between May and August 2021. Participants volunteered for the study after being informed through flyers and announcements at the vaccination sites. All participants were screened by a trained research assistant using a checklist form (online Appendix) to confirm no active COVID-19 case or a history of the disease. Exclusion criteria included prior COVID-19 infection identified through serological testing for antibodies to the nucleocapsid protein of severe acute respiratory syndrome coronavirus 2, gastrointestinal surgery, inflammatory bowel disease, immunocompromised status (including post-transplantation, use of immunosuppressants, or receipt of chemotherapy), other medical conditions (malignancy, haematological, rheumatological or autoimmune diseases), and fewer than 14 days between the booster dose and either the study endpoint or the date of COVID-19 diagnosis.
 
Demographic and clinical information—including age, sex, body mass index (BMI), waist-to-hip ratio, smoking status, alcohol use, co-morbidities (hypertension, diabetes mellitus, and prediabetes), and recent medication use within 6 months of vaccination (proton pump inhibitors, statins, metformin, antibiotics,12 antidepressants, steroids, probiotics or prebiotics)—was collected. Additional data included blood pressure; blood test results (complete blood count, liver and renal function tests,13 glycated haemoglobin [HbA1c] level, lipid profile, and presence of hepatitis B surface antigen); controlled attenuation parameter (CAP) to measure liver fat14; and liver stiffness measured by transient elastography15 using FibroScan (Echosens, Paris, France). We also cross-checked the Hospital Authority’s database (eg, Clinical Management System) to verify participants’ co-morbidity conditions.
 
The primary outcome was COVID-19. All participants were prospectively followed from the date of their third vaccine dose until either a COVID-19 diagnosis or the end of the study (18 May 2022), whichever occurred first. Monthly follow-ups were conducted via phone calls or messages to inquire about participants’ COVID-19 status, especially during the fifth COVID-19 outbreak in Hong Kong in early 2022,16 when face-to-face meetings were not recommended. Participants were also instructed to notify the study team if they tested positive. COVID-19 diagnosis was based on self-reported symptoms followed by either a rapid antigen test or deep throat saliva reverse transcription polymerase chain reaction test.
 
Model development
This was a binary classification task using supervised learning algorithms, aiming to predict COVID-19 status after three vaccine doses. Predicted outcomes were labelled as ‘0’ (negative) or ‘1’ (positive). The dataset was randomly divided into training and validation sets in a 6:4 ratio.
 
Data preprocessing included three steps: missing data imputation, feature engineering, and data transformation. First, variables with more than 20% missing data were dropped because high levels of missingness can hinder the accuracy and reliability of imputation methods.17 18 Remaining missing values were imputed using the MICE (Multivariate Imputation by Chained Equations) package in R software (version 4.2.1, R Foundation for Statistical Computing, Vienna, Austria). Second, new features were extracted from existing variables (ie, transforming numerical variables into categorical groups and combining similar variables). Third, continuous variables were standardised through centring and scaling, whereas categorical variables were processed using one-hot encoding to ensure data compatibility for different ML algorithms.
 
Feature selection involved correlation analysis between variables and the dependent variable, the Boruta package in R,19 literature review, and expert consultation. A total of 37 variables were selected and ranked based on their overall importance using the aforementioned methods. Male sex, age ≥60 years, hepatitis B virus surface antigen positivity, diabetes/prediabetes, and recent medication use (antibiotics, proton pump inhibitors, probiotics/prebiotics, metformin, statins) were regarded as categorical variables (online supplementary Table 1).
 
Six frequently used supervised ML models were selected: logistic regression, linear discriminant analysis, random forest, naïve Bayes, NN, and extreme gradient boosting (XGBoost) [online supplementary Table 2]. Due to the imbalance in the dataset, with relatively few COVID-19 cases, multiple models were explored to assess different strategies for handling class imbalance. Hyperparameter tuning was performed using the caret package in R with grid search (3p grid size, where p represents the number of hyperparameters) and three-fold cross-validation. The dataset was divided into three equal subsets: the model trained on two subsets and validated on the third; the process was repeated five times, with the validation subset rotated each time. Hyperparameters yielding the highest area under the receiver operating characteristic curve (AUC) on the validation set were selected. A loop function was implemented to iteratively train the model while removing a single variable from the end of the ranked list of variables. By evaluating model performance with different variable combinations, we identified the most predictive variables.
 
A sensitivity analysis was conducted by excluding variables not routinely available in clinical practice (eg, CAP and liver stiffness).
 
Evaluation and comparison of model performance
To compare the performance of the ML models, we calculated AUCs and used DeLong’s test to assess statistical significance among the AUCs. We estimated the best cut-off point for each model using the Youden index, selecting the threshold that maximised the sum of sensitivity and specificity. Using these cut-off points, we calculated performance metrics including sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), positive likelihood ratio (PLR), and negative likelihood ratio (NLR) to identify the best model. We also compared the miss rate (false negative rate) across models. Given the imbalanced nature of the dataset, precision-recall curves and F1 scores were used. Higher F1 scores indicate better balance between precision and recall.
 
All statistical analyses were conducted using R, with packages such as caret, randomForest, naivebayes, nnet, xgboost, pROC, and SHAPforxgboost for model building, evaluation, and interpretation. The DTComPair package was used to compare performance metrics.20 Continuous variables were summarised as medians and interquartile ranges (IQRs), with comparisons performed using the Wilcoxon rank-sum test. Categorical variables were presented as counts and percentages, and compared using Pearson’s Chi squared test or Fisher’s exact test, applying Bonferroni correction for multiple comparisons. SHapley Additive exPlanations (SHAP) analysis was utilised to interpret complex models by generating SHAP values to determine feature impact.
 
Results
Patient characteristics
A total of 304 three-dose BNT162b2 recipients were identified between May and August 2021 (Fig 1a). The median age was 50.9 years (IQR=43.6-57.8), and 94 participants (30.9%) were men. Over a median follow-up of 2.6 months (IQR=1.8-3.1; up to 5.1 months), 27 participants (8.9%) tested positive for COVID-19. The dataset was randomly split into training and testing sets, comprising 184 (60.5%) and 120 (39.5%) participants, respectively. Table 1 summarises baseline characteristics, stratified by the outcome of interest (COVID-19 status) and by training and testing sets. Baseline characteristics prior to imputation are shown in online supplementary Table 3.
 

Figure 1. Machine learning model development. (a) Participant selection process. (b) Model development and validation on the testing set
 

Table 1. Baseline participant characteristics based on the status of coronavirus disease 2019 and train-test dataset (n=304)
 
The COVID-19-positive patients had worse medical conditions than those tested negative. Specifically, they were older (with a higher proportion aged ≥60 years: 22.2% vs 16.2%), predominantly male (33.3% vs 30.7%), had greater liver fat content (median CAP: 249.0 dB/m vs 227.0 dB/m), and were more frequently diagnosed with prediabetes/diabetes (55.6% vs 38.3%). Both the training and testing sets had a comparable proportion of COVID-19–positive cases (8.3-9.2%). Most independent variables were similarly distributed between sets (P>0.05), although CAP differed significantly (Table 1).
 
Performance of different machine learning models
We trained six different ML algorithms on the training set to predict COVID-19. Model performance was evaluated using three-fold cross-validation (Fig 1b). Concerning the testing set, performance metrics for each model are reported in Table 2 and AUCs are summarised in Figure 2. A comparison of AUCs between training and testing sets for the six ML models is presented in online supplementary Figure 1. All models showed a slight decrease in AUC from the training to the testing set, indicating some degree of overfitting. Notably, the NN model did not exhibit a significant AUC reduction, suggesting it was less susceptible to overfitting than other models.
 

Table 2. Performance metrics of different machine learning models
 
Of the six ML models evaluated, the NN algorithm performed best (AUC: 0.74, 95% CI=0.60-0.88), followed by XGBoost (AUC: 0.62, 95% CI=0.42-0.82) [Fig 2]. Using the optimal cut-off value estimated by the maximum Youden index, performance metrics are summarised in Table 2. The 2×2 confusion matrix tables, which summarise the numbers of true positives, true negatives, false positives, and false negatives for each model’s predictions, are shown in online supplementary Table 4. Multiple comparisons between the NN and other models in terms of performance metrics are presented in online supplementary Table 5.
 

Figure 2. Receiver operating characteristic curves of different machine learning models using the testing set
 
The NN and linear discriminant analysis models achieved the highest sensitivity, with values of 90% (95% CI=55%-100%) and 80% (95% CI=44%-97%), respectively. The random forest model had the best specificity (72%, 95% CI=62%-80%). The NN model also had the highest NPV (98%, (95% CI=92%-100%) and the best likelihood ratios (PLR: 2.15, 95% CI=1.59-2.91; NLR: 0.17, 95% CI=0.03-1.11) [Table 2]. It classified 45.8% of participants as high risk for COVID-19, with a miss rate or false negative rate of 10% (Table 3). Precision-recall curves and F1 scores for all models are shown in online supplementary Figure 2, offering a more precise evaluation of model performance in the context of an imbalanced dataset. With a precision baseline of 0.092, naïve Bayes and random forest models recorded AUC values of around 0.10, reflecting modest discrimination ability under class imbalance. The NN model achieved an F1 score of 0.277, highlighting a better balance between precision and recall.
 

Table 3. Number and proportion of predicted positive cases of coronavirus disease 2019 and miss rates or false negative rates by different machine learning models (n=120)
 
Crucial risk factors associated with coronavirus disease 2019 in the neural network model
According to the best-performing model (the NN model), the five most important predictors of COVID-19 risk were CAP, alanine aminotransferase level, age (≥60 years), presence of prediabetes/diabetes, and aspartate aminotransferase (AST) level, with relative importance values of 14.9%, 10.1%, 9.4%, 8.4%, and 7.9%, respectively (Fig 3). These were further confirmed by SHAP analysis, a method specifically compatible with ensemble algorithms (ie, XGBoost) that quantifies the contribution of each input variable to the model’s prediction. When SHAP analysis was applied to the second best-performing model (XGBoost), leading variables remained similar to those in the NN model, except for BMI which ranked highest in importance (with a mean absolute SHAP value of 0.992) in the XGBoost model (online supplementary Fig 3). The SHAP analysis in online supplementary Figure 3b also provided deeper insights into the contribution of each variable to the model’s prediction. Among leading variables in the XGBoost model, higher CAP (red dots), lower BMI (blue dots), and age ≥60 years (red dots) had a positive impact (right side of the plot) on COVID-19 prediction. In terms of high-density lipoprotein (HDL) and AST levels, the SHAP plot showed a wide distribution with mixed colours, suggesting that HDL and AST levels had diverse impacts on COVID-19 prediction.
 

Figure 3. Relative importance of risk factors in predicting the risk of coronavirus disease 2019 by the neural network model
 
Sensitivity analysis excluding non-routine clinical variables
Excluding CAP and liver stiffness, XGBoost achieved the best performance (AUC: 0.66, 95% CI=0.50-0.82), followed by naïve Bayes, logistic regression, linear discriminant analysis, random forest, and NN models (AUCs: 0.49- 0.63) [online supplementary Fig 4]. The top five predictors in the XGBoost model were BMI, alanine aminotransferase level, HDL level, HbA1c level, and age ≥60 years (online supplementary Fig 5). In the NN model, the top predictors were AST level, HbA1c level, HDL level, hepatitis B virus antigen positivity, and alanine aminotransferase level (online supplementary Fig 6).
 
Discussion
In this study involving three-dose BNT162b2 recipients, the NN model achieved satisfactory performance in predicting COVID-19 using baseline clinical data. The leading predictors identified were age ≥60 years, presence of prediabetes/diabetes, CAP, alanine aminotransferase level, and AST level, highlighting the need for vigilance among fully vaccinated individuals, especially those with concomitant co-morbidities.
 
Advanced age, prediabetes/diabetes, and abnormal liver condition (ie, high fatty liver content and abnormal liver function test results) were significant predictors of high infection risk, consistent with previous studies.21 22 23 24 25 A meta-analysis of 18 studies revealed a higher prevalence of diabetes (11.5%) among hospitalised COVID-19 patients21 compared to the general population (9.3%).26 Studies have found that the presence of preexisting diabetes or hyperglycaemia is associated with higher risks of severe illness, mortality, and complications in COVID-19 patients.22 23 This elevated risk is likely due to impaired immune function, chronic inflammation, and common cardiovascular and metabolic co-morbidities in diabetic patients.27 28 Individuals with liver diseases or abnormal liver function test results also exhibit higher risks of severe COVID-19 and complications.24 25
 
This study is among the few that have developed ML models to predict COVID-19 in recipients of three doses of BNT162b2. No prior studies have developed COVID-19 prognostic models with clear information on vaccination status, type, and number of doses. A study from Hong Kong11 showed that a timely third vaccine dose strongly protected against Omicron BA.2 variant infections, the dominant strain in Hong Kong during our study period. The effectiveness of vaccination against infection declined over time after two doses but was restored to a high level after a third dose, resulting in significantly lower risks of infection, hospitalisation, and severe illness compared with those who received only two doses.2 3 By including only three-dose vaccinated patients in the development of ML models, the resulting models may be more accurate in predicting COVID-19 risk and severity among vaccinated individuals. This can be particularly important in settings where vaccination rates are high and breakthrough infections are a concern; it may help identify individuals with higher infection risk who could benefit from additional precautions or interventions.
 
Strengths and limitations
Our study offers practical value by enabling risk stratification, allowing healthcare providers to focus resources on higher-risk populations. It informs public health strategies by identifying factors associated with increased COVID-19 risk despite vaccination, guiding targeted campaigns and resource allocation. Additionally, an understanding of risk predictors in vaccinated individuals supports tailored booster strategies. The identification of key variables such as age, prediabetes/diabetes, and liver enzyme levels also encourages further research into underlying mechanisms and potential interventions.
 
However, this study had some limitations. First, the small sample size (~300 participants) may affect model performance and generalisability. The dataset size was constrained by specific inclusion criteria, but this represented the maximum size available for model training. We believe that selection of high-quality data maximises training efficacy. Second, we did not include gut microbiota data, which may be associated with COVID-19 vaccine immunogenicity.29 A focus on readily available clinical data facilitates practical and clinically relevant predictive models. Third, our dataset exhibited significant class imbalance, such that only 8.9% of participants developed COVID-19 within 6 months. Whereas receiver operating characteristic curve analysis provides an optimistic assessment, we also used precision-recall curves and F1 scores for a more realistic evaluation. Fourth, although missing values for certain variables might introduce error into the prediction models, the small percentage of missing data and the use of multiple imputation likely had minimal impact on model accuracy. Fifth, COVID-19 cases were self-reported and confirmed by either rapid antigen or polymerase chain reaction tests. In Hong Kong, rapid antigen tests have a false negative rate of approximately 15% (sensitivity: 85%)30 but a high specificity of 99.93%,30 indicating very few false positives. Although some cases may have gone unreported or untested, we believe that the majority adhered to testing requirements as mandated by law. Additionally, we did not grade infection severity, and there were no hospitalised cases in our cohort, limiting our ability to predict hospitalisation outcomes in this study. Sixth, the NN model—our best-performing model—is complex and has low interpretability. We used a variable importance plot to visualise and identify the most influential features, enhancing its practical application. It should be noted that the other models demonstrated suboptimal performance, with AUCs below 0.7. The NN model’s superior performance is likely due to its ability to capture complex patterns and interactions. Simpler models struggled with the dataset’s complexity, class imbalance, non-linear relationships, and outliers. Finally, although this study offers insights into the use of advanced ML models to predict COVID-19 outcomes, its generalisability is limited. Overfitting remains a concern despite mitigation techniques (eg, regularisation, pruning, and ensemble methods). The complexity of our models and the dataset hinder generalisability. Variability in vaccines, booster intervals, doses, demographics, and study design further impacts the generalisability of our model. Future studies should include diverse populations and vaccine types to enhance applicability. External validation of our results in other centres is also warranted.
 
Conclusion
The NN model is a useful tool for identifying individuals at low risk of COVID-19 within 6 months after receiving three doses of BNT162b2. Key features selected by the model highlight the central role of metabolic risk factors (prediabetes/diabetes, non-alcoholic fatty liver disease, and steatohepatitis) in vaccine immunogenicity.
 
Author contributions
Concept or design: JT Tan, KS Cheung.
Acquisition of data: JT Tan, R Zhang, KH Chan.
Analysis or interpretation of data: JT Tan, KS Cheung.
Drafting of the manuscript: JT Tan.
Critical revision of the manuscript for important intellectual content: KS Cheung, IFN Hung.
 
All authors had full access to the data, contributed to the study, approved the final version for publication, and take responsibility for its accuracy and integrity.
 
Conflicts of interest
All authors have disclosed no conflicts of interest.
 
Funding/support
This research was funded by the Health and Medical Research Fund of the former Food and Health Bureau, Hong Kong SAR Government (Ref No.: COVID1903010, Project 16). The funder had no role in the study design, data collection/analysis/interpretation, or manuscript preparation.
 
Ethics approval
The research was approved by the Institutional Review Board of The University of Hong Kong/Hospital Authority Hong Kong West Cluster, Hong Kong (Ref No.: UW 21-216). Participants provided written informed consent to participate in this study.
 
Supplementary material
The supplementary material was provided by the authors and some information may not have been peer reviewed. Accepted supplementary material will be published as submitted by the authors, without any editing or formatting. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by the Hong Kong Academy of Medicine and the Hong Kong Medical Association. The Hong Kong Academy of Medicine and the Hong Kong Medical Association disclaim all liability and responsibility arising from any reliance placed on the content.
 
References
1. World Health Organization. WHO Coronavirus (COVID-19) Dashboard. 2023. Available from: https://data.who.int/dashboards/covid19/cases. Accessed 4 May 2023.
2. Andrews N, Stowe J, Kirsebom F, et al. Effectiveness of COVID-19 booster vaccines against COVID-19–related symptoms, hospitalization and death in England. Nat Med 2022;28:831-37. Crossref
3. Andrews N, Stowe J, Kirsebom F, et al. COVID-19 vaccine effectiveness against the Omicron (B.1.1.529) variant. N Engl J Med 2022;386:1532-46. Crossref
4. Peng Q, Zhou R, Wang Y, et al. Waning immune responses against SARS-CoV-2 variants of concern among vaccinees in Hong Kong. EBioMedicine 2022;77:103904. Crossref
5. Willette AA, Willette SA, Wang Q, et al. Using machine learning to predict COVID-19 infection and severity risk among 4510 aged adults: a UK Biobank cohort study. Sci Rep 2022;12:7736. Crossref
6. Wynants L, Van Calster B, Collins GS, et al. Prediction models for diagnosis and prognosis of COVID-19: systematic review and critical appraisal. BMJ 2020;369:m1328. Crossref
7. Subudhi S, Verma A, Patel AB, et al. Comparing machine learning algorithms for predicting ICU admission and mortality in COVID-19. NPJ Digit Med 2021;4:87. Crossref
8. Brinati D, Campagner A, Ferrari D, Locatelli M, Banfi G, Cabitza F. Detection of COVID-19 infection from routine blood exams with machine learning: a feasibility study. J Med Syst 2020;44:135. Crossref
9. Yao H, Zhang N, Zhang R, et al. Severity detection for the coronavirus disease 2019 (COVID-19) patients using a machine learning model based on the blood and urine tests. Front Cell Dev Biol 2020;8:683. Crossref
10. Baden LR, El Sahly HM, Essink B, et al. Efficacy and safety of the mRNA-1273 SARS-CoV-2 vaccine. N Engl J Med 2021;384:403-16. Crossref
11. Zhou R, Liu N, Li X, et al. Three-dose vaccination–induced immune responses protect against SARS-CoV-2 Omicron BA.2: a population-based study in Hong Kong. Lancet Reg Health West Pac 2023;32:100660. Crossref
12. Cheung KS, Lam LK, Zhang R, et al. Association between recent usage of antibiotics and immunogenicity within six months after COVID-19 vaccination. Vaccines (Basel) 2022;10:1122. Crossref
13. Cheung KS, Mok CH, Mao X, et al. COVID-19 vaccine immunogenicity among chronic liver disease patients and liver transplant recipients: a meta-analysis. Clin Mol Hepatol 2022;28:890-911. Crossref
14. Cheung KS, Lam LK, Hui RW, et al. Effect of moderate-to-severe hepatic steatosis on neutralising antibody response among BNT162b2 and CoronaVac recipients. Clin Mol Hepatol 2022;28:553-64. Crossref
15. Cheung KS, Lam LK, Mao X, et al. Effect of moderate to severe hepatic steatosis on vaccine immunogenicity against wild-type and mutant virus and COVID-19 infection among BNT162b2 recipients. Vaccines (Basel) 2023;11:497. Crossref
16. Cheung PH, Chan CP, Jin DY. Lessons learned from the fifth wave of COVID-19 in Hong Kong in early 2022. Emerg Microbes Infect 2022;11:1072-8. Crossref
17. Little RJ, Rubin DB. Statistical Analysis with Missing Data, 3rd edition. New York [NY]: John Wiley & Sons; 2019. Crossref
18. Dong Y, Peng CY. Principled missing data methods for researchers. Springerplus 2013;2:222. Crossref
19. Kursa MB, Rudnicki WR. Feature selection with the Boruta package. J Stat Softw 2010;36:1-13. Crossref
20. Kitcharanant N, Chotiyarnwong P, Tanphiriyakun T, et al. Development and internal validation of a machine-learning–developed model for predicting 1-year mortality after fragility hip fracture. BMC Geriatr 2022;22:451. Crossref
21. Singh AK, Gillies CL, Singh R, et al. Prevalence of co-morbidities and their association with mortality in patients with COVID-19: a systematic review and meta-analysis. Diabetes Obes Metab 2020;22:1915-24. Crossref
22. Zhu L, She ZG, Cheng X, et al. Association of blood glucose control and outcomes in patients with COVID-19 and pre-existing type 2 diabetes. Cell Metab 2020;31:1068-77.e3. Crossref
23. Yang JK, Feng Y, Yuan MY, et al. Plasma glucose levels and diabetes are independent predictors for mortality and morbidity in patients with SARS. Diabet Med 2006;23:623-8. Crossref
24. Singh S, Khan A. Clinical characteristics and outcomes of coronavirus disease 2019 among patients with preexisting liver disease in the United States: a multicenter research network study. Gastroenterology 2020;159:768-771.e3. Crossref
25. Simon TG, Hagström H, Sharma R, et al. Risk of severe COVID-19 and mortality in patients with established chronic liver disease: a nationwide matched cohort study. BMC Gastroenterol 2021;21:439. Crossref
26. Saeedi P, Petersohn I, Salpea P, et al. Global and regional diabetes prevalence estimates for 2019 and projections for 2030 and 2045: results from the International Diabetes Federation Diabetes Atlas, 9th edition. Diabetes Res Clin Pract 2019;157:107843. Crossref
27. Pal R, Bhadada SK. COVID-19 and diabetes mellitus: an unholy interaction of two pandemics. Diabetes Metab Syndr 2020;14:513-7. Crossref
28. Azar WS, Njeim R, Fares AH, et al. COVID-19 and diabetes mellitus: how one pandemic worsens the other. Rev Endocr Metab Disord 2020;21:451-63. Crossref
29. Ng HY, Leung WK, Cheung KS. Association between gut microbiota and SARS-CoV-2 infection and vaccine immunogenicity. Microorganisms 2023;11:452. Crossref
30. Zee JS, Chan CT, Leung AC, et al. Rapid antigen test during a COVID-19 outbreak in a private hospital in Hong Kong. Hong Kong Med J 2022;28:300-5. Crossref

Spectrum of inherited eye disorders at Hong Kong Children’s Hospital: insights into the local genetic landscape and experience with ocular genetic services

Hong Kong Med J 2025 Aug;31(4):287–95 | Epub 4 Aug 2025
© Hong Kong Academy of Medicine. CC BY-NC-ND 4.0
 
ORIGINAL ARTICLE
Spectrum of inherited eye disorders at Hong Kong Children’s Hospital: insights into the local genetic landscape and experience with ocular genetic services
Shirley SW Cheng, MB, ChB, FHKAM (Paediatrics)1; Stephanie Cheung, FCOphthHK, FHKAM (Ophthalmology)2,3; TC Ko, FRCS (Edin), FCOphth3,4,5; TL Lee, MB, BS, FHKAM (Paediatrics)6; Jason Yam, FCOphthHK, FHKAM (Ophthalmology)2,3,7; HM Luk, MD, FHKAM (Paediatrics)1
1 Department of Clinical Genetics, Hong Kong Children’s Hospital, Hong Kong SAR, China
2 Clinical Services Department, Hong Kong Eye Hospital, Hong Kong SAR, China
3 Department of Ophthalmology, Hong Kong Children’s Hospital, Hong Kong SAR, China
4 Department of Ophthalmology, Tung Wah Eastern Hospital, Hong Kong SAR, China
5 Department of Ophthalmology, Pamela Youde Nethersole Eastern Hospital, Hong Kong SAR, China
6 Hospital Chief Executive Office, Hong Kong Children’s Hospital, Hong Kong SAR, China
7 Department of Ophthalmology and Visual Sciences, The Chinese University of Hong Kong, Hong Kong SAR, China
 
Corresponding authors: Dr Jason Yam (yamcheuksing@cuhk.edu.hk); Dr HM Luk (lukhm@ha.org.hk)
 
 Full paper in PDF
 
Abstract
Introduction: Inherited eye disorders (IEDs) are a leading cause of visual impairment. However, local data and information about the genetic landscape of IEDs in Hong Kong remain limited. This study aimed to examine the diagnostic yield, mutational spectrum, and clinical utility of genomic testing in patients with IEDs at a major local centre.
 
Methods: This retrospective observational study included 130 patients with suspected IEDs who attended the genetic counselling clinic at the Department of Clinical Genetics of the Hong Kong Children’s Hospital between December 2021 and October 2023. Analyses were conducted on the spectrum of ocular genetic disorders, genetic variants, diagnostic yields, and clinical utility of genomic testing.
 
Results: The overall diagnostic yield of genomic testing was 51.5%. Inherited retinal disorders accounted for approximately 60% of positive results. Patients with syndromic features and a positive family history were significantly more likely to receive a molecular diagnosis (P<0.05). Clinical utility of genomic testing was observed in over 70% of patients with positive results. With genetic counselling, a confirmed molecular diagnosis contributed to disease prognostication, avoided unnecessary investigations, guided clinical management, and facilitated reproductive planning and family cascade screening.
 
Conclusion: There is a growing demand for the application of genomic medicine in patients with IEDs. Genetic testing is widely accepted and demonstrates high diagnostic and clinical utilities. The multidisciplinary team clinic service model is the global trend for integrating genomic testing into routine care. Hong Kong Children’s Hospital is implementing this model to meet the evolving needs of this patient population.
 
 
New knowledge added by this study
  • The local diagnostic yield of genomic testing in patients with inherited eye disorder (IED) is 51.5%.
  • Molecular confirmation of IEDs in more than 70% of patients demonstrated the clinical utility of genomic testing.
Implications for clinical practice or policy
  • Incorporation of genetic testing into routine IED workup is imperative.
  • Implementation of a multidisciplinary team or combined clinic model—including ophthalmologists, geneticists, genetic counsellors, optometrists, and nurses—enables personalised and timely management of IED patients.
 
 
Introduction
According to World Health Organization estimates, approximately 19 million children under the age of 15 years are visually impaired, with 1.4 million exhibiting irreversible impairment.1 Among cases of severe visual impairment diagnosed before the age of 1 year, around one-third are attributable to genetic causes.2
 
Substantial proportions of childhood and adult-onset visual impairments are caused by inherited eye disorders (IEDs), which include anterior segment dysgenesis; inherited retinal disorders (IRDs); microphthalmia, anophthalmia, and coloboma; ocular tumours; congenital cataracts; and albinism. Over the past three decades, more than 450 genes have been associated with IEDs.2 3 Genetic diagnosis in such cases is challenging due to both clinical and genetic heterogeneity.
 
Ocular genetics has rapidly evolved over the past decade—from identifying inheritance patterns of IEDs to establishing genotype-phenotype correlations for disease prognostication and enrolling patients in gene therapy trials. In 2018, the United States Food and Drug Administration approved the first ocular gene therapy, Luxturna, for the treatment of RPE65-related inherited retinal disease.4 In 2012, the American Academy of Ophthalmology published diagnostic guidelines encouraging the routine use of genetic testing for IEDs.5 Multiple genes can now be assessed simultaneously through a single genomic test, which is particularly useful for identifying heterogeneous single-gene disorders and resolving cases where a clinical diagnosis is difficult to establish.6 Advances in sequencing technologies are uncovering the molecular aetiologies of various disorders. Consequently, the genomic approach to IEDs is gaining popularity, highlighting the need for more sophisticated genomic testing and comprehensive ocular genetic services.
 
In Hong Kong, the Retinitis Pigmentosa Registry—the first of its kind among Chinese populations globally—was established in 1995. Its main objectives are to provide detailed ophthalmic and genetic examinations for patients with inherited retinal degenerative diseases and to build a database for future scientific, medical, and sociological research.7 However, local data remain limited and the genetic landscapes of other IEDs are still unclear.
 
Hong Kong Children’s Hospital (HKCH) serves as the tertiary referral centre for complex, serious, and uncommon paediatric cases requiring multidisciplinary management, providing diagnosis, treatment, and rehabilitation services across the territory. In 2021, the Clinical Genetics Service Unit (CGSU) at HKCH was established as the first clinical genetics branch under the Hospital Authority. In July 2023, the Clinical Genetic Service (CGS) of the Department of Health (DH)—the former government-funded tertiary genetic referral centre providing genetic counselling and laboratory services to the entire Hong Kong population—was integrated with the CGSU and renamed the Department of Clinical Genetics (DCG) under the Hospital Authority. As a major clinical genetics service provider in Hong Kong, the DCG now offers genetic counselling services territory-wide.
 
Acknowledging the knowledge gap in the local genetic landscape and the lack of a comprehensive service model for patients with IEDs in Hong Kong, we conducted this retrospective review to analyse the local mutational spectrum across various IED subtypes and the corresponding diagnostic yield in our institution. Our aim was to better understand the clinical utility of genomic testing in IED patients and to formulate a comprehensive ocular genetic service model that addresses the needs of local patients.
 
Methods
Study design and population
Patients presenting with eye manifestations were retrospectively identified by querying records between 1 December 2021 and 30 October 2023 through the Hospital Authority Teams database under the CGSU/DCG at HKCH. The database included all patients who had attended genetic counselling clinics under the CGSU/DCG. Clinical geneticists and ophthalmologists reviewed all clinical notes, genetic reports, and electronic health records in the Clinical Management System, as well as paper records.
 
Patients’ phenotypes were reviewed and categorised by ophthalmologists into the following nine groups: (a) anterior segment dysgenesis; (b) IRDs; (c) cataract and lens disorders; (d) microphthalmia, anophthalmia, and coloboma spectrum; (e) neuro-ophthalmology (eg, optic atrophy); (f) ocular albinism or oculocutaneous albinism; (g) high myopia; (h) ocular tumours; and (i) others.
 
Patients with inconclusive eye phenotypes were excluded. Relevant history (including consanguinity, ethnicity, and family history), physical examination findings (dysmorphism and involvement of other systems), ophthalmological assessments and examinations, other relevant investigations (eg, magnetic resonance imaging of the brain and renal imaging), and previous genetic test reports were reviewed. A positive family history was defined as the presence of related eye phenotypes in a first-degree relative, or in two or more second- or third-degree relatives with the same condition.
 
All patients underwent comprehensive dysmorphology evaluations and genetic counselling, including pre-test and post-test consultations with the clinical genetics team. Prior to providing informed consent for genomic testing, patients were counselled on the indications, limitations, diagnostic yield, variants of uncertain clinical significance, and the ethical, social, and legal implications of genomic testing. Informed consent was obtained from affected patients or their legal guardians before undergoing diagnostic genomic testing.
 
Genomic testing
According to clinical indications, patients were offered various genomic tests, including single-gene sequencing, array comparative genomic hybridisation, multiplex ligation–dependent probe amplification, whole-exome sequencing–based panels, medical exome sequencing, and mitochondrial sequencing. DNA was extracted from peripheral blood ethylenediaminetetraacetic acid samples. For mitochondrial sequencing, mitochondrial DNA extracted from urine-derived cells was used. All tests were performed in one of two accredited laboratories: the Genetic Laboratory of DH (which became a combined service with the Hospital Authority after July 2023), or the Genetics and Genomics Laboratory at HKCH, in accordance with laboratory-specific protocols and guidelines. Inheritance and phasing were determined via targeted Sanger sequencing of parental samples.
 
Data collection and analysis
Clinical characteristics were collected from electronic records and, when available, hospital case notes and CGS paper records. These characteristics included age at onset, age at first encounter, sex, ethnicity, consanguinity, laterality of ocular involvement, severity of visual impairment, family history of ocular conditions, syndromic features, and other associated system involvement. Genetic testing results were retrieved from the Clinical Management System, CGS database, and paper records. Additionally, reproductive planning (for either the index patient or their parents) and other subspecialty referrals after a substantiated molecular diagnosis—as documented in genetic counselling notes—were recorded for clinical utility analysis. All clinical data are presented as percentages or means ± standard deviations, unless otherwise specified.
 
Molecular and clinical data from all recruited individuals were analysed using SPSS (Windows version 26.0; IBM Corp, Armonk [NY], United States). Categorical variables (eg, syndromic vs non-syndromic presentation, presence of family history) were compared using Fisher’s exact test, while continuous variables were compared using the independent samples t test. P values of less than 0.05 were considered statistically significant. This article was written in compliance with the STROBE (Strengthening the Reporting of Observational Studies in Epidemiology) reporting guidelines.
 
Results
Between December 2021 and October 2023, 3653 patients were registered at the HKCH genetic counselling clinic. Of these, 148 symptomatic patients from 147 families met the inclusion criteria for this study. Approximately 4% of patients presented to the genetics clinic with ophthalmological diseases. Overall, 130 (87.8%) patients consented to genomic testing (Fig).
 

Figure. Patient selection for analysis
 
Patient demographics
Among the 130 patients, approximately 92% were Chinese, with a male-to-female ratio of 3:2. The mean age (±standard deviation) at onset was 12.5±16.2 years. Within this cohort, 53.1% of patients were classified under IRDs, 14.6% under neuro-ophthalmology, and 13.8% under cataract/lens disorders.
 
Fifteen patients (11.5% of those tested) presented with more than one ocular phenotype. The majority of patients (>80%) exhibited bilateral ocular involvement. Detailed demographics, family history, and disease categories of the 130 patients who underwent genetic testing are presented in Table 1.
 

Table 1. Patient demographics (n=130) and disease categories
 
Molecular findings and diagnostic yield
The diagnostic yield of genomic testing was defined as the proportion of individuals with pathogenic or likely pathogenic molecular variants or structural variants contributing to the clinical phenotype. A whole-exome sequencing–based virtual panel was requested for 78 (60%) of the 130 patients, based on their presenting phenotypes (online supplementary Table 1). Using this panel-based approach, the diagnostic yield was 51.3%. Twenty-three patients (17.7%) underwent single-gene testing based on highly specific phenotypes without molecular heterogeneity, such as RB1, CHD7, NF1, and RS1 (online supplementary Table 2). This single-gene approach successfully diagnosed 14 patients (60.8%). Medical exome sequencing was offered to 22 patients with multiple congenital anomalies or suspected syndromes, achieving a diagnostic yield of 50% (11/22). Two patients were diagnosed through copy number variation analysis (online supplementary Table 2).
 
The overall diagnostic yield for this cohort was 51.5% (Table 2). As mentioned earlier, 15 patients exhibited overlapping phenotypes across disease categories, with inherited retinal disorders and cataracts being the most common co-existing phenotypes. The microphthalmia, anophthalmia, and coloboma spectrum demonstrated the highest diagnostic yield at 100%. All five patients in this category presented with bilateral eye involvement and were syndromic (eg, two with CHARGE syndrome) [online supplementary Table 2]. Among the 69 IRD patients who underwent testing, 40 had confirmed molecular diagnoses, resulting in diagnostic yield of 58% for the IRD group. The most commonly identified genes were USH2A, ABCA4, COL2A1, RP1L1, and RS1 (online supplementary Table 2). No significant differences in diagnostic yield were detected across disease categories (Table 2).
 

Table 2. Diagnostic characteristics of genetic testing by disease category
 
Patients presenting with IRDs and neuro-ophthalmological conditions generally exhibited a later age at onset and age at first encounter compared with other categories, although these differences were not statistically significant (Table 2).
 
In total, 25 novel variants were identified in 25 patients across 20 genes. Of these, four remained of uncertain clinical significance despite further phasing and segregation analysis (online supplementary Table 2). Five variants were found in trans with another likely pathogenic variant in the same gene, consistent with autosomal recessive inheritance.8 Following detailed phenotypic correlation and variant curation, 16 previously unreported novel variants were confirmed to contribute to molecular diagnoses within this cohort.
 
A significant difference in the proportion of positive genetic test results was observed between patients with and without a family history of ocular conditions (P=0.037). However, among patients with a family history, the interval between symptom onset and the first visit to the genetics clinic was significantly longer. Positive molecular diagnoses were also more likely to be achieved in syndromic patients (P=0.0014) [Table 3].
 

Table 3. Genetic testing outcomes and time lapse between onset and first encounter
 
As shown in Table 4, individuals with bilateral eye involvement had a greater proportion of positive genetic test results (54.9%), although this difference was not statistically significant (P=0.07). Additionally, no significant difference in diagnostic yield was observed according to age at onset (P=0.29).
 

Table 4. Diagnostic yield of genetic testing by ocular disease site and age at onset
 
Diagnostic and clinical utilities
Genomic testing is increasingly recognised as an important tool for establishing new diagnoses or confirming ones, particularly in the context of rare conditions, which are often complex and costly to diagnose, leading to prolonged diagnostic odysseys. Molecular findings may offer additional clinical utility, including: (1) avoidance of unnecessary investigations or treatments; (2) improved prognostic certainty or redirection of clinical care; (3) enhanced surveillance or timely referral for extraocular manifestations; (4) provision of pre-symptomatic or cascade testing for potentially affected family members; and (5) support for reproductive planning.
 
In total, 14 patients received revised diagnoses after genomic testing, representing 21% of positive cases (Table 5). These new diagnoses were related to syndromic conditions, such as CTNNB1-related neurodevelopmental disorders, or involved extraocular features, such as pantothenate kinase–associated neurodegeneration (online supplementary Table 2).
 

Table 5. Diagnostic and clinical utilities of genetic testing (n=67)
 
Through medical record review, we determined that approximately 10% of test-positive patients were able to avoid unnecessary investigations and treatments. In two cases, metabolic workups for congenital cataract were discontinued after diagnostic confirmation. One patient with a pathogenic ABCA4 variant was advised to withhold vitamin A supplementation. In another case, a syndromic diagnosis of SOX2-related microphthalmia eliminated the need for repeated magnetic resonance imaging of the brain and prompted clinicians to monitor for other potential systemic associations, enabling timely intervention. Overall, 74.6% of patients experienced at least one clinical benefit as a result of genomic testing. More than 70% of test-positive patients benefited from improved prognostic certainty or a redirection of care. Approximately 30% of patients—or their carrier or affected parents—were offered options for reproductive planning through either prenatal confirmatory testing or preimplantation genetic testing. Table 5 summarises the clinical and diagnostic utilities observed in this study.
 
Discussion
Molecular findings and diagnostic yield
In this cohort, we reviewed 130 patients who attended the HKCH genetic counselling clinic over a 23-month period. This review offers a snapshot of the local genomic landscape of IEDs. The overall diagnostic yield of molecular testing was 51.5%, which is comparable to previously reported yields, ranging from 25% to 70% depending on phenotype and testing methodology.6 9 10 11 12 13 14 15 16 17 18 19 20
 
Among IRDs, a highly heterogeneous group, the diagnostic yield was 58%. This finding is consistent with a recent systematic review which reported a yield of 61.3% (95% confidence interval=57.8%-64.7%) across 51 studies of mixed IRD phenotypes.21 Several studies have demonstrated that well-curated gene panels are as effective as medical exome sequencing in detecting pathogenic variants in patients with IRDs.16 19 20 21 22
 
In our cohort of ocular tumours, 29.4% of patients received germline molecular diagnoses; most of these patients had unilateral retinoblastoma with no family history. Neither routine next-generation sequencing nor Sanger sequencing is typically capable of detecting low-level mosaicism. A previous study reported germline RB1 mutation detection rates ranging from 10% to 55% in unilateral retinoblastoma, which are substantially lower than those observed in bilateral cases.23 In the present study, the oculocutaneous albinism/ocular albinism group had the lowest diagnostic yield at 25%. This low yield may be attributed to the small sample size and the predominance of ocular albinism cases, for which previous research has shown a considerably lower molecular diagnostic yield than oculocutaneous albinism.24
 
Four recurrent variants were identified in this cohort (online supplementary Table 2):
  1. NM_000350.3 (ABCA4): c.1804C>T, p.(Arg602Trp). This variant is present at a very low frequency in the Genome Aggregation Database25 (gnomAD v2.1.1: 11 in 250 870 alleles), with a predominance in East Asian populations (gnomAD v2.1.1: 5 in 18 364 alleles). The exact carrier risk in our locality requires further research.
  2. NM_000330.4 (RS1): c.214G>A, p.(Glu72Lys). A missense variant located in exon 4 of the RS1 gene. This variant is well documented in Chinese populations, where it accounts for 9.2% of variants in individuals with X-linked retinoschisis.26
  3. NM_178857.6 (RP1L1): c.133C>T, p.(Arg45Trp). This hotspot mutation, located in exon 2 of RP1L1, is associated with occult macular dystrophy. Although its allele frequency is not particularly enriched in the Chinese population, it has been mentioned in case reports.27 28
  4. NM_206933.3 (USH2A): c.5572+1G>A. A splice-site variant in intron 27 of the USH2A gene, which has been documented in the literature.29 It has a relatively high allele frequency in East Asians (gnomAD v2.1.1: 3 in 249 996 alleles; East Asian subset: 3 in 18 382 alleles).30
 
Another variant, NM_153638.4 (PANK2): c.655G>A, p.(Gly219Ser), is a rare missense variant absent from the general population. It was detected in our local database and reported in 2023.31 Neurodegeneration with brain iron accumulation 1A (OMIM #234200) is caused by biallelic pathogenic variants in PANK2. This rare condition is characterised by early-onset retinal degeneration or pigmented retinopathy, followed by subtle neurological deficits such as tremor and extrapyramidal symptoms. Both ocular and neurological features follow a progressive course. Notably, two unrelated patients in our database carried the same PANK2 variant. A large, population-based study is warranted to determine whether this variant represents a founder mutation in our locality.
 
Diagnostic and clinical utilities
Genomic testing has advanced considerably over the past decade. As next-generation sequencing technologies (eg, whole-genome sequencing and multi-omics analysis) become more prevalent, diagnostic yields continue to improve.32 Given the availability of existing therapies, such as voretigene neparvovec for RPE65-related diseases, clinical trials are increasingly investigating gene-based therapies, including gene replacement through viral vectors, mutation suppression via small molecules, and splice modulation using antisense oligonucleotides.33 34 In addition to ending the diagnostic odyssey, a molecular diagnosis informs clinical management, facilitates access to other clinical services, initiates surveillance for extraocular manifestations, and supports family planning.6 12 35
 
Disease prognostication is a key aspect of clinical utility, most commonly reported in the IRD group. For example, COL2A1-related Stickler syndrome carries a high risk of retinal detachment, which may be mitigated through prophylactic cryotherapy or laser retinopexy.36 37 38 Among patients with unilateral retinoblastoma, those harbouring germline variants require closer surveillance of the contralateral eye and enhanced vigilance for the potential development of other cancers later in life.39 40
 
Among the 67 test-positive patients, 35 were diagnosed with autosomal dominant conditions (Table 5), six of which were inherited from an affected parent. Approximately one-third of positive findings were attributed to autosomal recessive conditions, with both parents identified as heterozygous carriers. Nine patients had X-linked conditions; in nearly all cases, the mothers were confirmed as heterozygous carriers, except for two who declined genetic testing. In this context, molecular diagnosis is clearly beneficial for cascade screening and reproductive planning. In practice, however, the extent to which patients report these benefits is often influenced by age and family circumstances within the study cohort. As a result, direct comparisons of reported utility across studies remain challenging.
 
Limitations and strengths
In Hong Kong, our genetic counselling department serves as the major referral centre, receiving patients from both public and private sectors. Individuals with more severe phenotypes are more likely to be referred, resulting in potential ascertainment bias.
 
Genomic testing was recommended by clinical geneticists based on the clinical phenotype. However, due to resource limitations, not all patients underwent the full spectrum of available tests, which may have resulted in an underestimation of the diagnostic yield. Additionally, certain clinical subgroups (eg, microphthalmia, anophthalmia, and coloboma) had limited sample sizes, potentially affecting diagnostic yield outcomes. Despite these limitations, this pilot study provides a reliable estimate of the mutational spectrum and diagnostic yield among local IED patients.
 
To our knowledge, this is the first retrospective study of IED patients to examine both the local genetic landscape and the clinical utility of genomic testing. Our findings highlight the importance of integrating modern genomic technologies into the management of patients with IEDs. They also underscore the need for an enhanced service model through a multidisciplinary team approach, implemented via a combined ocular genetics clinic.
 
Ideally, clinical utility should be assessed through a randomised controlled trial, which maximises internal validity and control for confounding variables. However, the level of evidence required varies according to clinical indication and type of genetic test. The data presented in this retrospective observational study, collected over nearly 2 years, are considered representative of real-world clinical scenarios. Future research involving multicentre collaborations over a longer period (eg, 10 years) will provide a more comprehensive understanding.
 
Ocular genetics clinic: a new service model in Hong Kong
Interestingly, patients with a family history experienced a longer interval between symptom onset and their first encounter at the clinical genetics clinic (Table 3). This finding emphasises the importance of raising public awareness about the role of genomic medicine in managing IEDs.
 
The multidisciplinary team clinic model—comprising ophthalmologists, genetic counsellors, geneticists, and genetic nurses—is a current global trend for integrating genomic testing into clinical care pathways. It has been proven effective, particularly when applied to IRDs as a model.41 42 Similar models have also been adopted in other specialties, such as neurogenetics and cardiogenetics clinics.
 
At HKCH, a combined ocular genetics clinic commenced service in May 2022. The team includes ophthalmologists, genetic counsellors, clinical geneticists, optometrists, and nurses. Patients are referred from both public and private sectors for a variety of indications, such as atypical eye phenotypes, suspected syndromic conditions, or complex counselling needs (eg, variants of uncertain clinical significance detected in previous genomic tests conducted locally or overseas). This one-stop combined clinic enables joint discussions among specialists to formulate comprehensive management plans and reduces the need for repeated hospital visits, saving patients valuable time.
 
Conclusion
Approximately 4% of patients attending our genetic clinic had ocular disorders. The overall diagnostic yield of genomic testing was 51.5%; predominance was the strongest among patients with syndromic presentations and positive family history.
 
This study demonstrates high clinical utility of genomic testing in over 70% of patients with confirmed molecular diagnoses. There is a global shift towards managing IED patients through a multidisciplinary team clinic service model. To meet the growing demand for genomic medicine in IEDs, future studies should incorporate prospective, population-wide sampling, long-term follow-up, and multicentre collaboration.
 
Author contributions
Concept or design: SSW Cheng, HM Luk.
Acquisition of data: SSW Cheng.
Analysis or interpretation of data: SSW Cheng, SSL Cheung.
Drafting of the manuscript: All authors.
Critical revision of the manuscript for important intellectual content: All authors.
 
All authors had full access to the data, contributed to the study, approved the final version for publication, and take responsibility for its accuracy and integrity.
 
Conflicts of interest
As an editor of the journal, JCS Yam was not involved in the peer review process. Other authors have disclosed no conflicts of interest.
 
Acknowledgement
The authors thank the patients and their families for contributing the clinical data used in this study.
 
Declaration
Part of the research data was presented at the Kowloon Central Cluster Convention, 25 October 2024, Hong Kong.
 
Funding/support
This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.
 
Ethics approval
This research was performed in compliance with the Declaration of Helsinki. Ethics approval was granted by the Hospital Authority Central Institutional Review Board, Hong Kong (Ref No.: PAED-2023-076). A waiver of patient consent was obtained from the Committee due to the retrospective nature of the research.
 
Supplementary material
The supplementary material was provided by the authors and some information may not have been peer reviewed. Accepted supplementary material will be published as submitted by the authors, without any editing or formatting. Any opinions or recommendations discussed are solely those of the author(s) and are not endorsed by the Hong Kong Academy of Medicine and the Hong Kong Medical Association. The Hong Kong Academy of Medicine and the Hong Kong Medical Association disclaim all liability and responsibility arising from any reliance placed on the content.
 
References
1. National Academies of Sciences, Engineering, and Medicine; Health and Medicine Division; Board on Health Care Services; Board on the Health of Select Populations; Committee on the Evidence Base for Genetic Testing. An Evidence Framework for Genetic Testing. Washington (DC): National Academies Press (US); 2017.
2. Rahi JS, Cable N; British Childhood Visual Impairment Study Group. Severe visual impairment and blindness in children in the UK. Lancet 2003;362:1359-65. Crossref
3. Chen HY, Lehmann OJ, Swaroop A. Genetics and therapy for pediatric eye diseases. EBioMedicine 2021;67:103360. Crossref
4. FDA approves hereditary blindness gene therapy. Nat Biotechnol 2018;36:6. Crossref
5. Stone EM, Aldave AJ, Drack AV, et al. Recommendations for genetic testing of inherited eye diseases: report of the American Academy of Ophthalmology task force on genetic testing. Ophthalmology 2012;119:2408-10. Crossref
6. Burdon KP. The utility of genomic testing in the ophthalmology clinic: a review. Clin Exp Ophthalmol 2021;49:615-25. Crossref
7. Lam ST, To CH, Leung KW, Yip SP, Lo IF, Tsang KP. Lessons learnt from a genetic disease registry in Hong Kong. Hong Kong Med J 2021;27:226-8. Crossref
8. Richards S, Aziz N, Bale S, et al. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet Med 2015;17:405-24. Crossref
9. Ma A, Grigg JR, Flaherty M, et al. Genome sequencing in congenital cataracts improves diagnostic yield. Hum Mutat 2021;42:1173-83. Crossref
10. Haug P, Koller S, Maggi J, et al. Whole exome sequencing in coloboma/microphthalmia: identification of novel and recurrent variants in seven genes. Genes (Basel) 2021;12:65. Crossref
11. García Bohórquez B, Aller E, Rodríguez Muñoz A, Jaijo T, García García G, Millán JM. Updating the genetic landscape of inherited retinal dystrophies. Front Cell Dev Biol 2021;9:645600. Crossref
12. Lenassi E, Clayton-Smith J, Douzgou S, et al. Clinical utility of genetic testing in 201 preschool children with inherited eye disorders. Genet Med 2020;22:745-51. Crossref
13. Wang P, Li S, Sun W, et al. An ophthalmic targeted exome sequencing panel as a powerful tool to identify causative mutations in patients suspected of hereditary eye diseases. Transl Vis Sci Technol 2019;8:21. Crossref
14. Patel A, Hayward JD, Tailor V, et al. The oculome panel test: next-generation sequencing to diagnose a diverse range of genetic developmental eye disorders. Ophthalmology 2019;126:888-907. Crossref
15. Martin-Merida I, Avila-Fernandez A, Del Pozo-Valero M, et al. Genomic landscape of sporadic retinitis pigmentosa: findings from 877 Spanish cases. Ophthalmology 2019;126:1181-8. Crossref
16. Wang L, Zhang J, Chen N, et al. Application of whole exome and targeted panel sequencing in the clinical molecular diagnosis of 319 Chinese families with inherited retinal dystrophy and comparison study. Genes (Basel) 2018;9:360. Crossref
17. Lasseaux E, Plaisant C, Michaud V, et al. Molecular characterization of a series of 990 index patients with albinism. Pigment Cell Melanoma Res 2018;31:466-74. Crossref
18. Haer-Wigman L, van Zelst-Stams WA, Pfundt R, et al. Diagnostic exome sequencing in 266 Dutch patients with visual impairment. Eur J Hum Genet 2017;25:591-9. Crossref
19. Saudi Mendeliome Group. Comprehensive gene panels provide advantages over clinical exome sequencing for Mendelian diseases. Genome Biol 2015;16:134. Crossref
20. Consugar MB, Navarro-Gomez D, Place EM, et al. Panel-based genetic diagnostic testing for inherited eye diseases is highly accurate and reproducible, and more sensitive for variant detection, than exome sequencing. Genet Med 2015;17:253-61. Crossref
21. Britten-Jones AC, Gocuk SA, Goh KL, Huq A, Edwards TL, Ayton LN. The diagnostic yield of next generation sequencing in inherited retinal diseases: a systematic review and meta-analysis. Am J Ophthalmol 2023;249:57-73. Crossref
22. Hayman T, Millo T, Hendler K, et al. Whole exome sequencing of 491 individuals with inherited retinal diseases reveals a large spectrum of variants and identification of novel candidate genes. J Med Genet 2024;61:224-31. Crossref
23. Gupta H, Malaichamy S, Mallipatna A, et al. Retinoblastoma genetics screening and clinical management. BMC Med Genomics 2021;14:188. Crossref
24. Chan KS, Bohnsack BL, Ing A, et al. Diagnostic yield of genetic testing for ocular and oculocutaneous albinism in a diverse united states pediatric population. Genes (Basel) 2023;14:135. Crossref
25. Genome Aggregation Database. Available from: https://gnomad.broadinstitute.org/. Accessed 1 Sep 2024.
26. Huang L, Sun L, Wang Z, et al. Clinical manifestation and genetic analysis in Chinese early onset X-linked retinoschisis. Mol Genet Genomic Med 2020;8:e1421. Crossref
27. Qi YH, Gao FJ, Hu FY, et al. Next-generation sequencing–aided rapid molecular diagnosis of occult macular dystrophy in a Chinese family. Front Genet 2017;8:107. Crossref
28. Xiao S, Sun W, Xiao X, et al. Clinical and genetic features of retinoschisis in 120 families with RS1 mutations. Br J Ophthalmol 2023;107:367-72. Crossref
29. Lin YW, Huang YS, Lin CY, et al. High prevalence of exon-13 variants in USH2A-related retinal dystrophies in Taiwanese population. Orphanet J Rare Dis 2024;19:238. Crossref
30. Chau JF, Yu MH, Chui MM, et al. Comprehensive analysis of recessive carrier status using exome and genome sequencing data in 1543 Southern Chinese. NPJ Genom Med 2022;7:23. Crossref
31. Wong EW, Cheng SS, Woo TT, Lam RF, Lai FH. Concurrent PANK2 and OCA2 variants in a patient with retinal dystrophy, hypopigmented irides and neurodegeneration. Ophthalmic Genet 2023;44:403-7. Crossref
32. Weisschuh N, Mazzola P, Zuleger T, et al. Diagnostic genome sequencing improves diagnostic yield: a prospective single-centre study in 1000 patients with inherited eye diseases. J Med Genet 2024;61:186-95. Crossref
33. Drag S, Dotiwala F, Upadhyay AK. Gene therapy for retinal degenerative diseases: progress, challenges, and future directions. Invest Ophthalmol Vis Sci 2023;64:39. Crossref
34. Tan F, Li X, Wang Z, Li J, Shahzad K, Zheng J. Clinical applications of stem cell–derived exosomes. Signal Transduct Target Ther 2024;9:17. Crossref
35. Sahu A, Kaur S, Sukhija J, Srivastava P, Kaur A. Spectrum of congenital and inherited ocular disorders seen in a genetic clinic: experience of a developing ocular genetic service. Indian J Ophthalmol 2023;71:935-40. Crossref
36. Fincham GS, Pasea L, Carroll C, et al. Prevention of retinal detachment in Stickler syndrome: the Cambridge prophylactic cryotherapy protocol. Ophthalmology 2014;121:1588-97. Crossref
37. Savarirayan R, Bompadre V, Bober MB, et al. Best practice guidelines regarding diagnosis and management of patients with type II collagen disorders. Genet Med 2019;21:2070-80. Crossref
38. Khanna S, Rodriguez SH, Blair MA, Wroblewski K, Shapiro MJ, Blair MP. Laser prophylaxis in patients with Stickler syndrome. Ophthalmol Retina 2022;6:263-7. Crossref
39. Tonorezos ES, Friedman DN, Barnea D, et al. Recommendations for long-term follow-up of adults with heritable retinoblastoma. Ophthalmology 2020;127:1549-57.Crossref
40. Abramson DH. Re: Skalet et al.: Screening children at risk for retinoblastoma: consensus report from the American Association of Ophthalmic Oncologists and Pathologists (Ophthalmology. 2018;125:453-458). Ophthalmology 2018;125:e63-4. Crossref
41. Davison N, Payne K, Eden M, et al. Exploring the feasibility of delivering standardized genomic care using ophthalmology as an example. Genet Med 2017;19:1032-9. Crossref
42. Black GC, Sergouniotis P, Sodi A, et al. The need for widely available genomic testing in rare eye diseases: an ERN-EYE position statement. Orphanet J Rare Dis 2021;16:142. Crossref

Pages