Article intro - taxonomy for surgical gestures
Abstract
Introduction Artificial intelligence (AI) for surgical workflow analysis often fails to generalize because surgical actions lack a standardized, fine-grained representation. Gesture-level “tokenization” of surgery, capturing instrument–tissue interac- tions as the smallest intentional functional units, offers greater technical specificity than phase- or step-level labels and has demonstrated associations with proficiency and clinical outcomes. However, the field remains fragmented by heterogeneous gesture terminology, limiting dataset interoperability and model reproducibility. Methods We conducted a SAGES-led, accelerated Delphi consensus process to establish a standardized surgical gesture taxonomy. Starting with 270 literature-derived gesture terms, we employed a novel hybrid pipeline combining large language model (LLM)-assisted semantic clustering with multi-round expert review. The process involved two Delphi surveys (open- ended, then structured agreement) with a predefined ≥ 80% agreement threshold, a pilot interactive video-based validation task where participants labeled 30 surgical clips, and a final in-person consensus meeting with live anonymous polling. Results Across iterative refinement, the taxonomy evolved from 106 gestures in 11 clusters to a hierarchical framework of Clusters, Gestures, and Sub-gestures, which, after consolidation and pilot annotation, reached a final consensus taxonomy comprising 10 clusters, 24 gestures, and 46 sub-gestures. The panel rejected dominant-instrument-only labeling, supporting multi-instrument annotation to capture assisting actions critical to surgical quality. Video-based validation demonstrated high agreement for multiple gestures (e.g., coagulate, suction, irrigate, staple, clip, needle drive), while identifying predict- able ambiguities among semantically proximate actions (e.g., cut vs seal; grasp vs clamp; dissect vs spread), informing final revisions. Conclusion This work establishes a standardized, hierarchical taxonomy for surgical gestures, providing a foundational language for surgical data science. This framework is designed to reduce annotation variability, enable reliable cross-study comparisons, and accelerate the development of scalable video-based assessment, computer vision, and autonomous systems. Defining temporal boundaries for these gestures was identified as the next critical step.
Keywords: Surgical gestures · Minimally invasive surgery · Delphi consensus
Source: LinkedIn



Comments