Summary
Modeling and recognition of surgical activities in video poses an interesting research problem. Although a number of recent works studied automatic recognition of surgical activities, generalizability of these works across different tasks and different datasets remains a challenge.
For example, placing a Tie Knot might occur during a task of Suturing on Tissue and also during the more specific and challenging task of Urethrovesical Anastomosis (UVA) that involves stitching and reconnecting two anatomical structures together. If we heavily rely on image cues of the surgical scene, representations of these surgical activities vary greatly.
Moreover, the need for representations with greater expressive power that we can use not only to recognize surgical activities but also to bridge the gap between recognition and control in autonomous systems is growing.
In this project we will explore different modalities, spatial and temporal representations, saliency and attention models for surgical activity understanding. A good knowledge of fundamental topics in computer vision and deep learning, along with strong coding skills in Python is expected. Experience with advanced deep learning topics is preferred.
The project will allow exploration of different ideas and topics, and a chance to work with collaborators from different disciplines (medicine, robotics, surgical education and cognition).
