Welcome to my website (prof.irfanessa.com). Here you will find information related to my academic pursuits. This includes updates on my research projects, list of publications, classes I teach and my collaborators/students. If you'd like to contact me, I suggest please see the FAQ. Students wanted to contact me about working with me are highly encouraged to read the FAQ. My bio is also available. Use the menu bar above, or the TAGS and CATEGORIES listed in the columns to find relevant information.
A. Zia, Y. Sharma, V. Bettadapura, E.Sarin, and I. Essa (2017), “Video and Accelerometer-Based Motion Analysis for Automated Surgical Skills Assessment,” in Proceedings of Information Processing in Computer-Assisted Interventions (IPCAI), 2017. [PDF][BIBTEX]
@InProceedings{ 2017-Zia-VAMAASSA,
author = {A. Zia and Y. Sharma and V. Bettadapura and E.Sarin
and I. Essa},
booktitle = {Proceedings of Information Processing in
Computer-Assisted Interventions (IPCAI)},
month = {June},
pdf = {http://www.cc.gatech.edu/~irfan/p/2017-Zia-VAMAASSA.pdf},
title = {Video and Accelerometer-Based Motion Analysis for
Automated Surgical Skills Assessment},
year = {2017}
}
Abstract
Purpose: Basic surgical skills of suturing and knot tying are an essential part of medical training. Having an automated system for surgical skills assessment could help save experts time and improve training efficiency. There have been some recent attempts at automated surgical skills assessment using either video analysis or acceleration data. In this paper, we present a novel approach for automated
assessment of OSATS based surgical skills and provide an analysis of different features on multi-modal data (video and accelerometer data). Methods: We conduct the largest study, to the best of our knowledge, for basic surgical skills assessment on a dataset that contained video and accelerometer data for suturing and knot-tying tasks. We introduce “entropy based” features – Approximate Entropy (ApEn) and Cross-Approximate Entropy (XApEn), which quantify the amount of predictability and regularity of fluctuations in time-series data. The
proposed features are compared to existing methods of Sequential Motion Texture (SMT), Discrete Cosine Transform (DCT) and Discrete Fourier Transform (DFT), for surgical skills assessment. Results: We report average performance of different features across all applicable OSATS criteria for suturing and knot tying tasks. Our analysis shows that the proposed entropy based features out-perform previous state-of-the-art methods using video data. For accelerometer data, our method performs better for suturing only. We also show that fusion of video and acceleration features can improve overall performance with the proposed entropy features achieving highest accuracy. Conclusions: Automated surgical skills assessment can be achieved with high accuracy using the proposed entropy features. Such a system can significantly improve the efficiency of surgical training in medical schools and teaching hospitals.
Aneeq Zia awarded the “Young Investigator Travel Award” given to young investigators (including Ph.D. and MSc students and junior researchers) with accepted papers at IPCAI conference to attend IPCAI/CARS 2017.
This paper was also 1 of the 12 papers voted by the audience for a 25 minute long oral presentation and discussion session on the last day of conference (based on 5 minute short presentations given by all authors on the first day).
A. Zia, Y. Sharma, V. Bettadapura, E. L. Sarin, T. Ploetz, M. A. Clements, and I. Essa (2016), “Automated video-based assessment of surgical skills for training and evaluation in medical schools,” International Journal of Computer Assisted Radiology and Surgery, vol. 11, iss. 9, pp. 1623-1636, 2016. [WEBSITE] [DOI][BIBTEX]
@Article{ 2016-Zia-AVASSTEMS,
author = {Zia, Aneeq and Sharma, Yachna and Bettadapura,
Vinay and Sarin, Eric L and Ploetz, Thomas and
Clements, Mark A and Essa, Irfan},
doi = {10.1007/s11548-016-1468-2},
journal = {International Journal of Computer Assisted
Radiology and Surgery},
month = {September},
number = {9},
pages = {1623--1636},
publisher = {Springer Berlin Heidelberg},
title = {Automated video-based assessment of surgical skills
for training and evaluation in medical schools},
url = {http://link.springer.com/article/10.1007/s11548-016-1468-2},
volume = {11},
year = {2016}
}
Abstract
Sample frames from our video dataset
Purpose: Routine evaluation of basic surgical skills in medical schools requires considerable time and effort from supervising faculty. For each surgical trainee, a supervisor has to observe the trainees in- person. Alternatively, supervisors may use training videos, which reduces some of the logistical overhead. All these approaches, however, are still incredibly time consuming and involve human bias. In this paper, we present an automated system for surgical skills assessment by analyzing video data of surgical activities.
Method : We compare different techniques for video-based surgical skill evaluation. We use techniques that capture the motion information at a coarser granularity using symbols or words, extract motion dynamics using textural patterns in a frame kernel matrix, and analyze fine-grained motion information using frequency analysis. Results: We were successfully able to classify surgeons into different skill levels with high accuracy. Our results indicate that fine-grained analysis of motion dynamics via frequency analysis is most effective in capturing the skill relevant information in surgical videos.
Conclusion: Our evaluations show that frequency features perform better than motion texture features, which in turn perform better than symbol/word-based features. Put succinctly, skill classification accuracy is positively correlated with motion granularity as demonstrated by our results on two challenging video datasets.
A. Zia, Y. Sharma, V. Bettadapura, E. Sarin, M. Clements, and I. Essa (2015), “Automated Assessment of Surgical Skills Using Frequency Analysis,” in International Conference on Medical Image Computing and Computer Assisted Interventions (MICCAI), 2015. [PDF][BIBTEX]
@InProceedings{ 2015-Zia-AASSUFA,
author = {A. Zia and Y. Sharma and V. Bettadapura and E.
Sarin and M. Clements and I. Essa},
booktitle = {International Conference on Medical Image Computing
and Computer Assisted Interventions (MICCAI)},
month = {October},
pdf = {http://www.cc.gatech.edu/~irfan/p/2015-Zia-AASSUFA.pdf},
title = {Automated Assessment of Surgical Skills Using
Frequency Analysis},
year = {2015}
}
Abstract
We present an automated framework for a visual assessment of the expertise level of surgeons using the OSATS (Objective Structured Assessment of Technical Skills) criteria. Video analysis technique for extracting motion quality via frequency coefficients is introduced. The framework is tested in a case study that involved analysis of videos of medical students with different expertise levels performing basic surgical tasks in a surgical training lab setting. We demonstrate that transforming the sequential time data into frequency components effectively extracts the useful information differentiating between different skill levels of the surgeons. The results show significant performance improvements using DFT and DCT coefficients over known state-of-the-art techniques.
Participated in the Dagstuhl Workshop on “Modeling and Simulation of Sport Games, Sport Movements, and Adaptations to Training” at the Dagstuhl Castle, September 13 – 16, 2015.
Motivation
Computational modeling and simulation are essential to analyze human motion and interaction in sports science. Applications range from game analysis, issues in training science like training load-adaptation relationship, motor control & learning, to biomechanical analysis. The motivation of this seminar is to enable an interdisciplinary exchange between sports and computer scientists to advance modeling and simulation technologies in selected fields of applications: sport games, sport movements and adaptations to training. In addition, contributions to the epistemic basics of modeling and simulation are welcome.
E. Thomaz, I. Essa, and G. D. Abowd (2015), “A Practical Approach for Recognizing Eating Moments with Wrist-Mounted Inertial Sensing,” in Proceedings of ACM International Conference on Ubiquitous Computing (UBICOMP), 2015. [PDF][BIBTEX]
@InProceedings{ 2015-Thomaz-PAREMWWIS,
author = {Edison Thomaz and Irfan Essa and Gregory D. Abowd},
booktitle = {Proceedings of ACM International Conference on
Ubiquitous Computing (UBICOMP)},
month = {September},
pdf = {http://www.cc.gatech.edu/~irfan/p/2015-Thomaz-PAREMWWIS.pdf},
title = {A Practical Approach for Recognizing Eating Moments
with Wrist-Mounted Inertial Sensing},
year = {2015}
}
Abstract
Recognizing when eating activities take place is one of the key challenges in automated food intake monitoring. Despite progress over the years, most proposed approaches have been largely impractical for everyday usage, requiring multiple onbody sensors or specialized devices such as neck collars for swallow detection. In this paper, we describe the implementation and evaluation of an approach for inferring eating moments based on 3-axis accelerometry collected with a popular off-the-shelf smartwatch. Trained with data collected in a semi-controlled laboratory setting with 20 subjects, our system recognized eating moments in two free-living condition studies (7 participants, 1 day; 1 participant, 31 days), with Fscores of 76.1% (66.7% Precision, 88.8% Recall), and 71.3% (65.2% Precision, 78.6% Recall). This work represents a contribution towards the implementation of a practical, automated system for everyday food intake monitoring, with applicability in areas ranging from health research and food journaling.
D. Castro, S. Hickson, V. Bettadapura, E. Thomaz, G. Abowd, H. Christensen, and I. Essa (2015), “Predicting Daily Activities from Egocentric Images Using Deep Learning,” in Proceedings of International Symposium on Wearable Computers (ISWC), 2015. [PDF][WEBSITE] [arXiv][BIBTEX]
@InProceedings{ 2015-Castro-PDAFEIUDL,
arxiv = {http://arxiv.org/abs/1510.01576},
author = {Daniel Castro and Steven Hickson and Vinay
Bettadapura and Edison Thomaz and Gregory Abowd and
Henrik Christensen and Irfan Essa},
booktitle = {Proceedings of International Symposium on Wearable
Computers (ISWC)},
month = {September},
pdf = {http://www.cc.gatech.edu/~irfan/p/2015-Castro-PDAFEIUDL.pdf},
title = {Predicting Daily Activities from Egocentric Images
Using Deep Learning},
url = {http://www.cc.gatech.edu/cpl/projects/dailyactivities/},
year = {2015}
}
Abstract
We present a method to analyze images taken from a passive egocentric wearable camera along with the contextual information, such as time and day of a week, to learn and predict everyday activities of an individual. We collected a dataset of 40,103 egocentric images over a 6 month period with 19 activity classes and demonstrate the benefit of state-of-the-art deep learning techniques for learning and predicting daily activities. Classification is conducted using a Convolutional Neural Network (CNN) with a classification method we introduce called a late fusion ensemble. This late fusion ensemble incorporates relevant contextual information and increases our classification accuracy. Our technique achieves an overall accuracy of 83.07% in predicting a person’s activity across the 19 activity classes. We also demonstrate some promising results from two additional users by fine-tuning the classifier with one day of training data.
E. Thomaz, C. Zhang, I. Essa, and G. D. Abowd (2015), “Inferring Meal Eating Activities in Real World Settings from Ambient Sounds: A Feasibility Study,” in Proceedings of ACM Conference on Intelligence User Interfaces (IUI), 2015. (Best Short Paper Award)[PDF][BIBTEX]
@InProceedings{ 2015-Thomaz-IMEARWSFASFS,
author = {Edison Thomaz and Cheng Zhang and Irfan Essa and
Gregory D. Abowd},
awards = {(Best Short Paper Award)},
booktitle = {Proceedings of ACM Conference on Intelligence User
Interfaces (IUI)},
month = {May},
pdf = {http://www.cc.gatech.edu/~irfan/p/2015-Thomaz-IMEARWSFASFS.pdf},
title = {Inferring Meal Eating Activities in Real World
Settings from Ambient Sounds: A Feasibility Study},
year = {2015}
}
Abstract
Dietary self-monitoring has been shown to be an effective method for weight-loss, but it remains an onerous task despite recent advances in food journaling systems. Semi-automated food journaling can reduce the effort of logging, but often requires that eating activities be detected automatically. In this work we describe results from a feasibility study conducted in-the-wild where eating activities were inferred from ambient sounds captured with a wrist-mounted device; twenty participants wore the device during one day for an average of 5 hours while performing normal everyday activities. Our system was able to identify meal eating with an F-score of 79.8% in a person-dependent evaluation, and with 86.6% accuracy in a person-independent evaluation. Our approach is intended to be practical, leveraging off-the-shelf devices with audio sensing capabilities in contrast to systems for automated dietary assessment based on specialized sensors.
V. Bettadapura, E. Thomaz, A. Parnami, G. Abowd, and I. Essa (2015), “Leveraging Context to Support Automated Food Recognition in Restaurants,” in Proceedings of IEEE Winter Conference on Applications of Computer Vision (WACV), 2015. [PDF][WEBSITE] [DOI] [arXiv][BIBTEX]
@InProceedings{ 2015-Bettadapura-LCSAFRR,
arxiv = {http://arxiv.org/abs/1510.02078},
author = {Vinay Bettadapura and Edison Thomaz and Aman
Parnami and Gregory Abowd and Irfan Essa},
booktitle = {Proceedings of IEEE Winter Conference on
Applications of Computer Vision (WACV)},
doi = {10.1109/WACV.2015.83},
month = {January},
pdf = {http://www.cc.gatech.edu/~irfan/p/2015-Bettadapura-LCSAFRR.pdf},
publisher = {IEEE Computer Society},
title = {Leveraging Context to Support Automated Food
Recognition in Restaurants},
url = {http://www.vbettadapura.com/egocentric/food/},
year = {2015}
}
Abstract
The pervasiveness of mobile cameras has resulted in a dramatic increase in food photos, which are pictures reflecting what people eat. In this paper, we study how taking pictures of what we eat in restaurants can be used for the purpose of automating food journaling. We propose to leverage the context of where the picture was taken, with additional information about the restaurant, available online, coupled with state-of-the-art computer vision techniques to recognize the food being consumed. To this end, we demonstrate image-based recognition of foods eaten in restaurants by training a classifier with images from restaurant’s online menu databases. We evaluate the performance of our system in unconstrained, real-world settings with food images taken in 10 restaurants across 5 different types of food (American, Indian, Italian, Mexican and Thai).
V. Bettadapura, I. Essa, and C. Pantofaru (2015), “Egocentric Field-of-View Localization Using First-Person Point-of-View Devices,” in Proceedings of IEEE Winter Conference on Applications of Computer Vision (WACV), 2015. (Best Paper Award)[PDF][WEBSITE] [DOI] [arXiv][BIBTEX]
@InProceedings{ 2015-Bettadapura-EFLUFPD,
arxiv = {http://arxiv.org/abs/1510.02073},
author = {Vinay Bettadapura and Irfan Essa and Caroline
Pantofaru},
awards = {(Best Paper Award)},
booktitle = {Proceedings of IEEE Winter Conference on
Applications of Computer Vision (WACV)},
doi = {10.1109/WACV.2015.89},
month = {January},
pdf = {http://www.cc.gatech.edu/~irfan/p/2015-Bettadapura-EFLUFPD.pdf},
publisher = {IEEE Computer Society},
title = {Egocentric Field-of-View Localization Using
First-Person Point-of-View Devices},
url = {http://www.vbettadapura.com/egocentric/localization/},
year = {2015}
}
Abstract
We present a technique that uses images, videos and sensor data taken from first-person point-of-view devices to perform egocentric field-of-view (FOV) localization. We define egocentric FOV localization as capturing the visual information from a person’s field-of-view in a given environment and transferring this information onto a reference corpus of images and videos of the same space, hence determining what a person is attending to. Our method matches images and video taken from the first-person perspective with the reference corpus and refines the results using the first-person’s head orientation information obtained using the device sensors. We demonstrate single and multi-user egocentric FOV localization in different indoor and outdoor environments with applications in augmented reality, event understanding and studying social interactions.
Y. Sharma, V. Bettadapura, T. Ploetz, N. Hammerla, S. Mellor, R. McNaney, P. Olivier, S. Deshmukh, A. Mccaskie, and I. Essa (2014), “Video Based Assessment of OSATS Using Sequential Motion Textures,” in Proceedings of Workshop on Modeling and Monitoring of Computer Assisted Interventions (M2CAI), 2014. (Best Paper Honorable Mention Award)[PDF][BIBTEX]
@InProceedings{ 2014-Sharma-VBAOUSMT,
author = {Yachna Sharma and Vinay Bettadapura and Thomas
Ploetz and Nils Hammerla and Sebastian Mellor and
Roisin McNaney and Patrick Olivier and Sandeep
Deshmukh and Andrew Mccaskie and Irfan Essa},
awards = {(Best Paper Honorable Mention Award)},
booktitle = {{Proceedings of Workshop on Modeling and Monitoring
of Computer Assisted Interventions (M2CAI)}},
month = {September},
pdf = {http://www.cc.gatech.edu/~irfan/p/2014-Sharma-VBAOUSMT.pdf},
title = {Video Based Assessment of OSATS Using Sequential
Motion Textures},
year = {2014}
}
Abstract
A fully automated framework for video-based surgical skill assessment is presented that incorporates the sequential and qualitative aspects of surgical motion in a data-driven manner. The Objective Structured Assessment of Technical Skills (OSATS) assessments is replicated, which provides both an overall and in-detail evaluation of basic suturing skills required for surgeons. Video analysis techniques are introduced that incorporate sequential motion aspects into motion textures. Significant performance improvement over standard bag-of-words and motion analysis approaches is demonstrated. The framework is evaluated in a case study that involved medical students with varying levels of expertise performing basic surgical tasks in a surgical training lab setting.
Awarded the Best Paper Honorable Mention (2nd place) Award at M2CAI 2014.