Paper (WACV 2016) “Discovering Picturesque Highlights from Egocentric Vacation Videos”

March 7th, 2016 Irfan Essa Posted in Computational Photography and Video, Computer Vision, Daniel Castro, PAMI/ICCV/CVPR/ECCV, Vinay Bettadapura No Comments »

Paper

  • D. Castro, V. Bettadapura, and I. Essa (2016), “Discovering Picturesque Highlights from Egocentric Vacation Video,” in Proceedings of IEEE Winter Conference on Applications of Computer Vision (WACV), 2016. [PDF] [WEBSITE] [arXiv] [BIBTEX]
    @InProceedings{    2016-Castro-DPHFEVV,
      arxiv    = {http://arxiv.org/abs/1601.04406},
      author  = {Daniel Castro and Vinay Bettadapura and Irfan
          Essa},
      booktitle  = {Proceedings of IEEE Winter Conference on
          Applications of Computer Vision (WACV)},
      month    = {March},
      pdf    = {http://www.cc.gatech.edu/~irfan/p/2016-Castro-DPHFEVV.pdf},
      title    = {Discovering Picturesque Highlights from Egocentric
          Vacation Video},
      url    = {http://www.cc.gatech.edu/cpl/projects/egocentrichighlights/},
      year    = {2016}
    }

Abstract

2016-Castro-DPHFEVVWe present an approach for identifying picturesque highlights from large amounts of egocentric video data. Given a set of egocentric videos captured over the course of a vacation, our method analyzes the videos and looks for images that have good picturesque and artistic properties. We introduce novel techniques to automatically determine aesthetic features such as composition, symmetry, and color vibrancy in egocentric videos and rank the video frames based on their photographic qualities to generate highlights. Our approach also uses contextual information such as GPS, when available, to assess the relative importance of each geographic location where the vacation videos were shot. Furthermore, we specifically leverage the properties of egocentric videos to improve our highlight detection. We demonstrate results on a new egocentric vacation dataset which includes 26.5 hours of videos taken over a 14-day vacation that spans many famous tourist destinations and also provide results from a user-study to access our results.

 

AddThis Social Bookmark Button

Paper in MICCAI (2015): “Automated Assessment of Surgical Skills Using Frequency Analysis”

October 6th, 2015 Irfan Essa Posted in Activity Recognition, Aneeq Zia, Eric Sarin, Mark Clements, Medical, MICCAI, Papers, Vinay Bettadapura, Yachna Sharma No Comments »

Paper

  • A. Zia, Y. Sharma, V. Bettadapura, E. Sarin, M. Clements, and I. Essa (2015), “Automated Assessment of Surgical Skills Using Frequency Analysis,” in International Conference on Medical Image Computing and Computer Assisted Interventions (MICCAI), 2015. [PDF] [BIBTEX]
    @InProceedings{    2015-Zia-AASSUFA,
      author  = {A. Zia and Y. Sharma and V. Bettadapura and E.
          Sarin and M. Clements and I. Essa},
      booktitle  = {International Conference on Medical Image Computing
          and Computer Assisted Interventions (MICCAI)},
      month    = {October},
      pdf    = {http://www.cc.gatech.edu/~irfan/p/2015-Zia-AASSUFA.pdf},
      title    = {Automated Assessment of Surgical Skills Using
          Frequency Analysis},
      year    = {2015}
    }

Abstract

We present an automated framework for a visual assessment of the expertise level of surgeons using the OSATS (Objective Structured Assessment of Technical Skills) criteria. Video analysis technique for extracting motion quality via  frequency coefficients is introduced. The framework is tested in a case study that involved analysis of videos of medical students with different expertise levels performing basic surgical tasks in a surgical training lab setting. We demonstrate that transforming the sequential time data into frequency components effectively extracts the useful information differentiating between different skill levels of the surgeons. The results show significant performance improvements using DFT and DCT coefficients over known state-of-the-art techniques.

AddThis Social Bookmark Button

2015 C+J Symposium

October 2nd, 2015 Irfan Essa Posted in Computational Journalism, Nick Diakopoulos No Comments »

logoData and computation drive our world, often without sufficient critical assessment or accountability. Journalism is adapting responsibly—finding and creating new kinds of stories that respond directly to our new societal condition. Join us for a two-day conference exploring the interface between journalism and computing.October 2-3, New York, NY#CJ2015

Source: 2015 C+J Symposium

Participated the 4th Computation+Journalism Symposium, October 2-3, in New York, NY at The Brown Institute for Media Innovation Pulitzer Hall, Columbia University.  Keynotes were Lada Adamic (Facebook) and Chris Wiggins (Columbia, NYT), with 2 curated panels and 5 sessions of peer-reviewed papers.

Past Symposiums were held in

  • Atlanta, GA (CJ 2008, hosted by Georgia Tech),
  • Atlanta, GA (CJ 2013, hosted by Georgia Tech), and
  • NYC, NY (CJ 2014, hosted by Columbia U).
  • Next one is being hosted by Stanford and will be in Palo Alto, CA.
AddThis Social Bookmark Button

Paper in Ubicomp 2015: “A Practical Approach for Recognizing Eating Moments with Wrist-Mounted Inertial Sensing”

September 8th, 2015 Irfan Essa Posted in ACM UIST/CHI, Activity Recognition, Behavioral Imaging, Edison Thomaz, Gregory Abowd, Health Systems, Machine Learning, Mobile Computing, Papers, UBICOMP, Ubiquitous Computing No Comments »

Paper

  • E. Thomaz, I. Essa, and G. D. Abowd (2015), “A Practical Approach for Recognizing Eating Moments with Wrist-Mounted Inertial Sensing,” in Proceedings of ACM International Conference on Ubiquitous Computing (UBICOMP), 2015. [PDF] [BIBTEX]
    @InProceedings{    2015-Thomaz-PAREMWWIS,
      author  = {Edison Thomaz and Irfan Essa and Gregory D. Abowd},
      booktitle  = {Proceedings of ACM International Conference on
          Ubiquitous Computing (UBICOMP)},
      month    = {September},
      pdf    = {http://www.cc.gatech.edu/~irfan/p/2015-Thomaz-PAREMWWIS.pdf},
      title    = {A Practical Approach for Recognizing Eating Moments
          with Wrist-Mounted Inertial Sensing},
      year    = {2015}
    }

Abstract

Thomaz-UBICOMP15.pngRecognizing when eating activities take place is one of the key challenges in automated food intake monitoring. Despite progress over the years, most proposed approaches have been largely impractical for everyday usage, requiring multiple onbody sensors or specialized devices such as neck collars for swallow detection. In this paper, we describe the implementation and evaluation of an approach for inferring eating moments based on 3-axis accelerometry collected with a popular off-the-shelf smartwatch. Trained with data collected in a semi-controlled laboratory setting with 20 subjects, our system recognized eating moments in two free-living condition studies (7 participants, 1 day; 1 participant, 31 days), with Fscores of 76.1% (66.7% Precision, 88.8% Recall), and 71.3% (65.2% Precision, 78.6% Recall). This work represents a contribution towards the implementation of a practical, automated system for everyday food intake monitoring, with applicability in areas ranging from health research and food journaling.

AddThis Social Bookmark Button

Paper in ISWC 2015: “Predicting Daily Activities from Egocentric Images Using Deep Learning”

September 7th, 2015 Irfan Essa Posted in Activity Recognition, Daniel Castro, Gregory Abowd, Henrik Christensen, ISWC, Machine Learning, Papers, Steven Hickson, Ubiquitous Computing, Vinay Bettadapura No Comments »

Paper

  • D. Castro, S. Hickson, V. Bettadapura, E. Thomaz, G. Abowd, H. Christensen, and I. Essa (2015), “Predicting Daily Activities from Egocentric Images Using Deep Learning,” in Proceedings of International Symposium on Wearable Computers (ISWC), 2015. [PDF] [WEBSITE] [arXiv] [BIBTEX]
    @InProceedings{    2015-Castro-PDAFEIUDL,
      arxiv    = {http://arxiv.org/abs/1510.01576},
      author  = {Daniel Castro and Steven Hickson and Vinay
          Bettadapura and Edison Thomaz and Gregory Abowd and
          Henrik Christensen and Irfan Essa},
      booktitle  = {Proceedings of International Symposium on Wearable
          Computers (ISWC)},
      month    = {September},
      pdf    = {http://www.cc.gatech.edu/~irfan/p/2015-Castro-PDAFEIUDL.pdf},
      title    = {Predicting Daily Activities from Egocentric Images
          Using Deep Learning},
      url    = {http://www.cc.gatech.edu/cpl/projects/dailyactivities/},
      year    = {2015}
    }

Abstract

Castro-ISWC2015We present a method to analyze images taken from a passive egocentric wearable camera along with the contextual information, such as time and day of a week, to learn and predict everyday activities of an individual. We collected a dataset of 40,103 egocentric images over a 6 month period with 19 activity classes and demonstrate the benefit of state-of-the-art deep learning techniques for learning and predicting daily activities. Classification is conducted using a Convolutional Neural Network (CNN) with a classification method we introduce called a late fusion ensemble. This late fusion ensemble incorporates relevant contextual information and increases our classification accuracy. Our technique achieves an overall accuracy of 83.07% in predicting a person’s activity across the 19 activity classes. We also demonstrate some promising results from two additional users by fine-tuning the classifier with one day of training data.

AddThis Social Bookmark Button

Fall 2015 Teaching: Computer Vision and Computational Photography for Online MSCS.

August 15th, 2015 Irfan Essa Posted in Aaron Bobick, Computational Photography, Computer Vision No Comments »

In fall 2015 fall term, I am teaching two classes. Both for the Georgia Tech’s Online MSCS program.Cursor_and_CS6475___Computational_Photography___Georgia_Tech

AddThis Social Bookmark Button

Paper in ACM IUI15: “Inferring Meal Eating Activities in Real World Settings from Ambient Sounds: A Feasibility Study”

April 1st, 2015 Irfan Essa Posted in ACM ICMI/IUI, Activity Recognition, Audio Analysis, Behavioral Imaging, Edison Thomaz, Gregory Abowd, Health Systems, Machine Learning, Multimedia No Comments »

Paper

  • E. Thomaz, C. Zhang, I. Essa, and G. D. Abowd (2015), “Inferring Meal Eating Activities in Real World Settings from Ambient Sounds: A Feasibility Study,” in Proceedings of ACM Conference on Intelligence User Interfaces (IUI), 2015. (Best Short Paper Award) [PDF] [BIBTEX]
    @InProceedings{    2015-Thomaz-IMEARWSFASFS,
      author  = {Edison Thomaz and Cheng Zhang and Irfan Essa and
          Gregory D. Abowd},
      awards  = {(Best Short Paper Award)},
      booktitle  = {Proceedings of ACM Conference on Intelligence User
          Interfaces (IUI)},
      month    = {May},
      pdf    = {http://www.cc.gatech.edu/~irfan/p/2015-Thomaz-IMEARWSFASFS.pdf},
      title    = {Inferring Meal Eating Activities in Real World
          Settings from Ambient Sounds: A Feasibility Study},
      year    = {2015}
    }

Abstract

2015-04-IUI-AwardDietary self-monitoring has been shown to be an effective method for weight-loss, but it remains an onerous task despite recent advances in food journaling systems. Semi-automated food journaling can reduce the effort of logging, but often requires that eating activities be detected automatically. In this work we describe results from a feasibility study conducted in-the-wild where eating activities were inferred from ambient sounds captured with a wrist-mounted device; twenty participants wore the device during one day for an average of 5 hours while performing normal everyday activities. Our system was able to identify meal eating with an F-score of 79.8% in a person-dependent evaluation, and with 86.6% accuracy in a person-independent evaluation. Our approach is intended to be practical, leveraging off-the-shelf devices with audio sensing capabilities in contrast to systems for automated dietary assessment based on specialized sensors.

AddThis Social Bookmark Button

Participated in the KAUST Conference on Computational Imaging and Vision 2015

March 1st, 2015 Irfan Essa Posted in Computational Photography and Video, Computer Vision, Daniel Castro, Presentations No Comments »

I was invited to participate and present at the King Abdullah University of Science & Technology Conference on Computational Imaging and Vision (CIV)

March 1-4, 2015
Building 19 Level 3, Lecture Halls
Visual Computing Center (VCC)

Invited Speakers included

  • Shree Nayar – Columbia University
  • Daniel Cremers – Technical University of Munich
  • Rene Vidal –The Johns Hopkins University
  • Wolfgang Heidrich – VCC, KAUST
  • Jingyi Yu –University of Delaware
  • Irfan Essa – The Georgia Institute of Technology
  • Mubarak Shah – University of Central Florida
  • Larry Davis – University of Maryland
  • David Forsyth –University of Illinois
  • Gordon Wetzstein – Stanford University
  • Brian Barsky – University of California
  • Yi Ma – ShanghaiTech University
  • etc.

This event was hosted by the Visual Computing Center (Wolfgang HeidrichBernard GhanemGanesh Sundaramoorthi).

Daniel Castro also attended and presented a poster at the meeting.

2015-03-KUAST

AddThis Social Bookmark Button

Paper in WACV (2015): “Semantic Instance Labeling Leveraging Hierarchical Segmentation”

January 6th, 2015 Irfan Essa Posted in Computer Vision, Henrik Christensen, PAMI/ICCV/CVPR/ECCV, Papers, Robotics, Steven Hickson No Comments »

Paper

  • S. Hickson, I. Essa, and H. Christensen (2015), “Semantic Instance Labeling Leveraging Hierarchical Segmentation,” in Proceedings of IEEE Winter Conference on Applications of Computer Vision (WACV), 2015. [PDF] [DOI] [BIBTEX]
    @InProceedings{    2015-Hickson-SILLHS,
      author  = {Steven Hickson and Irfan Essa and Henrik
          Christensen},
      booktitle  = {Proceedings of IEEE Winter Conference on
          Applications of Computer Vision (WACV)},
      doi    = {10.1109/WACV.2015.147},
      month    = {January},
      pdf    = {http://www.cc.gatech.edu/~irfan/p/2015-Hickson-SILLHS.pdf},
      publisher  = {IEEE Computer Society},
      title    = {Semantic Instance Labeling Leveraging Hierarchical
          Segmentation},
      year    = {2015}
    }

Abstract

Most of the approaches for indoor RGBD semantic labeling focus on using pixels or superpixels to train a classifier. In this paper, we implement a higher level segmentation using a hierarchy of superpixels to obtain a better segmentation for training our classifier. By focusing on meaningful segments that conform more directly to objects, regardless of size, we train a random forest of decision trees as a classifier using simple features such as the 3D size, LAB color histogram, width, height, and shape as specified by a histogram of surface normals. We test our method on the NYU V2 depth dataset, a challenging dataset of cluttered indoor environments. Our experiments using the NYU V2 depth dataset show that our method achieves state of the art results on both a general semantic labeling introduced by the dataset (floor, structure, furniture, and objects) and a more object specific semantic labeling. We show that training a classifier on a segmentation from a hierarchy of super pixels yields better results than training directly on super pixels, patches, or pixels as in previous work.2015-Hickson-SILLHS_pdf

AddThis Social Bookmark Button

Paper in IEEE WACV (2015): “Finding Temporally Consistent Occlusion Boundaries using Scene Layout”

January 6th, 2015 Irfan Essa Posted in Computational Photography and Video, Computer Vision, Matthias Grundmann, PAMI/ICCV/CVPR/ECCV, Papers, S. Hussain Raza, Uncategorized No Comments »

Paper

  • S. H. Raza, A. Humayun, M. Grundmann, D. Anderson, and I. Essa (2015), “Finding Temporally Consistent Occlusion Boundaries using Scene Layout,” in Proceedings of IEEE Winter Conference on Applications of Computer Vision (WACV), 2015. [PDF] [DOI] [BIBTEX]
    @InProceedings{    2015-Raza-FTCOBUSL,
      author  = {Syed Hussain Raza and Ahmad Humayun and Matthias
          Grundmann and David Anderson and Irfan Essa},
      booktitle  = {Proceedings of IEEE Winter Conference on
          Applications of Computer Vision (WACV)},
      doi    = {10.1109/WACV.2015.141},
      month    = {January},
      pdf    = {http://www.cc.gatech.edu/~irfan/p/2015-Raza-FTCOBUSL.pdf},
      publisher  = {IEEE Computer Society},
      title    = {Finding Temporally Consistent Occlusion Boundaries
          using Scene Layout},
      year    = {2015}
    }

Abstract

We present an algorithm for finding temporally consistent occlusion boundaries in videos to support segmentation of dynamic scenes. We learn occlusion boundaries in a pairwise Markov random field (MRF) framework. We first estimate the probability of a spatiotemporal edge being an occlusion boundary by using appearance, flow, and geometric features. Next, we enforce occlusion boundary continuity in an MRF model by learning pairwise occlusion probabilities using a random forest. Then, we temporally smooth boundaries to remove temporal inconsistencies in occlusion boundary estimation. Our proposed framework provides an efficient approach for finding temporally consistent occlusion boundaries in video by utilizing causality, redundancy in videos, and semantic layout of the scene. We have developed a dataset with fully annotated ground-truth occlusion boundaries of over 30 videos (∼5000 frames). This dataset is used to evaluate temporal occlusion boundaries and provides a much-needed baseline for future studies. We perform experiments to demonstrate the role of scene layout, and temporal information for occlusion reasoning in video of dynamic scenes.

AddThis Social Bookmark Button