Paper (WACV 2016) “Discovering Picturesque Highlights from Egocentric Vacation Videos”

Paper

  • D. Castro, V. Bettadapura, and I. Essa (2016), “Discovering Picturesque Highlights from Egocentric Vacation Video,” in Proceedings of IEEE Winter Conference on Applications of Computer Vision (WACV), 2016. [PDF] [WEBSITE] [arXiv] [BIBTEX]
    @InProceedings{    2016-Castro-DPHFEVV,
      arxiv    = {http://arxiv.org/abs/1601.04406},
      author  = {Daniel Castro and Vinay Bettadapura and Irfan
          Essa},
      booktitle  = {Proceedings of IEEE Winter Conference on
          Applications of Computer Vision (WACV)},
      month    = {March},
      pdf    = {http://www.cc.gatech.edu/~irfan/p/2016-Castro-DPHFEVV.pdf}
          ,
      title    = {Discovering Picturesque Highlights from Egocentric
          Vacation Video},
      url    = {http://www.cc.gatech.edu/cpl/projects/egocentrichighlights/}
          ,
      year    = {2016}
    }

Abstract

2016-Castro-DPHFEVVWe present an approach for identifying picturesque highlights from large amounts of egocentric video data. Given a set of egocentric videos captured over the course of a vacation, our method analyzes the videos and looks for images that have good picturesque and artistic properties. We introduce novel techniques to automatically determine aesthetic features such as composition, symmetry, and color vibrancy in egocentric videos and rank the video frames based on their photographic qualities to generate highlights. Our approach also uses contextual information such as GPS, when available, to assess the relative importance of each geographic location where the vacation videos were shot. Furthermore, we specifically leverage the properties of egocentric videos to improve our highlight detection. We demonstrate results on a new egocentric vacation dataset which includes 26.5 hours of videos taken over a 14-day vacation that spans many famous tourist destinations and also provide results from a user-study to access our results.

 

Tags: , , , , | Categories: Computational Photography and Video, Computer Vision, Daniel Castro, PAMI/ICCV/CVPR/ECCV, Vinay Bettadapura | Date: March 7th, 2016 | By: Irfan Essa |

No Comments »



Spring 2016 Teaching

My teaching activities for Spring 2016 areBB1162B4-4F87-480C-A850-00C54FAA0E21

Tags: , , , | Categories: Computational Photography, Computational Photography and Video, Computer Vision, Computer Vision | Date: January 10th, 2016 | By: Irfan Essa |

No Comments »



Paper in MICCAI (2015): “Automated Assessment of Surgical Skills Using Frequency Analysis”

Paper

  • A. Zia, Y. Sharma, V. Bettadapura, E. Sarin, M. Clements, and I. Essa (2015), “Automated Assessment of Surgical Skills Using Frequency Analysis,” in International Conference on Medical Image Computing and Computer Assisted Interventions (MICCAI), 2015. [PDF] [BIBTEX]
    @InProceedings{    2015-Zia-AASSUFA,
      author  = {A. Zia and Y. Sharma and V. Bettadapura and E.
          Sarin and M. Clements and I. Essa},
      booktitle  = {International Conference on Medical Image Computing
          and Computer Assisted Interventions (MICCAI)},
      month    = {October},
      pdf    = {http://www.cc.gatech.edu/~irfan/p/2015-Zia-AASSUFA.pdf}
          ,
      title    = {Automated Assessment of Surgical Skills Using
          Frequency Analysis},
      year    = {2015}
    }

Abstract

We present an automated framework for a visual assessment of the expertise level of surgeons using the OSATS (Objective Structured Assessment of Technical Skills) criteria. Video analysis technique for extracting motion quality via  frequency coefficients is introduced. The framework is tested in a case study that involved analysis of videos of medical students with different expertise levels performing basic surgical tasks in a surgical training lab setting. We demonstrate that transforming the sequential time data into frequency components effectively extracts the useful information differentiating between different skill levels of the surgeons. The results show significant performance improvements using DFT and DCT coefficients over known state-of-the-art techniques.

Tags: , , , , , , | Categories: Activity Recognition, Aneeq Zia, Eric Sarin, Mark Clements, Medical, MICCAI, Papers, Vinay Bettadapura, Yachna Sharma | Date: October 6th, 2015 | By: Irfan Essa |

No Comments »



2015 C+J Symposium

logoData and computation drive our world, often without sufficient critical assessment or accountability. Journalism is adapting responsibly—finding and creating new kinds of stories that respond directly to our new societal condition. Join us for a two-day conference exploring the interface between journalism and computing.October 2-3, New York, NY#CJ2015

Source: 2015 C+J Symposium

Participated the 4th Computation+Journalism Symposium, October 2-3, in New York, NY at The Brown Institute for Media Innovation Pulitzer Hall, Columbia University.  Keynotes were Lada Adamic (Facebook) and Chris Wiggins (Columbia, NYT), with 2 curated panels and 5 sessions of peer-reviewed papers.

Past Symposiums were held in

  • Atlanta, GA (CJ 2008, hosted by Georgia Tech),
  • Atlanta, GA (CJ 2013, hosted by Georgia Tech), and
  • NYC, NY (CJ 2014, hosted by Columbia U).
  • Next one is being hosted by Stanford and will be in Palo Alto, CA.

Tags: , , | Categories: Computational Journalism, Nick Diakopoulos | Date: October 2nd, 2015 | By: Irfan Essa |

No Comments »



Presentation at Max-Planck-Institut für Informatik in Saarbrücken (2015): “Video Analysis and Enhancement”

Video Analysis and Enhancement: Spatio-Temporal Methods for Extracting Content from Videos and Enhancing Video OutputSaarbrücken_St_Johanner_Markt_Brunnen

Irfan Essa (prof.irfanessa.com)

Georgia Institute of Technology
School of Interactive Computing

Hosted by Max-Planck-Institut für Informatik in Saarbrucken (Bernt Schiele, Director of Computer Vision and Multimodal Computing)

Abstract 

In this talk, I will start with describing the pervasiveness of image and video content, and how such content is growing with the ubiquity of cameras.  I will use this to motivate the need for better tools for analysis and enhancement of video content. I will start with some of our earlier work on temporal modeling of video, then lead up to some of our current work and describe two main projects. (1) Our approach for a video stabilizer, currently implemented and running on YouTube, and its extensions. (2) A robust and scaleable method for video segmentation. 

I will describe, in some detail, our Video stabilization method, which generates stabilized videos and is in wide use running on YouTube, with Millions of users. Then I will  describe an efficient and scalable technique for spatiotemporal segmentation of long video sequences using a hierarchical graph-based algorithm. I will describe the videosegmentation.com site that we have developed for making this system available for wide use.

Finally, I will follow up with some recent work on image and video analysis in the mobile domains.  I will also make some observations about the ubiquity of imaging and video in general and need for better tools for video analysis. 

Tags: , , , , , | Categories: Computational Journalism, Computational Photography and Video, Computer Vision, Presentations, Ubiquitous Computing | Date: September 14th, 2015 | By: Irfan Essa |

No Comments »



Dagstuhl Workshop 2015: “Modeling and Simulation of Sport Games, Sport Movements, and Adaptations to Training”

Participated in the Dagstuhl Workshop on “Modeling and Simulation of Sport Games, Sport Movements, and Adaptations to Training” at the Dagstuhl Castle, September 13  – 16, 2015.

Motivation

Computational modeling and simulation are essential to analyze human motion and interaction in sports science. Applications range from game analysis, issues in training science like training load-adaptation relationship, motor control & learning, to biomechanical analysis. The motivation of this seminar is to enable an interdisciplinary exchange between sports and computer scientists to advance modeling and simulation technologies in selected fields of applications: sport games, sport movements and adaptations to training. In addition, contributions to the epistemic basics of modeling and simulation are welcome.

Source: Schloss Dagstuhl : Seminar Homepage

Past Seminars on this topic include

Tags: , , , , , | Categories: Activity Recognition, Behavioral Imaging, Computer Vision, Human Factors, Modeling and Animation, Presentations | Date: September 13th, 2015 | By: Irfan Essa |

No Comments »



Presentation at Max-Planck-Institute for Intelligent Systems in Tübingen (2015): “Data-Driven Methods for Video Analysis and Enhancement”

Data-Driven Methods for Video Analysis and EnhancementIMG_3995

Irfan Essa (prof.irfanessa.com)
Georgia Institute of Technology

Thursday, September 10, 2 pm,
Max Planck House Lecture Hall (Spemannstr. 36)
Hosted by Max-Planck-Institute for Intelligent Systems (Michael Black, Director of Percieving Systems)

Abstract

In this talk, I will start with describing the pervasiveness of image and video content, and how such content is growing with the ubiquity of cameras.  I will use this to motivate the need for better tools for analysis and enhancement of video content. I will start with some of our earlier work on temporal modeling of video, then lead up to some of our current work and describe two main projects. (1) Our approach for a video stabilizer, currently implemented and running on YouTube and its extensions. (2) A robust and scalable method for video segmentation.

I will describe, in some detail, our Video stabilization method, which generates stabilized videos and is in wide use. Our method allows for video stabilization beyond the conventional filtering that only suppresses high-frequency jitter. This method also supports the removal of rolling shutter distortions common in modern CMOS cameras that capture the frame one scan-line at a time resulting in non-rigid image distortions such as shear and wobble. Our method does not rely on apriori knowledge and works on video from any camera or on legacy footage. I will showcase examples of this approach and also discuss how this method is launched and running on YouTube, with Millions of users.

Then I will  describe an efficient and scalable technique for spatiotemporal segmentation of long video sequences using a hierarchical graph-based algorithm. This hierarchical approach generates high-quality segmentations and we demonstrate the use of this segmentation as users interact with the video, enabling efficient annotation of objects within the video. I will also show some recent work on how this segmentation and annotation can be used to do dynamic scene understanding.

I will then follow up with some recent work on image and video analysis in the mobile domains.  I will also make some observations about the ubiquity of imaging and video in general and need for better tools for video analysis.

Tags: , , , | Categories: Computational Photography and Video, Computer Vision, Machine Learning, Presentations | Date: September 10th, 2015 | By: Irfan Essa |

No Comments »



Paper in Ubicomp 2015: “A Practical Approach for Recognizing Eating Moments with Wrist-Mounted Inertial Sensing”

Paper

  • E. Thomaz, I. Essa, and G. D. Abowd (2015), “A Practical Approach for Recognizing Eating Moments with Wrist-Mounted Inertial Sensing,” in Proceedings of ACM International Conference on Ubiquitous Computing (UBICOMP), 2015. [PDF] [BIBTEX]
    @InProceedings{    2015-Thomaz-PAREMWWIS,
      author  = {Edison Thomaz and Irfan Essa and Gregory D. Abowd},
      booktitle  = {Proceedings of ACM International Conference on
          Ubiquitous Computing (UBICOMP)},
      month    = {September},
      pdf    = {http://www.cc.gatech.edu/~irfan/p/2015-Thomaz-PAREMWWIS.pdf}
          ,
      title    = {A Practical Approach for Recognizing Eating Moments
          with Wrist-Mounted Inertial Sensing},
      year    = {2015}
    }

Abstract

Thomaz-UBICOMP15.pngRecognizing when eating activities take place is one of the key challenges in automated food intake monitoring. Despite progress over the years, most proposed approaches have been largely impractical for everyday usage, requiring multiple onbody sensors or specialized devices such as neck collars for swallow detection. In this paper, we describe the implementation and evaluation of an approach for inferring eating moments based on 3-axis accelerometry collected with a popular off-the-shelf smartwatch. Trained with data collected in a semi-controlled laboratory setting with 20 subjects, our system recognized eating moments in two free-living condition studies (7 participants, 1 day; 1 participant, 31 days), with Fscores of 76.1% (66.7% Precision, 88.8% Recall), and 71.3% (65.2% Precision, 78.6% Recall). This work represents a contribution towards the implementation of a practical, automated system for everyday food intake monitoring, with applicability in areas ranging from health research and food journaling.

Tags: , , , , , | Categories: ACM UIST/CHI, Activity Recognition, Behavioral Imaging, Edison Thomaz, Gregory Abowd, Health Systems, Machine Learning, Mobile Computing, Papers, UBICOMP, Ubiquitous Computing | Date: September 8th, 2015 | By: Irfan Essa |

No Comments »



Paper in ISWC 2015: “Predicting Daily Activities from Egocentric Images Using Deep Learning”

Paper

  • D. Castro, S. Hickson, V. Bettadapura, E. Thomaz, G. Abowd, H. Christensen, and I. Essa (2015), “Predicting Daily Activities from Egocentric Images Using Deep Learning,” in Proceedings of International Symposium on Wearable Computers (ISWC), 2015. [PDF] [WEBSITE] [arXiv] [BIBTEX]
    @InProceedings{    2015-Castro-PDAFEIUDL,
      arxiv    = {http://arxiv.org/abs/1510.01576},
      author  = {Daniel Castro and Steven Hickson and Vinay
          Bettadapura and Edison Thomaz and Gregory Abowd and
          Henrik Christensen and Irfan Essa},
      booktitle  = {Proceedings of International Symposium on Wearable
          Computers (ISWC)},
      month    = {September},
      pdf    = {http://www.cc.gatech.edu/~irfan/p/2015-Castro-PDAFEIUDL.pdf}
          ,
      title    = {Predicting Daily Activities from Egocentric Images
          Using Deep Learning},
      url    = {http://www.cc.gatech.edu/cpl/projects/dailyactivities/}
          ,
      year    = {2015}
    }

Abstract

Castro-ISWC2015We present a method to analyze images taken from a passive egocentric wearable camera along with the contextual information, such as time and day of a week, to learn and predict everyday activities of an individual. We collected a dataset of 40,103 egocentric images over a 6 month period with 19 activity classes and demonstrate the benefit of state-of-the-art deep learning techniques for learning and predicting daily activities. Classification is conducted using a Convolutional Neural Network (CNN) with a classification method we introduce called a late fusion ensemble. This late fusion ensemble incorporates relevant contextual information and increases our classification accuracy. Our technique achieves an overall accuracy of 83.07% in predicting a person’s activity across the 19 activity classes. We also demonstrate some promising results from two additional users by fine-tuning the classifier with one day of training data.

Tags: , , , , , , , , | Categories: Activity Recognition, Daniel Castro, Gregory Abowd, Henrik Christensen, ISWC, Machine Learning, Papers, Steven Hickson, Ubiquitous Computing, Vinay Bettadapura | Date: September 7th, 2015 | By: Irfan Essa |

No Comments »



Fall 2015 Teaching: Computer Vision and Computational Photography for Online MSCS.

In fall 2015 fall term, I am teaching two classes. Both for the Georgia Tech’s Online MSCS program.Cursor_and_CS6475___Computational_Photography___Georgia_Tech

Tags: , , , | Categories: Aaron Bobick, Computational Photography, Computer Vision | Date: August 15th, 2015 | By: Irfan Essa |

No Comments »