Paper (WACV 2016) “Discovering Picturesque Highlights from Egocentric Vacation Videos”

March 7th, 2016 Irfan Essa Posted in Computational Photography and Video, Computer Vision, Daniel Castro, PAMI/ICCV/CVPR/ECCV, Vinay Bettadapura No Comments »


  • D. Castro, V. Bettadapura, and I. Essa (2016), “Discovering Picturesque Highlights from Egocentric Vacation Video,” in Proceedings of IEEE Winter Conference on Applications of Computer Vision (WACV), 2016. [PDF] [WEBSITE] [arXiv] [BIBTEX]
    @InProceedings{    2016-Castro-DPHFEVV,
      arxiv    = {},
      author  = {Daniel Castro and Vinay Bettadapura and Irfan
      booktitle  = {Proceedings of IEEE Winter Conference on
          Applications of Computer Vision (WACV)},
      month    = {March},
      pdf    = {}
      title    = {Discovering Picturesque Highlights from Egocentric
          Vacation Video},
      url    = {}
      year    = {2016}


2016-Castro-DPHFEVVWe present an approach for identifying picturesque highlights from large amounts of egocentric video data. Given a set of egocentric videos captured over the course of a vacation, our method analyzes the videos and looks for images that have good picturesque and artistic properties. We introduce novel techniques to automatically determine aesthetic features such as composition, symmetry, and color vibrancy in egocentric videos and rank the video frames based on their photographic qualities to generate highlights. Our approach also uses contextual information such as GPS, when available, to assess the relative importance of each geographic location where the vacation videos were shot. Furthermore, we specifically leverage the properties of egocentric videos to improve our highlight detection. We demonstrate results on a new egocentric vacation dataset which includes 26.5 hours of videos taken over a 14-day vacation that spans many famous tourist destinations and also provide results from a user-study to access our results.


AddThis Social Bookmark Button

Paper in ISWC 2015: “Predicting Daily Activities from Egocentric Images Using Deep Learning”

September 7th, 2015 Irfan Essa Posted in Activity Recognition, Daniel Castro, Gregory Abowd, Henrik Christensen, ISWC, Machine Learning, Papers, Steven Hickson, Ubiquitous Computing, Vinay Bettadapura No Comments »


  • D. Castro, S. Hickson, V. Bettadapura, E. Thomaz, G. Abowd, H. Christensen, and I. Essa (2015), “Predicting Daily Activities from Egocentric Images Using Deep Learning,” in Proceedings of International Symposium on Wearable Computers (ISWC), 2015. [PDF] [WEBSITE] [arXiv] [BIBTEX]
    @InProceedings{    2015-Castro-PDAFEIUDL,
      arxiv    = {},
      author  = {Daniel Castro and Steven Hickson and Vinay
          Bettadapura and Edison Thomaz and Gregory Abowd and
          Henrik Christensen and Irfan Essa},
      booktitle  = {Proceedings of International Symposium on Wearable
          Computers (ISWC)},
      month    = {September},
      pdf    = {}
      title    = {Predicting Daily Activities from Egocentric Images
          Using Deep Learning},
      url    = {}
      year    = {2015}


Castro-ISWC2015We present a method to analyze images taken from a passive egocentric wearable camera along with the contextual information, such as time and day of a week, to learn and predict everyday activities of an individual. We collected a dataset of 40,103 egocentric images over a 6 month period with 19 activity classes and demonstrate the benefit of state-of-the-art deep learning techniques for learning and predicting daily activities. Classification is conducted using a Convolutional Neural Network (CNN) with a classification method we introduce called a late fusion ensemble. This late fusion ensemble incorporates relevant contextual information and increases our classification accuracy. Our technique achieves an overall accuracy of 83.07% in predicting a person’s activity across the 19 activity classes. We also demonstrate some promising results from two additional users by fine-tuning the classifier with one day of training data.

AddThis Social Bookmark Button

Participated in the KAUST Conference on Computational Imaging and Vision 2015

March 1st, 2015 Irfan Essa Posted in Computational Photography and Video, Computer Vision, Daniel Castro, Presentations No Comments »

I was invited to participate and present at the King Abdullah University of Science & Technology Conference on Computational Imaging and Vision (CIV)

March 1-4, 2015
Building 19 Level 3, Lecture Halls
Visual Computing Center (VCC)

Invited Speakers included

  • Shree Nayar – Columbia University
  • Daniel Cremers – Technical University of Munich
  • Rene Vidal –The Johns Hopkins University
  • Wolfgang Heidrich – VCC, KAUST
  • Jingyi Yu –University of Delaware
  • Irfan Essa – The Georgia Institute of Technology
  • Mubarak Shah – University of Central Florida
  • Larry Davis – University of Maryland
  • David Forsyth –University of Illinois
  • Gordon Wetzstein – Stanford University
  • Brian Barsky – University of California
  • Yi Ma – ShanghaiTech University
  • etc.

This event was hosted by the Visual Computing Center (Wolfgang HeidrichBernard GhanemGanesh Sundaramoorthi).

Daniel Castro also attended and presented a poster at the meeting.


AddThis Social Bookmark Button

At ICVSS (International Computer Vision Summer School) 2013, in Calabria, ITALY (July 2013)

July 11th, 2013 Irfan Essa Posted in Computational Photography, Computational Photography and Video, Daniel Castro, Matthias Grundmann, Presentations, S. Hussain Raza, Vivek Kwatra No Comments »

Teaching at the ICVSS 2013, in Calabria, Italy, July 2013 (Programme)

Computational Video: Post-processing Methods for Stabilization, Retargeting and Segmentation

Irfan Essa
(This work in collaboration with
Matthias Grundmann, Daniel Castro, Vivek Kwatra, Mei Han, S. Hussian Raza).


We address a variety of challenges for analysis and enhancement of Computational Video. We present novel post-processing methods to bridge the difference between professional and casually shot videos mostly seen on online sites. Our research presents solutions to three well-defined problems: (1) Video stabilization and rolling shutter removal in casually-shot, uncalibrated videos; (2) Content-aware video retargeting; and (3) spatio-temporal video segmentation to enable efficient video annotation. We showcase several real-world applications building on these techniques.

We start by proposing a novel algorithm for video stabilization that generates stabilized videos by employing L1-optimal camera paths to remove undesirable motions. We compute camera paths that are optimally partitioned into con- stant, linear and parabolic segments mimicking the camera motions employed by professional cinematographers. To achieve this, we propose a linear program- ming framework to minimize the first, second, and third derivatives of the result- ing camera path. Our method allows for video stabilization beyond conventional filtering, that only suppresses high frequency jitter. An additional challenge in videos shot from mobile phones are rolling shutter distortions. Modern CMOS cameras capture the frame one scanline at a time, which results in non-rigid image distortions such as shear and wobble. We propose a solution based on a novel mixture model of homographies parametrized by scanline blocks to correct these rolling shutter distortions. Our method does not rely on a-priori knowl- edge of the readout time nor requires prior camera calibration. Our novel video stabilization and calibration free rolling shutter removal have been deployed on YouTube where they have successfully stabilized millions of videos. We also discuss several extensions to the stabilization algorithm and present technical details behind the widely used YouTube Video Stabilizer.

We address the challenge of changing the aspect ratio of videos, by proposing algorithms that retarget videos to fit the form factor of a given device without stretching or letter-boxing. Our approaches use all of the screens pixels, while striving to deliver as much video-content of the original as possible. First, we introduce a new algorithm that uses discontinuous seam-carving in both space and time for resizing videos. Our algorithm relies on a novel appearance-based temporal coherence formulation that allows for frame-by-frame processing and results in temporally discontinuous seams, as opposed to geometrically smooth and continuous seams. Second, we present a technique, that builds on the above mentioned video stabilization approach. We effectively automate classical pan and scan techniques by smoothly guiding a virtual crop window via saliency constraints.

Finally, we introduce an efficient and scalable technique for spatio-temporal segmentation of long video sequences using a hierarchical graph-based algorithm. We begin by over-segmenting a volumetric video graph into space-time regions grouped by appearance. We then construct a region graph over the ob- tained segmentation and iteratively repeat this process over multiple levels to create a tree of spatio-temporal segmentations. This hierarchical approach gen- erates high quality segmentations, and allows subsequent applications to choose from varying levels of granularity. We demonstrate the use of spatio-temporal segmentation as users interact with the video, enabling efficient annotation of objects within the video.

Part of this talks will will expose attendees to use the Video Stabilizer on YouTube and the video segmentation system at Please find appropriate videos to test the systems.

Part of the work described above was done at Google, where Matthias Grundmann, Vivek Kwatra and Mei Han are, and Professor Essa is working as a Consultant. Part of the work were efforts of research by Matthias Grundmann, Daniel Castro and S. Hussain Raza, as part of their research efforts as students at GA Tech.

AddThis Social Bookmark Button

Summer in Barcelona. Teaching classes and such.

May 13th, 2013 Irfan Essa Posted in CnJ, Computational Photography, Daniel Castro, Study Abroad No Comments »

I am here in Barcelona for my 4th summer, participating in the Georgia Tech, College of Computing’s International Study Abroad Programs in Barcelona, ESP. Teaching two classes and spending time with 64 student participants, 4 teaching assistants, and 6 faculty from GA Tech.  Spending time with Faculty at Facultat d’Informàtica de Barcelona – UPC, our hosts here in Barcelona.

  • CS 4464: Computational Journalism: This class is aimed at understanding the computational and technological advancements in the area of journalism. Primary focus is on the study of technologies for developing new tools for (a) sense-making from diverse news information sources, (b) the impact of more and cheaper networked sensors (c) collaborative human models for information aggregation and sense-making, (d) mashups and the use of programming in journalism, (e) the impact of mobile computing and data gathering, (f) computational approaches to information quality, (g) data mining for personalization and aggregation, and (h) citizen journalism. 
  • CS 4475: Computational Photography: This class explores perceptual and technical aspects of pictures, and more precisely the capture and depiction of reality on a 2D medium. The scientific, perceptual, and artistic principles behind image-making will be emphasized. Topics include the relationship between pictorial techniques and the human visual system; intrinsic limitations of 2D representations and their possible compensations; and technical issues involving depiction. Technical aspects of image capture and rendering, and exploration of how such a medium can be used to its maximum potential, will be examined. Material from recent coursera offering of Computational Photography will be leveraged this class.
AddThis Social Bookmark Button

Paper in IEEE ICCP 2012: “Calibration-Free Rolling Shutter Removal”

April 28th, 2012 Irfan Essa Posted in Computational Photography and Video, Daniel Castro, ICCP, Matthias Grundmann, Vivek Kwatra No Comments »

Calibration-Free Rolling Shutter Removal

  • M. Grundmann, V. Kwatra, D. Castro, and I. Essa (2012), “Calibration-Free Rolling Shutter Removal,” in Proceedings of IEEE Conference on Computational Photography (ICCP), 2012. (Best Paper Award) [PDF] [WEBSITE] [VIDEO] [DOI] [BLOG] [BIBTEX]
    @InProceedings{    2012-Grundmann-CRSR,
      author  = {Matthias Grundmann and Vivek Kwatra and Daniel
          Castro and Irfan Essa},
      awards  = {(Best Paper Award)},
      blog    = {}
      booktitle  = {Proceedings of IEEE Conference on Computational
          Photography (ICCP)},
      doi    = {10.1109/ICCPhot.2012.6215213},
      pdf    = {}
      publisher  = {IEEE Computer Society},
      title    = {Calibration-Free Rolling Shutter Removal},
      url    = {}
      video    = {},
      year    = {2012}


We present a novel algorithm for efficient removal of rolling shutter distortions in uncalibrated streaming videos. Our proposed method is calibration free as it does not need any knowledge of the camera used, nor does it require calibration using specially recorded calibration sequences. Our algorithm can perform rolling shutter removal under varying focal lengths, as in videos from CMOS cameras equipped with an optical zoom. We evaluate our approach across a broad range of cameras and video sequences demonstrating robustness, scalability, and repeatability. We also conducted a user study, which demonstrates a preference for the output of our algorithm over other state-of-the art methods. Our algorithm is computationally efficient, easy to parallelize, and robust to challenging artifacts introduced by various cameras with differing technologies.

Presented at IEEE International Conference on Computational Photography, Seattle, WA, April 27-29, 2012.



AddThis Social Bookmark Button