Paper in AAAI’s ICWSM (2017) “Selfie-Presentation in Everyday Life: A Large-Scale Characterization of Selfie Contexts on Instagram”

May 18th, 2017 Irfan Essa Posted in Computational Journalism, Computational Photography and Video, Computer Vision, Face and Gesture, Julia Deeb-Swihart, Papers, Social Computing No Comments »

Paper

  • J. Deeb-Swihart, C. Polack, E. Gilbert, and I. Essa (2017), “Selfie-Presentation in Everyday Life: A Large-Scale Characterization of Selfie Contexts on Instagram,” in In Proceedings of The International AAAI Conference on Web and Social Media (ICWSM), 2017. [PDF] [BIBTEX]
    @InProceedings{    2017-Deeb-Swihart-SELLCSCI,
      author  = {Julia Deeb-Swihart and Christopher Polack and Eric
          Gilbert and Irfan Essa},
      booktitle  = {In Proceedings of The International AAAI Conference
          on Web and Social Media (ICWSM)},
      month    = {May},
      organization  = {AAAI},
      pdf    = {http://www.cc.gatech.edu/~irfan/p/2017-Deeb-Swihart-SELLCSCI.pdf},
      title    = {Selfie-Presentation in Everyday Life: A Large-Scale
          Characterization of Selfie Contexts on Instagram},
      year    = {2017}
    }

Abstract

Carefully managing the presentation of self via technology is a core practice on all modern social media platforms. Recently, selfies have emerged as a new, pervasive genre of identity performance. In many ways unique, selfies bring us full circle to Goffman—blending the online and offline selves together. In this paper, we take an empirical, Goffman-inspired look at the phenomenon of selfies. We report a large-scale, mixed-method analysis of the categories in which selfies appear on Instagram—an online community comprising over 400M people. Applying computer vision and network analysis techniques to 2.5M selfies, we present a typology of emergent selfie categories which represent emphasized identity statements. To the best of our knowledge, this is the first large-scale, empirical research on selfies. We conclude, contrary to common portrayals in the press, that selfies are really quite ordinary: they project identity signals such as wealth, health and physical attractiveness common to many online media, and to offline life.

AddThis Social Bookmark Button

Paper in IJCNN (2017) “Towards Using Visual Attributes to Infer Image Sentiment Of Social Events”

May 18th, 2017 Irfan Essa Posted in Computational Journalism, Computational Photography and Video, Computer Vision, Machine Learning, Papers, Unaiza Ahsan No Comments »

Paper

  • U. Ahsan, M. D. Choudhury, and I. Essa (2017), “Towards Using Visual Attributes to Infer Image Sentiment Of Social Events,” in Proceedings of The International Joint Conference on Neural Networks, Anchorage, Alaska, US, 2017. [PDF] [BIBTEX]
    @InProceedings{    2017-Ahsan-TUVAIISSE,
      address  = {Anchorage, Alaska, US},
      author  = {Unaiza Ahsan and Munmun De Choudhury and Irfan
          Essa},
      booktitle  = {Proceedings of The International Joint Conference
          on Neural Networks},
      month    = {May},
      pdf    = {http://www.cc.gatech.edu/~irfan/p/2017-Ahsan-TUVAIISSE.pdf},
      publisher  = {International Neural Network Society},
      title    = {Towards Using Visual Attributes to Infer Image
          Sentiment Of Social Events},
      year    = {2017}
    }

Abstract

Widespread and pervasive adoption of smartphones has led to instant sharing of photographs that capture events ranging from mundane to life-altering happenings. We propose to capture sentiment information of such social event images leveraging their visual content. Our method extracts an intermediate visual representation of social event images based on the visual attributes that occur in the images going beyond
sentiment-specific attributes. We map the top predicted attributes to sentiments and extract the dominant emotion associated with a picture of a social event. Unlike recent approaches, our method generalizes to a variety of social events and even to unseen events, which are not available at training time. We demonstrate the effectiveness of our approach on a challenging social event image dataset and our method outperforms state-of-the-art approaches for classifying complex event images into sentiments.

AddThis Social Bookmark Button

Paper in IEEE WACV (2017): “Complex Event Recognition from Images with Few Training Examples”

March 27th, 2017 Irfan Essa Posted in Computational Journalism, Computational Photography and Video, Computer Vision, PAMI/ICCV/CVPR/ECCV, Papers, Unaiza Ahsan No Comments »

Paper

  • U. Ahsan, C. Sun, J. Hays, and I. Essa (2017), “Complex Event Recognition from Images with Few Training Examples,” in IEEE Winter Conference on Applications of Computer Vision (WACV), 2017. [PDF] [arXiv] [BIBTEX]
    @InProceedings{    2017-Ahsan-CERFIWTE,
      arxiv    = {https://arxiv.org/abs/1701.04769},
      author  = {Unaiza Ahsan and Chen Sun and James Hays and Irfan
          Essa},
      booktitle  = {IEEE Winter Conference on Applications of Computer
          Vision (WACV)},
      month    = {March},
      pdf    = {http://www.cc.gatech.edu/~irfan/p/2017-Ahsan-CERFIWTE.pdf},
      title    = {Complex Event Recognition from Images with Few
          Training Examples},
      year    = {2017}
    }

Abstract

We propose to leverage concept-level representations for complex event recognition in photographs given limited training examples. We introduce a novel framework to discover event concept attributes from the web and use that to extract semantic features from images and classify them into social event categories with few training examples. Discovered concepts include a variety of objects, scenes, actions and event subtypes, leading to a discriminative and compact representation for event images. Web images are obtained for each discovered event concept and we use (pre-trained) CNN features to train concept classifiers. Extensive experiments on challenging event datasets demonstrate that our proposed method outperforms several baselines using deep CNN features directly in classifying images into events with limited training examples. We also demonstrate that our method achieves the best overall accuracy on a data set with unseen event categories using a single training example.

AddThis Social Bookmark Button

Paper (ACM MM 2016) “Leveraging Contextual Cues for Generating Basketball Highlights”

October 18th, 2016 Irfan Essa Posted in ACM MM, Caroline Pantofaru, Computational Photography and Video, Computer Vision, Papers, Sports Visualization, Vinay Bettadapura No Comments »

Paper

  • V. Bettadapura, C. Pantofaru, and I. Essa (2016), “Leveraging Contextual Cues for Generating Basketball Highlights,” in Proceedings of ACM International Conference on Multimedia (ACM-MM), 2016. [PDF] [WEBSITE] [arXiv] [BIBTEX]
    @InProceedings{    2016-Bettadapura-LCCGBH,
      arxiv    = {http://arxiv.org/abs/1606.08955},
      author  = {Vinay Bettadapura and Caroline Pantofaru and Irfan
          Essa},
      booktitle  = {Proceedings of ACM International Conference on
          Multimedia (ACM-MM)},
      month    = {October},
      organization  = {ACM},
      pdf    = {http://www.cc.gatech.edu/~irfan/p/2016-Bettadapura-LCCGBH.pdf},
      title    = {Leveraging Contextual Cues for Generating
          Basketball Highlights},
      url    = {http://www.vbettadapura.com/highlights/basketball/index.htm},
      year    = {2016}
    }

Abstract

2016-Bettadapura-LCCGBH

Leveraging Contextual Cues for Generating Basketball Highlights

The massive growth of sports videos has resulted in a need for automatic generation of sports highlights that are comparable in quality to the hand-edited highlights produced by broadcasters such as ESPN. Unlike previous works that mostly use audio-visual cues derived from the video, we propose an approach that additionally leverages contextual cues derived from the environment that the game is being played in. The contextual cues provide information about the excitement levels in the game, which can be ranked and selected to automatically produce high-quality basketball highlights. We introduce a new dataset of 25 NCAA games along with their play-by-play stats and the ground-truth excitement data for each basket. We explore the informativeness of five different cues derived from the video and from the environment through user studies. Our experiments show that for our study participants, the highlights produced by our system are comparable to the ones produced by ESPN for the same games.

AddThis Social Bookmark Button

Research Blog: Motion Stills – Create beautiful GIFs from Live Photos

June 7th, 2016 Irfan Essa Posted in Computational Photography and Video, Computer Vision, In The News, Interesting, Matthias Grundmann, Projects No Comments »

Kudos to the team from Machine Perception at Google Research that just launched the Motion Still App to generate novel photos on an iOS device. This work is in part aimed at combining efforts like Video Textures and Video Stabilization and a lot more.

Today we are releasing Motion Stills, an iOS app from Google Research that acts as a virtual camera operator for your Apple Live Photos. We use our video stabilization technology to freeze the background into a still photo or create sweeping cinematic pans. The resulting looping GIFs and movies come alive, and can easily be shared via messaging or on social media.

Source: Research Blog: Motion Stills – Create beautiful GIFs from Live Photos

AddThis Social Bookmark Button

Paper (WACV 2016) “Discovering Picturesque Highlights from Egocentric Vacation Videos”

March 7th, 2016 Irfan Essa Posted in Computational Photography and Video, Computer Vision, Daniel Castro, PAMI/ICCV/CVPR/ECCV, Vinay Bettadapura No Comments »

Paper

  • D. Castro, V. Bettadapura, and I. Essa (2016), “Discovering Picturesque Highlights from Egocentric Vacation Video,” in Proceedings of IEEE Winter Conference on Applications of Computer Vision (WACV), 2016. [PDF] [WEBSITE] [arXiv] [BIBTEX]
    @InProceedings{    2016-Castro-DPHFEVV,
      arxiv    = {http://arxiv.org/abs/1601.04406},
      author  = {Daniel Castro and Vinay Bettadapura and Irfan
          Essa},
      booktitle  = {Proceedings of IEEE Winter Conference on
          Applications of Computer Vision (WACV)},
      month    = {March},
      pdf    = {http://www.cc.gatech.edu/~irfan/p/2016-Castro-DPHFEVV.pdf},
      title    = {Discovering Picturesque Highlights from Egocentric
          Vacation Video},
      url    = {http://www.cc.gatech.edu/cpl/projects/egocentrichighlights/},
      year    = {2016}
    }

Abstract

2016-Castro-DPHFEVVWe present an approach for identifying picturesque highlights from large amounts of egocentric video data. Given a set of egocentric videos captured over the course of a vacation, our method analyzes the videos and looks for images that have good picturesque and artistic properties. We introduce novel techniques to automatically determine aesthetic features such as composition, symmetry, and color vibrancy in egocentric videos and rank the video frames based on their photographic qualities to generate highlights. Our approach also uses contextual information such as GPS, when available, to assess the relative importance of each geographic location where the vacation videos were shot. Furthermore, we specifically leverage the properties of egocentric videos to improve our highlight detection. We demonstrate results on a new egocentric vacation dataset which includes 26.5 hours of videos taken over a 14-day vacation that spans many famous tourist destinations and also provide results from a user-study to access our results.

 

AddThis Social Bookmark Button

Spring 2016 Teaching

January 10th, 2016 Irfan Essa Posted in Computational Photography, Computational Photography and Video, Computer Vision, Computer Vision No Comments »

My teaching activities for Spring 2016 areBB1162B4-4F87-480C-A850-00C54FAA0E21

AddThis Social Bookmark Button

Presentation at Max-Planck-Institut für Informatik in Saarbrücken (2015): “Video Analysis and Enhancement”

September 14th, 2015 Irfan Essa Posted in Computational Journalism, Computational Photography and Video, Computer Vision, Presentations, Ubiquitous Computing No Comments »

Video Analysis and Enhancement: Spatio-Temporal Methods for Extracting Content from Videos and Enhancing Video OutputSaarbrücken_St_Johanner_Markt_Brunnen

Irfan Essa (prof.irfanessa.com)

Georgia Institute of Technology
School of Interactive Computing

Hosted by Max-Planck-Institut für Informatik in Saarbrucken (Bernt Schiele, Director of Computer Vision and Multimodal Computing)

Abstract 

In this talk, I will start with describing the pervasiveness of image and video content, and how such content is growing with the ubiquity of cameras.  I will use this to motivate the need for better tools for analysis and enhancement of video content. I will start with some of our earlier work on temporal modeling of video, then lead up to some of our current work and describe two main projects. (1) Our approach for a video stabilizer, currently implemented and running on YouTube, and its extensions. (2) A robust and scaleable method for video segmentation. 

I will describe, in some detail, our Video stabilization method, which generates stabilized videos and is in wide use running on YouTube, with Millions of users. Then I will  describe an efficient and scalable technique for spatiotemporal segmentation of long video sequences using a hierarchical graph-based algorithm. I will describe the videosegmentation.com site that we have developed for making this system available for wide use.

Finally, I will follow up with some recent work on image and video analysis in the mobile domains.  I will also make some observations about the ubiquity of imaging and video in general and need for better tools for video analysis. 

AddThis Social Bookmark Button

Presentation at Max-Planck-Institute for Intelligent Systems in Tübingen (2015): “Data-Driven Methods for Video Analysis and Enhancement”

September 10th, 2015 Irfan Essa Posted in Computational Photography and Video, Computer Vision, Machine Learning, Presentations No Comments »

Data-Driven Methods for Video Analysis and EnhancementIMG_3995

Irfan Essa (prof.irfanessa.com)
Georgia Institute of Technology

Thursday, September 10, 2 pm,
Max Planck House Lecture Hall (Spemannstr. 36)
Hosted by Max-Planck-Institute for Intelligent Systems (Michael Black, Director of Percieving Systems)

Abstract

In this talk, I will start with describing the pervasiveness of image and video content, and how such content is growing with the ubiquity of cameras.  I will use this to motivate the need for better tools for analysis and enhancement of video content. I will start with some of our earlier work on temporal modeling of video, then lead up to some of our current work and describe two main projects. (1) Our approach for a video stabilizer, currently implemented and running on YouTube and its extensions. (2) A robust and scalable method for video segmentation.

I will describe, in some detail, our Video stabilization method, which generates stabilized videos and is in wide use. Our method allows for video stabilization beyond the conventional filtering that only suppresses high-frequency jitter. This method also supports the removal of rolling shutter distortions common in modern CMOS cameras that capture the frame one scan-line at a time resulting in non-rigid image distortions such as shear and wobble. Our method does not rely on apriori knowledge and works on video from any camera or on legacy footage. I will showcase examples of this approach and also discuss how this method is launched and running on YouTube, with Millions of users.

Then I will  describe an efficient and scalable technique for spatiotemporal segmentation of long video sequences using a hierarchical graph-based algorithm. This hierarchical approach generates high-quality segmentations and we demonstrate the use of this segmentation as users interact with the video, enabling efficient annotation of objects within the video. I will also show some recent work on how this segmentation and annotation can be used to do dynamic scene understanding.

I will then follow up with some recent work on image and video analysis in the mobile domains.  I will also make some observations about the ubiquity of imaging and video in general and need for better tools for video analysis.

AddThis Social Bookmark Button

Participated in the KAUST Conference on Computational Imaging and Vision 2015

March 1st, 2015 Irfan Essa Posted in Computational Photography and Video, Computer Vision, Daniel Castro, Presentations No Comments »

I was invited to participate and present at the King Abdullah University of Science & Technology Conference on Computational Imaging and Vision (CIV)

March 1-4, 2015
Building 19 Level 3, Lecture Halls
Visual Computing Center (VCC)

Invited Speakers included

  • Shree Nayar – Columbia University
  • Daniel Cremers – Technical University of Munich
  • Rene Vidal –The Johns Hopkins University
  • Wolfgang Heidrich – VCC, KAUST
  • Jingyi Yu –University of Delaware
  • Irfan Essa – The Georgia Institute of Technology
  • Mubarak Shah – University of Central Florida
  • Larry Davis – University of Maryland
  • David Forsyth –University of Illinois
  • Gordon Wetzstein – Stanford University
  • Brian Barsky – University of California
  • Yi Ma – ShanghaiTech University
  • etc.

This event was hosted by the Visual Computing Center (Wolfgang HeidrichBernard GhanemGanesh Sundaramoorthi).

Daniel Castro also attended and presented a poster at the meeting.

2015-03-KUAST

AddThis Social Bookmark Button