Paper in AISTATS 2013 “Beyond Sentiment: The Manifold of Human Emotions”

April 29th, 2013 Irfan Essa Posted in AAAI/IJCAI/UAI, Behavioral Imaging, Computational Journalism, Machine Learning, Papers, WWW No Comments »

  • S. Kim, F. Li, G. Lebanon, and I. A. Essa (2013), “Beyond Sentiment: The Manifold of Human Emotions,” in Proceedings of AI STATS, 2013. [PDF] [BIBTEX]
    @InProceedings{    2012-Kim-BSMHE,
      author  = {Seungyeon Kim and Fuxin Li and Guy Lebanon and
          Irfan A. Essa},
      booktitle  = {Proceedings of AI STATS},
      pdf    = {http://arxiv.org/pdf/1202.1568v1},
      title    = {Beyond Sentiment: The Manifold of Human Emotions},
      year    = {2013}
    }

Abstract

Sentiment analysis predicts the presence of positive or negative emotions in a text document. In this paper we consider higher dimensional extensions of the sentiment concept, which represent a richer set of human emotions. Our approach goes beyond previous work in that our model contains a continuous manifold rather than a finite set of human emotions. We investigate the resulting model, compare it to psychological observations, and explore its predictive capabilities. Besides obtaining significant improvements over a baseline without manifold, we are also able to visualize different notions of positive sentiment in different domains.

via [arXiv.org 1202.1568] Beyond Sentiment: The Manifold of Human Emotions.

AddThis Social Bookmark Button

Paper in ICCP 2013 “Post-processing approach for radiometric self-calibration of video”

April 19th, 2013 Irfan Essa Posted in Computational Photography and Video, ICCP, Matthias Grundmann, Papers, Sing Bing Kang No Comments »

  • M. Grundmann, C. McClanahan, S. B. Kang, and I. Essa (2013), “Post-processing Approach for Radiometric Self-Calibration of Video,” in Proceedings of IEEE International Conference on Computational Photography (ICCP), 2013. [PDF] [WEBSITE] [VIDEO] [DOI] [BIBTEX]
    @InProceedings{    2013-Grundmann-PARSV,
      author  = {Matthias Grundmann and Chris McClanahan and Sing
          Bing Kang and Irfan Essa},
      booktitle  = {{Proceedings of IEEE International Conference on
          Computational Photography (ICCP)}},
      doi    = {10.1109/ICCPhot.2013.6528307},
      month    = {April},
      organization  = {IEEE Computer Society},
      pdf    = {http://www.cc.gatech.edu/~irfan/p/2013-Grundmann-PARSV.pdf},
      title    = {Post-processing Approach for Radiometric
          Self-Calibration of Video},
      url    = {http://www.cc.gatech.edu/cpl/projects/radiometric},
      video    = {http://www.youtube.com/watch?v=sC942ZB4WuM},
      year    = {2013}
    }

Abstract

We present a novel data-driven technique for radiometric self-calibration of video from an unknown camera. Our approach self-calibrates radiometric variations in video, and is applied as a post-process; there is no need to access the camera, and in particular it is applicable to internet videos. This technique builds on empirical evidence that in video the camera response function (CRF) should be regarded time variant, as it changes with scene content and exposure, instead of relying on a single camera response function. We show that a time-varying mixture of responses produces better accuracy and consistently reduces the error in mapping intensity to irradiance when compared to a single response model. Furthermore, our mixture model counteracts the effects of possible nonlinear exposure-dependent intensity perturbations and white-balance changes caused by proprietary camera firmware. We further show how radiometrically calibrated video improves the performance of other video analysis algorithms, enabling a video segmentation algorithm to be invariant to exposure and gain variations over the sequence. We validate our data-driven technique on videos from a variety of cameras and demonstrate the generality of our approach by applying it to internet video.

via IEEE Xplore – Post-processing approach for radiometric self-calibration of video.

AddThis Social Bookmark Button

Paper in ECCV Workshop 2012: “Weakly Supervised Learning of Object Segmentations from Web-Scale Videos”

October 7th, 2012 Irfan Essa Posted in Activity Recognition, Awards, Google, Matthias Grundmann, Multimedia, PAMI/ICCV/CVPR/ECCV, Papers, Vivek Kwatra, WWW No Comments »

Weakly Supervised Learning of Object Segmentations from Web-Scale Videos

  • G. Hartmann, M. Grundmann, J. Hoffman, D. Tsai, V. Kwatra, O. Madani, S. Vijayanarasimhan, I. Essa, J. Rehg, and R. Sukthankar (2012), “Weakly Supervised Learning of Object Segmentations from Web-Scale Videos,” in Proceedings of ECCV 2012 Workshop on Web-scale Vision and Social Media, 2012. [PDF] [DOI] [BIBTEX]
    @InProceedings{    2012-Hartmann-WSLOSFWV,
      author  = {Glenn Hartmann and Matthias Grundmann and Judy
          Hoffman and David Tsai and Vivek Kwatra and Omid
          Madani and Sudheendra Vijayanarasimhan and Irfan
          Essa and James Rehg and Rahul Sukthankar},
      booktitle  = {Proceedings of ECCV 2012 Workshop on Web-scale
          Vision and Social Media},
      doi    = {10.1007/978-3-642-33863-2_20},
      pdf    = {http://www.cs.cmu.edu/~rahuls/pub/eccv2012wk-cp-rahuls.pdf},
      title    = {Weakly Supervised Learning of Object Segmentations
          from Web-Scale Videos},
      year    = {2012}
    }

Abstract

We propose to learn pixel-level segmentations of objects from weakly labeled (tagged) internet videos. Speci cally, given a large collection of raw YouTube content, along with potentially noisy tags, our goal is to automatically generate spatiotemporal masks for each object, such as dog”, without employing any pre-trained object detectors. We formulate this problem as learning weakly supervised classi ers for a set of independent spatio-temporal segments. The object seeds obtained using segment-level classi ers are further re ned using graphcuts to generate high-precision object masks. Our results, obtained by training on a dataset of 20,000 YouTube videos weakly tagged into 15 classes, demonstrate automatic extraction of pixel-level object masks. Evaluated against a ground-truthed subset of 50,000 frames with pixel-level annotations, we con rm that our proposed methods can learn good object masks just by watching YouTube.

Presented at: ECCV 2012 Workshop on Web-scale Vision and Social Media, 2012, October 7-12, 2012, in Florence, ITALY.

Awarded the BEST PAPER AWARD!

 

AddThis Social Bookmark Button

AT UBICOMP 2012 Conference, in Pittsburgh, PA, September 5 – 7, 2012

September 4th, 2012 Irfan Essa Posted in Edison Thomaz, Grant Schindler, Gregory Abowd, Papers, Presentations, Thomas Ploetz, UBICOMP, Ubiquitous Computing, Vinay Bettadapura No Comments »

At ACM sponsored, 14th International Conference on Ubiquitous Computing (Ubicomp 2012), Pittsburgh, PA, September 5 – 7, 2012.

Here are the highlights of my group’s participation in Ubicomp 2012.

  • E. Thomaz, V. Bettadapura, G. Reyes, M. Sandesh, G. Schindler, T. Ploetz, G. D. Abowd, and I. Essa (2012), “Recognizing Water-Based Activities in the Home Through Infrastructure-Mediated Sensing,” in Proceedings of ACM International Conference on Ubiquitous Computing (UBICOMP), 2012. [PDF] [WEBSITE] (Oral Presentation at 2pm on Wednesday September 5, 2012).
  • J. Wang, G. Schindler, and I. Essa (2012), “Orientation Aware Scene Understanding for Mobile Camera,” in Proceedings of ACM International Conference on Ubiquitous Computing (UBICOMP), 2012. [PDF][WEBSITE] (Oral Presentation at 2pm on Thursday September 6, 2012).

In addition, my colleague, Gregory Abowd has a position paper on “What next, Ubicomp? Celebrating an intellectual disappearing act” on Wednesday 11:15am session and my other colleague/collaborator Thomas Ploetz has a paper on “Automatic Assessment of Problem Behavior in Individuals with Developmental Disabilities” with his co-authors Nils Hammerla, Agata Rozga, Andrea Reavis, Nathan Call, Gregory Abowd on Friday September 6, in the 9:15am session.

AddThis Social Bookmark Button

Paper in IEEE CVPR 2012: “Detecting Regions of Interest in Dynamic Scenes with Camera Motions”

June 16th, 2012 Irfan Essa Posted in Activity Recognition, Kihwan Kim, Machine Learning, PAMI/ICCV/CVPR/ECCV, Papers, PERSEAS, Visual Surviellance No Comments »

Detecting Regions of Interest in Dynamic Scenes with Camera Motions

  • K. Kim, D. Lee, and I. Essa (2012), “Detecting Regions of Interest in Dynamic Scenes with Camera Motions,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2012. [PDF] [WEBSITE] [VIDEO] [DOI] [BLOG] [BIBTEX]
    @InProceedings{    2012-Kim-DRIDSWCM,
      author  = {Kihwan Kim and Dongreyol Lee and Irfan Essa},
      blog    = {http://prof.irfanessa.com/2012/04/09/paper-cvpr2012/},
      booktitle  = {Proceedings of IEEE Conference on Computer Vision
          and Pattern Recognition (CVPR)},
      doi    = {10.1109/CVPR.2012.6247809},
      pdf    = {http://www.cc.gatech.edu/~irfan/p/2012-Kim-DRIDSWCM.pdf},
      publisher  = {IEEE Computer Society},
      title    = {Detecting Regions of Interest in Dynamic Scenes
          with Camera Motions},
      url    = {http://www.cc.gatech.edu/cpl/projects/roi/},
      video    = {http://www.youtube.com/watch?v=19BMwDMCSp8},
      year    = {2012}
    }

Abstract

We present a method to detect the regions of interests in moving camera views of dynamic scenes with multiple mov- ing objects. We start by extracting a global motion tendency that reflects the scene context by tracking movements of objects in the scene. We then use Gaussian process regression to represent the extracted motion tendency as a stochastic vector field. The generated stochastic field is robust to noise and can handle a video from an uncalibrated moving camera. We use the stochastic field for predicting important future regions of interest as the scene evolves dynamically.

We evaluate our approach on a variety of videos of team sports and compare the detected regions of interest to the camera motion generated by actual camera operators. Our experimental results demonstrate that our approach is computationally efficient, and provides better prediction than those of previously proposed RBF-based approaches.

Presented at: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2012, Providence, RI, June 16-21, 2012

AddThis Social Bookmark Button

Award (2012): Best Computer Vision Paper Award by Google Research

March 22nd, 2012 Irfan Essa Posted in Computational Photography and Video, Matthias Grundmann, Papers, Vivek Kwatra No Comments »

Our following paper was just awarded the Excellent Paper for 2011 in Computer Vision by Google Research.

  • M. Grundmann, V. Kwatra, and I. Essa (2011), “Auto-Directed Video Stabilization with Robust L1 Optimal Camera Paths,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2011. [PDF] [WEBSITE] [VIDEO] [DEMO] [DOI] [BLOG] [BIBTEX]
    @InProceedings{    2011-Grundmann-AVSWROCP,
      author  = {M. Grundmann and V. Kwatra and I. Essa},
      blog    = {http://prof.irfanessa.com/2011/06/19/videostabilization/},
      booktitle  = {Proceedings of IEEE Conference on Computer Vision
          and Pattern Recognition (CVPR)},
      demo    = {http://www.youtube.com/watch?v=0MiY-PNy-GU},
      doi    = {10.1109/CVPR.2011.5995525},
      month    = {June},
      pdf    = {http://www.cc.gatech.edu/~irfan/p/2011-Grundmann-AVSWROCP.pdf},
      publisher  = {IEEE Computer Society},
      title    = {Auto-Directed Video Stabilization with Robust L1
          Optimal Camera Paths},
      url    = {http://www.cc.gatech.edu/cpl/projects/videostabilization/},
      video    = {http://www.youtube.com/watch?v=i5keG1Y810U},
      year    = {2011}
    }

Casually shot videos captured by handheld or mobile cameras suffer from significant amount of shake. Existing in-camera stabilization methods dampen high-frequency jitter but do not suppress low-frequency movements and bounces, such as those observed in videos captured by a walking person. On the other hand, most professionally shot videos usually consist of carefully designed camera configurations, using specialized equipment such as tripods or camera dollies, and employ ease-in and ease-out for transitions. Our stabilization technique automatically converts casual shaky footage into more pleasant and professional looking videos by mimicking these cinematographic principles. The original, shaky camera path is divided into a set of segments, each approximated by either constant, linear or parabolic motion, using an algorithm based on robust L1 optimization. The stabilizer has been part of the YouTube Editor youtube.com/editor since March 2011.

via Research Blog.

AddThis Social Bookmark Button

Paper in ICCV 2011: “Gaussian Process Regression Flow for Analysis of Motion Trajectories”

October 28th, 2011 Irfan Essa Posted in Activity Recognition, DARPA, Kihwan Kim, PAMI/ICCV/CVPR/ECCV, Papers No Comments »

Gaussian Process Regression Flow for Analysis of Motion Trajectories

  • Kim, Lee, and Essa (2011), “Gaussian Process Regression Flow for Analysis of Motion Trajectories,” in Proceedings of IEEE International Conference on Computer Vision (ICCV), 2011. [PDF] [WEBSITE] [VIDEO] [BIBTEX]
     @inproceedings{Kim2011-GPRF, Author = {K. Kim and D. Lee and I. Essa}, Booktitle = {Proceedings of IEEE International Conference on Computer Vision (ICCV)}, Month = {November}, Pdf = {http://www.cc.gatech.edu/~irfan/p/2011-Kim-GPRFAMT.pdf}, Publisher = {IEEE Computer Society}, Title = {Gaussian Process Regression Flow for Analysis of Motion Trajectories}, Url = {http://www.cc.gatech.edu/cpl/projects/gprf/}, Video = {http://www.youtube.com/watch?v=UtLr37hDQz0}, Year = {2011}}

Abstract

Analysis and Recognition of motions and activities of objects in videos requires effective representations for analysis and matching of motion trajectories. In this paper, we introduce a new representation specifically aimed at matching motion trajectories. We model a trajectory as a continuous dense flow field from a sparse set of vector sequences using Gaussian Process Regression. Furthermore, we introduce a random sampling strategy for learning stable classes of motions from limited data.

Our representation allows for incrementally predicting possible paths and detecting anomalous events from online trajectories. This representation also supports matching of complex motions with acceleration changes and pauses or stops within a trajectory. We use the proposed approach for classifying and predicting motion trajectories in traffic monitoring domains and test on several data sets. We show that our approach works well on various types of complete and incomplete trajectories from a variety of video data sets with different frame rates

AddThis Social Bookmark Button

Paper (2011) in IEEE CVPR: “Auto-Directed Video Stabilization with Robust L1 Optimal Camera Paths”

June 19th, 2011 Irfan Essa Posted in Computational Photography and Video, Matthias Grundmann, PAMI/ICCV/CVPR/ECCV, Papers, Vivek Kwatra No Comments »

Auto-Directed Video Stabilization with Robust L1 Optimal Camera Paths

  • Grundmann, Kwatra, and Essa (2011), “Auto-Directed Video Stabilization with Robust L1 Optimal Camera Paths,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2011.  [PDF] [WEBSITE][VIDEO] [DEMO][Google Research Blog] [BIBTEX]
     @inproceedings{2011-Grundmann-AVSWROCP, Author = {M. Grundmann and V. Kwatra and I. Essa}, Booktitle = {Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}, Month = {June}, Pdf = {http://www.cc.gatech.edu/~irfan/p/2011-Grundmann-AVSWROCP}, Publisher = {IEEE Computer Society}, Title = {Auto-Directed Video Stabilization with Robust L1 Optimal Camera Paths}, Url = {http://www.cc.gatech.edu/cpl/projects/videostabilization/}, Video = {http://www.youtube.com/watch?v=i5keG1Y810U}, Year = {2011}}

Abstract

We present a novel algorithm for automatically applying constrainable, L1-optimal camera paths to generate stabilized videos by removing undesired motions. Our goal is to compute camera paths that are composed of constant, linear and parabolic segments mimicking the camera motions employed by professional cinematographers. To this end, our algorithm is based on a linear programming framework to minimize the first, second, and third derivatives of the resulting camera path. Our method allows for video stabilization beyond the conventional filtering of camera paths that only suppresses high frequency jitter. We incorporate additional constraints on the path of the camera directly in our algorithm, allowing for stabilized and retargeted videos. Our approach accomplishes this without the need of user interaction or costly 3D reconstruction of the scene, and works as a post-process for videos from any camera or from an online source.

AddThis Social Bookmark Button

Paper (2011) in Virtual Reality: “Augmenting aerial earth maps with dynamic information from videos”

February 2nd, 2011 Irfan Essa Posted in Computational Photography and Video, Kihwan Kim, Papers, Sangmin Oh No Comments »

Augmenting aerial earth maps with dynamic information from videos

  • Kim, Oh, Lee, and Essa (2011), “Augmenting aerial earth maps with dynamic information from videos,” Journal of Virtual Reality, Special Issue on Augmented Reality, vol. 15, iss. 2-3, pp. 1359-4338, 2011.  [PDF] [WEBSITE] [VIDEO] [DOI] [SpringerLink][BIBTEX]
    
    @article{2011-Kim-AAEMWDIFV, 
     Author = {K. Kim and S. Oh and J. Lee and I. Essa}, 
     Doi = {10.1007/s10055-010-0186-2}, 
     Journal = {Journal of Virtual Reality, Special Issue on Augmented Reality}, 
     Number = {2-3}, 
     Pages = {1359-4338}, 
     Pdf = {http://www.cc.gatech.edu/~irfan/p/2011-Kim-AAEMWDIFV.pdf}, 
     Title = {Augmenting aerial earth maps with dynamic information from videos}, 
     Url = {http://www.cc.gatech.edu/cpl/projects/augearth}, 
     Video = {http://www.youtube.com/watch?v=TPk88soc2qw}, 
     Volume = {15}, 
     Year = {2011}}

Abstract

We introduce methods for augmenting aerial visualizations of Earth (from tools such as Google Earth or Microsoft Virtual Earth) with dynamic information obtained from videos. Our goal is to make Augmented Earth Maps that visualize plausible live views of dynamic scenes in a city. We propose different approaches to analyze videos of pedestrians and cars in real situations, under differing conditions to extract dynamic information. Then, we augment an Aerial Earth Maps (AEMs) with the extracted live and dynamic content. We also analyze natural phenomenon (skies, clouds) and project information from these to the AEMs to add to the visual reality. Our primary contributions are: (1) Analyzing videos with different viewpoints, coverage, and overlaps to extract relevant information about view geometry and movements, with limited user input. (2) Projecting this information appropriately to the viewpoint of the AEMs and modeling the dynamics in the scene from observations to allow inference (in case of missing data) and synthesis. We demonstrate this over a variety of camera configurations and conditions. (3) The modeled information from videos is registered to the AEMs to render appropriate movements and related dynamics. We demonstrate this with traffic flow, people movements, and cloud motions. All of these approaches are brought together as a prototype system for a real-time visualization of a city that is alive and engaging.

Augmented Earth

AddThis Social Bookmark Button

Poster STS 2011: “3-Dimensional Visualization of the Operating Room Using Advanced Motion Capture: A Novel Paradigm to Expand Simulation-Based Surgical Education”

February 2nd, 2011 Irfan Essa Posted in Computational Photography and Video, Eric Sarin, Health Systems, Kihwan Kim, Papers, Uncategorized, William Cooper No Comments »

3-Dimensional Visualization of the Operating Room Using Advanced Motion Capture: A Novel Paradigm to Expand Simulation-Based Surgical Education

  • Sarin, Kim, Essa, and Cooper (2011), “3-Dimensional Visualization of the Operating Room Using Advanced Motion Capture: A Novel Paradigm to Expand Simulation-Based Surgical Education,” in Proccedings of Society of Thoracic Surgeons Annual Meeting, Society of Thoracic Surgeons, 2011.  [BLOG][BIBTEX]
    
    @incollection{2011-Sarin-3VORUAMCNPESSE,
      Author = {E. L. Sarin and K. Kim and I. Essa and W. A. Cooper},
      Blog = {http://prof.irfanessa.com/2011/02/02/sts-2011/},
      Booktitle = {Proccedings of Society of Thoracic Surgeons Annual Meeting},
      Month = {January},
      Publisher = {Society of Thoracic Surgeons},
      Title = {3-Dimensional Visualization of the Operating Room Using Advanced Motion Capture: A Novel Paradigm to Expand Simulation-Based Surgical Education},
      Type = {Poster and Video Presentation},
      Year = {2011}}

A collaborative project between School of Interactive Computing, Georgia Institute of Technology, Atlanta, Georgia, Division of Cardiothoracic Surgery, Emory University School of Medicine, Atlanta, Georgia, and Inova Heart and Vascular Institute1, Fairfax, Virginia. This was a Video and a Poster presentation at the Society of Thoracic Surgeons Annual Meeting in San Diego, CA, Jan 2011.

Poster for Society of Thoracic Surgeon's Annual Meeting

AddThis Social Bookmark Button