Going Live on YouTube (2011): Lights, Camera… EDIT! New Features for the YouTube Video Editor

March 21st, 2011 Irfan Essa Posted in Computational Photography and Video, Google, In The News, Matthias Grundmann, Multimedia, Vivek Kwatra, WWW No Comments »

via YouTube Blog: Lights, Camera… EDIT! New Features for the YouTube Video Editor.

  • M. Grundmann, V. Kwatra, and I. Essa (2011), “Auto-Directed Video Stabilization with Robust L1 Optimal Camera Paths,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2011. [PDF] [WEBSITE] [VIDEO] [DEMO] [DOI] [BLOG] [BIBTEX]
    @InProceedings{    2011-Grundmann-AVSWROCP,
      author  = {M. Grundmann and V. Kwatra and I. Essa},
      blog    = {http://prof.irfanessa.com/2011/06/19/videostabilization/},
      booktitle  = {Proceedings of IEEE Conference on Computer Vision
          and Pattern Recognition (CVPR)},
      demo    = {http://www.youtube.com/watch?v=0MiY-PNy-GU},
      doi    = {10.1109/CVPR.2011.5995525},
      month    = {June},
      pdf    = {http://www.cc.gatech.edu/~irfan/p/2011-Grundmann-AVSWROCP.pdf},
      publisher  = {IEEE Computer Society},
      title    = {Auto-Directed Video Stabilization with Robust L1
          Optimal Camera Paths},
      url    = {http://www.cc.gatech.edu/cpl/projects/videostabilization/},
      video    = {http://www.youtube.com/watch?v=i5keG1Y810U},
      year    = {2011}
    }

Lights, Camera… EDIT! New Features for the YouTube Video Editor

Nine months ago we launched our cloud-based video editor. It was a simple product built to provide our users with simple editing tools. Although it didn’t have all the features available on paid desktop editing software, the idea was that the vast majority of people’s video editing needs are pretty basic and straight-forward and we could provide these features with a free editor available on the Web. Since launch, hundreds of thousands of videos have been published using the YouTube Video Editor and we’ve regularly pushed out new feature enhancements to the product, including:

  • Video transitions (crossfade, wipe, slide)
  • The ability to save projects across sessions
  • Increased clips allowed in the editor from 6 to 17
  • Video rotation (from portrait to landscape and vice versa – great for videos shot on mobile)
  • Shape transitions (heart, star, diamond, and Jack-O-Lantern for Halloween)
  • Audio mixing (AudioSwap track mixed with original audio)
  • Effects (brightness/contrast, black & white)

A new user interface and project menu for multiple saved projects

While many of these are familiar features also available on desktop software, today, we’re excited to unveil two new features that the team has been working on over the last couple of months that take unique advantage of the cloud:

Stabilizer

Ever shoot a shaky video that’s so jittery, it’s actually hard to watch? Professional cinematographers use stabilization equipment such as tripods or camera dollies to keep their shots smooth and steady. Our team mimicked these cinematographic principles by automatically determining the best camera path for you through a unified optimization technique. In plain English, you can smooth some of those unsteady videos with the click of a button. We also wanted you to be able to preview these results in real-time, before publishing the finished product to the Web. We can do this by harnessing the power of the cloud by splitting the computation required for stabilizing the video into chunks and distributed them across different servers. This allows us to use the power of many machines in parallel, computing and streaming the stabilized results quickly into the preview. You can check out the paper we’re publishing entitled “Auto-Directed Video Stabilization with Robust L1 Optimal Camera Paths.” Want to see stabilizer in action? You can test it out for yourself, or check out these two videos. The first is without stabilizer.

And now, with the stabilizer:

AddThis Social Bookmark Button

Paper in CVPR (2010): “Motion Field to Predict Play Evolution in Dynamic Sport Scenes

June 13th, 2010 Irfan Essa Posted in Activity Recognition, Jessica Hodgins, Kihwan Kim, Matthias Grundmann, PAMI/ICCV/CVPR/ECCV, Papers, Sports Visualization No Comments »

Kihwan Kim, Matthias Grundmann, Ariel Shamir, Iain Matthews, Jessica Hodgins, Irfan Essa (2010) “Motion Field to Predict Play Evolution in Dynamic Sport Scenes” in Proceedings of IEEE Computer Vision and Pattern Recognition Conference (CVPR), San Francisco, CA, USA, June 2010 [PDF][Website][DOI][Video (Youtube)].

Abstract

Videos of multi-player team sports provide a challenging domain for dynamic scene analysis. Player actions and interactions are complex as they are driven by many factors, such as the short-term goals of the individual player, the overall team strategy, the rules of the sport, and the current context of the game. We show that constrained multi-agent events can be analyzed and even predicted from video. Such analysis requires estimating the global movements of all players in the scene at any time, and is needed for modeling and predicting how the multi-agent play evolves over time on the field. To this end, we propose a novel approach to detect the locations of where the play evolution will proceed, e.g. where interesting events will occur, by tracking player positions and movements over time. We start by extracting the ground level sparse movement of players in each time-step, and then generate a dense motion field. Using this field we detect locations where the motion converges, implying positions towards which the play is evolving. We evaluate our approach by analyzing videos of a variety of complex soccer plays.

CVPR 2010 Paper on Play Evolution

AddThis Social Bookmark Button

Paper in CVPR (2010): “Discontinuous Seam-Carving for Video Retargeting”

June 13th, 2010 Irfan Essa Posted in Computational Photography and Video, Matthias Grundmann, PAMI/ICCV/CVPR/ECCV, Papers, Vivek Kwatra No Comments »

Discontinuous Seam-Carving for Video Retargeting

  • M. Grundmann, V. Kwatra, M. Han, and I. Essa (2010), “Discontinuous Seam-Carving for Video Retargeting,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2010. [PDF] [WEBSITE] [DOI] [BIBTEX]
    @InProceedings{    2010-Grundmann-DSVR,
      author  = {M. Grundmann and V. Kwatra and M. Han and I. Essa},
      booktitle  = {Proceedings of IEEE Conference on Computer Vision
          and Pattern Recognition (CVPR)},
      doi    = {10.1109/CVPR.2010.5540165},
      month    = {June},
      pdf    = {http://www.cc.gatech.edu/cpl/projects/videoretargeting/cvpr2010_videoretargeting.pdf},
      publisher  = {IEEE Computer Society},
      title    = {Discontinuous Seam-Carving for Video Retargeting},
      url    = {http://www.cc.gatech.edu/cpl/projects/videoretargeting/},
      year    = {2010}
    }

Abstract

We introduce a new algorithm for video retargeting that uses discontinuous seam-carving in both space and time for resizing videos. Our algorithm relies on a novel appearance-based temporal coherence formulation that allows for frame-by-frame processing and results in temporally discontinuous seams, as opposed to geometrically smooth and continuous seams. This formulation optimizes the difference in appearance of the resultant retargeted frame to the optimal temporally coherent one, and allows for carving around fast moving salient regions.

Additionally, we generalize the idea of appearance-based coherence to the spatial domain by introducing piece-wise spatial seams. Our spatial coherence measure minimizes the change in gradients during retargeting, which preserves spatial detail better than minimization of color difference alone. We also show that per-frame saliency (gradient- based or feature-based) does not always produce desirable retargeting results and propose a novel automatically computed measure of spatio-temporal saliency. As needed, a user may also augment the saliency by interactive region-brushing. Our retargeting algorithm processes the video sequentially, making it conducive for streaming applications.

Examples from our CVPR 2010 Paper

AddThis Social Bookmark Button

Paper in CVPR (2010): “Efficient Hierarchical Graph-Based Video Segmentation

June 13th, 2010 Irfan Essa Posted in Computational Photography and Video, Matthias Grundmann, PAMI/ICCV/CVPR/ECCV, Vivek Kwatra No Comments »

Matthias GrundmannVivek KwatraMei Han, Irfan Essa (2010) “Efficient Hierarchical Graph-Based Video Segmentation” in Proceedings of IEEE Computer Vision and Pattern Recognition Conference (CVPR), San Francisco, CA, USA, June 2010 [PDF][Website][DOI][Video (Youtube)].

Abstract

We present an efficient and scalable technique for spatio- temporal segmentation of long video sequences using a hierarchical graph-based algorithm. We begin by over- segmenting a volumetric video graph into space-time regions grouped by appearance. We then construct a “region graph” over the obtained segmentation and iteratively repeat this process over multiple levels to create a tree of spatio-temporal segmentations. This hierarchical approach generates high quality segmentations, which are temporally coherent with stable region boundaries, and allows subse- quent applications to choose from varying levels of granularity. We further improve segmentation quality by using dense optical flow to guide temporal connections in the initial graph.

We also propose two novel approaches to improve the scalability of our technique: (a) a parallel out- of-core algorithm that can process volumes much larger than an in-core algorithm, and (b) a clip-based process- ing algorithm that divides the video into overlapping clips in time, and segments them successively while enforcing consistency.

We demonstrate hierarchical segmentations on video shots as long as 40 seconds, and even support a streaming mode for arbitrarily long videos, albeit without the ability to process them hierarchically.

VideoSegmentation Teaser

AddThis Social Bookmark Button

Paper in CVPR (2010): “Player Localization Using Multiple Static Cameras for Sports Visualization”

June 13th, 2010 Irfan Essa Posted in Activity Recognition, Jessica Hodgins, Kihwan Kim, Machine Learning, Matthias Grundmann, PAMI/ICCV/CVPR/ECCV, Raffay Hamid, Sports Visualization No Comments »

Raffay Hamid, Ram Krishan Kumar, Matthias Grundmann, Kihwan Kim, Irfan Essa, Jessica Hodgins (2010), “Player Localization Using Multiple Static Cameras for Sports Visualization” In Proceedings of IEEE Computer Vision and Pattern Recognition Conference (CVPR), San Francisco, CA, USA, June 2010 [PDF][Website][DOI][Video (Youtube)].

Abstract

We present a novel approach for robust localization of multiple people observed using multiple cameras. We usethis location information to generate sports visualizations,which include displaying a virtual offside line in soccer games, and showing players’ positions and motion patterns.Our main contribution is the modeling and analysis for the problem of fusing corresponding players’ positional informationas finding minimum weight K-length cycles in complete K-partite graphs. To this end, we use a dynamic programmingbased approach that varies over a continuum of being maximally to minimally greedy in terms of the numberof paths explored at each iteration. We present an end-to-end sports visualization framework that employs our proposed algorithm-class. We demonstrate the robustness of our framework by testing it on 60; 000 frames of soccerfootage captured over 5 different illumination conditions, play types, and team attire.

Teaser Image from CVPR 2010 paper

AddThis Social Bookmark Button

CVPR 2010: Accepted Papers

April 1st, 2010 Irfan Essa Posted in Activity Recognition, Computational Photography and Video, Jessica Hodgins, Kihwan Kim, Matthias Grundmann, PAMI/ICCV/CVPR/ECCV, Papers, Vivek Kwatra No Comments »

We have the following 4 papers that have been accepted for publications in IEEE CVPR 2010. More details forthcoming, with links to more details.
  • Matthias Grundmann, Vivek Kwatra, Mei Han, and Irfan Essa (2010) “Discontinuous Seam-Carving for Video Retargeting” (a GA Tech, Google Collaboration)
  • Matthias Grundmann, Vivek Kwatra, Mei Han, and Irfan Essa (2010) “Efficient Hierarchical Graph-Based Video Segmentation” (a GA Tech, Google Collaboration)
  • Kihwan Kim, Matthias Grundmann, Ariel Shamir, Iain Matthews, Jessica Hodgins, and Irfan Essa (2010) “Motion Fields to Predict Play Evolution in Dynamic Sport Scenes” (a GA Tech, Disney Collaboration)
  • Raffay Hamid, Ramkrishan Kumar, Matthias Grundmann, Kihwan Kim, Irfan Essa, and Jessica Hodgins (2010) “Player Localization Using Multiple Static Cameras for Sports Visualization” (a GA Tech, Disney Collaboration)
AddThis Social Bookmark Button

Paper: ICPR (2008) “3D Shape Context and Distance Transform for Action Recognition”

December 8th, 2008 Irfan Essa Posted in Activity Recognition, Aware Home, Face and Gesture, Franzi Meier, Matthias Grundmann, PAMI/ICCV/CVPR/ECCV, Papers 1 Comment »

M. Grundmann, F. Meier, and I. Essa (2008) “3D Shape Context and Distance Transform for Action Recognition”, In Proceedings of International Conference on Pattern Recognition (ICPR) 2008, Tampa, FL. [Project Page | DOI | PDF]

ABSTRACT

We propose the use of 3D (2D+time) Shape Context to recognize the spatial and temporal details inherent in human actions. We represent an action in a video sequence by a 3D point cloud extracted by sampling 2D silhouettes over time. A non-uniform sampling method is introduced that gives preference to fast moving body parts using a Euclidean 3D Distance Transform. Actions are then classified by matching the extracted point clouds. Our proposed approach is based on a global matching and does not require specific training to learn the model. We test the approach thoroughly on two publicly available datasets and compare to several state-of-the-art methods. The achieved classification accuracy is on par with or superior to the best results reported to date.

AddThis Social Bookmark Button