MENU: Home Bio Affiliations Research Teaching Publications Collaborators/Students Contact FAQ ©2007-13 RSS

Coursera Course on Computational Photography, NOW LIVE and RUNNING!

March 23rd, 2013 Irfan Essa Posted in Computational Photography, Coursera, Denis Lantsman | 1 Comment »

We are live!

Welcome to the course website!

As you take a look around, there are a few things we want to bring to your attention:

1. Do note that while this is an introductory class, we do require that you have working knowledge of college level mathematics, which includes concepts like Linear Algebra and Calculus. In addition, the programming assignments will require access to a computer with Python and OpenCV. Instructions for installing this software are available here, which is also linked in the navigation bar of the site. The responsibility to get your computer system working with this software is entirely yours. The diverse nature of computer platforms and softwares do offer challenges in setting up such systems and we encourage folks to use the forums to help each other with challenges faced.Please get started with the software installation as soon as possible!

2. These online courses are still rather mysterious, we want to learn as much as we can about them What works? What doesnt?. Accordingly, we invite you to participate in occasional surveys that are part of the research study. The surveys will tell us about you, and how the course is working for you. The information you provide will help us make the course better and improve our ability to provide more intellectually engaging content. Participation in the surveys is completely optional and will not affect your grade in the course in any way. For your reference, here is a link to a PDF document Consent Form that describes the surveys and study in more detail.If you would like to participate, begin by filling out the background survey here. Also, keep an eye on the syllabus page and the weekly announcements for end-of-week surveys.

3. The syllabus page will be updated every week to reflect all of the course content as it becomes available. Please consult that page if you are unsure about what to do next. The page also contains useful information about class logistics, policies, and frequently asked questions.

4. Note that the lecture videos page contains links to the subtitles, slides, and video file downloads for each of the lecture videos.Thank you for your attention, and I hope you will have fun in the following few weeks!

-Denis

via Announcements | Computational Photography.

Computational Photography via Coursera

Tags: , , ,

AddThis Social Bookmark Button

Matthias Grundmann’s PhD Thesis Defense (2013): “Title: Computational Video: Post-processing Methods for Stabilization, Retargeting and Segmentation”

February 4th, 2013 Irfan Essa Posted in Computational Photography and Video, Matthias Grundmann, PhD | No Comments »

Title: Computational Video: Post-processing Methods for Stabilization, Retargeting and Segmentation

Matthias Grundmann
School of Interactive Computing
College of Computing
Georgia Institute of Technology

Date: February 04, 2013 (Monday)
Time: 3:00p – 6:00p EST
Location: Nano building, 116-118

Abstract:

M+I

In this thesis, we address a variety of challenges for analysis and enhancement of Computational Video. We present novel post-processing methods to bridge the difference between professional and casually shot videos mostly seen on online sites. Our research presents solutions to three well-defined problems: (1) Video stabilization and rolling shutter removal in casually-shot, uncalibrated videos; (2) Content-aware video retargeting; and (3) spatio-temporal video segmentation to enable efficient video annotation. We showcase several real-world applications building on these techniques.

We start by proposing a novel algorithm for video stabilization that generates stabilized videos by employing L1-optimal camera paths to remove undesirable motions. We compute camera paths that are optimally partitioned into constant, linear and parabolic segments mimicking the camera motions employed by professional cinematographers. To achieve this, we propose a linear programming framework to minimize the first, second, and third derivatives of the resulting camera path. Our method allows for video stabilization beyond conventional filtering, that only suppresses high frequency jitter. An additional challenge in videos shot from mobile phones are rolling shutter distortions. Modern CMOS cameras capture the frame one scanline at a time, which results in non-rigid image distortions such as shear and wobble. We propose a solution based on a novel mixture model of homographies parametrized by scanline blocks to correct these rolling shutter distortions. Our method does not rely on a-priori knowledge of the readout time nor requires prior camera calibration. Our novel video stabilization and calibration free rolling shutter removal have been deployed on YouTube where they have successfully stabilized millions of videos. We also discuss several extensions to the stabilization algorithm and present technical details behind the widely used YouTube Video Stabilizer.

We address the challenge of changing the aspect ratio of videos, by proposing algorithms that retarget videos to fit the form factor of a given device without stretching or letter-boxing. Our approaches use all of the screen’s pixels, while striving to deliver as much video-content of the original as possible. First, we introduce a new algorithm that uses discontinuous seam-carving in both space and time for resizing videos. Our algorithm relies on a novel appearance-based temporal coherence formulation that allows for frame-by-frame processing and results in temporally discontinuous seams, as opposed to geometrically smooth and continuous seams. Second, we present a technique, that builds on the above mentioned video stabilization approach. We effectively automate classical pan and scan techniques by smoothly guiding a virtual crop window via saliency constraints.

Finally, we introduce an efficient and scalable technique for spatio-temporal segmentation of long video sequences using a hierarchical graph-based algorithm. We begin by over-segmenting a volumetric video graph into space-time regions grouped by appearance. We then construct a “region graph” over the obtained  segmentation and iteratively repeat this process over multiple levels to create a tree of spatio-temporal segmentations. This hierarchical approach generates high quality segmentations, and allows subsequent applications to choose from varying levels of granularity. We demonstrate the use of spatio-temporal segmentation as users interact with the video, enabling efficient annotation of objects within the video.

Committee:

  • Dr. Irfan Essa (Advisor, School of Interactive Computing, Georgia Tech)
  • Dr. Jim Rehg (School of Interactive Computing, Georgia Tech)
  • Dr. Frank Dellaert (School of Interactive Computing, Georgia Tech)
  • Dr. Michael Black (Perceiving Systems Department, Max Planck Institute for Intelligent Systems)
  • Dr. Sing Bing Kang (Adjunct Faculty, Georgia Tech; Microsoft Research, Microsoft Corp.)
  • Dr. Vivek Kwatra (Google Research, Google Inc.)

Tags: , , , ,

AddThis Social Bookmark Button

Videos from the Computational Journalism Symposium (Jan 31 – Feb 1, 2013).

February 1st, 2013 Irfan Essa Posted in Computational Journalism, Events, Presentations | No Comments »

The Computation + Journalism Symposium 2013, held Jan 31 – Feb 1, 2013, at Georgia Institute of Technology, Atlanta, GA, USA was a huge success. Please see the videos here of all the sessions. See me discuss computational journalism with Phil Meyer, and my slides and take-away points from the closing session.

Tags: ,

AddThis Social Bookmark Button

Computational Photography Classes at Georgia Tech | CS 4475 Spring 2013

January 7th, 2013 Irfan Essa Posted in Teaching | No Comments »

I am teaching my usual Computational Photography Class at Georgia Tech | CS 4475 for Spring 2013. This site should have all the external materials for the class.  All internal GA Tech materials will be available via the T-square site.  I am also hoping to use the Piazza site.

Tags: , ,

AddThis Social Bookmark Button

Computation + Journalism Symposium 2013 on Jan 31 – Feb 1, at GA Tech.

January 2nd, 2013 Irfan Essa Posted in Brad Stenger, CnJ, Computational Journalism, Events, Interesting, Nick Diakopoulos | No Comments »

Join us for the 2nd Computation + Journalism Symposium 2013 in Atlanta, GA on Jan 31 – Feb 1, 2013

What role does computation have in the practice of journalism today and in the near future? As computer-driven forces like automation and aggregation increasingly alter the role of journalists and journalism in society, how can computation become a force of deliberate, positive social impact in journalism and civic life? Five years after the first Computation and Journalism symposium at Georgia Tech, this event brings together leaders in both journalism and computation to discuss and debate current trends and future opportunities.

Join us for the second Symposium on Computation + Journalism to be held at the Georgia Institute of Technology in Atlanta on Jan 31, – Feb 1, 2012. Visit this site for additional details.

Tags: ,

AddThis Social Bookmark Button

Best Wishes for 2013

January 2nd, 2013 Irfan Essa Posted in Uncategorized | No Comments »

Happy 2013

Happy New Year

Best Wishes for 2013 and Beyond!

Tags:

AddThis Social Bookmark Button

A Wordle of all Title/Abstracts from 2012

January 2nd, 2013 Irfan Essa Posted in Wordle | No Comments »

Wordle2012A Wordle I generated using Titles/Abstracts from all my papers/talks/presentations from 2013. Click here for the original wordle.

 

 

Tags: ,

AddThis Social Bookmark Button

Congrats to Joy Buolamwini, Rhode Scholar 2012

November 19th, 2012 Irfan Essa Posted in Joy Buolamwini | No Comments »

Georgia Tech Alumna Named Rhodes Scholar | College of Computing.

Georgia Tech alumna Joy Buolamwini has been named a Rhodes scholar. She will attend the University of Oxford, where she plans to pursue degrees in African studies and global governance and diplomacy.

A 2012 computer science graduate from Memphis, Tenn., Buolamwini plans to use her skills to lower barriers in communication with the goal of increasing commerce, keeping governments accountable and improving the quality of life for millions of people.

“My mission is to show compassion through computation,” Buolamwini said. “The heart of computing is humanity, and as a Rhodes scholar, I will have an unprecedented opportunity to gain a deeper understanding of developing nations and global governance while connecting with world leaders who are committed to fighting the world’s fight – making sure each individual can reach her human potential.”

AddThis Social Bookmark Button

Presentation (2012): CMU Robotics Institute Seminar

October 19th, 2012 Irfan Essa Posted in Computational Photography and Video, Matthias Grundmann, Presentations, Vivek Kwatra | No Comments »

Video Analysis and Enhancement: Video Stabilization and Rolling Shutter Removal on YouTube

Irfan Essa
Georgia Tech
School of Interactive Computing
GVU and RIM @ GT Centers

October 19, 2012, 3:30 PM, NSH 1305

Abstract

In this talk, I will discuss a variety of approaches my group is working on for video analysis and enhancement. In particular, I will describe our approach for a video stabilizer, currently implemented and running on YouTube, and its extensions.

This method generates stabilized videos by employing L1-optimal camera paths to remove undesirable motions [1]. We compute camera paths that are optimally partitioned into constant, linear and parabolic segments mimicking the camera motions employed by professional cinematographers. We propose a linear programming framework to minimize the first, second, and third derivatives of the resulting camera path. Our method allows for video stabilization beyond the conventional filtering that only suppresses high frequency jitter. An additional challenge in videos shot from mobile phones are rolling shutter distortions. Modern CMOS cameras capture the frame one scan-line at a time, which results in non-rigid image distortions such as shear and wobble. I will demonstrate a solution based on a novel mixture model of homographies parametrized by scan-line blocks to correct these rolling shutter distortions [2]. Our method does not rely on a-priori knowledge of the readout time nor requires prior camera calibration. A thorough evaluation based on a user study and direct comparisons to other approaches, demonstrates a general preference for our algorithm.

I will conclude the talk by showcasing a live demo of the stabilizer. This work is in collaboration with Matthias Grundmann and Vivek Kwatra at Google, and appears in following two papers.

Time permitting, I will discuss some other projects we are working on, including video segmentation and retargetting.

[1] Matthias Grundmann, Vivek Kwatra, Irfan Essa, CVPR 2011, www.cc.gatech.edu/cpl/projects/videostabilization

[2] Matthias Grundmann, Vivek Kwatra, Daniel Castro Irfan Essa, ICCP 2012, Best paper, www.cc.gatech.edu/cpl/projects/rollingshutter

Host: Takeo Kanade

via Robotics Institute: Talks and Seminars.

Tags: , , , ,

AddThis Social Bookmark Button

Paper in ECCV Workshop 2012: “Weakly Supervised Learning of Object Segmentations from Web-Scale Videos”

October 7th, 2012 Irfan Essa Posted in Activity Recognition, Awards, Google, Matthias Grundmann, Multimedia, PAMI/ICCV/CVPR/ECCV, Papers, Vivek Kwatra, WWW | No Comments »

Weakly Supervised Learning of Object Segmentations from Web-Scale Videos

  • G. Hartmann, M. Grundmann, J. Hoffman, D. Tsai, V. Kwatra, O. Madani, S. Vijayanarasimhan, I. Essa, J. Rehg, and R. Sukthankar (2012), “Weakly Supervised Learning of Object Segmentations from Web-Scale Videos,” in Proceedings of ECCV 2012 Workshop on Web-scale Vision and Social Media, 2012. [PDF] [BIBTEX]
    @inproceedings{2012-Hartmann-WSLOSFWV,
      Author = {Glenn Hartmann and Matthias Grundmann and Judy Hoffman and David Tsai and Vivek Kwatra and Omid Madani and Sudheendra Vijayanarasimhan and Irfan Essa and James Rehg and Rahul Sukthankar},
      Booktitle = {Proceedings of ECCV 2012 Workshop on Web-scale Vision and Social Media},
      Date-Added = {2012-10-23 15:03:18 +0000},
      Date-Modified = {2012-10-23 15:07:04 +0000},
      Pdf = {http://www.cs.cmu.edu/~rahuls/pub/eccv2012wk-cp-rahuls.pdf},
      Title = {Weakly Supervised Learning of Object Segmentations from Web-Scale Videos},
      Year = {2012}}

Abstract

We propose to learn pixel-level segmentations of objects from weakly labeled (tagged) internet videos. Speci cally, given a large collection of raw YouTube content, along with potentially noisy tags, our goal is to automatically generate spatiotemporal masks for each object, such as dog”, without employing any pre-trained object detectors. We formulate this problem as learning weakly supervised classi ers for a set of independent spatio-temporal segments. The object seeds obtained using segment-level classi ers are further re ned using graphcuts to generate high-precision object masks. Our results, obtained by training on a dataset of 20,000 YouTube videos weakly tagged into 15 classes, demonstrate automatic extraction of pixel-level object masks. Evaluated against a ground-truthed subset of 50,000 frames with pixel-level annotations, we con rm that our proposed methods can learn good object masks just by watching YouTube.

Presented at: ECCV 2012 Workshop on Web-scale Vision and Social Media, 2012, October 7-12, 2012, in Florence, ITALY.

Awarded the BEST PAPER AWARD!

 

Tags: , , , , ,

AddThis Social Bookmark Button