<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>prof.irfanessa.com &#187; 2002</title>
	<atom:link href="http://prof.irfanessa.com/tag/2002/feed/" rel="self" type="application/rss+xml" />
	<link>http://prof.irfanessa.com</link>
	<description>Irfan Essa&#039;s Academic Activities</description>
	<lastBuildDate>Wed, 25 Jan 2012 23:42:09 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	
		<item>
		<title>Funding: NSF/ITR (2002) &#8220;Analysis of Complex Audio-Visual Events Using Spatially Distributed Sensors&#8221;</title>
		<link>http://prof.irfanessa.com/2002/10/01/funding-nsfitr-2002-analysis-of-complex-audio-visual-events-using-spatially-distributed-sensors/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=funding-nsfitr-2002-analysis-of-complex-audio-visual-events-using-spatially-distributed-sensors</link>
		<comments>http://prof.irfanessa.com/2002/10/01/funding-nsfitr-2002-analysis-of-complex-audio-visual-events-using-spatially-distributed-sensors/#comments</comments>
		<pubDate>Tue, 01 Oct 2002 14:56:34 +0000</pubDate>
		<dc:creator>Irfan Essa</dc:creator>
				<category><![CDATA[0205507]]></category>
		<category><![CDATA[Funding]]></category>
		<category><![CDATA[James Rehg]]></category>
		<category><![CDATA[2002]]></category>
		<category><![CDATA[Audio Analysis]]></category>
		<category><![CDATA[Computer Vision]]></category>
		<category><![CDATA[NSF]]></category>

		<guid isPermaLink="false">http://academics.irfanessa.com/2002/10/01/funding-nsfitr-2002-analysis-of-complex-audio-visual-events-using-spatially-distributed-sensors/</guid>
		<description><![CDATA[Award#0205507 &#8211; ITR: Analysis of Complex Audio-Visual Events Using Spatially Distributed Sensors ABSTRACT We propose to develop a comprehensive framework for the joint analysis of audio-visual signals obtained from spatially distributed microphones and cameras. We desire solutions to the audio-visual sensing problem that will scale to an arbitrary number of cameras and microphones and can [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://nsf.gov/awardsearch/showAward.do?AwardNumber=0205507">Award#0205507 &#8211; ITR: Analysis of Complex Audio-Visual Events Using Spatially Distributed Sensors</a></p>
<p style="text-align: center;"><strong>ABSTRACT</strong></p>
<p style="text-align: justify;">We propose to develop a comprehensive framework for the joint analysis of audio-visual signals obtained from spatially distributed microphones and cameras. We desire solutions to the audio-visual sensing problem that will scale to an arbitrary number of cameras and microphones and can address challenging environments in which there are multiple speech and nonspeech sound sources and multiple moving people and objects. Recently it has become relatively inexpensive to deploy tens or even hundreds of cameras and microphones in an environment. Many applications could benefit from ability to sense in both modalities. There are two levels at which joint audio-visual analysis can take place. At the signal level, the challenge is to develop representations that capture the rich dependency structure in the joint signal and deal success-fully issues such as variable sampling rates and varying temporal delays between cues. At the spatial level the challenge is to compensate for the distortions introduced by the sensor location and pool information across sensors to recover 3-D information about the spatial environment. For many applications, it is highly desirable if the solution method is self-calibrating, and does not require an extensive manual calibration process every time a new sensor is added or an old sensor is moved or replaced. Removing the burden of manual calibration also makes it possible to exploit ad hoc sensor networks which could arise, for example, from wearable microphones and cameras. We propose to address the following four research topics: 1. Representations and learning methods for signal level fusion. 2. Volumetric techniques for fusing spatially distributed audio-visual data. 3. Self-calibration of distributed microphone-camera systems 4. Applications of audio-visual sensing. For example, this proposal includes considerable work on lip and facial analysis to improve voice communications.</p>
]]></content:encoded>
			<wfw:commentRss>http://prof.irfanessa.com/2002/10/01/funding-nsfitr-2002-analysis-of-complex-audio-visual-events-using-spatially-distributed-sensors/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Paper AAAI (2002): &#8220;Recognizing Multitasked Activities from Video using Stochastic Context-Free Grammar&#8221;</title>
		<link>http://prof.irfanessa.com/2002/09/29/paper-aaai-2002-recognizing-multitasked-activities-from-video-using-stochastic-context-free-grammar/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=paper-aaai-2002-recognizing-multitasked-activities-from-video-using-stochastic-context-free-grammar</link>
		<comments>http://prof.irfanessa.com/2002/09/29/paper-aaai-2002-recognizing-multitasked-activities-from-video-using-stochastic-context-free-grammar/#comments</comments>
		<pubDate>Sun, 29 Sep 2002 15:13:50 +0000</pubDate>
		<dc:creator>Irfan Essa</dc:creator>
				<category><![CDATA[AAAI/IJCAI/UAI]]></category>
		<category><![CDATA[Activity Recognition]]></category>
		<category><![CDATA[Darnell Moore]]></category>
		<category><![CDATA[Intelligent Environments]]></category>
		<category><![CDATA[Papers]]></category>
		<category><![CDATA[2002]]></category>
		<category><![CDATA[AI]]></category>
		<category><![CDATA[Computer Vision]]></category>

		<guid isPermaLink="false">http://prof.irfanessa.com/?p=631</guid>
		<description><![CDATA[D. Moore and I. Essa (2002). &#8220;Recognizing multitasked activities from video using stochastic context-free grammar&#8221;, in Proceedings of AAAI 2002. [PDF &#124; Project Site] Abstract In this paper, we present techniques for recognizing com- plex, multitasked activities from video. Visual information like image features and motion appearances, combined with domain-specific information, like object context is [...]]]></description>
			<content:encoded><![CDATA[<p>D. Moore and I. Essa (2002). &#8220;Recognizing multitasked activities from video using stochastic context-free grammar&#8221;, in Proceedings of AAAI 2002. [<a href="http://www.aaai.org/Papers/AAAI/2002/AAAI02-116.pdf">PDF</a> | <a href="http://www.cc.gatech.edu/cpl/projects/objectspaces/" target="_blank">Project Site</a>]</p>
<p style="text-align: center;"><strong>Abstract</strong></p>
<p style="text-align: justify;"><strong> </strong>In this paper, we present techniques for recognizing com- plex, multitasked activities from video. Visual information like image features and motion appearances, combined with domain-specific information, like object context is used ini- tially to label events. Each action event is represented with a unique symbol, allowing for a sequence of interactions to be described as an ordered symbolic string. Then, a model of stochastic context-free grammar (SCFG), which is devel- oped using underlying rules of an activity, is used to provide the structure for recognizing semantically meaningful behav- ior over extended periods. Symbolic strings are parsed us- ing the Earley-Stolcke algorithm to determine the most likely semantic derivation for recognition. Parsing substrings al- lows us to recognize patterns that describe high-level, com- plex events taking place over segments of the video sequence. We introduce new parsing strategies to enable error detection and recovery in stochastic context-free grammar and meth- ods of quantifying group and individual behavior in activities with separable roles. We show through experiments, with a popular card game, the recognition of high-level narratives of multi-player games and the identification of player strate- gies and behavior using computer vision.</p>
<p style="text-align: justify;">
<div class="wp-caption aligncenter" style="width: 396px"><a href="http://lh3.ggpht.com/_ukXHDWz1Yr0/SujBSmeQr9I/AAAAAAAA2YI/5Lp-GeSp28Q/OS-bjack.jpg"><img class=" " title="Recognizing Black Jack" src="http://lh3.ggpht.com/_ukXHDWz1Yr0/SujBSmeQr9I/AAAAAAAA2YI/5Lp-GeSp28Q/OS-bjack.jpg" alt="Recognizing Black Jack" width="386" height="183" /></a><p class="wp-caption-text">Recognizing Black Jack</p></div>
]]></content:encoded>
			<wfw:commentRss>http://prof.irfanessa.com/2002/09/29/paper-aaai-2002-recognizing-multitasked-activities-from-video-using-stochastic-context-free-grammar/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Paper:ICPR (2002) &#8220;Learning video processing by example&#8221;</title>
		<link>http://prof.irfanessa.com/2002/08/11/ieeexplore-learning-video-processing-by-example/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=ieeexplore-learning-video-processing-by-example</link>
		<comments>http://prof.irfanessa.com/2002/08/11/ieeexplore-learning-video-processing-by-example/#comments</comments>
		<pubDate>Sun, 11 Aug 2002 19:33:08 +0000</pubDate>
		<dc:creator>Irfan Essa</dc:creator>
				<category><![CDATA[Antonio Haro]]></category>
		<category><![CDATA[Collaborators]]></category>
		<category><![CDATA[Computational Photography and Video]]></category>
		<category><![CDATA[Numerical Machine Learning]]></category>
		<category><![CDATA[PAMI/ICCV/CVPR/ECCV]]></category>
		<category><![CDATA[2002]]></category>
		<category><![CDATA[Computational Photography]]></category>
		<category><![CDATA[Computer Vision]]></category>

		<guid isPermaLink="false">http://academics.irfanessa.com/?p=267</guid>
		<description><![CDATA[Haro, A. Essa, I. (2002), &#8220;Learning video processing by example&#8221; In Proceedings of 16th International Conference on Pattern Recognition, 2002, 11-15 Aug. 2002 Volume: 1, page(s): 487 &#8211; 491 vol.1, Number of Pages: 4 vol.(xxix 834 xxxv 1116 xxxiii 1068 xxv 418), ISSN: 1051-4651, ISBN: 0-7695-1695-X, [Digital Object Identifier: 10.1109/ICPR.2002.1044771][IEEEXplore#] Abstract We present an algorithm [...]]]></description>
			<content:encoded><![CDATA[<p>Haro, A.   Essa, I. (2002), &#8220;Learning video processing by example&#8221; <em>In Proceedings of 16th International Conference on Pattern Recognition, 2002</em>, 11-15 Aug. 2002 Volume: 1, page(s): 487 &#8211; 491 vol.1, Number of Pages: 4 vol.(xxix 834 xxxv 1116 xxxiii 1068 xxv 418), ISSN: 1051-4651, ISBN: 0-7695-1695-X, [Digital Object Identifier: 10.1109/ICPR.2002.1044771][<a href="http://ieeexplore.ieee.org/search/freesrchabstract.jsp?arnumber=1044771&amp;isnumber=22378&amp;punumber=8091&amp;k2dockey=1044771@ieeecnfs&amp;query=((essa)%3Cin%3Eau+)&amp;pos=5&amp;access=yes">IEEEXplore#</a>]</p>
<p style="text-align: center;"><strong>Abstract</strong></p>
<p style="text-align: justify;">We present an algorithm that approximates the output of an arbitrary video processing algorithm based on a pair of input and output exemplars. Our algorithm relies on learning the mapping between the input and output exemplars to model the processing that has taken place. We approximate the processing by observing that pixel neighborhoods similar in appearance and motion to those in the exemplar input should result in neighborhoods similar to the exemplar output. Since there are not many pixel neighborhoods in the exemplars, we use techniques from texture synthesis to generalize the output of neighborhoods not observed in the exemplars. The same algorithm is used to learn such processing as motion blur color correction, and painting.</p>
]]></content:encoded>
			<wfw:commentRss>http://prof.irfanessa.com/2002/08/11/ieeexplore-learning-video-processing-by-example/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Talk: Invited Speaker at CMU&#8217;s Robotics Institute (2002): &#8220;Temporal Reasoning from Video to Temporal Synthesis of Video&#8221;</title>
		<link>http://prof.irfanessa.com/2002/02/12/talk-at-cmus-robotics-institute-2002/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=talk-at-cmus-robotics-institute-2002</link>
		<comments>http://prof.irfanessa.com/2002/02/12/talk-at-cmus-robotics-institute-2002/#comments</comments>
		<pubDate>Wed, 13 Feb 2002 00:25:49 +0000</pubDate>
		<dc:creator>Irfan Essa</dc:creator>
				<category><![CDATA[Activity Recognition]]></category>
		<category><![CDATA[Aware Home]]></category>
		<category><![CDATA[Computational Photography and Video]]></category>
		<category><![CDATA[Presentations]]></category>
		<category><![CDATA[2002]]></category>
		<category><![CDATA[Computational Photography]]></category>

		<guid isPermaLink="false">http://irfan.essa.org/wp/2002/02/12/talk-at-cmus-robotics-institute-2002/</guid>
		<description><![CDATA[Irfan Essa, &#8220;Temporal Reasoning from Video to Temporal Synthesis of Video&#8221; CMU&#8217;s Robotics Institute: Seminar, February 15, 2002 Temporal Reasoning from Video to Temporal Synthesis of Video Abstract In this talk, I will present some ongoing work on extracting spatio-temporal cues from video for both synthesis of novel video sequences, and recognition of complex activities. [...]]]></description>
			<content:encoded><![CDATA[<ul>
<li>Irfan Essa, &#8220;Temporal Reasoning from Video to Temporal Synthesis of Video<strong>&#8221; </strong><a href="http://www.cs.cmu.edu/~ri-seminar/archives/2002.spring/2002.Feb.15.html">CMU&#8217;s Robotics Institute: Seminar, February 15, 2002</a></li>
</ul>
<p align="center"><strong>Temporal Reasoning from Video to Temporal Synthesis of Video</strong></p>
<p align="center">Abstract</p>
<p style="text-align: justify;">In this talk, I will present some ongoing work on extracting spatio-temporal cues from video for both synthesis of novel video sequences, and recognition of complex activities. First I will discuss (in brief) our work on Video Textures, where repeating information is extracted to generate extended sequences of videos.  I will then describe some our extensions to this approach that allows for controlled generation of animations of video sprites.  We have developed various learning and optimization techniques that allow for video-based animations of photo-realistic characters.  Then I will describe our new approach for image and video synthesis that builds on optimal patch-based copying of samples. I will show how our method allows for iterative refinement and extend to synthesis of both images and video from very limited samples. In the next part of my talk, I will describe how a similar analysis of video can be used to recognize what a person is doing in a scene.  Such an analysis of video, aimed at recognition, requires more contextual information about the environment.  I will show how we leverage off contextual information shared between actions and objects to recognize what is happening in complex environments.  I will also show that by adding some form of grammar (we use Stochastic Context Free Grammar) we can recognize very complex, multi-tasked activities. Finally, I will describe (very briefly) the Aware Home project at Georgia Tech, which is one primary area of ongoing and future research for me and my group.  Further information on my work with videos is available from my webpage at http://www.cc.gatech.edu/~irfan</p>
]]></content:encoded>
			<wfw:commentRss>http://prof.irfanessa.com/2002/02/12/talk-at-cmus-robotics-institute-2002/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

<!-- Performance optimized by W3 Total Cache. Learn more: http://www.w3-edge.com/wordpress-plugins/

Served from: prof.irfanessa.com @ 2012-02-05 14:22:29 -->
