Paper in Ubicomp 2015: “A Practical Approach for Recognizing Eating Moments with Wrist-Mounted Inertial Sensing”

September 8th, 2015 Irfan Essa Posted in ACM UIST/CHI, Activity Recognition, Behavioral Imaging, Edison Thomaz, Gregory Abowd, Health Systems, Machine Learning, Mobile Computing, Papers, UBICOMP, Ubiquitous Computing No Comments »

Paper

  • E. Thomaz, I. Essa, and G. D. Abowd (2015), “A Practical Approach for Recognizing Eating Moments with Wrist-Mounted Inertial Sensing,” in Proceedings of ACM International Conference on Ubiquitous Computing (UBICOMP), 2015. [PDF] [BIBTEX]
    @InProceedings{    2015-Thomaz-PAREMWWIS,
      author  = {Edison Thomaz and Irfan Essa and Gregory D. Abowd},
      booktitle  = {Proceedings of ACM International Conference on
          Ubiquitous Computing (UBICOMP)},
      month    = {September},
      pdf    = {http://www.cc.gatech.edu/~irfan/p/2015-Thomaz-PAREMWWIS.pdf}
          ,
      title    = {A Practical Approach for Recognizing Eating Moments
          with Wrist-Mounted Inertial Sensing},
      year    = {2015}
    }

Abstract

Thomaz-UBICOMP15.pngRecognizing when eating activities take place is one of the key challenges in automated food intake monitoring. Despite progress over the years, most proposed approaches have been largely impractical for everyday usage, requiring multiple onbody sensors or specialized devices such as neck collars for swallow detection. In this paper, we describe the implementation and evaluation of an approach for inferring eating moments based on 3-axis accelerometry collected with a popular off-the-shelf smartwatch. Trained with data collected in a semi-controlled laboratory setting with 20 subjects, our system recognized eating moments in two free-living condition studies (7 participants, 1 day; 1 participant, 31 days), with Fscores of 76.1% (66.7% Precision, 88.8% Recall), and 71.3% (65.2% Precision, 78.6% Recall). This work represents a contribution towards the implementation of a practical, automated system for everyday food intake monitoring, with applicability in areas ranging from health research and food journaling.

AddThis Social Bookmark Button

Paper in IEEE WACV (2015): “Leveraging Context to Support Automated Food Recognition in Restaurants”

January 6th, 2015 Irfan Essa Posted in Activity Recognition, Computer Vision, Edison Thomaz, First Person Computing, Gregory Abowd, Mobile Computing, PAMI/ICCV/CVPR/ECCV, Papers, Ubiquitous Computing, Uncategorized, Vinay Bettadapura No Comments »

Paper

  • V. Bettadapura, E. Thomaz, A. Parnami, G. Abowd, and I. Essa (2015), “Leveraging Context to Support Automated Food Recognition in Restaurants,” in Proceedings of IEEE Winter Conference on Applications of Computer Vision (WACV), 2015. [PDF] [WEBSITE] [DOI] [arXiv] [BIBTEX]
    @InProceedings{    2015-Bettadapura-LCSAFRR,
      arxiv    = {http://arxiv.org/abs/1510.02078},
      author  = {Vinay Bettadapura and Edison Thomaz and Aman
          Parnami and Gregory Abowd and Irfan Essa},
      booktitle  = {Proceedings of IEEE Winter Conference on
          Applications of Computer Vision (WACV)},
      doi    = {10.1109/WACV.2015.83},
      month    = {January},
      pdf    = {http://www.cc.gatech.edu/~irfan/p/2015-Bettadapura-LCSAFRR.pdf}
          ,
      publisher  = {IEEE Computer Society},
      title    = {Leveraging Context to Support Automated Food
          Recognition in Restaurants},
      url    = {http://www.vbettadapura.com/egocentric/food/},
      year    = {2015}
    }

 

Abstract

The pervasiveness of mobile cameras has resulted in a dramatic increase in food photos, which are pictures reflecting what people eat. In this paper, we study how taking pictures of what we eat in restaurants can be used for the purpose of automating food journaling. We propose to leverage the context of where the picture was taken, with additional information about the restaurant, available online, coupled with state-of-the-art computer vision techniques to recognize the food being consumed. To this end, we demonstrate image-based recognition of foods eaten in restaurants by training a classifier with images from restaurant’s online menu databases. We evaluate the performance of our system in unconstrained, real-world settings with food images taken in 10 restaurants across 5 different types of food (American, Indian, Italian, Mexican and Thai).food-poster

AddThis Social Bookmark Button

Paper in WACV (2015): “Egocentric Field-of-View Localization Using First-Person Point-of-View Devices”

January 6th, 2015 Irfan Essa Posted in Activity Recognition, Caroline Pantofaru, Computer Vision, First Person Computing, Mobile Computing, PAMI/ICCV/CVPR/ECCV, Papers, Vinay Bettadapura No Comments »

Paper

  • V. Bettadapura, I. Essa, and C. Pantofaru (2015), “Egocentric Field-of-View Localization Using First-Person Point-of-View Devices,” in Proceedings of IEEE Winter Conference on Applications of Computer Vision (WACV), 2015. (Best Paper Award) [PDF] [WEBSITE] [DOI] [arXiv] [BIBTEX]
    @InProceedings{    2015-Bettadapura-EFLUFPD,
      arxiv    = {http://arxiv.org/abs/1510.02073},
      author  = {Vinay Bettadapura and Irfan Essa and Caroline
          Pantofaru},
      awards  = {(Best Paper Award)},
      booktitle  = {Proceedings of IEEE Winter Conference on
          Applications of Computer Vision (WACV)},
      doi    = {10.1109/WACV.2015.89},
      month    = {January},
      pdf    = {http://www.cc.gatech.edu/~irfan/p/2015-Bettadapura-EFLUFPD.pdf}
          ,
      publisher  = {IEEE Computer Society},
      title    = {Egocentric Field-of-View Localization Using
          First-Person Point-of-View Devices},
      url    = {http://www.vbettadapura.com/egocentric/localization/}
          ,
      year    = {2015}
    }

Abstract

We present a technique that uses images, videos and sensor data taken from first-person point-of-view devices to perform egocentric field-of-view (FOV) localization. We define egocentric FOV localization as capturing the visual information from a person’s field-of-view in a given environment and transferring this information onto a reference corpus of images and videos of the same space, hence determining what a person is attending to. Our method matches images and video taken from the first-person perspective with the reference corpus and refines the results using the first-person’s head orientation information obtained using the device sensors. We demonstrate single and multi-user egocentric FOV localization in different indoor and outdoor environments with applications in augmented reality, event understanding and studying social interactions.

AddThis Social Bookmark Button

Paper in ACM Ubicomp 2013 “Technological approaches for addressing privacy concerns when recognizing eating behaviors with wearable cameras”

September 14th, 2013 Irfan Essa Posted in Activity Recognition, Computational Photography and Video, Edison Thomaz, Gregory Abowd, ISWC, Mobile Computing, Papers, UBICOMP, Ubiquitous Computing No Comments »

  • E. Thomaz, A. Parnami, J. Bidwell, I. Essa, and G. D. Abowd (2013), “Technological Approaches for Addressing Privacy Concerns when Recognizing Eating Behaviors with Wearable Cameras.,” in Proceedings of the ACM International Joint Conference on Pervasive and Ubiquitous Computing (UbiComp, 2013. [PDF] [DOI] [BIBTEX]
    @InProceedings{    2013-Thomaz-TAAPCWREBWWC,
      author  = {Edison Thomaz and Aman Parnami and Jonathan Bidwell
          and Irfan Essa and Gregory D. Abowd},
      booktitle  = {{Proceedings of the ACM International Joint
          Conference on Pervasive and Ubiquitous Computing
          (UbiComp}},
      doi    = {10.1145/2493432.2493509},
      month    = {September},
      pdf    = {http://www.cc.gatech.edu/~irfan/p/2013-Thomaz-TAAPCWREBWWC.pdf}
          ,
      title    = {Technological Approaches for Addressing Privacy
          Concerns when Recognizing Eating Behaviors with
          Wearable Cameras.},
      year    = {2013}
    }

 Abstract

First-person point-of-view (FPPOV) images taken by wearable cameras can be used to better understand people’s eating habits. Human computation is a way to provide effective analysis of FPPOV images in cases where algorithmic approaches currently fail. However, privacy is a serious concern. We provide a framework, the privacy-saliency matrix, for understanding the balance between the eating information in an image and its potential privacy concerns. Using data gathered by 5 participants wearing a lanyard-mounted smartphone, we show how the framework can be used to quantitatively assess the effectiveness of four automated techniques (face detection, image cropping, location filtering and motion filtering) at reducing the privacy-infringing content of images while still maintaining evidence of eating behaviors throughout the day.

via ACM DL Technological approaches for addressing privacy concerns when recognizing eating behaviors with wearable cameras.

AddThis Social Bookmark Button

DEMO (2011): Auto-Directed Video Stabilization with Robust L1 Optimal Camera Paths – from Google Research Blog

June 20th, 2011 Irfan Essa Posted in Computational Photography and Video, In The News, Matthias Grundmann, Mobile Computing, PAMI/ICCV/CVPR/ECCV, Vivek Kwatra No Comments »

via Auto-Directed Video Stabilization with Robust L1 Optimal Camera Paths – Google Research Blog.

Auto-Directed Video Stabilization with Robust L1 Optimal Camera Paths
Posted by Matthias GrundmannVivek Kwatra, and Irfan Essa,

Earlier this year, we announced the launch of new features on the YouTube Video Editor, including stabilization for shaky videos, with the ability to preview them in real-time. The core technology behind this feature is detailed in this paper, which will be presented at the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR 2011).

Casually shot videos captured by handheld or mobile cameras suffer from significant amount of shake. Existing in-camera stabilization methods dampen high-frequency jitter but do not suppress low-frequency movements and bounces, such as those observed in videos captured by a walking person. On the other hand, most professionally shot videos usually consist of carefully designed camera configurations, using specialized equipment such as tripods or camera dollies, and employ ease-in and ease-out for transitions. Our goal was to devise a completely automatic method for converting casual shaky footage into more pleasant and professional looking videos.

Our technique mimics the cinematographic principles outlined above by automatically determining the best camera path using a robust optimization technique. The original, shaky camera path is divided into a set of segments, each approximated by either a constant, linear or parabolic motion. Our optimization finds the best of all possible partitions using a computationally efficient and stable algorithm.

To achieve real-time performance on the web, we distribute the computation across multiple machines in the cloud. This enables us to provide users with a real-time preview and interactive control of the stabilized result. Above we provide a video demonstration of how to use this feature on the YouTube Editor. We will also demo this live at Google’s exhibition booth in CVPR 2011.

For more details see the Project Site. See the youtube video of the system on youtube. See the paper in PDF, and a technical video of the work.

Full paper is

 

AddThis Social Bookmark Button

Paper: ISWC (2008) “Localization and 3D Reconstruction of Urban Scenes Using GPS”

September 28th, 2008 Irfan Essa Posted in ISWC, Kihwan Kim, Mobile Computing, Papers, Thad Starner No Comments »

Kihwan Kim, Jay Summet, Thad Starner, Daniel Ashbrook, Mrunal Kapade and Irfan Essa  (2008) “Localization and 3D Reconstruction of Urban Scenes Using GPS” In Proceedings of IEEE Symposium on Wearable Computing (ISWC) 2008 (To Appear). [PDF]

ABSTRACT

research_gpsray

Using off-the-shelf Global Positioning System (GPS) units, we reconstruct buildings in 3D by exploiting the reduction in signal to noise ratio (SNR) that occurs when the buildings obstruct the line-of-sight between the moving units and the orbiting satellites. We measure the size and height of skyscrapers as well as automatically constructing a density map representing the location of multiple buildings in an urban landscape.  If deployed on a large scale, via a cellular service provider’s GPS-enabled mobile phones or GPS-tracked delivery vehicles, the system could provide an inexpensive means of continuously creating and updating 3D maps of urban environments.

AddThis Social Bookmark Button

Funding (2007): NSF “Web on Demand – Bridging the Gap Between Social Networks and Ad Hoc Networking”

September 1st, 2008 Irfan Essa Posted in Computational Journalism, Kishore Ramachandran, Mobile Computing No Comments »

Award#0834545 – CSR-DMSS, SM: Web on Demand – Bridging the Gap Between Social Networks and Ad Hoc Networking

Investigator(s): Umakishore Ramachandran, (Principal Investigator), Irfan Essa (Co-Principal Investigator)

Dates: September 1, 2008 – August 31, 2009 (Estimated)

Abstract

From the western world to the third world, the use of handheld devices (cellphones, PDAs) has proliferated. The world of users is becoming both wireless and mobile. Web 2.0 has ushered in an age wherein the web is viewed as a provider of services and not just a repository of documents and/or information. Despite this advance, the web remains just that, a single web with an inherent assumption that a powerful computing and communication infrastructure supports it. Couldn’t mobile wireless devices in close proximity form a web of their own? This is the vision behind this project, the Web on Demand (WoD). WoD aims at bridging the gap between social networks and ad hoc networking. In other words, it aims to rethink the system software stack all the way from application to networking that would allow the creation and management of social networks without any assumption of infrastructure support. The core of the research is to develop software technologies for mobile devices that would allow the dynamic creation of thematic ad hoc overlay networks empowering (a) mobile people with similar interests (e.g., weather forecast), (b) friends and family (e.g., in a theme park), and (c) participants in mission critical applications (e.g., search and rescue), stay connected. WoD complements the World Wide Web (WWW) and leverages it when it is available, such as exploiting the ambient computing infrastructure to enhance user experience, and managing the dynamic creation of User Generated Content (UGC) by mobile users. The vision behind this project is to democratize access to services that are currently offered through WWW. In this sense, the results from this research can have far-reaching technological and societal consequences. Most importantly, the research will help breed a new class of computer scientists who are connected with societal causes in addition to advancing technology.

AddThis Social Bookmark Button