Paper in ISWC 2015: “Predicting Daily Activities from Egocentric Images Using Deep Learning”

September 7th, 2015 Irfan Essa Posted in Activity Recognition, Daniel Castro, Gregory Abowd, Henrik Christensen, ISWC, Machine Learning, Papers, Steven Hickson, Ubiquitous Computing, Vinay Bettadapura No Comments »

Paper

  • D. Castro, S. Hickson, V. Bettadapura, E. Thomaz, G. Abowd, H. Christensen, and I. Essa (2015), “Predicting Daily Activities from Egocentric Images Using Deep Learning,” in Proceedings of International Symposium on Wearable Computers (ISWC), 2015. [PDF] [WEBSITE] [arXiv] [BIBTEX]
    @InProceedings{    2015-Castro-PDAFEIUDL,
      arxiv    = {http://arxiv.org/abs/1510.01576},
      author  = {Daniel Castro and Steven Hickson and Vinay
          Bettadapura and Edison Thomaz and Gregory Abowd and
          Henrik Christensen and Irfan Essa},
      booktitle  = {Proceedings of International Symposium on Wearable
          Computers (ISWC)},
      month    = {September},
      pdf    = {http://www.cc.gatech.edu/~irfan/p/2015-Castro-PDAFEIUDL.pdf},
      title    = {Predicting Daily Activities from Egocentric Images
          Using Deep Learning},
      url    = {http://www.cc.gatech.edu/cpl/projects/dailyactivities/},
      year    = {2015}
    }

Abstract

Castro-ISWC2015We present a method to analyze images taken from a passive egocentric wearable camera along with the contextual information, such as time and day of a week, to learn and predict everyday activities of an individual. We collected a dataset of 40,103 egocentric images over a 6 month period with 19 activity classes and demonstrate the benefit of state-of-the-art deep learning techniques for learning and predicting daily activities. Classification is conducted using a Convolutional Neural Network (CNN) with a classification method we introduce called a late fusion ensemble. This late fusion ensemble incorporates relevant contextual information and increases our classification accuracy. Our technique achieves an overall accuracy of 83.07% in predicting a person’s activity across the 19 activity classes. We also demonstrate some promising results from two additional users by fine-tuning the classifier with one day of training data.

AddThis Social Bookmark Button

Paper in WACV (2015): “Semantic Instance Labeling Leveraging Hierarchical Segmentation”

January 6th, 2015 Irfan Essa Posted in Computer Vision, Henrik Christensen, PAMI/ICCV/CVPR/ECCV, Papers, Robotics, Steven Hickson No Comments »

Paper

  • S. Hickson, I. Essa, and H. Christensen (2015), “Semantic Instance Labeling Leveraging Hierarchical Segmentation,” in Proceedings of IEEE Winter Conference on Applications of Computer Vision (WACV), 2015. [PDF] [DOI] [BIBTEX]
    @InProceedings{    2015-Hickson-SILLHS,
      author  = {Steven Hickson and Irfan Essa and Henrik
          Christensen},
      booktitle  = {Proceedings of IEEE Winter Conference on
          Applications of Computer Vision (WACV)},
      doi    = {10.1109/WACV.2015.147},
      month    = {January},
      pdf    = {http://www.cc.gatech.edu/~irfan/p/2015-Hickson-SILLHS.pdf},
      publisher  = {IEEE Computer Society},
      title    = {Semantic Instance Labeling Leveraging Hierarchical
          Segmentation},
      year    = {2015}
    }

Abstract

Most of the approaches for indoor RGBD semantic labeling focus on using pixels or superpixels to train a classifier. In this paper, we implement a higher level segmentation using a hierarchy of superpixels to obtain a better segmentation for training our classifier. By focusing on meaningful segments that conform more directly to objects, regardless of size, we train a random forest of decision trees as a classifier using simple features such as the 3D size, LAB color histogram, width, height, and shape as specified by a histogram of surface normals. We test our method on the NYU V2 depth dataset, a challenging dataset of cluttered indoor environments. Our experiments using the NYU V2 depth dataset show that our method achieves state of the art results on both a general semantic labeling introduced by the dataset (floor, structure, furniture, and objects) and a more object specific semantic labeling. We show that training a classifier on a segmentation from a hierarchy of super pixels yields better results than training directly on super pixels, patches, or pixels as in previous work.2015-Hickson-SILLHS_pdf

AddThis Social Bookmark Button

Four Papers at IEEE Winter Conference on Applications of Computer Vision (WACV 2015)

January 5th, 2015 Irfan Essa Posted in Computational Photography and Video, Computer Vision, PAMI/ICCV/CVPR/ECCV, Papers, S. Hussain Raza, Steven Hickson, Vinay Bettadapura No Comments »

Four papers accepted at the IEEE Winter Conference on Applications of Computer Vision (WACV) 2015. See you at Waikoloa Beach, Hawaii!

  • V. Bettadapura, E. Thomaz, A. Parnami, G. Abowd, and I. Essa (2015), “Leveraging Context to Support Automated Food Recognition in Restaurants,” in Proceedings of IEEE Winter Conference on Applications of Computer Vision (WACV), 2015. [PDF] [WEBSITE] [DOI] [arXiv] [BIBTEX]
    @InProceedings{    2015-Bettadapura-LCSAFRR,
      arxiv    = {http://arxiv.org/abs/1510.02078},
      author  = {Vinay Bettadapura and Edison Thomaz and Aman
          Parnami and Gregory Abowd and Irfan Essa},
      booktitle  = {Proceedings of IEEE Winter Conference on
          Applications of Computer Vision (WACV)},
      doi    = {10.1109/WACV.2015.83},
      month    = {January},
      pdf    = {http://www.cc.gatech.edu/~irfan/p/2015-Bettadapura-LCSAFRR.pdf},
      publisher  = {IEEE Computer Society},
      title    = {Leveraging Context to Support Automated Food
          Recognition in Restaurants},
      url    = {http://www.vbettadapura.com/egocentric/food/},
      year    = {2015}
    }
  • S. Hickson, I. Essa, and H. Christensen (2015), “Semantic Instance Labeling Leveraging Hierarchical Segmentation,” in Proceedings of IEEE Winter Conference on Applications of Computer Vision (WACV), 2015. [PDF] [DOI] [BIBTEX]
    @InProceedings{    2015-Hickson-SILLHS,
      author  = {Steven Hickson and Irfan Essa and Henrik
          Christensen},
      booktitle  = {Proceedings of IEEE Winter Conference on
          Applications of Computer Vision (WACV)},
      doi    = {10.1109/WACV.2015.147},
      month    = {January},
      pdf    = {http://www.cc.gatech.edu/~irfan/p/2015-Hickson-SILLHS.pdf},
      publisher  = {IEEE Computer Society},
      title    = {Semantic Instance Labeling Leveraging Hierarchical
          Segmentation},
      year    = {2015}
    }
  • S. H. Raza, A. Humayun, M. Grundmann, D. Anderson, and I. Essa (2015), “Finding Temporally Consistent Occlusion Boundaries using Scene Layout,” in Proceedings of IEEE Winter Conference on Applications of Computer Vision (WACV), 2015. [PDF] [DOI] [BIBTEX]
    @InProceedings{    2015-Raza-FTCOBUSL,
      author  = {Syed Hussain Raza and Ahmad Humayun and Matthias
          Grundmann and David Anderson and Irfan Essa},
      booktitle  = {Proceedings of IEEE Winter Conference on
          Applications of Computer Vision (WACV)},
      doi    = {10.1109/WACV.2015.141},
      month    = {January},
      pdf    = {http://www.cc.gatech.edu/~irfan/p/2015-Raza-FTCOBUSL.pdf},
      publisher  = {IEEE Computer Society},
      title    = {Finding Temporally Consistent Occlusion Boundaries
          using Scene Layout},
      year    = {2015}
    }
  • V. Bettadapura, I. Essa, and C. Pantofaru (2015), “Egocentric Field-of-View Localization Using First-Person Point-of-View Devices,” in Proceedings of IEEE Winter Conference on Applications of Computer Vision (WACV), 2015. (Best Paper Award) [PDF] [WEBSITE] [DOI] [arXiv] [BIBTEX]
    @InProceedings{    2015-Bettadapura-EFLUFPD,
      arxiv    = {http://arxiv.org/abs/1510.02073},
      author  = {Vinay Bettadapura and Irfan Essa and Caroline
          Pantofaru},
      awards  = {(Best Paper Award)},
      booktitle  = {Proceedings of IEEE Winter Conference on
          Applications of Computer Vision (WACV)},
      doi    = {10.1109/WACV.2015.89},
      month    = {January},
      pdf    = {http://www.cc.gatech.edu/~irfan/p/2015-Bettadapura-EFLUFPD.pdf},
      publisher  = {IEEE Computer Society},
      title    = {Egocentric Field-of-View Localization Using
          First-Person Point-of-View Devices},
      url    = {http://www.vbettadapura.com/egocentric/localization/},
      year    = {2015}
    }

Last one was also the WINNER of Best Paper Award (see http://wacv2015.org/). More details coming soon.

 

AddThis Social Bookmark Button

Paper in CVPR 2014 “Efficient Hierarchical Graph-Based Segmentation of RGBD Videos”

June 22nd, 2014 Irfan Essa Posted in Computer Vision, Henrik Christensen, Papers, Steven Hickson No Comments »

  • S. Hickson, S. Birchfield, I. Essa, and H. Christensen (2014), “Efficient Hierarchical Graph-Based Segmentation of RGBD Videos,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014. [PDF] [WEBSITE] [BIBTEX]
    @InProceedings{    2014-Hickson-EHGSRV,
      author  = {Steven Hickson and Stan Birchfield and Irfan Essa
          and Henrik Christensen},
      booktitle  = {{Proceedings of IEEE Conference on Computer Vision
          and Pattern Recognition (CVPR)}},
      month    = {June},
      organization  = {IEEE Computer Society},
      pdf    = {http://www.cc.gatech.edu/~irfan/p/2014-Hickson-EHGSRV.pdf},
      title    = {Efficient Hierarchical Graph-Based Segmentation of
          RGBD Videos},
      url    = {http://www.cc.gatech.edu/cpl/projects/4dseg},
      year    = {2014}
    }

Abstract

We present an efficient and scalable algorithm for seg- menting 3D RGBD point clouds by combining depth, color, and temporal information using a multistage, hierarchical graph-based approach. Our algorithm processes a moving window over several point clouds to group similar regions over a graph, resulting in an initial over-segmentation. These regions are then merged to yield a dendrogram using agglomerative clustering via a minimum spanning tree algorithm. Bipartite graph matching at a given level of the hierarchical tree yields the final segmentation of the point clouds by maintaining region identities over arbitrarily long periods of time. We show that a multistage segmentation with depth then color yields better results than a linear combination of depth and color. Due to its incremental process- ing, our algorithm can process videos of any length and in a streaming pipeline. The algorithm’s ability to produce robust, efficient segmentation is demonstrated with numerous experimental results on challenging sequences from our own as well as public RGBD data sets.

AddThis Social Bookmark Button