Unified crowd segmentation

Tu, P., Sebastian, T., Doretto, G., Krahnstoever, N., Rittscher, J., and Yu, T.
Unified crowd segmentation
In Proceedings of the European Conference on Computer Vision (ECCV), pp. 691–704, 2008.

Download

PDF (9.3MB )  

Abstract

This paper presents a unified approach to crowd segmentation. A global solution is generated using an Expectation Maximization framework. Initially, a head and shoulder detector is used to nominate an exhaustive set of person locations and these form the person hypotheses. The image is then partitioned into a grid of small patches which are each assigned to one of the person hypotheses. A key idea of this paper is that while whole body monolithic person detectors can fail due to occlusion, a partial response to such a detector can be used to evaluate the likelihood of a single patch being assigned to a hypothesis. This captures local appearance information without having to learn specific appearance models. The likelihood of a pair of patches being assigned to a person hypothesis is evaluated based on low level image features such as uniform motion fields and color constancy. During the E-step, the single and pairwise likelihoods are used to compute a globally optimal set of assignments of patches to hypotheses. In the M-step, parameters which enforce global consistency of assignments are estimated. This can be viewed as a form of occlusion reasoning. The final assignment of patches to hypotheses constitutes a segmentation of the crowd. The resulting system provides a global solution that does not require background modeling and is robust with respect to clutter and partial occlusion.

BibTeX

@InProceedings{tuSDKRY08eccv,
  Title                    = {Unified crowd segmentation},
  Author                   = {Tu, P. and Sebastian, T. and Doretto, G. and Krahnstoever, N. and Rittscher, J. and Yu, T.},
  Booktitle                = eccv,
  Year                     = {2008},
  Pages                    = {691--704},
  Abstract                 = {This paper presents a unified approach to crowd segmentation. A global solution is generated using an Expectation Maximization framework. Initially, a head and shoulder detector is used to nominate an exhaustive set of person locations and these form the person hypotheses. The image is then partitioned into a grid of small patches which are each assigned to one of the person hypotheses. A key idea of this paper is that while whole body monolithic person detectors can fail due to occlusion, a partial response to such a detector can be used to evaluate the likelihood of a single patch being assigned to a hypothesis. This captures local appearance information without having to learn specific appearance models. The likelihood of a pair of patches being assigned to a person hypothesis is evaluated based on low level image features such as uniform motion fields and color constancy. During the E-step, the single and pairwise likelihoods are used to compute a globally optimal set of assignments of patches to hypotheses. In the M-step, parameters which enforce global consistency of assignments are estimated. This can be viewed as a form of occlusion reasoning. The final assignment of patches to hypotheses constitutes a segmentation of the crowd. The resulting system provides a global solution that does not require background modeling and is robust with respect to clutter and partial occlusion.},
  Bib2html_pubtype         = {Refereed Conferences},
  Bib2html_rescat          = {Video Surveillance, People Detection, Integral Image Computations, People Tracking},
  File                     = {tuSDKRY08eccv.pdf:doretto\\conference\\tuSDKRY08eccv.pdf:PDF},
  Owner                    = {doretto},
  Timestamp                = {2008.01.16}
}