Chia-Chih Chen : C.-C. Chen

Home Resume Publications Research Personal Contact

Audio-visual modeling of human activity

There are several analogies between activity and speech signals with regard to the way they are generated, propagated, and perceived. Despite all the “acting apparatus” being directly visible, there is little in the way of research exploring the temporal signals emitted from body parts for activity recognition. We propose a novel action representation, the action spectrogram, which is inspired by a common spectrographic representation of speech. Different from sound spectrogram, an action spectrogram is a space-time-frequency representation which characterizes the short-time spectral properties of body parts’ movements. We transform an activity sequence into the occurrence likelihood series of action associated space-time interest points (STIP); therefore, activities are recognized by not only local video content but also occurrence likelihood spectra of action associated STIP.

Related publication

Chia-Chih Chen and J. K. Aggarwal,

Modeling Human Activities as Speech,

IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

[BibTex]

[Video]

[Supplementary]

Low-resolution human-vehicle interaction recognition

We propose a novel framework to recognize human-vehicle interactions from aerial video. In this scenario, the object resolution is low, the visual cues are vague, and the detection and tracking of objects are less reliable as a consequence. To address these issues, we present a temporal logic based approach which does not require training from event examples. At the low-level, we employ dynamic programming to perform fast model fitting between the tracked vehicle and the rendered 3-D vehicle models. At the semantic-level, given the localized event region of interest (ROI), we verify the time series of human-vehicle relationships with the pre-specified event definitions in a piecewise fashion.

Related publication

Jong Taek Lee*, Chia-Chih Chen*, and J. K. Aggarwal,

Recognizing Human-Vehicle Interactions from Aerial Video without Training,

Workshop of Aerial Video Processing in conjunction with CVPR (WAVP)

*Equal contribution authorship.

[BibTex]

Low-resolution human action recognition

Recognition of human actions from a distant view is a challenging problem in computer vision. The available visual cues are particularly sparse and vague under this scenario. We proposed an action descriptor which is composed of time series of subspace projected histogram-based features. Our descriptor combines both human poses and motion information and performs feature selection in a supervised fashion. Our method has been tested on several public datasets and achieves very promising accuracy.

Related publications

Chia-Chih Chen and J. K. Aggarwal,

Recognizing Human Action from a Far Field of View,

IEEE Workshop on Motion and Video Computing (WMVC)

[BibTex]

M. S. Ryoo, Chia-Chih Chen, J. K. Aggarwal, and Amit Roy-Chowdhury,

An Overview of Contest on Semantic Description of Human Activities (SDHA) 2010,

International Conference on Pattern Recognition (ICPR) Contests

[BibTex]

Chia-Chih Chen, M. S. Ryoo, and J. K. Aggarwal,

UT-Tower Dataset: Aerial View Activity Classification Challenge,

[BibTex]

Kinect applications - human detection

In this paper, we present a novel human detection method using depth information taken by Kinect for Xbox 360. We propose a model based approach, which detects humans using a 2D head contour model and a 3D head surface model. We propose a segmentation scheme to segment humans from the surroundings and extract the complete contours of the figures based on the detected ROI. We also explore the tracking algorithm based on detection results. Our method is tested on a database taken by Kinect in our lab and presents superior results.

Related publication

Lu Xia, Chia-Chih Chen, J. K. Aggarwal,

Human Detection Using Depth Information by Kinect,

Workshop on Human Activity Understanding from 3D Data in conjunction with CVPR (HAU3D)

[BibTex]

Kinect applications - human action recognition

We present a novel approach for human action recognition with histograms of 3D joint locations (HOJ3D) as a compact representation of postures. We extract the 3D skeletal joint locations from Kinect depth maps using Shotton et al.’s method. The HOJ3D computed from the action depth sequences are reprojected using LDA and then clustered into k posture visual words, which represent the prototypical poses of actions. The temporal evolutions of those visual words are modeled by discrete hidden Markov models (HMMs). In addition, due to the design of our spherical coordinate system and the robust 3D skeleton estimation from Kinect, our method demonstrates significant view invariance on our 3D action dataset.

Related publication

Lu Xia, Chia-Chih Chen, J. K. Aggarwal,

View Invariant Human Action Recognition Using Histograms of 3D Joints,

Workshop on Human Activity Understanding from 3D Data in conjunction with CVPR (HAU3D)

[BibTex]

Detection of object abandonment

Public security is a crucial issue in our world today. In this work, we describe a smart threat detection system that captures, exploits and interprets the temporal flow of sub-events related to the abandonment of an object. Our representational framework applies temporal interval logic to define the relationships between actions, events, and their effects. An event is defined as having occurred if and only if a given sequence of observations matches the formal event representation and meets the pre-specified temporal constraints.

Related publications

M. Bhargava, Chia-Chih Chen, M. S. Ryoo, and J. K. Aggarwal,

Detection of Object Abandonment Using Temporal Logic,

Journal of Machine Vision and Applications (MVA)

[BibTex]

M. Bhargava, Chia-Chih Chen, M. S. Ryoo, and J. K. Aggarwal,

Detection of Abandoned Objects in Crowded Environments,

IEEE International Conference on Advanced Video and Signal based Surveillance (AVSS)

[BibTex]

Human cast shadow removal

The success in human shadow removal leads to the accurate recognition of activities and the robust tracking of people. We present a shadow removal technique which effectively eliminates human shadow cast from an unknown direction of light source. A multi-cue shadow descriptor is proposed to characterize the distinctive properties of shadow. We employ a 3-stage process to detect then remove shadow. Our algorithm further improves the shadow detection accuracy by imposing the spatial constraint between the foreground subregions of human and shadow.

Related publication

Chia-Chih Chen and J. K. Aggarwal,

Human Shadow Removal with Unknown Light Source,

International Conference on Pattern Recognition (ICPR)

[BibTex]

Recognition of box-like objects

In this work, we propose a template-based algorithm for recognizing box-like objects. Our technique is invariant to scale, rotation and translation as well as robust to patterned surfaces and moderate occlusions. We reassemble box-like segments from an over-segmented image. The shapes and inner edges of the merged box-like segments are verified seprartly with the corresponding templates. We formulate the process of template matching into an optimization problem, and propose a combined metric to evaluate the similarity.

Related publication

Chia-Chih Chen and J. K. Aggarwal,

Recognition of Box-like Objects by Fusing Cues of Shape and Edges,

International Conference on Pattern Recognition (ICPR)

[BibTex]

Video background initialization

Background subtraction is an essential element in most object tracking and video surveillance systems. The success of this low-level processing step is highly dependent on the quality of the background model maintained. We present a background initialization algorithm which identifies stable intervals of intensity values at each pixel, and determines which interval is most likely to display background. Our method is adaptive to the scale of foregound motion and is able to equalize the uneven effect caused by different object depths. Our method achieves promising results even with complex foreground contents.

Related publication

Chia-Chih Chen and J. K. Aggarwal,

An Adaptive Background Model Initialization Algorithm with Objects Moving at Different Depths,

IEEE International Conference on Image Processing (ICIP)

[BibTex]

Last update 04/05/2011