Applying Mid-Level Vision Techniques for Video Data Compression
and Manipulation
John Y. A. Wang, Edward H. Adelson, and Ujjaval Desai
Published in
Proceedings of SPIE on Digital Video Compression on Personal
Computers: Algorithms and Technologies, vol. 2187 (pp. 116-127)
San Jose; February (1994).
Most image coding systems rely on signal processing concepts such as
transforms, VQ, and motion compensation. In order to achieve significantly
lower bit rates, it will be necessary to devise encoding schemes that
involve mid-level and high-level computer vision. Model-based systems have
been described, but these are usually restricted to some special class of
images such as head-and-shoulders sequences. We propose to use mid-level
vision concepts to achieve a decomposition that can be applied to a wider
domain of image material. In particular, we describe a coding scheme based
on a set of overlapping layers. The layers, which are ordered in depth and
move over one another, are composited in a manner similar to traditional
"cel" animation. The decomposition (the vision problem) is challenging,
but we have attained promising results on simple sequences. Once the
decomposition has been achieved, the synthesis is straightforward.