Perceptual Organization

{ Home | Research | People | Awards | Publications }

A Visual-Attention Model Using Earth Mover's Distance based Saliency Measurement and Nonlinear Feature Combination
Y. Lin, Y. Y. Tang, B. Fang, Z. Shang, Y. Huang, S. Wang
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2013
PDF
Abstract

This paper introduces a new computational visual-attention model for static and dynamic saliency maps. First, we use the Earth Mover's Distance (EMD) to measure the center-surround difference in the receptive field, instead of using the Difference-of-Gaussian filter that is widely used in many previous visual-attention models. Second, we propose to take two steps of biologically-inspired nonlinear operations for combining different features: combining subsets of basic features into a set of super features using the $L^m$-norm and then combining the super features using the Winner-Take-All mechanism. Third, we extend the proposed model to construct dynamic saliency maps from videos, by using EMD for computing the center-surround difference in the spatio-temporal receptive field. We evaluate the performance of the proposed model on both static image data and video data. Comparison results show that the proposed model outperforms several existing models under a unified evaluation setting.

CrackTree: Automatic Crack Detection from Pavement Images
Q. Zou, Y. Cao, Q. Li, Q. Mao, S. Wang
Pattern Recognition Letters, 2012
PDF
Abstract

Pavement cracks are important information for evaluating the road condition and conducting the necessary road maintenance. In this paper, we develop CrackTree, a fully-automatic method to detect cracks from pavement images. In practice, crack detection is a very challenging problem because of (1) low contrast between cracks and the surrounding pavement, (2) intensity inhomogeneity along the cracks, and (3) possible shadows with similar intensity to the cracks. To address these problems, the proposed method consists of three steps. First, we develop a geodesic shadow-removal algorithm to remove the pavement shadows while preserving the cracks. Second, we build a crack probability map using tensor voting, which enhances the connection of the crack fragments with good proximity and curve continuity. Finally, we sample a set of crack seeds from the crack probability map, represent these seeds by a graph model, derive minimum spanning trees from this graph, and conduct recursive tree-edge pruning to identify desirable cracks. We evaluate the proposed method on a collection of 206 real pavement images and the experimental results show that the proposed method achieves a better performance than several existing methods.

Superedge Grouping for Object Localization by Combining Appearance and Shape Information
Z. Zhang, S. Fidler, J. Waggoner, Y. Cao, S. Dickinson, J. Siskind, S. Wang
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2012
PDF
Abstract

Both appearance and shape play important roles in object localization and object detection. In this paper, we propose a new superedge grouping method for object localization by incorporating both boundary shape and appearance information of objects. Compared with the previous edge grouping methods, the proposed method does not subdivide detected edges into short edgels before grouping. Such long, unsubdivided superedges not only facilitate the incorporation of object shape information into localization, but also increase the robustness against image noise and reduce computation. We identify and address several important problems in achieving the proposed superedge grouping, including gap filling for connecting superedges, accurate encoding of region-based information into individual edges, and the incorporation of object-shape information into object localization. In this paper, we use the bag of visual words technique to quantify the region-based appearance features of the object of interest. We find that the proposed method, by integrating both boundary and region information, can produce better localization performance than previous subwindow search and edge grouping methods on most of the 20 object categories from the VOC 2007 database. Experiments also show that the proposed method is roughly 50 times faster than the previous edge grouping method.

Feature Matching in Underwater Environments using Sparse Linear Combinations
K. Oliver, W. Hou, S. Wang
IEEE Workshop on Object Tracking and Classification Beyond and in the Visible Spectrum, 2010
PDF
Abstract

Feature matching is a key, underlying component in many approaches to object detection, localization, and recognition. In many cases, feature matching is accomplished by nearest neighbor methods on extracted feature descriptors. This methodology works well for clean, out-of-water images; however, when imaging underwater, even an image of the same object can be drastically different due to varying water conditions. As a result, descriptors of the same point on an object may be completely different between the clean and underwater images, and between different underwater images taken under varying imaging conditions. This makes feature matching between such images a very challenging problem. In this paper, we present a new method for feature matching by first synthetically constructing a feature codebook for all template features by simulating different underwater imaging conditions. We then approximate the target feature by a sparse linear combination of the features in the constructed codebook. The optimal sparse linear combination is found by compressive sensing algorithms. In the experiments, we show that the proposed method can produce better feature matching performance than the nearest neighbor approach and associated naïve extensions.

Image Feature Detection and Matching in Underwater Conditions
K. Oliver, W. Hou, S. Wang
Proceedings of SPIE Volume 7678 (Ocean Sensing and Monitoring II), 2010
PDF
Abstract

The main challenge in underwater imaging and image analysis is to overcome the effects of blurring due to the strong scattering of light by the water and its constituents. This blurring adds complexity to already challenging problems like object detection and localization. The current state-of-the-art approaches for object detection and localization normally involve two components: (a) a feature detector that extracts a set of feature points from an image, and (b) a feature matching algorithm that tries to match the feature points detected from a target image to a set of template features corresponding to the object of interest. A successful feature matching indicates that the target image also contains the object of interest. For underwater images, the target image is taken in underwater conditions while the template features are usually extracted from one or more training images that are taken out-of-water or in different underwater conditions. In addition, the objects in the target image and the training images may show different poses, including rotation, scaling, translation transformations, and perspective changes. In this paper we investigate the effects of various underwater point spread functions on the detection of image features using many different feature detectors, and how these functions affect the capability of these features when they are used for matching and object detection. This research provides insight to further develop robust feature detectors and matching algorithms that are suitable for detecting and localizing objects from underwater images.

Free-Shape Subwindow Search for Object Localization
Z. Zhang, Y. Cao, D. Salvi, K. Oliver, J. Waggoner, S. Wang
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2010
PDF
Abstract

Object localization in an image is usually handled by searching for an optimal subwindow that tightly covers the object of interest. However, the subwindows considered in previous work are limited to rectangles or other specified, simple shapes. With such specified shapes, no subwindow can cover the object of interest tightly. As a result, the desired subwindow around the object of interest may not be optimal in terms of the localization objective function, and cannot be detected by a subwindow search algorithm. In this paper, we propose a new graph-theoretic approach for object localization by searching for an optimal subwindow without pre-specifying its shape. Instead, we require the resulting subwindow to be well aligned with edge pixels that are detected from the image. This requirement is quantified and integrated into the localization objective function based on the widely-used bag of visual words technique. We show that the ratio-contour graph algorithm can be adapted to find the optimal free-shape subwindow in terms of the new localization objective function. In the experiment, we test the proposed approach on the PASCAL VOC 2006 and VOC 2007 databases for localizing several categories of animals. We find that its performance is better than the previous efficient subwindow search algorithm.

Globally Optimal Grouping for Symmetric Closed Boundaries by Combining Boundary and Region Information
J. S. Stahl, S. Wang
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2008
PDF
Abstract

Many natural and man-made structures have a boundary that shows a certain level of bilateral symmetry, a property that plays an important role in both human and computer vision. In this paper, we present a new grouping method for detecting closed boundaries with symmetry. We first construct a new type of grouping token in the form of symmetric trapezoids by pairing line segments detected from the image. A closed boundary can then be achieved by connecting some trapezoids with a sequence of gap-filling quadrilaterals. For such a closed boundary, we define a unified grouping cost function in a ratio form: the numerator reflects the boundary information of proximity and symmetry and the denominator reflects the region information of the enclosed area. The introduction of the region-area information in the denominator is able to avoid a bias toward shorter boundaries. We then develop a new graph model to represent the grouping tokens. In this new graph model, the grouping cost function can be encoded by carefully designed edge weights and the desired optimal boundary corresponds to a special cycle with a minimum ratio-form cost. We finally show that such a cycle can be found in polynomial time using a previous graph algorithm. We implement this symmetry-grouping method and test it on a set of synthetic data and real images. The performance is compared to two previous grouping methods that do not consider symmetry in their grouping cost functions.

Open Boundary Capable Edge Grouping with Feature Maps
J. S. Stahl, K. Oliver, S. Wang
IEEE Computer Society Workshop on Perceptual Organization in Computer Vision (POCV), 2008
PDF
Abstract

Edge grouping methods aim at detecting the complete boundaries of salient structures in noisy images. In this paper, we develop a new edge grouping method that exhibits several useful properties. First, it combines both boundary and region information by defining a unified grouping cost. The region information of the desirable structures is included as a binary feature map that is of the same size as the input image. Second, it finds the globally optimal solution of this grouping cost. We extend a prior graph-based edge grouping algorithm to achieve this goal. Third, it can detect both closed boundaries, where the structure of interest lies completely within the image perimeter, and open boundaries, where the structure of interest is cropped by the image perimeter. Given this capability for detecting both open and closed boundaries, the proposed method can be extended to segment an image into disjoint regions in a hierarchical way. Experimental results on real images are reported, with a comparison against a prior edge grouping method that can only detect closed boundaries.

Edge Grouping Combining Boundary and Region Information
J. S. Stahl, S. Wang
IEEE Transactions on Image Processing (TIP), 2007
PDF
Abstract

This paper introduces a new edge-grouping method to detect perceptually salient structures in noisy images. Specifically, we define a new grouping cost function in a ratio form, where the numerator measures the boundary proximity of the resulting structure and the denominator measures the area of the resulting structure. This area term introduces a preference towards detecting larger-size structures and, therefore, makes the resulting edge grouping more robust to image noise. To find the optimal edge grouping with the minimum grouping cost, we develop a special graph model with two different kinds of edges and then reduce the grouping problem to finding a special kind of cycle in this graph with a minimum cost in ratio form. This optimal cycle-finding problem can be solved in polynomial time by a previously developed graph algorithm. We implement this edge-grouping method, test it on both synthetic data and real images, and compare its performance against several available edge-grouping and edge-linking methods. Furthermore, we discuss several extensions of the proposed method, including the incorporation of the well-known grouping cues of continuity and intensity homogeneity, introducing a factor to balance the contributions from the boundary and region information, and the prevention of detecting self-intersecting boundaries.

Global Detection of Salient Convex Boundaries
S. Wang, J. S. Stahl, A. Bailey, M. Dropps
International Journal of Computer Vision (IJCV), 2007
PDF
Abstract

As an important geometric property of many structures or structural components, convexity plays an important role in computer vision and image understanding. In this paper, we describe a general approach that can force various edge-grouping algorithms to detect only convex structures from a set of boundary fragments. The basic idea is to remove some fragments and fragment connections so that, on the remaining ones, a prototype edge-grouping algorithm that detects closed boundaries without the convexity constraint can only produce convex closed boundaries. We show that this approach takes polynomial time and preserves the grouping optimality by not excluding any valid convex boundary from the search space. Choosing the recently developed ratio-contour algorithm as the prototype grouping algorithm, we develop a new convex-grouping algorithm, which can detect convex salient boundaries with good continuity and proximity in a globally optimal fashion. To facilitate the application of this convex-grouping algorithm, we develop a new fragment-connection method based on four-point Bezier curves. We demonstrate the performance of this convex-grouping algorithm by conducting experiments on both synthetic and real images. In addition, we provide a comparison with some prior edge-grouping algorithms. Finally, we show that the proposed convex-grouping algorithm can be further extended to detect convex open boundaries, derive region-based image hierarchies, and incorporate some simple human-computer interactions.

Globally Optimal Grouping for Symmetric Boundaries
J. S. Stahl, S. Wang
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2006
PDF
Abstract

Many natural and man-made structures have a boundary that shows certain level of bilateral symmetry, a property that has been used to solve many computer-vision tasks. In this paper, we present a new grouping method for detecting closed boundaries with symmetry. We first construct a new type of grouping token in the form of a symmetric trapezoid, with which we can flexibly incorporate various boundary and region information into a unified grouping cost function. Particularly, this grouping cost function integrates Gestalt laws of proximity, closure, and continuity, besides the desirable boundary symmetry. We then develop a graph algorithm to find the boundary that minimizes this grouping cost function in a globally optimal fashion. Finally, we test this method by some experiments on a set of natural and medical images.

Evaluating Edge Detection Through Boundary Detection
S. Wang, F. Ge, T. Liu
EURASIP Journal on Applied Signal Processing (Special Issue on Performance Evaluation in Image Processing), 2006
PDF
Abstract

Edge detection has been widely used in computer vision and image processing. However, the performance evaluation of the edge-detection results is still a challenging problem. A major dilemma in edge-detection evaluation is the difficulty to balance the objectivity and generality: a general-purpose edge-detection evaluation independent of specific applications is usually not well defined, while an evaluation on a specific application has weak generality. Aiming at addressing this dilemma, this paper presents new evaluation methodology and a framework in which edge detection is evaluated through boundary detection, that is, the likelihood of retrieving the full object boundaries from this edge-detection output. Such a likelihood, we believe, reflects the performance of edge detection in many applications since boundary detection is the direct and natural goal of edge detection. In this framework, we use the newly developed ratio-contour algorithm to group the detected edges into closed boundaries. We also collect a large data set (1030) of real images with unambiguous ground-truth boundaries for evaluation. Five edge detectors (Sobel, LoG, Canny, Rothwell, and Edison) are evaluated in this paper and we find that the current edge-detection performance still has scope for improvement by choosing appropriate detectors and detector parameters.

Salient Closed Boundary Extraction with Ratio Contour
S. Wang, T. Kubota, J. M. Siskind, J. Wang
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005
PDF
Abstract

We present ratio contour, a novel graph-based method for extracting salient closed boundaries from noisy images. This method operates on a set of boundary fragments that are produced by edge detection. Boundary extraction identifies a subset of these fragments and connects them sequentially to form a closed boundary with the largest saliency. We encode the Gestalt laws of proximity and continuity in a novel boundary-saliency measure based on the relative gap length and average curvature when connecting fragments to form a closed boundary. This new measure attempts to remove a possible bias toward short boundaries. We present a polynomial-time algorithm for finding the most-salient closed boundary. We also present supplementary preprocessing steps that facilitate the application of ratio contour to real images. We compare ratio contour to two closely related methods for extracting closed boundaries: Elder and Zucker's method based on the shortest-path algorithm and Williams and Thornber's method based on spectral analysis and a strongly-connected-components algorithm. This comparison involves both theoretic analysis and experimental evaluation on both synthesized data and real images.

Convex Grouping Combining Boundary and Region Information
J. S. Stahl, S. Wang
IEEE International Conference on Computer Vision (ICCV), 2005
PDF
Abstract

Convexity is an important geometric property of many natural and man-made structures. Prior research has shown that it is imperative to many perceptual-organization and image-understanding tasks. This paper presents a new grouping method for detecting convex structures from noisy images in a globally optimal fashion. Particularly, this method combines both region and boundary information: the detected structural boundary is closed and well aligned with detected edges while the enclosed region has good intensity homogeneity. We introduce a ratio-form cost function for measuring the structural desirability, which avoids a possible bias to detect small structures. A new fragment-pruning algorithm is developed to achieve the structural convexity. The proposed method can also be extended to detect open boundaries, which correspond to the structures that are partially cropped by the image perimeter, and incorporate a human-computer interaction for detecting a convex boundary around a specified point. We test the proposed method on a set of real images and compare it with the Jacobs¿ convex-grouping method.

From Fragments to Salient Closed Boundaries: An In-Depth Study
S. Wang, J. Wang, T. Kubota
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2004
PDF
Abstract

This paper conducts an in-depth study on a classical perceptual-organization problem: finding salient closed boundaries from a set of boundary fragments detected in a noisy image. In this problem, a saliency boundary is formed by identifying and connecting a subset of fragments according to the simple Gestalt laws of closure, continuity, and proximity. Our specific interest is focused on the methods that aim to achieve boundary closure, an important global property of perceptual salient boundaries. In this paper, we analyze and compare three such methods that are developed in recent years: (a) Elder and Zucker's method based on the shortest-path algorithm, (b) Williams and Thornber's method combining the spectral-analysis and the strongly-connected-component algorithms, and (c) Wang, Kubota, and Siskind's method based on ratio-contour algorithm. Both theoretic analysis and experimental study show that, with a unified setting of fragment saliency, Wang, Kubota and Siskind's method more appropriately constrains the search space for the closed boundaries, and usually produces better performance than or at least comparable performance as the other two methods. Particularly, Wang, Kubota, and Siksind's method can always guarantee the boundary closure and simplicity, which may not be always hold in the other two methods. We construct and collect a variety of synthesized and real images for this comparison.

Salient Boundary Detection Using Ratio Contour
S. Wang, J. Wang, T. Kubota
Neural Information Processing Systems Conference (NIPS), 2003
PDF
Abstract

This paper presents a novel graph-theoretic approach, named ratio contour, to extract perceptually salient boundaries from a set of noisy boundary fragments detected in real images. The boundary saliency is defined using the Gestalt laws of closure, proximity, and continuity. This paper first constructs an undirected graph with two different set of edges: solid edges and dashed edges. The weights of solid and dashed edges measure the local saliency in and between boundary fragments, respectively. Then the most salient boundary is detected by searching for an optimal cycle in this graph with minimum average weight. The proposed approach guarantees the global optimality without introducing any biases related to region area or boundary length. We collect a variety of images for testing the proposed approach with encouraging results.