-
Bhavik R. Bakshi and George Stephanopoulos.
Reasoning in Time: Modelling, Analysis, and Pattern Recognition of Temporal Process Trends.
In Advances in Chemical Engineering,
volume 22,
pages 485--548.
1995.
Keywords:
Wavelets,
Multiscale Analysis,
Artificial Intelligence.
| Abstract: |
[...] To improve the computer's ability to reason efficiently in time, we must first establish new forms for the representation of temporal behaviours. It is the purpose of this chapter to examine the engineering needs for temporal decision making and to propose specific models that encapsulate the requisite temporal characteristics of individual variables and composite processes. Through a combination of analytical techniques, such as {\sl scale-space filtering} and {\sl wavelet-based, multiresolution decomposition of functions}, and modelling paradigms from {\sl artificial intelligence} (AI), we have developed a concise framework that can be used to model, analyze, and synthesize the temporal trends of process operations. Within this framework, the modelling needs for logical reasoning in time can be fully satisfied, while maintaining consistency with the numerical task carried out at the same time. Thus, through the modelling paradigms of this chapter one may put together intelligent systems that use consistent representations for their logical and numerical tasks. |
-
James C. Bezdek.
Hybrid modelling in pattern recognition and control.
KBS,
8(6):359--371,
1995.
Keywords:
Clustering,
Fuzzy Clustering,
Neural Networks.
| Abstract: |
Four techniques are discussed: fuzzy pattern recognition (numerical and syntactic), computational neural networks, and fuzzy control. The paper assesses the maturation of these disciplines by giving two examples of crossfertilization between control and pattern recognition. First, the use of pattern recognition (fuzzy clustering and feed forward neural networks) to help develop and represent fuzzy controllers is illustrated. Second, an example of waveform analysis by syntactic pattern recognition that uses the fuzzy control paradigm is given. The paper concludes with some ideas about what should happen next, and what may happen next, in terms of hybridization between the four disciplines. |
-
James C. Bezdek,
Richard J. Hathaway,
and Nikhil R. Pal.
Norm-Induced Shell-Prototypes (NISP) Clustering.
NPSC,
3:431--450,
1995.
Keywords:
Clustering,
Similarity Measures,
Fuzzy c-Means.
| Abstract: |
Recent generalizations of the fuzzy c-means clustering criterion include algorithms that find and represent data with hyperspherical shell (boundary) prototypes. Alternating optimization (AO) algorithms have been successful because inner product norms are used as the criterion of (dis)similarity between the data and the shell prototypes, and they are differentiable. We extend c-shells clustering to arbitrary, non-differentiable norms (such as the Minkowski family); and we show that algorithms similar to the inner product norm based shell prototypes developed by Dave and Krishnapuram can be generalized to any norm on $\RR^s$ for the hard, fuzzy and possibilistic cases. Two non-AO methods for solution of the minimization problem associated with each case are discussed. Finally, we exemplify the two methods for the fuzzy case with two numerical examples that seek ``diamond'' shaped shell prototypes induced by the city block or 1-norm. |
-
Susan M. Bridges,
Cynthia Higginbotham,
James M. McKinion,
and Julia E. Hodges.
Fuzzy Descriptors of Time-Varying Data: Theory and Applications.
AI Applications,
9(2):1--14,
1995.
Keywords:
Classification.
| Abstract: |
Many AI applications must reason about large quantities of time-varying numerical data. Weather data, for example, are usually one of the most important inputs for computer systems that deal with natural and agricultural resource management. Crop simulation systems that integrate numeric and symbolic computing must often make decisions based on an imprecise qualitative characterization of precise quantitative input data or simulation results. These imprecise estimators can be viewed as fuzzy descriptions of precise data. We develop a method for representing time-varying quantitative data in a qualitative form using fuzzy sets. Certain fuzzy sets that are used to represent the classification of quantitative over specified time intervals are called fuzzy attributes. Many issues associated with deriving fuzzy attributes were addressed, including: identification of attributes and terms describing these attributes, determination of appropriate time intervals, and development of membership functions for the fuzzy sets. Membership functions were used to assign descriptive terms to specific data sets; these descriptors, in turn, have been used to generate natural language descriptions of the data and to select data with specified attributes for use with a numerical simulation. |
-
Fernand S. Cohen,
Zhaohui Huang,
and Zhengwei Yang.
Invariant Matching and Identification of Curves Using $B$-Splines Curves Representation.
TIP,
4(1):1--10,
1995.
Keywords:
Classification,
Sequential/Temporal Data,
Image Data,
Splines.
| Abstract: |
There have been many techniques for curve shape representation and analysis, ranging from Fourier descriptors, to moments, to implicit polynomials, to different geometry features, to time series models, to $B$-splines, etc. The $B$-splines stand as one of the most efficient curve (surface) representations and possess very attractive properties such as spatial uniqueness, boundedness and continuity, local shape controllability, and invariance to affine transformations. These properties made them very attractive for curve representation, and consequently, they have been extensively used in computer-aided design and computer graphics. Very little work, however, has been devoted to them for recognition purposes. One possible reason might be due to the fact that the $B$-spline curve is not uniquely described by a single set of parameters (control points), which made the curve matching (recognition) process difficult when comparing the respective parameters of the curves to be matched. This paper is an attempt to find matching solutions despite this limitation, and as such, it deals the problem of using $B$-splines for shape recognition and identification from curves, with an emphasis on the following applications: affine invariant matching and classification of 2-D curves with applications in identification of aircraft types based on image silhouettes and writer-identification based on handwritten text. |
-
David Eppstein.
Fast Hierarchical Clustering and Other Applications of Dynamic Closest Pairs.
Disc. Comp. Geom.,
13:111--122,
1995.
Keywords:
Clustering,
Similarity Measures.
| Abstract: |
We develop data structures for dynamic closest pair problems with arbitrary (not necessarily geometric) distance functions, based on a technique previously used by the author for Euclidean closest pairs. We show how to insert and delete objects from an n-object set, maintaining the closest pair, in $O(n \log^2 n)$ time per update and $O(n)$ space. With quadratic space, we can instead use a quadtree-like structure to achieve an optimal time bound, $O(n)$ per update. We apply these data structures to hierarchical clustering, greedy matching and TSP heuristics, and discuss other potential applications in machine learning. Gr\"obner bases, and local improvement algorithms for partition and placement problems. Experiments show our new method to be faster in practice then previously used heuristics. |
-
J.A. Garcia,
J. Fdez-Valdivia,
F. J. Cortijo,
and R. Molina.
A dynamic approach for clustering data.
SP,
44(2):181--196,
1995.
Keywords:
Piecewise Linear Representations,
Clustering,
Similarity Measures,
Sequential/Temporal Data.
| Abstract: |
This paper introduces a new method for clustering data using a dynamic scheme. An appropriate partitioning is obtained based on both a dissimilarity measure between pairs of entities as well as a dynamic procedure of splitting. A dissimilarity function is defined by using the cost of the optimum path from a datum to each entity on a graph, with the cost of the path being defined as the greatest distance between two successive vertices on the path. The procedure of clustering is dynamic in the sense that the initial problem of determining a partition into an unknown number of natural groupings has been reduced to a sequence of only two class splitting stages. Having arisen from any particular application, the proposed approach could be effective for many domains, and is especially successful to identify clusters if there is lack of prior knowledge about the data set. The usefulness of the dynamic approach to deal with elongated or non-piecewise linear separable clusters as well as sparse and dense groupings is demonstrated with several data sets. |
-
Isak Gath and Dan Hoory.
Fuzzy clustering of elliptic ring-shaped clusters.
PRL,
16:727--741,
1995.
Keywords:
Clustering,
Fuzzy Clustering,
Image Data.
| Abstract: |
A new method for the detection and parameter estimation of elliptic ring-shaped clusters is developed. The algorithm is based on optimization of an objective function in which the ellipse prototype parameters (2 foci and a radius) are included in the optimization scheme. The performance of the Fuzzy k-Ellipses (FKE) algorithm has been evaluated on various forms of synthetic data, such as multiple intersecting ellipses, concentric ellipses, incomplete ellipses, ellipses with hidden lines, and combinations of the above, and has been found to be very good. The algorithm was employed for the detection of the outer and inner contours of the heart's left ventricle from MRI images of transverse sections of the thorax. Reconstruction of 3D danymic images(systole-diastole) of the left ventricle was obtained from the 2D images, taken at different points in time during the cardiac cycle. |
-
Ardeshir Goshtasby and Hai-Lun Shyu.
Edge detection by curve fitting.
IVC,
13(3):169--177,
1995.
Keywords:
Image Data.
| Abstract: |
Edge detection is formulated as a curve fitting problem. First, high-gradient pixels are grouped into elongated regions and then a curve is fitted to each. The curve fitting method used in this work does not require solving a system of equations, and therefore is fast. Examples of edge detection by curve fitting on synthetic and real images are presented, and results obtained are compared with those determined by the Laplacian of Gaussian operator. |
-
Amara Graps.
An Introduction to Wavelets.
CSE,
2(2),
1995.
Keywords:
Wavelets,
Multiscale Analysis,
Image Data.
| Abstract: |
Wavelets are mathematical functions that cut up data into different frequency components, and then study each component with a resolution matched to its scale. They have advantages over traditional Fourier methods in analyzing physical situations where the signal contains discontinuities and sharp spikes. Wavelets were developed independently in the fields of mathematics, quantum physics, electrical engineering, and seismic geology. Interchanges between these fields during the last ten years have led to many new wavelet applications such as image compression, turbulence, human vision, radar, and earthquake prediction. This paper introduces wavelets to the interested technical person outside of the digital signal processing field. I describe the history of wavelets beginning with Fourier, compare wavelet transforms with Fourier transforms, state properties and other special aspects of wavelets, and finish with some interesting applications such as image compression, musical tones, and de-noising noisy data. |
-
Richard J. Hathaway and James C. Bezdek.
Optimization of Clustering Criteria by Reformulation.
TFS,
3(2):241--245,
1995.
Keywords:
Clustering,
Fuzzy Clustering.
| Abstract: |
Various hard, fuzzy and possibilistic clustering criteria (objective functions) are useful as bases for a variety of pattern recognition problems. At present, many of these criteria have customized individual optimization algorithms. Because of the specialized nature of these algorithms, experimentation with new and existing criteria can be very inconvenient and costly in terms of development and implementation time. This note shows how to reformulate some clustering criteria so that specialized algorithms can be replaced by general optimization routines found in commercially available software. We prove that the original and reformulated versions of each criterion are fully equivalent. Finally, two numerical examples are given to illustrate reformulation. The second one shows that reformulated hard c-means avoids an unappealing local extrema that traps the unreformulated version of the ubiquitous IRIS data. |
-
C.V. Jawahar,
P.K. Biswas,
and A.K. Ray.
Detection of clusters of distinct geometry: A step towards generalised fuzzy clustering.
PRL,
16:1119--1123,
1995.
Keywords:
Clustering,
Fuzzy Clustering,
Image Data.
| Abstract: |
This paper presents a method of identification of clusters even if they are of distinctly different geometrical categories. An optimal fuzzy partition is achieved by minimisation of a modified objective function. This technique is an elegant one and works well in segmentation of 3D images. |
-
Willi Klösgen.
Efficient Discovery of Interesting Statements in Databases.
JIIS,
4:53--69,
1995.
Keywords:
Speed-up Issues.
| Abstract: |
The Explora system supports {\sl Discovery in Databases} by large scale search for interesting instances of statistical patterns. In this paper we describe how Explora assesses {\sl interestingness} and achieves {\sl computational efficiency}. These problems arise because of of the variety of aptterns and the immense combinatorial possibilities of generating instances when studying relations between variables in subsets of data. To restrict the search with respect to the analysis goals, the user can focus each discovery task performed during an interactive and iterative exploration process. Some basic organization principles of search can further limit the search effort. One principle is to organize search hierarchically and to evaluate first the statistical or information theoretic evidence of the general hypothesis. Then more special hypotheses can be eliminated from further search, if a more general hypothesis was already verified. But this approach has some drawbacks and even in moderately sized data does not prevent large sets of findings. Therefore, in a second evaluation phase, further aspects of interestingness are assessed. A refinement strategy selects the most interesting of the statistically significant statements. A second problem for discovery systems is efficiency. Each hypothesis evaluation requires many data accesses. We describe strategies that reduce data accesses and speed-up computation. |
-
Raghu Krishnapuram,
Hichem Frigui,
and Olfa Nasraoui.
Fuzzy and Possibilistic Shell Clustering Algorithms and Their Application to Boundary Detection and Surface Approximation -- Part II.
TFS,
3(1):44--60,
1995.
Keywords:
Clustering,
Cluster Validity Measures,
Image Data.
| Abstract: |
Shell clustering algorithms are ideally suited for computer vision tasks such as boundary detection and surface approximation, particularly when the boundaries have jagged or scattered edges and when the range data is sparse. This is because shell clustering is insensitive to local aberrations, it can be performed directly in image space, and unlike traditional approaches it does assume dense data and does not use additional features such as curvatures and surface normals. The shell clustering algorithms introduced in Part I of this paper assume that the number of clusters is known, however, which is not the case in many boundary and surface approximation applications. This problem can be overcome by considering cluster validity. In this paper we introduce a validity measure called surface density which is explicitly meant for the type of applications considered in this paper. We show through theoretical derivations that surface density is relatively invariant to size and partiality (incompleteness) of the cluster. We describe unsupervised clustering algorithms that use the surface density measure and other measures to determine the optimum number of shell clusters automatically, and illustrate the application of the proposed algorithms to boundary detection in the case of intensity images and to surface approximation in the case of range images. |
-
Raghu Krishnapuram,
Hichem Frigui,
and Olfa Nasraoui.
Fuzzy and Possibilistic Shell Clustering Algorithms and Their Application to Boundary Detection and Surface Approximation -- Part I.
TFS,
3(1):29--43,
1995.
Keywords:
Noise Handling,
Clustering,
Fuzzy Clustering,
Fuzzy c-Means.
| Abstract: |
Traditionally, prototype-based fuzzy clustering algorithms such as the Fuzzy c-Means (FCM) algorithm have been used to find ``compact'' or ``filled'' clusters. Recently, there have been attempts to generalize such algorithms to the case of hollow or ``shell-like'' clusters, i.e., clusters that lie in subspaces of the feature space. The shell clustering approach provides a powerful means to solve the hitherto unsolved problem of simultaneously fitting multiple curves/surfaces to unsegmented, scattered and sparse data. In this paper, we present several fuzzy and possibilistic algorithms to detect linear and quadric shell clusters. We also introduce generalizations of these algorithms in which the prototypes represent sets of higher-order polynomial functions. The suggested algorithms provide a good trade-off between computational complexity and performance. Since the objective function used in these algorithms is the sum of sqaured distacnes, the clustering in sensitive to noise and outliers. We show that by using a possibilistic approach to clustering, one can make the proposed algorithm robust. |
-
Nikhil R. Pal and James C. Bezdek.
On Cluster Validity for the Fuzzy c-Means Model.
TFS,
3(3):370--379,
1995.
Keywords:
Clustering,
Cluster Validity Measures,
Fuzzy c-Means.
| Abstract: |
Many functionals have been proposed for validation of partitions of object data produced by the fuzzy c-means (FCM) clustering algorithm. We examine the role a subtle but important parameter -- the weighting exponent $m$ of the FCM model -- plays in determining the validity of FCM partitions. The functionals considered are the partition coefficient and entropy indexes of Bezedek, the Xie-Beni, and extended Xie-Beni indexes, and the Fukuyama-Sugeno index. Limit analysis indicates, and numerical experiments confirm, that the Fukuyama-Sugeno index is sensitive to both high and low values of $m$ and may be unreliable because of this. Of the indexes tested, the Xie-Beni index provided the best response over a wide range of choices for the number of clusters, (2-10), and for $m$ from 1.01-7. Finally, our calculations suggest that the best choice form $m$ is probably in the interval $[1.5,2.5]$, whose mean and midpoint, $m=2$, have often been the preferred choice for many users of FCM.\newline (Remark: Some errors in the tables are corrected in \cite{Pal:TFS:5:1}) |
-
Paul L. Rosin and Geoff A. W. West.
Nonparametric Segmentation of Curves into Various Representations.
TPAMI,
17(12):1140--1153,
1995.
| Abstract: |
This paper describes and demonstrates the operation and performance of an algorithm for segmenting connected points into a combination of representations such as lines, circular, elliptical and superelliptical arcs, and polynomials. The algorithm has a number of interesting properties including being scale invariant, nonparameteric, general purpose, and efficient. |
|