Home Kontakt Lehre Research Software


Publications of year 1995

Books and proceedings

  • Mark Stefik. Introduction to Knowledge Systems. 1995.
    Keywords: Classification.
    Abstract: Symbol Systems, Search and Problem Solving, Knowledge and Software Engineering, Reasoning about Time, Reasoning about Space, Reasoning about Uncertainty and Vagueness, Classification, Configuration, Diagnosis and Troubleshoothing, and lots of excercises incl. selected answers

Thesis

  • Volker Breuer. Zeitreihenanalyse mit Stochastischen Automaten: Klassifikation und Modellbildung. PhD thesis, 1995.

Articles in journal or book's chapters

  • Bhavik R. Bakshi and George Stephanopoulos. Reasoning in Time: Modelling, Analysis, and Pattern Recognition of Temporal Process Trends. In Advances in Chemical Engineering, volume 22, pages 485--548. 1995.
    Keywords: Wavelets, Multiscale Analysis, Artificial Intelligence.
    Abstract: [...] To improve the computer's ability to reason efficiently in time, we must first establish new forms for the representation of temporal behaviours. It is the purpose of this chapter to examine the engineering needs for temporal decision making and to propose specific models that encapsulate the requisite temporal characteristics of individual variables and composite processes. Through a combination of analytical techniques, such as {\sl scale-space filtering} and {\sl wavelet-based, multiresolution decomposition of functions}, and modelling paradigms from {\sl artificial intelligence} (AI), we have developed a concise framework that can be used to model, analyze, and synthesize the temporal trends of process operations. Within this framework, the modelling needs for logical reasoning in time can be fully satisfied, while maintaining consistency with the numerical task carried out at the same time. Thus, through the modelling paradigms of this chapter one may put together intelligent systems that use consistent representations for their logical and numerical tasks.
  • James C. Bezdek. Hybrid modelling in pattern recognition and control. KBS, 8(6):359--371, 1995.
    Keywords: Clustering, Fuzzy Clustering, Neural Networks.
    Abstract: Four techniques are discussed: fuzzy pattern recognition (numerical and syntactic), computational neural networks, and fuzzy control. The paper assesses the maturation of these disciplines by giving two examples of crossfertilization between control and pattern recognition. First, the use of pattern recognition (fuzzy clustering and feed forward neural networks) to help develop and represent fuzzy controllers is illustrated. Second, an example of waveform analysis by syntactic pattern recognition that uses the fuzzy control paradigm is given. The paper concludes with some ideas about what should happen next, and what may happen next, in terms of hybridization between the four disciplines.
  • James C. Bezdek, Richard J. Hathaway, and Nikhil R. Pal. Norm-Induced Shell-Prototypes (NISP) Clustering. NPSC, 3:431--450, 1995.
    Keywords: Clustering, Similarity Measures, Fuzzy c-Means.
    Abstract: Recent generalizations of the fuzzy c-means clustering criterion include algorithms that find and represent data with hyperspherical shell (boundary) prototypes. Alternating optimization (AO) algorithms have been successful because inner product norms are used as the criterion of (dis)similarity between the data and the shell prototypes, and they are differentiable. We extend c-shells clustering to arbitrary, non-differentiable norms (such as the Minkowski family); and we show that algorithms similar to the inner product norm based shell prototypes developed by Dave and Krishnapuram can be generalized to any norm on $\RR^s$ for the hard, fuzzy and possibilistic cases. Two non-AO methods for solution of the minimization problem associated with each case are discussed. Finally, we exemplify the two methods for the fuzzy case with two numerical examples that seek ``diamond'' shaped shell prototypes induced by the city block or 1-norm.
  • Susan M. Bridges, Cynthia Higginbotham, James M. McKinion, and Julia E. Hodges. Fuzzy Descriptors of Time-Varying Data: Theory and Applications. AI Applications, 9(2):1--14, 1995.
    Keywords: Classification.
    Abstract: Many AI applications must reason about large quantities of time-varying numerical data. Weather data, for example, are usually one of the most important inputs for computer systems that deal with natural and agricultural resource management. Crop simulation systems that integrate numeric and symbolic computing must often make decisions based on an imprecise qualitative characterization of precise quantitative input data or simulation results. These imprecise estimators can be viewed as fuzzy descriptions of precise data. We develop a method for representing time-varying quantitative data in a qualitative form using fuzzy sets. Certain fuzzy sets that are used to represent the classification of quantitative over specified time intervals are called fuzzy attributes. Many issues associated with deriving fuzzy attributes were addressed, including: identification of attributes and terms describing these attributes, determination of appropriate time intervals, and development of membership functions for the fuzzy sets. Membership functions were used to assign descriptive terms to specific data sets; these descriptors, in turn, have been used to generate natural language descriptions of the data and to select data with specified attributes for use with a numerical simulation.
  • Fernand S. Cohen, Zhaohui Huang, and Zhengwei Yang. Invariant Matching and Identification of Curves Using $B$-Splines Curves Representation. TIP, 4(1):1--10, 1995.
    Keywords: Classification, Sequential/Temporal Data, Image Data, Splines.
    Abstract: There have been many techniques for curve shape representation and analysis, ranging from Fourier descriptors, to moments, to implicit polynomials, to different geometry features, to time series models, to $B$-splines, etc. The $B$-splines stand as one of the most efficient curve (surface) representations and possess very attractive properties such as spatial uniqueness, boundedness and continuity, local shape controllability, and invariance to affine transformations. These properties made them very attractive for curve representation, and consequently, they have been extensively used in computer-aided design and computer graphics. Very little work, however, has been devoted to them for recognition purposes. One possible reason might be due to the fact that the $B$-spline curve is not uniquely described by a single set of parameters (control points), which made the curve matching (recognition) process difficult when comparing the respective parameters of the curves to be matched. This paper is an attempt to find matching solutions despite this limitation, and as such, it deals the problem of using $B$-splines for shape recognition and identification from curves, with an emphasis on the following applications: affine invariant matching and classification of 2-D curves with applications in identification of aircraft types based on image silhouettes and writer-identification based on handwritten text.
  • David Eppstein. Fast Hierarchical Clustering and Other Applications of Dynamic Closest Pairs. Disc. Comp. Geom., 13:111--122, 1995.
    Keywords: Clustering, Similarity Measures.
    Abstract: We develop data structures for dynamic closest pair problems with arbitrary (not necessarily geometric) distance functions, based on a technique previously used by the author for Euclidean closest pairs. We show how to insert and delete objects from an n-object set, maintaining the closest pair, in $O(n \log^2 n)$ time per update and $O(n)$ space. With quadratic space, we can instead use a quadtree-like structure to achieve an optimal time bound, $O(n)$ per update. We apply these data structures to hierarchical clustering, greedy matching and TSP heuristics, and discuss other potential applications in machine learning. Gr\"obner bases, and local improvement algorithms for partition and placement problems. Experiments show our new method to be faster in practice then previously used heuristics.
  • J.A. Garcia, J. Fdez-Valdivia, F. J. Cortijo, and R. Molina. A dynamic approach for clustering data. SP, 44(2):181--196, 1995.
    Keywords: Piecewise Linear Representations, Clustering, Similarity Measures, Sequential/Temporal Data.
    Abstract: This paper introduces a new method for clustering data using a dynamic scheme. An appropriate partitioning is obtained based on both a dissimilarity measure between pairs of entities as well as a dynamic procedure of splitting. A dissimilarity function is defined by using the cost of the optimum path from a datum to each entity on a graph, with the cost of the path being defined as the greatest distance between two successive vertices on the path. The procedure of clustering is dynamic in the sense that the initial problem of determining a partition into an unknown number of natural groupings has been reduced to a sequence of only two class splitting stages. Having arisen from any particular application, the proposed approach could be effective for many domains, and is especially successful to identify clusters if there is lack of prior knowledge about the data set. The usefulness of the dynamic approach to deal with elongated or non-piecewise linear separable clusters as well as sparse and dense groupings is demonstrated with several data sets.
  • Isak Gath and Dan Hoory. Fuzzy clustering of elliptic ring-shaped clusters. PRL, 16:727--741, 1995.
    Keywords: Clustering, Fuzzy Clustering, Image Data.
    Abstract: A new method for the detection and parameter estimation of elliptic ring-shaped clusters is developed. The algorithm is based on optimization of an objective function in which the ellipse prototype parameters (2 foci and a radius) are included in the optimization scheme. The performance of the Fuzzy k-Ellipses (FKE) algorithm has been evaluated on various forms of synthetic data, such as multiple intersecting ellipses, concentric ellipses, incomplete ellipses, ellipses with hidden lines, and combinations of the above, and has been found to be very good. The algorithm was employed for the detection of the outer and inner contours of the heart's left ventricle from MRI images of transverse sections of the thorax. Reconstruction of 3D danymic images(systole-diastole) of the left ventricle was obtained from the 2D images, taken at different points in time during the cardiac cycle.
  • Ardeshir Goshtasby and Hai-Lun Shyu. Edge detection by curve fitting. IVC, 13(3):169--177, 1995.
    Keywords: Image Data.
    Abstract: Edge detection is formulated as a curve fitting problem. First, high-gradient pixels are grouped into elongated regions and then a curve is fitted to each. The curve fitting method used in this work does not require solving a system of equations, and therefore is fast. Examples of edge detection by curve fitting on synthetic and real images are presented, and results obtained are compared with those determined by the Laplacian of Gaussian operator.
  • Amara Graps. An Introduction to Wavelets. CSE, 2(2), 1995.
    Keywords: Wavelets, Multiscale Analysis, Image Data.
    Abstract: Wavelets are mathematical functions that cut up data into different frequency components, and then study each component with a resolution matched to its scale. They have advantages over traditional Fourier methods in analyzing physical situations where the signal contains discontinuities and sharp spikes. Wavelets were developed independently in the fields of mathematics, quantum physics, electrical engineering, and seismic geology. Interchanges between these fields during the last ten years have led to many new wavelet applications such as image compression, turbulence, human vision, radar, and earthquake prediction. This paper introduces wavelets to the interested technical person outside of the digital signal processing field. I describe the history of wavelets beginning with Fourier, compare wavelet transforms with Fourier transforms, state properties and other special aspects of wavelets, and finish with some interesting applications such as image compression, musical tones, and de-noising noisy data.
  • Richard J. Hathaway and James C. Bezdek. Optimization of Clustering Criteria by Reformulation. TFS, 3(2):241--245, 1995.
    Keywords: Clustering, Fuzzy Clustering.
    Abstract: Various hard, fuzzy and possibilistic clustering criteria (objective functions) are useful as bases for a variety of pattern recognition problems. At present, many of these criteria have customized individual optimization algorithms. Because of the specialized nature of these algorithms, experimentation with new and existing criteria can be very inconvenient and costly in terms of development and implementation time. This note shows how to reformulate some clustering criteria so that specialized algorithms can be replaced by general optimization routines found in commercially available software. We prove that the original and reformulated versions of each criterion are fully equivalent. Finally, two numerical examples are given to illustrate reformulation. The second one shows that reformulated hard c-means avoids an unappealing local extrema that traps the unreformulated version of the ubiquitous IRIS data.
  • C.V. Jawahar, P.K. Biswas, and A.K. Ray. Detection of clusters of distinct geometry: A step towards generalised fuzzy clustering. PRL, 16:1119--1123, 1995.
    Keywords: Clustering, Fuzzy Clustering, Image Data.
    Abstract: This paper presents a method of identification of clusters even if they are of distinctly different geometrical categories. An optimal fuzzy partition is achieved by minimisation of a modified objective function. This technique is an elegant one and works well in segmentation of 3D images.
  • Willi Klösgen. Efficient Discovery of Interesting Statements in Databases. JIIS, 4:53--69, 1995.
    Keywords: Speed-up Issues.
    Abstract: The Explora system supports {\sl Discovery in Databases} by large scale search for interesting instances of statistical patterns. In this paper we describe how Explora assesses {\sl interestingness} and achieves {\sl computational efficiency}. These problems arise because of of the variety of aptterns and the immense combinatorial possibilities of generating instances when studying relations between variables in subsets of data. To restrict the search with respect to the analysis goals, the user can focus each discovery task performed during an interactive and iterative exploration process. Some basic organization principles of search can further limit the search effort. One principle is to organize search hierarchically and to evaluate first the statistical or information theoretic evidence of the general hypothesis. Then more special hypotheses can be eliminated from further search, if a more general hypothesis was already verified. But this approach has some drawbacks and even in moderately sized data does not prevent large sets of findings. Therefore, in a second evaluation phase, further aspects of interestingness are assessed. A refinement strategy selects the most interesting of the statistically significant statements. A second problem for discovery systems is efficiency. Each hypothesis evaluation requires many data accesses. We describe strategies that reduce data accesses and speed-up computation.
  • Raghu Krishnapuram, Hichem Frigui, and Olfa Nasraoui. Fuzzy and Possibilistic Shell Clustering Algorithms and Their Application to Boundary Detection and Surface Approximation -- Part II. TFS, 3(1):44--60, 1995.
    Keywords: Clustering, Cluster Validity Measures, Image Data.
    Abstract: Shell clustering algorithms are ideally suited for computer vision tasks such as boundary detection and surface approximation, particularly when the boundaries have jagged or scattered edges and when the range data is sparse. This is because shell clustering is insensitive to local aberrations, it can be performed directly in image space, and unlike traditional approaches it does assume dense data and does not use additional features such as curvatures and surface normals. The shell clustering algorithms introduced in Part I of this paper assume that the number of clusters is known, however, which is not the case in many boundary and surface approximation applications. This problem can be overcome by considering cluster validity. In this paper we introduce a validity measure called surface density which is explicitly meant for the type of applications considered in this paper. We show through theoretical derivations that surface density is relatively invariant to size and partiality (incompleteness) of the cluster. We describe unsupervised clustering algorithms that use the surface density measure and other measures to determine the optimum number of shell clusters automatically, and illustrate the application of the proposed algorithms to boundary detection in the case of intensity images and to surface approximation in the case of range images.
  • Raghu Krishnapuram, Hichem Frigui, and Olfa Nasraoui. Fuzzy and Possibilistic Shell Clustering Algorithms and Their Application to Boundary Detection and Surface Approximation -- Part I. TFS, 3(1):29--43, 1995.
    Keywords: Noise Handling, Clustering, Fuzzy Clustering, Fuzzy c-Means.
    Abstract: Traditionally, prototype-based fuzzy clustering algorithms such as the Fuzzy c-Means (FCM) algorithm have been used to find ``compact'' or ``filled'' clusters. Recently, there have been attempts to generalize such algorithms to the case of hollow or ``shell-like'' clusters, i.e., clusters that lie in subspaces of the feature space. The shell clustering approach provides a powerful means to solve the hitherto unsolved problem of simultaneously fitting multiple curves/surfaces to unsegmented, scattered and sparse data. In this paper, we present several fuzzy and possibilistic algorithms to detect linear and quadric shell clusters. We also introduce generalizations of these algorithms in which the prototypes represent sets of higher-order polynomial functions. The suggested algorithms provide a good trade-off between computational complexity and performance. Since the objective function used in these algorithms is the sum of sqaured distacnes, the clustering in sensitive to noise and outliers. We show that by using a possibilistic approach to clustering, one can make the proposed algorithm robust.
  • Nikhil R. Pal and James C. Bezdek. On Cluster Validity for the Fuzzy c-Means Model. TFS, 3(3):370--379, 1995.
    Keywords: Clustering, Cluster Validity Measures, Fuzzy c-Means.
    Abstract: Many functionals have been proposed for validation of partitions of object data produced by the fuzzy c-means (FCM) clustering algorithm. We examine the role a subtle but important parameter -- the weighting exponent $m$ of the FCM model -- plays in determining the validity of FCM partitions. The functionals considered are the partition coefficient and entropy indexes of Bezedek, the Xie-Beni, and extended Xie-Beni indexes, and the Fukuyama-Sugeno index. Limit analysis indicates, and numerical experiments confirm, that the Fukuyama-Sugeno index is sensitive to both high and low values of $m$ and may be unreliable because of this. Of the indexes tested, the Xie-Beni index provided the best response over a wide range of choices for the number of clusters, (2-10), and for $m$ from 1.01-7. Finally, our calculations suggest that the best choice form $m$ is probably in the interval $[1.5,2.5]$, whose mean and midpoint, $m=2$, have often been the preferred choice for many users of FCM.\newline (Remark: Some errors in the tables are corrected in \cite{Pal:TFS:5:1})
  • Paul L. Rosin and Geoff A. W. West. Nonparametric Segmentation of Curves into Various Representations. TPAMI, 17(12):1140--1153, 1995.
    Abstract: This paper describes and demonstrates the operation and performance of an algorithm for segmenting connected points into a combination of representations such as lines, circular, elliptical and superelliptical arcs, and polynomials. The algorithm has a number of interesting properties including being scale invariant, nonparameteric, general purpose, and efficient.

Conference's articles

  • Rakesh Agrawal, King-Li Lin, Harpreet S. Sawhney, and Kyuseok Shim. Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Databases. In VLDB95, Zürich, Switzerland, 1995.
    Keywords: Noise Handling, Similarity Measures, Sequential/Temporal Data.
    Abstract: We introduce a new model of similarity of time sequences that captures the intuitive notion that two sequences should be considered similar if they have enough non-overlapping time-ordered pairs of subsequences that are similar. The model allows the amplitude of one of the two sequences to be scaled by any suitable amount and its offset adjusted appropriately. Two subsequences are considered similar if one can be enclosed within an envelope of a specified width drawn around the other. The model also allows non-matching gaps in the matching sequences. The matching subsequences need not be aligned along the time axis.\newline Given this model of similarity, we present fast search techniques for discovering all similar sequences in a set of sequences. These techniques can also be used to find all (sub)sequences similar to a given sequence. We applied this matching system to the U.S.\ mutual funds data and discovered interesting matches.
  • Rakesh Agrawal and Ramakrishnan Srikant. Mining Sequential Patterns. In ICDE95, Taipei, Taiwan, pages 3--14, 1995.
    Keywords: Association Rules, Sequential/Temporal Patterns.
    Abstract: We are given a large database of customer transactions, where each transaction consists of customer-id, transaction time, and the items bought in the transaction. We introduce the problem of mining sequential patterns over such databases. We present three algorithms to solve this problem, and empirically evaluate their performance using synthetic data. Two of the proposed algorithms, AprioriSome and AprioriAll, have comparable performance, albeit AprioriSome performs a little better when the minimum number of customer that must support a sequential pattern is low. Scale-up experiments show that both AprioriSome and AprioriAll scale linearly with the number of customer transactions. They also have excellent scale-up properties with respect to the number of transactions per customer and the number of items in a transaction.
  • Tai Wai Cheng, Dimitry B. Goldgof, and Lawrence O. Hall. Fast Clustering with Application to Fuzzy Rule Generation. In Proceedings of the 4th IEEE International Conference on Fuzzy Systems, Yokohama, Japan, pages 2289--2295, 1995.
    Keywords: Sampling, Classification, Clustering, Fuzzy c-Means, Rule Generation, Image Data, Speed-up Issues.
    Abstract: This paper presents a multistage random sampling fuzzy c-means based clustering algorithm, which significantly reduces the computation time required to partition a data set into $c$ classes. A series of subsets of the full data set are used for classification in order to provide an approximation to the final cluster centers. The quality of the final partitions is equivalent to that of the fuzzy c-means. The speed-up is normally a factor of 2-3 times, which is especially significant for high dimensional spaces and large data sets. Recently clustering has been used to generate fuzzy rules for control. In this paper, we show that the multistage random sampling fuzzy c-means based clustering algorithm (mrFCM) can be effectively used to create fuzzy rules in the domain of magnetic resonance images where over 60,000 patterns and 3 features or attributes are common.
  • Luc Dehaspe and Luc De Raedt. Mining Association Rules in Multiple Relations. In Proc. 7th Int. Workshop on ILP, volume 1297 of LNCS, pages 125--132, 1995.
    Keywords: Inductive Logic Programming, Association Rules.
    Abstract: The application of algorithms for efficiently generating association rules is so far restricted to cases where information is put together in a single relation. We describe how this restriction can be overcome through the combination of the available algoritms with standard techniques from the field of inductive logic programming. We present the system WARMR, which extends APRIORI to min association rules in multiple relations. We apply WARMR to the natural language processing task of mining part-of-speech tagging rules in a large corpus of English.
  • Jiawei Han and Yongjian Fu. Discovery of Multiple-Level Association Rules from Large Databases. In VLDB95, pages 420--431, 1995.
    Keywords: Association Rules.
    Abstract: Previous studies on mining association rules find rules at single concept level, however, mining association rules at multiple concept levels may lead to the discovery of more specific and conrete knowledge from data. In this study, a top-down progressive deepening method is developed for mining multiple-level association rules from large transaction databases by extension of some existing association rule techniques. A group of variant algorithms are proposed based on the ways of sharing intermediate results, with the relative performacne tested on different kinds of data. Relaxation of the rule conditions for finding ``level-crossing'' association rules is also discussed in the paper.
  • Marta Karczewicz, Matti Hämäläinen, and Moncef Gabbouj. Multiple knot spline approximation for ECG data compression. In Proceedings NSIP, 1995.
    Keywords: Splines, Medical Applications.
    Abstract: This paper presents a new adaptive compression method for ECG signals based on B-spline expansion, allowing multiple knots at the same location. The algorithm tries to remove as many of these knots as possible to yield a high compression ratio. Low bit rates on the order of 160-200 bit/s are achieved with very good quality of the reconstructed signal. The algorithm is compared with other transform based schemes.
  • Frank Klawonn and Rudolf Kruse. Derivation of Fuzzy Classification Rules from Multidimensional Data. In G.E. Lasker and X. Liu, editors, Advances in Intelligent Data Analysis, Windsor, Ontario, pages 90--94, 1995.
    Keywords: Classification, Clustering, Fuzzy Clustering.
    Abstract: This paper describes techniques for deriving fuzzy classification rules based on special modified fuzzy clustering algorithms. The basic idea is that each fuzzy cluster induces a fuzzy classification rule. The fuzzy sets appearing in a rule associated with a fuzzy cluster are obtained by projecting the cluster to the one-dimensional coordinate spaces. In order to allow clusters of varying shape and size we derive special fuzzy clustering algorithms which are searching for clusters in the form of axes-parallel hyper-ellipsoids. Our method can be applied to classification tasks where the classification of the sample data is known as well as when it is not known.
  • Krzysztof Koperski and Jiawei Han. Discovery of Spatial Association Rules in Geographic Information Systems. In SSD95, volume 951 of LNCS, pages 47-66, 1995.
    Keywords: Association Rules.
    Abstract: Spatial data mining, i.e., discovery of interesting implicit knowledge in spatial databases, is an important task for understanding and use of spatial data- and knowledge-bases. In this paper, an efficient method for mining strong spatialassociation rules in geographic information databases is proposed and studied. A spatial association rule is a rule indicating certain association relationships among a set of spatial and possibly some nonspatial predicates. A strong rule indicates that the patterns in the rule have relatively frequent occurrences in the database and strong implication relationships. Several optimizatin techniques are explored, including a two-step spatial computation technique (approximate computation on large sets, and refined computations on small promising patterns), shared processing in the derivation of large predicates at multiple concept levels, etc. Our analysis shows that interesting association rules can be discovered efficiently in large spatial databases.
  • Heikki Mannila, Hannu Toivonen, and A. Inkeri Verkamo. Discovering frequent episodes in sequences. In KDD95, Menlo Park, Calif., pages 210--215, 1995.
    Keywords: Sequential/Temporal Data.
    Abstract: Sequences of events describing the behaviour and actions of users or systems can be collected in several domains. In this paper we consider the problem of recognizing frequent episodes in such sequences of events. An episode is defined to be a collection of events that occur within time intervals of a given size in a given partial order. Once such episodes are known, one can produce rules for describing or predicting the behaviour of the sequence. We describe an efficient algorithm for the discovery of all frequent episodes from a given class of episodes, and present experimental results.
  • Ramakrishnan Srikant and Rakesh Agrawal. Mining Generalized Association Rules. In VLDB95, pages 407--419, 1995.
    Keywords: Association Rules.
    Abstract: We introduce the problem of mining generalized association rules. Given a large database of transactions, where each transaction consists of a set of items, and a taxonomy ({\sl is-a} hierarchy) on the items, we find associations between items at any level of the taxonomy. For example, given a taxonomy that says thatjackets {\sl is-a} outerwear {\sl is-a} clothes, we may infer a rule that ``people who buy outerwear tend to buy shoes''. This rule may hold even if rules that ``people who buy jackets tend to buy shoes'', and ``people who buy clothes tend to buy shoes'' do not hold. An obvious solution to the problem is to add all ancestors of each item in a transaction to the transaction, and then run any of the algorithms for mining association rules on these ``extended transactions''. However, this ``Basic'' algorithm is not very fast; we present two algorithms Cumulate and EstMerge, which run 2 to 5 times faster than Basic (and more than 100 times faster on real-life datasets). We also present a new interest-measure for rules which uses the information in the taxonomy. Given a user-specified ``minimum interest-level'', this measure prunes a alarge number of redundant rules; 40\27777756214 60\10004716220f all the rules were pruned on two real-life datasets.

Internal reports

  • Hagit Shatkay. Approximate Queries and Representations for Large Data Sequences. Technical report 3, 1995.
    Keywords: Sequential/Temporal Data.
    Abstract: [Excert from introduction:] In this paper we present a notion of approximation that is appropriate for long data sequences and time series. We discuss approximation both in terms of queries and in terms of representation. We generalize the previous notion of approximate queries, and suggest a strategy for handling this kind of approximation, without committing ourselves to any specific query language or form.

Disclaimer

This list of publications is neither official nor complete, but a personal compilation.

Copyright and all rights therein are retained by authors or by other copyright holders. All person copying this information are expected to adhere to the terms and constraints invoked by each author's copyright.

This document was translated from BibTEX by bibtex2html

Home © F. Höppner last update: Tue Dec 7 08:49:56 CET 2004