Home Kontakt Lehre Research Software


Publications of year 1997

Books and proceedings

  • Enrique Castillo, José Gutiérrez, and Ali S. Hadi. Expert Systems and Probabilistic Network Models. 1997.
  • Frank Höppner, Frank Klawonn, and Rudolf Kruse. Fuzzy-Clusteranalyse, Computational Intelligence. Braunschweig, Germany, 1997.
    Keywords: Clustering.
    Abstract: Dieses Werk gibt eine methodische Einf\"uhrung in die zahlreichen Fuzzy-Clustering-Algorithmen mit ihren Anwendungen in den Bereichen Datenanalyse, Erzeugung von Regeln f\"ur Fuzzy-Regler, Klassifikations- und Approximationsprobleme sowie eine ausf\"uhrliche Darstellung des Shell-Clustering zur Erkennung von geometrischen Konturen in Bildern.
  • Tom M. Mitchell. Machine Learning. 1997.
    Keywords: Decision Trees, Neural Networks.
    Abstract: Chapter Headings: Introduction, Concept Learning and the General-to-Specific Ordering, Decision Tree Learning, Artificial Neural Networks, Evaluating Hypotheses, Bayesian Learning, Computational Learning Theory, Instance-Based Learning, Genetic Algorithms, Learning Sets of Rules, Analytical Learning, Combining Inductive and Analytical Learning, Reinforcement Learning
  • James O. Ramsay and B. W. Silverman. Functional Data Analysis, Springer Series in Statistics. 1997.
    Abstract: Chapter headings: Introduction, Notation and techniques, Representing functional data as smooth functions, The roughness penalty approach, The registration and display of functional data, Principal components analysis for functional data, Regularized principal components analysis, Principal components analysis of mixed data, Functional linear models, Functional linear models for scalar responses, Functional linear models for functional responses, Canonical correlation and discriminant analysis, Differential operators in functional data analysis, Principal differential analysis, More general roughness penalties, Some perspectives on FDA, Appendix: Some algebraic and functional techniques.
  • Mika Sato, Yoshiharu Sato, and Lakhmi C. Jain. Fuzzy Clustering Models and Applications, volume 9 of Studies in Fuzziness and Softcomputing. Heidelberg, Germany, 1997.
    Keywords: Clustering, Fuzzy Clustering.

Articles in journal or book's chapters

  • Padhraic Smyth. Clustering sequences with hidden Markov models. In Advances in Neural Information Processing 9. 1997.
    Keywords: Clustering, Sequential/Temporal Data.
    Abstract: This paper discusses a probabilistic model-based approach to clustering {\sl sequences}, using hidden Markov models (HMMs). The problem can be framed as a generalization of the standard mixture model approach to clustering in feature space. Two primary issues are addressed. First, a novel parameter initialization procedure is proposed, and second, the more difficult problem of determining the number of clusters $K$, from the data, is investigated. Experimental results indicate that the proposed techniques are useful for revealing hidden cluster structure in data sets of sequences.
  • Eike Anklam, Maria Rosa Bassani, Thomas Eiberger, Stefan Kriebel, Markus Lipp, and Reinhard Matissek. Characterization of cocoa butters and other vegetable fats by pyrolysis-mass spectrometry. Fresenius J. Anal. Chem., 357:981--984, 1997.
    Keywords: Neural Networks.
    Abstract: Pyrolysis Mass Spectrometry (Py-MS) was used for the discriminant of cocoa butters from other vegetable fats. Mass spectra ranging from 50 amu to 250 amu were analyzed by principal components analysis (PCA) and with neural nets. The application of neural nets leads to a good discrimination between the two classes. Detailed analysis of the nets revealed that only the first 60 masses were used within the net. The use of PCA requires a careful selection of the number of masses included in the calculation. Canonical variance analysis was applied to determine the significant masses. Optimal performance of PCA was observed only using the first 22 significant masses. Most of these masses were different from the ones used by the neural net. It seems that the mass spectra obtained by Py-MS contain sufficient information for the discrimination of pure cocoa butter from other vegetable fats, but none of the methods seems to be able to extract all information available. Neural net provides a very robust method for this task and no prior data selection was necessary.
  • Rajesh N. Davé and Raghu Krishnapuram. Robust Clustering Methods: A Unified View. TFS, 5(2):270--293, 1997.
    Keywords: Clustering.
    Abstract: Clustering methods need to be robust if they are to be useful in practice. In this paper, we analyze several popular robust clustering methods and show that they have much in common. We also establish a connection between fuzzy set theory and robust statistics and point out the similarities between robust clustering methods and statistical methods such as the weighted least-squared (LS) technique, the M estimator, the minimum volume ellipsoid (MVE) algorithm, cooperative robust estimation (CRE), minimization of probability of randomness (MINPRAN), and the epsilon contamination model. By gleaning the common principles upon which the methods proposed in the literature are based, we arrive at a unified view of robust clustering methods. We define several general concepts that are useful in robust clustering, state the robust clustering problem in terms of the defined concepts, and propose generic algorithms and guidelines for clustering noisy data. We also discuss why the generalized Hough transform is a suboptimal solution to the robust clustering problem.
  • Thomas Drakengren and Peter Jonsson. Eight Maximal Tractable Subclasses of Allen's Algebra with Metric Time. JAIR, 7:25--45, 1997.
    Keywords: Temporal Reasoning.
    Abstract: This paper combines two important directions of research in temporal reasoning: that of finding maximal tractable subclasses of Allen's interval algebra, and that of reasoning with metric temporal information. Eight new maximal tractable subclasses of Allen's interval algebra are presented, some of them subsuming previously reported tractable algebras. The algebras allow for metric temporal constraints on interval starting or ending points, using the recent framework of Horn DLRs. Two of the algebras can express the notion of sequentiality between intervals, being the first such algebras admitting both qualitative and metric time.
  • A. Famili, Wei-Min Shen, Richard Weber, and Evangelos Simoudis. Data Preprocessing and Intelligent Data Analysis. IDA, 1(1), 1997. [ URL ]
    Abstract: This paper first provides an overview of data preprocessing, focusing on problems of real world data. These are primarily problems that have to be carefully understood and solved before any data analysis process can start. The paper discusses in detail two main reasons for performing data preprocessing: (i) problems with the data and (ii) preparation for data analysis. The paper continues with details of data preprocessing techniques achieving each of the above mentioned objectives. A total of 14 techniques are discussed. Two examples of data preprocessing applications from two of the most data rich domains are given at the end. The applications are related to semiconductor manufacturing and aerospace domains where large amounts of data are available, and they are fairly relaible. Future directions and some challenges are discussed at the end.
  • Frank Höppner. Fuzzy Shell Clustering Algorithms in Image Processing: Fuzzy c-Rectangular and 2-Rectangular Shells. TFS, 5(4):599-613, 1997. [ PDF ]
    Keywords: Clustering, Image Data.
    Abstract: Objective function-based clustering has been generalized recently to detect contours of circles and ellipses or even hyperbolas in a set of binary data vectors. Although there are special algorithms to discover lines, the detection of rectangles needs further treatment. A simple line-detection algorithm is not sufficient for rectangles, since for identifying four lines as one rectangle, additional information such as the length of the lines and whether they are parallel or meet at a right angle is necessary. In this paper, a special fuzzy shell-clustering algorithm for rectangular contours is developed. The principal idea behind it can be generalized for other polygons so we also derive an algorithm that is capable of detecting rectangles and other polygons as well as approximating circles, ellipses and lines.
  • Nicolaos B. Karayiannis and James C. Bezdek. An Integrated Approach to Fuzzy Learning Vector Quantization and Fuzzy c-Means Clustering. TFS, 5(4):622-628, 1997.
    Keywords: Discretization, Classification, Clustering, Fuzzy c-Means.
    Abstract: This letter derives a new interpretation for a family of competetive learning algorithms and investigates their relationship to fuzzy c-means and fuzzy learning vector quantization. These algorithms map a set of feature vectors into a set of prototypes associated with a competetive network that performs unsupervised learning. Derivation of the new algorithms is accomplished by minimizing an average generalized distance between the feature vectors and prototypes using gradient descent. A close relationship between the resulting algorithms and fuzzy c-means is revealed by investigating the functionals involved. It is also shown that the fuzzy c-means and fuzzy learning vector quantization algorithms are related to the proposed algorithms if the learning rate at each iteration is selected to satisfy a certain condition.
  • Frank Klawonn and Rudolf Kruse. Constructing a fuzzy controller from data. FSS, 85:177--193, 1997.
  • László T. Kóczy and Alesandro Zorat. Fuzzy systems and approximation. FSS, 85:203--222, 1997.
    Abstract: The basic motivation of using fuzzy rule-based systems especially for control purposes is to deduce simple and fast approximations of the unknown or too complicated models. Fuzzy rule-based systems have become very popular because of their transparency and easiness of tuning and modification. Recently, some results concerning the explicit functions implemented by realistic fuzzy controllers presented the class of functions that could be implemented this way. Some parallel results, on the other hand, attempted to prove that the main advantage of using fuzzy systems was the suitability for approximation with arbitrary accuracy in their universality. The explizit formulas and some very recent theoretical results made it clear however that fuzzy systems were not really good approximators, as realistic fuzzy controllers could generate only very rough approximations of given transference functions.\\ In connection with approximation the question can be asked, whether there is an optimal fineness/roughness of a fuzzy rule-base that controls a certain action with roughness gives minimal time complexity. As an example, a target tracking problem was chosen (``Cat and Mouse'', or ``Hawk and Sparrow'' problem) where the antagonistic criteria of minimizing inference time by the given rule-base and minimizing action time (search for the target, with given uncertainty provided by the rule model) were examined. Under certain assumptions the solution of this optimization problem leads to nontrivial rule-base sizes. These results have also practical applicability since if a fine enough model of the system is known it is always possible to generate a rougher version of the same, by applying the model transformation technique offered by rule interpolation with $\alpha$-levels.
  • Craig G. Nevill-Manning and Ian W. Witten. Identifying Hierarchical Structure in Sequences: A linear-time algorithm. JAIR, 7:67--82, 1997.
    Keywords: Sequential/Temporal Data.
    Abstract: SEQUITUR is an algorithm that infers a hierarchical structure from a sequence of discrete symbols by replacing repeated phrases with a grammatical rule that generates the phrase, and continuing this process recursively. The result is a hierarchical representation of the original sequence, which offers insights into its lexical structure. The algorithm is driven by two constraints that reduce the size of the grammar, and produce structure as a by-product. SEQUITUR breaks new ground by operating incrementally. Moreover, the method's simple structure permits a proof that it operates in space and time that is linear in the size of the input. Our implementation can process 50000 symbols per second and has been applied to an extensive range of real world sequences.
  • Nikhil R. Pal and James C. Bezdek. Corrections to ``On Cluster Validity for the Fuzzy c-Means Model''. TFS, 5(1):152--153, 1997.
    Keywords: Clustering, Cluster Validity Measures, Fuzzy c-Means.
    Abstract: Validation of partitions produced by the fuzzy c-means clustering algorithm was discussed by Pal and Bezdek \cite{Pal:TFS:3:3}. Two tables in \cite{Pal:TFS:3:3} contain erroneous values. This errata reports the reports the correct values, and notes that the conlusions drawn in \cite{Pal:TFS:3:3}, based on these simulations, remain unchanged.
  • Witold Pedrycz and James Waletzky. Fuzzy Clustering with Partial Supervision. SMCB, 27(5):787--795, 1997.
    Keywords: Classification, Clustering, Fuzzy Clustering.
    Abstract: Presented here is a problem of fuzzy clustering with partial supervision, i.e., unsupervised learning completed in the presence of some labeled patterns. The classification information is incorporated additively as a part of an objective function utilized in the standard fuzzy ISODATA. The algorithms proposed in the paper embrace two specific learning scenarios of complete and incomplete class assignment of the labeled patterns. Numerical examples including both synthetic and real-world data arising in the realm of software engineering are also provided.
  • Padhraic Smyth. Belief Networks, Hidden Markov Models, and Markow Random Fields: A Unifying View. PRL, 18:1261--1268, 1997.
    Keywords: Image Data.
    Abstract: The use of graphs to represent independence structure in multivariate probability models has been pursued in a relatively independent fashion across a wide variety of research disciplines since the beginning of this century. This paper provides a brief overview of the current status of such research with particular attention to recent developments which have served to unify such seemingly disparate topics as probabilistic expert systems, statistical physics, image analysis, genetics, decoding of error-correcting codes, Kalman filters, and speech recognition with Markov models.

Conference's articles

  • Béla Bollobás, Gautam Das, Dimitrios Gunopulos, and Heikki Mannila. Time-Series Similarity Problems and Well-Separated Geometric Sets. In Proc. of ACM Symp. on Computational Geometry (SOCG), 1997.
    Keywords: Noise Handling, Similarity Measures, Sequential/Temporal Data.
    Abstract: Given a pair of nonidentical complex objects, defining (and determining) how similar they are to each other is a nontrivial problem. In data mining applications, one frequently needs to determine the similarity between two time series. We analyze a model of time-series similarity that allows outliers, and different scaling functions. We present deterministic and randomized algorithms for computing this notion of similarity. The algorithms are based on nontrivial tools and methods from computational geometry. In particular, we use properties of families of well-separated geometric sets. The randomized algorithm has provably good performance and also works extremely efficiently in practice.
  • Sarah Boyd. Detecting and Describing Patterns in Time-Varying Data Using Wavelets. In IDA97, LNCS 1280, pages 585--596, 1997.
    Keywords: Wavelets, Multiscale Analysis.
    Abstract: Reasoning effectively about time-varying data requires sophisticated pattern detection mechanisms. This paper describes techniques developed for detecting patterns in time-varying data with the ultimate aim of generating textual descriptions of the data. Preliminary experiments are described in which the visually significant features in weather data are extracted and compared against hand-written expert descriptions.
  • Gautam Das, Rudolf Fleischer, Leszek Gasieniec, Dimitrios Gunopulos, and Juha Kärkkäinen. Episode Matching. In Proceedings of Conference on Combinatorial Pattern Matching, 1997.
    Keywords: Sequential/Temporal Data.
    Abstract: Given two words, text $T$ of length $n$ and episode $P$ of length $m$, the episode matching problem is to find all minimal length substrings of text $T$ that contain episode $P$ as a subsequence. The respective optimization problem is to find the smallest number $w$, such that text $T$ has a subword of length $w$ which contains episode $P$. \newline In this paper, we introduce a few efficient off-line as well as on-line algorithms for the entire problem, where by on-line algorithms we mean algorithms which search from left to right consecutive text symbols only once. We present two alphabet independent algorithms which work in time $O(nm)$. The off-line algorithm operates in $O(1)$ additional space while the on-line algorithm pays for its property with $O(m)$ additional space. Two other on-line algorithms have subquadratic time complexity. One of them works in time $O(nm/\log m)$ and $O(m)$ additional space. The other one gives a time/space trade-off, i.e., it works in time $O(n+s+nm\log\log s/\log(s/m))$ when additional space is limited to $O(s)$. \newline Finally we present two approximation algorithms for the optimization problem. The off-line algorithm is alphabet independent, it has superlinear time complexity $O(n/\epsilon + n \log\log (n/m))$ and it uses only constant space. The on-line algorithm works in time $O(n/\epsilon+n)$ and uses space $O(m)$. Both approximation algorithms achieve $1+\epsilon$ approximation ratio, for any $\epsilon>0$.
  • Gautam Das, Dimitrios Gunopulos, and Heikki Mannila. Finding Similar Time Series. In Proceedings of the Conference on Principles of Knowledge Discovery and Data Mining, 1997.
    Keywords: Sampling, Noise Handling, Similarity Measures, Sequential/Temporal Data.
    Abstract: Similarity of objects is one of the crucial concepts in several applications, including data mining. For complex objects, similarity is nontrivial to define. In this paper we present an intuitive model for measuring the similarity between two time series. The model outliers, different scaling functions, and variable sampling rates. Using methods from computational geometry, we show that this notion of similarity can be computed in polynomial time. Using statistical approximation techniques, the algorithms can be speeded up considerably. We give preliminary experimental results that show the naturalness of the notion.
  • Rajesh N. Davé and Sumit Sen. On Generalizing the Noise Clustering Algorithms. In Proceedings of the 7th Fuzzy Systems Association World Congress (IFSA'97), volume III, pages 205--210, 1997.
    Keywords: Noise Handling, Clustering, Fuzzy c-Means.
    Abstract: In this paper, Dav\'es noise clustering (NC) algorithm is revisited. The original NC algorithm considered noise to be a separate class, and represented it by a prototype that has the same distance $\delta$, from all the data points. Although this concept has been successful in developing a class of NC algorithms to detect a variety of cluster shapes in noisy data, use of the same constant value of the noise distance $\delta$, for all the feature vectors in the data set makes it somewhat limited in its scope. By allowing $\delta$ to take different values for different feasture vectors, the algorithm can be generalized. It can also be shown that the membership generated by NC algorithm is a product of two terms, one is the original fuzzy c-means (FCM) membership, and the other is a generalized possibilistic membership. In this light, it is shown that the NC technique is a generalization of the possibilistic technique.
  • David J. Hand. Intelligent Data Analysis: Issues and Opportunities. In IDA97, number 1280 of LNCS, pages 1--14, 1997.
    Abstract: [has no abstract] Reflections on term (un)intelligent data analysis, the change of data, problems and models over time, and the role of human and computers in solving the problems.
  • T. Honkela, S. Kaski, K. Lagus, and T. Kohonen. WEBSOM -- Self-Organizing Maps of Document Collections. In Proc. of Workshop on Self-Organizing Maps (WSOM), June 4-6, pages 310--315, 1997. [ URL ]
    Abstract: Searching for relevant text documents has traditionally been based on keywords and Boolean expressions of them. Often the search result show high recall and low precision, or vice versa. Considerable efforts have been made to develop alternative methods, but their practical applicability has been low. Powerful methods are needed for the exploration of miscellaneous document collections. The WEBSOM method organizes a document collection on a map display that provides an overview of the collection and facilitates interactive browsing. Interesting documents can be retrieved by a content addressable search of interesting map locations. The interesting locations could also be marked as filters for collecting interesting new documents.
  • Eamonn J. Keogh and Padhraic Smyth. A probabilistic approach to fast pattern matching in time series databases. In KDD97, pages 20--24, 1997.
    Keywords: Piecewise Linear Representations, Similarity Measures, Sequential/Temporal Data.
    Abstract: The problem of efficiently and accurately locating pattern of interest in massive time series data sets is an important and non-trivial problem in a wide variety of applications, including diagnosis and monitoring of complex systems, biomedical data analysis, and exploratory data analysis in scientific and business time series. In this paper a probabilistic approach is taken to this problem. Using piecewise linear segmentations as the underlying representation, local features (such as peaks, troughs, and plateaus) are defined using a prior distribution on expected defomrations from a basic template. Global shape information is represented using another prior on the relative locations of the individual features. An appropriately defined probabilistic model integrates the local and global information and directly leads to an overall distance measure between sequence patterns based on prior knowledge. A search algorithm using this distance measure is shown to efficiently and accurately find matches for a variety of patterns on a number of data sets, including engineering sensor data from space shuttle mission archives. The proposed approach provides a natural framework to support user-customizable ``query by content'' on time series data, taking prior domain information into account in a principled manner.
  • Eamonn J. Keogh. A Fast and Robust Method for Pattern Matching in Time Series Databases. In Proceedings of 9th Int. Conf. on Tools with AI (TAI 97), 1997.
    Keywords: Noise Handling, Piecewise Linear Representations, Similarity Measures, Sequential/Temporal Data.
    Abstract: The problem of finding patterns of interest in time series databases (query by content) is an importatn one, with applications in virtually every field of science. A variety of approaches have been suggested. These approaches are robust to noise, offset translation, and amplitude scaling to varying degrees. However, they are all extremely sensitive to scaling in the time axis (longitudinal scaling). We present a method for similarity search that is robust to scaling in the time axis, in addition to noise, offset translation, and amplitude scaling. The method has been tested on medical, financial, space telemetry and artificial data. Furthermore, the method is exceptionally fast, with the predicted 2 to 4 orders of magnitude speedup actually observed. The method uses a piecewise linear representation of the orginal data. We also introduce a new algorithm which both decides the optimal number of linear segments to use, and produces the actual linear representation.
  • Paul R. Kersten. Implementation Issues in the Fuzzy c-Medians Algorithm. In Proceedings of the 6th International Conference on Fuzzy Systems, Barcelona, Spain, pages 957--962, 1997.
    Keywords: Noise Handling, Clustering, Fuzzy c-Means.
    Abstract: The fuzzy c-Median (FcMED) clustering algorithm is an alternating optimization (AO) method of solving the fuzzy c-Means (FCM) clustering algorithm using the $L_1$ norm. This algorithm is more resistant to outliers than the FCM-AO algorithm using the $L_2$ norm. The robustness of the FCMED does not come for free, since the fuzz ymedian is the cluster-centering statistic and exact evaluation of the fuzzy median usually involves ordering the sample values. The efficiency of calculating the fuzzy median is an important implementation issue. Two other evaluation methods are considered for the fuzzy median: The first is the remedian, which statisticians use to simplify the estimation of the median. A fuzzy remedian is defined and used to approximate the fuzzy median. The second method finds the root of the derivative of the functional equation defining the fuzzy median. Both approaches are described and illustrated in this paper.
  • Frank Klawonn and Erich-Peter Klement. Mathematical Analysis of Fuzzy Classifiers. In X. Liu, P. Cohen, and M. Berthold, editors, IDA97, Berlin, pages 359--370, 1997.
    Keywords: Classification.
    Abstract: We examine the principle capabilities and limits of fuzzy classifiers that are based on a finite set of fuzzy if-then rules like they are used for fuzzy controllers, except that the conclusion of a rule specifies a discrete class instead of a (fuzzy) real output value. Our results show that in the two-dimensional case, for classification problem whose solutions can only be solved approximately by crisp classification rules, very simple fuzzy rules provide an exact solution. However, in the multi-dimensional case, even for linear separable problems, max-min rules are not sufficient.
  • R. Lowen and W. Peeters. On various classes of semi-pseudometrics used in pattern recognition. In IFSA97, Prague, pages 232--237, 1997.
    Abstract: A general framework of using semi-pseudometrics instead of pseudometrics for comparing distances between fuzzy sets [...] will be described. We will consider various kinds of semi-pseudometrics, depending on a range of parameters and external conditions, and summarize the relationships between them.
  • R. J. Miller and Y. Yang. Association Rules over Interval Data. In MD97, Tucson, Arizona, USA, pages 452--461, 1997.
    Keywords: Association Rules.
    Abstract: We consider the problem of mining association rules over interval data (that is, ordered data for which the separation between data points has meaning). We show that the measures of what rules are most important (also called rule interest) that are used for mining nominal and ordinal data do not capture the semantics of interval data. In the presence of interval data, support and confidence are no longer intuitive measures of the interest of a rule. We propose a new definition of interest for association rules that takes into account the semantics of interval data. We developed an algorithm for mining association rules under the new definition and overview our experience using the algorithm on large real-life datasets.
  • Detlef Nauck and Rudolf Kruse. Neuro-Fuzzy Systems for Function Approximation. In 4th International Workshop Fuzzy-Neuro Systems 97, Soest, 1997.
    Keywords: Classification.
    Abstract: We propose a neuro-fuzzy architecture for function approximation based on supervised learning. The learning algorithm is able to determine the structure and the parameters of a fuzzy system. The approach is an extension to our already published NEFCON and NEFCLASS models which are used for control or classification purposes. The proposed extended model, which we call NEFPROX, is more general and can be used for any application based on function approximation.
  • Nikhil R. Pal, Kuhu Pal, and James C. Bezdek. A Mixed c-Means Clustering Model. In FUZZIEEE97, pages 11-21, 1997.
    Keywords: Noise Handling, Clustering, Fuzzy c-Means.
    Abstract: We justify the need for computing both membership and typicalit values when clustering unlabeled data. Then we propose a new model called fuzzy-possibilistic c-means (FCM/PCM) models, FPCM simultaneously produces both memberships and possibilities, along with the usual point prototypes or cluster centers for each cluster We show that FPCM solves the noise sensitivity defect of FCM, and also overcomes the coincident clusters problem of PCM. Then we derive first order necessary conditions for extrema of the PFCM objective function, and use them as the basis for a standard alternating optimization approach to finding local minima. Three numerical examples are given that compare FCM to FPCM. Our calculations show that FPCM compares favorably to FCM.
  • Lars Pickert, Frank Klawonn, and Edgar Wingender. Fuzzy Cluster Analysis for Identification of Gene Regulation Regions. In IFSA97, Academia, Prague, pages 56--61, 1997.
    Keywords: Clustering, Cluster Validity Measures, Fuzzy Clustering, Fuzzy c-Means.
    Abstract: The main approach of this work is the implementation of a cluster analysis program for identification of regulatory regions in genomes. These regions are important parts of the genetic pool in higher developed organisms. They are composed of several basic elements, so called transcription factor sites, which can be identified by special analysis tools more or less vaguely. The program we have developed is able to search for two-dimensional clusters in the results of such analysis tools to give hints on gene regulatory regions. For this purpose, two fuzzy clustering algorithms have been implemented: The fuzzy c-means (FCM) and the Gath and Geva fuzzy clustering algorithm (GG) with two conventional cluster validity methods and one which has been developed especially for this application. All results of the cluster analysis program can be visualized and documented automatically.
  • Sani Susanto, R. D. Kennedy, and J. W. H. dan Price. A preliminary study of a fuzzy clustering and assignment problem-based cell formation algorithm. In Proc. of Int. Conf. on Manufacturing Automation, Hong Kong, pages 95--104, 1997. [ PDF ]
    Keywords: Clustering, Fuzzy Clustering.

Internal reports

  • Heikki Mannila, Hannu Toivonen, and A. Inkeri Verkamo. Discovery of frequent episodes in event sequences. Technical report 15, 1997.
    Keywords: Sequential/Temporal Data.
    Abstract: Sequences of events describing the behaviour and actions of users or systems can be collected in several domains. We consider the problem of discovering frequently occuring episodes in such sequences. An episode is defined to be a collection of events that occur relatively close to each other in a given partial order. Once such episodes are known, one can produce rules for describing or predicting the behaviour of the sequence. We give efficient algorithms for the discovery of all frequent episodes froma given class of episodes, and present extensive experimental results. The methods are in use in telecommunication alarm management.

Disclaimer

This list of publications is neither official nor complete, but a personal compilation.

Copyright and all rights therein are retained by authors or by other copyright holders. All person copying this information are expected to adhere to the terms and constraints invoked by each author's copyright.

This document was translated from BibTEX by bibtex2html

Home © F. Höppner last update: Tue Dec 7 08:49:56 CET 2004