Session Program


  • 12 July 2017
  • 02:30PM - 04:30PM
  • Room: Auditorium
  • Chair: Francesco Marcelloni

Fuzzy Data Science I

Abstract - Data of a sensory profile represent human's evaluations and feelings about intensities of several criterias to describe and compare different products (as the criteria odor of red fruits for a red wine). This article concerns the representation and querying of such data. It is shown that sensorial data are intrinsically imprecise due to this human evaluation (possibilistic data), and especially for untrained people. The data treatment can take advantages of a querying with user's preferences (flexible querying with fuzzy predicates). The classical approach to evaluate a fuzzy predicate on a possibility distribution is based on a possibility and a necessity measures of a fuzzy event and it is shown that this approach may be not convenient. A new expression for the evaluation of a fuzzy predicate on a possibility distribution is then introduced. More complex flexible queries on possibilistic data are defined and methods to rank the answers are also proposed.
Abstract - An effective way to implement a fuzzy database is on top of a classical Relational Database Management Systems (RDBMS). In this sense, we have proposed a Fuzzy Object Relational Database Management System (FORDBMS) built on top of Oracle RDBMS. To enhance the performance of queries based on possibility, we have carried out a study to adapt indexing techniques available in classical RDBMS to the fuzzy retrieval. This paper shows the implementation of the best of these indexing techniques on our FORDBMS and evaluates and compares their performance. The results show that the best of these techniques enhance query execution time in several orders of magnitude with respect to sequential retrieving.
Abstract - Fuzzy Extreme Learning Machine (F-ELM) constructs a fuzzy neural networks by embedding fuzzy membership functions and rules into the hidden layer of extreme learning machine (ELM), that is, it can be interpreted as a fuzzy system with the structure of neural network. Although F-ELM has shown the characteristics of fast learning of model parameters, it has poor robustness to small and noisy datasets since its parameters connecting hidden layer with output layer are optimized by least square(LS). In order to overcome this challenge, a Ridge Regression based Extreme Learning Fuzzy System (RR- EL-FS) is presented in this study, which has introduced the strategy of ridge regression into F- ELM to enhance the robustness. The experimental results also validate that the performance of RR-EL-FS is better than F-ELM and some related methods to small and noisy datasets.
Abstract - A large number of applications manage uncertain data. Usually, users expect high quality results when they pose queries with strict conditions over these data. However, as they may not be clear about the contents of the databases that contain such data, these queries may be failing i.e., they may return no result or results that do not satisfy the expected degree of certainty. In this paper, we deal with this problem by proposing an approach that identifies the parts of the failing query, called Minimal Failing Subqueries (MFSs), that are responsible of its failure. Our approach also computes, in the same time, a set of Maximal Succeeding Subqueries (XSSs) that represent non failing queries with a maximal number of predicates of the initial query. We demonstrate the impact of our proposal with a set of experiments on synthetic and real datasets.
Abstract - The aggregation of multiple information sources has a long history and ranges from sensor fusion to the aggregation of individual algorithm outputs and human knowledge. A popular approach to achieve such aggregation is the fuzzy integral (FI) which is defined with respect to a fuzzy measure (FM) (i.e. a normal, monotone capacity). In practice, the discrete FI aggregates information contributed by a discrete number of sources through a weighted aggregation (post-sorting), where the weights are captured by a FM that models the typically subjective 'worth' of subsets of the overall set of sources. While the combination of FI and FM has been very successful, challenges remain both in regards to the behavior of the resulting aggregation operators--which for example do not produce symmetrically mirrored outputs for symmetrically mirrored inputs--and also in a manifest difference between the intuitive interpretation of a stand-alone FM and its actual role and impact when used as part of information fusion with a FI. This paper elucidates these challenges and introduces a novel family of recursive average (RAV) operators as an alternative to the FI in aggregation with respect to a FM; focusing specifically on the arithmetic recursive average. The RAV is designed to address the above challenges, while also facilitating fine-grained analysis of the resulting aggregation of different combinations of sources. We provide the mathematical foundations of the RAV and include initial experiments and comparisons to the FI for both numeric and interval-valued data.
Abstract - Clustering is one of unsupervised classification method. Typical clustering methods are constructed based on optimization of the given objective function. Many clustering methods are formulated as optimization problems with typical objective functions and constraints. The objective function itself is also an evaluation guideline of results of clustering methods. Considered together with its theoretical extensibility, there is the great advantage to construct clustering methods in the framework of optimization. From the viewpoint of optimization, some of the authors proposed an Even-sized Clustering method Based on Optimization (ECBO), which is with tight constraints of cluster size, and constructed some variations of ECBO. It is considered that ECBO has the advantage in the viewpoint of clustering accuracy, cluster size, and optimization framework than other similar methods. However, the constraint of cluster sizes of ECBO is tight in the meaning of cluster size so that it may be inconvenient in case that some extra margin of cluster size is allowed. Moreover, it is expected that new clustering algorithms in which each cluster size can be controlled deal with more various datasets. From the above view point, we proposed two new clustering algorithms based on ECBO. One is COntrolled-sized Clustering Based on Optimization (COCBO), and the other is an extended COCBO, which is referred to as COntrolled-sized Clustering Based on Optimization++ (COCBO++). Each cluster size can be controlled in the algorithms. However, these algorithms have some problems. In this paper, we will describe various types of COCBO to solve the above problems and estimate the methods in some numerical examples.