National Repository of Grey Literature 23 records found  1 - 10nextend  jump to record: Search took 0.00 seconds. 
Prediction of inpatient mortality for patients with myocardial infarction
Kratochvíl, Václav ; Kružík, H. ; Tůma, P. ; Vomlel, Jiří ; Somol, Petr
The topic of this paper is the standartization of inpatient mortality for patients with myocardial infarction based on discovered correlations between risk factors and the mortality.
Conditioning and Flexibility in Compositional Models
Kratochvíl, Václav
Reasoning by cases or assumptions is a common form of human reasoning. In case of probability reasoning, this is modeled by conditioning of a multidimensional probability distribution. Compositional models are defined as a multidimensional distributions assembled from a (so called generating) sequence of lowdimensional probability distributions, with the help of operators of composition. In this case, the conditioning process can be viewed as a transformation of one generating sequence into another one. It appears that the conditioning process is simple when conditioning variable appears in the argument of the first distribution of the corresponding generating sequence. That is why we introduce the so called flexible sequences. Flexible sequences are those, which can be reordered in many ways that each variable can appears among arguments of the first distribution. In this paper, we study the problem of flexibility in light of the very recent solution of the equivalence problem.
Feature Selection - A Very Compact Survey Over the Diversity of Existing Approaches
Somol, Petr ; Novovičová, Jana ; Pudil, Pavel ; Kittler, J.
Feature Selection has been a subject of extensive research that nowadays extends far beyond the boundaries of statistical pattern recognition. We provide a concise yet wide view of the topic including representative references in an attempt to point out that important results can be easily overlooked or duplicated in a variety of – even indirectly related – research fields.
Fast Dependency-Aware Feature Selection in Very-High-Dimensional Pattern Recognition Problems
Somol, Petr ; Grim, Jiří
The paper addresses the problem of making dependency-aware feature selection feasible in pattern recognition problems of very high dimensionality. The idea of individually best ranking is generalized to evaluate the contextual quality of each feature in a series of randomly generated feature subsets. Each random subset is evaluated by a criterion function of arbitrary choice (permitting functions of high complexity). Eventually, the novel dependency-aware feature rank is computed, expressing the average benefit of including a feature into feature subsets. The method is efficient and generalizes well especially in very-high-dimensional problems, where traditional context-aware feature selection methods fail due to prohibitive computational complexity or to over-fitting. The method is shown well capable of over-performing the commonly applied individual ranking which ignores important contextual information contained in data.
Introduction to Feature Selection Toolbox 3 – The C++ Library for Subset Search, Data Modeling and Classification
Somol, Petr ; Vácha, Pavel ; Mikeš, Stanislav ; Hora, Jan ; Pudil, Pavel ; Žid, Pavel
We introduce a new standalone widely applicable software library for feature selection (also known as attribute or variable selection), capable of reducing problem dimensionality to maximize the accuracy of data models, performance of automatic decision rules as well as to reduce data acquisition cost. The library can be exploited by users in research as well as in industry. Less experienced users can experiment with different provided methods and their application to real-life problems, experts can implement their own criteria or search schemes taking advantage of the toolbox framework. In this paper we first provide a concise survey of a variety of existing feature selection approaches. Then we focus on a selected group of methods of good general performance as well as on tools surpassing the limits of existing libraries. We build a feature selection framework around them and design an object-based generic software library. We describe the key design points and properties of the library.
Sequential Retreating Search Methods in Feature Selection
Somol, Petr ; Pudil, Pavel
Inspired by Floating Search, our new pair of methods, the Sequential Forward Retreating Search (SFRS) and Sequential Backward Retreating Search (SBRS) is exceptionally suitable for Wrapper based feature selection. (Conversely, it cannot be used with monotonic criteria.) Unlike most of other known sub-optimal search methods, both the SFRS and SBRS are parameter-free deterministic sequential procedures that incorporate in the optimization process both the search for the best subset and the determination of the best subset size. The subset yielded by either of the two new methods is to be expected closer to optimum than the best of all subsets yielded in one run of the Floating Search. Retreating Search time complexity is to be expected slightly worse but in the same order of magnitude as that of the Floating Search. In addition to introducing the new methods we provide a testing framework to evaluate them with respect to other existing tools.
Problém chybějících dat při sčítání lidu - kteří respondenti neodpověděli
Hora, Jan
The confidentiality of census data is known to be rather restrictive for economic and social research. To improve the availability of census information we have proposed recently a new method of interactive presentation of census results by means of statistical models. The method is based on estimation of the joint probability distribution of data records in the form of a distribution mixture. The estimated mixture model can be used as a knowledge base of a probabilistic expert system and in this way we can derive the statistical information from the distribution mixture without any further access to the original database. The statistical model does not contain the original data and therefore the final interactive software product can be made freely available via internet without any confidentiality concerns.
Motivace různých charakterizací ekvivalentních persegramů
Kratochvíl, Václav
In this paper we give the motivation and introduction for indirect characterization of equivalence. We have found three operations on persegram remaining induced independence model. By combining them together, one can generate a class of equivalent models. We are not sure whether one can generate the whole class. This problem is closely connected with the above mentioned problem of invariant properties.
Aplikace bayesovských sítích ve hře Minesweepe
Vomlelová, M. ; Vomlel, Jiří
We use the computer game of Minesweeper to illustrate few modeling tricks utilized when applying Bayesian network (BN) models in real applications. Among others, we apply rank-one decomposition (ROD) toconditional probability tables (CPTs) representing addition. Typically, this transformation helps to reduce the computational complexity of probabilistic inference with the BN model. However, in this paper we will see that (except for the total sum node) when ROD is applied to the whole CPT it does not bring any savings for the BN model of Minesweeper. Actually, in order to gain from ROD we need minimal rank-one decompositions of CPTs when the state of the dependent variable is observed. But this is not known and it is a topic for our future research.
Experimentální srovnání triangulačních heuristik na transformovaných sítích BN2O
Vomlel, Jiří ; Savický, Petr
In this paper we present results of experimental comparisons of several triangulation heuristics on bipartite graphs. Our motivation for testing heuristics on the family of bipartite graphs is the rank-one decomposition of BN2O networks. A BN2O network is a Bayesian network having the structure of a bipartite graph with all edges directed from the top level toward the bottom level and where all conditional probability tables are noisy-or gates. After applying the rank-one decomposition, which adds an extra level of auxiliary nodes in between the top and bottom levels, and after removing simplicial nodes of the bottom level we get so called BROD graph. This is an undirected bipartite graph. It is desirable for efficiency of the inference to find a triangulation of the BROD graph having the sum of table sizes for all cliques of the triangulated graph as small as possible. From this point of view, the minfill heuristics perform in average better than other tested heuristics (minwidth, h1, and mcs).

National Repository of Grey Literature : 23 records found   1 - 10nextend  jump to record:
Interested in being notified about new results for this query?
Subscribe to the RSS feed.