Data mining tasks

A very important data mining task is the discovery of characteristic descriptions for subsets of data, which characterize its members and distinguish it from other subsets. Descriptions can, for example, be the output of statistical methods like average or variance. [Pg.474]

The paper is organized as follows. The next section introduces data mining tasks and models, followed by a quick tour of some theoretical results. Next, a review of the recent advances is presented, followed by challenges and a summary. [Pg.31]

Predictive Modeling is another Data Mining task that is addressed by Statistical methods. The most common type of predictive model used in Statistics is linear regression, where we describe one variable as a linear combination of other known variables. A number of other tasks that involve analysis of several variables for various purposes are categorized by statisticians under the umbrella term multivariate analysis. [Pg.85]

The input file formate or data set formats which are accepted by Tanagra for performing different data mining tasks are. txt,. arff and. xls, and sparse formats include. dat and data. [Pg.153]

Preprocessing The main goals of spectral preprocessing can be summarized as follows (1) improvement of the robustness and accuracy of subsequent classification analysis, (2) improved interpietability, (3) detection and removal of outliers and trends, and (4) reduetion of the dimensionahty of subsequent data-mining tasks. This step often involves the removal of irrelevant and/or redundant information by feature seleetion (Laseh 2012). [Pg.207]

It extends the usage of statistical methods and combines it with machine learning methods and the application of expert systems. The visualization of the results of data mining is an important task as it facilitates an interpretation of the results. Figure 9-32 plots the different disciplines which contribute to data mining. [Pg.472]

This section will therefore focus on the aims and tasks of data mining and refer to the methods where applicable. A thorough description of data mining is given in Ref. [20]. [Pg.472]

Data mining can fulfill various different tasks such as classification, clustering and similarity detection, prediction, estimation, or description retrieval, which are described in Sections 9.8.1-9.8.5. [Pg.472]

A most important task in the handling of molecular data is the evaluation of "hidden information in large chemical data sets. One of the differences between data mining techniques and conventional database queries is the generation of new data that are used subsequently to characterize molecular features in a more general way. Generally, it is not possible to hold all the potentially important information in a data set of chemical structures. Thus, the extraction of relevant information and the production of reliable secondary information are important topics. [Pg.515]

A variety of methods have been developed by mathematicians and computer scientists to address this task, which has become known as data mining (see Chapter 9, Section 9.8). Fayyad defined and described the term data mining as the nontrivial extraction of impHcit, previously unknown and potentially useful information from data, or the search for relationships and global patterns that exist in databases [16]. In order to extract information from huge quantities of data and to gain knowledge from this information, the analysis and exploration have to be performed by automatic or semi-automatic methods. Methods applicable for data analysis are presented in Chapter 9. [Pg.603]

The user-interface for the operators was based on the formalized ontology for the production process, as developed so far. The various tasks, interventions, error categories and error details and the available counter measures were all retrieved from this ontology. This allowed to record the user activities during production, and to store this data with its appropriate context to enable context-sensitive data-mining across all available data sources. To-... [Pg.684]

Data mining can be defined as a process of exploration of large amounts of data in search of consistent patterns, correlations, and other systematic relationships between queries and database objects. The tasks of a data mining engine can be divided into the following classes. [Pg.336]

If statistical methods fail to solve a chemical problem, artificial neural networks can be used for analyzing especially nonlinear and complex relationships between descriptors. The important tasks for neural networks in data mining are as follows ... [Pg.337]

It is convenient to categorize data mining into types of tasks corresponding to the different objectives. The categorization below is not unique and underlines only the most dominant tasks encountered in drug discovery applications. [Pg.677]

In comparative data mining, one distinguishes between overlay analysis and the retrieval of patterns from one or several data sets given a set of patterns of interest. The latter task is also known as retrieval by content [8], Typical manifestations of comparative data mining in drug discovery are for example similarity analysis of chemical compounds [33], or the comparison of large chemical libraries [34],... [Pg.679]

Big Chemical Encyclopedia

Chemical substances, components, reactions, process design ...