KDD Nuggets 95:17, e-mailed 95-07-28 Contents: * W. Ziarko, Special Issue of Computational Intelligence * D. Silver, Data Transformation Tools -- Summary * C. Sturrock, Random vs. network-selected sample partitioning * P. Smyth, NASA data sets at http://nssdc.gsfc.nasa.gov * M. Bramer, Research Fellowships in KBS/Machine Learning see http://www.sis.port.ac.uk/FELLOWSHIP.HTML * GPS, ComputerWorld: Sears Mines Data with multidimensional tools * GPS, New Siftware in KD Mine http://info.gte.com/~kdd/what-is-new.html The KDD Nuggets is a moderated mailing list for news and information relevant to Knowledge Discovery in Databases (KDD), also known as Data Mining, Knowledge Extraction, etc. Relevant items include tool announcements and reviews, summaries of publications, information requests, interesting ideas, clever opinions, etc. Please include a DESCRIPTIVE subject line in your submission. Nuggets frequency is approximately bi-weekly. Back issues of Nuggets, a catalog of S*i*ftware (data mining tools), references, FAQ, and other KDD-related information are available at Knowledge Discovery Mine, URL http://info.gte.com/~kdd/ or by anonymous ftp to ftp.gte.com, cd /pub/kdd, get README E-mail add/delete requests to kdd-request@gte.com E-mail contributions to kdd@gte.com -- Gregory Piatetsky-Shapiro (moderator) ********************* Official disclaimer *********************************** * All opinions expressed herein are those of the writers (or the moderator) * * and not necessarily of their respective employers (or GTE Laboratories) * ***************************************************************************** ~~~~~~~~~~~~ Quotable Quote ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ We are going to keep meeting until we can figure out why nothing gets done around here! A frustrated CEO To be on time, risk being early A time management tip >~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ From: Wojtek Ziarko Subject: Special Issue of Computational Intelligence To: kdd@gte.com Date: Thu, 6 Jul 1995 15:26:41 -0600 (CST) ================================================== Computational Intelligence, An International Journal has recently published Special Issue on Rough Sets and Knowledge Discovery. The issue, edited by Wojciech Ziarko, contains a selection of expanded articles following the workshop on the same subject held in Banff, Canada in 1993. Interested persons can order copies of the journal from Blackwell Publishers, 238 Main Str., Cambridge, MA 02142, tel. 1-800-835-6770, fax. 617-547-0789. The cost of the single issue is $28 in North America and $32 elsewhere. In what follows is an introduction to the special issue which should give an idea about its contents INTRODUCTION TO THE SPECIAL ISSUE ON ROUGH SETS AND KNOWLEDGE DISCOVERY Wojciech Ziarko Computer Science Department University of Regina Regina, SK Canada S4S 0A2 1. Introduction The theory of rough sets is a relatively new research direction concerned with the analysis and modelling of classification and decision problems involving vague, imprecise, uncertain or incomplete information. The methodology stems from the premise that the classifcation acts of an empirical observation and subsequent decision making are fundamental features of intelligent behaviour. Consequently, the methodology models observations as classifications and decision problems as sets expressed approximately in terms of lower and upper bounds constructed using the classifications (Pawlak 1991). The rigorous mathematical approach to deal with such problems in a formal way was originally proposed by Zdzislaw Pawlak and later investigated in detail by logicians, mathematicians and computer scientists. The theory of rough sets gave rise to new formal approaches to approximate reasoning, digital logic analysis and reduction, control algorithm acquisition, machine learning algorithms and pattern recognition. One particularly attractive application area for this methodology is in knowledge discovery or database mining. Initiated by Gregory Piatetsky- Shapiro (Piatetsky-Shapiro and Frawley 1991), knowledge discovery has grown into an important research and application subfield within AI. It is concerned with identification of non-trivial data patterns or relationships normally hidden in databases. The relationships typically assume the form of data dependencies, either functional, partially functional or probabilistic, and their discovery and characterization is possible only by using special software tools. In this context, the existing research results derived from the basic model of rough sets provide a wealth of techniques applicable to the knowledge discovery problem. In particular, the theory of rough sets is used to model and analyze the data at various levels of abstraction to better expose the data regularities; it provides techniques to analyze data dependencies, to identify fundamental factors and to discover rules from data, both deterministic and non-deterministic rules. This special issue represents a spectra of research results in both rough sets and knowledge discovery areas, with particular emphasis on applications. The applications described in this issue fall into the following categories: medical research, data preprocessing and reduction for neural networks, control algorithm acquisition, database systems and machine learning. More theoretical articles deal with such problems as the representation of uncertain information and formal reasoning with such information, concept formation and database storage and retrieval of vague or imprecise information. Almost all the articles included in this collection were chosen out of fifty papers presented at the International Workshop on Rough Sets and Knowledge Discovery held in Banff, Canada, in October 1993. The workshop was organized by the Department of Computer Science at the University of Regina in Regina, Saskatchewan, Canada. 2. Review of the Contents of the Special Issue. In what follows, we briefly review the contents of the papers included in this issue. 2.1 Vagueness versus Uncertainty The first paper, written by Zdzislaw Pawlak, the originator of rough sets, deals with the fundamental issues of vagueness and uncertainty of information. The author is making a clear distinction between these two notions in the context of the rough sets model. The main point of the article is to demonstrate that the vagueness of information can be seen as a property of imprecisely specified sets (concepts), whereas uncertainty can be attributed to set elements through the usage of the rough membership function, similar to the fuzzy membership function (Zadeh 1965). 2.2 Rule Discovery The majority of papers included in this issue deal with applications of rough sets. Most of the applications are concerned with discovery and utilization of new knowledge extracted from empirical data. In particular, a number of papers study the problem of rule discovery from data in the rough sets framework. The paper by Tsumoto and Tanaka describes the application of a probabilistic extension of the basic rough sets model to discovery of diagnostic rules from medical data. The authors are medical professionals and the data they used in this application was accumulated through their clinical practice. The article written by Skowron proposes methods of extracting potentially useful rules from among possible rules identified by a rule discovery system. This is a very important problem since data relationships in the form of "if A then B" do not necessarily reflect real rules of the application domain and some kind of screening is needed to eliminate likely incorrect rules. The paper by Hu and Cercone deals with techniques for data generalization using the concept hierarchy approach to new concept formation and rough sets approach to data analysis and reduction. The generalization of the original information is an essential step in rough-sets based data analysis. It helps to uncover repetitive data patterns and to produce rules with a high degree of support in terms of available data objects. A related problem is studied by Xiang, Wong and Cercone. The authors combine the concept hierarchy approach, probability theory and rough sets methodology in the construction of a rule discovery algorithm with associated estimates of error probabilities. The empirical comparison of some rule extraction algorithms is the subject of the paper by D. Grzymala-Busse and J. Grzymala-Busse. The authors compare the sensitivity of the predictive accuracy of the rules produced by these algorithms with respect to missing attributes in the training data set. The comparison involves methods based on the well-known ID-3 technique and rough set-based methods implemented in the authors' system LERS. The acquisition and incremental adaption of rules discovered from data is discussed in the article by Shan and Ziarko. The authors present a technique for finding all maximally general rules and reducts, i.e. minimal subsets of attributes preserving dependency degree with a target concept. The method for subsequent incremental modification of rules and reducts as new data becomes available is also presented. In their presentation, the authors use an extension of the original rough sets definition, called variable precision rough sets model, to be able to capture strong data patterns in the boundary area of the target concept (Katzberg and Ziarko 1994). Plonka and Mrozek describe experiments oriented towards application of rough set methods to control algorithm acquisition from empirical data. The objective of the experiments was to use data representing sensor readings and human operator actions accumulated during controlling a complex device to automatically produce a control algorithm which could eventually be substituted for the operator. To log the operation data, the authors developed a system to manually control the inverted pendulum in a balanced position. They have demonstrated that through acquisition of rules from the sampled operation data it is possible to achieve high stability and fully balanced automatic control of the pendulum. 2.3 Data Preprocessing An application of rough sets to training set reduction for neural networks is described in the paper written by Jelonek, Krawiec and Slowinski. The authors present and verify by experimentation the method for information- preserving reduction of attributes and their domains. The method leads to a significant speedup of network training at the expense of a small increase of the network classification error rate. 2.4 Storage and Retrieval of Imprecise Information An interesting application of rough sets to a generalization of the standard relational database model is presented in the article written by Beaubouef, Petry and Buckles. The generalization allows for representing information in terms of imprecisely specified concepts, i.e. given by their lower and upper approximations . The authors develop the complete extended model, including the generalization of relational algebra for manipulating and querying such databases. 2.5 Representation and Reasoning with Uncertain Information Modelling uncertain information and reasoning with such information in the context of rough sets is the subject of two papers. The article by Herment and Orlowska presents a basic logical framework, called information logic, for formal reasoning with imprecise and vague concepts specified by their lower and upper approximations. The authors also present the implementation of a graphical proof editor for the construction and syntactic verification of formal proofs of information logic. The paper by Wong, Wang and Yao is primarily concerned with the problem of representation of uncertain information. The authors propose a representation technique called an interval structure and study its relationship to the rough set theory and the theory of evidence. 2.6. Concept Formation Concept formation and representation of inter-concept relationships is an important part of knowledge discovery and machine learning research. Two papers in this collection deal with some specific issues related to this problem. The problem of concept formation from data representing observations is discussed by Godin, Missaoui and Alaoui in the framework of the Gallois lattice theory. The authors use the notion of Gallois lattice to represent the identified concepts and relationships among them. They also present incremental algorithms for modifying the structure of the lattice as new data objects become available. Hamilton and Fudger discuss the problem of concept formation in the context of the discovery system DBLEARN. They look at this problem from the application perspective of trying to find good heuristic measures for evaluating the potential usefulness of the intermediate concepts created in the concept hierarchy. They propose several heuristic concept evaluation methods based on properties of concept trees and present the results of empirical testing of the methods using four commercial databases. References Pawlak, Z. 1991. Rough sets: Theoretical aspects of reasoning about data. Kluwer Academic Publishers. Piatetsky-Shapiro, G. (ed.) and Frawley, W. J. 1991. Knowledge discovery in databases. AAAI/MIT Press. Katzberg, J. and Ziarko, W. 1994. Variable precision rough sets with asymmetric bounds. In Ziarko, W. (ed.) Rough sets, fuzzy sets and knowledge discovery, Springer Verlag, Workshops in Computing series. Zadeh, L. 1965. Fuzzy sets. Information and Control, 8:338-353. >~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ [ From Neuron Digest [Volume 95 Issue 32], GPS] Subject: Summary of responses on data transformation tools From: "Danny L. Silver" Date: Fri, 07 Jul 1995 09:35:45 -0400 Some time ago (May/95), I request additional information on data transformation tools: > Many of us spend hours preparing data files for acceptance by > machine learning systems. Typically, I use awk or C code to transform > ASCII records into numeric or symbolic attribute tuples for a neural net, > inductive decision tree, etc. Before re-inventing the wheel, has anyone > developed a general tool for perfoming some of the more common > transformations. Any related suggestions would be of great use to many > on the network. Below is a summary of the most informative responses I received. Sorry for the delay. .. Danny - -- ========================================================================= = Daniel L. Silver University of Western Ontario, London, Canada = = N6A 3K7 - Dept. of Comp. Sci. - Office: MC27b = = dsilver@csd.uwo.ca H: (519)473-6168 O: (519)679-2111 (ext.6903) = ========================================================================= From: A. Famili I have done quite a bit of work in this area, on data preparation and data pre-processing, and also rule post-processing in induction. As part of our induction system that we have built, we have some data pre-processing capabilities added. I am also organizing and will be chairing a panel on the "Role of data pre-processing in Intelligent Data Analysis" in IDA-95 Symposium. (Intelligent Data Analysis Symposium to be held in Germany in Aug. 1995). The most common tool in the market is NeuralWare's Data Sculptor (I have only seen the brochure and a demo). It is claimed to be a general purpose tool. Others are in a short report that I send you below. A. Famili, Ph.D. Senior Research Scientist Knowledge Systems Lab. IIT- NRC, Bldg. M-50 Montreal Rd. Ottawa, Ont. K1A 0R6 Canada Phone: (613) 993-8554 Fax : (613) 952-7151 email: famili@ai.iit.nrc.ca - --------------------------- A. Famili Knowledge Systems Laboratory Institute for Information Technology National Research Council Canada 1.0 Introduction This report outlines a comparison that was made for three commercial data pre-processing tools that are available in the market. The purpose of the study was to identify useful features that exist in these tools that could be helpful in intelligent filtering and data analysis of the IDS project. The comparison study does not involve use and evaluation of either tools on real data. Two of these tools (LabView and OS/2 Visualizer) are avail- able in the KSL. 2.0 Data Sculptor Developed by NeuralWare, the criteria was that in neural network data analysis applica- tions, 80 percent of time is spent on data preprocessing. This tool was developed to han- dle any type of transformation or manipulation of data, before the data being analysed. The graphics capabilities include: histograms, bar charts, line, pie and scatter plots. There are several stat. functions to be used on the data. There are also options to create new variables (attribute vectors) based on transformation of other variables. Following are some important specifications, as explained in the fact sheets and demo version: - - Input Data Formats: DBase, Excell, Paradox, Fixed Field, ASCII, and Binary. - - Output Data Formats: Fixed Field, Delimited ASCII and Binary - - General Data Transformations: Sorting, File Merge, Field Comparison, Sieve and Duplicate and Neighborhood. - - Math. Transformations: Arithmetic, Trigonometric, and Exponential. - - Special Transformations: Encodings of the type One-of-N, Fuzzy One-of-N, R-of-N, Analog R-of-N, Thermometer, Circular, and Inverse Thermometer, Normalizing Func- tions, Fast Fourier Transformations and some more. - - Stat. Functions: Count, Sum, Mean, Chi-square, Min, Max, STD, Variance, Correla- tion and some more. - - Graph Formats: Bar chart, Histogram, Scatter Plot, Pie, etc. - - Spreadsheet: Data Viewing, and Search Function. A data pre-processing application can be built by using (or defining) icons and assembling the entire application in the Data Sculptor environment, which is quite easy to use. There are a number of demo applications that came with the demo diskettes. On- line hypertext help facility is also available. Data Sculptor runs under Windows. Information for Data Sculptor comes from the literature and two demo diskettes. 3.0 LabView and Data Engine Lab View (Laboratory Virtual Instrument Engineering Workbench) is a product developed by National Instruments. It is however available with Data Engine, a data analysis product developed by MIT in Germany. LabView, a high level programming environ- ment, has been developed to simplify the scientific computation, analyzing process control, and test and measurement applications. It is far more sophisticated than other data pre-processing systems. Unlike other programming systems that are text based, LabView is graphics based and lets users create data viewing and simulation programs in block diagram forms. LabView also contains application specific libraries for data acquisition, data analysis, data presentation, and data storage. It even comes with it's own GUI builder facilities (called front panel) so that the application is monitored and run to simulate the panel of a physical instrument. There are also a number of LabView companion products that have been developed by users or suppliers of this product. 4.0 OS/2 Visualizer The Visualizer comes with OS2 and is installed on the PC's of the IDS project. It's main function is support for data visualization, and consists of three modules: (i) Charts, (ii) Statistics, and (iii) Query. The visualizer Charts provides support for a variety of chart making requirements. Examples are: line, pie, bar, scatter, surface, mixed, etc. The visualizer Statistics provides support in 57 statistical methods in seven categories of: (i) Exploratory methods, (ii) Distributions, (iii) Relations, (iv) Quality control, (v) Model fitting, (vi) Analysis of variance, and (vii) Tests. Each of the above categories consists of several features that are useful for statistical analysis of data. The visualizer Query provides support for a number of query tasks to be performed on the data. These include means to access and work with the data that is currently used, creating and storing new tables in the database, combining data from many tables, and many more. It is not evident, from the documentation, whether or not we can perform some form of data transformation or preprocessing on the queried data so that a preprocessed data file is created for further analysis. ================================================================ From: Matthijs Kadijk I personnaly think that AWK is the best most general tool fit for those purposes, but for those who want something less general but easy to use I suggest to use dm, (a data manipulater) which is part of Gary Perlman UNIX|STAT package. It should be no problem to find it on the net. I also use the unix|stat programs to analyse the results of simulations with my NN programs. I'll attatch the dm tutorial to this mail (DLS: not include in this summary). Matthijs Kadijk _____________________ ______________________________ / Matthijs Kadijk \ / email: kkm@bouw.tno.nl \ | TNO-Bouw, Postbus 49 | www: http://www.bouw.tno.nl \___________________ | NL-2600 AA Delft | tel: +31 - 15 - 842 195 /\ fax: +31 15 843975 \ \_____________________/ \ ________________________/ \_ _____________________/ ===================================================================== >~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Return-Path: Date: Tue, 11 Jul 1995 11:36:24 +0100 From: Charles.Sturrock@mtm.kuleuven.ac.be (Charles Sturrock) To: kdd%eureka@gte.com Subject: random vs. network-selected sample partitioning Cc: saswss@unx.sas.com X-Sun-Charset: US-ASCII Content-Type: text Content-Length: 2132 The recent excerpt from Warren Sarle posted to KDD nuggets triggered the following question, which I think is applicable to any method of model development (not just neural networks) or knowledge discovery from databases. Suppose one has a sample of "real world" input-output data to be used to model some physical process. To simplify the problem, suppose the output is binary. Like all real world samples, the data are noisy. A Kohonen Self-Organizing Map (SOM) can be used to identify situations where the data are in conflict, i.e., a set of cases of identical or nearly identical inputs is approximately evenly distributed between True and False values of the output. Suppose one uses a SOM to partition a sample into training and test cases as follows. Where the data are in conflict, only one case from each is chosen for the training set. Where the data are not in conflict, only one case is chosen for the training set. All remaining cases are put into the test set. This process creates a test set that "mirrors" the training set, and hence yields a very low error rate when the ensuing model is evaluated. Now this sounds very much like "testing on the training cases", a common error in performance analysis. However, the SOM seems to create a training set that is most representative of the entire population, i.e., the set of all cases that one wants to be able to generalize to. The fact that the test set mirrors this particular training set in this case isn't a problem. Or is it? I would appreciate any comments or literature references from anyone that address the consequences of partitioning a sample in this manner. The literature I have read all seems to emphasize resampling (multiple random partitioning and averaging the results) as the safest means of accurate error rate estimation. However, resampling is very expensive computationally, and yields multiple models. If the research goal is to find the one single model with the lowest error rate, then does not the SOM-derived sample partitioning method achieve this goal much cheaper? C.P. Sturrock charles@mtm.kuleuven.ac.be >~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Date: Mon, 17 Jul 95 09:12:05 PDT From: pjs@aig.jpl.nasa.gov (Padhraic J. Smyth) Subject: WWW page for NASA data sets Useful WWW page with information on NASA data =============================================== The National Space Science Data Center (NSSDC) now has a WWW homepage with a significant amount of information about all of NASA's data sets from planetary exploration, space and solar physics, life sciences, astrophysics, including many links to other sites. This is probably the most useful place to start for people interested in data mining of NASA's vast scientific data sets. The address is: http://nssdc.gsfc.nasa.gov >~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ From: bramerma@cv.port.ac.uk Date: Tue, 25 Jul 1995 12:07:02 EDT Subject: Research Fellowships/Senior Research Fellowships in Kno UNIVERSITY OF PORTSMOUTH DEPARTMENT OF INFORMATION SCIENCE RESEARCH FELLOWSHIP/SENIOR RESEARCH FELLOWSHIP IN KNOWLEDGE BASED SYSTEMS/MACHINE LEARNING Salary up to 24,000 pounds per annum As part of a major programme of expansion entitled 'Investing in Research Excellence', the University of Portsmouth is inviting applications for Research Fellows/Senior Research Fellows in either of the above areas. Appointments at both levels are being offered with earmarked support for five years which, following successful completion, could lead to a subsequent position as Senior or Principal Lecturer, Reader or Professor. Fellows will join the Artificial Intelligence Research Group in the Department of Information Science, which is the largest and most active of the Department's research groups. Its activities can be broadly divided into two areas: Knowledge Based Systems and Machine Learning. Fellows will be expected to make a substantial contribution to one of these areas. Successful candidates will conduct research under the direction of Professor Max Bramer who is the leader of the AI Research Group and Head of the Department of Information Science. Professor Bramer has been actively involved in research in Artificial Intelligence since the early 1970s and since 1988 has been chairman of the BCS Specialist Group on Expert Systems (SGES), a member organisation of ECCAI. His current research interests include inductive learning, case-based reasoning, model-based approaches to diagnostic reasoning and methodologies for knowledge engineering. Other leading members of the group include Professor Tom Addis, who also has over 20 years involvement in the field and is a former chairman of the Artificial Intelligence Professional Group of the Institution of Electrical Engineers. He is currently conducting research into graphical programming interfaces and into theoretical aspects of the representation and use of knowledge. Fellows will normally need to have a PhD degree as well as a track record of successful published research in either Machine Learning or Knowledge Based Systems. They will be expected to have an active area of ongoing research which fits closely with the Department's current interests. In addition they will be expected to contribute to the overall development of the Department's research culture. Specifically, they will: - Undertake and publish research in their chosen subject area - Maintain expertise in their discipline and disseminate it to others - Support and facilitate the research of other members of the AI Research Group - Contribute to the Department's PhD training program for its AI cohort of research students - Prepare bids for funding from UK and European Union research initiatives and from industry in collaboration with other members of the AI Research Group as appropriate. Further particulars are available from: The Personnel Office, University of Portsmouth, University House, Winston Churchill Avenue, Portsmouth PO1 2UP England. Telephone +44-(0)1705-843421 (24-hour answer phone). Quote Reference RF1G. NOTE: THE CLOSING DATE IS AUGUST 7th 1995. For an informal discussion prior to application, contact Professor Max Bramer on +44-(0)1705-844444 (email: bramerma@csovax.portsmouth.ac.uk) or Simon Thompson on +44-(0)1705-844097 (email: sgt@sis.port.ac.uk). Further information about the AI Research Group and the appointment is also available on the World-Wide Web at: HTTP://WWW.SIS.PORT.AC.UK/FELLOWSHIP.HTML >~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Date: Tue, 25 Jul 1995 13:17:49 -0400 From: gps0 (Gregory Piatetsky-Shapiro) Subject: ComputerWorld: Sears Mines Data with multidimensional tools June 26, 1995 Computerworld, has a short article on Sears using multidimensional technology from Arbor Software Corp for data mining. Arbor's Essbase -- a multidimensional database engine with a spreadsheet-like front-end will be used by Sears's financial analysts, along with store and merchandise managers. Sears expects that eventually upward of 3000 employees will use the system. Analysts said that one key reason why Arbor is winning accounts such as Sears is because it designed its front-end to resemble spreadsheets, one of the most commonly used desktop analysis tools. >~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Date: Fri, 28 Jul 1995 08:29:22 -0400 From: kdd (KDD Nuggets Moderator) Subject: New Siftware in KD Mine

July 27, 1995

  • In Snob, MML-based program for clustering and unsupervised classification of multivariate data.
  • In Cornerstone, an integrated statistical and data visualization package.
  • In Inspect, a system for the visualization and interpretation of data, using statistical, KNN, and neural networks methods.
  • In KATE-tools, a set of tools for decision-tree building and case-based reasoning.
  • In Siftware section, new entry for

    July 6, 1995

    • In Information Harvester, a hybrid system, combining decision tree, statistics and fuzzy reasoning.
    • In JMP , a stand-alone statistical analysis product developed by the SAS Institute for personal computing environments. JMP software offers over 270 statistical features, can assist in the design and analysis of experiments, and is capable of both visual and text reporting.
    >~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~