by Alfred Inselberg, aiisreal at tau.ac.il
the inventor of Parallel Coordinates
Here is a multi-dimensional visualization
Click to see a larger picture
Price and Order Info
The software is based on Parallel Coordinates which is a methodology
for the unambiguous (i.e. no loss of information) visualization of multivariate
data and RELATIONS. The discovery of multivariate/multidimensional relations
in a dataset is transformed into a 2-D pattern recognition problem. The
software's unique interface, queries, and boolean operators enable the
visual/interactive discovery of complex relations in multivariate
datasets, and in turn finding the effect these relations have on various
objectives. Unexpected relations have been discovered in datasets
with more than 100 variables from which sensitivities, repetitive patterns, other
trends and salient properties are found. The visualization not only helps
the discovery process but ALSO the presentation and EXPLANATION of the
results.
Recently very efficient classifiers based on the methodology were found
and have now been implemented in the software. Specifically, let a dataset
consist of N categories (i.e. subsets). Here the classifiers :
- Discover explicit rules (if they exist -- i.e. if there is sufficient
information in the data) which distinguish a category from the others.
- Find the MINIMAL subset of parameters which suffices to specify the
rule WITHOUT LOSS OF INFORMATION (i.e. this is NOT an approximation). --
this has achieved tremendous i.e. 1 to 5 or more reduction in the
dimensionality of the problem
- Order these variables so as to optimize the separation between the
categories -- this provides a very useful rating of the importance of the
parameters.
- Provides the rule VISUALLY.
- Provides information on the geometrical distribution of the data.
The classifier's speed enables its use ADAPTIVELY, i.e. where the rule
is derived and updated in real-time with the data flow. The classifier is
better suited to handling numerical data though it can be applied to
datasets where no more than 20% of the variables are categorical. Also
there are provisions in the software for handling datasets with missing
values.
Price, ordering and technical information
Please contact Alfred Inselberg, aiisreal at tau.ac.il, and mention that
you saw it in KDnuggets.
|