KDnuggets : Newsletter : 1999 Issues : 99:01 Contents :

KDnuggets 99:01, item 7, Tools and Services:

Previous | Contents |  Next

Date: Tue, 5 Jan 1998 09:41:10 -0500 (EST) 
From: Gregory Piatetsky-Shapiro gps
Subject: Discovery of Sequences 

I have received quite a few responses to Ismail Parsa's request for tools 
for Sequential Associations.  

Robert St. Amant stamant@eos.ncsu.edu points to a UMass web page for
multi-stream dependency detection. 
Inge Jonassen inge@ii.uib.no points to tools for mining biosequences
(DNA, RNA, and protein sequences) for conserved patterns.
Yves Chauvin yves@netid.com describes their software HMMpro (available
for download) which uses machine learning for knowledge discovery in molecular 
biology.
Jonathan D. Becher becher@neovista.com describes how Neovista DecisionAR
mining engine finds sequential and temporal patterns.

--
Date: Wed, 16 Dec 1998 08:41:12 -0500 (EST)
From: Robert St. Amant stamant@eos.ncsu.edu

Adele Howe, at Colorado State, developed techniques some time ago for
detecting dependencies in symbolic traces of plan behavior.  Paul
Cohen and his students at UMass extended these techniques to do
multi-stream dependency detection, incremental dependency detection,
and other variations. Paul's Web page, http://www-eksl.cs.umass.edu,
has pointers to the relevant papers.

 The Lisp code was developed under a DARPA grant, and may be available for the
asking (contact Tim Oates at oates@cs.umass.edu).
--
Date: Wed, 16 Dec 1998 15:19:21 +0100 (MET)
From: Inge Jonassen inge@ii.uib.no

This may not be exactly what you asked for, but it is closely related.
There are many tools available for mining biosequences (DNA, RNA,
and protein sequences) for conserved patterns. See 
http://www.ebi.ac.uk/~brazma/patterns.html (collection of links) and
http://www.ii.uib.no/~inge/patterns.html (short introductory text)

-- 
Date: Wed, 16 Dec 1998 10:04:06 -0800
From: Yves Chauvin yves@netid.com

You may want to check our web site: http://www.netid.com

We use (mostly) HMMs for sequence data analysis applied to biological sequences.
Our software, HMMpro, is available for download.
HMMpro uses machine learning for knowledge discovery in molecular biology.

All the HMM concepts can be obviously extended
to other types of sequence analysis.

--
Date: Thu, 17 Dec 1998 12:08:37 -0800
From: Jonathan D. Becher becher@neovista.com
Subject: Neovista Tools for Sequence Data Analysis

The NeoVista DecisionAR mining engine does indeed find sequential patterns, described as temporal analysis in the NeoVista literature. In temporal analysis, association rules are computed for items purchased by the same customer in different visits over a period of time. Temporal analysis takes all transactions for a single customer, and considers them to be a single transaction sequence. The data must include a customer ID, and a date and/or time value for each transaction. Associations are computed for pairs of items. Associations are always forward in time. DecisionAR has three different options for temporal analysis:

For temporal analysis by occurrence, DecisionAR measures the percentage of customers who purchase item X and also purchase item Y at least once within a specified period of time. In this form of temporal analysis subsequent purchases of items X and Y by the same customer are disregarded. For example:
Visit1Visit2 Visit3Visit4Visit5Count(XY)
Cust1XYX YX1
Cust2XXY YX1
Cust3XZY XY1
Cust4XXYYY YX1

Temporal analysis by subsequent visits answers a different question. What is the probability that a purchase of item Y will follow the purchase of item X in subsequent visits within a specified time period? For example:
Visit1 Visit2 Visit3 Visit4 Visit5 Count(XY)
Cust1 X Y X Y X 2
Cust2 X X Y Y X 1
Cust3 X Z Y X Y 2
Cust4 XX YY Y Y X 1

In temporal analysis by next visit, the purchase of item Y must occur in the very next visit by the same customer in order to be counted. For example:
Visit1 Visit2 Visit3 Visit4 Visit5 Count(XY)
Cust1 X Y X Y X 2
Cust2 X X Y Y X 1
Cust3 X Z Y X Y 1
Cust4 XX YY Y Y X 1

For more information, contact NeoVista Software at www.neovista.com.

Jonathan D. Becher
VP, Applications and Technology
becher@neovista.com


Previous | Contents |  Next


KDnuggets : Newsletter : 1999 Issues : 99:01 Contents :

Copyright © 1999 KDnuggets