KDnuggets Home » News » 2011 » Aug » Software » JT on IBM In-Database Analytics  ( < Prev | 11:n19 | Next > )

JT on IBM In-Database Analytics


 
  
three main features: SQL Pushback, direct access to DB analytic modeling routines and model deployment/scoring options help achieve in-database execution.


James Taylor, August 2, 2011

IBM IBM SPSS has been supporting in-database analytic modeling for a while now. Their objective is to make it possible for analysts to run the complete data mining process end-to-end in-database - from accessing the data to data transformation and model building/scoring. In particular, they try to enable analysts to push data transformation and data preparation into the database as these are typically a big part of data mining projects. To achieve in-database execution they provide three main features - SQL Pushback, direct access to a database's own analytic modeling routines and model deployment/scoring options.

To build a predictive analytic model in IBM SPSS Modeler , an analyst creates an analytic workflow. This consists of multiple tasks or nodes to read, merge or transform data; split data into different test sets; apply modeling algorithms and more. SQL Pushback takes the nodes in this workflow that relate to data access and transformation and pushes them to the database. The tool generates the SQL you need for these steps and executes that SQL on the database from which you sourced the data. This SQL is specific to the database concerned for the main supported databases (IBM DB2, Microsoft SQL Server, Netezza, Oracle, Teradata) and generic SQL is available for many nodes for other databases.

IBM SPSS Modeler also reorders work streams to maximize the effectiveness of this SQL, particularly in terms of keeping the data in the database. For instance if multiple nodes that can be executed in-database are separated by one that cannot be then the nodes will be re-ordered to group the in-database nodes where this is possible.

Read more.


 
Related
Data Mining Software

KDnuggets Home » News » 2011 » Aug » Software » JT on IBM In-Database Analytics  ( < Prev | 11:n19 | Next > )