Data Marts as an indispensable analytical tool
An analytical Data Mart is in effective and user-friendly tool for reporting, analyses and modeling. Explore, how data marts could provide time saving, less error prone and streamline solution for your business problems.
By Aleksandra Besinska, Algolytics.
Information about provided services, customers and transactions can be stored in different database systems and data warehouses, depending on the way in which a company operates.
Due to such arrangements, even the simplest analyses or report may require significant expenditures of time, as well as in-depth knowledge about database systems and their availability.
For an analyst this situation is frequently the source of difficulties – lack of required information or time to analyze source data may lead to errors in the resulting analyses and, in consequence, to financial losses.
Data Marts as an indispensable analytical tool
In a situation like this it is convenient to build Data Marts. Data Mart is a table or a collection of tables containing only the information which the analysts require to do their job. This data is pulled from multiple sources, processed in a uniform manner, documented and optimized.
Data Marts frequently contain data aggregated on the customer level, such as the average number of transactions in last 6 months, number of cash loans drawn by the client during the last 12 months, etc. When the computed aggregate values are available, it is much easier to prepare reports.
Data Marts are built only once, at the start of the analytical process, but they are cyclically and automatically updated, so as to contain all the relevant information pertaining to customers/products/transactions in a given period of time.
Fig. 1. Data Mart in an analytical project – an example
Benefits of using Data Marts
- Streamlining of marketing and sales analyzes, report generation, prognostic model creation and management
- Time savings – the analyst does not need to acquire and process data from a number of sources whenever a new model hast to be built
- Smaller risk of errors in the analysis, which means the results are more credible
- Availability of the most up to date analytical information, thanks to cyclic updating of the Data Mart
Practical example of Data Mart application
Building a Data Mart can be especially useful in corporate projects, where there are numerous different distributed data sources and the amount of data is very large. This was the case in one of our projects. The company had a couple of millions of customers, tens of tables (data about transactions, customers, products, etc.) and over ten billion records in total.
The data structure was too complicated to effectively build scoring models (for churn prediction, product recommendations, offer targeting, etc.). It was necessary to build an analytical Data Mart, so that all the information required to build the models would be available in one place. In effect:
- We significantly reduced data complexity – only one record per client
- Data management during the analytical process became much easier (e.g. we obtained an effect in which the number of calls made by the customer to the competition’s Customer Service Center in the last month was available „on the spot ” at the client level)
Moreover, the ability to update the Data Mart on a daily basis enabled us to use it for marketing campaigns. For constructing the Data Mart in this project we used our own AdvancedMiner system.
Is one Data Mart enough?
Depending on the business requirements, it may turn out that more than one Data Mart is necessary. Before deciding how many and what kind of Data Marts to build, it is necessary to formulate concrete business requirements. In other words, one must determine what information is needed and what kinds of analyses, models and reports will be made.
Summing up
An analytical Data Mart is in effective and user-friendly tool for reporting, analyses and modeling. It may be a basis for further development of the ETL process, in order to facilitate advanced analyses for e.g. risk assessment, automation of data quality control, and verification of the effectiveness of deployed analytical models.
Related: