Silver BlogA Brief Introduction to the Concept of Data

Every aspiring data scientist must know the concept of data and the kind of analysis they can run. This article introduces the concept of data (quantitative and qualitative) and the types of analysis.



By Angelica Lo Duca, Institute of Informatics and Telematics of the National Research Council

According to the Cambridge Dictionary [1], Data is information, especially facts or numbers, collected to be examined and considered and used to help decision-making, or information in an electronic form that can be stored and used by a computer. In other words, Data is a set of variables which can be quantitative or qualitative [2,3].

 

Data Types

 


Data can be either quantitative or qualitative. Understanding the difference between quantitative and qualitative data is very important, because they are treated and analyzed in different ways: for example, you cannot calculate statistics for qualitative data, or you cannot exploit Natural Language Processing techniques for quantitative data.

Image

 

Quantitative Data

 


Quantitative data include data which can be expressed as numbers, thus they can be measured, counted and analysed through statistics computations. Quantitative data can be used to describe and analyze a phenomenon, in order to discover trends, compare differences and perform predictions. Often, quantitative data are already structured, thus it is quite easy to perform further analysis.

Quantitative data include:

 
1. Continuous data, which can take any numeric value.

Examples of continuous data are:

  • the average temperature over the years (e.g. 35°C or 84.2 °F)
  • price of a product over a month (e.g. $ 23.50 or 45.00 €)

Usually continuous data are distributed into an interval, which can assume both negative and positive values (interval data).

 
2. Discrete data, which can take only certain numeric values.

Examples of discrete data are:

  • scores of an exam, e.g 18 or 30
  • the shoes number, e.g. 42 EU

Usually discrete data are equidistant and non-negative (ratio data).

 

Qualitative Data

 


Qualitative data cannot be measured through standard computation techniques, because they express feelings, sensations and experiences. Qualitative data can be used to understand the context around a given phenomenon and discover new aspects. Often, qualitative data are unstructured, thus they require additional techniques to extract meaningful information.

Quantitative data include:

 
1. Nominal data, which are used to label quantities which cannot be measured, without following a specific order. Usually, nominal data group similar objects.

Examples of nominal data include:

  • the languages spoken by a person (e.g. English, Italian, French)
  • Colour palette (e.g. Red, Green)

 
2. Ordinal data, which differ from nominal data only for the fact that they can be ordered.

Examples of nominal data include:

  • the opinion regarding a given product (e.g. poor quality, medium quality, high quality)
  • the time of day (e.g. morning, afternoon, night)

 

Types of Data Analysis

 


The goal of data analysis is to discover hidden trends, patterns and relationships in the data.

According to the data type, different analyses can be performed: quantitative and qualitative analysis for quantitative and qualitative data, respectively.

Image

 

Quantitative Analysis

 


Quantitative analysis [4] refers to quantitative data and includes the classical techniques for statistics:

 
1. Descriptive Statistics

Descriptive Statistics [5] analyses the past, by describing the basic features of data.

Descriptive Statistics is based on the calculation of some measures:

  • Frequency (count, percentage)
  • Central tendency (mean, median, mode)
  • Variability (maximum, minimum, range, quartile, variance)

 
2. Inferential Statistics

Inferential Statistics aims at building predictive models to understand the trend of a given phenomenon.

Inferential Statistics includes the following types of analysis:

  • Hypothesis testing (ANOVA, t-test, Box-Cox, …)
  • Confidence interval estimation

 

Qualitative Analysis

 


Qualitative analysis [6] exploits qualitative data and tries to understand data context. Since it is not possible to measure data, the following strategies can be adopted to analyse qualitative data:

 
1. Deductive Analysis

In Deductive Analysis, the researcher formulates some a-priori structures or questions to investigate data. This approach can be used when the research has at least a minimum overview of data.

 
2. Inductive Analysis

Inductive Analysis starts looking at data in the hope of extracting some useful information. This kind of analysis is quite time consuming, since it requires a deep investigation of data. Inductive Analysis is used when the researcher has no idea of data.

 

Conclusion

 
This article has introduced the basic concept of data, which include quantitative and qualitative data. Quantitative analysis focuses on numbers, while qualitative analysis focuses on categories. A great effort has been done in both types of analysis, but the research is still open.

 

References

 
[1] Cambridge Dictionary: Definition of Data:
https://dictionary.cambridge.org/dictionary/english/data

[2] Qualitative vs Quantitative Data: Definitions, Analysis, Examples:
https://www.intellspot.com/qualitative-vs-quantitative-data/

[3] ​​How to Understand the Quantitative and Qualitative Data in Your Business:
https://laconteconsulting.com/2020/02/14/quantitative-qualitative-data/

[4] Quantitative Data: Definition, Types, Analysis and Examples:
https://www.questionpro.com/blog/quantitative-data/

[5] A Gentle Introduction to Descriptive Analytics:
https://medium.com/analytics-vidhya/a-gentle-introduction-to-descriptive-analytics-8b4e8e1ad238

[6] Qualitative Data – Definition, Types, Analysis and Examples:
https://www.questionpro.com/blog/qualitative-data/

 
Bio: Angelica Lo Duca (Medium) works as post-doc at the Institute of Informatics and Telematics of the National Research Council (IIT-CNR) in Pisa, Italy. She is Professor of "Data Journalism" for the Master degree course in Digital Humanities at the University of Pisa. Her research interests include Data Science, Data Analysis, Text Analysis, Open Data, Web Applications and Data Journalism, applied to the fields of society, tourism and cultural heritage. She used to work on Data Security, Semantic Web and Linked Data. Angelica is also an enthusiastic tech writer.

Related: