Submit a blog to KDnuggets -- Top Blogs Win A Reward

Topics: AI | Data Science | Data Visualization | Deep Learning | Machine Learning | NLP | Python | R | Statistics

KDnuggets Home » News » 2021 » Jun » Products, Services » The Data Matters: Choosing the right data to analyze can make or break your analysis ( 21:n22 )

The Data Matters: Choosing the right data to analyze can make or break your analysis


We started Nomad Data to help data scientists and business analysts quickly find the right commercial datasets to match their specific use case. We catalog use cases of data and use machine learning and AI to match analysis goals with datasets.



Sponsored Post.

Nomad Data Components
Photo by Jackson Simmer on Unsplash

We are living in a world where microchips and sensors are being added to devices from televisions to keychains. To say the amount of data being created is exploding doesn’t do justice to the sheer acceleration under way. This data revolution, however, is a double edged sword. Although the data that would best serve the analysis you’re trying to do probably exists, the difficulty in finding it is significant and increasing. Choosing the wrong data can doom your analysis from the start.

We started Nomad Data to help data scientists and business analysts quickly find the right commercial datasets to match their specific use case. We catalog use cases of data and use machine learning and AI to match analysis goals with datasets.

Imagine you’re working as a data scientist for an auto loan company in late 2017, a few months after Hurricane Maria struck Puerto Rico, devastating the island. Your boss in the risk department asks you to come up with an analysis on whether people are fleeing the island permanently, therefore increasing the risk that the firm’s auto loans will go into default. The first step in tackling the problem is deciding what data is a good proxy for population migration. Using Nomad Data you would quickly be connected to:

  • Consumer Transaction Data - This allows you to see near real-time consumer spend at stores on the island.
  • Geolocation Data - Using signals from anonymous consumer cell phones you can see the number of active phones on the island and how that number has changed since before the hurricane.
  • Consumer Credit Data - Using data from the three credit bureaus you can see if the volume of residents paying their bills has changed since before the hurricane.
  • Airline Manifest Data - Data from online travel agencies allows visibility into the volume of airline tickets being purchased to/from the island.

Most data scientists don’t know the different types of external data sources that exist as there are thousands and new ones are being created daily. Even if they do know, being sure that a given dataset can answer a very particular question requires investigation which means spending time and money. If you went through and tested the above sources, you would learn that there is only one unbiased by infrastructure challenges that the island was facing. By searching with Nomad Data you save valuable time by testing only the most relevant data.


Sign Up

By subscribing you accept KDnuggets Privacy Policy