KDnuggets Home » News » 2014 » Jun » Opinions, Interviews » Top 10 Data Analysis Tools for Business ( 14:n15 )

Top 10 Data Analysis Tools for Business


Ten free, easy-to-use, and powerful tools to help you analyze and visualize data, analyze social networks, do optimization, search more efficiently, and solve your data analysis problems.



By Alex Jones, June 2014.

As a graduate student in Business Analytics, I have worked the better part of a year to become a giant nerd, excel wizard, big data ninja, data scientist, predictive analytics architect. While the skills I have developed have been invaluable, taking a year of computer science, advanced mathematics, engineering and business classes, is simply not feasible for most people.

Although the challenge of collecting and analyzing "Big Data" requires some complex and technical solutions, the fact is, that most businesses do not realize what they are currently capable of.

Specifically, there are a number of exceptionally powerful analytical tools that are free and open source that you can leverage today to enhance your business and develop skills that can genuinely propel your career.

Rather than just leave you to navigate the frightening and giant world of IT tools and software, I have put together a list of what I see as the Top 10 Data analysis tools for Business. I picked these because of their free availability (for personal use), ease of use (no coding and intuitively designed), powerful capabilities (beyond basic excel), and well-documented resources (if you get stuck, you can Google your way through).

  1. Tableau Public: Tableau democratizes visualization in an elegantly simple and intuitive tool. It is exceptionally powerful in business because it communicates insights through data visualization. Although great alternatives exist, Tableau Public's million row limit provides a great playground for personal use and the free trial is more than long enough to get you hooked. In the analytics process, Tableau's visuals allow you to quickly investigate a hypothesis, sanity check your gut, and just go explore the data before embarking on a treacherous statistical journey.

  2. OpenRefine: Formerly GoogleRefine, OpenRefine is a data cleaning software that allows you to get everything ready for analysis. What do I mean by that? Well, let's look at an example. Recently, I was cleaning up a database that included chemical names and noticed that rows had different spellings, capitalization, spaces, etc that made it very difficult for a computer to process. Fortunately, OpenRefine contains a number of clustering algorithms (groups together similar entries) and makes quick work of an otherwise messy problem.
    **Tip- Increase Java Heap Space to run large files (Google the tip for exact instructions!)

  3. KNIME: KNIME allows you to manipulate, analyze, and modeling data in an incredibly intuitive way through visual programming. Essentially, rather than writing blocks of code, you drop nodes onto a canvas and drag connection points between activities. More importantly, KNIME can be extended to run R, python, text mining, chemistry data, etc, which gives you the option to dabble in the more advanced code driven analysis.
    **TIP- Use "File Reader" instead of CSV reader for CSV files. Strange quirk of the software.

  4. RapidMiner: Much like KNIME, RapidMiner operates through visual programming and is capable of manipulating, analyzing and modeling data. Most recently, RapidMiner won KDnuggets software poll, demonstrating that data science does not need to be a counter-intuitive coding endeavor.

  5. Google Fusion Tables: Meet Google Spreadsheets cooler, larger, and much nerdier cousin. Google Fusion tables is an incredible tool for data analysis, large data-set visualization, and mapping. Not surprisingly, Google's incredible mapping software plays a big role in pushing this tool onto the list. Take for instance this map, which I made to look at oil production platforms in the Gulf of Mexico. With just a quick upload, Google Fusion tables recognized the latitude and longitude data and got to work.

  6. NodeXL: NodeXL is a visualization and analysis software of networks and relationships. Think of the giant friendship maps you see that represent linkedin or Facebook connections. NodeXL takes that a step further by providing exact calculations. If you're looking for something a little less advanced, check out the node graph on Google Fusion Tables, or for a little more visualization try out Gephi.

  7. Import.io: Web scraping and pulling information off of websites used to be something reserved for the nerds. Now with Import.io, everyone can harvest data from websites and forums. Simply highlight what you want and in a matter of minutes Import.io walks you through and "learns" what you are looking for. From there, Import.io will dig, scrape, and pull data for you to analyze or export.

  8. Google Search Operators: Google is an undeniably powerful resource and search operators just take it a step up. Operators essentially allow you to quickly filter Google results to get to the most useful and relevant information. For instance, say you're looking for a Data science report published this year from ABC Consulting. If we presume that the report will be in PDF we can search
    "Date Science Report" site:ABCConsulting.com Filetype:PDF

    then underneath the search bar, use the "Search Tools" to limit the results to the past year. The operators can be even more useful for discovering new information or market research.

  9. Solver: Solver is an optimization and linear programming tool in excel that allows you to set constraints (Don't spend more than this many dollars, be completed in that many days, etc). Although advanced optimization may be better suited for another program (such as R's optim package), Solver will make quick work of a wide range of problems.

  10. WolframAlpha: Wolfram Alpha's search engine is one of the web's hidden gems and helps to power Apple's Siri. Beyond snarky remarks, Wolfram Alpha is the nerdy Google, provides detailed responses to technical searches and makes quick work of calculus homework. For business users, it presents information charts and graphs, and is excellent for high level pricing history, commodity information, and topic overviews.

One of my favorite data related quotes is:

Data matures like wine, applications like fish

---James Governor, Founder of Redmonk.

Although these tools make analysis easier, they're only as valuable as the information put in and analysis that you conduct. So take a moment to learn a few new tricks, challenge yourself, and let these tools enhance and complement the logic and reasoning skills that you already have.


Alex Jones is a Graduate Student at U. Texas McCombs School of Business.

Related: