- We Don’t Need Data Engineers, We Need Better Tools for Data Scientists - May 7, 2021.
In today's data science jobs landscape, a variety of roles are being filled from specialized engineering positions to the more generalized data scientist. However, is it possible that some of these job types are duplicative or misdirected, such as that of the Data Engineer, which might exist as we know it because of a lack of adequate tooling for Data Scientists?
- Improving model performance through human participation - Apr 23, 2021.
Certain industries, such as medicine and finance, are sensitive to false positives. Using human input in the model inference loop can increase the final precision and recall. Here, we describe how to incorporate human feedback at inference time, so that Machines + Humans = Higher Precision & Recall.
- Data Observability, Part II: How to Build Your Own Data Quality Monitors Using SQL - Feb 23, 2021.
Using schema and lineage to understand the root cause of your data anomalies.
- Data Observability: Building Data Quality Monitors Using SQL - Feb 16, 2021.
To trigger an alert when data breaks, data teams can leverage a tried and true tactic from our friends in software engineering: monitoring and observability. In this article, we walk through how you can create your own data quality monitors for freshness and distribution from scratch using SQL.
- Data Catalogs Are Dead; Long Live Data Discovery - Dec 28, 2020.
Why data catalogs aren’t meeting the needs of the modern data stack, and how a new approach – data discovery – is needed to better facilitate metadata management and data reliability.
- A Tour of End-to-End Machine Learning Platforms - Jul 29, 2020.
An end-to-end machine learning platform needs a holistic approach. If you’re interested in learning more about a few well-known ML platforms, you’ve come to the right place!
- What I learned from looking at 200 machine learning tools - Jul 21, 2020.
While hundreds of machine learning tools are available today, the ML software landscape may still be underdeveloped with more room to mature. This review considers the state of ML tools, existing challenges, and which frameworks are addressing the future of machine learning software.
- Data Science Tools Popularity, animated - Jun 25, 2020.
Watch the evolution of the top 10 most popular data science tools based on KDnuggets software polls from 2000 to 2019.
- Lynx Analytics is open-sourcing LynxKite, its Complete Graph Data Science Platform - Jun 25, 2020.
Check out this article for a brief summary on what LynxKite is, where it is coming from and how it can help with your data science projects.
- Count, the data notebook everyone can use - Jun 9, 2020.
Dashboards have been the primary weapon of choice for distributing data over the last few decades, but they have brought with them a new set of problems. To increasingly democratise access to data we need to think again.
- State of the Machine Learning and AI Industry - Apr 16, 2020.
Enterprises are struggling to launch machine learning models that encapsulate the optimization of business processes. These are now the essential components of data-driven applications and AI services that can improve legacy rule-based business processes, increase productivity, and deliver results. In the current state of the industry, many companies are turning to off-the-shelf platforms to increase expectations for success in applying machine learning.
- Domino named a Visionary in Gartner Magic Quadrant for completeness of vision and ability to execute - Mar 10, 2020.
From a product perspective, we believe three aspects of the Domino platform, in particular, are foundational to earning this illustrious moniker: openness, collaboration, and reproducibility.
- Leaders, Changes, and Trends in Gartner 2020 Magic Quadrant for Data Science and Machine Learning Platforms - Feb 24, 2020.
The Gartner 2020 Magic Quadrant for Data Science and Machine Learning Platforms has the largest number of leaders ever. We examine the leaders and changes and trends vs previous years.
- Manual Coding or Automated Data Integration – What’s the Best Way to Integrate Your Enterprise Data? - Aug 19, 2019.
What’s the best way to execute your data integration tasks: writing manual code or using ETL tool? Find out the approach that best fits your organization’s needs and the factors that influence it.
- KDnuggets™ News 19:n22, Jun 12: The Modern Open-Source Data Science/Machine Learning Ecosystem; Simplifying the Data Visualisation Process in Python - Jun 12, 2019.
The 6 tools in the modern open-source Data Science ecosystem; Simplifying the Data Visualisation Process in Python; The Infinity Stones of Data Science; Best resources for developers transitioning into data science.
- KDnuggets™ News 19:n21, Jun 5: Transitioning your Career to Data Science; 11 top Data Science, Machine Learning platforms; 7 Steps to Mastering Intermediate ML w. Python - Jun 5, 2019.
The results of KDnuggets 20th Annual Software Poll; How to transition to a Data Science career; Mastering Intermediate Machine Learning with Python ; Understanding Natural Language Processing (NLP); Backprop as applied to LSTM, and much more.
- Make better decisions with data in every corner of your business - Mar 22, 2019.
Mode is the data science platform that helps you get data in every corner of your business and create a single source of truth. Free your data science team, automate everything, and create a single source of truth.
- Gainers, Losers, and Trends in Gartner 2019 Magic Quadrant for Data Science and Machine Learning Platforms - Feb 11, 2019.
We compare Gartner 2019 MQ for Data Science, Machine Learning Platforms to its previous versions and identify notable changes for leaders and challengers, including RapidMiner, KNIME, TIBCO, Alteryx, Dataiku, SAS, and MathWorks.
- Get the latest analyst research on data science platforms - Jan 30, 2019.
Access a complimentary copy of the Gartner 2019 Magic Quadrant for Data Science and Machine-Learning Platforms to discover the latest trends and see why Dataiku was named a "Challenger" in the industry.
- SQL, Python, and R in One Platform - Nov 27, 2018.
Stop jumping between applications. Get a complete analytical toolkit.
- The 2018 Data Scientist Report is Here - Aug 23, 2018.
Learn about the data and tools that data scientists are working with in 2018, Ethical issues around AI, Algorithmic bias, Job satisfaction, and more.
- Top KDnuggets tweets, Aug 15-21: How to Set Up a Free Data Science Environment on Google Cloud - Aug 22, 2018.
Also: Unveiling Mathematics Behind XGBoost; Causation in a Nutshell; Introduction to Fraud Detection Systems.
- KDnuggets™ News 18:n21, May 23: Python eats away at R; Top 2018 Analytics, Data Science, Machine Learning tools; 9 Must-have skills for a Data Scientist - May 23, 2018.
Also How to Implement a YOLO (v3) Object Detector from Scratch in PyTorch; Frameworks for Approaching the Machine Learning Process.
- Python eats away at R: Top Software for Analytics, Data Science, Machine Learning in 2018: Trends and Analysis - May 22, 2018.
Python continues to eat away at R, RapidMiner gains, SQL is steady, Tensorflow advances pulling along Keras, Hadoop drops, Data Science platforms consolidate, and more.
Pages: 1 2
- KNIME US Data Science Roadshow 2018 - Mar 13, 2018.
The Data Science Learnathon is coming to the US, coast to coast. Learn how to use open source, GUI driven KNIME Analytics Platform. We’ll provide datasets, jump-start workflows, solutions, and of course data science experts. Sign-up now.
- KDnuggets™ News 18:n09, Feb 28: Gartner 2018 MQ for Data Science/ML – Gainers and Losers; Comparative Analysis of Top 6 BI/Data Viz Tools - Feb 28, 2018.
A Comparative Analysis of Top 6 BI and Data Visualization Tools; A Tour of The Top 10 Algorithms for Machine Learning Newbies; A Guide to Hiring Data Scientists.
- Gainers and Losers in Gartner 2018 Magic Quadrant for Data Science and Machine Learning Platforms - Feb 27, 2018.
We compare Gartner 2018 Magic Quadrant for Data Science, Machine Learning Platforms vs its 2017 version and identify notable changes for leaders and challengers, including IBM, SAS, RapidMiner, KNIME, Alteryx, H2O.ai, and Domino.
- Gartner 2018 Magic Quadrant for Data Science and Machine Learning – Read the report - Feb 23, 2018.
Read Gartner 2018 Magic Quadrant for Data Science and Machine Learning Platforms, courtesy of Domino, and learn which data science platform is right for your organization and why Domino was named a Visionary.
- How (& Why) Data Scientists and Data Engineers Should Share a Platform - Nov 17, 2017.
Sharing one platform has some obvious benefits for Data Science and Data Engineering teams, but technical, language and process challenges often make this a challenge. Learn how one company implemented single cloud platform for R, Python and other workloads – and some of the unexpected benefits they discovered along the way.
- KDnuggets™ News 17:n40, Oct 18: Want to Become a Data Scientist? Read This!; Natural Stupidity is more Dangerous than Artificial Intelligence - Oct 18, 2017.
Want to Become a Data Scientist? Read This Interview First; Natural Stupidity is more Dangerous than Artificial Intelligence; Random Forests(r), Explained; Key Trends and Takeaways from RE-WORK Deep Learning Summit Montreal; An Overview of 3 Popular Courses on Deep Learning
- [Webinar] Data Science for Big Data with Anaconda Enterprise, Oct 19 - Oct 12, 2017.
This Team Anaconda webinar, Oct 19, will demonstrate how easily the Anaconda Enterprise data science platform integrates with Hadoop and Spark clusters, giving your data scientists access to the libraries they need and empowering you to extract the most value from your Big Data.
- Introducing R-Brain: A New Data Science Platform - Oct 11, 2017.
R-Brain is a next generation platform for data science built on top of Jupyterlab with Docker, which supports not only R, but also Python, SQL, has integrated intellisense, debugging, packaging, and publishing capabilities.
- An opinionated Data Science Toolbox in R from Hadley Wickham, tidyverse - Oct 10, 2017.
Get your productivity boosted with Hadley Wickham's powerful R package, tidyverse. It has all you need to start developing your own data science workflows.
- Putting Machine Learning in Production - Sep 22, 2017.
In machine learning, going from research to production environment requires a well designed architecture. This blog shows how to transfer a trained model to a prediction server.
- Cool Vendor status for CrowdFlower means SF best ice cream for you - Sep 19, 2017.
Schedule a time to see a demo of the CrowdFlower platform and see how we empower data scientists to train, test, and tune machine learning for a human world. We’ll hook you up with some of San Francisco's best ice cream!
- Unveiling Anaconda Enterprise 5, Sep 19 Webinar - Sep 15, 2017.
Want to see the new Anaconda Enterprise 5 features in action? Register now for our new webinar, Unveiling Anaconda Enterprise 5—The Enterprise-Ready Data Science Platform.
- Python overtakes R, becomes the leader in Data Science, Machine Learning platforms - Aug 28, 2017.
While Python did not "swallow" R, in 2017 Python ecosystem overtook R as the leading platform for Analytics, Data Science, and Machine Learning and is pulling users from other platforms.
- New Poll: Python vs R vs rest: What did you use in 2016-17 for Analytics, Data Science, Machine Learning tasks? - Aug 15, 2017.
Python vs R vs Other - What did you use for Analytics, Data Science, Machine Learning work in 2016-17? Vote and we will analyze and report results and trends.
- KDnuggets™ News 17:n21, May 31: Python Machine Learning Workflows from Scratch; Machine Learning Crash Course - May 31, 2017.
Machine Learning Workflows in Python from Scratch Part 1: Data Preparation; Machine Learning Crash Course: Part 1; An Introduction to the MXNet Python API; How A Data Scientist Can Improve Productivity; Data science platforms are on the rise and IBM is leading the way
- Data science platforms are on the rise and IBM is leading the way - May 25, 2017.
Download the 2017 Gartner Magic Quadrant for Data Science Platforms today to learn why IBM is named a leader in data science and to find out why data science, analytics, and machine learning are the engines of the future.
- Data Science & Machine Learning Platforms for the Enterprise - May 8, 2017.
A resilient Data Science Platform is a necessity to every centralized data science team within a large corporation. It helps them centralize, reuse, and productionize their models at peta scale.
- New Poll: What software you used for Analytics, Data Mining, Data Science, Machine Learning projects in the past 12 months? - May 5, 2017.
Vote in KDnuggets 18th Annual Poll: What software you used for Analytics, Data Mining, Data Science, Machine Learning projects in the past 12 months? We will clean, analyze, visualize, and publish the results.
- DataScience.com New Update Aims to Be Industry-Leading Enterprise Data Science Platform - May 4, 2017.
DataScience.com’s enterprise data science platform can now be deployed on-premises or in the cloud. New features include scalable infrastructure, intuitive project organization, and task automation.
- Dataiku: The Complete Data Sheet - Apr 20, 2017.
Whether your every day tool is Scala, Python, R, or Excel, you can now use one tool - Dataiku - to transform raw data to predictions without the hassle. Discover the platform!
- KDnuggets™ News 17:n15, Apr 19: Forrester vs Gartner on Data Science/Analytics Platforms; 5 Machine Learning Projects You Can No Longer Overlook - Apr 19, 2017.
Also Top mistakes data scientists make when dealing with business people; New Online Data Science Tracks for 2017; Cartoon: Why AI needs help with taxes.
- Forrester vs Gartner on Data Science Platforms and Machine Learning Solutions - Apr 14, 2017.
Who leads in Data Science, Machine Learning, and Predictive Analytics? We compare the latest Forrester and Gartner reports for this industry for 2017 Q1, identify gainers and losers, and strong leaders vs contenders.
- Webinar: R with RStudio, Spark, sparklyr in Minutes, April 26 - Apr 11, 2017.
Find out how to expand R capabilities with RStudio + sparklyr on Apache Spark on a fast cloud platform and how simple to get started in the cloud with Cazena Data Science Sandbox as a Service.
- Gartner Data Science Platforms – A Deeper Look - Mar 3, 2017.
Thomas Dinsmore critical examination of Gartner 2017 MQ of Data Science Platforms, including vendors who out, in, have big changes, Hadoop and Spark integration, open source software, and what Data Scientists actually use.
- Gartner 2017 Magic Quadrant for Data Science Platforms: gainers and losers - Feb 23, 2017.
We compare Gartner 2017 Magic Quadrant for Data Science Platforms vs its 2016 version and identify notable changes for leaders and challengers, including IBM, SAS, RapidMiner, KNIME, MathWorks, Microsoft, and Quest.
- KDnuggets™ News 17:n06, Feb 15: So What is Big Data? 52 Useful Machine Learning APIs; Data Science finds Perfect Valentines Dates - Feb 15, 2017.
Also Making Python Speak SQL with pandasql; 52 Useful Machine Learning & Prediction APIs, updated; New Poll: Do you support Trump Immigration Ban?
- Forrester Study: Companies Using Data Science Platforms Are Surpassing The Competition - Feb 8, 2017.
Companies that regularly exceed shareholder expectations have something in common: 88% of them use a fully functional platform to do data science work. Get the white paper from Forrester to learn more.
- R-Brain Platform for Data Science: R, Python, sharing, security, and marketplace - Dec 7, 2016.
R-Brain IDE enables data scientists to use both R and Python with full language support. It enables sharing and has a marketplace for models. Try it free.
- R, Python Duel As Top Analytics, Data Science software – KDnuggets 2016 Software Poll Results - Jun 6, 2016.
R remains the leading tool, with 49% share, but Python grows faster and almost catches up to R. RapidMiner remains the most popular general Data Science platform. Big Data tools used by almost 40%, and Deep Learning usage doubles.
Pages: 1 2
- Poll: What software you used for Analytics, Data Mining, Data Science, Machine Learning projects in the past 12 months? - May 14, 2016.
Vote in KDnuggets 17th Annual Poll: What software you used for Analytics, Data Mining, Data Science Machine Learning projects in the past 12 months? We will clean and analyze the results and publish our analysis afterwards.
- Angoss 9.6 Data Science Software Suite - Apr 29, 2016.
Angoss software provides users with comprehensive scorecard building functionality that is fast, reliable, accurate, and business centric.
- KDnuggets™ News 16:n14, Apr 20: Top 15 Frameworks for Machine Learning Experts; How to Grow Your Own Data Scientists - Apr 20, 2016.
Top 15 Frameworks for Machine Learning Experts; How to Grow Your Own Data Scientists; Association Rules and the Apriori Algorithm: A Tutorial; Automated Machine Learning: Changing the Game.
- Salford Predictive Modeler 8: Faster. More Machine Learning. Better results - Apr 4, 2016.
Take a giant step forward with SPM 8: Download and try it for yourself just released version 8 and get better results.
- Exclusive Interview with Alexander Gray, Skytree CEO: Fast, Automated, Machine Learning Software for Free? - Mar 17, 2016.
We discuss how Skytree compares with competition, how does it perform relative to expert Data Scientists, how does Skytree Automodel compare to Deep Learning, and more.
- Automated Data Science and Data Mining - Mar 4, 2016.
Automated Data Science is becoming more popular. Here is our initial list of automated Data Science and Data Mining platforms.
- New Salford Predictive Modeler 8 - Mar 1, 2016.
Salford Predictive Modeler software suite: Faster. More Comprehensive Machine Learning. More Automation. Better results. Take a giant step forward in your data science productivity with SPM 8. Download and try it today!
- Hackerday – Stay Updated in your Career through Hands-On Projects - Nov 19, 2015.
Hackerday is a platform at DeZyre – which allows you to come together as a group, code and work on day long hackathons, where you will be guided by an industry expert, as you are coding. Next Hackerday session Nov 21.
- FlyElephant supports R, Python, and public API - Nov 9, 2015.
FlyElephant is the Platform-as-a-Service for data analysis and simulations of processes. It supports elastic multi-core systems, HPC and GPU clusters, R, Python, and more. Meet CEO Dmitry Spodarets in Silicon Valley until Nov 12 and in Austin, Nov 13 to Nov 20.
- Dataiku Data Science Studio, now also runs on Apache Spark - Sep 29, 2015.
Dataiku Data Science Studio version 2.1 has many useful features for Data Scientists, including integration with Apache Spark.
Pages: 1 2
- Anaconda Data Science Platform for R, Python, or both - Aug 18, 2015.
Got R, Python, or both? Download conda, the leading package and environment manager for data science, which works with both R and Python packages.
- Dataiku Data Science Studio – intuitive solution for data professionals - Jul 8, 2015.
Data Science Studio (DSS) from Dataiku is an intuitive software solution that let data professionals harness the power of big data. The latest version DSS 2.0 brings predictive analytics to a whole new level in terms of collaboration and usability.
- Poll: What Predictive Analytics, Data Mining, Data Science software/tools you used in the past 12 months? - May 7, 2015.
Vote in KDnuggets 16th Annual Poll: What Analytics, Data Mining, Data Science software/tools you used in the past 12 months for a real project?. We will clean and analyze the results and publish our trend analysis afterwards.
- New Poll: Computing platform for your analytics, data mining, data science work or research - Mar 14, 2015.
New KDnuggets Poll is asking: What computing platform you use for analytics, data mining, data science work or research? Please vote.
- Wrangling Public Bike Share Data with The Free Trial of Trifacta - Mar 6, 2015.
A free trial of Trifacta is a good opportunity for data analysts to start wrangle the different shapes and sizes of data sets. We give an example of wrangling Bay Area Bike Share data to better understand biking around San Francisco.
- BigML machine learning platform Winter 2015 Release, Feb 11 - Feb 2, 2015.
See the latest in BigML's continuously evolved machine learning platform with its emphasis on consumability, programmability, and scalability. Feb 11 webinar at 9 am PT and 5 pm PT.
- RapidMiner Academia makes RapidMiner freely available to students, academics worldwide - Jan 20, 2015.
Students, professors and researchers can now use free of charge the latest version of RapidMiner platform for education and academic research.
- Top KDnuggets tweets, Jan 14-15: 10 FB likes predicts personality better than a co-worker; A Deep Dive into Recurrent Neural Nets - Jan 16, 2015.
A Deep Dive into Recurrent Neural Nets #DeepLearning; SOASTA announces #DataScience Workbench for insights from user experience; What's Wrong with this Picture? The Art of Honest Visualizations; Deep Learning can be easily fooled.
- KDnuggets™ News 15:n01, Jan 7: Clever methods of overfitting; 5 Analytics Rules to cut thru the Hype - Jan 7, 2015.
11 Clever Methods of Overfitting and how to avoid them, Data Mining and Text Analytics of World Cup 2014, iMath Cloud Data Science Platform beta, Platfora CEO on Insightful Analytics for Big Data, and more analytics, big data, data science, and data mining stories.
- iMath Cloud Data Science Platform beta - Jan 6, 2015.
iMathResearch presents a Data Science platform, offering development in Python, R or Octave, cloud-based collaboration, private computational instances and visualization from the browser.
- iMathCloud, Python Data Science Platform - Nov 10, 2014.
iMathResearch presents its first tool for big data analysis, offering easy access to computational tools, a simple Python-based interface, cloud-based collaboration, and private computational instances.
- Making Sense of Public Data – Wrangling Jeopardy - Oct 7, 2014.
Trifacta’s Alon Bartur & Will Davis detail their process for transforming or “wrangling” publicly available Jeopardy data found on the web for downstream analysis.
- Dear CIO, what you have is NOT a Data Lake - Jul 17, 2014.
Data Lakes are often the ideal structure of a company's big data, but the reality is that data is often split into data puddles. Xurmo seeks to eliminate this by integrating Data Virtualization into the Data Lake.
- Domino – A Platform For Modern Data Analysis - Jun 26, 2014.
Tools that facilitate data science best practices have not yet matured to match their counterparts in the world of software engineering. Domino is a platform built from the ground up to fill in these gaps and accelerate modern analytical workflows.
- Data Lakes vs Data Warehouses - Jun 7, 2014.
Data Warehouses, traditionally popular for business intelligence tasks, are being replaced by less-structured Data Lakes which allow more flexibility.
- Top KDnuggets tweets, May 7-8: 30 Simple Tools for Data Visualization; Did Target Really Predict Pregnancy? - May 9, 2014.
30 Simple Tools for Data and Geo-Visualization: iCharts, Fusion, Modest Maps, Raw ...; Did Target Really Predict a Teen's Pregnancy? The Inside Story; Sense, new Data Science startup, builds a Data Science Platform of the Future; Analytics Experts on #BigData Misconceptions.
- Top KDnuggets tweets, Mar 21-23: Machine Learning in Parallel with SVM; Good Data Sets for Data Science Practice - Mar 24, 2014.
Machine Learning in Parallel with SVM, GLM; Good Data Sets for Data Science Practice: Big enough, requires data engineering, rich; Cartoon: Why Madame Zaza, Fortune Teller, changes to Predictive Analytics; Top 45 #BigData Tools and Platforms for Developers
- Aunsight: New Data Science Platform from Aunalytics - Feb 5, 2014.
Aunsight , a powerful new Data Science Platform, will let data scientists easily design and customize powerful workflows, integrate disparate data sources, add new algorithms, and focus on solving big data problems.