- How to think like a data scientist to become one - Mar 23, 2017.
The author went from securities analyst to Head of Data Science at Amazon. He describes what he learned in his journey and gives 4 useful rules based on his experience.
- KDnuggets™ News 17:n11, Mar 22: 50 Companies Leading The AI Revolution; 17 More Must-Know Data Science Q&A, part 3 - Mar 22, 2017.
Also 7 Types of Data Scientist Job Profiles; Email Spam Filtering: An Implementation with Python and Scikit-learn.
- The Most Underutilized Function in SQL - Mar 20, 2017.
Find out why md5() is an SQL function that's used surprisingly often, and find out how -- and why -- you can use it yourself.
- Grunion, Query Optimization Tool for Data Science and Big Data - Mar 14, 2017.
Grunion is a patent-pending query optimization, translation, and federation framework built to help bridge the gap between data science and data engineering teams. Read more to request access.
- KDnuggets™ News 17:n06, Feb 15: So What is Big Data? 52 Useful Machine Learning APIs; Data Science finds Perfect Valentines Dates - Feb 15, 2017.
Also Making Python Speak SQL with pandasql; 52 Useful Machine Learning & Prediction APIs, updated; New Poll: Do you support Trump Immigration Ban?
- Making Python Speak SQL with pandasql - Feb 8, 2017.
Want to wrangle Pandas data like you would SQL using Python? This post serves as an introduction to pandasql, and details how to get it up and running inside of Rodeo.
- A Funny Look at Big Data and Data Science - Dec 27, 2016.
A less than serious look at Big Data and Data Science. If you can laugh at all cartoons, then your Data Science skills are in good shape.
- How to Make Your Database 200x Faster Without Having to Pay More - Nov 22, 2016.
Waiting long for a BI query to execute? I know it’s annoyingly frustrating… It’s a major bottle neck in day-to-day life of a Data Analyst or BI expert. Let’s learn some of the easy to use solutions and a very good explanation of why to use them, along with other advanced technological solutions.
Pages: 1 2 3
- Evaluating HTAP Databases for Machine Learning Applications - Nov 2, 2016.
Businesses are producing a greater number of intelligent applications; which traditional databases are unable to support. A new class of databases, Hybrid Transactional and Analytical Processing (HTAP) databases, offers a variety of capabilities with specific strengths and weaknesses to consider. This article aims to give application developers and data scientists a better understanding of the HTAP database ecosystem so they can make the right choice for their intelligent application.
Pages: 1 2
- Top KDnuggets tweets, Sep 28-Oct 4: 7 Steps to Mastering SQL for #DataScience; Biggest Issues in #DataScience - Oct 5, 2016.
7 Steps to Mastering SQL for #DataScience; New Andrew Ng #MachineLearning #Book Under Construction, #Free Draft Chapters; Top #DataScientist Claudia Perlich on Biggest Issues in #DataScience; Awesome Public Datasets on GitHub
- O’Reilly Live Training–Real-time. Real experts. Real learning. - Sep 26, 2016.
Get intensive, hands-on training from O'Reilly's expert network on critical data topics - from SQL fundamentals to distributed computing; enterprise strategy to data science at scale.
- Doing Statistics with SQL - Aug 2, 2016.
This post covers how to perform some basic in-database statistical analysis using SQL.
- Database Key Terms, Explained - Jul 28, 2016.
Interested in a survey of important database concepts and terminology? This post defines 16 essential database key terms concisely and accurately.
Pages: 1 2
- 5 Big Data Projects You Can No Longer Overlook - Jul 21, 2016.
Check out 5 Big Data projects that you are not likely to have seen before, but which may be useful to you, and perhaps even scratch an itch you didn't know you had.
- KDnuggets™ News 16:n22, Jun 22: Data Science Blog Contest; Free Machine Learning Ebook; Master SQL for Data Science - Jun 22, 2016.
Data Science Blog Contest; New Free Andrew Ng Machine Learning Book Under Construction; 7 Steps to Mastering SQL for Data Science; A Visual Explanation of the Back Propagation Algorithm; Mining Twitter Data with Python Part 1: Collecting Data
- 7 Steps to Mastering SQL for Data Science - Jun 16, 2016.
Follow these 7 steps to go from SQL data science newbie to seasoned practitioner quickly. No nonsense, just the necessities.
Pages: 1 2
- Morpace: SQL Programmer - Jun 10, 2016.
Seeking an SQL Programmer to design, implement and maintain a relational database and reporting system. Will collaborate with other programmers and cross-functional teams to assist in designing and advancing the system in an agile environment.
- R, Python Duel As Top Analytics, Data Science software – KDnuggets 2016 Software Poll Results - Jun 6, 2016.
R remains the leading tool, with 49% share, but Python grows faster and almost catches up to R. RapidMiner remains the most popular general Data Science platform. Big Data tools used by almost 40%, and Deep Learning usage doubles.
Pages: 1 2
- Spark 2.0 Preview Now on Databricks Community Edition: Easier, Faster, Smarter - May 17, 2016.
The preview of Spark 2.0 is here, and it promises to be easier, faster, and smarter.
- Practical skills that practical data scientists need - May 13, 2016.
The long story short, data scientist needs to be capable of solving business analytics problems. Learn more about the skill-set you need to master to achieve so.
- The MBA Data Science Toolkit: 8 resources to go from the spreadsheet to the command line - Apr 18, 2016.
A great guide for the MBA, or any relatively non-technical convert, for getting comfortable with the command line and other technical skills required to excel in data science.
Pages: 1 2
- Fastest Growing Programming Languages and Computing Frameworks - Mar 7, 2016.
A new model for ranking programming languages and predicting the growth of user adoption. Includes current language rankings and predictions.
- Webinar: Driving Data Democracy: Hadoop and Redshift, Mar 16 - Mar 4, 2016.
The Hadoop ecosystem has improved markedly over the past few years. MPP databases allow analytics teams to easily query massive structured data sets. Learn how these pipelines work on March 16.
- Data Science Skills for 2016 - Feb 12, 2016.
As demand for the hottest job is getting hotter in new year, the skill set required for them is getting larger. Here, we are discussing the skills which will be in high demand for data scientist which include data visualization, Apache Spark, R, python and many more.
- Will Balkanization of Data Science lead to one Empire or many Republics? - Nov 30, 2015.
We examine the “Technoslavia” of the Big Data and Data Science market and consider whether it is likely to lead to a unified empire or a federation of independent republics.
- Top KDnuggets tweets, Oct 27 – Nov 02: A Framework for Distributed Deep Learning Layer Design in Python - Nov 3, 2015.
A Framework for Distributed #DeepLearning Layer Design in Python; SQL vs. NoSQL- What You Need to Know; Great Tutorial: A Neural Network in 11 lines of #Python; Data Scientist - 2nd Best IT and Engineering Job.
- Spark + SETI: Amping up Spark SQL with Parquets - Oct 21, 2015.
Spark SQL is a great component for data scientists as it simplifies the querying large distributed datasets. Learn how to integrate it with Parquets, which we have found to significantly improve the performance of sparse-column queries.
- Easier Data Prep and Analysis for Data Scientists, Oct 20 Webinar - Oct 6, 2015.
Rapid Insight will show tools that make the data preparation and analysis process significantly faster, without losing the flexibility of advanced programming or SQL tools.
- Dataiku Data Science Studio, now also runs on Apache Spark - Sep 29, 2015.
Dataiku Data Science Studio version 2.1 has many useful features for Data Scientists, including integration with Apache Spark.
Pages: 1 2
- Spark SQL for Real Time Analytics – Part Two - Sep 22, 2015.
Apache Spark is the hottest topic in Big Data. Part 2 of this covers basic concepts of Stream Processing for Real Time Analytics and for the next frontier – Internet of Things (IoT).
Pages: 1 2
- Data Science for Internet of Things – practitioner course - Sep 14, 2015.
Created by Data Science and IoT professionals, the course covers infrastructure (Hadoop – Spark), Programming / Modelling(R/Time series) and ioT. Course starts Nov 2015, delivered online, and will have limited participants.
- Upcoming Webcasts on Analytics, Big Data, Data Science – Sep 8 and beyond - Sep 7, 2015.
The Future of Data Science, Ensuring Business Value from Analytics, Apache Ignite, Text Analytics, Best Practices of Data Science, Forecasting With Predictive Analytics, and more.
- Spark SQL for Real-Time Analytics - Sep 4, 2015.
Apache Spark is the hottest topic in Big Data. This tutorial discusses why Spark SQL is becoming the preferred method for Real Time Analytics and for next frontier, IoT (Internet of Things).
- 60+ Free Books on Big Data, Data Science, Data Mining, Machine Learning, Python, R, and more - Sep 4, 2015.
Here is a great collection of eBooks written on the topics of Data Science, Business Analytics, Data Mining, Big Data, Machine Learning, Algorithms, Data Science Tools, and Programming Languages for Data Science.
- How to become a Data Scientist for Free - Aug 28, 2015.
Here are the most required skills for a data scientist position based on ReSkill’s analyses of thousands of job posts and free resources to learn each skill.
- A Beginner’s Guide to SQL - Aug 27, 2015.
SQL is one of the core skills of a data engineer and data scientist. This mini-tutorial explains the four fundamental SQL functions: Create, Read, Update, and Delete using a fun example of movie quotes database.
Pages: 1 2 3
- Apache Drill Makes Big Data Analysis Easier for Everyone - Aug 18, 2015.
Apache Drill is an open source query engine that provides interactive and secure SQL analytics at the scale of petabytes. Provides data querying and exploring capabilities from varied NoSQL databases and file formats.
- To Code or Not to Code with KNIME - Jul 22, 2015.
Find out how KNIME allows us to integrating analytical languages, such as R and Python and visual design of SQL code. Also, learn to integrate your Hadoop, visualization and ETL systems with the KNIME.
Pages: 1 2
- Emacs for Data Science - Jul 10, 2015.
Data science nowadays demands a polyglot developer and, choosing a correct code editor would definitely be a worthy investment. Here we provide, important features of Emacs and its advantages over other editors.
- Which Big Data, Data Mining, and Data Science Tools go together? - Jun 11, 2015.
We analyze the associations between the top Big Data, Data Mining, and Data Science tools based on the results of 2015 KDnuggets Software Poll. Download anonymized data and analyze it yourself.
- R leads RapidMiner, Python catches up, Big Data tools grow, Spark ignites - May 25, 2015.
R is the most popular overall tool among data miners, although Python usage is growing faster. RapidMiner continues to be most popular suite for data mining/data science. Hadoop/Big Data tools usage grew to 29%, propelled by 3x growth in Spark. Other tools with strong growth include H2O (0xdata), Actian, MLlib, and Alteryx.
- Top KDnuggets tweets, Apr 14-20: Modern Methods for Sentiment Analysis; Basics of SQL, RDBMS – must have skills - Apr 21, 2015.
Great overview: Modern Methods for Sentiment Analysis #word2vec; Basics of SQL and RDBMS - must have skills for data science; The 7 Most Unusual Applications of Big Data; Extensive, but a little confusing site: Understanding Data Visualization.
- KDnuggets™ News 15:n09, Mar 25: Deep Learning from Scratch; 10 steps to Kaggle Success; US CDS DJ Patil Cartoon - Mar 25, 2015.
Deep Learning for Text Understanding from Scratch; New Poll: Computing platform; 10 Steps to Success in Kaggle Data; Cartoon: US Chief Data Scientist Most Difficult Challenge; SQL-like Query Language for Real-time Streaming Analytics.
- Interview: Dave McCrory, Basho on Distributed Database Needs of a Future Enterprise - Mar 16, 2015.
We discuss the future of distributed storage for enterprise, Scale-up vs. Scale-out, software design patterns in Cloud era, microservices model and the place for legacy database in modern enterprise IT.
- SQL-like Query Language for Real-time Streaming Analytics - Mar 12, 2015.
We need SQL like query language for Realtime Streaming Analytics to be expressive, short, fast, define core operations that cover 90% of problems, and to be easy to follow and learn.
- Upcoming Webcasts on Analytics, Big Data, Data Science – Mar 10 and beyond - Mar 9, 2015.
Data Wrangling and the Art of Big Data Discovery, Data Mining: Failure to Launch, The State of Hadoop Adoption, Addressing the Challenges of Data Variety, and more.
- Top KDnuggets tweets, Feb 23-25: Microsoft is building fast, low-power Deep Learning networks; Lucrative tech careers: Data Scientist, Data Engineer - Feb 26, 2015.
5 lucrative tech careers in 2015: Data Scientist ($150K), Data Engineer ($148K); Which SQL on Hadoop? Gartner Poll Still Says "Whatever" But DBMS Providers Gain; 10 Most-Funded #BigData #Startups; DataRPM 8 runs in #Hadoop, uses #MachineLearning to find insights.
- Analyzing Analysts to Build Better Analysis Software - Feb 10, 2015.
Our study how analysts used Mode led to major updates designed to fit how data analysts and business analysts actually use data - there's no one-size-fits-all tool and analysis doesn't end with the analyst.
- Most Demanded Data Science and Data Mining Skills - Dec 15, 2014.
Our analysis of most demanded data scientist skills shows that Data Science is a team effort focused on business analytics, with top 5 platform skills being SQL, Python, R, SAS, and Hadoop.
- If programming languages were vehicles, what would be R, Python, SAS, and SQL? - Dec 6, 2014.
We expand on the idea "If programming languages were vehicles" and examine what would be the main languages for data science: R, Python, SAS, and SQL?
- Mode Playbook for Open Source Analytics - Dec 5, 2014.
Mode Analytics is open-sourcing their internal analysis and data visualizations which can be tailored to common data structures in SQL databases.
- SlamData Open Source Analytics Tool for MongoDB - Dec 4, 2014.
SlamData is an open source SQL-based tool designed to make accessing data in MongoDB easy for developers and non-developers alike with the goal of making application intelligence easier.
- SQL School tackles the data analyst shortage - Nov 17, 2014.
SQL School is a free, interactive tutorial from Mode Analytics, written by analysts for aspiring analysts. Check it out!
- Top KDnuggets tweets, Oct 24-26: Why Deep Learning is likely to make other Machine Learning algorithms obsolete - Oct 27, 2014.
Why Deep Learning is likely to make other Machine Learning algorithms obsolete; Open Source Distributed Analytics Engine with SQL interface; Data Mining Reveals How News Coverage Varies Around the World; 3 Great (and Free) Data Science Books You Can Read Now.
- Four main languages for Analytics, Data Mining, Data Science - Aug 18, 2014.
New KDnuggets Poll shows the growing dominance of four main languages for Analytics, Data Mining, and Data Science: R, SAS, Python, and SQL - used by 91% of data scientists - and decline in popularity of other languages, except for Julia and Scala.
- Top KDnuggets tweets, Aug 6-7: Becoming a Data Scientist: MS Program, Bootcamp, or MOOCs? - Aug 8, 2014.
Becoming a Data Scientist: MS Program, Bootcamp, or MOOCs?; Statistics is the *least* important part of data science; New Poll: What languages you used for analytics / data mining in 2014; If you love Pizza and #DataScience, here is a unique job for you.
- KDnuggets 15th Annual Analytics, Data Mining, Data Science Software Poll: RapidMiner Continues To Lead - Jun 7, 2014.
With over 3,000 data miners taking part in KDnuggets 15th Annual Software Poll, RapidMiner continues to lead. Free software is used much more outside US, and Hadoop usage grows fastest in Asia.
- Upcoming Webcasts on Analytics, Big Data, Data Science – Jun 2 and beyond - Jun 2, 2014.
SQL-on-HaDOOP, BigML, ClearStory, Analytic Maturity with Dean Abbott and TIBCO, Just Enough Math, Analytically Speaking with Dan Ariely, Data Mining FTL, and more.
- Top KDnuggets tweets, May 23-25: Data Science vs. Statistics: one big difference; A SQL query walks into a bar - May 26, 2014.
Data Science vs. Statistics: one big difference in Data Science focus; TGIF: A SQL query walks into a bar, approaches two girls at two tables ...; Amazing demo - IBM #Watson analyzes topic, presents a speech, can debate opponents; Microsoft #Kinect as Inexpensive #BigData Tool.
- Uppd8: An Engine for the Wisdom of Crowds - May 15, 2014.
What people think matters. Uppd8 focuses on crowd sentiment analysis and provides tag-scored data based on different user types. Basic services will be provided for free.
- Top KDnuggets tweets, May 12-13: Guide to Data Science Cheat Sheets; How to analyze Facebook Networks using R - May 14, 2014.
Guide to Data Science Cheat Sheets; Clever hack: How to analyze Facebook Networks using R; Very useful - Introduction to #SQL for Data Scientists; Planning a late career shift to Analytics /Data Science? Be prepared.
- Guide to Data Science Cheat Sheets - May 12, 2014.
Selection of the most useful Data Science cheat sheets, covering SQL, Python (including NumPy, SciPy and Pandas), R (including Regression, Time Series, Data Mining), MATLAB, and more.
- 3 Key Trends in the DBMS Market - May 3, 2014.
The top 3 trends in DBMS include market consolidation, moving beyond OLTP, and distributed computing - we examine them in detail.
- Top KDnuggets tweets, Apr 28-29 - Apr 30, 2014.
9 Free Books for Learning Data Mining; Cartoon: Data Scientist Salary Negotiation; statsTeachR - great free resource; What every Data Scientist needs to know about SQL.
- Online Data Science Certificates: Analytics and Programming for Data Science - Mar 1, 2014.
Statistics.com, a leading provider of online education in statistics and analytics announces two new online certificates for Data Science - "Analytics for Data Science" and "Programming for Data Science".
- Method3: Experienced Big Data Software Engineer - Feb 27, 2014.
Method3, a leader in human capital, RPO, and technology solutions is seeking an experienced Big Data Software Engineer for a large Big4 client in the Irvine, CA area.
- Top stories for Jan 5-11: MADlib: Big Data Machine Learning in SQL; Rock Stars of Big Data - Jan 12, 2014.
MADlib: Big Data Machine Learning in SQL for Data Scientists; IEEE Rock Stars of Big Data Presentations; Hadoop Elephants in the Cloud.