- Data Science Tools Illustrated Study Guides - Aug 25, 2020.
These data science tools illustrated guides are broken up into four distinct categories: data retrieval, data manipulation, data visualization, and engineering tips. Both online and PDF versions of these guides are available.
- Better Blog Post Analysis with googleAnalyticsR - Jul 24, 2020.
In this post, we'll walk through using googleAnalyticsR for better blog post analysis, so you can do my better blog post analysis for yourself!
- Wrapping Machine Learning Techniques Within AI-JACK Library in R - Jul 17, 2020.
The article shows an approach to solving problem of selecting best technique in machine learning. This can be done in R using just one library called AI-JACK and the article shows how to use this tool.
- Understanding Time Series with R - Jul 9, 2020.
Analyzing time series is such a useful resource for essentially any business, data scientists entering the field should bring with them a solid foundation in the technique. Here, we decompose the logical components of a time series using R to better understand how each plays a role in this type of analysis.
- An Introduction to Statistical Learning: The Free eBook - Jun 29, 2020.
This week's free eBook is a classic of data science, An Introduction to Statistical Learning, with Applications in R. If interested in picking up elementary statistical learning concepts, and learning how to implement them in R, this book is for you.
- Practical Markov Chain Monte Carlo - Jun 26, 2020.
This is a slightly more intricate example of MCMC, compared to many with a fairly simple model, a single predictor (maybe two), and not much else, which highlights a couple of issues and tricks worth noting for a handwritten implementation.
- Data Science Tools Popularity, animated - Jun 25, 2020.
Watch the evolution of the top 10 most popular data science tools based on KDnuggets software polls from 2000 to 2019.
- Build a Branded Web Based GIS Application Using R, Leaflet and Flexdashboard - Jun 24, 2020.
By using R, Flexdashboard and Leaflet, we can build a customized and branded web application to showcase location based data interactively across the organization. Instead of crowding the application with many widgets, we use menu tabs and pages to separate the interactive aspects.
- modelStudio and The Grammar of Interactive Explanatory Model Analysis - Jun 19, 2020.
modelStudio is an R package that automates the exploration of ML models and allows for interactive examination. It works in a model agnostic fashion, therefore is compatible with most of the ML frameworks.
- Fighting Disease with Data: Q&A with Epidemiologist Amrish Baidjoe - Jun 11, 2020.
Data science tools are powerful for investigating the current pandemic and other outbreaks, when accurate and actionable data are crucial. Epidemiologist and R Epidemics Consortium leader Amrish Baidjoe shared his insights into using data science to fight disease, from modeling to automation to new technologies.
- Python for data analysis… is it really that simple?!? - Apr 2, 2020.
The article addresses a simple data analytics problem, comparing a Python and Pandas solution to an R solution (using plyr, dplyr, and data.table), as well as kdb+ and BigQuery solutions. Performance improvement tricks for these solutions are then covered, as are parallel/cluster computing approaches and their limitations.
- Time Series Classification Synthetic vs Real Financial Time Series - Mar 18, 2020.
This article discusses distinguishing between real financial time series and synthetic time series using XGBoost.
- Decision Boundary for a Series of Machine Learning Models - Mar 13, 2020.
I train a series of Machine Learning models using the iris dataset, construct synthetic data from the extreme points within the data and test a number of Machine Learning models in order to draw the decision boundaries from which the models make predictions in a 2D space, which is useful for illustrative purposes and understanding on how different Machine Learning models make predictions.
- KDnuggets™ News 20:n09, Mar 4: When Will AutoML replace Data Scientists (if ever) – vote; 20 AI, DS, ML Terms You Need to Know (part 2) - Mar 4, 2020.
- Python and R Courses for Data Science - Feb 26, 2020.
Since Python and R are a must for today's data scientists, continuous learning is paramount. Online courses are arguably the best and most flexible way to upskill throughout ones career.
- KDnuggets™ News 20:n08, Feb 26: Gartner 2020 Magic Quadrant for Data Science & Machine Learning Platforms; Will AutoML Replace Data Scientists? - Feb 26, 2020.
This week in KDnuggets: The Death of Data Scientists - will AutoML replace them?; Leaders, Changes, and Trends in Gartner 2020 Magic Quadrant for Data Science and Machine Learning Platforms; Hand labeling is the past. The future is #NoLabel AI; The Forgotten Algorithm; Getting Started with R Programming; and much, much more.
- Getting Started with R Programming - Feb 19, 2020.
An end to end Data Analysis using R, the second most requested programming language in Data Science.
- Introduction to Geographical Time Series Prediction with Crime Data in R, SQL, and Tableau - Feb 14, 2020.
When reviewing geographical data, it can be difficult to prepare the data for an analysis. This article helps by covering importing data into a SQL Server database; cleansing and grouping data into a map grid; adding time data points to the set of grid data and filling in the gaps where no crimes occurred; importing the data into R; running XGBoost model to determine where crimes will occur on a specific day
- Basics of Audio File Processing in R - Feb 11, 2020.
This post provides basic information on audio processing using R as the programming language. It also walks through and understands some basics of sound and digital audio.
- Serverless Machine Learning with R on Cloud Run - Feb 4, 2020.
Expedite the deployment of your machine models using serverless cloud infrastructure. In this tutorial, we explore creating and deploying a model which scraps real time Twitter data and returns interactive visualization using R.
- Classify A Rare Event Using 5 Machine Learning Algorithms - Jan 15, 2020.
Which algorithm works best for unbalanced data? Are there any tradeoffs?
- Beginner’s Guide to K-Nearest Neighbors in R: from Zero to Hero - Jan 3, 2020.
This post presents a pipeline of building a KNN model in R with various measurement metrics.
- Plotnine: Python Alternative to ggplot2 - Dec 12, 2019.
Python's plotting libraries such as matplotlib and seaborn does allow the user to create elegant graphics as well, but lack of a standardized syntax for implementing the grammar of graphics compared to the simple, readable and layering approach of ggplot2 in R makes it more difficult to implement in Python.
- Data Science for Managers: Programming Languages - Nov 19, 2019.
In this article, we are going to talk about popular languages for Data Science and briefly describe each of them.
- How to Visualize Data in Python (and R) - Nov 14, 2019.
Producing accessible data visualizations is a key data science skill. The following guidelines will help you create the best representations of your data using R and Python's Pandas library.
- KDnuggets™ News 19:n43, Nov 13: Dynamic Reports in Python and R; Creating NLP Vocabularies; What is Data Science? - Nov 13, 2019.
On KDnuggets this week: Orchestrating Dynamic Reports in Python and R with Rmd Files; How to Create a Vocabulary for NLP Tasks in Python; What is Data Science?; The Complete Data Science LinkedIn Profile Guide; Set Operations Applied to Pandas DataFrames; and much, much more.
- Orchestrating Dynamic Reports in Python and R with Rmd Files - Nov 8, 2019.
Do you want to extract csv files with Python and visualize them in R? How does preparing everything in R and make conclusions with Python sound? Both are possible if you know the right libraries and techniques. Here, we’ll walk through a use-case using both languages in one analysis
- Customer Segmentation for R Users - Sep 26, 2019.
This article shows you how to separate your customers into distinct groups based on their purchase behavior. For the R enthusiasts out there, I demonstrated what you can do with r/stats, ggradar, ggplot2, animation, and factoextra.
- Scikit-Learn vs mlr for Machine Learning - Sep 10, 2019.
How does the scikit-learn machine learning library for Python compare to the mlr package for R? Following along with a machine learning workflow through each approach, and see if you can gain a competitive advantage by knowing both frameworks.
- KDnuggets™ News 19:n33, Sep 4: Data Science Skills Poll; Object-oriented Programming for Data Scientists - Sep 4, 2019.
This week: Object-oriented programming for data scientists; Deep Learning Next Step: Transformers and Attention Mechanism; R Users' Salaries from the 2019 Stackoverflow Survey; Types of Bias in Machine Learning; 4 Tips for Advanced Feature Engineering and Preprocessing; and much more!
- R Users’ Salaries from the 2019 Stackoverflow Survey - Aug 30, 2019.
Let’s take a look on what R users are saying about their salaries. Note that the following results could be biased because of unrepresentative and in some cases small samples.
- Coding Random Forests® in 100 lines of code* - Aug 7, 2019.
There are dozens of machine learning algorithms out there. It is impossible to learn all their mechanics; however, many algorithms sprout from the most established algorithms, e.g. ordinary least squares, gradient boosting, support vector machines, tree-based algorithms and neural networks.
- KDnuggets™ News 19:n29, Aug 7: What 70% of Data Science Learners Do Wrong; Pytorch Cheat Sheet for Beginners - Aug 7, 2019.
This week on KDnuggets: What 70% of Data Science Learners Do Wrong; Pytorch Cheat Sheet for Beginners and Udacity Deep Learning Nanodegree; How a simple mix of object-oriented programming can sharpen your deep learning prototype; Can we trust AutoML to go on full autopilot?; Ten more random useful things in R you may not know about; 25 Tricks for Pandas; and much more!
- Ten more random useful things in R you may not know about - Jul 31, 2019.
I had a feeling that R has developed as a language to such a degree that many of us are using it now in completely different ways. This means that there are likely to be numerous tricks, packages, functions, etc that each of us use, but that others are completely unaware of, and would find useful if they knew about them.
- Kaggle Kernels Guide for Beginners: A Step by Step Tutorial - Jul 23, 2019.
This is an attempt to hold the hands of a complete beginner and walk them through the world of Kaggle Kernels — for them to get started.
- The Evolution of a ggplot - Jul 18, 2019.
A step-by-step tutorial showing how to turn a default ggplot into an appealing and easily understandable data visualization in R.
- How to Make Stunning 3D Plots for Better Storytelling - Jul 17, 2019.
3D Plots built in the right way for the right purpose are always stunning. In this article, we’ll see how to make stunning 3D plots with R using ggplot2 and rayshader.
- Modelplotr v1.0 now on CRAN: Visualize the Business Value of your Predictive Models - Jun 21, 2019.
Explaining the business value of your predictive models to your business colleagues is a challenging task. Using Modelplotr, an R package, you can easily create stunning visualizations that clearly communicate the business value of your models.
Pages: 1 2
- Ten random useful things in R that you might not know about - Jun 20, 2019.
Because the R ecosystem is so rich and constantly growing, people can often miss out on knowing about something that can really help them in a task that they have to complete
- KDnuggets™ News 19:n23, Jun 19: Useful Stats for Data Scientists; Python, TensorFlow & R Winners in Latest Job Report - Jun 19, 2019.
This week on KDnuggets: 5 Useful Statistics Data Scientists Need to Know; Data Science Jobs Report 2019: Python Way Up, TensorFlow Growing Rapidly, R Use Double SAS; How to Learn Python for Data Science the Right Way; The Machine Learning Puzzle, Explained; Scalable Python Code with Pandas UDFs; and much more!
- Data Science Jobs Report 2019: Python Way Up, TensorFlow Growing Rapidly, R Use Double SAS - Jun 17, 2019.
Data science jobs continue to grow in 2019, and this report shares the change and spread of jobs by software over recent years.
- What you need to know: The Modern Open-Source Data Science/Machine Learning Ecosystem - Jun 10, 2019.
We identify the 6 tools in the modern open-source Data Science ecosystem, examine the Python vs R question, and determine which tools are used the most with Deep Learning and Big Data.
- The Whole Data Science World in Your Hands - Jun 5, 2019.
Testing MatrixDS capabilities on different languages and tools: Python, R and Julia. If you work with data you have to check this out.
- Python leads the 11 top Data Science, Machine Learning platforms: Trends and Analysis - May 30, 2019.
Python continues to lead the top Data Science platforms, but R and RapidMiner hold their share; Almost 50% have used Deep Learning tools; SQL is steady; Consolidation continues.
Pages: 1 2
- How to correctly select a sample from a huge dataset in machine learning - May 1, 2019.
We explain how choosing a small, representative dataset from a large population can improve model training reliability.
- Powerful like your local notebook. Sharable like a Google Doc. - Apr 30, 2019.
Mode is the only analytics platform with native Python and R Notebooks. Get everyone up and running in minutes by delivering Notebook-powered results right in your browser. Now anyone on your team can re-run R- and Python-powered reports themselves—without ever touching code.
- KDnuggets™ News 19:n16, Apr 24: Data Visualization in Python with Matplotlib & Seaborn; Getting Into Data Science: The Ultimate Q&A - Apr 24, 2019.
Best Data Visualization Techniques for small and large data; The Rise of Generative Adversarial Networks; Approach pre-trained deep learning models with caution; How Optimization Works; Building a Flask API to Automatically Extract Named Entities Using SpaCy
- The Mueller Report Word Cloud: A brief tutorial in R - Apr 22, 2019.
Word clouds are simple visual summaries of the mostly frequently used words in a text, presenting essentially the same information as a histogram but are somewhat less precise and vastly more eye-catching. Get a quick sense of the themes in the recently released Mueller Report and its 448 pages of legal content.
- Because analysis is more than just dashboards - Apr 11, 2019.
Where traditional BI tools often make it easy to build dashboards, Mode makes it easy for you to answer any follow-up questions when you see changes in those dashboards. Choose the level of abstraction you want for a given dataset and quickly get to the story behind the change.
- R vs Python for Data Visualization - Mar 25, 2019.
This article demonstrates creating similar plots in R and Python using two of the most prominent data visualization packages on the market, namely ggplot2 and Seaborn.
- Top R Packages for Data Cleaning - Mar 15, 2019.
Data cleaning is one of the most important and time consuming task for data scientists. Here are the top R packages for data cleaning.
- Advanced Analytics & Data Gov Training in Chicago: ML, DL, Self-Service, Strategy, and more - Mar 13, 2019.
Learn effective data governance practices and how to successfully implement advanced analytics by attending our industry leading training at TDWI Chicago, April 28 - May 3, and take your projects to the next level.
- Who is a typical Data Scientist in 2019? - Mar 11, 2019.
We investigate what a typical data scientist looks like and see how this differs from this time last year, looking at skill set, programming languages, industry of employment, country of employment, and more.
- Don’t do analysis in a vacuum - Feb 22, 2019.
Traditional tools force analysts to play the import-and-export game, so it's difficult to keep data fresh and accessible. Every Mode report or dashboard lives at a unique URL for future sharing, iterating, and building upon. Mode brings your entire team together in one platform.
- Running R and Python in Jupyter - Feb 19, 2019.
The Jupyter Project began in 2014 for interactive and scientific computing. Fast forward 5 years and now Jupyter is one of the most widely adopted Data Science IDE's on the market and gives the user access to Python and R
- Understanding Gradient Boosting Machines - Feb 6, 2019.
However despite its massive popularity, many professionals still use this algorithm as a black box. As such, the purpose of this article is to lay an intuitive framework for this powerful machine learning technique.
- Using Caret in R to Classify Term Deposit Subscriptions for a Bank - Feb 4, 2019.
This article uses direct marketing campaign data from a Portuguese banking institution to predict if a customer will subscribe for a term deposit. We’ll be working with R’s Caret package to achieve this.
- Airbnb Rental Listings Dataset Mining - Jan 28, 2019.
An Exploratory Analysis of Airbnb’s Data to understand the rental landscape in New York City.
- 2018’s Top 7 R Packages for Data Science and AI - Jan 22, 2019.
This is a list of the best packages that changed our lives this year, compiled from my weekly digests.
Pages: 1 2
- Deep learning in Satellite imagery - Dec 26, 2018.
This article outlines possible sources of satellite imagery, what its properties are and how this data can be utilised using R.
- Exploring the Data Jungle Free eBook - Dec 18, 2018.
This free eBook by Brian Godsey will provide you with real-world examples in Python, R, and other languages suitable for data science.
- Automated Web Scraping in R - Dec 11, 2018.
How to automatically web scrape periodically so you can analyze timely/frequently updated data.
- Data Science Projects Employers Want To See: How To Show A Business Impact - Dec 4, 2018.
The best way to create better data science projects that employers want to see is to provide a business impact. This article highlights the process using customer churn prediction in R as a case-study.
- Best Machine Learning Languages, Data Visualization Tools, DL Frameworks, and Big Data Tools - Dec 3, 2018.
We cover a variety of topics, from machine learning to deep learning, from data visualization to data tools, with comments and explanations from experts in the relevant fields.
- SQL, Python, and R in One Platform - Nov 27, 2018.
Stop jumping between applications. Get a complete analytical toolkit.
- SQL, Python, & R in One Platform - Oct 26, 2018.
No more jumping between applications. Mode Studio combines a SQL editor, Python and R notebooks, and a visualization builder in one platform.
- Apache Spark Introduction for Beginners - Oct 18, 2018.
An extensive introduction to Apache Spark, including a look at the evolution of the product, use cases, architecture, ecosystem components, core concepts and more.
- SQL, Python, & R: All in One Platform - Oct 11, 2018.
Mode Studio connects a SQL editor, Python and R notebooks, and a visualization builder in one platform. Sign up now for access.
- Evaluating the Business Value of Predictive Models in Python and R - Oct 11, 2018.
In these blogs for R and python we explain four valuable evaluation plots to assess the business value of a predictive model. We show how you can easily create these plots and help you to explain your predictive model to non-techies.
Pages: 1 2
- KDnuggets™ News 18:n37, Oct 3: Mathematics of Machine Learning; Effective Transfer Learning for NLP; Path Analysis with R - Oct 3, 2018.
Also: Introducing VisualData: A Search Engine for Computer Vision Datasets; Raspberry Pi IoT Projects for Fun and Profit; Recent Advances for a Better Understanding of Deep Learning; Basic Image Data Analysis Using Python - Part 3; Introduction to Deep Learning
- Introducing Path Analysis Using R - Sep 27, 2018.
Path analysis is an extension of multiple regression. It allows for the analysis of more complicated models.
- Optimization 101 for Data Scientists - Aug 8, 2018.
We show how to use optimization strategies to make the best possible decision.
- From Data to Viz: how to select the the right chart for your data - Aug 1, 2018.
We offer an interactive, decision tree-style tool, which examines the data you have and proposes a set of potentially appropriate visualizations to represent your dataset.
- Remote Data Science: How to Send R and Python Execution to SQL Server from Jupyter Notebooks - Jul 27, 2018.
Did you know that you can execute R and Python code remotely in SQL Server from Jupyter Notebooks or any IDE? Machine Learning Services in SQL Server eliminates the need to move data around.
- Dimensionality Reduction : Does PCA really improve classification outcome? - Jul 13, 2018.
In this post, I am going to verify this statement using a Principal Component Analysis ( PCA ) to try to improve the classification performance of a neural network over a dataset.
- 5 of Our Favorite Free Visualization Tools - Jul 5, 2018.
5 key free data visualization tools that can provide flexible and effective data presentation.
- [ebook] Apache Spark™ Under the Hood - Jun 27, 2018.
Learn how to install and run Spark yourself; A summary of Spark core architecture and concepts; Spark powerful language APIs and how you can use them.
- KDnuggets™ News 18:n25, Jun 27: 5 Clustering Algorithms Data Scientists Need to Know; Detecting Sarcasm with Deep Convolutional Neural Networks? - Jun 27, 2018.
Also 30 Free Resources for Machine Learning, Deep Learning, NLP ; 7 Simple Data Visualizations You Should Know in R.
- Stagraph – a general purpose R GUI, for data import, wrangling, and visualization - Jun 25, 2018.
Stagraph is a new simple visual interface for R, which focuses on data import, data wrangling and data visualization.
- How to Execute R and Python in SQL Server with Machine Learning Services - Jun 25, 2018.
Machine Learning Services in SQL Server eliminates the need for data movement - you can install and run R/Python packages to build Deep Learning and AI applications on data in SQL Server.
- 7 Simple Data Visualizations You Should Know in R - Jun 22, 2018.
This post presents a selection of 7 essential data visualizations, and how to recreate them using a mix of base R functions and a few common packages.
- KDnuggets™ News 18:n23, Jun 13: Did Python declare victory over R?; Master the Netflix Interview; Deep Learning Projects DIY Style - Jun 13, 2018.
Also: Command Line Tricks For Data Scientists; How (dis)similar are my train and test data?; 5 Machine Learning Projects You Should Not Overlook, June 2018; Introduction to Game Theory; Human Interpretable Machine Learning
- The 6 components of Open-Source Data Science/ Machine Learning Ecosystem; Did Python declare victory over R? - Jun 6, 2018.
We find 6 tools form the modern open source Data Science / Machine Learning ecosystem; examine whether Python declared victory over R; and review which tools are most associated with Deep Learning and Big Data.
- KDnuggets™ News 18:n22, Jun 6: 10 More Free Must-Read Books for Machine Learning and Data Science; Beginner Guide to Data Science Pipeline - Jun 6, 2018.
Summer. Time to sit back and unwind. Or get your hands on some free machine learning and data science books and learn! Here is a great selection to get started.
- Using Linear Regression for Predictive Modeling in R - Jun 1, 2018.
In this post, we’ll use linear regression to build a model that predicts cherry tree volume from metrics that are much easier for folks who study trees to measure.
Pages: 1 2
- Virtual Training Events Without Leaving Your Desk - May 30, 2018.
Check out our lineup of upcoming virtual seminars, online learning courses, and customized training in your office. Space is limited, so reserve your seat early and score the best savings!
- Top 20 R Libraries for Data Science in 2018 - May 25, 2018.
We have prepared an infographic of Top 20 R packages for data science, which covers the libraries main features and GitHub activities, as all of the libraries are open-source.
- Modelling Time Series Processes using GARCH - May 25, 2018.
To go into the turbulent seas of volatile data and analyze it in a time changing setting, ARCH models were developed.
Pages: 1 2
- How to tackle common data cleaning issues in R - May 24, 2018.
R is a great choice for manipulating, cleaning, summarizing, producing probability statistics, and so on. In addition, it's not going away anytime soon, it is platform independent, so what you create will run almost anywhere, and it has awesome help resources.
- Python eats away at R: Top Software for Analytics, Data Science, Machine Learning in 2018: Trends and Analysis - May 22, 2018.
Python continues to eat away at R, RapidMiner gains, SQL is steady, Tensorflow advances pulling along Keras, Hadoop drops, Data Science platforms consolidate, and more.
Pages: 1 2
- Optimization Using R - May 18, 2018.
Optimization is a technique for finding out the best possible solution for a given problem for all the possible solutions. Optimization uses a rigorous mathematical model to find out the most efficient solution to the given problem.
Pages: 1 2
- R Fundamentals: Building a Simple Grade Calculator - Mar 19, 2018.
In this tutorial, we'll teach you the basics of R by building a simple grade calculator. While we do not assume any R-specific knowledge, you should be familiar with general programming concepts.
Pages: 1 2
- New Book: Credit risk analytics, The R Companion - Mar 16, 2018.
Credit risk analytics in R will enable you to build credit risk models from start to finish, with access to real credit data on accompanying website, you will master a wide range of applications.
- KDnuggets™ News 18:n11, Mar 14: Two sides of getting a job as a Data Scientist; 5 things to know about Machine Learning - Mar 14, 2018.
Also 18 Inspiring Women In AI, Big Data, Data Science, Machine Learning; Great Data Scientists Don't Just Think Outside the Box; Favorite Data Science / Machine Learning Blog; Text Processing in R.
- Choropleth Maps in R - Mar 12, 2018.
Choropleth maps provides a very simple and easy way to understand visualizations of a measurement across different geographical areas, be it states or countries.
Pages: 1 2
- Text Processing in R - Mar 9, 2018.
There are good reasons to want to use R for text processing, namely that we can do it, and that we can fit it in with the rest of our analyses. Furthermore, there is a lot of very active development going on in the R text analysis community right now.
- TDWI Chicago, May 6-11: Get Your Hands Dirty With Data – KDnuggets Offer - Mar 2, 2018.
Attend the Hands-on Lab series and bring practical skills back from Chicago. Save 30% through March 16 with priority code KD30.
- Control Structures in R: Using If-Else Statements and Loops - Feb 23, 2018.
Control structures allow you to specify the execution of your code. They are extremely useful if you want to run a piece of code multiple times, or if you want to run a piece a code if a certain condition is met.
- Building a Daily Bitcoin Price Tracker with Coindeskr and Shiny in R - Feb 7, 2018.
This tutorial is to help an R user build his/her own Daily Bitcoin Price Tracker using three packages, Coindeskr, Shiny and Dygraphs.
- Data Science vs Addiction: Estimating Opioid Abuse by Location - Jan 26, 2018.
Data science can help find the optimal locations for drug treatment facilities, even in the face of major data challenges.
- Deep Learning in H2O using R - Jan 22, 2018.
This article is about implementing Deep Learning (DL) using the H2O package in R. We start with a background on DL, followed by some features of H2O's DL framework, followed by an implementation using R.
- Propensity Score Matching in R - Jan 18, 2018.
Propensity scores are an alternative method to estimate the effect of receiving treatment when random assignment of treatments to subjects is not feasible.
Pages: 1 2
- KDnuggets™ News 18:n03, Jan 17: Top 10 TED Talks on Data Science, Machine Learning; How Docker Can Help You Become A More Effective Data Scientist - Jan 17, 2018.
Also A Primer on Web Scraping in R; Elasticsearch for Dummies; Generative Adversarial Networks, an overview,
- Topological Data Analysis for Data Professionals: Beyond Ayasdi - Jan 16, 2018.
We review recent developments and tools in topological data analysis, including applications of persistent homology to psychometrics and a recent extension of piecewise regression, called Morse-Smale regression.
- A Primer on Web Scraping in R - Jan 12, 2018.
If you are a data scientist who wants to capture data from such web pages then you wouldn’t want to be the one to open all these pages manually and scrape the web pages one by one. To push away the boundaries limiting data scientists from accessing such data from web pages, there are packages available in R.
Pages: 1 2
- 10 Tools to Help You Learn R - Jan 4, 2018.
There are several tools to help you grasp the foundational principles and more. The list below gives you an idea of what’s available and how much it costs.
- Simple Ways Of Working With Medium To Big Data Locally - Dec 27, 2017.
An overview of the installation and implementation of simple techniques for working with large datasets in your machine.
- How (& Why) Data Scientists and Data Engineers Should Share a Platform - Nov 17, 2017.
Sharing one platform has some obvious benefits for Data Science and Data Engineering teams, but technical, language and process challenges often make this a challenge. Learn how one company implemented single cloud platform for R, Python and other workloads – and some of the unexpected benefits they discovered along the way.
- Extracting Tweets With R - Nov 14, 2017.
This article will give you a great, brief overview for extracting Tweets using R.
- Tips for Getting Started with Text Mining in R and Python - Nov 8, 2017.
This article opens up the world of text mining in a simple and intuitive way and provides great tips to get started with text mining.
- Process Mining with R: Introduction - Nov 2, 2017.
In the past years, several niche tools have appeared to mine organizational business processes. In this article, we’ll show you that it is possible to get started with “process mining” using well-known data science programming languages as well.
Pages: 1 2
- KDnuggets™ News 17:n41, Oct 25: Learning git not enough to become data scientist; Peak Data Scientist Demand? Top Machine Learning w. R videos - Oct 25, 2017.
Becoming a data scientist after a science PhD; New Poll: When will demand for Data Scientists/Machine Learning experts peak? It Only Takes One Line of Code to Run Regression.
- Top 10 Machine Learning with R Videos - Oct 24, 2017.
A complete video guide to Machine Learning in R! This great compilation of tutorials and lectures is an amazing recipe to start developing your own Machine Learning projects.
- Data Science Bootcamp in Zurich, Switzerland, January 15 – April 6, 2018 - Oct 12, 2017.
Come to the land of chocolate and Data Science where the local tech scene is booming and the jobs are a plenty. Learn the most important concepts from top instructors by doing and through projects. Use code KDNUGGETS to save.
- Best practices of orchestrating Python and R code in ML projects - Oct 12, 2017.
Instead of arguing about Python vs R I will examine the best practices of integrating both languages in one data science project.
Pages: 1 2
- Introducing R-Brain: A New Data Science Platform - Oct 11, 2017.
R-Brain is a next generation platform for data science built on top of Jupyterlab with Docker, which supports not only R, but also Python, SQL, has integrated intellisense, debugging, packaging, and publishing capabilities.
- Learn Generalized Linear Models (GLM) using R - Oct 11, 2017.
In this article, we aim to discuss various GLMs that are widely used in the industry. We focus on: a) log-linear regression b) interpreting log-transformations and c) binary logistic regression.
Pages: 1 2
- An opinionated Data Science Toolbox in R from Hadley Wickham, tidyverse - Oct 10, 2017.
Get your productivity boosted with Hadley Wickham's powerful R package, tidyverse. It has all you need to start developing your own data science workflows.
- Find Out What Celebrities Tweet About the Most - Oct 5, 2017.
Word cloud is a popular data visualisation method. Here we show how to use R to create twitter word cloud of celebrities and politicians.
- Top 10 Videos on Machine Learning in Finance - Sep 29, 2017.
Talks, tutorials and playlists – you could not get a more gentle introduction to Machine Learning (ML) in Finance. Got a quick 4 minutes or ready to study for hours on end? These videos cover all skill levels and time constraints!
- Visualizing High Dimensional Data In Augmented Reality - Sep 25, 2017.
When Data Scientists first get a data set, they oftne use a matrix of 2D scatter plots to quickly see the contents and relationships between pairs of attributes. But for data with lots of attributes, such analysis does not scale.
- The Easy Button for R & Python on Spark, Webinar Oct 18 - Sep 22, 2017.
Learn five solid reasons to use managed services for Cloudera for R, Python and other advanced analytics on Spark & Hadoop in the cloud.
- 30 Essential Data Science, Machine Learning & Deep Learning Cheat Sheets - Sep 22, 2017.
This collection of data science cheat sheets is not a cheat sheet dump, but a curated list of reference materials spanning a number of disciplines and tools.
Pages: 1 2 3
- A Solution to Missing Data: Imputation Using R - Sep 21, 2017.
Handling missing values is one of the worst nightmares a data analyst dreams of. In situations, a wise analyst ‘imputes’ the missing values instead of dropping them from the data.
- KDnuggets™ News 17:n35, Sep 13: Putting the “Science” Back in Data Science; Python vs. R: And the leader is… - Sep 13, 2017.
Putting the "Science" Back in Data Science; Python vs R - Who Is Really Ahead in Data Science, Machine Learning; I built a chatbot in 2 hours and this is what I learned; Are Data Lakes Fake News?; Python Overtaking R?
- Videos for Business Analytics using Data Mining course - Sep 12, 2017.
Here we present links to very useful videos on Business Analytics using data mining courses.
- Python vs R – Who Is Really Ahead in Data Science, Machine Learning? - Sep 12, 2017.
We examine Google Trends, job trends, and more and note that while Python has only a small advantage among current Data Science and Machine Learning related jobs, this advantage is likely to increase in the future.
- Python vs R for Artificial Intelligence, Machine Learning, and Data Science - Sep 11, 2017.
This is a summary (with links) of a three-part article series that's intended to be an in-depth overview of the considerations, tradeoffs, and recommendations associated with selecting between Python and R for programmatic data science tasks.
- Next Generation Data Manipulation with R and dplyr - Aug 31, 2017.
The idea behind the dplyr package is to do one thing at a time. dplyr has separate functions for every task which make its implementation crisp and easy to understand.
- KDnuggets™ News 17:n33, Aug 30: Python Overtakes R in Machine Learning; Data Science in 42 Steps; Deep Learning not AI’s Future - Aug 30, 2017.
Also: KDnuggets part-time, paid internship in Data Science/Machine Learning Journalism; How To Write Better SQL Queries: The Definitive Guide; Understanding overfitting: an inaccurate meme in Machine Learning; How to Become a Data Scientist: The Definitive Guide