- Three R Libraries Every Data Scientist Should Know (Even if You Use Python) - Dec 20, 2021.
Check out these powerful R libraries built by the world’s biggest tech companies.
- Four Different Pipes for R with magrittr - Oct 6, 2021.
The magrittr package supplies the pipe operator (%>%), but it turns out that the package actually contains four pipe operators in total. Let's go into them a bit.
- Path to Full Stack Data Science - Sep 27, 2021.
Start your journey toward mastering all aspects of the field of Data Science with this focused list of in-depth self-learning resources. Curated with the beginner in mind, these recommendations will help you learn efficiently, and can also offer existing professionals useful highlights for review or help filling in any gaps in skills.
- ebook: Learn Data Science with R – free download - Sep 7, 2021.
Check out this new book for data science beginners with many practical examples that covers statistics, R, graphing, and machine learning. As a source to learn the full breadth of data science foundations, "Learn Data Science with R" starts at the beginner level and gradually progresses into expert content.
- Introduction to Statistical Learning Second Edition - Aug 13, 2021.
The second edition of the classic "An Introduction to Statistical Learning, with Applications in R" was published very recently, and is now freely-available via PDF on the book's website.
- 5 Tips for Writing Clean R Code - Aug 9, 2021.
This article summarizes the most common mistakes to avoid and outline best practices to follow in programming in general. Follow these tips to speed up the code review iteration process and be a rockstar developer in your reviewer’s eyes!
- The Most In-Demand Skills for Data Scientists in 2021 - Apr 15, 2021.
If you are preparing to make a career as a Data Scientist or are looking for opportunities to skill-up in your current role, this analysis of in-demand skills for 2021, based on over 15,000 Data Scientist job postings, should offer you a good idea as to which programming languages and software tools are increasing and decreasing in importance.
- Data Science Curriculum for Professionals - Mar 25, 2021.
If you are looking to expand or transition your current professional career that is buried in spreadsheet analysis into one powered by data science, then you are in for an exciting but complex journey with much to explore and master. To begin your adventure, following this complete road map to guide you from a gnome in the forest of spreadsheets to an AI wizard known far and wide throughout the kingdom.
- Support Vector Machine for Hand Written Alphabet Recognition in R - Jan 27, 2021.
We attempt to break down a problem of hand written alphabet image recognition into a simple process rather than using heavy packages. This is an attempt to create the data and then build a model using Support Vector Machines for Classification.
- Creating Good Meaningful Plots: Some Principles - Jan 12, 2021.
Hera are some thought starters to help you create meaningful plots.
- 15 Free Data Science, Machine Learning & Statistics eBooks for 2021 - Dec 31, 2020.
We present a curated list of 15 free eBooks compiled in a single location to close out the year.
- Undersampling Will Change the Base Rates of Your Model’s Predictions - Dec 17, 2020.
In classification problems, the proportion of cases in each class largely determines the base rate of the predictions produced by the model. Therefore if you use sampling techniques that change this proportion, there is a good chance you will want to rescale / calibrate your predictions before using them in the wild.
- KDnuggets™ News 20:n47, Dec 16: A Rising Library Beating Pandas in Performance; R or Python? Why Not Both? - Dec 16, 2020.
Also: 10 Python Skills They Don't Teach in Bootcamp; Data Science Volunteering: Ways to Help; A Journey from Software to Machine Learning Engineer; Data Science and Machine Learning: The Free eBook
- R or Python? Why Not Both? - Dec 9, 2020.
Do you use both R and Python, either in different projects or in the same? Check out prython, an IDE designed to handle your needs.
- Simple & Intuitive Ensemble Learning in R - Dec 2, 2020.
Read about metaEnsembleR, an R package for heterogeneous ensemble meta-learning (classification and regression) that is fully-automated.
- Top 6 Data Science Programs for Beginners - Nov 20, 2020.
Udacity has the best industry-leading programs in data science. Here are the top six data science courses for beginners to help you get started.
- Behavior Analysis with Machine Learning and R: The free eBook - Oct 22, 2020.
Check out this new free ebook to learn how to leverage the power of machine learning to analyze behavioral patterns from sensor data and electronic records using R.
- KDnuggets™ News 20:n40, Oct 21: fastcore: An Underrated Python Library; Goodhart’s Law for Data Science: what happens when a measure becomes a target? - Oct 21, 2020.
fastcore: An Underrated Python Library; Goodhart's Law for Data Science and what happens when a measure becomes a target?; Text Mining with R: The Free eBook; Free From MIT: Intro to Computational Thinking and Data Science; How to ace the data science coding challenge
- Text Mining with R: The Free eBook - Oct 15, 2020.
This freely-available book will show you how to perform text analytics in R, using packages from the tidyverse.
- Data Science Minimum: 10 Essential Skills You Need to Know to Start Doing Data Science - Oct 1, 2020.
Data science is ever-evolving, so mastering its foundational technical and soft skills will help you be successful in a career as a Data Scientist, as well as pursue advance concepts, such as deep learning and artificial intelligence.
- Data Science Tools Illustrated Study Guides - Aug 25, 2020.
These data science tools illustrated guides are broken up into four distinct categories: data retrieval, data manipulation, data visualization, and engineering tips. Both online and PDF versions of these guides are available.
- Better Blog Post Analysis with googleAnalyticsR - Jul 24, 2020.
In this post, we'll walk through using googleAnalyticsR for better blog post analysis, so you can do my better blog post analysis for yourself!
- Wrapping Machine Learning Techniques Within AI-JACK Library in R - Jul 17, 2020.
The article shows an approach to solving problem of selecting best technique in machine learning. This can be done in R using just one library called AI-JACK and the article shows how to use this tool.
- Understanding Time Series with R - Jul 9, 2020.
Analyzing time series is such a useful resource for essentially any business, data scientists entering the field should bring with them a solid foundation in the technique. Here, we decompose the logical components of a time series using R to better understand how each plays a role in this type of analysis.
- An Introduction to Statistical Learning: The Free eBook - Jun 29, 2020.
This week's free eBook is a classic of data science, An Introduction to Statistical Learning, with Applications in R. If interested in picking up elementary statistical learning concepts, and learning how to implement them in R, this book is for you.
- Practical Markov Chain Monte Carlo - Jun 26, 2020.
This is a slightly more intricate example of MCMC, compared to many with a fairly simple model, a single predictor (maybe two), and not much else, which highlights a couple of issues and tricks worth noting for a handwritten implementation.
- Data Science Tools Popularity, animated - Jun 25, 2020.
Watch the evolution of the top 10 most popular data science tools based on KDnuggets software polls from 2000 to 2019.
- Build a Branded Web Based GIS Application Using R, Leaflet and Flexdashboard - Jun 24, 2020.
By using R, Flexdashboard and Leaflet, we can build a customized and branded web application to showcase location based data interactively across the organization. Instead of crowding the application with many widgets, we use menu tabs and pages to separate the interactive aspects.
- modelStudio and The Grammar of Interactive Explanatory Model Analysis - Jun 19, 2020.
modelStudio is an R package that automates the exploration of ML models and allows for interactive examination. It works in a model agnostic fashion, therefore is compatible with most of the ML frameworks.
- Fighting Disease with Data: Q&A with Epidemiologist Amrish Baidjoe - Jun 11, 2020.
Data science tools are powerful for investigating the current pandemic and other outbreaks, when accurate and actionable data are crucial. Epidemiologist and R Epidemics Consortium leader Amrish Baidjoe shared his insights into using data science to fight disease, from modeling to automation to new technologies.
- Python for data analysis… is it really that simple?!? - Apr 2, 2020.
The article addresses a simple data analytics problem, comparing a Python and Pandas solution to an R solution (using plyr, dplyr, and data.table), as well as kdb+ and BigQuery solutions. Performance improvement tricks for these solutions are then covered, as are parallel/cluster computing approaches and their limitations.
- Time Series Classification Synthetic vs Real Financial Time Series - Mar 18, 2020.
This article discusses distinguishing between real financial time series and synthetic time series using XGBoost.
- Decision Boundary for a Series of Machine Learning Models - Mar 13, 2020.
I train a series of Machine Learning models using the iris dataset, construct synthetic data from the extreme points within the data and test a number of Machine Learning models in order to draw the decision boundaries from which the models make predictions in a 2D space, which is useful for illustrative purposes and understanding on how different Machine Learning models make predictions.
- KDnuggets™ News 20:n09, Mar 4: When Will AutoML replace Data Scientists (if ever) – vote; 20 AI, DS, ML Terms You Need to Know (part 2) - Mar 4, 2020.
- Python and R Courses for Data Science - Feb 26, 2020.
Since Python and R are a must for today's data scientists, continuous learning is paramount. Online courses are arguably the best and most flexible way to upskill throughout ones career.
- KDnuggets™ News 20:n08, Feb 26: Gartner 2020 Magic Quadrant for Data Science & Machine Learning Platforms; Will AutoML Replace Data Scientists? - Feb 26, 2020.
This week in KDnuggets: The Death of Data Scientists - will AutoML replace them?; Leaders, Changes, and Trends in Gartner 2020 Magic Quadrant for Data Science and Machine Learning Platforms; Hand labeling is the past. The future is #NoLabel AI; The Forgotten Algorithm; Getting Started with R Programming; and much, much more.
- Getting Started with R Programming - Feb 19, 2020.
An end to end Data Analysis using R, the second most requested programming language in Data Science.
- Introduction to Geographical Time Series Prediction with Crime Data in R, SQL, and Tableau - Feb 14, 2020.
When reviewing geographical data, it can be difficult to prepare the data for an analysis. This article helps by covering importing data into a SQL Server database; cleansing and grouping data into a map grid; adding time data points to the set of grid data and filling in the gaps where no crimes occurred; importing the data into R; running XGBoost model to determine where crimes will occur on a specific day
- Basics of Audio File Processing in R - Feb 11, 2020.
This post provides basic information on audio processing using R as the programming language. It also walks through and understands some basics of sound and digital audio.
- Serverless Machine Learning with R on Cloud Run - Feb 4, 2020.
Expedite the deployment of your machine models using serverless cloud infrastructure. In this tutorial, we explore creating and deploying a model which scraps real time Twitter data and returns interactive visualization using R.
- Classify A Rare Event Using 5 Machine Learning Algorithms - Jan 15, 2020.
Which algorithm works best for unbalanced data? Are there any tradeoffs?
- Beginner’s Guide to K-Nearest Neighbors in R: from Zero to Hero - Jan 3, 2020.
This post presents a pipeline of building a KNN model in R with various measurement metrics.
- Plotnine: Python Alternative to ggplot2 - Dec 12, 2019.
Python's plotting libraries such as matplotlib and seaborn does allow the user to create elegant graphics as well, but lack of a standardized syntax for implementing the grammar of graphics compared to the simple, readable and layering approach of ggplot2 in R makes it more difficult to implement in Python.
- Data Science for Managers: Programming Languages - Nov 19, 2019.
In this article, we are going to talk about popular languages for Data Science and briefly describe each of them.
- How to Visualize Data in Python (and R) - Nov 14, 2019.
Producing accessible data visualizations is a key data science skill. The following guidelines will help you create the best representations of your data using R and Python's Pandas library.
- KDnuggets™ News 19:n43, Nov 13: Dynamic Reports in Python and R; Creating NLP Vocabularies; What is Data Science? - Nov 13, 2019.
On KDnuggets this week: Orchestrating Dynamic Reports in Python and R with Rmd Files; How to Create a Vocabulary for NLP Tasks in Python; What is Data Science?; The Complete Data Science LinkedIn Profile Guide; Set Operations Applied to Pandas DataFrames; and much, much more.
- Orchestrating Dynamic Reports in Python and R with Rmd Files - Nov 8, 2019.
Do you want to extract csv files with Python and visualize them in R? How does preparing everything in R and make conclusions with Python sound? Both are possible if you know the right libraries and techniques. Here, we’ll walk through a use-case using both languages in one analysis
- Customer Segmentation for R Users - Sep 26, 2019.
This article shows you how to separate your customers into distinct groups based on their purchase behavior. For the R enthusiasts out there, I demonstrated what you can do with r/stats, ggradar, ggplot2, animation, and factoextra.
- Scikit-Learn vs mlr for Machine Learning - Sep 10, 2019.
How does the scikit-learn machine learning library for Python compare to the mlr package for R? Following along with a machine learning workflow through each approach, and see if you can gain a competitive advantage by knowing both frameworks.
- KDnuggets™ News 19:n33, Sep 4: Data Science Skills Poll; Object-oriented Programming for Data Scientists - Sep 4, 2019.
This week: Object-oriented programming for data scientists; Deep Learning Next Step: Transformers and Attention Mechanism; R Users' Salaries from the 2019 Stackoverflow Survey; Types of Bias in Machine Learning; 4 Tips for Advanced Feature Engineering and Preprocessing; and much more!
- R Users’ Salaries from the 2019 Stackoverflow Survey - Aug 30, 2019.
Let’s take a look on what R users are saying about their salaries. Note that the following results could be biased because of unrepresentative and in some cases small samples.
- Coding Random Forests® in 100 lines of code* - Aug 7, 2019.
There are dozens of machine learning algorithms out there. It is impossible to learn all their mechanics; however, many algorithms sprout from the most established algorithms, e.g. ordinary least squares, gradient boosting, support vector machines, tree-based algorithms and neural networks.
- KDnuggets™ News 19:n29, Aug 7: What 70% of Data Science Learners Do Wrong; Pytorch Cheat Sheet for Beginners - Aug 7, 2019.
This week on KDnuggets: What 70% of Data Science Learners Do Wrong; Pytorch Cheat Sheet for Beginners and Udacity Deep Learning Nanodegree; How a simple mix of object-oriented programming can sharpen your deep learning prototype; Can we trust AutoML to go on full autopilot?; Ten more random useful things in R you may not know about; 25 Tricks for Pandas; and much more!
- Ten more random useful things in R you may not know about - Jul 31, 2019.
I had a feeling that R has developed as a language to such a degree that many of us are using it now in completely different ways. This means that there are likely to be numerous tricks, packages, functions, etc that each of us use, but that others are completely unaware of, and would find useful if they knew about them.
- Kaggle Kernels Guide for Beginners: A Step by Step Tutorial - Jul 23, 2019.
This is an attempt to hold the hands of a complete beginner and walk them through the world of Kaggle Kernels — for them to get started.
- The Evolution of a ggplot - Jul 18, 2019.
A step-by-step tutorial showing how to turn a default ggplot into an appealing and easily understandable data visualization in R.
- How to Make Stunning 3D Plots for Better Storytelling - Jul 17, 2019.
3D Plots built in the right way for the right purpose are always stunning. In this article, we’ll see how to make stunning 3D plots with R using ggplot2 and rayshader.
- Modelplotr v1.0 now on CRAN: Visualize the Business Value of your Predictive Models - Jun 21, 2019.
Explaining the business value of your predictive models to your business colleagues is a challenging task. Using Modelplotr, an R package, you can easily create stunning visualizations that clearly communicate the business value of your models.
Pages: 1 2
- Ten random useful things in R that you might not know about - Jun 20, 2019.
Because the R ecosystem is so rich and constantly growing, people can often miss out on knowing about something that can really help them in a task that they have to complete
- KDnuggets™ News 19:n23, Jun 19: Useful Stats for Data Scientists; Python, TensorFlow & R Winners in Latest Job Report - Jun 19, 2019.
This week on KDnuggets: 5 Useful Statistics Data Scientists Need to Know; Data Science Jobs Report 2019: Python Way Up, TensorFlow Growing Rapidly, R Use Double SAS; How to Learn Python for Data Science the Right Way; The Machine Learning Puzzle, Explained; Scalable Python Code with Pandas UDFs; and much more!
- Data Science Jobs Report 2019: Python Way Up, TensorFlow Growing Rapidly, R Use Double SAS - Jun 17, 2019.
Data science jobs continue to grow in 2019, and this report shares the change and spread of jobs by software over recent years.
- What you need to know: The Modern Open-Source Data Science/Machine Learning Ecosystem - Jun 10, 2019.
We identify the 6 tools in the modern open-source Data Science ecosystem, examine the Python vs R question, and determine which tools are used the most with Deep Learning and Big Data.
- The Whole Data Science World in Your Hands - Jun 5, 2019.
Testing MatrixDS capabilities on different languages and tools: Python, R and Julia. If you work with data you have to check this out.
- Python leads the 11 top Data Science, Machine Learning platforms: Trends and Analysis - May 30, 2019.
Python continues to lead the top Data Science platforms, but R and RapidMiner hold their share; Almost 50% have used Deep Learning tools; SQL is steady; Consolidation continues.
Pages: 1 2
- Powerful like your local notebook. Sharable like a Google Doc. - Apr 30, 2019.
Mode is the only analytics platform with native Python and R Notebooks. Get everyone up and running in minutes by delivering Notebook-powered results right in your browser. Now anyone on your team can re-run R- and Python-powered reports themselves—without ever touching code.
- KDnuggets™ News 19:n16, Apr 24: Data Visualization in Python with Matplotlib & Seaborn; Getting Into Data Science: The Ultimate Q&A - Apr 24, 2019.
Best Data Visualization Techniques for small and large data; The Rise of Generative Adversarial Networks; Approach pre-trained deep learning models with caution; How Optimization Works; Building a Flask API to Automatically Extract Named Entities Using SpaCy
- The Mueller Report Word Cloud: A brief tutorial in R - Apr 22, 2019.
Word clouds are simple visual summaries of the mostly frequently used words in a text, presenting essentially the same information as a histogram but are somewhat less precise and vastly more eye-catching. Get a quick sense of the themes in the recently released Mueller Report and its 448 pages of legal content.
- Because analysis is more than just dashboards - Apr 11, 2019.
Where traditional BI tools often make it easy to build dashboards, Mode makes it easy for you to answer any follow-up questions when you see changes in those dashboards. Choose the level of abstraction you want for a given dataset and quickly get to the story behind the change.
- R vs Python for Data Visualization - Mar 25, 2019.
This article demonstrates creating similar plots in R and Python using two of the most prominent data visualization packages on the market, namely ggplot2 and Seaborn.
- Top R Packages for Data Cleaning - Mar 15, 2019.
Data cleaning is one of the most important and time consuming task for data scientists. Here are the top R packages for data cleaning.
- Advanced Analytics & Data Gov Training in Chicago: ML, DL, Self-Service, Strategy, and more - Mar 13, 2019.
Learn effective data governance practices and how to successfully implement advanced analytics by attending our industry leading training at TDWI Chicago, April 28 - May 3, and take your projects to the next level.
- Who is a typical Data Scientist in 2019? - Mar 11, 2019.
We investigate what a typical data scientist looks like and see how this differs from this time last year, looking at skill set, programming languages, industry of employment, country of employment, and more.
- Don’t do analysis in a vacuum - Feb 22, 2019.
Traditional tools force analysts to play the import-and-export game, so it's difficult to keep data fresh and accessible. Every Mode report or dashboard lives at a unique URL for future sharing, iterating, and building upon. Mode brings your entire team together in one platform.
- Running R and Python in Jupyter - Feb 19, 2019.
The Jupyter Project began in 2014 for interactive and scientific computing. Fast forward 5 years and now Jupyter is one of the most widely adopted Data Science IDE's on the market and gives the user access to Python and R
- Understanding Gradient Boosting Machines - Feb 6, 2019.
However despite its massive popularity, many professionals still use this algorithm as a black box. As such, the purpose of this article is to lay an intuitive framework for this powerful machine learning technique.
- Using Caret in R to Classify Term Deposit Subscriptions for a Bank - Feb 4, 2019.
This article uses direct marketing campaign data from a Portuguese banking institution to predict if a customer will subscribe for a term deposit. We’ll be working with R’s Caret package to achieve this.
- Airbnb Rental Listings Dataset Mining - Jan 28, 2019.
An Exploratory Analysis of Airbnb’s Data to understand the rental landscape in New York City.
- 2018’s Top 7 R Packages for Data Science and AI - Jan 22, 2019.
This is a list of the best packages that changed our lives this year, compiled from my weekly digests.
Pages: 1 2
- Deep learning in Satellite imagery - Dec 26, 2018.
This article outlines possible sources of satellite imagery, what its properties are and how this data can be utilised using R.
- Exploring the Data Jungle Free eBook - Dec 18, 2018.
This free eBook by Brian Godsey will provide you with real-world examples in Python, R, and other languages suitable for data science.
- Automated Web Scraping in R - Dec 11, 2018.
How to automatically web scrape periodically so you can analyze timely/frequently updated data.
- Data Science Projects Employers Want To See: How To Show A Business Impact - Dec 4, 2018.
The best way to create better data science projects that employers want to see is to provide a business impact. This article highlights the process using customer churn prediction in R as a case-study.
- Best Machine Learning Languages, Data Visualization Tools, DL Frameworks, and Big Data Tools - Dec 3, 2018.
We cover a variety of topics, from machine learning to deep learning, from data visualization to data tools, with comments and explanations from experts in the relevant fields.
- SQL, Python, and R in One Platform - Nov 27, 2018.
Stop jumping between applications. Get a complete analytical toolkit.
- SQL, Python, & R in One Platform - Oct 26, 2018.
No more jumping between applications. Mode Studio combines a SQL editor, Python and R notebooks, and a visualization builder in one platform.
- Apache Spark Introduction for Beginners - Oct 18, 2018.
An extensive introduction to Apache Spark, including a look at the evolution of the product, use cases, architecture, ecosystem components, core concepts and more.
- SQL, Python, & R: All in One Platform - Oct 11, 2018.
Mode Studio connects a SQL editor, Python and R notebooks, and a visualization builder in one platform. Sign up now for access.
- Evaluating the Business Value of Predictive Models in Python and R - Oct 11, 2018.
In these blogs for R and python we explain four valuable evaluation plots to assess the business value of a predictive model. We show how you can easily create these plots and help you to explain your predictive model to non-techies.
Pages: 1 2
- KDnuggets™ News 18:n37, Oct 3: Mathematics of Machine Learning; Effective Transfer Learning for NLP; Path Analysis with R - Oct 3, 2018.
Also: Introducing VisualData: A Search Engine for Computer Vision Datasets; Raspberry Pi IoT Projects for Fun and Profit; Recent Advances for a Better Understanding of Deep Learning; Basic Image Data Analysis Using Python - Part 3; Introduction to Deep Learning
- Introducing Path Analysis Using R - Sep 27, 2018.
Path analysis is an extension of multiple regression. It allows for the analysis of more complicated models.
- Optimization 101 for Data Scientists - Aug 8, 2018.
We show how to use optimization strategies to make the best possible decision.
- From Data to Viz: how to select the the right chart for your data - Aug 1, 2018.
We offer an interactive, decision tree-style tool, which examines the data you have and proposes a set of potentially appropriate visualizations to represent your dataset.
- Remote Data Science: How to Send R and Python Execution to SQL Server from Jupyter Notebooks - Jul 27, 2018.
Did you know that you can execute R and Python code remotely in SQL Server from Jupyter Notebooks or any IDE? Machine Learning Services in SQL Server eliminates the need to move data around.
- Dimensionality Reduction : Does PCA really improve classification outcome? - Jul 13, 2018.
In this post, I am going to verify this statement using a Principal Component Analysis ( PCA ) to try to improve the classification performance of a neural network over a dataset.
- 5 of Our Favorite Free Visualization Tools - Jul 5, 2018.
5 key free data visualization tools that can provide flexible and effective data presentation.
- [ebook] Apache Spark™ Under the Hood - Jun 27, 2018.
Learn how to install and run Spark yourself; A summary of Spark core architecture and concepts; Spark powerful language APIs and how you can use them.
- KDnuggets™ News 18:n25, Jun 27: 5 Clustering Algorithms Data Scientists Need to Know; Detecting Sarcasm with Deep Convolutional Neural Networks? - Jun 27, 2018.
Also 30 Free Resources for Machine Learning, Deep Learning, NLP ; 7 Simple Data Visualizations You Should Know in R.
- Stagraph – a general purpose R GUI, for data import, wrangling, and visualization - Jun 25, 2018.
Stagraph is a new simple visual interface for R, which focuses on data import, data wrangling and data visualization.
- How to Execute R and Python in SQL Server with Machine Learning Services - Jun 25, 2018.
Machine Learning Services in SQL Server eliminates the need for data movement - you can install and run R/Python packages to build Deep Learning and AI applications on data in SQL Server.
- 7 Simple Data Visualizations You Should Know in R - Jun 22, 2018.
This post presents a selection of 7 essential data visualizations, and how to recreate them using a mix of base R functions and a few common packages.
- KDnuggets™ News 18:n23, Jun 13: Did Python declare victory over R?; Master the Netflix Interview; Deep Learning Projects DIY Style - Jun 13, 2018.
Also: Command Line Tricks For Data Scientists; How (dis)similar are my train and test data?; 5 Machine Learning Projects You Should Not Overlook, June 2018; Introduction to Game Theory; Human Interpretable Machine Learning
- The 6 components of Open-Source Data Science/ Machine Learning Ecosystem; Did Python declare victory over R? - Jun 6, 2018.
We find 6 tools form the modern open source Data Science / Machine Learning ecosystem; examine whether Python declared victory over R; and review which tools are most associated with Deep Learning and Big Data.
- KDnuggets™ News 18:n22, Jun 6: 10 More Free Must-Read Books for Machine Learning and Data Science; Beginner Guide to Data Science Pipeline - Jun 6, 2018.
Summer. Time to sit back and unwind. Or get your hands on some free machine learning and data science books and learn! Here is a great selection to get started.
- Using Linear Regression for Predictive Modeling in R - Jun 1, 2018.
In this post, we’ll use linear regression to build a model that predicts cherry tree volume from metrics that are much easier for folks who study trees to measure.
Pages: 1 2
- Virtual Training Events Without Leaving Your Desk - May 30, 2018.
Check out our lineup of upcoming virtual seminars, online learning courses, and customized training in your office. Space is limited, so reserve your seat early and score the best savings!
- Top 20 R Libraries for Data Science in 2018 - May 25, 2018.
We have prepared an infographic of Top 20 R packages for data science, which covers the libraries main features and GitHub activities, as all of the libraries are open-source.
- Modelling Time Series Processes using GARCH - May 25, 2018.
To go into the turbulent seas of volatile data and analyze it in a time changing setting, ARCH models were developed.
Pages: 1 2
- How to tackle common data cleaning issues in R - May 24, 2018.
R is a great choice for manipulating, cleaning, summarizing, producing probability statistics, and so on. In addition, it's not going away anytime soon, it is platform independent, so what you create will run almost anywhere, and it has awesome help resources.
- Python eats away at R: Top Software for Analytics, Data Science, Machine Learning in 2018: Trends and Analysis - May 22, 2018.
Python continues to eat away at R, RapidMiner gains, SQL is steady, Tensorflow advances pulling along Keras, Hadoop drops, Data Science platforms consolidate, and more.
Pages: 1 2
- Optimization Using R - May 18, 2018.
Optimization is a technique for finding out the best possible solution for a given problem for all the possible solutions. Optimization uses a rigorous mathematical model to find out the most efficient solution to the given problem.
Pages: 1 2
- R Fundamentals: Building a Simple Grade Calculator - Mar 19, 2018.
In this tutorial, we'll teach you the basics of R by building a simple grade calculator. While we do not assume any R-specific knowledge, you should be familiar with general programming concepts.
Pages: 1 2
- New Book: Credit risk analytics, The R Companion - Mar 16, 2018.
Credit risk analytics in R will enable you to build credit risk models from start to finish, with access to real credit data on accompanying website, you will master a wide range of applications.
- KDnuggets™ News 18:n11, Mar 14: Two sides of getting a job as a Data Scientist; 5 things to know about Machine Learning - Mar 14, 2018.
Also 18 Inspiring Women In AI, Big Data, Data Science, Machine Learning; Great Data Scientists Don't Just Think Outside the Box; Favorite Data Science / Machine Learning Blog; Text Processing in R.
- Choropleth Maps in R - Mar 12, 2018.
Choropleth maps provides a very simple and easy way to understand visualizations of a measurement across different geographical areas, be it states or countries.
Pages: 1 2
- Text Processing in R - Mar 9, 2018.
There are good reasons to want to use R for text processing, namely that we can do it, and that we can fit it in with the rest of our analyses. Furthermore, there is a lot of very active development going on in the R text analysis community right now.
- TDWI Chicago, May 6-11: Get Your Hands Dirty With Data – KDnuggets Offer - Mar 2, 2018.
Attend the Hands-on Lab series and bring practical skills back from Chicago. Save 30% through March 16 with priority code KD30.
- Control Structures in R: Using If-Else Statements and Loops - Feb 23, 2018.
Control structures allow you to specify the execution of your code. They are extremely useful if you want to run a piece of code multiple times, or if you want to run a piece a code if a certain condition is met.
- Building a Daily Bitcoin Price Tracker with Coindeskr and Shiny in R - Feb 7, 2018.
This tutorial is to help an R user build his/her own Daily Bitcoin Price Tracker using three packages, Coindeskr, Shiny and Dygraphs.
- Data Science vs Addiction: Estimating Opioid Abuse by Location - Jan 26, 2018.
Data science can help find the optimal locations for drug treatment facilities, even in the face of major data challenges.
- Deep Learning in H2O using R - Jan 22, 2018.
This article is about implementing Deep Learning (DL) using the H2O package in R. We start with a background on DL, followed by some features of H2O's DL framework, followed by an implementation using R.
- Propensity Score Matching in R - Jan 18, 2018.
Propensity scores are an alternative method to estimate the effect of receiving treatment when random assignment of treatments to subjects is not feasible.
Pages: 1 2
- KDnuggets™ News 18:n03, Jan 17: Top 10 TED Talks on Data Science, Machine Learning; How Docker Can Help You Become A More Effective Data Scientist - Jan 17, 2018.
Also A Primer on Web Scraping in R; Elasticsearch for Dummies; Generative Adversarial Networks, an overview,
- Topological Data Analysis for Data Professionals: Beyond Ayasdi - Jan 16, 2018.
We review recent developments and tools in topological data analysis, including applications of persistent homology to psychometrics and a recent extension of piecewise regression, called Morse-Smale regression.
- A Primer on Web Scraping in R - Jan 12, 2018.
If you are a data scientist who wants to capture data from such web pages then you wouldn’t want to be the one to open all these pages manually and scrape the web pages one by one. To push away the boundaries limiting data scientists from accessing such data from web pages, there are packages available in R.
Pages: 1 2
- 10 Tools to Help You Learn R - Jan 4, 2018.
There are several tools to help you grasp the foundational principles and more. The list below gives you an idea of what’s available and how much it costs.
- Simple Ways Of Working With Medium To Big Data Locally - Dec 27, 2017.
An overview of the installation and implementation of simple techniques for working with large datasets in your machine.
- How (& Why) Data Scientists and Data Engineers Should Share a Platform - Nov 17, 2017.
Sharing one platform has some obvious benefits for Data Science and Data Engineering teams, but technical, language and process challenges often make this a challenge. Learn how one company implemented single cloud platform for R, Python and other workloads – and some of the unexpected benefits they discovered along the way.
- Extracting Tweets With R - Nov 14, 2017.
This article will give you a great, brief overview for extracting Tweets using R.
- Tips for Getting Started with Text Mining in R and Python - Nov 8, 2017.
This article opens up the world of text mining in a simple and intuitive way and provides great tips to get started with text mining.
- Process Mining with R: Introduction - Nov 2, 2017.
In the past years, several niche tools have appeared to mine organizational business processes. In this article, we’ll show you that it is possible to get started with “process mining” using well-known data science programming languages as well.
Pages: 1 2