Using Python and R together: 3 main approaches

Well if Data Science and Data Scientists can not decide on what data to choose to help them decide which language to use, here is an article to use BOTH.



By Ajay Ohri, DecisionStats.

python-r

Why would anyone want to use R and Python in the same software? Have we not seen enough debates on the internet on which language is better. Even on this website ( an industry bellwether)  I could count four such articles  here by Datacamp ,Stichfix, by a debate and by a KDnuggets poll. Well if Data Science and Data Scientists can not decide on what data to choose to help them decide which language to use, here is an article on how to use BOTH.

Data Scientists:

Think Python AND R

and not just PYTHON OR R

Here are some reasons to do so:

  1. Both are good stable languages with interesting complementary qualities. You can get much better packages in one and then stich them with some data from the other. An example is using time series forecasting (forecast::auto.arima) and decision trees in R (http://www.statmethods.net/advstats/cart.html) and doing data munging in Python.
  2. Both languages borrow from each other. Even seasoned package developers like Hadley Wickham (Rstudio) borrows from Beautiful Soup (python) to make rvest for web scraping. Yhat borrows from sqldf to make pandasql. Rather than  reinvent the wheel in the other language developers can focus on innovation
  3. The customer does not care which language the code was written, the customer cares for insights.
  4. You are less likely to be tied down due to a bug or feature request or a version compatibility issue
  5. It is sexier for data scientists to be skilled on (or cooler as we older guys liked to say before Harvard Business Review proclaimed data scientist as the sexiest job)
  6. There are only four main languages within Data Science (~91% by KDnuggets Poll) and everyone can use SQL from their own language. There is no debate on SQL.

So how to do it? As of December 2015 there are three principal ways to use BOTH Python an R

  1. Use a Python package rpy2 to use R within Python . You can see examples here You can also use Python from within R using the rPython package
  2. Use Jupyter with the IR Kernel – The Jupyter project is named after Julia Python and R and makes the interactivity of iPython available to other languages
  3. Use Beaker notebook -Inspired by Jupyter, Beaker Notebook allows you to switch from one language in one code block to another language in another code block in a streamlined way to pass shared objects (data)

I wish I could say all these methods are simple and streamlined but they are not. I enjoy Python’s power in data munging and I enjoy R’s huge library of packages and functions for statistics. Will using R and Python together grow in the future. Let’s see. But in case you did not know, you can also use SAS language with R and Python(using Java). Now that is cool for sure.

ajay-ohriBio: Ajay Ohri is the founder of analytics startup DECISIONSTATS. He is the author of ” R for Business Analytics” and “R for Cloud Computing”.

Related: