- 15 common mistakes data scientists make in Python (and how to fix them) - Mar 3, 2021.
Writing Python code that works for your data science project and performs the task you expect is one thing. Ensuring your code is readable by others (including your future self), reproducible, and efficient are entirely different challenges that can be addressed by minimizing common bad practices in your development.
- Build Your First Data Science Application - Feb 4, 2021.
Check out these seven Python libraries to make your first data science MVP application.
- JupyterLab 3 is Here: Key reasons to upgrade now - Jan 8, 2021.
Read about these 3 reasons for checking out JupyterLab 3 today.
- Best Python IDEs and Code Editors You Should Know - Jan 8, 2021.
Developing machine learning algorithms requires implementing countless libraries and integrating many supporting tools and software packages. All this magic must be written by you in yet another tool -- the IDE -- that is fundamental to all your code work and can drive your productivity. These top Python IDEs and code editors are among the best tools available for you to consider, and are reviewed with their noteworthy features.
- Stop Running Jupyter Notebooks From Your Command Line - Oct 28, 2020.
Instead, run your Jupyter Notebook as a stand alone web app.
- How to be a 10x data scientist - Oct 12, 2020.
If you are a Data Scientist looking to make it to the next level, then there are many opportunities to up your game and your efficiency to stand out from the others. Some of these recommendations that you can follow are straightforward, and others are rarely followed, but they will all pay back in dividends of time and effectiveness for your career.
- Here are the Most Popular Python IDEs/Editors - Oct 6, 2020.
Jupyter Notebook continues to lead as the most popular Python IDE, but its share has declined since the last poll. The top 4 contenders have remained the same, but only one has significantly improved its share. We also examine the breakdown by employment and region.
- 4 Tools to Speed Up Your Data Science Writing - Sep 9, 2020.
This article covers how you can achieve your writing goals with these 4 tools.
- Data Science Meets Devops: MLOps with Jupyter, Git, and Kubernetes - Aug 21, 2020.
An end-to-end example of deploying a machine learning product using Jupyter, Papermill, Tekton, GitOps and Kubeflow.
- Netflix’s Polynote is a New Open Source Framework to Build Better Data Science Notebooks - Aug 5, 2020.
The new notebook environment provides substantial improvements to streamline experimentation in machine learning workflows.
- A Complete Guide To Survival Analysis In Python, part 3 - Jul 30, 2020.
Concluding this three-part series covering a step-by-step review of statistical survival analysis, we look at a detailed example implementing the Kaplan-Meier fitter based on different groups, a Log-Rank test, and Cox Regression, all with examples and shared code.
- Apache Spark Cluster on Docker - Jul 22, 2020.
Build your own Apache Spark cluster in standalone mode on Docker with a JupyterLab interface.
- A Complete guide to Google Colab for Deep Learning - Jun 16, 2020.
Google Colab is a widely popular cloud service for machine learning that features free access to GPU and TPU computing. Follow this detailed guide to help you get up and running fast to develop your next deep learning algorithms with Colab.
- Count, the data notebook everyone can use - Jun 9, 2020.
Dashboards have been the primary weapon of choice for distributing data over the last few decades, but they have brought with them a new set of problems. To increasingly democratise access to data we need to think again.
- Interactive Machine Learning Experiments - May 26, 2020.
Dive into experimenting with machine learning techniques using this open-source collection of interactive demos built on multilayer perceptrons, convolutional neural networks, and recurrent neural networks. Each package consists of ready-to-try web browser interfaces and fully-developed notebooks for you to fine tune the training for better performance.
- 5 Great New Features in Scikit-learn 0.23 - May 15, 2020.
Check out 5 new features of the latest Scikit-learn release, including the ability to visualize estimators in notebooks, improvements to both k-means and gradient boosting, some new linear model implementations, and sample weight support for a pair of existing regressors.
- Coding habits for data scientists - May 14, 2020.
While the core machine learning algorithms might only take up a few lines of code, it's the rest of your program that can get messy fast. Learn about some techniques for identifying bad coding habits in ML that add to complexity in code as well as start new habits that can help partition complexity.
- The Super Duper NLP Repo: 100 Ready-to-Run Colab Notebooks - Apr 24, 2020.
Check out this repository of more than 100 freely-accessible NLP notebooks, curated from around the internet, and ready to launch in Colab with a single click.
- Dockerize Jupyter with the Visual Debugger - Apr 17, 2020.
A step by step guide to enable and use visual debugging in Jupyter in a docker container.
- Better notebooks through CI: automatically testing documentation for graph machine learning - Apr 16, 2020.
In this article, we’ll walk through the detailed and helpful continuous integration (CI) that supports us in keeping StellarGraph’s demos current and informative.
- The 4 Best Jupyter Notebook Environments for Deep Learning - Mar 19, 2020.
Many cloud providers, and other third-party services, see the value of a Jupyter notebook environment which is why many companies now offer cloud hosted notebooks that are hosted on the cloud. Let's have a look at 3 such environments.
- 5 Google Colaboratory Tips - Mar 2, 2020.
Are you looking for some tips for using Google Colab for your projects? This article presents five you may find useful.
- Introducing fastpages: An easy to use blogging platform with extra features for Jupyter Notebooks - Feb 27, 2020.
This article introduces the easy to use blogging platform fastpages. fastpages relies on Github pages for hosting, and Github Actions to automate the creation of your blog, and contains extra features for Jupyter Notebooks.
- How to Optimize Your Jupyter Notebook - Jan 30, 2020.
This article walks through some simple tricks on improving your Jupyter Notebook experience, and covers useful shortcuts, adding themes, automatically generated table of contents, and more.
- Alternative Cloud Hosted Data Science Environments - Dec 19, 2019.
Over the years new alternative providers have risen to provided a solitary data science environment hosted on the cloud for data scientist to analyze, host and share their work.
- The Notebook Anti-Pattern - Nov 21, 2019.
This article aims to explain why this drive towards the use of notebooks in production is an anti pattern, giving some suggestions along the way.
- Automatic Version Control for Data Scientists - Sep 24, 2019.
How can you keep your machine learning models and data organized so you can collaborate effectively? Discover this new tool set available for better version control designed for the data scientist workflow.
- The Easy Way to Do Advanced Data Visualisation for Data Scientists - Aug 13, 2019.
Creating effective data visualisations is a core skill for data scientists. This tutorial will guide you through how to easily develop interactive visualisations using the Python library plotly.
- Easy, One-Click Jupyter Notebooks - Jul 24, 2019.
All of the setup for software, networking, security, and libraries is automatically taken care of by the Saturn Cloud system. Data Scientists can then focus on the actual Data Science and not the tedious infrastructure work that falls around it
- 10 Simple Hacks to Speed up Your Data Analysis in Python - Jul 11, 2019.
This article lists some curated tips for working with Python and Jupyter Notebooks, covering topics such as easily profiling data, formatting code and output, debugging, and more. Hopefully you can find something useful within.
- Why do we need AWS SageMaker? - Jun 26, 2019.
Today, there are several platforms available in the industry that aid software developers, data scientists as well as a layman in developing and deploying machine learning models within no time.
- How to select rows and columns in Pandas using [ ], .loc, iloc, .at and .iat - Jun 19, 2019.
Subset selection is one of the most frequently performed tasks while manipulating data. Pandas provides different ways to efficiently select subsets of data from your DataFrame.
- How to Learn Python for Data Science the Right Way - Jun 14, 2019.
The biggest mistake you can make while learning Python for data science is to learn Python programming from courses meant for programmers. Avoid this mistake, and learn Python the right way by following this approach.
- Top KDnuggets Tweets, Jun 5 – 11: A New Extension to Organize your Code on Jupyter Notebooks; Data Science Cheat Sheet - Jun 12, 2019.
Also: Cognitive Biases are Making Sure You Aren’t So Smart; 3 Machine Learning Books that Helped me Level Up as a Data Scientist; Mastering Intermediate Machine Learning with Python
- Overview of Different Approaches to Deploying Machine Learning Models in Production - Jun 12, 2019.
Learn the different methods for putting machine learning models into production, and to determine which method is best for which use case.
- Jupyter Notebooks: Data Science Reporting - Jun 6, 2019.
Jupyter does bring us some benefits of being able to organize code but many of us still find ourselves with messy and unnecessary code chunks. Here are some ways including a NEW EXTENSION that anyone can use to begin organizing your code on your notebooks.
- The Whole Data Science World in Your Hands - Jun 5, 2019.
Testing MatrixDS capabilities on different languages and tools: Python, R and Julia. If you work with data you have to check this out.
- Running R and Python in Jupyter - Feb 19, 2019.
The Jupyter Project began in 2014 for interactive and scientific computing. Fast forward 5 years and now Jupyter is one of the most widely adopted Data Science IDE's on the market and gives the user access to Python and R
- 7 Steps to Mastering Basic Machine Learning with Python — 2019 Edition - Jan 29, 2019.
With a new year upon us, I thought it would be a good time to revisit the concept and put together a new learning path for mastering machine learning with Python. With these 7 steps you can master basic machine learning with Python!
- 3 More Google Colab Environment Management Tips - Jan 2, 2019.
This is a short collection of lessons learned using Colab as my main coding learning environment for the past few months. Some tricks are Colab specific, others as general Jupyter tips, and still more are filesystem related, but all have proven useful for me.
- Here are the most popular Python IDEs / Editors - Dec 7, 2018.
We report on the most popular IDE and Editors, based on our poll. Jupyter is the favorite across all regions and employment types, but there is competition for no. 2 and no. 3 spots.
- Best Machine Learning Languages, Data Visualization Tools, DL Frameworks, and Big Data Tools - Dec 3, 2018.
We cover a variety of topics, from machine learning to deep learning, from data visualization to data tools, with comments and explanations from experts in the relevant fields.
- What is the Best Python IDE for Data Science? - Nov 14, 2018.
Before you start learning Python, choose the IDE that suits you the best. We examine many available tools, their pros and cons, and suggest how to choose the best Python IDE for you.
- [Download] Real-Life ML Examples + Notebooks - Nov 13, 2018.
In this eBook, we will walk you through four Machine Learning use cases on Databricks: Loan Risk Use Case; Advertising Analytics & Prediction Use Case; Market Basket Analysis Problem at Scale; Suspicious Behavior Identification in Video Use Case. Get your copy now!
- Best Practices for Using Notebooks for Data Science - Nov 8, 2018.
Are you interested in implementing notebooks for data science? Check out these 5 things to consider as you begin the process.
- How to Set Up a Free Data Science Environment on Google Cloud - Aug 15, 2018.
In this post, we'll walk through how to set up a data science environment on Google Cloud Platform (GCP). Because of the economy of scale that cloud hosting companies provide, individuals or teams can affordably access powerful computers.
- Data Scientist guide for getting started with Docker - Aug 14, 2018.
Docker is an increasingly popular way to create and deploy applications through virtualization, but can it be useful for data scientists? This guide should help you quickly get started.
- KDnuggets™ News 18:n29, Aug 1: Building an Awesome Data Science Portfolio; Data Science + DevOps = Taming the Unicorn - Aug 1, 2018.
Also: A Practitioner's Guide to Processing & Understanding Text: Data Retrieval with Web Scraping; Remote Data Science: How to Send R and Python Execution to SQL Server from Jupyter Notebooks; Best Deal in the Galaxy? Win KDnuggets Free Pass to Strata Data Conference NYC
- Remote Data Science: How to Send R and Python Execution to SQL Server from Jupyter Notebooks - Jul 27, 2018.
Did you know that you can execute R and Python code remotely in SQL Server from Jupyter Notebooks or any IDE? Machine Learning Services in SQL Server eliminates the need to move data around.
- 5 Data Science Projects That Will Get You Hired in 2018 - Jun 26, 2018.
A portfolio of real-world projects is the best way to break into data science. This article highlights the 5 types of projects that will help land you a job and improve your career.
- JupyterCon – Exclusive KDnuggets Offer - May 3, 2018.
JupyterCon returns to New York August 21-24. Save an additional 20% on individual Gold, Silver and Bronze passes with the code KDN20 before May 18.
- Jupyter Notebook for Beginners: A Tutorial - May 1, 2018.
The Jupyter Notebook is an incredibly powerful tool for interactively developing and presenting data science projects. Although it is possible to use many different programming languages within Jupyter Notebooks, this article will focus on Python as it is the most common use case.
Pages: 1 2
- Creating a simple text classifier using Google CoLaboratory - Mar 15, 2018.
Google CoLaboratory is Google’s latest contribution to AI, wherein users can code in Python using a Chrome browser in a Jupyter-like environment. In this article I have shared a method, and code, to create a simple binary text classifier using Scikit Learn within Google CoLaboratory environment.
- Top 5 Best Jupyter Notebook Extensions - Mar 13, 2018.
Check out these 5 Jupyter notebook extensions to help increase your productivity.
- 5 Things to Know About Machine Learning - Mar 7, 2018.
This post will point out 5 thing to know about machine learning, 5 things which you may not know, may not have been aware of, or may have once known and now forgotten.
- Jupyter Pop-up coming to Boston on March 21 - Feb 28, 2018.
Attend a day-long exploration of Jupyter's best practices and practical use cases in business and industry.
- Fast.ai Lesson 1 on Google Colab (Free GPU) - Feb 8, 2018.
In this post, I will demonstrate how to use Google Colab for fastai. You can use GPU as a backend for free for 12 hours at a time. GPU compute for free? Are you kidding me?
- Top KDnuggets tweets, Jan 3-9: A collection of Jupyter notebooks NumPy, Pandas, matplotlib, basic #Python #MachineLearning - Jan 10, 2018.
Artificial General Intelligence (AGI) in less than 50 years; Top KDnuggets tweets: 10 Free Must-Read Books for #MachineLearning and #DataScience; The Art of Learning #DataScience; Supercharging Visualization with Apache Arrow; Docker for #DataScience
- Introducing R-Brain: A New Data Science Platform - Oct 11, 2017.
R-Brain is a next generation platform for data science built on top of Jupyterlab with Docker, which supports not only R, but also Python, SQL, has integrated intellisense, debugging, packaging, and publishing capabilities.
- Top KDnuggets tweets, Sep 27 – Oct 03: Introduction to #Blockchains & What It Means to #BigData; 7 More Steps to Mastering #MachineLearning With #Python - Oct 4, 2017.
Also Jupyter Notebooks are Breathtakingly Featureless - Use Jupyter Lab; The 4 Types of Data #Analytics; Aspiring Data Scientists! Learn the basics with these 7 books.
- From Notebooks to JupyterLab – The Evolution of Data Science IDEs - Aug 16, 2017.
This live webinar (Aug 22) will discuss the impact that the notebook experience has had on data science, and how JupyterLab - the next generation data science IDE - has evolved from the classic notebooks.
- JupyterCon – Collaborative Data Science, New York, August 22-25 - Jul 10, 2017.
Bloomberg, Microsoft, Netflix and others found how Jupyter Notebook - the new front end for collaborative data science - make data a competitive advantage. Save an extra 20% with code PCKDNG.
- Exploratory Data Analysis in Python - Jul 7, 2017.
We view EDA very much like a tree: there is a basic series of steps you perform every time you perform EDA (the main trunk of the tree) but at each step, observations will lead you down other avenues (branches) of exploration by raising questions you want to answer or hypotheses you want to test.
- Getting Started with Python for Data Analysis - Jul 5, 2017.
A guide for beginners to Python for getting started with data analysis.
- How Feature Engineering Can Help You Do Well in a Kaggle Competition – Part 3 - Jul 4, 2017.
In this last post of the series, I describe how I used more powerful machine learning algorithms for the click prediction problem as well as the ensembling techniques that took me up to the 19th position on the leaderboard (top 2%)
- How Feature Engineering Can Help You Do Well in a Kaggle Competition – Part 2 - Jun 27, 2017.
In this post, I describe the competition evaluation, the design of my cross-validation strategy and my baseline models using statistics and trees ensembles.
Pages: 1 2
- How Feature Engineering Can Help You Do Well in a Kaggle Competition – Part I - Jun 8, 2017.
As I scroll through the leaderboard page, I found my name in the 19th position, which was the top 2% from nearly 1,000 competitors. Not bad for the first Kaggle competition I had decided to put a real effort in!
- Data Science for Newbies: An Introductory Tutorial Series for Software Engineers - May 31, 2017.
This post summarizes and links to the individual tutorials which make up this introductory look at data science for newbies, mainly focusing on the tools, with a practical bent, written by a software engineer from the perspective of a software engineering approach.
- KDnuggets™ News 17:n13, Apr 5: What makes a great data scientist? Best R Packages for Machine Learning - Apr 5, 2017.
Also Best R Packages for Machine Learning; Deep Stubborn Networks - A Breakthrough Advance Towards Adversarial Machine Learning; A Short Guide to Navigating the Jupyter Ecosystem.
- A Short Guide to Navigating the Jupyter Ecosystem - Mar 31, 2017.
This post presents a no-nonsense overview of the Jupyter ecosystem, and a few tips, tricks and concepts you may find useful for navigating it.
- Top KDnuggets tweets, Mar 22-28: Big #DataScience: Expectation vs. Reality - Mar 29, 2017.
Also A Gentle Introduction To Graph Theory; An Overview of #Python #DeepLearning Frameworks; The Great Algorithm Tutorial Roundup.
- Moving from R to Python: The Libraries You Need to Know - Feb 24, 2017.
Are you considering making a move from R to Python? Here are the libraries you need to know, how they stack up to their R contemporaries, and why you should learn them.
- Jupyter Notebook Best Practices for Data Science - Oct 20, 2016.
Check out this overview of Jupyter notebook best practices as pertains to data science. Novice or expert, you may find something of use here.
- Top KDnuggets tweets, Aug 17-23: Approaching (Almost) Any #MachineLearning Problem; #Database Nirvana – can one query language rule them all? - Aug 24, 2016.
In Search of #Database Nirvana - can one query language rule them all? Google Cloud Datalab: #Jupyter meets #TensorFlow, #cloud meets local deployment; Approaching (Almost) Any #MachineLearning Problem; The Gentlest Introduction to Tensorflow Part 1.
- Visualizing 1 Billion Points of Data: Doing It Right – Aug 18 Webinar - Aug 11, 2016.
Join Continuum Analytics on August 18 for a webinar on Big Data visualization with the datashader library. Save your spot today!
- Top KDnuggets tweets, Jul 13 – Jul 19: Bayesian #MachineLearning, Explained; Introducing JupyterLab - Jul 20, 2016.
Bayesian #MachineLearning, Explained; JupyterLab: the next generation of the #Jupyter Notebook; On the importance of democratizing #ArtificialIntelligence
- Statistical Data Analysis in Python - Jul 18, 2016.
This tutorial will introduce the use of Python for statistical data analysis, using data stored as Pandas DataFrame objects, taking the form of a set of IPython notebooks.
- Top KDnuggets tweets, Jul 6 – Jul 12: Statistical Data Analysis #Python #Jupyter Notebooks; Modern Pandas Notebooks - Jul 13, 2016.
Statistical Data Analysis in #Python (#Jupyter Notebooks); Modern Pandas: idiomatic Pandas notebook collection; New (free) book by @rdpeng: #rstats Programming for #DataScience
- Jupyter+Spark+Mesos: An “Opinionated” Docker Image - May 31, 2016.
Check "opinionated" Docker-based stacks for Jupyter, including one to combine Jupyter and Spark right out of the gate.
- R or Python? Consider learning both - Mar 8, 2016.
The key to become a data science professional is in understanding the underlying data science concepts and work towards expanding your programming toolbox as much as you can. Hence, one should understand when to use Python and when to pick R, rather mastering just one language.
Pages: 1 2
- Using Python and R together: 3 main approaches - Dec 10, 2015.
Well if Data Science and Data Scientists can not decide on what data to choose to help them decide which language to use, here is an article to use BOTH.
- Building and Sharing R packages made easy with Anaconda - Sep 15, 2015.
"R Essentials" bundle comes with Jupyter, IRKernel, and over 80 of the most used R packages and dependencies for data science, and get Conda, the leading package manager for data science. Free Anaconda download.