Search results for "s3"

376 documents found out of 7186 total.

  • Get Interactive Plots Directly With Pandas">Silver BlogGet Interactive Plots Directly With Pandas

    Telling a story with data is a core function for any Data Scientist, and creating data visualizations that are simultaneously illuminating and appealing can be challenging. This tutorial reviews how to create Plotly and Bokeh plots directly through Pandas plotting syntax, which will help you convert static visualizations into interactive counterparts -- and take your analysis to the next level.

    https://www.kdnuggets.com/2021/06/interactive-plots-directly-pandas.html

  • Gold BlogHow I Doubled My Income with Data Science and Machine Learning">Rewards BlogGold BlogHow I Doubled My Income with Data Science and Machine Learning

    Many career opportunities exist in the ever-expanding domain of data. Finding your place -- and finding your salary -- is largely up to your dedication, focus, and drive to learn. If you are an aspiring Data Scientist or have already started your professional journey, there are multiple strategies for maximizing your earning potential.

    https://www.kdnuggets.com/2021/06/double-income-data-science-machine-learning.html

  • State of Mathematical Optimization Report, 2021

    Download your copy of Gurobi's first-ever "State of Mathematical Optimization Report," which is based on data from a survey of commercial mathematical optimization users. Get yours now.

    https://www.kdnuggets.com/2021/05/gurobi-state-mathematical-optimization-report-2021.html

  • Awesome list of datasets in 100+ categories

    With an estimated 44 zettabytes of data in existence in our digital world today and approximately 2.5 quintillion bytes of new data generated daily, there is a lot of data out there you could tap into for your data science projects. It's pretty hard to curate through such a massive universe of data, but this collection is a great start. Here, you can find data from cancer genomes to UFO reports, as well as years of air quality data to 200,000 jokes. Dive into this ocean of data to explore as you learn how to apply data science techniques or leverage your expertise to discover something new.

    https://www.kdnuggets.com/2021/05/awesome-list-datasets.html

  • Animated Bar Chart Races in Python

    A quick and step-by-step beginners project to create an animation bar graph for an amazing Covid dataset.

    https://www.kdnuggets.com/2021/05/animated-race-bar-charts-python.html

  • Super Charge Python with Pandas on GPUs Using Saturn Cloud

    Saturn Cloud is a tool that allows you to have 10 hours of free GPU computing and 3 hours of Dask Cluster computing a month for free. In this tutorial, you will learn how to use these free resources to process data using Pandas on a GPU. The experiments show that Pandas is over 1,000,000% slower on a CPU as compared to running Pandas on a Dask cluster of GPUs.

    https://www.kdnuggets.com/2021/05/super-charge-python-pandas-gpus-saturn-cloud.html

  • Applying Python’s Explode Function to Pandas DataFrames">Silver BlogApplying Python’s Explode Function to Pandas DataFrames

    Read this applied Python method to solve the issue of accessing column by date/ year using the Pandas library and functions lambda(), list(), map() & explode().

    https://www.kdnuggets.com/2021/05/applying-pythons-explode-function-pandas-dataframes.html

  • 10 Must-Know Statistical Concepts for Data Scientists

    Statistics is a building block of data science. If you are working or plan to work in this field, then you will encounter the fundamental concepts reviewed for you here. Certainly, there is much more to learn in statistics, but once you understand these basics, then you can steadily build your way up to advanced topics.

    https://www.kdnuggets.com/2021/04/10-statistical-concepts-data-scientists.html

  • Deep Learning Recommendation Models (DLRM): A Deep Dive

    The currency in the 21st century is no longer just data. It's the attention of people. This deep dive article presents the architecture and deployment issues experienced with the deep learning recommendation model, DLRM, which was open-sourced by Facebook in March 2019.

    https://www.kdnuggets.com/2021/04/deep-learning-recommendation-models-dlrm-deep-dive.html

  • Automated Text Classification with EvalML

    Learn how EvalML leverages Woodwork, Featuretools and the nlp-primitives library to process text data and create a machine learning model that can detect spam text messages.

    https://www.kdnuggets.com/2021/04/automated-text-classification-evalml.html

  • How to deploy Machine Learning/Deep Learning models to the web">Gold BlogHow to deploy Machine Learning/Deep Learning models to the web

    The full value of your deep learning models comes from enabling others to use them. Learn how to deploy your model to the web and access it as a REST API, and begin to share the power of your machine learning development with the world.

    https://www.kdnuggets.com/2021/04/deploy-machine-learning-models-to-web.html

  • Top YouTube Machine Learning Channels

    These are the top 15 YouTube channels for machine learning as determined by our stated criteria, along with some additional data on the channels to help you decide if they may have some content useful for you.

    https://www.kdnuggets.com/2021/03/top-youtube-machine-learning-channels.html

  • DeepMind’s AlphaFold & the Protein Folding Problem

    Recently, DeepMind's AlphaFold made impressive headway in the protein structure prediction problem. Read this for an overview and explanation.

    https://www.kdnuggets.com/2021/03/deepmind-alphafold-protein-folding-problem.html

  • Dask and Pandas: No Such Thing as Too Much Data

    Do you love pandas, but don't love it when you reach the limits of your memory or compute resources? Dask provides you with the option to use the pandas API with distributed data and computing. Learn how it works, how to use it, and why it’s worth the switch when you need it most.

    https://www.kdnuggets.com/2021/03/dask-pandas-data.html

  • Machine Learning Systems Design: A Free Stanford Course">Gold BlogMachine Learning Systems Design: A Free Stanford Course

    This freely-available course from Stanford should give you a toolkit for designing machine learning systems.

    https://www.kdnuggets.com/2021/02/machine-learning-systems-design-free-stanford-course.html

  • The Difficulty of Graph Anonymisation

    Lessons from network science and the difficulty of graph anonymization. A data scientist's take on the difficultly of striking a balance between privacy and utility in anonymizing connected data.

    https://www.kdnuggets.com/2021/02/difficulty-graph-anonymisation.html

  • Powerful Exploratory Data Analysis in just two lines of code">Gold BlogPowerful Exploratory Data Analysis in just two lines of code

    EDA is a fundamental early process for any Data Science investigation. Typical approaches for visualization and exploration are powerful, but can be cumbersome for getting to the heart of your data. Now, you can get to know your data much faster with only a few lines of code... and it might even be fun!

    https://www.kdnuggets.com/2021/02/powerful-exploratory-data-analysis-sweetviz.html

  • Feature Store as a Foundation for Machine Learning

    With so many organizations now taking the leap into building production-level machine learning models, many lessons learned are coming to light about the supporting infrastructure. For a variety of important types of use cases, maintaining a centralized feature store is essential for higher ROI and faster delivery to market. In this review, the current feature store landscape is described, and you can learn how to architect one into your MLOps pipeline.

    https://www.kdnuggets.com/2021/02/feature-store-foundation-machine-learning.html

  • Past 2021 Meetings / Online Events on AI, Analytics, Big Data, Data Science, and Machine Learning

    Past | Jan | Feb | Mar | Apr | May | Jun | Jul | Aug | Sep | Oct | Nov | Dec Read more »

    https://www.kdnuggets.com/meetings/past-meetings-2021.html

  • What is Graph Theory, and Why Should You Care?

    Go from graph theory to path optimization.

    https://www.kdnuggets.com/2021/01/graph-theory-why-care.html

  • Machine learning is going real-time

    Extracting immediate predictions from machine learning algorithms on the spot based on brand-new data can offer a next level of interaction and potential value to its consumers. The infrastructure and tech stack required to implement such real-time systems is also next level, and many organizations -- especially in the US -- seem to be resisting. But, what even is real-time ML, and how can it deliver a better experience?

    https://www.kdnuggets.com/2021/01/machine-learning-real-time.html

  • The Ultimate Scikit-Learn Machine Learning Cheatsheet">Gold BlogThe Ultimate Scikit-Learn Machine Learning Cheatsheet

    With the power and popularity of the scikit-learn for machine learning in Python, this library is a foundation to any practitioner's toolset. Preview its core methods with this review of predictive modelling, clustering, dimensionality reduction, feature importance, and data transformation.

    https://www.kdnuggets.com/2021/01/ultimate-scikit-learn-machine-learning-cheatsheet.html

  • 10 Underappreciated Python Packages for Machine Learning Practitioners">Gold Blog10 Underappreciated Python Packages for Machine Learning Practitioners

    Here are 10 underappreciated Python packages covering neural architecture design, calibration, UI creation and dissemination.

    https://www.kdnuggets.com/2021/01/10-underappreciated-python-packages-machine-learning-practitioners.html

  • How to Get a Job as a Data Engineer

    Data engineering skills are currently in high demand. If you are looking for career prospects in this fast-growing profession, then these 10 skills and key factors will help you prepare to land an entry-level position in this field.

    https://www.kdnuggets.com/2021/01/get-job-as-data-engineer.html

  • Model Experiments, Tracking and Registration using MLflow on Databricks

    This post covers how StreamSets can help expedite operations at some of the most crucial stages of Machine Learning Lifecycle and MLOps, and demonstrates integration with Databricks and MLflow.

    https://www.kdnuggets.com/2021/01/model-experiments-tracking-registration-mlflow-databricks.html

  • Six Tips on Building a Data Science Team at a Small Company

    When a company decides that they want to start leveraging their data for the first time, it can be a daunting task. Many businesses aren’t fully aware of all that goes into building a data science department. If you're the data scientist hired to make this happen, we have some tips to help you face the task head-on.

    https://www.kdnuggets.com/2021/01/six-tips-building-data-science-team-small-company.html

  • How to easily check if your Machine Learning model is fair?

    Machine learning models deployed today -- as will many more in the future -- impact people and society directly. With that power and influence resting in the hands of Data Scientists and machine learning engineers, taking the time to evaluate and understand if model results are fair will become the linchpin for the future success of AI/ML solutions. These are critical considerations, and using a recently developed fairness module in the dalex Python package is a unified and accessible way to ensure your models remain fair.

    https://www.kdnuggets.com/2020/12/machine-learning-model-fair.html

  • Production Machine Learning Monitoring: Outliers, Drift, Explainers & Statistical Performance

    A practical deep dive on production monitoring architectures for machine learning at scale using real-time metrics, outlier detectors, drift detectors, metrics servers and explainers.

    https://www.kdnuggets.com/2020/12/production-machine-learning-monitoring-outliers-drift-explainers-statistical-performance.html

  • Crack SQL Interviews">Gold BlogCrack SQL Interviews

    SQL is an essential programming language for data analysis and processing. So, SQL questions are always part of the interview process for data science-related jobs, including data analysts, data scientists, and data engineers. Become familiar with these common patterns seen in SQL interview questions and follow our tips on how to neatly handle each with SQL queries.

    https://www.kdnuggets.com/2020/12/crack-sql-interviews.html

  • Industry 2021 Predictions for AI, Analytics, Data Science, Machine Learning

    We bring you industry predictions from 12 innovative companies - what key trends they expect in 2021 in AI, Analytics, Data Science, and Machine Learning?

    https://www.kdnuggets.com/2020/12/industry-2021-predictions-ai-data-science-machine-learning.html

  • Data Compression via Dimensionality Reduction: 3 Main Methods

    Lift the curse of dimensionality by mastering the application of three important techniques that will help you reduce the dimensionality of your data, even if it is not linearly separable.

    https://www.kdnuggets.com/2020/12/data-compression-dimensionality-reduction.html

  • Pruning Machine Learning Models in TensorFlow

    Read this overview to learn how to make your models smaller via pruning.

    https://www.kdnuggets.com/2020/12/pruning-machine-learning-models-tensorflow.html

  • 14 Data Science projects to improve your skills

    There's a lot of data out there and so many data science techniques to master or review. Check out these great project ideas from easy to advanced difficulty levels to develop new skills and strengthen your portfolio.

    https://www.kdnuggets.com/2020/12/14-data-science-projects-improve-skills.html

  • The Rise of the Machine Learning Engineer">Gold BlogThe Rise of the Machine Learning Engineer

    The evolution of Big Data into machine learning applications ushered in an exciting era of new roles and skillsets that became necessary to implement these technologies. With the Machine Learning Engineer being such a crucial component today, where the evolution of this field will take us tomorrow should be fascinating.

    https://www.kdnuggets.com/2020/11/rise-machine-learning-engineer.html

  • Computer Vision at Scale With Dask And PyTorch

    A tutorial on conducting image classification inference using the Resnet50 deep learning model at scale with using GPU clusters on Saturn Cloud. The results were: 40x faster computer vision that made a 3+ hour PyTorch model run in just 5 minutes.

    https://www.kdnuggets.com/2020/11/computer-vision-scale-dask-pytorch.html

  • 5 Most Useful Machine Learning Tools every lazy full-stack data scientist should use

    If you consider yourself a Data Scientist who can take any project from data curation to solution deployment, then you know there are many tools available today to help you get the job done. The trouble is that there are too many choices. Here is a review of five sets of tools that should turn you into the most efficient full-stack data scientist possible.

    https://www.kdnuggets.com/2020/11/5-useful-machine-learning-tools.html

  • How to deploy PyTorch Lightning models to production

    A complete guide to serving PyTorch Lightning models at scale.

    https://www.kdnuggets.com/2020/11/deploy-pytorch-lightning-models-production.html

  • Data Science in the Cloud with Dask

    Scaling large data analyses for data science and machine learning is growing in importance. Dask and Coiled are making it easy and fast for folks to do just that. Read on to find out how.

    https://www.kdnuggets.com/2020/10/data-science-cloud-dask.html

  • The Insiders’ Guide to Generative and Discriminative Machine Learning Models

    In this article, we will look at the difference between generative and discriminative models, how they contrast, and one another.

    https://www.kdnuggets.com/2020/09/insiders-guide-generative-discriminative-machine-learning-models.html

  • An Introduction to NLP and 5 Tips for Raising Your Game

    This article is a collection of things the author would like to have known when they started out in NLP. Perhaps it will be useful for you.

    https://www.kdnuggets.com/2020/09/introduction-nlp-5-tips-raising-your-game.html

  • 4 Tools to Speed Up Your Data Science Writing

    This article covers how you can achieve your writing goals with these 4 tools.

    https://www.kdnuggets.com/2020/09/4-tools-speed-up-data-science-writing.html

  • How to Evaluate the Performance of Your Machine Learning Model">Silver BlogHow to Evaluate the Performance of Your Machine Learning Model

    You can train your supervised machine learning models all day long, but unless you evaluate its performance, you can never know if your model is useful. This detailed discussion reviews the various performance metrics you must consider, and offers intuitive explanations for what they mean and how they work.

    https://www.kdnuggets.com/2020/09/performance-machine-learning-model.html

  • 4 ways to improve your TensorFlow model – key regularization techniques you need to know">Gold Blog4 ways to improve your TensorFlow model – key regularization techniques you need to know

    Regularization techniques are crucial for preventing your models from overfitting and enables them perform better on your validation and test sets. This guide provides a thorough overview with code of four key approaches you can use for regularization in TensorFlow.

    https://www.kdnuggets.com/2020/08/tensorflow-model-regularization-techniques.html

  • Getting Started with Feature Selection

    For machine learning, more data is always better. What about more features of data? Not necessarily. This beginners' guide with code examples for selecting the most useful features from your data will jump start you toward developing the most effective and efficient learning models.

    https://www.kdnuggets.com/2020/08/getting-started-feature-selection.html

  • Awesome Machine Learning and AI Courses">Gold BlogAwesome Machine Learning and AI Courses

    Check out this list of awesome, free machine learning and artificial intelligence courses with video lectures.

    https://www.kdnuggets.com/2020/07/awesome-machine-learning-ai-courses.html

  • A Tour of End-to-End Machine Learning Platforms

    An end-to-end machine learning platform needs a holistic approach. If you’re interested in learning more about a few well-known ML platforms, you’ve come to the right place!

    https://www.kdnuggets.com/2020/07/tour-end-to-end-machine-learning-platforms.html

  • How to Handle Dimensions in NumPy

    Learn how to deal with Numpy matrix dimensionality using np.reshape, np.newaxis and np.expand_dims, illustrated with Python code.

    https://www.kdnuggets.com/2020/07/numpy-handle-dimensions.html

  • Free From Stanford: Ethical and Social Issues in Natural Language Processing

    Perhaps it's time to take a look at this relatively new offering from Stanford, Ethical and Social Issues in Natural Language Processing (CS384), an advanced seminar course covering ethical and social issues in NLP.

    https://www.kdnuggets.com/2020/07/ethical-social-issues-natural-language-processing.html

  • Spam Filter in Python: Naive Bayes from Scratch

    In this blog post, learn how to build a spam filter using Python and the multinomial Naive Bayes algorithm, with a goal of classifying messages with a greater than 80% accuracy.

    https://www.kdnuggets.com/2020/07/spam-filter-python-naive-bayes-scratch.html

  • Tools to Spot Deepfakes and AI-Generated Text

    The technologies that generate deepfake content is at the forefront of manipulating humans. While the research developing these algorithms is fascinating and will lead to powerful tools that enhance the way people create and work, in the wrong hands, these same tools drive misinformation at a scale we can't yet imagine. Stopping these bad actors using awesome tools is in your hands.

    https://www.kdnuggets.com/2020/06/dont-click-this-how-spot-deepfakes.html

  • Build Dog Breeds Classifier Step By Step with AWS Sagemaker

    This post takes you through the basic steps for creating a cloud-based deep learning dog classifier, with everything accomplished from the AWS Management Console.

    https://www.kdnuggets.com/2020/06/build-dog-breeds-classifier-aws-sagemaker.html

  • Skills to Build for Data Engineering">Silver BlogSkills to Build for Data Engineering

    This article jumps into the latest skill set observations in the Data Engineering Job Market which could definitely add a boost to your existing career or assist you in starting off your Data Engineering journey.

    https://www.kdnuggets.com/2020/06/skills-build-data-engineering.html

  • Machine Learning in Power BI using PyCaret

    Check out this step-by-step tutorial for implementing machine learning in Power BI within minutes.

    https://www.kdnuggets.com/2020/05/machine-learning-power-bi-pycaret.html

  • What You Need to Know About Deep Reinforcement Learning

    How does deep learning solve the challenges of scale and complexity in reinforcement learning? Learn how combining these approaches will make more progress toward the notion of Artificial General Intelligence.

    https://www.kdnuggets.com/2020/05/deep-reinforcement-learning.html

  • Getting Started with Spectral Clustering

    This post will unravel a practical example to illustrate and motivate the intuition behind each step of the spectral clustering algorithm.

    https://www.kdnuggets.com/2020/05/getting-started-spectral-clustering.html

  • Five Cool Python Libraries for Data Science">Gold BlogFive Cool Python Libraries for Data Science

    Check out these 5 cool Python libraries that the author has come across during an NLP project, and which have made their life easier.

    https://www.kdnuggets.com/2020/04/five-cool-python-libraries-data-science.html

  • Introducing Brain Simulator II: A New Platform for AGI Experimentation

    A growing consensus of researchers contend that new algorithms are needed to transform narrow AI to AGI. Brain Simulator II is free software for new algorithm development targeted at AGI that you can experiment with and participate in its development.

    https://www.kdnuggets.com/2020/04/brain-simulator-new-platform-agi.html

  • Announcing PyCaret 1.0.0

    An open source low-code machine learning library in Python. PyCaret is an alternate low-code library that can be used to replace hundreds of lines of code with few words only. This makes experiments exponentially fast and efficient.

    https://www.kdnuggets.com/2020/04/announcing-pycaret.html

  • State of the Machine Learning and AI Industry

    Enterprises are struggling to launch machine learning models that encapsulate the optimization of business processes. These are now the essential components of data-driven applications and AI services that can improve legacy rule-based business processes, increase productivity, and deliver results. In the current state of the industry, many companies are turning to off-the-shelf platforms to increase expectations for success in applying machine learning.

    https://www.kdnuggets.com/2020/04/machine-learning-ai-industry.html

  • Why and How to Use Dask with Big Data

    The Pandas library for Python is a game-changer for data preparation. But, when the data gets big, really big, then your computer needs more help to efficiency handle all that data. Learn more about how to use Dask and follow a demo to scale up your Pandas to work with Big Data.

    https://www.kdnuggets.com/2020/04/dask-big-data.html

  • Visualizing Decision Trees with Python (Scikit-learn, Graphviz, Matplotlib)

    Learn about how to visualize decision trees using matplotlib and Graphviz.

    https://www.kdnuggets.com/2020/04/visualizing-decision-trees-python.html

  • A Layman’s Guide to Data Science. Part 2: How to Build a Data Project

    As Part 2 in a Guide to Data Science, we outline the steps to build your first Data Science project, including how to ask good questions to understand the data first, how to prepare the data, how to develop an MVP, reiterate to build a good product, and, finally, present your project.

    https://www.kdnuggets.com/2020/04/guide-data-science-build-data-project.html

  • ModelDB 2.0 is here!

    We are excited to announce that ModelDB 2.0 is now available! We have learned a lot since building ModelDB 1.0, so we decided to rebuild from the ground up.

    https://www.kdnuggets.com/2020/03/verta-modeldb-20.html

  • The Most Useful Machine Learning Tools of 2020

    This articles outlines 5 sets of tools every lazy full-stack data scientist should use.

    https://www.kdnuggets.com/2020/03/most-useful-machine-learning-tools-2020.html

  • Software Interfaces for Machine Learning Deployment

    While building a machine learning model might be the fun part, it won't do much for anyone else unless it can be deployed into a production environment. How to implement machine learning deployments is a special challenge with differences from traditional software engineering, and this post examines a fundamental first step -- how to create software interfaces so you can develop deployments that are automated and repeatable.

    https://www.kdnuggets.com/2020/03/software-interfaces-machine-learning-deployment.html

  • Python Pandas For Data Discovery in 7 Simple Steps

    Just getting started with Python's Pandas library for data analysis? Or, ready for a quick refresher? These 7 steps will help you become familiar with its core features so you can begin exploring your data in no time.

    https://www.kdnuggets.com/2020/03/python-pandas-data-discovery.html

  • 50 Must-Read Free Books For Every Data Scientist in 2020">Silver Blog50 Must-Read Free Books For Every Data Scientist in 2020

    In this article, we are listing down some excellent data science books which cover the wide variety of topics under Data Science.

    https://www.kdnuggets.com/2020/03/50-must-read-free-books-every-data-scientist-2020.html

  • Decision Tree Intuition: From Concept to Application

    While the use of Decision Trees in machine learning has been around for awhile, the technique remains powerful and popular. This guide first provides an introductory understanding of the method and then shows you how to construct a decision tree, calculate important analysis parameters, and plot the resulting tree.

    https://www.kdnuggets.com/2020/02/decision-tree-intuition.html

  • Audio Data Analysis Using Deep Learning with Python (Part 2)

    This is a followup to the first article in this series. Once you are comfortable with the concepts explained in that article, you can come back and continue with this.

    https://www.kdnuggets.com/2020/02/audio-data-analysis-deep-learning-python-part-2.html

  • Getting Started with R Programming

    An end to end Data Analysis using R, the second most requested programming language in Data Science.

    https://www.kdnuggets.com/2020/02/getting-started-r-programming.html

  • Scaling the Wall Between Data Scientist and Data Engineer

    The educational and research focuses of machine learning tends to highlight the model building, training, testing, and optimization aspects of the data science process. To bring these models into use requires a suite of engineering feats and organization, a standard for which does not yet exist. Learn more about a framework for operating a collaborative data science and engineering team to deploy machine learning models to end-users.

    https://www.kdnuggets.com/2020/02/scaling-wall-data-scientist-data-engineer.html

  • Introduction to Geographical Time Series Prediction with Crime Data in R, SQL, and Tableau

    When reviewing geographical data, it can be difficult to prepare the data for an analysis. This article helps by covering importing data into a SQL Server database; cleansing and grouping data into a map grid; adding time data points to the set of grid data and filling in the gaps where no crimes occurred; importing the data into R; running XGBoost model to determine where crimes will occur on a specific day

    https://www.kdnuggets.com/2020/02/introduction-geographical-time-series-crime-r-sql-tableau.html

  • Basics of Audio File Processing in R

    This post provides basic information on audio processing using R as the programming language. It also walks through and understands some basics of sound and digital audio.

    https://www.kdnuggets.com/2020/02/basics-audio-file-processing-r.html

  • Intent Recognition with BERT using Keras and TensorFlow 2

    TL;DR Learn how to fine-tune the BERT model for text classification. Train and evaluate it on a small dataset for detecting seven intents. The results might surprise you!

    https://www.kdnuggets.com/2020/02/intent-recognition-bert-keras-tensorflow.html

  • Audio File Processing: ECG Audio Using Python

    In this post, we will look into an application of audio file processing, for a good cause — Analysis of ECG Heart beat and write code in python.

    https://www.kdnuggets.com/2020/02/audio-file-processing-ecg-audio-python.html

  • Past 2020 Meetings / Online Events on AI, Analytics, Big Data, Data Science, and Machine Learning

    Past | Jan | Feb | Mar | Apr | May | Jun | Jul | Aug | Sep | Oct | Nov | Dec Read more »

    https://www.kdnuggets.com/meetings/past-meetings-2020.html

  • Schema Evolution in Data Lakes

    Whereas a data warehouse will need rigid data modeling and definitions, a data lake can store different types and shapes of data. In a data lake, the schema of the data can be inferred when it’s read, providing the aforementioned flexibility. However, this flexibility is a double-edged sword.

    https://www.kdnuggets.com/2020/01/schema-evolution-data-lakes.html

  • Applying Occam’s razor to Deep Learning

    Finding a deep learning model to perform well is an exciting feat. But, might there be other -- less complex -- models that perform just as well for your application? A simple complexity measure based on the statistical physics concept of Cascading Periodic Spectral Ergodicity (cPSE) can help us be computationally efficient by considering the least complex during model selection.

    https://www.kdnuggets.com/2020/01/occams-razor-deep-learning.html

  • The Most In Demand Tech Skills for Data Scientists

    By the end of this article you’ll know which technologies are becoming more popular with employers and which are becoming less popular.

    https://www.kdnuggets.com/2019/12/most-demand-tech-skills-data-scientists.html

  • Alternative Cloud Hosted Data Science Environments

    Over the years new alternative providers have risen to provided a solitary data science environment hosted on the cloud for data scientist to analyze, host and share their work.

    https://www.kdnuggets.com/2019/12/alternative-cloud-data-science-environments.html

  • Deploying a pretrained GPT-2 model on AWS

    This post attempts to summarize my recent detour into NLP, describing how I exposed a Huggingface pre-trained Language Model (LM) on an AWS-based web application.

    https://www.kdnuggets.com/2019/12/deploying-pretrained-gpt-2-model-aws.html

  • Intro to Grafana: Installation, Configuration, and Building the First Dashboard

    One of the biggest highlights of Grafana is the ability to bring several data sources together in one dashboard with adding rows that will host individual panels. Let's look at installing, configuring, and creating our first dashboard using Grafana.

    https://www.kdnuggets.com/2019/12/intro-grafana-installation-configuration-building-first-dashboard.html

  • Why software engineering processes and tools don’t work for machine learning

    While AI may be the new electricity significant challenges remain to realize AI potential. Here we examine why data scientists and teams can’t rely on software engineering tools and processes for machine learning.

    https://www.kdnuggets.com/2019/12/comet-software-engineering-machine-learning.html

  • Python Tuples and Tuple Methods

    Brush up on your Python basics with this post on creating, using, and manipulating tuples.

    https://www.kdnuggets.com/2019/11/python-tuples-methods.html

  • Platinum BlogEverything a Data Scientist Should Know About Data Management">Silver BlogPlatinum BlogEverything a Data Scientist Should Know About Data Management

    For full-stack data science mastery, you must understand data management along with all the bells and whistles of machine learning. This high-level overview is a road map for the history and current state of the expansive options for data storage and infrastructure solutions.

    https://www.kdnuggets.com/2019/10/data-scientist-data-management.html

  • Beyond Word Embedding: Key Ideas in Document Embedding

    This literature review on document embedding techniques thoroughly covers the many ways practitioners develop rich vector representations of text -- from single sentences to entire books.

    https://www.kdnuggets.com/2019/10/beyond-word-embedding-document-embedding.html

  • Introduction to Artificial Neural Networks

    In this article, we’ll try to cover everything related to Artificial Neural Networks or ANN.

    https://www.kdnuggets.com/2019/10/introduction-artificial-neural-networks.html

  • How AI will transform healthcare (and can it fix the US healthcare system?)">Silver BlogHow AI will transform healthcare (and can it fix the US healthcare system?)

    This thorough review focuses on the impact of AI, 5G, and edge computing on the healthcare sector in the 2020s as well as a look at quantum computing's potential impact on AI, healthcare, and financial services.

    https://www.kdnuggets.com/2019/09/ai-transform-healthcare.html

  • 6 bits of advice for Data Scientists">Silver Blog6 bits of advice for Data Scientists

    As a data scientist, you can get lost in your daily dives into the data. Consider this advice to be certain to follow in your work for being diligent and more impactful for your organization.

    https://www.kdnuggets.com/2019/09/advice-data-scientists.html

  • The thin line between data science and data engineering

    Today, as companies have finally come to understand the value that data science can bring, more and more emphasis is being placed on the implementation of data science in production systems. And as these implementations have required models that can perform on larger and larger datasets in real-time, an awful lot of data science problems have become engineering problems.

    https://www.kdnuggets.com/2019/09/thin-line-between-data-science-data-engineering.html

  • Applying Data Science to Cybersecurity Network Attacks & Events

    Check out this detailed tutorial on applying data science to the cybersecurity domain, written by an individual with backgrounds in both fields.

    https://www.kdnuggets.com/2019/09/applying-data-science-cybersecurity-network-attacks-events.html

  • My journey path from a Software Engineer to BI Specialist to a Data Scientist">Silver BlogMy journey path from a Software Engineer to BI Specialist to a Data Scientist

    The career path of the Data Scientist remains a hot target for many with its continuing high demand. Becoming one requires developing a broad set of skills including statistics, programming, and even business acumen. Learn more about one person's experience making this journey, and discover the many resources available to help you find your way into a world of data science.

    https://www.kdnuggets.com/2019/09/journey-software-engineer-bi-data-scientist.html

  • Version Control for Data Science: Tracking Machine Learning Models and Datasets

    I am a Git god, why do I need another version control system for Machine Learning Projects?

    https://www.kdnuggets.com/2019/09/version-control-data-science-tracking-machine-learning-models-datasets.html

  • There is No Free Lunch in Data Science">Silver BlogThere is No Free Lunch in Data Science

    There is no such thing as a free lunch in life or data science. Here, we'll explore some science philosophy and discuss the No Free Lunch theorems to find out what they mean for the field of data science.

    https://www.kdnuggets.com/2019/09/no-free-lunch-data-science.html

  • Emoji Analytics

    Emoji is becoming a global language understandable by anyone who expresses... emotion. With the pervasiveness of these little Unicode blocks, we can perform analytics on their use throughout social media to gain insight into sentiments around the world.

    https://www.kdnuggets.com/2019/08/emoji-analytics.html

  • Object-oriented programming for data scientists: Build your ML estimator">Gold BlogObject-oriented programming for data scientists: Build your ML estimator

    Implement some of the core OOP principles in a machine learning context by building your own Scikit-learn-like estimator, and making it better.

    https://www.kdnuggets.com/2019/08/object-oriented-programming-data-scientists-estimator.html

  • An Overview of Python’s Datatable package

    Modern machine learning applications need to process a humongous amount of data and generate multiple features. Python’s datatable module was created to address this issue. It is a toolkit for performing big data (up to 100GB) operations on a single-node machine, at the maximum possible speed.

    https://www.kdnuggets.com/2019/08/overview-python-datatable-package.html

  • Learn how to use PySpark in under 5 minutes (Installation + Tutorial)

    Apache Spark is one of the hottest and largest open source project in data processing framework with rich high-level APIs for the programming languages like Scala, Python, Java and R. It realizes the potential of bringing together both Big Data and machine learning.

    https://www.kdnuggets.com/2019/08/learn-pyspark-installation-tutorial.html

  • Introduction to Image Segmentation with K-Means clustering

    Image segmentation is the classification of an image into different groups. Many kinds of research have been done in the area of image segmentation using clustering. In this article, we will explore using the K-Means clustering algorithm to read an image and cluster different regions of the image.

    https://www.kdnuggets.com/2019/08/introduction-image-segmentation-k-means-clustering.html

  • What is Benford’s Law and why is it important for data science?">Silver BlogWhat is Benford’s Law and why is it important for data science?

    Benford’s law is a little-known gem for data analytics. Learn about how this can be used for anomaly or fraud detection in scientific or technical publications.

    https://www.kdnuggets.com/2019/08/benfords-law-data-science.html

  • Deep Learning for NLP: ANNs, RNNs and LSTMs explained!">Silver BlogDeep Learning for NLP: ANNs, RNNs and LSTMs explained!

    Learn about Artificial Neural Networks, Deep Learning, Recurrent Neural Networks and LSTMs like never before and use NLP to build a Chatbot!

    https://www.kdnuggets.com/2019/08/deep-learning-nlp-explained.html

  • Lagrange multipliers with visualizations and code

    In this story, we’re going to take an aerial tour of optimization with Lagrange multipliers. When do we need them? Whenever we have an optimization problem with constraints.

    https://www.kdnuggets.com/2019/08/lagrange-multipliers-visualizations-code.html

  • Can we trust AutoML to go on full autopilot?

    We put an AutoML tool to the test on a real-world problem, and the results are surprising. Even with automatic machine learning, you still need expert data scientists.

    https://www.kdnuggets.com/2019/07/automl-full-autopilot.html

  • Top 13 Skills To Become a Rockstar Data Scientist">Platinum BlogTop 13 Skills To Become a Rockstar Data Scientist

    Education, coding, SQL, big data platforms, storytelling and more. These are the 13 skills you need to master to become a rockstar data scientist.

    https://www.kdnuggets.com/2019/07/top-13-skills-become-rockstar-data-scientist.html

  • Pre-training, Transformers, and Bi-directionality

    Bidirectional Encoder Representations from Transformers BERT (Devlin et al., 2018) is a language representation model that combines the power of pre-training with the bi-directionality of the Transformer’s encoder (Vaswani et al., 2017). BERT improves the state-of-the-art performance on a wide array of downstream NLP tasks with minimal additional task-specific training.

    https://www.kdnuggets.com/2019/07/pre-training-transformers-bi-directionality.html

  • 10 Gradient Descent Optimisation Algorithms + Cheat Sheet

    Gradient descent is an optimization algorithm used for minimizing the cost function in various ML algorithms. Here are some common gradient descent optimisation algorithms used in the popular deep learning frameworks such as TensorFlow and Keras.

    https://www.kdnuggets.com/2019/06/gradient-descent-algorithms-cheat-sheet.html

  • The Data Fabric for Machine Learning – Part 2: Building a Knowledge-Graph

    Before being able to develop a Data Fabric we need to build a Knowledge-Graph. In this article I’ll set up the basis on how to create it, in the next article we’ll go to the practice on how to do this.

    https://www.kdnuggets.com/2019/06/data-fabric-machine-learning-building-knowledge-graph.html

  • Understanding Cloud Data Services">Gold BlogUnderstanding Cloud Data Services

    Ready to move your systems to a cloud vendor or just learning more about big data services? This overview will help you understand big data system architectures, components, and offerings with an end-to-end taxonomy of what is available from the big three cloud providers.

    https://www.kdnuggets.com/2019/06/understanding-cloud-data-services.html

  • Random Forests® vs Neural Networks: Which is Better, and When?">Silver BlogRandom Forests® vs Neural Networks: Which is Better, and When?

    Random Forests and Neural Network are the two widely used machine learning algorithms. What is the difference between the two approaches? When should one use Neural Network or Random Forest?

    https://www.kdnuggets.com/2019/06/random-forest-vs-neural-network.html

  • How to choose a visualization">Gold BlogHow to choose a visualization

    Visualizations based on the structure of data are needed during analysis, which might be different than for the end user. A new guide for choosing the right visualization helps you flexibly understand the data first.

    https://www.kdnuggets.com/2019/06/how-choose-visualization.html

  • Animations with Matplotlib

    Animations make even more sense when depicting time series data like stock prices over the years, climate change over the past decade, seasonalities and trends since we can then see how a particular parameter behaves with time.

    https://www.kdnuggets.com/2019/05/animations-with-matplotlib.html

  • Boost Your Image Classification Model

    Check out this collection of tricks to improve the accuracy of your classifier.

    https://www.kdnuggets.com/2019/05/boost-your-image-classification-model.html

  • Analyzing Tweets with NLP in Minutes with Spark, Optimus and Twint

    Social media has been gold for studying the way people communicate and behave, in this article I’ll show you the easiest way of analyzing tweets without the Twitter API and scalable for Big Data.

    https://www.kdnuggets.com/2019/05/analyzing-tweets-nlp-spark-optimus-twint.html

  • Which Data Science / Machine Learning methods and algorithms did you use in 2018/2019 for a real-world application?

    Which Data Science / Machine Learning methods and algorithms did you use in 2018/2019 for a real-world application? Take part in the latest KDnuggets survey and have your say.

    https://www.kdnuggets.com/2019/04/poll-data-science-machine-learning-methods-algorithms-use-2018-2019.html

  • Platinum BlogTop 10 Coding Mistakes Made by Data Scientists">Gold BlogPlatinum BlogTop 10 Coding Mistakes Made by Data Scientists

    Here is a list of 10 common mistakes that a senior data scientist — who is ranked in the top 1% on Stackoverflow for python coding and who works with a lot of (junior) data scientists — frequently sees.

    https://www.kdnuggets.com/2019/04/top-10-coding-mistakes-data-scientists.html

  • Data Pipelines, Luigi, Airflow: Everything you need to know

    This post focuses on the workflow management system (WMS) Airflow: what it is, what can you do with it, and how it differs from Luigi.

    https://www.kdnuggets.com/2019/03/data-pipelines-luigi-airflow-everything-need-know.html

  • Feature Reduction using Genetic Algorithm with Python

    This tutorial discusses how to use the genetic algorithm (GA) for reducing the feature vector extracted from the Fruits360 dataset in Python mainly using NumPy and Sklearn.

    https://www.kdnuggets.com/2019/03/feature-reduction-genetic-algorithm-python.html

  • Deploy your PyTorch model to Production

    This tutorial aims to teach you how to deploy your recently trained model in PyTorch as an API using Python.

    https://www.kdnuggets.com/2019/03/deploy-pytorch-model-production.html

  • Artificial Neural Networks Optimization using Genetic Algorithm with Python">Platinum BlogArtificial Neural Networks Optimization using Genetic Algorithm with Python

    This tutorial explains the usage of the genetic algorithm for optimizing the network weights of an Artificial Neural Network for improved performance.

    https://www.kdnuggets.com/2019/03/artificial-neural-networks-optimization-genetic-algorithm-python.html

  • [eBook] Standardizing the Machine Learning Lifecycle

    We explore what makes the machine learning lifecycle so challenging compared to regular software, and share the Databricks approach.

    https://www.kdnuggets.com/2019/03/databrocks-ebook-machine-learning-lifecycle.html

  • Deconstructing BERT, Part 2: Visualizing the Inner Workings of Attention

    In this post, the author shows how BERT can mimic a Bag-of-Words model. The visualization tool from Part 1 is extended to probe deeper into the mind of BERT, to expose the neurons that give BERT its shape-shifting superpowers.

    https://www.kdnuggets.com/2019/03/deconstructing-bert-part-2-visualizing-inner-workings-attention.html

  • On Building Effective Data Science Teams

    We take a look at the qualities that make a successful data team in order to help business leaders and executives create better AI strategies.

    https://www.kdnuggets.com/2019/03/building-effective-data-science-teams.html

  • What are Some “Advanced” AI and Machine Learning Online Courses?

    Where can you find not-so-common, but high-quality online courses (Free) for ‘advanced’ machine learning and artificial intelligence?

    https://www.kdnuggets.com/2019/02/some-advanced-ai-machine-learning-online-courses.html

  • Artificial Neural Network Implementation using NumPy and Image Classification">Gold BlogArtificial Neural Network Implementation using NumPy and Image Classification

    This tutorial builds artificial neural network in Python using NumPy from scratch in order to do an image classification application for the Fruits360 dataset

    https://www.kdnuggets.com/2019/02/artificial-neural-network-implementation-using-numpy-and-image-classification.html

  • Comparing Machine Learning Models: Statistical vs. Practical Significance

    Is model A or B more accurate? Hmm… In this blog post, I’d love to share my recent findings on model comparison.

    https://www.kdnuggets.com/2019/01/comparing-machine-learning-models-statistical-vs-practical-significance.html

  • 10 Exciting Ideas of 2018 in NLP

    We outline a selection of exciting developments in NLP from the last year, and include useful recent papers and images to help further assist with your learning.

    https://www.kdnuggets.com/2019/01/10-exciting-ideas-2018-nlp.html

  • How to solve 90% of NLP problems: a step-by-step guide">Silver BlogHow to solve 90% of NLP problems: a step-by-step guide

    Read this insightful, step-by-step article on how to use machine learning to understand and leverage text.

    https://www.kdnuggets.com/2019/01/solve-90-nlp-problems-step-by-step-guide.html

  • Practical Apache Spark in 10 Minutes

    Check out this series of articles on Apache Spark. Each part is a 10 minute tutorial on a particular Apache Spark topic. Read on to get up to speed using Spark.

    https://www.kdnuggets.com/2019/01/practical-apache-spark-10-minutes.html

  • The Role of the Data Engineer is Changing

    The role of the data engineer in a startup data team is changing rapidly. Are you thinking about it the right way?

    https://www.kdnuggets.com/2019/01/role-data-engineer-changing.html

  • Comparison of the Top Speech Processing APIs

    There are two main tasks in speech processing. First one is to transform speech to text. The second is to convert the text into human speech. We will describe the general aspects of each API and then compare their main features in the table.

    https://www.kdnuggets.com/2018/12/activewizards-comparison-speech-processing-apis.html

2
Refine your search here:

Get the FREE ebook 'KDnuggets Artificial Intelligence Pocket Dictionary' along with the leading newsletter on Data Science, Machine Learning, AI & Analytics straight to your inbox.

By subscribing you accept KDnuggets Privacy Policy

Get the FREE ebook 'KDnuggets Artificial Intelligence Pocket Dictionary' along with the leading newsletter on Data Science, Machine Learning, AI & Analytics straight to your inbox.

By subscribing you accept KDnuggets Privacy Policy

No, thanks!