Search results for s3

    Found 343 documents, 5946 searched:

  • The Rise of the Machine Learning Engineer">Gold BlogThe Rise of the Machine Learning Engineer

    The evolution of Big Data into machine learning applications ushered in an exciting era of new roles and skillsets that became necessary to implement these technologies. With the Machine Learning Engineer being such a crucial component today, where the evolution of this field will take us tomorrow should be fascinating.

    https://www.kdnuggets.com/2020/11/rise-machine-learning-engineer.html

  • Computer Vision at Scale With Dask And PyTorch

    A tutorial on conducting image classification inference using the Resnet50 deep learning model at scale with using GPU clusters on Saturn Cloud. The results were: 40x faster computer vision that made a 3+ hour PyTorch model run in just 5 minutes.

    https://www.kdnuggets.com/2020/11/computer-vision-scale-dask-pytorch.html

  • 5 Most Useful Machine Learning Tools every lazy full-stack data scientist should use

    If you consider yourself a Data Scientist who can take any project from data curation to solution deployment, then you know there are many tools available today to help you get the job done. The trouble is that there are too many choices. Here is a review of five sets of tools that should turn you into the most efficient full-stack data scientist possible.

    https://www.kdnuggets.com/2020/11/5-useful-machine-learning-tools.html

  • How to deploy PyTorch Lightning models to production

    A complete guide to serving PyTorch Lightning models at scale.

    https://www.kdnuggets.com/2020/11/deploy-pytorch-lightning-models-production.html

  • Data Science in the Cloud with Dask

    Scaling large data analyses for data science and machine learning is growing in importance. Dask and Coiled are making it easy and fast for folks to do just that. Read on to find out how.

    https://www.kdnuggets.com/2020/10/data-science-cloud-dask.html

  • The Insiders’ Guide to Generative and Discriminative Machine Learning Models

    In this article, we will look at the difference between generative and discriminative models, how they contrast, and one another.

    https://www.kdnuggets.com/2020/09/insiders-guide-generative-discriminative-machine-learning-models.html

  • An Introduction to NLP and 5 Tips for Raising Your Game

    This article is a collection of things the author would like to have known when they started out in NLP. Perhaps it will be useful for you.

    https://www.kdnuggets.com/2020/09/introduction-nlp-5-tips-raising-your-game.html

  • 4 Tools to Speed Up Your Data Science Writing

    This article covers how you can achieve your writing goals with these 4 tools.

    https://www.kdnuggets.com/2020/09/4-tools-speed-up-data-science-writing.html

  • How to Evaluate the Performance of Your Machine Learning Model">Silver BlogHow to Evaluate the Performance of Your Machine Learning Model

    You can train your supervised machine learning models all day long, but unless you evaluate its performance, you can never know if your model is useful. This detailed discussion reviews the various performance metrics you must consider, and offers intuitive explanations for what they mean and how they work.

    https://www.kdnuggets.com/2020/09/performance-machine-learning-model.html

  • 4 ways to improve your TensorFlow model – key regularization techniques you need to know">Gold Blog4 ways to improve your TensorFlow model – key regularization techniques you need to know

    Regularization techniques are crucial for preventing your models from overfitting and enables them perform better on your validation and test sets. This guide provides a thorough overview with code of four key approaches you can use for regularization in TensorFlow.

    https://www.kdnuggets.com/2020/08/tensorflow-model-regularization-techniques.html

  • Getting Started with Feature Selection

    For machine learning, more data is always better. What about more features of data? Not necessarily. This beginners' guide with code examples for selecting the most useful features from your data will jump start you toward developing the most effective and efficient learning models.

    https://www.kdnuggets.com/2020/08/getting-started-feature-selection.html

  • Awesome Machine Learning and AI Courses">Gold BlogAwesome Machine Learning and AI Courses

    Check out this list of awesome, free machine learning and artificial intelligence courses with video lectures.

    https://www.kdnuggets.com/2020/07/awesome-machine-learning-ai-courses.html

  • A Tour of End-to-End Machine Learning Platforms

    An end-to-end machine learning platform needs a holistic approach. If you’re interested in learning more about a few well-known ML platforms, you’ve come to the right place!

    https://www.kdnuggets.com/2020/07/tour-end-to-end-machine-learning-platforms.html

  • How to Handle Dimensions in NumPy

    Learn how to deal with Numpy matrix dimensionality using np.reshape, np.newaxis and np.expand_dims, illustrated with Python code.

    https://www.kdnuggets.com/2020/07/numpy-handle-dimensions.html

  • Free From Stanford: Ethical and Social Issues in Natural Language Processing

    Perhaps it's time to take a look at this relatively new offering from Stanford, Ethical and Social Issues in Natural Language Processing (CS384), an advanced seminar course covering ethical and social issues in NLP.

    https://www.kdnuggets.com/2020/07/ethical-social-issues-natural-language-processing.html

  • Spam Filter in Python: Naive Bayes from Scratch

    In this blog post, learn how to build a spam filter using Python and the multinomial Naive Bayes algorithm, with a goal of classifying messages with a greater than 80% accuracy.

    https://www.kdnuggets.com/2020/07/spam-filter-python-naive-bayes-scratch.html

  • Tools to Spot Deepfakes and AI-Generated Text

    The technologies that generate deepfake content is at the forefront of manipulating humans. While the research developing these algorithms is fascinating and will lead to powerful tools that enhance the way people create and work, in the wrong hands, these same tools drive misinformation at a scale we can't yet imagine. Stopping these bad actors using awesome tools is in your hands.

    https://www.kdnuggets.com/2020/06/dont-click-this-how-spot-deepfakes.html

  • Build Dog Breeds Classifier Step By Step with AWS Sagemaker

    This post takes you through the basic steps for creating a cloud-based deep learning dog classifier, with everything accomplished from the AWS Management Console.

    https://www.kdnuggets.com/2020/06/build-dog-breeds-classifier-aws-sagemaker.html

  • Skills to Build for Data Engineering">Silver BlogSkills to Build for Data Engineering

    This article jumps into the latest skill set observations in the Data Engineering Job Market which could definitely add a boost to your existing career or assist you in starting off your Data Engineering journey.

    https://www.kdnuggets.com/2020/06/skills-build-data-engineering.html

  • Machine Learning in Power BI using PyCaret

    Check out this step-by-step tutorial for implementing machine learning in Power BI within minutes.

    https://www.kdnuggets.com/2020/05/machine-learning-power-bi-pycaret.html

  • What You Need to Know About Deep Reinforcement Learning

    How does deep learning solve the challenges of scale and complexity in reinforcement learning? Learn how combining these approaches will make more progress toward the notion of Artificial General Intelligence.

    https://www.kdnuggets.com/2020/05/deep-reinforcement-learning.html

  • Getting Started with Spectral Clustering

    This post will unravel a practical example to illustrate and motivate the intuition behind each step of the spectral clustering algorithm.

    https://www.kdnuggets.com/2020/05/getting-started-spectral-clustering.html

  • Five Cool Python Libraries for Data Science">Gold BlogFive Cool Python Libraries for Data Science

    Check out these 5 cool Python libraries that the author has come across during an NLP project, and which have made their life easier.

    https://www.kdnuggets.com/2020/04/five-cool-python-libraries-data-science.html

  • Introducing Brain Simulator II: A New Platform for AGI Experimentation

    A growing consensus of researchers contend that new algorithms are needed to transform narrow AI to AGI. Brain Simulator II is free software for new algorithm development targeted at AGI that you can experiment with and participate in its development.

    https://www.kdnuggets.com/2020/04/brain-simulator-new-platform-agi.html

  • Announcing PyCaret 1.0.0

    An open source low-code machine learning library in Python. PyCaret is an alternate low-code library that can be used to replace hundreds of lines of code with few words only. This makes experiments exponentially fast and efficient.

    https://www.kdnuggets.com/2020/04/announcing-pycaret.html

  • State of the Machine Learning and AI Industry

    Enterprises are struggling to launch machine learning models that encapsulate the optimization of business processes. These are now the essential components of data-driven applications and AI services that can improve legacy rule-based business processes, increase productivity, and deliver results. In the current state of the industry, many companies are turning to off-the-shelf platforms to increase expectations for success in applying machine learning.

    https://www.kdnuggets.com/2020/04/machine-learning-ai-industry.html

  • Why and How to Use Dask with Big Data

    The Pandas library for Python is a game-changer for data preparation. But, when the data gets big, really big, then your computer needs more help to efficiency handle all that data. Learn more about how to use Dask and follow a demo to scale up your Pandas to work with Big Data.

    https://www.kdnuggets.com/2020/04/dask-big-data.html

  • Visualizing Decision Trees with Python (Scikit-learn, Graphviz, Matplotlib)

    Learn about how to visualize decision trees using matplotlib and Graphviz.

    https://www.kdnuggets.com/2020/04/visualizing-decision-trees-python.html

  • A Layman’s Guide to Data Science. Part 2: How to Build a Data Project

    As Part 2 in a Guide to Data Science, we outline the steps to build your first Data Science project, including how to ask good questions to understand the data first, how to prepare the data, how to develop an MVP, reiterate to build a good product, and, finally, present your project.

    https://www.kdnuggets.com/2020/04/guide-data-science-build-data-project.html

  • ModelDB 2.0 is here!

    We are excited to announce that ModelDB 2.0 is now available! We have learned a lot since building ModelDB 1.0, so we decided to rebuild from the ground up.

    https://www.kdnuggets.com/2020/03/verta-modeldb-20.html

  • The Most Useful Machine Learning Tools of 2020

    This articles outlines 5 sets of tools every lazy full-stack data scientist should use.

    https://www.kdnuggets.com/2020/03/most-useful-machine-learning-tools-2020.html

  • Software Interfaces for Machine Learning Deployment

    While building a machine learning model might be the fun part, it won't do much for anyone else unless it can be deployed into a production environment. How to implement machine learning deployments is a special challenge with differences from traditional software engineering, and this post examines a fundamental first step -- how to create software interfaces so you can develop deployments that are automated and repeatable.

    https://www.kdnuggets.com/2020/03/software-interfaces-machine-learning-deployment.html

  • Python Pandas For Data Discovery in 7 Simple Steps

    Just getting started with Python's Pandas library for data analysis? Or, ready for a quick refresher? These 7 steps will help you become familiar with its core features so you can begin exploring your data in no time.

    https://www.kdnuggets.com/2020/03/python-pandas-data-discovery.html

  • 50 Must-Read Free Books For Every Data Scientist in 2020">Silver Blog50 Must-Read Free Books For Every Data Scientist in 2020

    In this article, we are listing down some excellent data science books which cover the wide variety of topics under Data Science.

    https://www.kdnuggets.com/2020/03/50-must-read-free-books-every-data-scientist-2020.html

  • Decision Tree Intuition: From Concept to Application

    While the use of Decision Trees in machine learning has been around for awhile, the technique remains powerful and popular. This guide first provides an introductory understanding of the method and then shows you how to construct a decision tree, calculate important analysis parameters, and plot the resulting tree.

    https://www.kdnuggets.com/2020/02/decision-tree-intuition.html

  • Audio Data Analysis Using Deep Learning with Python (Part 2)

    This is a followup to the first article in this series. Once you are comfortable with the concepts explained in that article, you can come back and continue with this.

    https://www.kdnuggets.com/2020/02/audio-data-analysis-deep-learning-python-part-2.html

  • Getting Started with R Programming

    An end to end Data Analysis using R, the second most requested programming language in Data Science.

    https://www.kdnuggets.com/2020/02/getting-started-r-programming.html

  • Scaling the Wall Between Data Scientist and Data Engineer

    The educational and research focuses of machine learning tends to highlight the model building, training, testing, and optimization aspects of the data science process. To bring these models into use requires a suite of engineering feats and organization, a standard for which does not yet exist. Learn more about a framework for operating a collaborative data science and engineering team to deploy machine learning models to end-users.

    https://www.kdnuggets.com/2020/02/scaling-wall-data-scientist-data-engineer.html

  • Introduction to Geographical Time Series Prediction with Crime Data in R, SQL, and Tableau

    When reviewing geographical data, it can be difficult to prepare the data for an analysis. This article helps by covering importing data into a SQL Server database; cleansing and grouping data into a map grid; adding time data points to the set of grid data and filling in the gaps where no crimes occurred; importing the data into R; running XGBoost model to determine where crimes will occur on a specific day

    https://www.kdnuggets.com/2020/02/introduction-geographical-time-series-crime-r-sql-tableau.html

  • Basics of Audio File Processing in R

    This post provides basic information on audio processing using R as the programming language. It also walks through and understands some basics of sound and digital audio.

    https://www.kdnuggets.com/2020/02/basics-audio-file-processing-r.html

  • Intent Recognition with BERT using Keras and TensorFlow 2

    TL;DR Learn how to fine-tune the BERT model for text classification. Train and evaluate it on a small dataset for detecting seven intents. The results might surprise you!

    https://www.kdnuggets.com/2020/02/intent-recognition-bert-keras-tensorflow.html

  • Audio File Processing: ECG Audio Using Python

    In this post, we will look into an application of audio file processing, for a good cause — Analysis of ECG Heart beat and write code in python.

    https://www.kdnuggets.com/2020/02/audio-file-processing-ecg-audio-python.html

  • Past 2020 Meetings / Online Events on AI, Analytics, Big Data, Data Science, and Machine Learning

    Past | Jan | Feb | Mar | Apr | May | Jun | Jul | Aug | Sep | Oct | Nov | Dec Read more »

    https://www.kdnuggets.com/meetings/past-meetings-2020.html

  • Schema Evolution in Data Lakes

    Whereas a data warehouse will need rigid data modeling and definitions, a data lake can store different types and shapes of data. In a data lake, the schema of the data can be inferred when it’s read, providing the aforementioned flexibility. However, this flexibility is a double-edged sword.

    https://www.kdnuggets.com/2020/01/schema-evolution-data-lakes.html

  • Applying Occam’s razor to Deep Learning

    Finding a deep learning model to perform well is an exciting feat. But, might there be other -- less complex -- models that perform just as well for your application? A simple complexity measure based on the statistical physics concept of Cascading Periodic Spectral Ergodicity (cPSE) can help us be computationally efficient by considering the least complex during model selection.

    https://www.kdnuggets.com/2020/01/occams-razor-deep-learning.html

  • The Most In Demand Tech Skills for Data Scientists

    By the end of this article you’ll know which technologies are becoming more popular with employers and which are becoming less popular.

    https://www.kdnuggets.com/2019/12/most-demand-tech-skills-data-scientists.html

  • Alternative Cloud Hosted Data Science Environments

    Over the years new alternative providers have risen to provided a solitary data science environment hosted on the cloud for data scientist to analyze, host and share their work.

    https://www.kdnuggets.com/2019/12/alternative-cloud-data-science-environments.html

  • Deploying a pretrained GPT-2 model on AWS

    This post attempts to summarize my recent detour into NLP, describing how I exposed a Huggingface pre-trained Language Model (LM) on an AWS-based web application.

    https://www.kdnuggets.com/2019/12/deploying-pretrained-gpt-2-model-aws.html

  • Intro to Grafana: Installation, Configuration, and Building the First Dashboard

    One of the biggest highlights of Grafana is the ability to bring several data sources together in one dashboard with adding rows that will host individual panels. Let's look at installing, configuring, and creating our first dashboard using Grafana.

    https://www.kdnuggets.com/2019/12/intro-grafana-installation-configuration-building-first-dashboard.html

  • Why software engineering processes and tools don’t work for machine learning

    While AI may be the new electricity significant challenges remain to realize AI potential. Here we examine why data scientists and teams can’t rely on software engineering tools and processes for machine learning.

    https://www.kdnuggets.com/2019/12/comet-software-engineering-machine-learning.html

  • Python Tuples and Tuple Methods

    Brush up on your Python basics with this post on creating, using, and manipulating tuples.

    https://www.kdnuggets.com/2019/11/python-tuples-methods.html

  • Platinum BlogEverything a Data Scientist Should Know About Data Management">Silver BlogPlatinum BlogEverything a Data Scientist Should Know About Data Management

    For full-stack data science mastery, you must understand data management along with all the bells and whistles of machine learning. This high-level overview is a road map for the history and current state of the expansive options for data storage and infrastructure solutions.

    https://www.kdnuggets.com/2019/10/data-scientist-data-management.html

  • Beyond Word Embedding: Key Ideas in Document Embedding

    This literature review on document embedding techniques thoroughly covers the many ways practitioners develop rich vector representations of text -- from single sentences to entire books.

    https://www.kdnuggets.com/2019/10/beyond-word-embedding-document-embedding.html

  • Introduction to Artificial Neural Networks

    In this article, we’ll try to cover everything related to Artificial Neural Networks or ANN.

    https://www.kdnuggets.com/2019/10/introduction-artificial-neural-networks.html

  • How AI will transform healthcare (and can it fix the US healthcare system?)">Silver BlogHow AI will transform healthcare (and can it fix the US healthcare system?)

    This thorough review focuses on the impact of AI, 5G, and edge computing on the healthcare sector in the 2020s as well as a look at quantum computing's potential impact on AI, healthcare, and financial services.

    https://www.kdnuggets.com/2019/09/ai-transform-healthcare.html

  • 6 bits of advice for Data Scientists">Silver Blog6 bits of advice for Data Scientists

    As a data scientist, you can get lost in your daily dives into the data. Consider this advice to be certain to follow in your work for being diligent and more impactful for your organization.

    https://www.kdnuggets.com/2019/09/advice-data-scientists.html

  • The thin line between data science and data engineering

    Today, as companies have finally come to understand the value that data science can bring, more and more emphasis is being placed on the implementation of data science in production systems. And as these implementations have required models that can perform on larger and larger datasets in real-time, an awful lot of data science problems have become engineering problems.

    https://www.kdnuggets.com/2019/09/thin-line-between-data-science-data-engineering.html

  • Applying Data Science to Cybersecurity Network Attacks & Events

    Check out this detailed tutorial on applying data science to the cybersecurity domain, written by an individual with backgrounds in both fields.

    https://www.kdnuggets.com/2019/09/applying-data-science-cybersecurity-network-attacks-events.html

  • My journey path from a Software Engineer to BI Specialist to a Data Scientist">Silver BlogMy journey path from a Software Engineer to BI Specialist to a Data Scientist

    The career path of the Data Scientist remains a hot target for many with its continuing high demand. Becoming one requires developing a broad set of skills including statistics, programming, and even business acumen. Learn more about one person's experience making this journey, and discover the many resources available to help you find your way into a world of data science.

    https://www.kdnuggets.com/2019/09/journey-software-engineer-bi-data-scientist.html

  • Version Control for Data Science: Tracking Machine Learning Models and Datasets

    I am a Git god, why do I need another version control system for Machine Learning Projects?

    https://www.kdnuggets.com/2019/09/version-control-data-science-tracking-machine-learning-models-datasets.html

  • There is No Free Lunch in Data Science">Silver BlogThere is No Free Lunch in Data Science

    There is no such thing as a free lunch in life or data science. Here, we'll explore some science philosophy and discuss the No Free Lunch theorems to find out what they mean for the field of data science.

    https://www.kdnuggets.com/2019/09/no-free-lunch-data-science.html

  • Emoji Analytics

    Emoji is becoming a global language understandable by anyone who expresses... emotion. With the pervasiveness of these little Unicode blocks, we can perform analytics on their use throughout social media to gain insight into sentiments around the world.

    https://www.kdnuggets.com/2019/08/emoji-analytics.html

  • Object-oriented programming for data scientists: Build your ML estimator">Gold BlogObject-oriented programming for data scientists: Build your ML estimator

    Implement some of the core OOP principles in a machine learning context by building your own Scikit-learn-like estimator, and making it better.

    https://www.kdnuggets.com/2019/08/object-oriented-programming-data-scientists-estimator.html

  • An Overview of Python’s Datatable package

    Modern machine learning applications need to process a humongous amount of data and generate multiple features. Python’s datatable module was created to address this issue. It is a toolkit for performing big data (up to 100GB) operations on a single-node machine, at the maximum possible speed.

    https://www.kdnuggets.com/2019/08/overview-python-datatable-package.html

  • Learn how to use PySpark in under 5 minutes (Installation + Tutorial)

    Apache Spark is one of the hottest and largest open source project in data processing framework with rich high-level APIs for the programming languages like Scala, Python, Java and R. It realizes the potential of bringing together both Big Data and machine learning.

    https://www.kdnuggets.com/2019/08/learn-pyspark-installation-tutorial.html

  • Introduction to Image Segmentation with K-Means clustering

    Image segmentation is the classification of an image into different groups. Many kinds of research have been done in the area of image segmentation using clustering. In this article, we will explore using the K-Means clustering algorithm to read an image and cluster different regions of the image.

    https://www.kdnuggets.com/2019/08/introduction-image-segmentation-k-means-clustering.html

  • What is Benford’s Law and why is it important for data science?">Silver BlogWhat is Benford’s Law and why is it important for data science?

    Benford’s law is a little-known gem for data analytics. Learn about how this can be used for anomaly or fraud detection in scientific or technical publications.

    https://www.kdnuggets.com/2019/08/benfords-law-data-science.html

  • Deep Learning for NLP: ANNs, RNNs and LSTMs explained!">Silver BlogDeep Learning for NLP: ANNs, RNNs and LSTMs explained!

    Learn about Artificial Neural Networks, Deep Learning, Recurrent Neural Networks and LSTMs like never before and use NLP to build a Chatbot!

    https://www.kdnuggets.com/2019/08/deep-learning-nlp-explained.html

  • Lagrange multipliers with visualizations and code

    In this story, we’re going to take an aerial tour of optimization with Lagrange multipliers. When do we need them? Whenever we have an optimization problem with constraints.

    https://www.kdnuggets.com/2019/08/lagrange-multipliers-visualizations-code.html

  • Can we trust AutoML to go on full autopilot?

    We put an AutoML tool to the test on a real-world problem, and the results are surprising. Even with automatic machine learning, you still need expert data scientists.

    https://www.kdnuggets.com/2019/07/automl-full-autopilot.html

  • Top 13 Skills To Become a Rockstar Data Scientist">Platinum BlogTop 13 Skills To Become a Rockstar Data Scientist

    Education, coding, SQL, big data platforms, storytelling and more. These are the 13 skills you need to master to become a rockstar data scientist.

    https://www.kdnuggets.com/2019/07/top-13-skills-become-rockstar-data-scientist.html

  • Pre-training, Transformers, and Bi-directionality

    Bidirectional Encoder Representations from Transformers BERT (Devlin et al., 2018) is a language representation model that combines the power of pre-training with the bi-directionality of the Transformer’s encoder (Vaswani et al., 2017). BERT improves the state-of-the-art performance on a wide array of downstream NLP tasks with minimal additional task-specific training.

    https://www.kdnuggets.com/2019/07/pre-training-transformers-bi-directionality.html

  • 10 Gradient Descent Optimisation Algorithms + Cheat Sheet

    Gradient descent is an optimization algorithm used for minimizing the cost function in various ML algorithms. Here are some common gradient descent optimisation algorithms used in the popular deep learning frameworks such as TensorFlow and Keras.

    https://www.kdnuggets.com/2019/06/gradient-descent-algorithms-cheat-sheet.html

  • The Data Fabric for Machine Learning – Part 2: Building a Knowledge-Graph

    Before being able to develop a Data Fabric we need to build a Knowledge-Graph. In this article I’ll set up the basis on how to create it, in the next article we’ll go to the practice on how to do this.

    https://www.kdnuggets.com/2019/06/data-fabric-machine-learning-building-knowledge-graph.html

  • Understanding Cloud Data Services">Gold BlogUnderstanding Cloud Data Services

    Ready to move your systems to a cloud vendor or just learning more about big data services? This overview will help you understand big data system architectures, components, and offerings with an end-to-end taxonomy of what is available from the big three cloud providers.

    https://www.kdnuggets.com/2019/06/understanding-cloud-data-services.html

  • Random Forests® vs Neural Networks: Which is Better, and When?">Silver BlogRandom Forests® vs Neural Networks: Which is Better, and When?

    Random Forests and Neural Network are the two widely used machine learning algorithms. What is the difference between the two approaches? When should one use Neural Network or Random Forest?

    https://www.kdnuggets.com/2019/06/random-forest-vs-neural-network.html

  • How to choose a visualization">Gold BlogHow to choose a visualization

    Visualizations based on the structure of data are needed during analysis, which might be different than for the end user. A new guide for choosing the right visualization helps you flexibly understand the data first.

    https://www.kdnuggets.com/2019/06/how-choose-visualization.html

  • Animations with Matplotlib

    Animations make even more sense when depicting time series data like stock prices over the years, climate change over the past decade, seasonalities and trends since we can then see how a particular parameter behaves with time.

    https://www.kdnuggets.com/2019/05/animations-with-matplotlib.html

  • Boost Your Image Classification Model

    Check out this collection of tricks to improve the accuracy of your classifier.

    https://www.kdnuggets.com/2019/05/boost-your-image-classification-model.html

  • Analyzing Tweets with NLP in Minutes with Spark, Optimus and Twint

    Social media has been gold for studying the way people communicate and behave, in this article I’ll show you the easiest way of analyzing tweets without the Twitter API and scalable for Big Data.

    https://www.kdnuggets.com/2019/05/analyzing-tweets-nlp-spark-optimus-twint.html

  • Which Data Science / Machine Learning methods and algorithms did you use in 2018/2019 for a real-world application?

    Which Data Science / Machine Learning methods and algorithms did you use in 2018/2019 for a real-world application? Take part in the latest KDnuggets survey and have your say.

    https://www.kdnuggets.com/2019/04/poll-data-science-machine-learning-methods-algorithms-use-2018-2019.html

  • Platinum BlogTop 10 Coding Mistakes Made by Data Scientists">Gold BlogPlatinum BlogTop 10 Coding Mistakes Made by Data Scientists

    Here is a list of 10 common mistakes that a senior data scientist — who is ranked in the top 1% on Stackoverflow for python coding and who works with a lot of (junior) data scientists — frequently sees.

    https://www.kdnuggets.com/2019/04/top-10-coding-mistakes-data-scientists.html

  • Data Pipelines, Luigi, Airflow: Everything you need to know

    This post focuses on the workflow management system (WMS) Airflow: what it is, what can you do with it, and how it differs from Luigi.

    https://www.kdnuggets.com/2019/03/data-pipelines-luigi-airflow-everything-need-know.html

  • Feature Reduction using Genetic Algorithm with Python

    This tutorial discusses how to use the genetic algorithm (GA) for reducing the feature vector extracted from the Fruits360 dataset in Python mainly using NumPy and Sklearn.

    https://www.kdnuggets.com/2019/03/feature-reduction-genetic-algorithm-python.html

  • Deploy your PyTorch model to Production

    This tutorial aims to teach you how to deploy your recently trained model in PyTorch as an API using Python.

    https://www.kdnuggets.com/2019/03/deploy-pytorch-model-production.html

  • Artificial Neural Networks Optimization using Genetic Algorithm with Python">Platinum BlogArtificial Neural Networks Optimization using Genetic Algorithm with Python

    This tutorial explains the usage of the genetic algorithm for optimizing the network weights of an Artificial Neural Network for improved performance.

    https://www.kdnuggets.com/2019/03/artificial-neural-networks-optimization-genetic-algorithm-python.html

  • [eBook] Standardizing the Machine Learning Lifecycle

    We explore what makes the machine learning lifecycle so challenging compared to regular software, and share the Databricks approach.

    https://www.kdnuggets.com/2019/03/databrocks-ebook-machine-learning-lifecycle.html

  • Deconstructing BERT, Part 2: Visualizing the Inner Workings of Attention

    In this post, the author shows how BERT can mimic a Bag-of-Words model. The visualization tool from Part 1 is extended to probe deeper into the mind of BERT, to expose the neurons that give BERT its shape-shifting superpowers.

    https://www.kdnuggets.com/2019/03/deconstructing-bert-part-2-visualizing-inner-workings-attention.html

  • On Building Effective Data Science Teams

    We take a look at the qualities that make a successful data team in order to help business leaders and executives create better AI strategies.

    https://www.kdnuggets.com/2019/03/building-effective-data-science-teams.html

  • What are Some “Advanced” AI and Machine Learning Online Courses?

    Where can you find not-so-common, but high-quality online courses (Free) for ‘advanced’ machine learning and artificial intelligence?

    https://www.kdnuggets.com/2019/02/some-advanced-ai-machine-learning-online-courses.html

  • Artificial Neural Network Implementation using NumPy and Image Classification">Gold BlogArtificial Neural Network Implementation using NumPy and Image Classification

    This tutorial builds artificial neural network in Python using NumPy from scratch in order to do an image classification application for the Fruits360 dataset

    https://www.kdnuggets.com/2019/02/artificial-neural-network-implementation-using-numpy-and-image-classification.html

  • Comparing Machine Learning Models: Statistical vs. Practical Significance

    Is model A or B more accurate? Hmm… In this blog post, I’d love to share my recent findings on model comparison.

    https://www.kdnuggets.com/2019/01/comparing-machine-learning-models-statistical-vs-practical-significance.html

  • 10 Exciting Ideas of 2018 in NLP

    We outline a selection of exciting developments in NLP from the last year, and include useful recent papers and images to help further assist with your learning.

    https://www.kdnuggets.com/2019/01/10-exciting-ideas-2018-nlp.html

  • How to solve 90% of NLP problems: a step-by-step guide">Silver BlogHow to solve 90% of NLP problems: a step-by-step guide

    Read this insightful, step-by-step article on how to use machine learning to understand and leverage text.

    https://www.kdnuggets.com/2019/01/solve-90-nlp-problems-step-by-step-guide.html

  • Practical Apache Spark in 10 Minutes

    Check out this series of articles on Apache Spark. Each part is a 10 minute tutorial on a particular Apache Spark topic. Read on to get up to speed using Spark.

    https://www.kdnuggets.com/2019/01/practical-apache-spark-10-minutes.html

  • The Role of the Data Engineer is Changing

    The role of the data engineer in a startup data team is changing rapidly. Are you thinking about it the right way?

    https://www.kdnuggets.com/2019/01/role-data-engineer-changing.html

  • Comparison of the Top Speech Processing APIs

    There are two main tasks in speech processing. First one is to transform speech to text. The second is to convert the text into human speech. We will describe the general aspects of each API and then compare their main features in the table.

    https://www.kdnuggets.com/2018/12/activewizards-comparison-speech-processing-apis.html

  • Introduction to Named Entity Recognition

    Named Entity Recognition is a tool which invariably comes handy when we do Natural Language Processing tasks. Read on to find out how.

    https://www.kdnuggets.com/2018/12/introduction-named-entity-recognition.html

  • Best Machine Learning Languages, Data Visualization Tools, DL Frameworks, and Big Data Tools">Silver BlogBest Machine Learning Languages, Data Visualization Tools, DL Frameworks, and Big Data Tools

    We cover a variety of topics, from machine learning to deep learning, from data visualization to data tools, with comments and explanations from experts in the relevant fields.

    https://www.kdnuggets.com/2018/12/machine-learning-data-visualization-deep-learning-tools.html

  • [Download] Real-Life ML Examples + Notebooks

    In this eBook, we will walk you through four Machine Learning use cases on Databricks: Loan Risk Use Case; Advertising Analytics & Prediction Use Case; Market Basket Analysis Problem at Scale; Suspicious Behavior Identification in Video Use Case. Get your copy now!

    https://www.kdnuggets.com/2018/11/databricks-ebook-machine-learning-use-cases.html

  • The Most in Demand Skills for Data Scientists">Platinum BlogThe Most in Demand Skills for Data Scientists

    Data scientists are expected to know a lot — machine learning, computer science, statistics, mathematics, data visualization, communication, and deep learning. How should data scientists who want to be in demand by employers spend their learning budget?

    https://www.kdnuggets.com/2018/11/most-demand-skills-data-scientists.html

  • Building an Image Classifier Running on Raspberry Pi

    The tutorial starts by building the Physical network connecting Raspberry Pi to the PC via a router. After preparing their IPv4 addresses, SSH session is created for remotely accessing of the Raspberry Pi. After uploading the classification project using FTP, clients can access it using web browsers for classifying images.

    https://www.kdnuggets.com/2018/10/building-image-classifier-running-raspberry-pi.html

  • Linear Regression in the Wild

    We take a look at how to use linear regression when the dependent variables have measurement errors.

    https://www.kdnuggets.com/2018/10/linear-regression-wild.html

  • More Effective Transfer Learning for NLP

    Until recently, the natural language processing community was lacking its ImageNet equivalent — a standardized dataset and training objective to use for training base models.

    https://www.kdnuggets.com/2018/10/more-effective-transfer-learning-nlp.html

  • Introduction to Deep Learning

    I decided to begin to put some structure in my understanding of Neural Networks through this series of articles.

    https://www.kdnuggets.com/2018/09/introduction-deep-learning.html

  • Power Laws in Deep Learning

    In pretrained, production quality DNNs,  the weight matrices for the Fully Connected (FC ) layers display Fat Tailed Power Law behavior.

    https://www.kdnuggets.com/2018/09/power-laws-deep-learning.html

  • Free resources to learn Natural Language Processing

    An extensive list of free resources to help you learn Natural Language Processing, including explanations on Text Classification, Sequence Labeling, Machine Translation and more.

    https://www.kdnuggets.com/2018/09/free-resources-natural-language-processing.html

  • Deep Learning for NLP: An Overview of Recent Trends">Silver BlogDeep Learning for NLP: An Overview of Recent Trends

    A new paper discusses some of the recent trends in deep learning based natural language processing (NLP) systems and applications. The focus is on the review and comparison of models and methods that have achieved state-of-the-art (SOTA) results on various NLP tasks and some of the current best practices for applying deep learning in NLP.

    https://www.kdnuggets.com/2018/09/deep-learning-nlp-overview-recent-trends.html

  • DynamoDB vs. Cassandra: from “no idea” to “it’s a no-brainer”

    DynamoDB vs. Cassandra: have they got anything in common? If yes, what? If no, what are the differences? We answer these questions and examine performance of both databases.

    https://www.kdnuggets.com/2018/08/dynamodb-vs-cassandra.html

  • Comparison of the Most Useful Text Processing APIs">Silver BlogComparison of the Most Useful Text Processing APIs

    There is a need to compare different APIs to understand key pros and cons they have and when it is better to use one API instead of the other. Let us proceed with the comparison.

    https://www.kdnuggets.com/2018/08/comparison-most-useful-text-processing-apis.html

  • Optimization 101 for Data Scientists

    We show how to use optimization strategies to make the best possible decision.

    https://www.kdnuggets.com/2018/08/optimization-101-data-scientists.html

  • Programming Best Practices For Data Science">Silver BlogProgramming Best Practices For Data Science

    In this post, I'll go over the two mindsets most people switch between when doing programming work specifically for data science: the prototype mindset and the production mindset.

    https://www.kdnuggets.com/2018/08/programming-best-practices-data-science.html

  • Cookiecutter Data Science: How to Organize Your Data Science Project">Gold BlogCookiecutter Data Science: How to Organize Your Data Science Project

    A logical, reasonably standardized, but flexible project structure for doing and sharing data science work.

    https://www.kdnuggets.com/2018/07/cookiecutter-data-science-organize-data-project.html

  • Introduction to Apache Spark

    This is the first blog in this series to analyze Big Data using Spark. It provides an introduction to Spark and its ecosystem.

    https://www.kdnuggets.com/2018/07/introduction-apache-spark.html

  • fast.ai Machine Learning Course Notes

    This posts is a collection of a set of fantastic notes on the fast.ai machine learning MOOC freely available online, as written and shared by a student. These notes are a valuable learning resource either as a supplement to the courseware or on their own.

    https://www.kdnuggets.com/2018/07/suenaga-fast-ai-machine-learning-notes.html

  • Using Topological Data Analysis to Understand the Behavior of Convolutional Neural Networks

    Neural Networks are powerful but complex and opaque tools. Using Topological Data Analysis, we can describe the functioning and learning of a convolutional neural network in a compact and understandable way.

    https://www.kdnuggets.com/2018/06/topological-data-analysis-convolutional-neural-networks.html

  • Why the Data Lake Matters

    This post outlines why the data lake matters, outlining the complexity of a data lake and taking a look at its evolution over time.

    https://www.kdnuggets.com/2018/06/why-data-lake-matters.html

  • IoT on AWS: Machine Learning Models and Dashboards from Sensor Data

    I developed my first IoT project using my notebook as an IoT device and AWS IoT as infrastructure, with this "simple" idea: collect CPU Temperature from my Notebook running on Ubuntu, send to Amazon AWS IoT, save data, make it available for Machine Learning models and dashboards.

    https://www.kdnuggets.com/2018/06/zimbres-iot-aws-machine-learning-dashboard.html

  • Using Linear Regression for Predictive Modeling in R

    In this post, we’ll use linear regression to build a model that predicts cherry tree volume from metrics that are much easier for folks who study trees to measure.

    https://www.kdnuggets.com/2018/06/linear-regression-predictive-modeling-r.html

  • 9 Must-have skills you need to become a Data Scientist, updated">Platinum Blog9 Must-have skills you need to become a Data Scientist, updated

    Check out this collection of 9 (plus some additional freebies) must-have skills for becoming a data scientist.

    https://www.kdnuggets.com/2018/05/simplilearn-9-must-have-skills-data-scientist.html

  • Complete Guide to Build ConvNet HTTP-Based Application using TensorFlow and Flask RESTful Python API">Silver BlogComplete Guide to Build ConvNet HTTP-Based Application using TensorFlow and Flask RESTful Python API

    In this tutorial, a CNN is to be built, and trained and tested against the CIFAR10 dataset. To make the model remotely accessible, a Flask Web application is created using Python to receive an uploaded image and return its classification label using HTTP.

    https://www.kdnuggets.com/2018/05/complete-guide-convnet-tensorflow-flask-restful-python-api.html

  • Data Augmentation: How to use Deep Learning when you have Limited Data

    This article is a comprehensive review of Data Augmentation techniques for Deep Learning, specific to images.

    https://www.kdnuggets.com/2018/05/data-augmentation-deep-learning-limited-data.html

  • Data Science Interview Guide

    Traditionally, Data Science would focus on mathematics, computer science and domain expertise. While I will briefly cover some computer science fundamentals, the bulk of this blog will mostly cover the mathematical basics one might either need to brush up on (or even take an entire course).

    https://www.kdnuggets.com/2018/04/data-science-interview-guide.html

  • Presto for Data Scientists – SQL on anything

    Presto enables data scientists to run interactive SQL across multiple data sources. This open source engine supports querying anything, anywhere, and at large scale.

    https://www.kdnuggets.com/2018/04/presto-data-scientists-sql.html

  • Don’t learn Machine Learning in 24 hours

    When it comes to machine learning, there's no quick way of teaching yourself - you're in it for the long haul.

    https://www.kdnuggets.com/2018/04/dont-learn-machine-learning-24-hours.html

  • Exploring DeepFakes">Silver BlogExploring DeepFakes

    In this post, I explore the capabilities of this tech, describe how it works, and discuss potential applications.

    https://www.kdnuggets.com/2018/03/exploring-deepfakes.html

  • A Beginner’s Guide to Data Engineering – Part II

    In this post, I share more technical details on how to build good data pipelines and highlight ETL best practices. Primarily, I will use Python, Airflow, and SQL for our discussion.

    https://www.kdnuggets.com/2018/03/beginners-guide-data-engineering-part-2.html

  • Logistic Regression: A Concise Technical Overview">Silver BlogLogistic Regression: A Concise Technical Overview

    Interested in learning the concepts behind Logistic Regression (LogR)? Looking for a concise introduction to LogR? This article is for you. Includes a Python implementation and links to an R script as well.

    https://www.kdnuggets.com/2018/02/logistic-regression-concise-technical-overview.html

  • Comparing Machine Learning as a Service: Amazon, Microsoft Azure, Google Cloud AI">Gold BlogComparing Machine Learning as a Service: Amazon, Microsoft Azure, Google Cloud AI

    A complete and unbiased comparison of the three most common Cloud Technologies for Machine Learning as a Service.

    https://www.kdnuggets.com/2018/01/mlaas-amazon-microsoft-azure-google-cloud-ai.html

  • Propensity Score Matching in R

    Propensity scores are an alternative method to estimate the effect of receiving treatment when random assignment of treatments to subjects is not feasible.

    https://www.kdnuggets.com/2018/01/propensity-score-matching-r.html

Refine your search here:

No, thanks!