Search results for cost function

    Found 970 documents, 5944 searched:

  • Top Programming Languages and Their Uses

    KDnuggets Top Blog The landscape of programming languages is rich and expanding, which can make it tricky to focus on just one or another for your career. We highlight some of the most popular languages that are modern, widely used, and come with loads of packages or libraries that will help you be more productive and efficient in your work.

    https://www.kdnuggets.com/2021/05/top-programming-languages.html

  • How to Process a DataFrame with Millions of Rows in Seconds

    TLDR; process it with a new Python Data Processing Engine in the Cloud.

    https://www.kdnuggets.com/2022/01/process-dataframe-millions-rows-seconds.html

  • A Deep Look Into 13 Data Scientist Roles and Their Responsibilities

    KDnuggets Top Blog Any modern company of any significant size around the world has a data science department, and a data engineer at one company might have the same responsibilities as a marketing scientist at another company. Data science jobs are not well-labeled, so make sure to cast a wide net.

    https://www.kdnuggets.com/2022/01/deep-look-13-data-scientist-roles-responsibilities.html

  • What Makes Python An Ideal Programming Language For Startups">Silver BlogWhat Makes Python An Ideal Programming Language For Startups

    In this blog, we will discuss what makes Python so popular, its features, and why you should consider Python as a programming language for your startup.

    https://www.kdnuggets.com/2021/12/makes-python-ideal-programming-language-startups.html

  • 11 Best Companies to Work for as a Data Scientist

    This list of best data science companies aims to go beyond the usual and expected. Some great and perhaps underrated options to get a job as a data scientist.

    https://www.kdnuggets.com/2021/12/11-best-companies-work-data-scientist.html

  • Explainable Forecasting and Nowcasting with State-of-the-art Deep Neural Networks and Dynamic Factor Model

    Review this detailed tutorial with code and revisit the decades-long old problem using a democratized and interpretable AI framework of how precisely can we anticipate the future and understand its causal factors?

    https://www.kdnuggets.com/2021/12/sota-explainable-forecasting-and-nowcasting.html

  • Data Science & Analytics Industry Main Developments in 2021 and Key Trends for 2022

    We have solicited insights from experts at industry-leading companies, asking: "What were the main AI, Data Science, Machine Learning Developments in 2021 and what key trends do you expect in 2022?" Read their opinions here.

    https://www.kdnuggets.com/2021/12/developments-predictions-data-science-analytics-industry.html

  • What Is AI Model Governance?

    How exactly does AI model governance help tackle these issues? And how can you ensure you’re using it to best fit your needs? Read on.

    https://www.kdnuggets.com/2021/12/ai-model-governance.html

  • Data Labeling for Machine Learning: Market Overview, Approaches, and Tools

    So much of data science and machine learning is founded on having clean and well-understood data sources that it is unsurprising that the data labeling market is growing faster than ever. Here, we highlight many of the top players in this industry and the techniques they use to help you consider which might make a good partner for your needs.

    https://www.kdnuggets.com/2021/12/data-labeling-ml-overview-and-tools.html

  • Main 2021 Developments and Key 2022 Trends in AI, Data Science, Machine Learning Technology

    Our panel of leading experts reviews 2021 main developments and examines the key trends in AI, Data Science, Machine Learning, and Deep Learning Technology.

    https://www.kdnuggets.com/2021/12/trends-ai-data-science-ml-technology.html

  • Deep Neural Networks Don’t Lead Us Towards AGI

    Machine learning techniques continue to evolve with increased efficiency for recognition problems. But, they still lack the critical element of intelligence, so we remain a long way from attaining AGI.

    https://www.kdnuggets.com/2021/12/deep-neural-networks-not-toward-agi.html

  • How to Get Certified as a Data Scientist">Gold BlogHow to Get Certified as a Data Scientist

    If you are early in your journey to becoming a Data Scientist, an interesting option is to earn certification by DataCamp, and this guide offers tips that will help beginners complete the challenges.

    https://www.kdnuggets.com/2021/12/get-certified-data-science.html

  • 5 Practical Data Science Projects That Will Help You Solve Real Business Problems for 2022">Gold Blog5 Practical Data Science Projects That Will Help You Solve Real Business Problems for 2022

    This curated list of data science projects offers real-life problems that will help you master skills to demonstration that you are technically sound and know how to conduct data science projects that add business value.

    https://www.kdnuggets.com/2021/12/5-practical-data-science-projects.html

  • Movie Recommendations with Spark Collaborative Filtering

    Not sure what movie to watch? Ask your recommender system.

    https://www.kdnuggets.com/2021/12/movie-recommendations-spark-collaborative-filtering.html

  • Sentiment Analysis API vs Custom Text Classification: Which one to choose?

    In this article, we are going to compare the sentiment extraction performance between Sentiment Analysis engines and Custom Text classification engines. The idea is to show pros and cons of these two types of engines on a concrete dataset.

    https://www.kdnuggets.com/2021/11/sentiment-analysis-api-custom-text-classification.html

  • Where NLP is heading">Silver BlogWhere NLP is heading

    Natural language processing research and applications are moving forward rapidly. Several trends have emerged on this progress, and point to a future of more exciting possibilities and interesting opportunities in the field.

    https://www.kdnuggets.com/2021/11/where-nlp-is-heading.html

  • How I Redesigned over 100 ETL into ELT Data Pipelines">Silver BlogHow I Redesigned over 100 ETL into ELT Data Pipelines

    Learn how to level up your Data Pipelines!

    https://www.kdnuggets.com/2021/11/redesigned-over-100-etl-elt-data-pipelines.html

  • Anecdotes from 11 Role Models in Machine Learning

    The skills needed to create good data are also the skills needed for good leadership.

    https://www.kdnuggets.com/2021/11/anecdotes-11-role-models-machine-learning.html

  • The Best Ways for Data Professionals to Market AWS Skills in 2022

    Knowing your way around Amazon Web Services (AWS) is increasingly useful. Here are five ways to market your AWS skills in today’s job market.

    https://www.kdnuggets.com/2021/11/best-ways-data-professionals-market-aws-skills.html

  • Design Patterns for Machine Learning Pipelines">Silver BlogDesign Patterns for Machine Learning Pipelines

    ML pipeline design has undergone several evolutions in the past decade with advances in memory and processor performance, storage systems, and the increasing scale of data sets. We describe how these design patterns changed, what processes they went through, and their future direction.

    https://www.kdnuggets.com/2021/11/design-patterns-machine-learning-pipelines.html

  • Advanced PyTorch Lightning with TorchMetrics and Lightning Flash

    In this tutorial we will be diving deeper into two additional tools you should be using: TorchMetrics and Lightning Flash. TorchMetrics unsurprisingly provides a modular approach to define and track useful metrics across batches and devices, while Lightning Flash offers a suite of functionality facilitating more efficient transfer learning and data handling, and a recipe book of state-of-the-art approaches to typical deep learning problems.

    https://www.kdnuggets.com/2021/11/advanced-pytorch-lightning-torchmetrics-lightning-flash.html

  • ETL and ELT: A Guide and Market Analysis

    ETL and related techniques remain a powerful and foundational tool in the data industry. We explain what ETL is and how ETL and ELT processes have evolved over the years, with a close eye toward how third-generation ETL tools are about to disrupt standard data processing practices.

    https://www.kdnuggets.com/2021/10/etl-elt-guide-market-analysis.html

  • How to Build Data Frameworks with Open Source Tools to Enhance Agility and Security

    Let’s take a look at how to harness open source tools to build your data frameworks.

    https://www.kdnuggets.com/2021/10/build-data-frameworks-open-source-tools-agility-security.html

  • A Guide to 14 Different Data Science Jobs">Silver BlogA Guide to 14 Different Data Science Jobs

    The field of data science is growing into one that features a variety of job titles This guide reviews different positions available for you to consider if you have a data science background.

    https://www.kdnuggets.com/2021/10/guide-14-different-data-science-jobs.html

  • Machine Learning Model Development and Model Operations: Principles and Practices">Gold BlogMachine Learning Model Development and Model Operations: Principles and Practices

    The ML model management and the delivery of highly performing model is as important as the initial build of the model by choosing right dataset. The concepts around model retraining, model versioning, model deployment and model monitoring are the basis for machine learning operations (MLOps) that helps the data science teams deliver highly performing models.

    https://www.kdnuggets.com/2021/10/machine-learning-model-development-operations-principles-practice.html

  • Guide To Finding The Right Predictive Maintenance Machine Learning Techniques

    What happens to a life so dependent on machines, when that particular machine breaks down? This is precisely why there’s a dire need for predictive maintenance with machine learning.

    https://www.kdnuggets.com/2021/10/guide-right-predictive-maintenance-machine-learning-techniques.html

  • Introduction to AutoEncoder and Variational AutoEncoder (VAE)">Silver BlogIntroduction to AutoEncoder and Variational AutoEncoder (VAE)

    Autoencoders and their variants are interesting and powerful artificial neural networks used in unsupervised learning scenarios. Learn how autoencoders perform in their different approaches and how to implement with Keras on the instructional data set of the MNIST digits.

    https://www.kdnuggets.com/2021/10/introduction-autoencoder-variational-autoencoder-vae.html

  • Gold BlogData Scientist vs Data Engineer Salary">Rewards BlogGold BlogData Scientist vs Data Engineer Salary

    What are the differences between these two popular tech roles?

    https://www.kdnuggets.com/2021/10/data-scientist-data-engineer-salary.html

  • Serving ML Models in Production: Common Patterns

    Over the past couple years, we've seen 4 common patterns of machine learning in production: pipeline, ensemble, business logic, and online learning. In the ML serving space, implementing these patterns typically involves a tradeoff between ease of development and production readiness. Ray Serve was built to support these patterns by being both easy to develop and production ready.

    https://www.kdnuggets.com/2021/10/serving-ml-models-production-common-patterns.html

  • How to calculate confidence intervals for performance metrics in Machine Learning using an automatic bootstrap method

    Are your model performance measurements very precise due to a “large” test set, or very uncertain due to a “small” or imbalanced test set?

    https://www.kdnuggets.com/2021/10/calculate-confidence-intervals-performance-metrics-machine-learning.html

  • Will Your Job be Replaced by a Machine?

    Yes! It will happen. However, you can pivot and thrive in this disruptive time by becoming a Citizen Developer!

    https://www.kdnuggets.com/2021/10/job-replaced-machine.html

  • How I Built A Perfect Model And Got Into Trouble

    Data-driven decisions, actionable insights, business impact—you've seen these buzzwords in data science jobs descriptions. But, just focusing on these terms doesn't automatically lead to the best results. Learn from this real-world scenario that followed data-driven indecisiveness, found misleading insights, and initially created a negative business impact.

    https://www.kdnuggets.com/2021/10/perfect-model-trouble.html

  • AutoML: An Introduction Using Auto-Sklearn and Auto-PyTorch

    AutoML is a broad category of techniques and tools for applying automated search to your automated search and learning to your learning. In addition to Auto-Sklearn, the Freiburg-Hannover AutoML group has also developed an Auto-PyTorch library. We’ll use both of these as our entry point into AutoML in the following simple tutorial.

    https://www.kdnuggets.com/2021/10/automl-introduction-auto-sklearn-auto-pytorch.html

  • 38 Free Courses on Coursera for Data Science">Gold Blog38 Free Courses on Coursera for Data Science

    There are so many online resources for learning data science, and a great deal of it can be used at no cost. This collection of free courses hosted by Coursera will help you enhance your data science and machine learning skills, no matter your current level of expertise.

    https://www.kdnuggets.com/2021/10/38-free-courses-coursera-datascience.html

  • Data science SQL interview questions from top tech firms">Gold BlogData science SQL interview questions from top tech firms

    As a data scientist, there is one thing you really need to understand and know how to handle: data. With SQL being a foundational technical approach for working with data, it should not be surprising that the top tech companies will ask about your SQL skills during an interview. Here, we cover the key concepts tested so you can best prepare for your next data science interview.

    https://www.kdnuggets.com/2021/10/data-science-sql-interview-questions.html

  • Surpassing Trillion Parameters and GPT-3 with Switch Transformers – a path to AGI?">Silver BlogSurpassing Trillion Parameters and GPT-3 with Switch Transformers – a path to AGI?

    Ever larger models churning on increasingly faster machines suggest a potential path toward smarter AI, such as with the massive GPT-3 language model. However, new, more lean, approaches are being conceived and explored that may rival these super-models, which could lead to a future with more efficient implementations of advanced AI-driven systems.

    https://www.kdnuggets.com/2021/10/trillion-parameters-gpt-3-switch-transformers-path-agi.html

  • MLOps and ModelOps: What’s the Difference and Why it Matters

    These two terms are often used interchangeably. However, there are key distinctions between the functionality and features each provide, and the AI value and scalability at your organization depend on them.

    https://www.kdnuggets.com/2021/09/mlops-modelops-difference.html

  • Computer Vision in Agriculture

    Deep learning isn’t just for placing ads or identifying cats anymore. Instead, a slew of young startups have started to incorporate the advances in computer vision made possible through larger and larger neural networks to real working robots in the fields.

    https://www.kdnuggets.com/2021/09/computer-vision-agriculture.html

  • How To Deal With Imbalanced Classification, Without Re-balancing the Data

    Before considering oversampling your skewed data, try adjusting your classification decision threshold, in Python.

    https://www.kdnuggets.com/2021/09/imbalanced-classification-without-re-balancing-data.html

  • Messy Data is Beautiful

    Once these types of data have been cleaned, they do more than show organized data sets. They reveal unlimited possibilities, and AI analytics can reveal these possibilities faster and more efficiently than ever before.

    https://www.kdnuggets.com/2021/09/sparkbeyond-messy-data-is-beautiful.html

  • GitHub Copilot and the Rise of AI Language Models in Programming Automation

    Read on to learn more about what makes Copilot different from previous autocomplete tools (including TabNine), and why this particular tool has been generating so much controversy.

    https://www.kdnuggets.com/2021/09/github-copilot-rise-ai-language-models-programming-automation.html

  • How to be a Data Scientist without a STEM degree">Silver BlogHow to be a Data Scientist without a STEM degree

    Breaking into data science as a professional does require technical skills, a well-honed knack for problem-solving, and a willingness to swim in oceans of data. Maybe you are coming in as a career change or ready to take a new learning path in life--without having previously earned an advanced degree in a STEM field. Follow these tips to find your way into this high-demand and interesting field.

    https://www.kdnuggets.com/2021/09/data-scientist-without-stem-degree.html

  • Adventures in MLOps with Github Actions, Iterative.ai, Label Studio and NBDEV

    This article documents the authors' experience building their custom MLOps approach.

    https://www.kdnuggets.com/2021/09/adventures-mlops-github-actions-iterative-ai-label-studio-and-nbdev.html

  • Introduction to Automated Machine Learning

    AutoML enables developers with limited ML expertise (and coding experience) to train high-quality models specific to their business needs. For this article, we will focus on AutoML systems which cater to everyday business and technology applications.

    https://www.kdnuggets.com/2021/09/introduction-automated-machine-learning.html

  • Smart Ingestion: Using ontology-driven AI

    Imagine data that organizes itself to power your decision-making.

    https://www.kdnuggets.com/2021/09/smart-ingestion-ontology-driven-ai.html

  • Fast AutoML with FLAML + Ray Tune

    Microsoft Researchers have developed FLAML (Fast Lightweight AutoML) which can now utilize Ray Tune for distributed hyperparameter tuning to scale up FLAML’s resource-efficient & easily parallelizable algorithms across a cluster.

    https://www.kdnuggets.com/2021/09/fast-automl-flaml-ray-tune.html

  • CSV Files for Storage? No Thanks. There’s a Better Option

    Saving data to CSV’s is costing you both money and disk space. It’s time to end it.

    https://www.kdnuggets.com/2021/08/csv-files-storage-better-option.html

  • How causal inference lifts augmented analytics beyond flatland

    In our quest to better understand and predict business outcomes, traditional predictive modeling tends to fall flat. However, causal inference techniques along with business analytics approaches can unravel what truly changes your KPIs.

    https://www.kdnuggets.com/2021/08/causal-inference-augmented-analytics-beyond-flatland.html

  • Coding Ethics for AI & AIOps: Designing Responsible AI Systems

    AI ops has taken Human machine collaboration to the next level where humans and machines are not just coexisting but are collaborating and working together like team members.

    https://www.kdnuggets.com/2021/08/coding-ethics-ai-aiops-designing-responsible-ai-systems.html

  • 11 Best Data Science Education Platforms

    We cover 11 best Data Science Education platforms for 11 different use cases, ranging from specific languages to hands-on learners, to the best free option.

    https://www.kdnuggets.com/2021/08/11-best-data-science-education-platforms.html

  • Essential Features of An Efficient Data Integration Solution

    This blog highlights the essential features of a data integration solution that help an organization generate consistent and accurate data to keep the business running smoothly.

    https://www.kdnuggets.com/2021/08/essential-features-efficient-data-integration-solution.html

  • Django’s 9 Most Common Applications">Gold BlogDjango’s 9 Most Common Applications

    Django is a Python web application framework enjoying widespread adoption in the data science community. But what else can you use Django for? Read this article for 9 use cases where you can put Django to work.

    https://www.kdnuggets.com/2021/08/django-9-common-applications.html

  • Model Drift in Machine Learning – How To Handle It In Big Data

    Rendezvous Architecture helps you run and choose outputs from a Champion model and many Challenger models running in parallel without many overheads. The original approach works well for smaller data sets, so how can this idea adapt to big data pipelines?

    https://www.kdnuggets.com/2021/08/model-drift-machine-learning-big-data.html

  • Agile Data Labeling: What it is and why you need it

    The notion of Agile in software development has made waves across industries with its revolution for productivity. Can the same benefits be applied to the often arduous task of annotating data sets for machine learning?

    https://www.kdnuggets.com/2021/08/agile-data-labeling.html

  • Writing Your First Distributed Python Application with Ray

    Using Ray, you can take Python code that runs sequentially and transform it into a distributed application with minimal code changes. Read on to find out why you should use Ray, and how to get started.

    https://www.kdnuggets.com/2021/08/distributed-python-application-ray.html

  • MLOps And Machine Learning Roadmap

    A 16–20 week roadmap to review machine learning and learn MLOps.

    https://www.kdnuggets.com/2021/08/mlops-machine-learning-roadmap.html

  • DeepMind’s New Super Model: Perceiver IO is a Transformer that can Handle Any Dataset

    The new transformer-based architecture can process audio, video and images using a single model.

    https://www.kdnuggets.com/2021/08/deepmind-new-super-model-perceiver-io-transformer.html

  • Practising SQL without your own database">Silver BlogPractising SQL without your own database

    SQL is a very important skill for data analysts and data scientists. However, when you are just starting out learning in the field, how can you practice querying with SQL if you don’t have any data stored in a database?

    https://www.kdnuggets.com/2021/08/sql-without-own-database.html

  • Including ModelOps in your AI strategy

    The strategic power of AI has been established thoroughly across many industries and companies, leading to surges in model creation. Investments in the people, processes, and tools for operationalizing models, referred to as ModelOps, lag. This function of operationalizing, integrating, and deploying AI models in line with businesses value expectations is growing into a core business capability as global use of AI matures.

    https://www.kdnuggets.com/2021/08/modelops-ai-strategy.html

  • Using Twitter to Understand Pizza Delivery Apprehension During COVID

    Analyzing customer sentiments and capturing any specific difference in emotion to order Dominos pizza in India during lockdown.

    https://www.kdnuggets.com/2021/08/twitter-understand-pizza-delivery-covid.html

  • Bootstrap a Modern Data Stack in 5 minutes with Terraform">Gold BlogBootstrap a Modern Data Stack in 5 minutes with Terraform

    What is a Modern Data Stack and how do you deploy one? This guide will motivate you to start on this journey with setup instructions for Airbyte, BigQuery, dbt, Metabase, and everything else you need using Terraform.

    https://www.kdnuggets.com/2021/08/bootstrap-modern-data-stack-terraform.html

  • How To Become A Freelance Data Scientist – 4 Practical Tips">Silver BlogHow To Become A Freelance Data Scientist – 4 Practical Tips

    If you are a nerd-ish data scientist who wants to start working as an independent (remote) freelance data scientist, then these four practical tips can help you transition from the traditional 9-to-5 job to a dynamic experience as a remote contractor, just as the author did three years ago.

    https://www.kdnuggets.com/2021/08/how-become-freelance-data-scientist.html

  • How To 2x Your Data Analytics Consulting Rates (Overnight)

    Looking to up your data analytics consulting rates? Learn exactly what most freelancers are charging, and the rates you SHOULD be charging as a business intelligence and analytics consultant. This post will show you what you need to know to achieve maximum results for your data consulting career.

    https://www.kdnuggets.com/2021/08/2x-data-analytics-consulting-rates-overnight.html

  • Development & Testing of ETL Pipelines for AWS Locally

    Typically, development and testing ETL pipelines is done on real environment/clusters which is time consuming to setup & requires maintenance. This article focuses on the development and testing of ETL pipelines locally with the help of Docker & LocalStack. The solution gives flexibility to test in a local environment without setting up any services on the cloud.

    https://www.kdnuggets.com/2021/08/development-testing-etl-pipelines-aws-locally.html

  • 10 Machine Learning Model Training Mistakes

    These common ML model training mistakes are easy to overlook but costly to redeem.

    https://www.kdnuggets.com/2021/07/10-machine-learning-model-training-mistakes.html

  • MLOps Best Practices

    Many technical challenges must be overcome to achieve successful delivery of machine learning solutions at scale. This article shares best practices we encountered while architecting and applying a model deployment platform within a large organization, including required functionality, the recommendation for a scalable deployment pattern, and techniques for testing and performance tuning models to maximize platform throughput.

    https://www.kdnuggets.com/2021/07/mlops-best-practices.html

  • Not Only for Deep Learning: How GPUs Accelerate Data Science & Data Analytics">Gold BlogNot Only for Deep Learning: How GPUs Accelerate Data Science & Data Analytics

    Modern AI/ML systems’ success has been critically dependent on their ability to process massive amounts of raw data in a parallel fashion using task-optimized hardware. Can we leverage the power of GPU and distributed computing for regular data processing jobs too?

    https://www.kdnuggets.com/2021/07/deep-learning-gpu-accelerate-data-science-data-analytics.html

  • WHT: A Simpler Version of the fast Fourier Transform (FFT) you should know

    The fast Walsh Hadamard transform is a simple and useful algorithm for machine learning that was popular in the 1960s and early 1970s. This useful approach should be more widely appreciated and applied for its efficiency.

    https://www.kdnuggets.com/2021/07/wht-simpler-fast-fourier-transform-fft.html

  • High-Performance Deep Learning: How to train smaller, faster, and better models – Part 5

    Training efficient deep learning models with any software tool is nothing without an infrastructure of robust and performant compute power. Here, current software and hardware ecosystems are reviewed that you might consider in your development when the highest performance possible is needed.

    https://www.kdnuggets.com/2021/07/high-performance-deep-learning-part5.html

  • Pushing No-Code Machine Learning to the Edge

    Discover the power of no-code machine learning, and what it can accomplish when pushed to edge devices.

    https://www.kdnuggets.com/2021/07/pushing-no-code-machine-learning-edge.html

  • How Can You Distinguish Yourself from Hundreds of Other Data Science Candidates?">Silver BlogHow Can You Distinguish Yourself from Hundreds of Other Data Science Candidates?

    A few easy (and not-so-easy) ways to prove to employers that your skills and attitudes place you in a higher bracket.

    https://www.kdnuggets.com/2021/07/distinguish-yourself-hundreds-other-data-science-candidates.html

  • Exploring the SwAV Method

    This post discusses the SwAV (Swapping Assignments between multiple Views of the same image) method from the paper “Unsupervised Learning of Visual Features by Contrasting Cluster Assignments” by M. Caron et al.

    https://www.kdnuggets.com/2021/07/swav-method.html

  • High-Performance Deep Learning: How to train smaller, faster, and better models – Part 4

    With the right software, hardware, and techniques at your fingertips, your capability to effectively develop high-performing models now hinges on leveraging automation to expedite the experimental process and building with the most efficient model architectures for your data.

    https://www.kdnuggets.com/2021/07/high-performance-deep-learning-part4.html

  • How To Transition From Data Freelancer to Data Entrepreneur (Almost Overnight)

    Data freelancers trade hours for dollars while data entrepreneurs have found a way to make money while they sleep. Ready to make the transition? Keep reading to learn how to do it as SEAMLESSLY and PROFITABLY as possible.

    https://www.kdnuggets.com/2021/07/transition-data-freelancer-data-entrepreneur-overnight.html

  • Predict Customer Churn (the right way) using PyCaret

    A step-by-step guide on how to predict customer churn the right way using PyCaret that actually optimizes the business objective and improves ROI.

    https://www.kdnuggets.com/2021/07/pycaret-predict-customer-churn-right-way.html

  • From Scratch: Permutation Feature Importance for ML Interpretability

    Use permutation feature importance to discover which features in your dataset are useful for prediction — implemented from scratch in Python.

    https://www.kdnuggets.com/2021/06/from-scratch-permutation-feature-importance-ml-interpretability.html

  • Computational Complexity of Deep Learning: Solution Approaches

    Why has deep learning been so successful? What is the fundamental reason that deep learning can learn from big data? Why cannot traditional ML learn from the large data sets that are now available for different tasks as efficiently as deep learning can?

    https://www.kdnuggets.com/2021/06/computational-complexity-deep-learning-solution-approaches.html

  • In-Warehouse Machine Learning and the Modern Data Science Stack

    As your organization matures its data science portfolio and capabilities, establishing a modern data stack is vital to enabling such growth. Here, we overview various in-data warehouse machine learning services, and discuss each of their benefits and requirements.

    https://www.kdnuggets.com/2021/06/in-warehouse-machine-learning-modern-data-science-stack.html

  • Data Careers in Demand: Crowd Solutions Architect Explained

    How can crowdsourcing support the applications of data teams at an organization? With an ever-increasing demand for more and higher quality data, a new role of the Crowd Solutions Architect (CSA) can leverage the potential of the masses to bring an advantage to a business's capability to deliver effective AI-driven solutions.

    https://www.kdnuggets.com/2021/06/data-careers-crowd-solutions-architect.html

  • Overview of AutoNLP from Hugging Face with Example Project

    AutoNLP is a beta project from Hugging Face that builds on the company’s work with its Transformer project. With AutoNLP you can get a working model with just a few simple terminal commands.

    https://www.kdnuggets.com/2021/06/overview-autonlp-hugging-face-example-project.html

  • High Performance Deep Learning, Part 1

    Advancing deep learning techniques continue to demonstrate incredible potential to deliver exciting new AI-enhanced software and systems. But, training the most powerful models is expensive--financially, computationally, and environmentally. Increasing the efficiency of such models will have profound impacts in many ways, so developing future models with this intension in mind will only help to further expand the reach, applicability, and value of what deep learning has to offer.

    https://www.kdnuggets.com/2021/06/efficiency-deep-learning-part1.html

  • 7 Data Security Best Practices for 2021

    Here are seven data security best practices to adopt this year.

    https://www.kdnuggets.com/2021/06/7-data-security-best-practices-2021.html

  • Facebook Launches One of the Toughest Reinforcement Learning Challenges in History

    The FAIR team just launched the NetHack Challenge as part of the upcoming NeurIPS 2021 competition. The objective is to test new RL ideas using a one of the toughest game environments in the world.

    https://www.kdnuggets.com/2021/06/facebook-launches-toughest-reinforcement-learning-challenges.html

  • Feature Selection – All You Ever Wanted To Know

    Although your data set may contain a lot of information about many different features, selecting only the "best" of these to be considered by a machine learning model can mean the difference between a model that performs well--with better performance, higher accuracy, and more computational efficiency--and one that falls flat. The process of feature selection guides you toward working with only the data that may be the most meaningful, and to accomplish this, a variety of feature selection types, methodologies, and techniques exist for you to explore.

    https://www.kdnuggets.com/2021/06/feature-selection-overview.html

  • The 7 Best Open Source AI Libraries You May Not Have Heard Of

    AI researchers today have many exciting options for working with specialized tools. Although starting original projects from scratch is often not necessary, knowing which existing library to leverage remains a challenge. This list of generally unknown yet awesome, open-source libraries offers an interesting collection to consider for state-of-the-art research that spans from automatic machine learning to differentiable quantum circuits.

    https://www.kdnuggets.com/2021/06/7-open-source-ai-libraries.html

  • 5 Tips for Picking an Edge AI Platform

    Edge Analytics isn’t just coding and tools. The different environment outside the datacenter or cloud means a purpose built platform is the best way to deliver consistent results. We discuss 5 different considerations for an edge platform to support your training and deployment.

    https://www.kdnuggets.com/2021/06/5-tips-edge-ai-platform.html

  • 4 Tips for Dataset Curation for NLP Projects

    You have heard it before, and you will hear it again. It's all about the data. Curating the right data is also so important than just curating any data. When dealing with text data, many hard-earned lessons have been learned by others over the years, and here are four data curation tips that you should be sure to follow during your next NLP project.

    https://www.kdnuggets.com/2021/05/4-tips-dataset-curation-nlp-projects.html

  • Choosing the Right BI Tool for Your Business

    Here are six questions to ask as you search for the best BI tool for your specific needs.

    https://www.kdnuggets.com/2021/05/choosing-right-bi-tool-business.html

  • Great New Resource for Natural Language Processing Research and Applications

    The NLP Index is a brand new resource for NLP code discovery, combining and indexing more than 3,000 paper and code pairs at launch. If you are interested in NLP research and locating the code and papers needed to understand an implement the latest research, you should check it out.

    https://www.kdnuggets.com/2021/05/great-new-resource-natural-language-processing-research-applications.html

  • Budgeting For Your AI Training Data: Consider These 3 Factors

    Before you even plan to procure the data, one of the most important considerations in determining how much you should spend on your AI training data. In this article, we will give you insights to develop an effective budget for AI training data.

    https://www.kdnuggets.com/2021/05/shaip-budgeting-ai-training-data.html

  • Data Validation in Machine Learning is Imperative, Not Optional

    Before we reach model training in the pipeline, there are various components like data ingestion, data versioning, data validation, and data pre-processing that need to be executed. In this article, we will discuss data validation, why it is important, its challenges, and more.

    https://www.kdnuggets.com/2021/05/data-validation-machine-learning-imperative.html

  • 6 Business Trends Benefiting Data Scientists

    Here are six business trends making data scientists even more in-demand.

    https://www.kdnuggets.com/2021/05/6-business-trends-data-scientists.html

  • How to pitch to VCs, explained: The Deck We Used to Raise Capital For Our Open-Source ELT Platform

    Winning seed funding from venture capitalists is a daunting task, and the pitch is key. Learn how one effective slide deck resulted in a successful early funding round for an open-source start-up, Airbyte.

    https://www.kdnuggets.com/2021/05/vc-pitch-deck-open-source-elt-platform.html

  • Awesome list of datasets in 100+ categories

    With an estimated 44 zettabytes of data in existence in our digital world today and approximately 2.5 quintillion bytes of new data generated daily, there is a lot of data out there you could tap into for your data science projects. It's pretty hard to curate through such a massive universe of data, but this collection is a great start. Here, you can find data from cancer genomes to UFO reports, as well as years of air quality data to 200,000 jokes. Dive into this ocean of data to explore as you learn how to apply data science techniques or leverage your expertise to discover something new.

    https://www.kdnuggets.com/2021/05/awesome-list-datasets.html

  • Machine Translation in a Nutshell

    Marketing scientist Kevin Gray asks Dr. Anna Farzindar of the University of Southern California for a snapshot of machine translation. Dr. Farzindar also provided the original art for this article.

    https://www.kdnuggets.com/2021/05/machine-translation-nutshell.html

  • Best Python Books for Beginners and Advanced Programmers

    Let's take a look at nine of the best Python books for both beginners and advanced programmers, covering topics such as data science, machine learning, deep learning, NLP, and more.

    https://www.kdnuggets.com/2021/05/best-python-books-beginner-advanced.html

  • Rebuilding My 7 Python Projects">Silver BlogRebuilding My 7 Python Projects

    This is how I rebuilt My Python Projects: Data Science, Web Development & Android Apps.

    https://www.kdnuggets.com/2021/05/rebuilding-7-python-projects.html

  • Deploy a Dockerized FastAPI App to Google Cloud Platform

    A short guide to deploying a Dockerized Python app to Google Cloud Platform using Cloud Run and a SQL instance.

    https://www.kdnuggets.com/2021/05/deploy-dockerized-fastapi-app-google-cloud-platform.html

  • Introducing The NLP Index

    The NLP Index is a brand new resource for NLP code discovery, combining and indexing more than 3,000 paper and code pairs at launch. If you are interested in NLP research and locating the code and papers needed to understand an implement the latest research, you should check it out.

    https://www.kdnuggets.com/2021/04/nlp-index.html

  • Using Data Science to Predict and Prevent Real World Problems

    Do you have an interest in data science but lack an understanding of what, exactly, it can be used to accomplish in the real world? Read this article for a few examples of just how helpful data science can be for predicting and preventing real world problems.

    https://www.kdnuggets.com/2021/04/data-science-predict-prevent-real-world-problems.html

  • Top 3 Challenges for Data & Analytics Leaders

    The author shares the 3 top challenges faced as they led and established a data & analytics function, as well as ways in which these challenges were addressed. How have you solved the one challenge which has remained elusive to the author?

    https://www.kdnuggets.com/2021/04/top-3-challenges-data-analytics-leaders.html

  • Improving model performance through human participation

    Certain industries, such as medicine and finance, are sensitive to false positives. Using human input in the model inference loop can increase the final precision and recall. Here, we describe how to incorporate human feedback at inference time, so that Machines + Humans = Higher Precision & Recall.

    https://www.kdnuggets.com/2021/04/improving-model-performance-through-human-participation.html

  • How to ace A/B Testing Data Science Interviews">Silver BlogHow to ace A/B Testing Data Science Interviews

    Understanding the process of A/B testing and knowing how to discuss this approach during data science job interviews can give you a leg up over other candidates. This mock interview provides a step-by-step guide through how to demonstrate your mastery of the key concepts and logical considerations.

    https://www.kdnuggets.com/2021/04/ab-testing-data-science-interviews.html

  • Build an Effective Data Analytics Team and Project Ecosystem for Success

    Apply these techniques to create a data analytics program that delivers solutions that delight end-users and meet their needs.

    https://www.kdnuggets.com/2021/04/build-effective-data-analytics-team-project-ecosystem-success.html

  • Free From Stanford: Machine Learning with Graphs

    Check out the freely-available Stanford course Machine Learning with Graphs, taught by Jure Leskovec, and see how a world renowned researcher teaches their topic of expertise. Accessible materials include slides, videos, and more.

    https://www.kdnuggets.com/2021/04/free-stanford-machine-learning-graphs.html

  • ETL in the Cloud: Transforming Big Data Analytics with Data Warehouse Automation

    Today, organizations are increasingly implementing cloud ETL tools to handle large data sets. With data sets becoming larger by the day, unified ETL tools have become crucial for data integration needs of enterprises.

    https://www.kdnuggets.com/2021/04/etl-cloud-transforming-big-data-analytics-data-warehouse-automation.html

  • Continuous Training for Machine Learning – a Framework for a Successful Strategy

    A basic appreciation by anyone who builds machine learning models is that the model is not useful without useful data. This doesn't change after a model is deployed to production. Effectively monitoring and retraining models with updated data is key to maintaining valuable ML solutions, and can be accomplished with effective approaches to production-level continuous training that is guided by the data.

    https://www.kdnuggets.com/2021/04/continuous-training-machine-learning.html

  • Deep Learning Recommendation Models (DLRM): A Deep Dive

    The currency in the 21st century is no longer just data. It's the attention of people. This deep dive article presents the architecture and deployment issues experienced with the deep learning recommendation model, DLRM, which was open-sourced by Facebook in March 2019.

    https://www.kdnuggets.com/2021/04/deep-learning-recommendation-models-dlrm-deep-dive.html

  • A/B Testing: 7 Common Questions and Answers in Data Science Interviews, Part 2

    In this second article in this series, we’ll continue to take an interview-driven approach by linking some of the most commonly asked interview questions to different components of A/B testing, including selecting ideas for testing, designing A/B tests, evaluating test results, and making ship or no ship decisions.

    https://www.kdnuggets.com/2021/04/ab-testing-7-common-questions-answers-data-science-interviews-2.html

  • Learning from machine learning mistakes

    Read this article and discover how to find weak spots of a regression model.

    https://www.kdnuggets.com/2021/03/learning-from-machine-learning-mistakes.html

  • Introducing dbt, the ETL and ELT Disrupter

    Moving and processing data is happening 24/7/365 world-wide at massive scales that only get larger by the hour. Tools exist to introduce efficiencies in how data can be extracted from sources, transformed through calculations, and loaded into target data repositories. However, on their own, these tools can introduce some restrictions in the processing, especially for the needs of data analytics and data science.

    https://www.kdnuggets.com/2021/03/dbt-etl-elt-disrupter.html

  • Forget Telling Stories; Help People Navigate

    When designing reporting & visualizations, think of them as part of a navigation framework rather than stand-alone information.

    https://www.kdnuggets.com/2021/03/forget-telling-stories-help-people-navigate.html

  • Must Know for Data Scientists and Data Analysts: Causal Design Patterns">Silver BlogMust Know for Data Scientists and Data Analysts: Causal Design Patterns

    Industry is a prime setting for observational causal inference, but many companies are blind to causal measurement beyond A/B tests. This formula-free primer illustrates analysis design patterns for measuring causal effects from observational data.

    https://www.kdnuggets.com/2021/03/causal-design-patterns.html

  • A Beginner’s Guide to the CLIP Model

    CLIP is a bridge between computer vision and natural language processing. I'm here to break CLIP down for you in an accessible and fun read! In this post, I'll cover what CLIP is, how CLIP works, and why CLIP is cool.

    https://www.kdnuggets.com/2021/03/beginners-guide-clip-model.html

  • How to Speed Up Pandas with Modin

    The Modin library has the ability to scale your pandas workflows by changing one line of code and integration with the Python ecosystem and Ray clusters. This tutorial goes over how to get started with Modin and how it can speed up your pandas workflows.

    https://www.kdnuggets.com/2021/03/speed-up-pandas-modin.html

  • DeepMind’s AlphaFold & the Protein Folding Problem

    Recently, DeepMind's AlphaFold made impressive headway in the protein structure prediction problem. Read this for an overview and explanation.

    https://www.kdnuggets.com/2021/03/deepmind-alphafold-protein-folding-problem.html

  • Speeding up Scikit-Learn Model Training

    If your scikit-learn models are taking a bit of time to train, then there are several techniques you can use to make the processing more efficient. From optimizing your model configuration to leveraging libraries to speed up training through parallelization, you can build the best scikit-learn model possible in the least amount of time.

    https://www.kdnuggets.com/2021/03/speed-up-scikit-learn-model-training.html

  • Dask and Pandas: No Such Thing as Too Much Data

    Do you love pandas, but don't love it when you reach the limits of your memory or compute resources? Dask provides you with the option to use the pandas API with distributed data and computing. Learn how it works, how to use it, and why it’s worth the switch when you need it most.

    https://www.kdnuggets.com/2021/03/dask-pandas-data.html

  • Google’s Model Search is a New Open Source Framework that Uses Neural Networks to Build Neural Networks">Gold BlogGoogle’s Model Search is a New Open Source Framework that Uses Neural Networks to Build Neural Networks

    The new framework brings state-of-the-art neural architecture search methods to TensorFlow.

    https://www.kdnuggets.com/2021/03/google-model-search-open-source-framework.html

  • 5 Supporting Skills That Can Help You Get a Data Science Job

    If you want to stand out among your fellow applicants, here are some supporting skills you should develop.

    https://www.kdnuggets.com/2021/02/5-supporting-skills-data-science-job.html

  • An overview of synthetic data types and generation methods

    Synthetic data can be used to test new products and services, validate models, or test performances because it mimics the statistical property of production data. Today you'll find different types of structured and unstructured synthetic data.

    https://www.kdnuggets.com/2021/02/overview-synthetic-data-types-generation-methods.html

  • Feature Store as a Foundation for Machine Learning

    With so many organizations now taking the leap into building production-level machine learning models, many lessons learned are coming to light about the supporting infrastructure. For a variety of important types of use cases, maintaining a centralized feature store is essential for higher ROI and faster delivery to market. In this review, the current feature store landscape is described, and you can learn how to architect one into your MLOps pipeline.

    https://www.kdnuggets.com/2021/02/feature-store-foundation-machine-learning.html

  • 6 Data Science Certificates To Level Up Your Career

    Anyone looking to obtain a data science certificate to prove their ability in the field will find a range of options exist. We review several valuable certificates to consider that will definitely pump up your resume and portfolio to get you closer to your dream job.

    https://www.kdnuggets.com/2021/02/6-data-science-certificates.html

  • 10 resources for data science self-study

    Many resources exist for the self-study of data science. In our modern age of information technology, an enormous amount of free learning resources are available to anyone, and with effort and dedication, you can master the fundamentals of data science.

    https://www.kdnuggets.com/2021/02/10-resources-data-science-self-study.html

  • Hugging Face Transformers Package – What Is It and How To Use It

    The rapid development of Transformers have brought a new wave of powerful tools to natural language processing. These models are large and very expensive to train, so pre-trained versions are shared and leveraged by researchers and practitioners. Hugging Face offers a wide variety of pre-trained transformers as open-source libraries, and you can incorporate these with only one line of code.

    https://www.kdnuggets.com/2021/02/hugging-face-transformer-basics.html

  • IBM Uses Continual Learning to Avoid The Amnesia Problem in Neural Networks

    Using continual learning might avoid the famous catastrophic forgetting problem in neural networks.

    https://www.kdnuggets.com/2021/02/ibm-continual-learning-avoid-amnesia-problem-neural-networks.html

  • How to Speed up Scikit-Learn Model Training

    Scikit-Learn is an easy to use a Python library for machine learning. However, sometimes scikit-learn models can take a long time to train. The question becomes, how do you create the best scikit-learn model in the least amount of time?

    https://www.kdnuggets.com/2021/02/speed-up-scikit-learn-model-training.html

  • Machine Learning – it’s all about assumptions

    Just as with most things in life, assumptions can directly lead to success or failure. Similarly in machine learning, appreciating the assumed logic behind machine learning techniques will guide you toward applying the best tool for the data.

    https://www.kdnuggets.com/2021/02/machine-learning-assumptions.html

  • A Critical Comparison of Machine Learning Platforms in an Evolving Market

    There’s a clear inclination towards the MLaaS model across industries, given the fact that companies today have an option to select from a wide range of solutions that can cater to diverse business needs. Here is a look at 3 of the top ML platforms for data excellence.

    https://www.kdnuggets.com/2021/02/critical-comparison-machine-learning-platforms-evolving-market.html

  • How to Get Data Science Interviews: Finding Jobs, Reaching Gatekeepers, and Getting Referrals">Silver BlogHow to Get Data Science Interviews: Finding Jobs, Reaching Gatekeepers, and Getting Referrals

    In this post, the author shares what to do to get job interviews efficiently. Find answers to these questions: Where should I look for data science jobs? How do I reach out to the gatekeeper? How do I get referrals? What makes a good data science resume?

    https://www.kdnuggets.com/2021/02/data-science-interviews-finding-jobs-reaching-gatekeepers-getting-referrals.html

Refine your search here:

No, thanks!