WTF is a Parameter?!?

Demystifying the concept of a parameter in machine learning: what they are, how many parameters a model has, and what could possibly go wrong when learning them.



WTF is a Parameter?!?
Image by Editor

 

Introduction

 
Machine learning systems consist, in essence, of models — like decision trees, linear regressors, or neural networks, among many others — that have been trained on a set of data examples to learn a series of patterns or relationships, for instance, to predict the price of an apartment in sunny Seville (Spain) based on its attributes. But a machine learning model's quality or performance on the task it has been trained for largely depends on its own "appearance" or "shape". Even two models of the same type, for example, two linear regression models, might perform very differently from each other depending on one key aspect: their parameters.

This article demystifies the concept of a parameter in machine learning models and outlines what they are, how many parameters a model has (spoiler alert: it depends!), and what could go wrong when setting a model's parameters during training. Let's explore these core components.

 

Demystifying Parameters in Machine Learning Models

 
Parameters are like the internal dials and knobs of a machine learning model: they define the behavior of your model. Just like a barista's coffee machine may brew a cup of coffee with varying quality depending on the quality of the coffee beans it grinds, a machine learning model's parameters are set differently depending on the nature — and, to a large extent, quality — of the training data examples used to learn to perform a task.

For example, back to the case of predicting apartment prices, if the training dataset of apartment examples with known prices contains noisy, irrelevant, or biased information, the training process may yield a model whose parameters (remember, internal settings) capture misleading patterns or input-output relationships, resulting in poor price predictions. Meanwhile, if the dataset contains clean, representative, and high-quality examples, chances are the training process will produce a model whose parameters are finely tuned to the real factors that influence higher or lower housing prices, leading to great predictions.

Noticed now I used the italics to emphasize the word "internal" several times? That was purely intentional and necessary to distinguish between machine learning model parameters and hyperparameters. Compared to parameters, a hyperparameter in a machine learning model is like a dial, knob, or even button or switch that is externally and manually adjusted (not learned from the data), typically by a human but also as a result of a search process to find the best configuration of relevant hyperparameters in your model. You can learn more about hyperparameters in this Machine Learning Mastery article.

 

Parameters are like the internal dials and knobs of a machine learning model — they define the "personality" or "behavior" of the model, namely, what aspects of the data it attends to, and to what extent.

 

Now that we have a better understanding of machine learning model parameters, a couple of questions that arise are:

  1. What do parameters look like?
  2. How many parameters exist in a machine learning model?

Parameters are normally numerical values, looking like weights that, in some model types, range between 0 and 1, and in others can take any other real values. This is why in machine learning jargon the terms parameter and weight are often used to refer to the same concept, especially in neural network-based models. The higher this weight, the more strongly this "knob" inside the model influences the outcome or prediction. In simpler machine learning models, like linear regression models, parameters are associated with input data features.

For instance, suppose we want to predict the price of an apartment based on four attributes: size in squared meters, proximity to the city center, number of bedrooms, and age of the building in years. A linear regression model trained for this predictive task would have four parameters — one linked to each input predictor — plus one extra parameter called the bias term (or intercept), not linked to any input feature of your data but typically needed in many machine learning models to have more "freedom" to effectively learn from diverse data. Thus, each parameter or weight's value indicates the strength of influence of its associated input feature in the process of making a prediction with that model. If the highest weight is the one for the "proximity to city center", that means apartment pricing in Seville is largely affected by how far they are from the city center.

More generally, and in mathematical terms, parameters in a simple model like a multiple linear regression model are denoted by \( \theta_i \) in an equation like this:
\[
\hat{y} = \theta_0 + \theta_1x_1 + \dots + \theta_nx_n
\]

Of course, only the simplest types of machine learning models have this small number of parameters. As data complexity grows, so normally does the necessity for larger, more sophisticated models like support vector machines, random forest ensembles, or neural networks, which introduce additional layers of structural complexity to be able to learn challenging relationships and patterns. As a result, larger models have a much higher number of parameters, now not just linked to inputs, but to complex and abstract interrelationships between inputs that are stacked and built up across the model innards. A deep neural network, for instance, can have from hundreds to millions of parameters, and some of the largest machine learning models as of today — the transformer architecture behind large language models (LLMs) — typically have billions of learnable parameters inside them!

 

Learning Parameters and Addressing Potential Issues

 
When the process to train a machine learning model starts, parameters are usually initialized as random values. The model makes predictions using training data examples with known prediction results, e.g. apartments with known prices, determining the error made and adjusting some parameters accordingly to gradually reduce errors made. This is how, example after example, machine learning models learn: parameters are progressively and iteratively updated during training, making them more and more tailored to the set of training examples the model is exposed to.

Unfortunately, some difficulties and problems may arise in practice when training a machine learning model — in other words, while gradually setting its parameters' values. Some common issues include overfitting and its counterpart underfitting, and they manifest through some finally learned parameters that are not in their best shape, resulting in a model that may perform poor predictions. These issues may also partly stem from manmade choices, like selecting a model that is too complex or too simple for the training data at hand, i.e. the number of parameters in the model is too small or too large. A model with too many parameters might become slow, expensive to train and use, and harder to control if it degrades over time. Meanwhile, a model with too few parameters does not have enough flexibility to learn useful patterns from the data.

 

Wrapping Up

 
This article provided an explanation in simple and friendly terms about an essential element in machine learning models: parameters. They are like the DNA of your model, and understanding what they are, how they are learned, and how they relate to model behavior and performance, is a vital expert towards becoming machine learning-savvy.
 
 

Iván Palomares Carrascosa is a leader, writer, speaker, and adviser in AI, machine learning, deep learning & LLMs. He trains and guides others in harnessing AI in the real world.


Get the FREE ebook 'KDnuggets Artificial Intelligence Pocket Dictionary' along with the leading newsletter on Data Science, Machine Learning, AI & Analytics straight to your inbox.

By subscribing you accept KDnuggets Privacy Policy


Get the FREE ebook 'KDnuggets Artificial Intelligence Pocket Dictionary' along with the leading newsletter on Data Science, Machine Learning, AI & Analytics straight to your inbox.

By subscribing you accept KDnuggets Privacy Policy

Get the FREE ebook 'KDnuggets Artificial Intelligence Pocket Dictionary' along with the leading newsletter on Data Science, Machine Learning, AI & Analytics straight to your inbox.

By subscribing you accept KDnuggets Privacy Policy

No, thanks!