# Linear Programming and Discrete Optimization with Python using PuLP

Knowledge of such optimization techniques is extremely useful for data scientists and machine learning (ML) practitioners as discrete and continuous optimization lie at the heart of modern ML and AI systems as well as data-driven business analytics processes.

Pages: 1 2

### Solving the problem and printing the solution

PuLP has quite a few choices of solver algorithms (e.g. COIN_MP, Gurobi, CPLEX, etc.). For this problem, we do not specify any choice and let the program default to its own choice depending on the problem structure.

prob.solve()

We can print the **status** of the solution. Note, although the status is *optimal* in this case, it does not need to be so. In case the problem is ill-formulated or there is not sufficient information, the solution may be *infeasible* or *unbounded*.

# The status of the solution is printed to the screen print("Status:", LpStatus[prob.status]) >> Status: Optimal

The full solution contains all the variables including the ones with zero weights. But to us, **only those variables are interesting which have non-zero coefficients** i.e. which should be included in the optimal diet plan. So, we can scan through the problem variables and print out only if the variable quantity is positive.

for v in prob.variables(): if v.varValue>0: print(v.name, "=", v.varValue) >> Food_Frozen_Broccoli =6.9242113Food_Scrambled_Eggs =6.060891Food__Baked_Potatoes =1.0806324

So, the optimal solution is to eat 6.923 servings of frozen broccoli, 6.06 servings of scrambled eggs and 1.08 servings of a baked potato!

You are welcome to download the whole notebook, the data file, and experiment with various constraints to change your diet plan. The code is here in my Github repository.

Finally, we can print the **objective function i.e. cost of the diet** in this case,

obj = value(prob.objective) print("The total cost of this balanced diet is: ${}".format(round(obj,2)))

>> The total cost of this balanced diet is: $5.52

### What if we want a solution with whole numbers?

As we can see that the optimal result came back with a set of fractional numbers of servings for the food items. This may not be practical and we may want the solution to be forced to have only integer quantities as servings.

This brings to us the technique of **integer programming**. The algorithm used for the previous optimization is simple linear programming where the variables were allowed to assume any real number value. **Integer programming forces some or all of the variables to assume only integer values.**

In fact, integer programming is a harder computational problem than linear programming. Integer variables make an optimization problem non-convex, and therefore far more difficult to solve. Memory and solution time may rise exponentially as you add more integer variables.

Fortunately, PuLP can solve an optimization problem with this kind of restrictions too.

The code is almost identical as before, so it is not repeated here. The only difference is that the variables are defined as belonging to

category as opposed to **Integer**

.**Continuous**

food_integer = LpVariable.dicts("Food",food_items,0,cat='Integer')

For this problem, it changes the optimal solution slightly, adding iceberg lettuce to the diet and increasing the cost by $0.06. You will also notice a **perceptible increase in the computation time** for the solution process.

Therefore, the optimal balanced diet with whole servings consists of -------------------------------------------------------------------- Food_Frozen_Broccoli =7.0Food_Raw_Lettuce_Iceberg =1.0Food_Scrambled_Eggs =6.0Food__Baked_Potatoes =1.0

A cool application of integer programming is solving a **driver-scheduling problem** which can be an NP-hard problem. See this article (also note in the article, how they compute the costs of various actions and use them in the optimization problem),

### How to incorporate binary decisions in a linear programming problem?

Often, we want to include some kind of *‘If-then-else*” kind of decision logic in the optimization problem.

What if we don’t want both broccoli and iceberg lettuce to be included in the diet (but only one of them is fine)?

How do we represent such decision logic in this framework?

Turns out, for this kind of logic, you need to introduce another type of variables called **indicator variables**. They are binary in nature and can indicate the presence or absence of a variable in the optimal solution.

But for this particular problem, there is an apparent problem with using indicator variables. Ideally, you want the cost/nutritional value of a food item to be included in the constraint equation if the indicator variable is 1 and ignore it if is zero. Mathematically, it is intuitive to write this as a **product of the original term (involving the food item) and the indicator variable**. But the moment you do that, you are multiplying two variables and making the problem nonlinear! It falls under the domain of **quadratic programming** (QP) in that case (*quadratic* because the terms are now the product of two linear terms).

The popular machine learning technique Support Vector Machine essentially solves a quadratic programming problem.

However, this general concept of using an indicator variable for expressing binary logic in a linear programming problem is also extremely useful. We have given a link to a problem of solving Sudoku puzzle by LP in the next section where this trick is used.

It turns out that there is a clever trick to incorporate such binary logic in this LP without making it a QP problem.

We can denote the binary variables as `food_chosen`

and instantiate them as `Integer`

with lower and upper bounds of 0 and 1.

food_chosen = LpVariable.dicts("Chosen",food_items,0,1,cat='Integer')

Then we write a special code to link the usual `food_vars`

and the binary `food_chosen`

and add this constraint to the problem.

forfinfood_items: prob += food_vars[f]>= food_chosen[f]*0.1 prob += food_vars[f]<= food_chosen[f]*1e5

If you stare at the code long enough, you will realize this effectively means that we are giving `food_vars`

importance only if the corresponding `food_chosen`

indicator variable is 1. But this way we avoid the direct multiplication and keep the problem structure linear.

To incorporate the either/or condition of broccoli and iceberg lettuce, we just put a simple code,

prob += food_chosen['Frozen Broccoli']+food_chosen['Raw Iceberg Lettuce']<=1

This ensures the sum of these two binary variables is at most 1, which means only one of them can be included in the optimal solution but not both.

### More applications of linear/integer programming

In this article, we showed the basic flow of setting up and solving a simple linear programming problem with Python. However, if you look around, you will find countless examples of engineering and business problems which can be transformed into some form of LP and then solved using efficient solvers. Following are some of the canonical examples to get you started thinking,

- Solving Sudoku as an LP problem
- Maximizing return on the long-term investment as an LP problem
- LP applied to production planning
- Solving warehouse location problem using ILP

Many machine learning algorithms also use the general class of optimization of which linear programming is a subset — **convex optimization**. See the following article for more information about it,

### Summary and conclusion

In this article, we illustrated solving a simple diet optimization problem with linear and integer programming techniques using Python package PuLP. It is noteworthy that even the widely-used SciPy has a linear optimization method built-in. Readers are encouraged to try various other Python libraries and choose a good method for themselves.

If you have any questions or ideas to share, please contact the author at **tirthajyoti[AT]gmail.com**. Also, you can check the author’s **GitHub**** repositories **for other fun code snippets in Python, R, or MATLAB and machine learning resources. If you are, like me, passionate about machine learning/data science, please feel free to add me on LinkedIn or follow me on Twitter.

**Bio: Tirthajyoti Sarkar** is the Senior Principal Engineer at ON Semiconductor working on Deep Learning/Machine Learning based design automation projects.

Original. Reposted with permission.

**Related:**

- How Optimization Works
- Optimization Using R
- Optimization in Machine Learning: Robust or global minimum?

Pages: 1 2