Why to study Linear Optimization?
In data science, we will be dealing frequently with optimization problems as our main motivation is to minimize(or maximize) a certain objective we are pursuing. Linear optimization(or linear programming) is a subfield of a much broader optimization field called convex optimization and is a great place to start because of its easiness and intuitiveness.
For you who are interested in studying LP but don’t have any preliminary knowledge about it, I try to explicate it as translucent as possible so that you can study in depth with much more confidence. I follow the book Introduction to Linear Optimization by Dimitris Bertsimas (Author), John N. Tsitsiklis. I intented to do this because some core concepts are so crucial that if you cannot internalize them, you are likely to have lots of headache ahead.
The Basic Concepts That Everything Else is Built Upon
Think of mathematics as a game that has certain rules to be played. Those rules in math might be axioms, definitions, assumptions, or any other thing that is agreed upon beforehand. So LP is a mathematical game that has certain rules and definitons to be played consistently. Consistent is the key word here because any inconsistency between rules, definitions, theories and realities leads to a pointless, confusing and ambiguous game. Let’s start.
Some Verbal Definitions
In LP, we have some variables which we call decision variables. Using those decision variables, we try to minimize or maximize some objective function we are interested in. In LP this objective function is linear in decision variables, that is why it is called linear optimization. We have also some constraints to be met. A vector x of our decision variables satisfying all of the constraints is called a feasible solution and the set of all feasible solutions is called the feasible region. If we find a feasible solution so that no other feasible solution in the feasible region provides improvement in our objective function further (depending on whether we are minimizing or maximizing), then this solution is called optimal solution, and the corresponding objective function value is called optimal objective.
Hyperplanes and Halfspaces
These are the fundamental concepts and definitions of our LP game that we always use. Formal definition of them is the following:
And a typical hyperplane looks like this:
Pay attention to the formal definition of hyperplane and halfspace. A hyperplane is an equation whereas a halfspace is an inequality. For example a1*x1 + a2*x2 = b is a linear equation in two dimensional space with coordinates x1 and x2. This is also known as line equation. So a hyperplane in 2-dimensional space is line. Similarly, in 3-dimensions a1*x1 + a2*x2 + a3*x3 = b is a linear equation and it is known as plane. In more dimension than 3, the geometric object constructed by a linear equation is called hyperplane.
Halfspace is similar to hyperplane, but it covers the area on the one side of a hyperplane. In the above picture, the side red dots are in is a halfspace, and the side green dots are in is another halfspace and hey are separated by a hyperplane. In other words, a hyperplane is the boundary of a corresponding halfspace.
We will play this game with objects called polyhedrons. The formal definition of polyhedron is something like that:
A typical polyhedron looks like these:
What the formal definition actually says is that a ployhedron is a collection of a finite number of inequalities. Every row of matrix A multiplied by vector x is an inequality that is greater than or equal to the corresponding row of vector b. Since linear inequalities construct halfspaces (the definition of halfspace), a polyhedron is the intersection of a finite number of halfspaces as the picture below illustrates:
So far we have seen how hyperplane, halfspace and polyhedron are related to each other. There are among the most important definitions we need to keep in mind.
This is another important concept we shoul know. The formal definition is the following:
A typical convex and non-convex set look like this:
A linear combination of any two points in a set with lambda between 0 and 1 (both inclusive) like in the definition is a linear combination of those two point. So this corresponds to a line segment joining the two points. We can go along this line with varying values of lambda. So the definition says that if the line segment joining any two points in a set is also contained in that set, then it is called a convex set. That is why the left plot is convex and the right plot is non-convex.
The convex combination of many vectors (points) with coefficients summing to 1 is called convex combination of those vectors and all convex combinations of those vectors is called the convex hull of them.
With hyperplane, halfspace and polyhedron, convexity is the crucial concept we need to know to go further. After introducing the most basic, fundamental concepts and definitions of LP, we can go further to define some other things we will encounter.
Extreme point of a polyhedron is the point which cannot be expressed as a convex combination of two other points in the polyhedron. The following is its formal definition:
The illustration below shows a typical polyhedron with its extreme points marked with blue circles. Those blue circles are extreme points because they cannot be expressed as a convex combination of any two points different from extreme point itself.
The lexical meaning of vertex is ‘the highest point of something; apex; summit; top.‘. In this sense, it is the highest point of a polyhedron from a specific perspective. The formal definition is as following:
x is actually a constant point and we can think of c’x as a constant term. So contant < c’y is indeed a halfspace whose boundary line is c’y (non-inclusive) as we have already learned in the previous post. Hence, the formal definition says that if we can find a hyperplane which meets P only at the point x so that any other point in the P resides on one side of that hyperplane, then x is a vertex of P.
Basic and Basic Feasible Solution
This concept includes some fundamental concepts of linear algebra, especially spaces, basis, and linear independence. The typical definition of them is as follows:
With the word “active”, it is meant that all equality costraints are met with x*.
From now on, we can start to play the LP game with those definitions.
Starting LP Game
First Game: With the definitions above, we can play an LP game to prove the followings:
(a) For example, two convext sets go to a bar to drink some beer and have fun. After a couple of hours they asked for the bill and the bill was $10. The set A has $20 with him and the set B has $30 with her. The intersection of their money is $20. $20 can be thought as a line joining 0 and 20 on number line and $30 can be thought as a line joining 0 and 30 on number line.
So as long as the bill does not exceed $20, they can pay the bill with convex combination of their intersection. For example they can pay $20 with lambda = 0.5. So each one pays 20*0.5 = 10 dollar. But it is not possible to pay $25 with convex combination of their intersection.
So the proof of a is about playing with convex combination.
(b) The proof relies on the definition of halfspace and convex combination.
(c) and (d) can be easily proven after (a) and (b).
Second Game: We have seen three so-called different concepts which are extreme point, vertex and basic feasible solution. In essence, those three concepts are equivalent:
Vertex => Extreme Point: Their equivalence can be shown with convex combination of two points in a convex set.
Extreme Point => Basic Feasible Solution: Their equivalence can be shown by the help of basis and linear independence concept in linear algebra.
Basic Feasible Solution => Vertex: This can be shown by the help of the definitions of halfspace, basic solution and basic feasible solution.
The Key Conclusion
The proofs of those can be found on the book. My point here is that if you have difficulty proving these theorems, always remember:
-In order to prove a mathematical concept, we should always pay attention to the definitions, rules, and axioms related to that concept. For example, in order to prove the theorems above, we should know the definitions of polyhedron, convex set, convex combination and convex hull. If we keep in mind those definitions and pay enough attention to them, we can start to establish some relationships that help us prove the theorems.
-Always start from the simplest case .
The concepts discussed in this post is the most crucial ones in LP and everything else is largely built upon those ideas. So mastering those ideas is extremely important to understand furter topics in LP.
Another Important Point
The surprising fact is that just a few learnable strategies of thinking can make you more effective in the classroom, the boardroom, and the living room. You can personally choose to become more successful by adopting five learnable habits. So when learning LP try these.
The 5 Elements of Effective Thinking:
1: Understand deeply
- Don’t face complex issues head-on; first understand simple ideas deeply. Clear the clutter and expose what is really important. Be brutally honest about what you know and don’t know. Then see what’s missing, identify the gaps, and fill them in. Let go of bias, prejudice, and preconceived notions. There are degrees to understanding (it’s not just a yes-or-no proposition) and you can always heighten yours. Rock solid understanding is the foundation for success.
2: Make mistakes
- Fail to succeed. Intentionally get it wrong to inevitably get it even more right. Mistakes are great teachers—they highlight unforeseen opportunities and holes in your understanding. They also show you which way to turn next and they ignite your imagination.
3: Raise questions
- Constantly create questions to clarify and extend your understanding. What’s the real question? Working on the wrong questions can waste a lifetime. Ideas are in the air—the right questions will bring them out and help you see connections that otherwise would have been invisible.
4: Follow the flow of ideas
- Look back to see where ideas came from and then look ahead to discover where those ideas may lead. A new idea is a beginning, not an end. Ideas are rare—milk them. Following the consequences of small ideas can result in big payoffs.
- The unchanging element is change—By mastering the first four elements, you can change the way you think and learn. You can always improve, grow, and extract more out of your education, yourself, and the way you live your life. Change is the universal constant that allows you to get the most out of living and learning.
-Introduction to Linear Optimization by Dimitris Bertsimas (Author), John N. Tsitsiklis (Author), John Tsitsiklis (Author)