December 26, 2009

Using Airline Miles to Buy Magazines: a Hidden Deeper Lesson

Last month, I received a letter from American Airlines’ Rewards Processing Center saying that I have 4735 miles that are about to expire and asking whether I’d be interested in redeeming them for some magazine subscriptions. These were the choices:

Magazine Issues Miles Needed
Afar 6 600
Architectural Digest 12 800
Barron’s 26 1700
Business Week 50 1600
Cat Fancy 12 500
Cigar Aficionado 6 400
Daily Variety 252 5500
Diabetes Forecast 12 500
ESPN the Magazine 26 500
Ebony and Jet 38 800
Elle 12 400
Entertainment Weekly 55 1300
Essence 12 500
Essence 2yrs 24 800
Fortune 25 1400
Golf Digest 12 700
Golf Magazine 12 700
Golfweek 45 1300
Martha Stewart Living 24 1400
Money 12 800
New York Magazine 46 700
Sports Illustrated 56 1400
SI and Golf Mag 68 1500
Sports Illustrated KIDS 12 1000
The Economist 51 3200
The Wall Street Journal 190 2800
US News & World Report 12 700
Variety 50 5500
Wine Spectator 15 900
Wired 12 400

Wow! 30 different magazines! How could I possibly decide…oh, wait! Maybe I can use some analytical techniques to help me make a better decision…what’s that thing called again…O.R.!!!

First, what do I want to accomplish?

a) Get as many issues of whatever magazine as possible: this is what a dentist’s office or any place with a waiting room might want to do. Note that the WSJ subscription provides 190 issues, but let’s say I only want to consider actual magazines. One possible answer is to subscribe to ESPN the Magazine, Ebony and Jet, Entertainment Weekly, New York Magazine, and Sports Illustrated. That will buy me 221 issues and use 4700 of my 4735 miles. Just out of curiosity, I could have gotten 286 issues if I hadn’t excluded the WSJ.

b) Get as many different subscriptions as possible: This one is easy. Just pick the magazines that require the least number of miles. One solution is Afar, Cat Fancy, Cigar Aficionado, Diabetes Forecast, ESPN the Magazine, Elle, Essence, US News & World Report, and Wired. That’s a total of 9 subscriptions, using 4500 of my 4735 miles (a waste of 235 miles).

c) Get the “best” 9 magazines you can afford: Since I know I can get 9 subscriptions, what are the 9 that make me use the most out of my 4735 miles? Maybe they’ll constitute a better set of 9 than those I found in item (b) above (positive correlation between quality and miles needed?). In this case the answer is to substitute New York Magazine for Essence in the set of 9 found in item (b). This alternative totals 144 issues and wastes only 35 miles.

d) Get the 9 magazines that provide the largest total number of issues: In that case I’d want to go with Cat Fancy, Cigar Aficionado, Diabetes Forecast, ESPN the Magazine, Ebony and Jet, Elle, Essence, New York Magazine and Wired. That’s a total of 176 issues and a waste of 35 miles as well.

For those of you who are thinking “who cares?”, don’t let appearances fool you. Say that I am United Way and 4735 is my budget (in thousands of US$). There are 30 charities (“magazines”) asking me for money (the “miles needed”) and telling me that they will help a certain number of people (the “issues”, in thousands). If I fund the charities in rows 9, 10, 12, 21 and 22 of the above table, I’ll spend US$ 4,700,000 and help 221 thousand people. That’s the best I can do! (if I am not allowed to fund the charity in row 26).

Lots of other problems can be cast into this framework. Another example: NASA is sending a spaceship to Mars. My available miles are the spaceship’s cargo capacity, the magazines are scientific experiments to be conducted in space (each requiring equipment to be loaded into the spaceship (the “miles needed”) and promising a certain (quantifiable) scientific benefit (the “issues”). This a generic resource allocation problem and the mathematical model we have used here is called The 0-1 Knapsack Problem.

Technical Details:

Want to solve your own budget allocation problem? Here’s how you can do it. We have n projects that require funding and a budget B . For each project i = 1,\ldots,n , let m_i be how much money it needs and let p_i be its payoff (e.g. lives saved). Let the binary variable x_i be equal to 1 if we decide to allocate money to project i , and equal to zero otherwise.

The objective is to maximize the payoff \sum_{i=1}^n p_i x_i .

The constraint is to respect the budget: \sum_{i=1}^n m_i x_i \leq B .

This model also allows you to include some useful side constraints. For example: “if I fund project i , then I must also fund project j ” would be written as x_i \leq x_j . And “if I fund project i , then I cannot fund project j “, would become x_i + x_j \leq 1 . I’ve talked about these kinds of logical conditions in another post.

Here’s the AMPL file I used to solve the magazine problem and its variations. Enjoy!

P.S.: Being an Economics major and an amazing cook, my wife requested The Economist and Martha Stewart Living; that’s what her objective function told her to do, I guess.

December 21, 2009

Using a Mathematical Model to Predict the Likelihood of Insurgent Attacks

University of Miami physicist Neil Johnson and his co-authors have found that there is a “generic way in which humans carry out insurgency and terrorism when faced by a large powerful state force, and this is irrespective of background history, motivation, ideology, politics, and location”. Their study is featured as the cover of the December 17, 2009 issue of Nature:

We have found a unified model of modern insurgent wars that shows a fundamental pattern in the apparent chaos of wars,” says Johnson. “In practical terms, our analysis can be used to create and explore scenarios, make predictions, and assess risks for present and future wars.

Here’s a link to the full story that appeared in UM’s E-Veritas publication. This paper will make for some nice holiday reading.

December 15, 2009

Let’s Join O.R. Forces to Crack the 17×17 Challenge

I recently came across this post by bit-player, which refers to this post by William Gasarch, on a very interesting feasibility problem: given an m \times n grid, assign one of c colors to each position on the grid so that no rectangle ends up with the same color in all of its vertices (see the original post for applications of this problem). Many results are known for different values of m , n , and c , but the challenge (which comes with a reward of US$289) is to decide whether a feasible solution exists when m=n=17 and c = 4 .

Apparently, some attempts have been made to use Integer Programming (IP) to solve this problem:

SAT-solvers and IP-programs have been used but have not worked— however, I don’t think they were tried that seriously.

By writing this post, I hope to get enough O.R. people excited to brainstorm together in search of a solution. Given this is a feasibility problem, another method of choice here would be Constraint Programming (more on that later).

I decided to begin with the first IP formulation that came to mind. I didn’t expect it to close the deal, but I wanted to see how far it would take me. Here it is: let the binary variable x_{ijt} equal 1 if the cell in row i and column j receives color t . There are two groups of constraints.

Every cell must have a color:

\sum_{t=1}^c x_{ijt} = 1, \enspace \forall \; i,j

Every rectangle can have at most 3 vertices with the same color:

x_{ijt} + x_{i+a,jt} + x_{i,j+b,t} + x_{i+a,j+b,t} \leq 3, \enspace \forall \; i,j,a,b,t

where

1 \leq i \leq m-1 , 1 \leq j \leq n-1 , 1 \leq a \leq m-i , and 1 \leq b \leq n-j .

The objective function does not matter (in theory), but my limited experiments indicate that it’s better to minimize the sum of all x_{ijt} than it is to maximize it. Gasarch also provides this diagram with a rectangle-free subset of size 74 for a 17 x 17 grid. I then added constraints of the form x_{ij1}=1 for every cell (i,j) with an “R” in the diagram. This may be too restrictive, since it’s not clear to me whether a feasible solution to the 17 x 17 grid must contain that diagram as a subset. If that’s true, the latter constraints help with symmetry breaking and also substantially reduce the problem size. To reduce some of the color symmetry in the problem, I also arbitrarily chose cell (1,1) to contain color 2, i.e. x_{112}=1 . Note that this still allows colors 3 and 4 to be swapped. I could have set x_{123}=1 , but that would mean taking a risk.

Finally, for the 17 x 17 grid, it is suspected that each line and each row will have three colors used four times and one color used five times, hence, I also included the following constraints:

4 \leq \sum_{j=1}^n x_{ijt} \leq 5, \enspace \forall \; i,t

4 \leq \sum_{i=1}^m x_{ijt} \leq 5, \enspace \forall \; j,t

Once again, if my understanding is correct, there’s no formal proof that the above constraints are valid.

Here’s a ZIMPL model that can be used to  generate the .LP files and here’s the 17 x 17 LP. I set CPLEX’s MIP emphasis to 4 (look for hidden feasible solutions) and first ran a 14 x 14 as a warm-up. CPLEX 12.1 finds a feasible solution in under 5 seconds (I substituted 1 for 4 in the color usage constraints above). I stopped the 15 x 15 problem after 12 fruitless hours of search, so 14 x 14 is apparently the largest of the easy instances.

I started the 17 x 17 run last Thursday. The initial IP has 1156 variables and 74,620 constraints. Pre-processing reduces that to 642 variables and 15,775 constraints. After approximately 122 hours and 77,000 nodes on a dual-core machine (3.79 GHz), no solution was found.

Now it’s time for smarter ideas. Here are a few candidates:

1) Try a column-generation approach. It’s easy to find solutions for 8×8 and 9×9 grids, which will appear as sub-grids of a 17×17 solution. So it may be possible to write a set partitioning formulation (with side constraints) that has the 8×8 and 9×9 solutions as variables.

2) Try a meta-heuristic approach (e.g. simulated annealing). This problem is probably one of those for which an almost-feasible solution (like this one) can be very far from a feasible solution.

3) Try constraint programming. I think the main difficulty here will be finding a good variable and value selection heuristic. I started building a Comet model for the problem using the global constraint cardinality; here it is. It still does not include the constraints that fix the color of the rectangle free subset provided by Gasarch, but those are very easy to add (see the code that is commented out).

Let me know what your thoughts and hunches are!

November 25, 2009

Technorati Code Verification Only; Please Ignore

CAJHQZTVGAA3

November 20, 2009

Facebook’s Farmville: What’s the Fastest Way to Get Rich?

One can’t help but notice that a lot of people have been playing Farmville on Facebook lately. Some of my relatives, friends, students and co-workers have taken the plunge. I don’t have time to play Farmville, but I thought it would be interesting to study it. It’s a simple and fun game, easy to explain, and apparently addictive.

You begin the game with a 12 by 12 piece of land. That gives you 144 one by one squares in which to plant crops. To simplify the explanation, I’m going to (1) restrict my attention to the four crops initially available (strawberries, eggplant, wheat, and soybeans); (2) assume crops will be harvested as soon as they are ripe (that is, ignore wilting); (3) disregard the random bonuses and gifts that show up every now and then. Blogger Mark Newheiser has a nice post about Farmville in which he includes a useful table. The cost column includes the cost of plowing the land (15 coins) plus the cost of purchasing the seeds. Because the harvest times of all four crops are multiples of four hours, we can think in terms of 4-hour periods.

Question: What’s the largest amount of cash one can obtain in 6 days? (need to rest on the 7th day).

You initially have 200 coins, one ripe and one half-grown square of strawberries, one ripe and one half-grown square of eggplant. By harvesting the two ripe squares, your fortune grows to 323 coins. I’ll also go ahead and clean (delete) the two half-grown squares so that we can begin with a clean slate. It turns out that the answer to the above question is to plant strawberries only, and as much as you can. Here’s the planting schedule:

Period Strawberry Squares
0 12
1 17
2 24
3 34
4 47
5 66
6 92
7 129

At period 8 (after 32 hours), you’ll have enough money to fill all your 144 squares with strawberries and you just continue to plow, plant and harvest them all every 4 hours. After 6 days (36 four-hour periods), you’ll have 44,913 coins (adjusted to take into account that 4 squares were already plowed at time zero). This solution is not very exciting and, moreover, easy to predict. Taking a second look at the cost table, we can see that the profit per hour of strawberries, eggplant, wheat, and soybeans are 2.5, 1, 0.903, and 1.375, respectively. This fact, combined with the fact that strawberries are the fastest to grow, yields the above result. As one progresses through the game, other crops become available and strawberries’ dominance goes away. However, we can show that things can get very complicated by changing the numbers a little bit. If we increase the cost of strawberries to 31 (including plowing), their profit per hour changes to 1 (worse than soybeans). Here’s the new optimal planting schedule for 6 days, which yields a profit of 15,725 coins:

Period Strawberry Squares Soybean Squares
0 2 8
1 3
6 1 15
7 7 1
8 6 2
9 7
12 5 26
13 8
14 13
15 29
16 25 8
17 2 27
18 57
22 16
23 75
24 69
29 16 59
30 85
35 59

Note that the best course of action now is to mix strawberries and soybeans and, at least to me, the proper way of doing it isn’t so obvious (hence the value of using Operations Research). If you’re now wondering whether there would be a situation in which the best idea is to use three different crops, the answer is yes! If we decrease the cost of eggplant to 19 (including plowing), for instance, the optimal planting schedule results in a profit of 18,157 coins and looks like this:

Period Strawberry Squares Soybean Squares Eggplant Squares
0 17
12 1 37 18
13 1
14 1 1
15 1
16 1
17 2
18 123
24 18
26 1
27 1
28 1
29 3
30 123
35 3

In conclusion, I believe that Farmville is a rich decision-making environment that can be used to illustrate some of the analytical techniques from the field of Operations Research. This simple example shows how complicated things can become, even with only four crops!

One possibility of extending this analysis would be to consider that one isn’t willing to log in every 4 hours of so in order to harvest crops, so here’s another question:

Question: Given a schedule of off-line hours (i.e. not available for harvesting), how much money can you make in 6 days?

Finally, I haven’t actually answered the question that appears in the title of this blog post: what’s the fastest way to obtain X coins? I’ll leave these last two questions as an exercise.

Technical Details:

This section of the post is dedicated to those who want to know more about how this analysis was conducted. I used an integer programming model written in AMPL (I think dynamic programming would work too). My variables are: P_{it} = number of squares planted with crop i in period t , H_{it} = number of squares of crop i harvested in period t , A_t = area in use in period t , M_t = money available in period t . Periods go from 0 to T (in our example, T=36 ). So we are interested in maximizing M_T .

The constraints update the values of A_t and M_t based on what you do in period t . Here they are (I’m omitting the details of boundary conditions):

\displaystyle A_t = A_{t-1} + \sum_{i} P_{it} - \sum_{i} H_{it} \enspace \forall \; t

\displaystyle M_t = M_{t-1} - \sum_{i} c_i P_{it} + \sum_{i} r_i H_{it} \enspace \forall \; t

where c_i and r_i are, respectively, the cost and revenue of crop i . Because we don’t consider wilting, H_{it}=P_{i(t-g_i)} , where g_i is the time it takes crop i to grow and ripen. Therefore, the H_{it} variables can be eliminated. Finally, we require M_t \geq 0 and 0 \leq A_t \leq 144 . Here’s the mixed-integer programming (MIP) model in AMPL in case anyone wants to play with it. When it is optimal to use more than one crop, the MIP can get pretty hard to solve, considering its size.

November 6, 2009

YouTube Videos on Solving LPs with Excel

This had been in the back burner for a long while. Each time I taught my O.R. class (either undergrad or MBA), I kept thinking that the students would like to be able to see the Excel setup of an LP model over and over again. Many of them are not proficient in Excel, and my class is their first contact with things like absolute cell references and SUMPRODUCT. I watched a number of YouTube videos on the topic, but I wasn’t happy with any of them. Besides, I wanted the video to be about the same example that I use in the classroom. So I decided to bite the bullet and go for it. I think the outcome was decent: not great, but not terrible either. My wife said I sound a little stilted. I agree. I was only willing to do it once, no second takes, which means I was a little nervous :-)

Before I give you the link to the videos, I need to bring your attention to two disclaimers:

Disclaimer 1: I used the demo version of iShowU to record my screen action. That means you’ll see a green watermark on top of the video. I know it’s ugly, but I didn’t want to pay $29.95 for it. At least not until I get some feedback to convince me that I’ll be doing more of these videos.

Disclaimer 2: I shamelessly use the same example that they (used to?) use at the Tepper School in my classes: the “famous” farmer problem (sorry Javier :-). I was the head TA for that class a number of times and I literally sat through it at least 3 or 4 times.

Here they are:

Linear Programming: The Farmer Problem, Part I (9:07 min)

Linear Programming: The Farmer Problem, Part II (8:57 min)

I hope these videos turn out to be useful to my students and to anyone who wants to learn about linear programming and Excel Solver. I’m thinking about doing more videos on the diet problem, transportation, sensitivity analysis, etc. Let’s see how things go.

September 29, 2009

Introducing CuSPLIB

A couple of colleagues and I are doing research on single-machine cumulative scheduling problems (CuSP). As part of this effort, we’ll have to create some benchmark instances on which to test our algorithms. Some time ago, I searched around for problem instances and could not find any. People seem to be more interested in the Resource Constrained Project Scheduling Problem (RCPSP), of which the CuSP is a special case/subproblem. One of the experts in the area told me that he was unaware of any standard CuSP benchmarks and that difficult instances were hard to generate. Therefore, I decided to make our instances public, hoping that (i) this could be helpful/useful to someone else out there, and (ii) this could attract more attention to this problem. As a result, CuSPLIB is born! It includes a few pieces of code (instance generator, MIP and CP models), an initial set of 10 instances, and some discussion about integer programming models for the problem. I intend to talk about other (alternative) models and include some references in the near future. The preliminary computational results are interesting and make me believe that it’s not that difficult to find challenging instances. Let me know what you think, and feel free to contribute to CuSPLIB! I’ll be updating it little by little as our research progresses.

September 22, 2009

Cited by Competing on Analytics, by mistake…

I just found out that one of my papers (Building Efficient Product Portfolios at John Deere and Company, co-authored with D. Napolitano, A. Scheller-Wolf and Sridhar Tayur), has been cited in chapter 2 of the book “Competing on Analytics: The New Science of Winning“, by Thomas Davenport and Jeanne Harris. The chapter is entitled “What Makes an Analytical Competitor?” (link to PDF). Another interesting coincidence was that today I listened to the Science of Better Podcast featuring Thomas Davenport! I was excited to see the context of the citation, but it all turned out to be a mistake :-( The authors refer to the work on “direct derivative estimation of non-stationary inventory” (which was also done by SmartOps, saving Deere 1.2 billion dollars over 5 years) and cite our paper as a reference for it. Could it have been because our paper has the words “John” and “Deere” in the title? On a positive note, however, this may drive some readers to our paper, which is, IMHO, a great read anyway.

September 14, 2009

Optima Newsletter

optimaThe main focus of my research so far has been on the integration of optimization techniques. Although I have used Integer Programming theory, I hadn’t (until recently) done research on creating new IP theory. Perhaps as a result of that, I spent many years unaware of Optima, the newsletter of the Mathematical Programming Society. Thanks to my recent trip to the ISMP, however, I was able to get a hold of issues 76 through 79 of Optima (they’re also available online). A quick browse through the pages left me with a very good impression. Every single issue had a discussion or short paper on a topic that interested me, and I can’t wait to sit down to read them more carefully (see for instance Javier Peña’s article on Nash Equilibria Computation via Smoothing Techniques (Nov 2008 issue); Texas Hold’em poker anyone?). I particularly like the fact that the short papers are written at a more accessible level (at least to me), when compared to papers published in Mathematical Programming; it’s a great way to get introduced to a new topic. Another great feature are the Discussion Columns, which give expert opinions and historical perspectives about  a variety of topics.

I’m glad I came across Optima and did not shy away from grabbing some of the free issues available at ISMP. I strongly recommend it to anyone interested in the field of mathematical programming.

August 30, 2009

ISMP ‘09: Friday (the end)

John and I attended the morning technical sessions on the COIN-OR project [add link]. I enjoyed the overview talk by Robert Fourer [add link] (“COIN-OR Triennial Update”). It helped me understand what is necessary to get a project into COIN (both technical and legal requirements). This made me wonder if that’s the right way to go with SIMPL [add link]. My first reaction is to say “yes”, but I’ll have to think about it and talk to my co-authors. The second talk was given by Laszlo Ladanyi [add link], and it was entitled “Bigger, Better, Faster: Update on the World’s Fastest Open-Source MIP Solver”. Although the latest results I’ve seen [add link] place CBC [add link] behind SCIP [add link], I guess it’s OK for them to say that because, although SCIP ’s code is freely available, it’s only free for academic use. This means SCIP cannot call itself “open-source” (Ah…the world of legal terminology). The improvements to CBC have indeed been impressive: new heuristics, 2x increase in preprocessing speed, and 55% improvement on the geometric mean of solution times for problems in the Mittelmann benchmarks [add link], to name a few. The final talk in the COIN-OR session was given by Bjarni Kristjansson [add link] of Maximal Software [add link]. He talked about new developments to CoinMP, which is a library that gives a high-level user easy access to many COIN solvers from both Windows and Linux applications.
An added (positive) side effect of attending the COIN-OR session was that I had a chance to visit Gleacher Center [add link] at the University of Chicago [add link]. Until Thursday, all the sessions I attended were held at the Marriott hotel. The talks were in what appeared to be an MBA classroom. Pretty neat room, nice tables and nice chairs, but no leather. If you want to get an excellent MBA degree, while attending classrooms with leather chairs, you will have to come to the University of Miami [add link] :-)
Unfortunately, there was also a negative side-effect: I had to miss Michel Goeman’s [add link] talk on the “Smallest Compact Formulation of the Permutahedron”. The permutahedron in n dimensions is the convex hull of the points that represent all possible permutations of the numbers from 1 to n. Michel found a formulation with O(nlog(n)) variables and O(nlog(n)) constraints and proved that no smaller compact formulation exists (up to constant factors). This may have some implications in hybrid methods because the permutahedron is related to the alldifferent global constraint of Constraint Programming. I’ll have to see if I can find Michel’s paper online somewhere, or ask him for a copy. In his abstract, he mentions that this result generalizes easily to variants of the permutahedron, so I’m curious to see what those variants are.
During the lunch break I took some time off to go to Millenium Park [add link]. It’s gorgeous. My trip to Chicago wouldn’t have been complete without a visit to The Bean [add link], the X Fountain [add link], and the Y stadium [add link]. I’ve included some photos below. On my way back, I stopped for a quick lunch (paninni + soup) and bought a 3-pack of Starbucks Via [add link] (their brand of instant coffee). I know what you’re going to say: instant coffee tastes terrible. However, there’s been so much talk about Via that I wanted to give it a try.
After lunch I went to François Soumis [add link] talk on “Dantzig-Wolfe Decomposition Putting in the Subproblem the Degenerated Constraints of a Linear Problem”. He explained the Improved Primal Simplex (IPS) [add link] algorithm and how it can be used in a column-generation style. The results are really impressive. For highly degenerate problems, the LPs solve 10x faster (compared to CPLEX’s primal simplex) when only lower bounds are taken into account, and 20x faster when both lower and upper bounds are used. For a class of degenerate set partitioning problems, LPs solve 20 to 40 times faster, while the IPs solve 50 to 100 times faster.
They’re working on a number of extensions, including generalizations of the same technique to non-linear problems (through active set methods).
The last two talks I attended had one thing in common: both dealt with scheduling problems with long time horizons. Although the typical time-indexed MIP formulation for machine scheduling (x_jt = 1 if job j starts at time t) provides strong lower bounds for short time horizons, it is ineffective at solving large horizon problems because the number of variables blows up. In the first talk of that session, Christos Maravelias [add link] talked about “The Polyhedral Properties of a Discrete-Time MIP Formulation for Production Planning”. Christos’s models were suffering from the “curse of long time horizons” and they tried many variants of the model to no avail. Their successful idea consists of looking at the problem as the intersection of two polyhedra P1 and P2 and trying to strengthen the individual P1 and P2 formulations. It turns out that the coefficient matrix of P1 was totally unimodular [add link] (hence no strengthening needed), but the matrix of P2 wasn’t. Nevertheless, by playing around with the basic time unit of their time horizon (and properly adjusting the right-hand side coefficients), they managed to rescale the coefficient matrix of P2 into a k-regular matrix [add link] (k-regularity is a generalization of total unimodularity). Interestingly, the rank-1 closure (w.r.t. mod-k cuts) can be obtained easily for k-regular matrices. In addition, that k-regularity had other positive effects on the way that CPLEX treats the problem (I didn’t fully get that part of the talk). The result: problems that used to take hundreds of thousands of nodes to solve were being solved with just a few nodes, and some times at the root node. The final talk, entitled “Single-Machine Scheduling over Long Time Horizons by Logic-based Benders Decomposition”, was given by Elvin Coban. She implemented three different models for the problem (MIP, CP and Benders) and looked at three objectives: feasibility only, min makespan, and min tardiness. The MIP and CP models were pretty simple and straightforward. The Benders decomposition model consisted of a MIP master problem that assigned groups of jobs to time intervals in the scheduling horizon, and a subproblem for each interval that took care of scheduling the jobs assigned to it. Both the master and the subproblems had their relaxations enriched with extra constraints. The Benders cuts were strengthened by looking at smaller-than-obvious causes of infeasibility (or bounds, for the makespan and tardiness objectives) through multiple solves of each subproblem. The results were interesting: for problems in which the jobs have tight time windows, CP is the best approach, followed by Benders and then (far, far away, 100 times slower) MIP. As the time horizon increases and/or job time windows become looser, the Benders approach remains robust and becomes better than CP, which goes all over the place (MIP is still far, far away). Here’s an idea of the sizes of their problems: # jobs: 60 to 120, time horizon: 170 to 4000 units. I was surprised to learn that they simply divided the time horizon into 10 equal intervals. I suspect that a more careful choice of intervals can make a significant difference (consider, for instance, the release dates and due dates as possible breakpoints).
That was the end of the conference. I headed back to my hotel, picked up my bag and took the Red Line -> Orange Line -> Midway. Total cost of ground transportation for the whole trip = $4.50. (The folks paying my reimbursement will be proud of me.) Unfortunately, upon arrival at the airport, I found out that my Southwest flight was delayed by 2 hours! Thankfully, I managed to arrive home safe and sound (and tired). I had a great time in Chicago, but I’m happy to be back home in sunny south Florida. I hope these ISMP blog posts have been useful to my blog readers. It was a pleasure to contribute my two cents to the O.R. community.

John and I attended the morning technical sessions on the COIN-OR project. I enjoyed the overview talk by Robert Fourer (“COIN-OR Triennial Update”). It helped me understand what is necessary to get a project into COIN (both technical and legal requirements). This made me wonder if that’s the right way to go with SIMPL. My first reaction is to say “yes”, but I’ll have to think about it and talk to my co-authors. The second talk was given by Laszlo Ladanyi, and it was entitled “Bigger, Better, Faster: Update on the World’s Fastest Open-Source MIP Solver”. Although the latest results I’ve seen place CBC behind SCIP, I guess it’s OK for them to say that because, although SCIP ’s code is freely available, it’s only free for academic use. This means SCIP cannot call itself “open-source” (Ah…the world of legal terminology). The improvements to CBC have indeed been impressive: new heuristics, 2x increase in pre-processing speed, and 55% improvement on the geometric mean of solution times for problems in the Mittelmann benchmarks, to name a few. The final talk in the COIN-OR session was given by Bjarni Kristjansson of Maximal Software. He talked about new developments to CoinMP, which is a library that gives a high-level user easy access to many COIN solvers from both Windows and Linux applications.

An added (positive) side effect of attending the COIN-OR session was that I had a chance to visit Gleacher Center at the University of Chicago. Until Thursday, all the sessions I attended were held at the Marriott hotel. The talks were in what appeared to be an MBA classroom. Pretty neat room, nice tables and nice chairs, but no leather. If you want to get an excellent MBA degree, while attending classrooms with leather chairs, you will have to come to the University of Miami :-)

Unfortunately, there was also a negative side-effect: I had to miss Michel Goemans’s talk on the “Smallest Compact Formulation of the Permutahedron”. The permutahedron in n dimensions is the convex hull of the points that represent all possible permutations of the numbers from 1 to n. Michel found a formulation with O(nlog(n)) variables and O(nlog(n)) constraints and proved that no smaller compact formulation exists (up to constant factors). This may have some implications in hybrid methods because the permutahedron is related to the alldifferent global constraint of Constraint Programming. I’ll have to see if I can find Michel’s paper online somewhere, or ask him for a copy. In his abstract, he mentions that this result generalizes easily to variants of the permutahedron, so I’m curious to see what those variants are.

During the lunch break I took some time off to go to Millenium Park. It’s gorgeous. My trip to Chicago wouldn’t have been complete without a visit to the Cloud Gate (a.k.a the bean), the Crown Fountain, and the Jay Pritzker Pavilion. I’ve included some photos below. On my way back, I stopped for a quick lunch (paninni + soup) and bought a 3-pack of Starbucks Via (their brand of instant coffee; only sold in Chicago, Seattle and London). I know what you’re going to say: instant coffee tastes terrible. However, there’s been so much talk about Via that I wanted to give it a try.

After lunch I went to François Soumis’s talk on “Dantzig-Wolfe Decomposition Putting in the Subproblem the Degenerated Constraints of a Linear Problem”. He explained the Improved Primal Simplex (IPS) algorithm and how it can be used in a column-generation style. The results are really impressive. For highly degenerate problems, the LPs solve 10x faster (compared to CPLEX’s primal simplex) when only lower bounds are taken into account, and 20x faster when both lower and upper bounds are used. For a class of degenerate set partitioning problems, LPs solve 20 to 40 times faster, while the IPs solve 50 to 100 times faster. They’re working on a number of extensions, including generalizations of the same technique to non-linear problems (through active set methods).

The last two talks I attended had one thing in common: both dealt with scheduling problems with long time horizons. Although the typical time-indexed MIP formulation for machine scheduling (x_jt = 1 if job j starts at time t) provides strong lower bounds for short time horizons, it is ineffective at solving large horizon problems because the number of variables grows too much. In the first talk of that session, Christos Maravelias talked about “The Polyhedral Properties of a Discrete-Time MIP Formulation for Production Planning”. Christos’s models were suffering from the “curse of long time horizons” and they tried many variants of the model to no avail. Their successful idea consists of looking at the problem as the intersection of two polyhedra P1 and P2 and trying to strengthen the individual P1 and P2 formulations. It turns out that the coefficient matrix of P1 was totally unimodular (hence no strengthening needed), but the matrix of P2 wasn’t. Nevertheless, by playing around with the basic time unit of their time horizon (and properly adjusting the right-hand side coefficients), they managed to rescale the coefficient matrix of P2 into a k-regular matrix (k-regularity is a generalization of total unimodularity). Interestingly, the rank-1 closure (w.r.t. mod-k cuts) can be obtained easily for k-regular matrices. In addition, that k-regularity had other positive effects on the way that CPLEX treats the problem (I didn’t fully get that part of the talk). The result: problems that used to take hundreds of thousands of nodes to solve were being solved with just a few nodes, and some times at the root node. The final talk, entitled “Single-Machine Scheduling over Long Time Horizons by Logic-based Benders Decomposition”, was given by Elvin Coban. She implemented three different models for the problem (MIP, CP and Benders) and looked at three objectives: feasibility only, min makespan, and min tardiness. The MIP and CP models were pretty simple and straightforward. The Benders decomposition model consisted of a MIP master problem that assigned groups of jobs to time intervals in the scheduling horizon, and a subproblem for each interval that took care of scheduling the jobs assigned to it. Both the master and the subproblems had their relaxations enriched with extra constraints. The Benders cuts were strengthened by looking at smaller-than-obvious causes of infeasibility (or bounds, for the makespan and tardiness objectives) through multiple solves of each subproblem. The results were interesting: for problems in which the jobs have tight time windows, CP is the best approach, followed by Benders and then (far, far away, 100 times slower) MIP. As the time horizon increases and/or job time windows become looser, the Benders approach remains robust and becomes better than CP, which goes all over the place (MIP is still far, far away). Here’s an idea of the sizes of their problems: # jobs: 60 to 120, time horizon: 170 to 4000 units. I was surprised to learn that they simply divided the time horizon into 10 equal intervals. I suspect that a more careful choice of intervals can make a significant difference (consider, for instance, the release dates and due dates as possible breakpoints).

That was the end of the conference. I headed back to my hotel, picked up my bag and took the Red Line -> Orange Line -> Midway. Total cost of ground transportation for the whole trip = $4.50. (The folks paying my reimbursement will be proud of me.) Unfortunately, upon arrival at the airport, I found out that my Southwest flight was delayed by 2 hours! Thankfully, I managed to arrive home safe and sound (and tired). I had a great time in Chicago, but I’m happy to be back home in sunny south Florida. I hope these ISMP blog posts have been useful to my blog readers. It was a pleasure to contribute my two cents to the O.R. community.

By the way, if you want a view of the ISMP more focused on Interior Point Methods, take a look at Nathan Brixius blog.

DSC06528

america

DSC06541

DSC06550

fountain

DSC06558