# Category Archives: Motivation

## Optimally Resting NBA Players

To celebrate the start of the 2013-2014 NBA season this past Tuesday, I decided to write a post on basketball. More specifically, on the important issue of how to give players some much needed rest in an “optimal” way. My inspiration came from an article by Michael Wallace published on ESPN.com on October 19. Here are some relevant excerpts:

After playing in the Miami Heat’s first five preseason games, LeBron James sat out Saturday night’s 121-96 victory over the San Antonio Spurs to rest…James said the decision to sit was part of the team’s “maintenance” process. Heat teammate Dwyane Wade played Saturday and scored 25 points in 26 minutes, but previously skipped three preseason games…”No, no injuries — just not suiting up,” James said. “It’s OK for LeBron to take one off.”

The key term here is maintenance process. You may also recall that, back in November 2012, the Spurs were fined \$250,000 by the league after coach Popovich sent Duncan, Parker, Ginobili, and Green home right before a game against the Miami Heat.

So we want to rest our players to keep them healthy, but this cannot come at the expense of losing games. There are many factors to be taken into account here, such as players’ current physical condition, strength and tightness of schedule, and match-ups (how well a team stacks up against another team), to name a few. This is definitely not an easy problem. However, some insight is better than no insight at all. Therefore, let’s see what we can do with a simple O.R. model, and then we can talk about the strengths and weaknesses of our initial approach. (Here’s where you, dear reader, are supposed to chime in!)

Let’s begin with two simple assumptions: (i) when it comes to resting, we have to take players’ individual needs into account, i.e., we’ll use player-specific data; and (ii) when it comes to the likelihood of beating an opposing team, it’s better to think in terms of full lineups, rather than in terms of individual players, i.e., we’ll use lineup-specific data. The data in assumption (i) comes from doctors, players’ medical records, and coaches’ strategies. In essence, it boils down to one number: how many minutes, at most, should each player play in each game, under ideal circumstances. A useful measure of the strength of a lineup is its adjusted plus-minus score (see, for example, the work of Wayne Winston and his book Mathletics). In summary, it’s a number that tells you how many points a given lineup plays above (or below) an average lineup in the league over 48 minutes (or over 100 possessions, or another metric of reference).

For the sake of explanation, I’ll pretend to be in charge of resting Miami Heat players (surprise!). I’ll refer to a generic lineup by the letter $i$ ($i=1,\ldots,8$), to a generic player by the letter $j$ ($j=$ LeBron, D-Wade, …, Andersen (Bird Man)), and to a generic game by the letter $k$.

What are the decisions to be made? Let’s consider a planning horizon that consists of the next 7 games (or pick your favorite number). So $k=1,\ldots,7$. For the Heat, the first 7 games of the 2013-2014 season are against the following teams: Bulls, 76ers, Nets, Wizards, Raptors, Clippers, and Celtics. For each one of my potential lineups $i$ and each game $k$, I want to figure out the number of minutes I should use lineup $i$ during game $k$. Because this is an unknown number right now, it’s a variable in the model. Let’s call it $x_{ik}$. Note it’s also OK to think of $x_{ik}$ as a percentage, rather than minutes. I’ll adopt the latter interpretation.

What are the constraints in this problem? There are three main constraints to worry about: (a) make sure to pick enough lineups to play each game in its entirety; (b) make sure your lineups are good enough to hopefully beat your opponents in each game; (c) keep track of players’ minutes, and don’t let them get out of hand. The next step is to represent each constraint mathematically.

Constraint (a): Pick enough lineups to completely cover each game. For every game $k$, we want to impose the following constraint:

$\displaystyle \sum_{i=1}^{10} x_{ik}=1$

This means that if we sum the percentage of time each lineup is used during game $k$, we reach 100%.

Constraint (b): Choose your lineups so that you expect to score enough points in every game to beat your opponents. In this example, I’ll focus on plus-minus scores, but as a coach you could focus on any metric that matters to you. Given a lineup $i$, let $p_i$ be its adjusted plus-minus score. For example, the lineup of LeBron, Wade, Bosh, Chalmers, and Allen in the 2012-2013 season had the amazing $p_i$ score of +36.9 (you can obtain these numbers, and many other neat statistics, from the web site stats.nba.com). Now let’s say you have the plus-minus score of your opponent in game $k$, which we’ll call $P_k$. One way to increase your chances of victory is by requiring that the expected plus-minus score of your lineup combination in game $k$ exceed $P_k$ by a certain amount. Therefore, for every game $k$, we write the following constraint:

$\displaystyle \sum_{i=1}^{10} p_i x_{ik} \geq P_k + 0.5$

I want to emphasize two things. First, $p_i$ can be any measure of goodness of your lineup, and it can take into account the specific opponent in game $k$. Likewise, $P_k$ can be any measure of goodness of team $k$, as long as it’s consistent with $p_i$. Second, you’re not restricted to having only one of these constraints. If many measures of goodness matter to you, add them all in. For example, if you’re playing a team that’s particularly good at rebounding and you believe that rebounding is the key to beating them (e.g. Heat vs. Pacers), then either replace the constraint above with the analogous rebounding version, or include the rebounding version in addition to the constraint above. Finally, note that I picked 0.5 as a fixed amount by which to exceed $P_k$, but it could be any number you wish, of course. It can even be a number that varies depending on the opponent.

Constraint (c): Keep track of how many minutes your players are playing above and beyond what you’d like them to play. For any given player $j$ and any given game $k$, let $m_{jk}$ be $j$‘s ideal number of playing minutes in game $k$ (make it zero if you want the player to sit out). When it’s not possible to match $m_{jk}$ exactly, we need to know how many minutes player $j$ played under or over $m_{jk}$. Let’s call these two unknown numbers (variables) $u_{jk}$ and $o_{jk}$, respectively. So, for every player $j$ and game $k$, we write the following constraint:

$\displaystyle 48\left(\sum_{i \text{ that includes } j} x_{ik}\right) + u_{jk} - o_{jk}=m_{jk}$

The expression “$i$ that includes $j$” under the summation means that we’re summing variables $x_{ik}$ for all lineups of which $j$ is a member. We’re multiplying the summation by 48 minutes because $x_{ik}$ is in percentage points and $m_{jk}$ is in minutes.

What is our goal? (a.k.a. objective function) It’s simple: we don’t want players to play too many minutes above $m_{jk}$. Because this overage amount is captured by variable $o_{jk}$, we can write our goal as:

$\displaystyle \text{minimize } \sum_{j=1}^{9} \sum_{k=1}^{7} o_{jk}$

This minimizes the total overage in playing minutes. For a more balanced solution, it’s also possible to minimize the maximum overage over all players, or add weights in front of the $o_{jk}$ variables to give preference to some players.

Now what? Well, the next step would be to solve this model and see what happens. I created a Microsoft Excel spreadsheet that can be solved with Excel Solver or OpenSolver. You can download it from here. Feel free to adapt it to your own needs and play around with it (this is the fun part!). Because my model was limited in size (I can’t use OpenSolver on my Mac at home), the solution isn’t very good (too many overage minutes). However, by adding more players and more lineups, the quality will certainly improve (use OpenSolver to break free from limits on model size). Here are some notes to help you understand the spreadsheet:

• Variables $x_{ik}$ are in the range B18:H25.
• Variables $u_{jk}$ and $o_{jk}$ are in ranges B56:J62 and B65:J71, respectively.
• Constraints (a) are implemented in rows 27, 28, 29.
• Constraints (b) are implemented in rows 33, 34, 35.
• The left-hand side of constraints (c) are in the range B74:J80. This range is required to be equal to the range B47:J53 (where the $m_{jk}$ are) inside the Solver window.
• The objective function whose formula appears above is in cell J21.

What are the pros and cons of this model? Can you make it better? No model is perfect. There are always real-life details that get omitted. The art of modeling is creating a model that is detailed enough to provide useful answers, but not too detailed to the point of requiring an unreasonable amount of time to solve. The definitions of “detailed enough” and “unreasonable amount of time” are mostly client-specific. (What would please Erik Spoelstra and his coaching staff?) What do you think are the main strengths and weaknesses in the model I describe above? What would you change? Good data is a big issue in this particular case. If you don’t like my data, can you propose alternative sources that are practical? I believe there’s plenty to talk about in this context, and I’m looking forward to receiving your feedback. Maybe we can converge to a model that is good enough for me to go knocking on the Miami Heat’s door! (Don’t worry. In the unlikely event they open the door, I’ll share the consulting fees.)

Filed under Applications, Linear Programming, Modeling, Motivation, Sports

## Enforcing a Restricted Smoking Policy on the UM Campus: a TSP Variant

The Coral Gables campus of the University of Miami is slowly transitioning into a smoke-free campus. (I can’t wait for that to happen.) Presently, there are a number of designated smoking areas (DSAs) around campus and nobody is supposed to smoke anywhere else. Here’s a map of campus with red dots representing DSAs (right-click on it and open it in a new tab to see a larger version):

Unfortunately, enforcement of this smoking policy is nowhere to be seen. The result? Lots of students smoking wherever they want and, even worse, smoking while walking around campus, which is a great way to maximize their air pollution effect. Don’t you love people who live in the universe of me, myself, and I? But let me stop ranting and return to operations research…

As someone who does not enjoy (and is allergic to) cigarette smoke, I started thinking about how to use OR to help with the enforcement effort. Let’s say there will be an enforcer (uniformed official) whose job is to walk around campus in search of violators. Based on violation reports submitted by students, faculty and staff, the University can draw a second set of colored dots, say black, on the above map. These black dots represent the non-smoking areas in which violations have been reported most often. For simplicity, let’s call them violation areas, or VAs.

In possession of the VA map, what is the enforcer supposed to do? You probably answered “walk around campus visiting each VA”. If you’re now thinking about the Traveling Salesman Problem (TSP), you’re on the right track. The enforcer has to visit each VA and return to his/her starting point. However, this is not quite like a pure TSP. Let me explain why. First of all, unlike the pure TSP, the enforcer has to make multiple passes through the VAs on a single day. Secondly, it’s also likely that some VAs are more popular than others. Therefore, we’d like the enforcer to visit them more often. Finally, we want the multiple visits to each VA to be spread throughout the day. With these considerations in mind, let me define the Smoking Policy Enforcement Problem (SPEP): We are given a set of $n$ locations on a map. For each location $i$, let $v_i$ be the minimum number of times the enforcer has to visit $i$ during the day, and let $s_i$ be the minimum separation between consecutive visits to location $i$. In other words, each time the enforcer visits $i$, he/she has to visit at least $s_i$ other locations before returning to $i$. The goal is to find a route for the enforcer that satisfies the visitation requirements ($v_i$ and $s_i$) while minimizing the total distance traveled.

After a few Google searches, I discovered that the SPEP is not a new problem. This shouldn’t have come as a surprise, given the TSP is one of the most studied problems in the history of OR. The article I found, written by R. Cheng, M. Gen, and M. Sasaki, is entitled “Film-copy Deliverer Problem Using Genetic Algorithms” and appeared in Computers and Industrial Engineering 29(1), pp. 549-553, 1995. Here’s how they define the problem:

There are a few minor differences with respect to the SPEP. In the above definition, $s_i=1$ for every location $i$. What they call $d_i$ is what I call $v_i$, and they require exactly $d_i$ visits, whereas I require at least $v_i$ visits.

I wasn’t aware of this TSP variant and I think it’s a very interesting problem. I’m happy to have found yet another application for it. Can you think of other contexts in which this problem appears? Let me know in the comments.