# How to Build the Best Fantasy Football Team, Part 2

UPDATE on 10/5/2015: Explained how to model a requirement of baseball leagues (Requirement 4).

UPDATE on 10/8/2015: Explained how to model a different objective function (Requirement 5).

Yesterday, I wrote a post describing an optimization model for picking a set of players for a fantasy football team that maximizes the teams’ point projection, while respecting a given budget and team composition constraints. In this post I’ll assume you’re familiar with that model. (If you are not, please spend a few minutes reading this first.)

Fellow O.R. blogger and Analytics expert Matthew Galati pointed out that my model did not include all of the team-building constraints that appear on popular fantasy football web sites. Therefore, I’m writing this follow-up post to address this issue. (Thanks, Matthew!) My MBA student Kevin Bustillo was kind enough to compile a list of rules from three sites for me. (Thanks, Kevin!) After looking at them, it seems my previous model fails to deal with three kinds of requirements:

1. Rosters must include players from at least $N_1$ different NFL teams ($N_1=2$ for Draft Kings and $N_1=3$ for both Fan Duel and Yahoo!).
2. Rosters cannot have more than $N_2$ players from the same team ($N_2=4$ for Fan Duel and $N_2=6$ for Yahoo! Draft Kings does not seem to have this requirement).
3. Players in the roster must represent at least $N_3$ different football games (Only Draft Kings seems to have this requirement, with $N_3=2$).

Let’s see what the math would look like for each of the three requirements above. (Converting this math into Excel formulas shouldn’t be a problem if you follow the methodology I used in my previous post.) I’ll be using the same variables I had before (recall that binary variable $x_i$ indicates whether or not player $i$ is on the team).

Requirement 1

Last time I checked, the NFL had 32 teams, so let’s index them with the letter $j=1,2,\ldots,32$ and create 32 new binary variables called $y_j$, each of which is equal to 1 when at least one player from team $j$ is on our team, and equal to zero otherwise. The requirement that our team must include players from at least $N_1$ teams can be written as this constraint:

$\displaystyle \sum_{j=1}^{32} y_j \geq N_1$

The above constraint alone, however, won’t do anything unless the $y_j$ variables are connected with the $x_i$ variables via additional constraints. The behavior that we want to enforce is that a given $y_j$ can only be allowed to equal 1, if at least one of the players from team $j$ has its corresponding $x$ variable equal to 1. To make this happen, we add the constraint below for each team $j$:

$\displaystyle y_j \leq \sum_{\text{all players } i \text{ that belong to team } j} x_i$

For example, if the Miami Dolphins are team number 1 and their players are numbered from 1 to 20, this constraint would look like this: $y_1 \leq x_1 + x_2 + \cdots + x_{20}$

Requirement 2

Repeat the following constraint for every team $j$:

$\displaystyle \sum_{\text{all players } i \text{ that belong to team } j} x_i \leq N_2$

Assuming again that the first 2o players represent all the players from the Miami Dolphins, this constraint on Fan Duel would look like this: $x_1 + x_2 + \cdots + x_{20} \leq 4$

Requirement 3

My understanding of this requirement is that it applies to short-term leagues that get decided after a given collection of games takes place (it could even be a single-day league). This could be implemented in a way that’s very similar to what I did for requirement 1. Create one binary $z_g$ variable for each game $g$. It will be equal to 1 if your team includes at least one player who’s participating in game $g$, and equal to zero otherwise. Then, you need this constraint

$\displaystyle \sum_{\text{all games } g} z_g \geq N_3$

as well as the constraint below repeated for each game $g$:

$\displaystyle z_g \leq \sum_{\text{all players } i \text{ that participate in game } g} x_i$

I earlier claimed that this model can be adapted to fit fantasy leagues other than football. So here’s a question I received from one of my readers:

For fantasy baseball, some players can play multiple positions. E.g. Miguel Cabrera can play 1B or 3B. I currently use OpenSolver for DFS and haven’t found a good way to incorporate this into my model. Any ideas?

Let’s call this…

Requirement 4: What if some players can be added to the team at one of several positions?

Here’s how to take care of this. Given a player $i$, let the index $t=1,2,\ldots,T_i$ represent the different positions he/she can play. Instead of having a binary variable $x_i$ representing whether or not $i$ is on the team, we have binary variables $x_{it}$ (as many as there are possible values for $t$) representing whether or not player $i$ is on the team at position $t$. Because a player can either not be picked or picked to play one position, we need the following constraint for each of these multi-position players:

$\displaystyle \sum_{t=1}^{T_i} x_{it} \leq 1$

Because we have replaced $x_i$ with a collection of $x_{it}$‘s, we need to replace all occurrences of $x_i$ in our model with $(x_{i1} + x_{i2} + \cdots + x_{iT_i})$.

In the Miguel Cabrera example above, let’s say Cabrera’s player ID (the index $i$) is 3, and that $t=1$ represents the first-base position, and $t=2$ represents the third-base position. The constraint above would become

$x_{31} + x_{32} \leq 1$

And we would replace all occurrences of $x_3$ in our model with $(x_{31} + x_{32})$.

That’s it!

I was wondering, is there a way to add an additional constraint that maximizes the minimum rating of the chosen players, if each player has some rating score. I tried to think that out, but can’t seem to get it to be linear.

Let’s call this…

Requirement 5: What if I want to maximize the point projection of the worst player on the team? (In other words, how do I make my worst player as good as possible?)

It’s possible to write a linear model to accomplish this. Technically speaking, we would be changing the objective function from maximizing the total point projection of all players on the team to maximizing the point projection of the worst player on the team. (There’s a way to do both together (sort of). I’ll say a few words about that later on.)

Here we go. Because we don’t know what the projection of the worst player is, let’s create a variable to represent it and call it $z$. The objective then becomes:

$\max z$

You might have imagined, however, that this isn’t enough. We defined in words what we want $z$ to be, but we still need formulas to make $z$ behave the way we want. Let $M$ be the largest point projection among all players that could potentially be on our team. It should be clear to you that the constraint $z\leq M$ is a valid ceiling on the value of $z$. In fact, the value of $z$ will be limited above by 9 values/ceilings: the 9 point projections of the players on the team. We want the lowest of these ceilings to be as high as possible.

When a player $i$ is not on the team ($x_i=0$), his point projection $p_i$ should not interfere with the value of $z$. When player $i$ is on the team ($x_i=1$), we would like $p_i$ to become a ceiling for $z$, by enforcing $z\leq p_i$. The way to make this happen is to write a constraint that changes its behavior depending on the value of $x_i$, as follows:

$z \leq p_ix_i + M(1-x_i)$

We need one of these for each player. To see why the constraint above works, consider the two possibilities for $x_i$. When $x_i=0$ (player not on the team), the constraint reduces to $z\leq M$ (the obvious ceiling), and when $x_i=1$ (player on the team), the constraint reduces to $z\leq p_i$ (the ceiling we want to push up).

BONUS: What if I want, among all possible teams that have the maximum total point projection, the one team whose worst player is as good as possible? To do this, you solve two optimization problems. First solve the original model maximizing the total point projection. Then switch to this $\max z$ model and include a constraint saying that the total point projection of your team (the objective formula of the first model) should equal the total maximum value you found earlier.

That’s it!

And that does it, folks!

Does your league have other requirements I have not addressed here? If so, let me know in the comments. I’m sure most (if not all) of them can be incorporated.

Filed under Analytics, Applications, Integer Programming, Modeling, Motivation, Sports

### 24 responses to “How to Build the Best Fantasy Football Team, Part 2”

1. Andreas Thorsen

Here is a nice story about an employee at FanDuel winning \$350,000 in a contest. It points to another critical element of the optimization problem for those playing contests versus hundreds (or thousands) other contestants: choosing players who are “under-the-radar”. That is, your chance of winning may increase if you have a well-performing, and unique, line-up.

http://fortune.com/2015/10/05/draftkings-fanduel-data-scandal/

It reminds me of the best way to pick lottery numbers. Choosing numbers using birthday dates is a bad strategy because so many people use this approach, making it more likely you would have to share the jackpot with others if you do win.

2. JackR

Thanks for the update on that last section. I think I’ve modeled it correctly so that multiple positions can be considered for each player. However, for the requirements above, I’m having an issue setting the the problem up. I get errors saying the constraints no longer are linear. It sort of makes sense, since there are times when a player is considered and he already has another teammate considered, and so their constraint doesn’t change since adding that player won’t change the y value. There are times, though when he’s the only player considered on his team, and THAT will change the y value. Linear optimization can do a lot, but I think they only way to solve that constraint is brute force, which no one has the time for.

It’s linear to say “No more than n players from one team”, because each time a player is considered, there’s a running tally that he’s contributing toward. But if there are times when he is considered and nothing changes, it messes the optimization up.

Perhaps I didn’t explain myself clearly. Would you be able to model the requirements above in your example spreadsheet from Part 1? That would help a lot.

3. orbythebeach

Jack: the whole model can still be linear even with the requirements above. So I guess there must be a typo somewhere in your model. I’ll update the spreadsheet to show how to do it, but it will take me a few days. If you want to email me your sheet to tallys@miami.edu, I can try to diagnose the issue. Let me know.

4. rs181602

Great read! I was wondering, is there a way to add an additional constraint that maximizes the minimum rating of the chosen players, if each player has some rating score. I tried to think that out, but can’t seem to get it to be linear.

5. orbythebeach

Yes, there is! I’ll show how to do it soon in an update to Part 2.

6. rs181602

Awesome! I look forward to seeing it!! Thank for the fast response :)

7. orbythebeach

rs181602: I’ve updated Part 2 with an explanation of what you want to do. I called it “Requirement 5”.

8. rs181602

thanks so much! That makes complete sense. I had struggled so much to find a way to get rid of the 0s, and had been doing weird stuff to modify all terms but make them irrelevant but totally forgot about Big M! When I tried doing something like this originally, I used If statements in some cells, and had to use evolutionary solver since simplex wouldn’t work and I couldn’t figure out why until I looked up that If makes things non linear. I feel that people familiar with Excel but not so much for using it for linear programming may be prone to using stuff other than Sum and Sumproduct, resulting in non-linear formulations and they won’t get optimal results. These two posts helped me not only with a problem pertaining to a hobby but also with a more concise way of solving other Integer Programs and Linear Programs! Great posts thanks again :)

9. orbythebeach

You’re very welcome. I’m glad it helped.

10. Was wondering if you knew of a way to run solver multiple times, or to have solver output multiple solutions, by either finding different solutions that either equal the same “salary” or increment down after each solve?

11. Andreas Thorsen

I don’t know of a great way of doing it in Excel Solver. However, you can iteratively find the next-best solution (which will likely have the same objective value in these problems, based on my experience) after first solving your model and then adding a constraint of the form:

sum_{i \in T}{(1-x_i)} >= 1

where T are the indices of the players from your original optimal solution. This will guarantee in the next solution that at least one of those players will not be chosen.

Perhaps this could be automated using some VBA code. If you coded it in CPLEX instead, I think there is a better command to generate the multiple optimal solutions.

12. orbythebeach

gerardobrown118: What you are looking for can be achieved with the Solver Table add-in. Just Google for it and you’ll find lots of info about it. Here’s one link that turns up: http://web2.utc.edu/~fgg366/3560/SolverTable-E2010/SolverTableHelp.docx.

13. Thank you both for your responses. Andreas’s response might help me most, as I forgot to explain I was using google spreadsheets, not excel.

Andreas, could u break down what you said as if you were speaking to a complete newbie, please?

Are you saying just to add “sum_{i \in T}{(1-x_i)} >= 1” to my solver constraints?

Also not following what to change T to..

Thanks so much!
Ken

14. orbythebeach

For a more detailed explanation of what Andreas suggested, see this link: http://orinanobworld.blogspot.com/2011/02/finding-multiple-solutions-in-binary.html

15. Thanks again you guys, This seems way over my head though. I have a simple fantasy football form setup that I followed a video tutorial to do ( http://excelbyjoe.com/how-to-use-solver-in-excel-to-create-fantasy-football-lineups/ ). You 2 are obviously a million miles more advanced than me, and I am just not following anything I am reading. lol

16. Andreas Thorsen

The above link gives a good explanation. In this context, let’s say you solve your model, and a team is just three players. The optimization model selected Michael Vick as QB, Adrian Peterson as RB, and Robbie Gould as KICKER. If there are 200 potential players, the binary variables are X1, X2, …, X200. Say Michael Vick corresponds to X3, AP corresponds to X50, and Robbie Gould corresponds to X88. Since that is your solution, those variables are set to 1, and all other players’ binary variables are set to 0.

If you want to find another solution, solve the model again, but with the constraint:

(1-X3) + (1-X50) + (1-X88) >= 1

What does that do? Well, it won’t allow all three of those variables to be equal to 1 anymore – because it would be infeasible to that constraint. So it would find the next-best solution, where at least one of those variables is equal to 0.

Hope that helps.

17. Thanks Andreas! That makes sense to me now!

18. Also, would there be a way of automating that procedure, so that solver just spits out multiple solutions for me?

I currently have my sheet setup so that it looks up the “1’s” and out puts the players selected to a separate sheet.

19. aemil

Great Post!!
Here is a python google or-tools implementation (https://github.com/mrnitrate/Draftkings-Optimal-Lineup-Generator) that can generate multiple unique solutions.

How would you write a constraint for ‘Don’t pick a defense that any offensive player is playing’

20. I’m having a very hard time modelling the two constraints in Requirement 1 into excel. I created a binary column holding each team similar to the how the position columns are set up. Then I used sumproduct to get the number of players on each team showing, also similar to position players. However I’m stuck there, especially in how to make sure a given Yj can only equal 1 or 0 rather than the actual number of players on the optimal team from that specific team.

21. orbythebeach

Hi! To implement requirement 1, you need one Yj variable per team. Let’s say you put them in cells L1 through L32. Then in cell L33 you add the formula =SUM(L1:L32) and in the Solver window you add a constraint saying L33 >= N, where N is typically 2 or 3, depending on the betting site. You also have to add a constraint in Solver to enforce that the Yj variables be binary (can only take the values 0 or 1). This is done in the same way I enforced the “Is on team?” decision variables (the X variables located in column B of the spreadsheet I posted) to be binary. Take a look there. Finally, you’ll need 32 new constraints (one per team) to make sure the X and Y variables work together. For example, using my sheet as a reference, say that Antonio Brown (cell B2), Calvin Johnson (cell B5), and Tony Romo (cell B7) are the only players from team 1 on your list of players. Then you need to add the following constraint in Solver: L1 <= B2 + B5 + B7. The constraints for the remaining 31 teams will follow a similar pattern. This is a tedious process, no doubt. One way to make it simpler is to add a new column which contains the team of each player. Then you could use the SUMIF() formula to produce the right-hand side piece of each of these 32 constraints.

22. Michael M.

Hi Tallys. All your work has been very helpful and informative. Thank you for doing all this. Have you posted a new spreadsheet with all the new additions? I only saw the link to the original file.

23. orbythebeach

Hi Michael. Thank you for your comment. I’m glad you enjoyed this series of posts. I did not implement all the changes I described. I left them as exercises for my readers :-)