# How to Build the Best Fantasy Football Team

Note 1: This is Part 1 of a two-part post on building fantasy league teams. Read this first and then read Part 2 here.

Note 2: Although the title says “Fantasy Football”, the model I describe below can, in principle, be modified to fit any fantasy league for any sport.

I’ve been recently approached by several people (some students, some friends) regarding the creation of optimal teams for fantasy football leagues. With the recent surge of betting sites like Fan Duel and Draft Kings, this has become a multi-million (or should I say, billion?) dollar industry. So I figured I’d write down a simple recipe to help everybody out. We’re about to use Prescriptive Analytics to bet on sports. Are you ready? Let’s do this! I’ll start with the math model and then show you how to make it all work using a spreadsheet.

The Rules

The fantasy football team rules state that a team must consist of:

• 1 quarterback (QB)
• 2 running backs (RB)
• 1 tight end (TE)
• 1 kicker
• 1 defense

Some leagues also have what’s called a “flex player”, which could be either a RB, WR, or TE. I’ll explain how to handle the flex player below. In addition, players have a cost and the person creating the team has a budget, call it $B$, to abide by (usually $B$ is $50,000 or$60,000).

The Data

For each player $i$, we are given the cost mentioned above, call it $c_i$, and a point projection $p_i$. The latter is an estimate of how many points we expect that player to score in a given week or game. When it comes to the defense, although it doesn’t always score, there’s also a way to calculate points for it (e.g. points prevented). How do these point projections get calculated, you may ask? This is where Predictive Analytics come into play. It’s essentially forecasting. You look at past/recent performance, you look at the upcoming opponent, you look at players’ health, etc. There are web sites that provide you with these projections, or you can calculate your own. The more accurate you are at these predictions, the more likely you are to cash in on the bets. Here, we’ll take these numbers as given.

The Optimization Model

The main decisions to be made are simple: which players should be on our team? This can be modeled as a yes/no decision variable for each player. So let’s create a binary variable called $x_i$ which can only take two values: it’s equal to the value 1 when player $i$ is on our team, and it’s equal to the value zero when player $i$ is not on our team. The value of $i$ (the player ID) ranges from 1 to the total number of players available to us.

Our objective is to create a team with the largest possible aggregate value of projected points. That is, we want to maximize the sum of point projections of all players we include on the team. This formula looks like this:

$\max \displaystyle \sum_{\text{all } i} p_i x_i$

The formula above works because when a player is on the team ($x_i=1$), its $p_i$ gets multiplied by one and is added to the sum, and when a player isn’t on the team ($x_i=0$) its $p_i$ gets multiplied by zero and doesn’t get added to the final sum. The mechanism I just described is the main idea behind what makes all formulas in this model work. For example, if the point predictions for the first 3 players are 12, 20, and 10, the maximization function start as: $\max 12x_1 + 20x_2 + 10x_3 + \cdots$

The budget constraint can be written by saying that the sum of the costs of all players on our team has to be less than or equal to our budget $B$, like this:

$\displaystyle \sum_{\text{all }i} c_i x_i \leq B$

For example, if the first 3 players cost 9000, 8500, and 11000, and our budget is 60,000, the above formula would look like this: $9000x_1 + 8500x_2 + 11000x_3 + \cdots \leq 60000$.

To enforce that the team has the right number of players in each position, we do it position by position. For example, to require that the team have one quarterback, we write:

$\displaystyle \sum_{\text{all } i \text{ that are quarterbacks}} x_i = 1$

To require that the team have two running backs and three wide receivers, we write:

$\displaystyle \sum_{\text{all } i \text{ that are running backs}} x_i = 2$

$\displaystyle \sum_{\text{all } i \text{ that are wide receivers}} x_i = 3$

The constraints for the remaining positions would be:

$\displaystyle \sum_{\text{all } i \text{ that are tight ends}} x_i = 1$

$\displaystyle \sum_{\text{all } i \text{ that are kickers}} x_i = 1$

$\displaystyle \sum_{\text{all } i \text{ that are defenses}} x_i = 1$

The Curious Case of the Flex Player

The flex player adds an interesting twist to this model. It’s a player that, if I understand correctly, takes the place of the kicker (meaning we would not have the kicker constraint above) and can be either a RB, WR, or TE. Therefore, right away, we have a new decision to make: what kind of player should the flex be? Let’s create three new yes/no variables to represent this decision: $f_{\text{RB}}$, $f_{\text{WR}}$, and $f_{\text{TE}}$. These variables mean, respectively: is the flex RB?, is the flex WR?, and is the flex TE? To indicate that only one of these things can be true, we write the constraint below:

$f_{\text{RB}} + f_{\text{WR}} + f_{\text{TE}} = 1$

In addition, having a flex player is equivalent to increasing the right-hand side of the constraints that count the number of RB, WR, and TE by one, but only for a single one of those constraints. We achieve this by changing these constraints from the format they had above to the following:

$\displaystyle \sum_{\text{all } i \text{ that are running backs}} x_i = 2 + f_{\text{RB}}$

$\displaystyle \sum_{\text{all } i \text{ that are wide receivers}} x_i = 3 + f_{\text{WR}}$

$\displaystyle \sum_{\text{all } i \text{ that are tight ends}} x_i = 1 + f_{\text{TE}}$

Note that because only one of the $f$ variables can be equal to 1, only one of the three constraints above will have its right-hand side increased from its original value of 2, 3, or 1.

Other Potential Requirements

Due to personal preference, inside information, or other esoteric considerations, one might want to include other requirements in this model. For example, if I want the best team that includes player number 8 and excludes player number 22, I simply have to force the x variable of player 8 to be 1, and the x variable of player 22 to be zero. Another constraint that may come in handy is to say that if player 9 is on the team, then player 10 also has to be on the team. This is achieved by:

$x_9 \leq x_{10}$

If you wanted the opposite, that is if player 9 is on the team then player 10 is NOT on the team, you’d write:

$x_9 + x_{10} \leq 1$

Other conditions along these lines are also possible.

Putting It All Together

If you were patient enough to stick with me all the way through here, you’re eager to put this math to work. Let’s do it using Microsoft Excel. Start by downloading this spreadsheet and opening it on your computer. Here’s what it contains:

• Column A: list of player names.
• Column B: yes/no decisions for whether a player is on the team (these are the x variables that Excel Solver will compute for us).
• Columns C through H: flags indicating whether or not a player is of a given type (0 = no, 1 = yes).
• Columns I and J: the cost and point projections for each player.

Now scroll down so that you can see rows 144 through 150. The cells in column B are currently empty because we haven’t chosen which players to add to the team yet. But if those choices had been made (that is, if we had filled column B with 0’s and 1’s), multiplying column B with column C in a cell-wise fashion and adding it all up would tell you how many quarterbacks you have. I have included this multiplication in cell C144 using the SUMPRODUCT formula. In a similar fashion, cells D144:H144 calculate how many players of each kind we’d have once the cells in column B receive values. The calculations of total team cost and total projected points for the team are analogous to the previous calculations and also use the SUMPRODUCT formula (see cells I144 and J144). You can try picking some players by hand (putting 1’s in some cells of column B) to see how the values of the cells in row 144 will change.

If you now open the Excel Solver window (under the Data tab, if your Solver add-in is active), you’ll see that I already have the entire model set up for you. If you’ve never used Excel Solver before, the following two-part video will get you started with it: part 1 and part 2.

The objective cell is J144, and that’s what we want to maximize. The variables (a.k.a. changing cells) are the player selections in column B, plus the flex-player type decisions (cells D147:F147). The constraints say that: (1) the actual number of players of each type (C144:H144) are equal to the desired number of each type (C146:H146), (2) the total cost of the team (I144) doesn’t exceed the budget (I146), (3) the three flex-player binary variables add up to 1 (D150 = F150), and, (4) all variables in the problem are binary. (I set the required number of kickers in cell G146 to zero because we are using the flex-player option. If you can have both a flex player and a kicker, just type a 1 in cell G146.) If you click on the “Solve” button, you’ll see that the best answer is a team that costs exactly \$50,000 and has a total projected point value of 78.3. Its flex player ended up being an RB.

This model is small enough that I can solve it with the free student version of Excel Solver (which comes by default with any Office installation). If you happen to have more players and your total variable count exceeds 200, the free solver won’t work. But don’t despair! There exists a great Solver add-in for Excel that is also free and has no size limit. It’s called OpenSolver, and it will work with the exact same setup I have here.

That’s it! If you have any questions or remarks, feel free to leave me a note in the comments below.

UPDATE: In a follow-up post, I explain how to model a few additional fantasy-league requirements that are not included in the model above.

Filed under Analytics, Applications, Integer Programming, Modeling, Motivation, Sports

### 18 responses to “How to Build the Best Fantasy Football Team”

1. Don’t forget the team restrictions present on some sites. E.g., DraftKings: “Rosters will consist of 9 players and must include players from at least 2 different NFL teams, and representing at least 2 different football games.”

There are many (just started recently) small analytics companies providing both the predictive and prescriptive side of lineup builders for daily fantasy sports. Some of them are using integer programming to build the lineups (some are just hacking it). One of my favorites is here: https://www.dailyfantasycafe.com/

When you start adding features like “exposure” or “diversity” for generating multiple lineup entries (not just the optimal) – the problem starts to get interesting.

2. Andreas Thorsen

This is a great analytics problem. I was playing FanDuel last season and developed a similar model, and I tested it for about three weeks against friends. An important skill to learn is how to collect data. I simply scraped data from the web using a Chrome extension called ‘Table Capture’.

Next, it is important to understand the uncertain nature of the data. From what I recall, FanDuel gives point predictions for each player, and DraftKings gives average points the player scored that season. FanDuel seemed to have poor predictions, and historical averages are not useful either, so instead I used the predicted points from Yahoo Fantasy Sports website. These predictions at least in some way took into account important determinants of points such as who their next opponent is.

The next issue was that the instances I solved tended to have multiple optimal solutions. If my predicted points per player hadn’t been integer numbers, and instead was measured in tenths or hundredths, perhaps this wouldn’t have been an issue. However, it was nice to be able to look at different optimal solutions which looked very different (whole new lineups).

This model was able to beat my friends about 50% of the time. However, if an appropriate predictive analytics component can be added, as well as using some “subject matter expertise” when analyzing the output, I think there is potential to make some money with this game. Alas, I live in Montana, a state that is banned from participating in the money version of these games, so I will never know.

3. orbythebeach

Thank you for pointing out these omissions, Matthew. Is there a place online where all of them are listed? (Do I have to create an account on one of these sites to see these rules?) I’ll try googling for this and will write a follow-up post showing how to model these extra requirements.

4. orbythebeach

Very good points, Andreas. Thank you!

5. Tallys – I do not know of a central place to find all the different rules. There are MANY sites now (FanDuel, DraftKings, Yahoo, FantasyHub, …..). But you can probably find them easily by googling.

Andreas – due to the uncertainty of the projections — Robust Optimization fits nicely here. But that all depends on which type of game you are playing (head to heads vs tournaments). Obviously, your strategy is different in each case.

6. orbythebeach

Thanks! I’ll try compiling a list with whatever I can find online.

7. JackR

For fantasy baseball, some players can play multiple positions. E.g. Miguel Cabrera can play 1B or 3B. I currently use OpenSolver for DFS and haven’t found a good way to incorporate this into my model. Any ideas?

8. JackR, each player_position should be a separate variable. Then force at most one to be selected. Be careful of some details when doing it this way… like double counting for #players on a team, etc… I even found it easier to model it with an extra player variable = sum{positions} player_position. And then use the player or player_position variables where appropriate.

9. orbythebeach

JackR: Matthew’s answer explains what you need to do. Because I like your question very much, I’ve updated the second part of this post with a detailed explanation of what you need to do to take care of your situation (I refer to it as Requirement 4). Take a look: https://orbythebeach.wordpress.com/2015/09/29/how-to-build-the-best-fantasy-football-team-part-2/

10. Great minds think alike…. how does your approach differ from this one? http://www.opsrules.com/supply-chain-optimization-blog/get-an-edge-in-fantasy-football-with-analytics

11. orbythebeach

Hi Pete. Thank you for the link. Indeed the model described on that page has the same essential ideas that mine has. The main difference I see is that in my post I went all the way down to the detailed model, including all the formulas and a running spreadsheet showing how things are implemented. I also address some extra requirements in the follow-up post (part 2).

12. Jeff

So I reading this and I have looked at your spreadsheet all are great, but I wondering how you would set a lineup based strictly off of data and it not involve salary costs? I ask this because I work more without draftkings, or fanduel and more with just standard leagues.

13. orbythebeach

Hi Jeff. If costs are not part of your version of the problem, you simply remove the constraint that says “don’t spend more money than what you have.” If you give me more details about what you’re trying to do, I can try and explain what changes are needed.

14. Jeff

Thanks for responding back. I only use Yahoo public league that has no salary or auction feature we use standard scoring with the addition of a flex wr/te. I have built a a VBD cheat sheet that goes off of yahoo’s scoring for every position. It worked great but what I couldn’t make it do is use it weekly to decide if I had the best line up a could according to projections given, or if there was a player trending up or down I was missing from historical numbers. So I am wondering if there was a way to take your line builder and use it with projected and historical numbers or one of them to help with a weekly line builder. Now to through this out there I actually did set my line up against Las Vegas projected point odds using Floor, Ceiling avg point and that seemed to work out as well. I hope this helps, let me know if you need more clarification from me.

15. orbythebeach

The model I describe is set up to take a single number for each player to represent his expected performance. From my understanding of what you describe, it looks like you’d be better off running a simulation (there are Excel Add-ins for this, such as Crystal Ball, @Risk, etc.). The idea of the simulation would be to take this same model as a starting point and run it thousands of times, each time changing the players’ expected performance number to a value within the range of values we believe to be adequate for each player. At the end of all these runs you’ll have thousands of “optimal lineups”, one for each combination of input numbers. Then, you could, for example, pick the lineup that most often appeared as the best lineup.