Research Assistant Professor position at University of Miami’s School of Business

My department is hiring! :-) See details below.

The Management Science Department at the University of Miami’s School of Business Administration invites applications for a non-tenure-track Research Assistant Professor position to begin in the Fall of 2015. The Management Science Department is a diverse group of faculty with expertise in several areas within Operations Research and Analytics, including statistics and machine learning, optimization, simulation, and quality management. Duties will include research, teaching at both the graduate and undergraduate levels, and advising undergraduate students seeking majors/minors in Management Science or Business Analytics.

Applicants should possess a PhD in operations research or a related discipline by the start date of employment. Applications should be submitted by e-mail to facultyaffairs@bus.miami.edu, and should include the following: a curriculum vitae, up to three representative publications, brief research and teaching statements, an official graduate transcript, information about teaching experience and performance evaluations, and three letters of recommendation. Applications will be reviewed as they arrive. The position will remain open until filled.

The University of Miami offers a comprehensive benefits package including medical and dental benefits, tuition remission, vacation, paid holidays, and much more. The University of Miami is an Equal Opportunity/Affirmative Action Employer.

Leave a comment

Filed under Uncategorized

Tenure-Track Position in Big Data Analytics, University of Miami, School of Business

I’m very happy to announce that the School of Business at the University of Miami is hiring in my department! Details below. This is an exciting time to be involved in Business Analytics!

Tenure-Track Faculty Position in Management Science (Big Data Analytics)

The Management Science Department at the University of Miami’s School of Business Administration invites applications for a tenure-track faculty position at the junior or advanced Assistant Professor level to begin in the Fall of 2015. Exceptional candidates at higher ranks will be considered subject to additional approval from the administration. Salaries are extremely competitive and commensurate with background and experience. This is a nine-month appointment but generous summer research support is anticipated from the School of Business.

Applicants with research interests in all areas of Analytics will be considered, although primary consideration will be given to those with expertise in Big Data Analytics and the computational challenges of dealing with large data sets. Expertise in, or experience with, one or more of the following is particularly welcome: MapReduce/Hadoop, Mahout, Cassandra, cloud computing, mobile/wearable technologies, social media analytics, recommendation systems, data mining and machine learning, and text mining. The Management Science Department is a diverse group of faculty with expertise in several areas within Operations Research and Analytics, including statistics and machine learning, optimization, simulation, and quality management. Duties will include research and teaching at the graduate and undergraduate levels.

Applicants should possess, or be close to completing, a PhD in computer science, operations research, statistics, or a related discipline by the start date of employment. Applications should be submitted by e-mail to facultyaffairs@bus.miami.edu, and should include the following: a curriculum vitae, up to three representative publications, brief research and teaching statements, an official graduate transcript (for the junior Assistant Professor level), information about teaching experience and performance evaluations, and three letters of recommendation. All applications completed by December 1, 2014 will receive full consideration, but candidates are urged to submit all required material as soon as possible. Applications will be accepted until the position is filled.

The University of Miami offers a comprehensive benefits package including medical and dental benefits, tuition remission, vacation, paid holidays, and much more. The University of Miami is an Equal Opportunity/Affirmative Action Employer.

Leave a comment

Filed under Analytics

Removing Ligatures in HTML Files Generated from LaTeX Files

I recently had to convert a LaTeX document to HTML and, after looking into several alternatives, decided to go with htlatex. Because my document contains accented characters, I chose to use the UTF-8 encoding as that seems to be the trend. To convert a LaTeX source file called file.tex you can issue the command below, which will create two main files: file.css and file.html (warning: the space before -cunif is a must):

htlatex file.tex “xhtml,charset=utf-8″ ” -cunihtf -utf8″

Overall, I’m very happy with the results produced by htlatex. Nevertheless, as I loaded file.html on my iPhone, I noticed that mobile Safari does not render all ligatures properly. For example, it has no problem with the ‘fi’ ligature, but it displays a hollow square in place of the characters for ‘ff’ and ‘ffi’ ligatures. I have not tested other mobile browsers, so I’m not sure if this is only an issue with mobile Safari. Safari on my desktop computer does not exhibit this problem.

To be safe, I thought I’d be better off removing all ligatures from the HTML file, which led me to search around for their UTF-8 codes and to write a little command-shell script that uses Perl to perform the task. Since this might turn out to be useful to someone else out there, I decided to post my shell script here. Use it at your own risk and enjoy!

perl -pi -e ‘s/\xef\xac\x80/ff/g’ file.html
perl -pi -e ‘s/\xef\xac\x81/fi/g’ file.html
perl -pi -e ‘s/\xef\xac\x82/fl/g’ file.html
perl -pi -e ‘s/\xef\xac\x83/ffi/g’ file.html
perl -pi -e ‘s/\xef\xac\x84/ffl/g’ file.html
perl -pi -e ‘s/\xc5\x92/OE/g’ file.html
perl -pi -e ‘s/\xc5\x93/oe/g’ file.html
perl -pi -e ‘s/\xc3\x86/AE/g’ file.html
perl -pi -e ‘s/\xc3\xa6/ae/g’ file.html
perl -pi -e ‘s/\xef\xac\x86/st/g’ file.html
perl -pi -e ‘s/\xc4\xb2/IJ/g’ file.html
perl -pi -e ‘s/\xc4\xb3/ij/g’ file.html

By the way, I’m only concerned with Latin ligatures, but you can find UTF-8 codes for other ligatures on this page. Bonus: here’s another useful article related to this topic: The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!).

Leave a comment

Filed under Tips and Tricks

The First Sentence of the Great Analytics Novel

Thedarktower7 I’ve written many times before about the importance of promoting O.R. to the general public. One of the ideas that’s been suggested by several people is the possibility of writing a work of fiction whose main character (our hero) is an O.R./Analytics person. I still believe this is a great idea, if executed properly.

Today, my wife brought to my attention The Bulwer-Lytton Fiction Contest, which, according to their web page, consists of the following:

Since 1982 the English Department at San Jose State University has sponsored the Bulwer-Lytton Fiction Contest, a whimsical literary competition that challenges entrants to compose the opening sentence to the worst of all possible novels. The contest (hereafter referred to as the BLFC) was the brainchild (or Rosemary’s baby) of Professor Scott Rice, whose graduate school excavations unearthed the source of the line “It was a dark and stormy night.” Sentenced to write a seminar paper on a minor Victorian novelist, he chose the man with the funny hyphenated name, Edward George Bulwer-Lytton, who was best known for perpetrating The Last Days of PompeiiEugene AramRienziThe CaxtonsThe Coming Race, and – not least – Paul Clifford, whose famous opener has been plagiarized repeatedly by the cartoon beagle Snoopy. No less impressively, Lytton coined phrases that have become common parlance in our language: “the pen is mightier than the sword,” “the great unwashed,” and “the almighty dollar” (the latter from The Coming Race, now available from Broadview Press).

Just like an awful first sentence can be a good indicator of a terrible book, the converse can also be true. Take, for example, the first sentence of Stephen King’s The Dark Tower series, which I happen to be reading (and loving) as we speak:

The man in black fled across the desert, and the gunslinger followed.

It’s such a strong, mysterious, and captivating sentence…

…which brings me to the point of this post. If it’s going to be difficult to write The Great Analytics Novel, what if we start by thinking about what would be the perfect, most compelling sentence to start such a novel? Yes, I propose a contest. Let’s use our artistic abilities and suggest starting sentences. Feel free to add them as comments to this post. Who knows? Maybe someone will get inspired and start writing the novel.

Here’s mine:

Upon using the word “mathematical” he knew he had lost the battle for, despite the dramatic cost savings, their logical reasoning was instantly halted, like a snowshoe hare frozen in fear of its chief predator: the Canada lynx.

I can’t wait to read your submissions!

4 Comments

Filed under Analytics, Books, Challenge, INFORMS Public Information Committee, Motivation, Promoting OR

Semantic Typing: When Is It Not Enough To Say That X Is Integer?

Andre Cire, John Hooker, and I recently finished a paper on an interesting, and somewhat controversial, topic that relates to high-level modeling of optimization problems. The paper is entitled “Modeling with Metaconstraints and Semantic Typing of Variables“, and its current version can be downloaded from here.

Here’s the abstract:

Recent research in the area of hybrid optimization shows that the right combination of different technologies, which exploits their complementary strengths, simplifies modeling and speeds up computation significantly. A substantial share of these computational gains comes from better communicating problem structure to solvers. Metaconstraints, which can be simple (e.g. linear) or complex (e.g. global) constraints endowed with extra behavioral parameters, allow for such richer representation of problem structure. They do, nevertheless, come with their own share of complicating issues, one of which is the identification of relationships between auxiliary variables of distinct constraint relaxations. We propose the use of additional semantic information in the declaration of decision variables as a generic solution to this issue. We present a series of examples to illustrate our ideas over a wide variety of applications.

Optimization models typically declare a variable by giving it a name and a canonical type, such as real, integer, binary, or string. However, stating that variable x is integer does not indicate whether that integer is the ID of a machine, the start time of an operation, or a production quantity. In other words, variable declarations say little about what the variable means. In the paper, we argue that giving a more specific meaning to variables through semantic typing can be beneficial for a number of reasons. For example, let’s say you need an integer variable x_j to represent the machine assigned to job j. Instead of writing something like this in your modeling language (e.g. AMPL):

var x{j in jobs} integer;

it would be beneficial to have a language that allows you to write something like this

x[j] is which machine assign(job j);

To see why, take a look at the paper ;-)

Leave a comment

Filed under Modeling, Research

Optimally Resting NBA Players

To celebrate the start of the 2013-2014 NBA season this past Tuesday, I decided to write a post on basketball. More specifically, on the important issue of how to give players some much needed rest in an “optimal” way. My inspiration came from an article by Michael Wallace published on ESPN.com on October 19. Here are some relevant excerpts:

After playing in the Miami Heat’s first five preseason games, LeBron James sat out Saturday night’s 121-96 victory over the San Antonio Spurs to rest…James said the decision to sit was part of the team’s “maintenance” process. Heat teammate Dwyane Wade played Saturday and scored 25 points in 26 minutes, but previously skipped three preseason games…”No, no injuries — just not suiting up,” James said. “It’s OK for LeBron to take one off.”

The key term here is maintenance process. You may also recall that, back in November 2012, the Spurs were fined $250,000 by the league after coach Popovich sent Duncan, Parker, Ginobili, and Green home right before a game against the Miami Heat.

So we want to rest our players to keep them healthy, but this cannot come at the expense of losing games. There are many factors to be taken into account here, such as players’ current physical condition, strength and tightness of schedule, and match-ups (how well a team stacks up against another team), to name a few. This is definitely not an easy problem. However, some insight is better than no insight at all. Therefore, let’s see what we can do with a simple O.R. model, and then we can talk about the strengths and weaknesses of our initial approach. (Here’s where you, dear reader, are supposed to chime in!)

Let’s begin with two simple assumptions: (i) when it comes to resting, we have to take players’ individual needs into account, i.e., we’ll use player-specific data; and (ii) when it comes to the likelihood of beating an opposing team, it’s better to think in terms of full lineups, rather than in terms of individual players, i.e., we’ll use lineup-specific data. The data in assumption (i) comes from doctors, players’ medical records, and coaches’ strategies. In essence, it boils down to one number: how many minutes, at most, should each player play in each game, under ideal circumstances. A useful measure of the strength of a lineup is its adjusted plus-minus score (see, for example, the work of Wayne Winston and his book Mathletics). In summary, it’s a number that tells you how many points a given lineup plays above (or below) an average lineup in the league over 48 minutes (or over 100 possessions, or another metric of reference).

For the sake of explanation, I’ll pretend to be in charge of resting Miami Heat players (surprise!). I’ll refer to a generic lineup by the letter i (i=1,\ldots,8), to a generic player by the letter j (j= LeBron, D-Wade, …, Andersen (Bird Man)), and to a generic game by the letter k.

We’re now ready to begin. Fasten your seat belts!

What are the decisions to be made? Let’s consider a planning horizon that consists of the next 7 games (or pick your favorite number). So k=1,\ldots,7. For the Heat, the first 7 games of the 2013-2014 season are against the following teams: Bulls, 76ers, Nets, Wizards, Raptors, Clippers, and Celtics. For each one of my potential lineups i and each game k, I want to figure out the number of minutes I should use lineup i during game k. Because this is an unknown number right now, it’s a variable in the model. Let’s call it x_{ik}. Note it’s also OK to think of x_{ik} as a percentage, rather than minutes. I’ll adopt the latter interpretation.

What are the constraints in this problem? There are three main constraints to worry about: (a) make sure to pick enough lineups to play each game in its entirety; (b) make sure your lineups are good enough to hopefully beat your opponents in each game; (c) keep track of players’ minutes, and don’t let them get out of hand. The next step is to represent each constraint mathematically.

Constraint (a): Pick enough lineups to completely cover each game. For every game k, we want to impose the following constraint:

\displaystyle \sum_{i=1}^{10} x_{ik}=1

This means that if we sum the percentage of time each lineup is used during game k, we reach 100%.

Constraint (b): Choose your lineups so that you expect to score enough points in every game to beat your opponents. In this example, I’ll focus on plus-minus scores, but as a coach you could focus on any metric that matters to you. Given a lineup i, let p_i be its adjusted plus-minus score. For example, the lineup of LeBron, Wade, Bosh, Chalmers, and Allen in the 2012-2013 season had the amazing p_i score of +36.9 (you can obtain these numbers, and many other neat statistics, from the web site stats.nba.com). Now let’s say you have the plus-minus score of your opponent in game k, which we’ll call P_k. One way to increase your chances of victory is by requiring that the expected plus-minus score of your lineup combination in game k exceed P_k by a certain amount. Therefore, for every game k, we write the following constraint:

\displaystyle \sum_{i=1}^{10} p_i x_{ik} \geq P_k + 0.5

I want to emphasize two things. First, p_i can be any measure of goodness of your lineup, and it can take into account the specific opponent in game k. Likewise, P_k can be any measure of goodness of team k, as long as it’s consistent with p_i. Second, you’re not restricted to having only one of these constraints. If many measures of goodness matter to you, add them all in. For example, if you’re playing a team that’s particularly good at rebounding and you believe that rebounding is the key to beating them (e.g. Heat vs. Pacers), then either replace the constraint above with the analogous rebounding version, or include the rebounding version in addition to the constraint above. Finally, note that I picked 0.5 as a fixed amount by which to exceed P_k, but it could be any number you wish, of course. It can even be a number that varies depending on the opponent.

Constraint (c): Keep track of how many minutes your players are playing above and beyond what you’d like them to play. For any given player j and any given game k, let m_{jk} be j‘s ideal number of playing minutes in game k (make it zero if you want the player to sit out). When it’s not possible to match m_{jk} exactly, we need to know how many minutes player j played under or over m_{jk}. Let’s call these two unknown numbers (variables) u_{jk} and o_{jk}, respectively. So, for every player j and game k, we write the following constraint:

\displaystyle 48\left(\sum_{i \text{ that includes } j} x_{ik}\right) + u_{jk} - o_{jk}=m_{jk}

The expression “i that includes j” under the summation means that we’re summing variables x_{ik} for all lineups of which j is a member. We’re multiplying the summation by 48 minutes because x_{ik} is in percentage points and m_{jk} is in minutes.

What is our goal? (a.k.a. objective function) It’s simple: we don’t want players to play too many minutes above m_{jk}. Because this overage amount is captured by variable o_{jk}, we can write our goal as:

\displaystyle \text{minimize } \sum_{j=1}^{9} \sum_{k=1}^{7} o_{jk}

This minimizes the total overage in playing minutes. For a more balanced solution, it’s also possible to minimize the maximum overage over all players, or add weights in front of the o_{jk} variables to give preference to some players.

Now what? Well, the next step would be to solve this model and see what happens. I created a Microsoft Excel spreadsheet that can be solved with Excel Solver or OpenSolver. You can download it from here. Feel free to adapt it to your own needs and play around with it (this is the fun part!). Because my model was limited in size (I can’t use OpenSolver on my Mac at home), the solution isn’t very good (too many overage minutes). However, by adding more players and more lineups, the quality will certainly improve (use OpenSolver to break free from limits on model size). Here are some notes to help you understand the spreadsheet:

  • Variables x_{ik} are in the range B18:H25.
  • Variables u_{jk} and o_{jk} are in ranges B56:J62 and B65:J71, respectively.
  • Constraints (a) are implemented in rows 27, 28, 29.
  • Constraints (b) are implemented in rows 33, 34, 35.
  • The left-hand side of constraints (c) are in the range B74:J80. This range is required to be equal to the range B47:J53 (where the m_{jk} are) inside the Solver window.
  • The objective function whose formula appears above is in cell J21.

What are the pros and cons of this model? Can you make it better? No model is perfect. There are always real-life details that get omitted. The art of modeling is creating a model that is detailed enough to provide useful answers, but not too detailed to the point of requiring an unreasonable amount of time to solve. The definitions of “detailed enough” and “unreasonable amount of time” are mostly client-specific. (What would please Erik Spoelstra and his coaching staff?) What do you think are the main strengths and weaknesses in the model I describe above? What would you change? Good data is a big issue in this particular case. If you don’t like my data, can you propose alternative sources that are practical? I believe there’s plenty to talk about in this context, and I’m looking forward to receiving your feedback. Maybe we can converge to a model that is good enough for me to go knocking on the Miami Heat’s door! (Don’t worry. In the unlikely event they open the door, I’ll share the consulting fees.)

3 Comments

Filed under Applications, Linear Programming, Modeling, Motivation, Sports

INFORMS 2013: Sunday Nuggets

I just published my Sunday blog post at this year’s INFORMS Annual Conference, which is taking place in Minneapolis, MN. Here’s a link to it. Enjoy!

Leave a comment

Filed under INFORMS