Project Scheduling using Solver Foundation

I'd like to work through a few examples of how to model real problems using Solver Foundation.  If you download the Solver Foundation Express Edition, you can follow along: start Excel, click the Solver Foundation tab, then click on the Model button to get a blank modeling pane.

I spent my first five years at Microsoft on the Project team, writing the project scheduling engine for Project Server.  Project is a great team and an awesome product, and I still have a soft spot for project scheduling.  My intent is not to reinvent Project - just to make a connection with a familiar problem so I can show off the Solver Foundation platform while introducing some key concepts. For those who don't know Project, the goal is to come up with a schedule for a list of tasks in a project.  Here are a few of the basics:

  • Tasks have a start, finish, and duration.
  • No task starts before a given project start date.
  • The finish date is computed as start + duration.
  • A pair of tasks can be linked by precedence relationships.  When there is a link from a task A to a task B, the B's start must occur on or after A's finish date.
  • The project finish date is defined as the latest finish date of any task in the project.

Given these requirements, we would like to compute the start and finish dates for all of the tasks in the project, and we want to find a schedule ends as soon as possible.  Let's use the Solver Foundation to create a model in OML (Solver Foundation's declarative language) and compute some schedules.  I always find it helpful to work out an example before I try and model a problem.  Here's a simple schedule in Project with five tasks, and two links that connect t0, t1, t2.  To compute the schedule by hand I could use a simple critical path scheduling algorithm.

A simple project 

Right away you can spot a few issues:

  • I haven't taken calendars into account - normally I don't want to schedule work on Saturdays and Sundays.
  • I haven't taken resources into account - in real life, people (or machines, or rooms) are assigned to work on tasks, and their availability may affect my schedule.
  • There are lots of cool Project features that I haven't taken into account - e.g. task constraints, summary tasks, etc.
  • I haven't taken uncertainty into account - in real life, durations aren't known precisely.

I will revisit some of these points in future posts - these are all things that I can address with Solver Foundation.  For now, I am going to stick with a simple version of the problem.  I'd rather get a correct model on a simplified version of the problem than an incorrect model on a more complex one!  In this sense, modeling is no different from writing an essay or designing software.

There are a few questions to ask yourself when building an OML model:

  • What are the key objects or entities that I will be working with? (tasks, links)
  • What do I want the solver to compute? (start and finish dates for tasks)
  • What do I know?  (the durations on the tasks)
  • What are my restrictions? (links enforce an ordering on tasks, start + duration = finish)
  • What do I want to optimize? (the finish date on the project)

If you have good answers to these questions, and start simple, you can usually come up with an OML model that describes the problem.  Let's start by modeling the start and finish dates of the tasks, since that is ultimately what we want to find out.  Each task has a start and a finish - these are dates.  Solver Foundation does not directly support date values, so let's say that start and finish are real numbers, representing the number of days since the project start date.  To express the idea that each start and finish is associated with a task, we define a "set" Parameter.  The start and finish dates are the values that we want the solver to figure out, so these are Decisions:

 Model[
 Parameters[Sets, Tasks],  
 
 Decisions[
     Reals[0, Infinity], Start[Tasks], Finish[Tasks]
  ],

Notice the syntax for Decisions: the first argument indicates the possible values (domain) for the decisions.  We could have used Integers[0, Infinity] for the first argument, but then we would not be able to model fractional days. Loosely speaking, the start and finish dates are the "output".  What's input?  The durations.  Let's suppose that the durations are entered into our spreadsheet like this:

Task Dur
t0 1
t1 2
t2 3
t3 2
t4 4

We will define Duration as a Parameter so that we can use data binding to determine the values.  Just like Start and Finish, Duration is indexed by Task. 

 Parameters[Reals, Duration[Tasks]],

After entering the duration data into Excel I can use the Binding button to add a binding from the range B2:B6 to Duration.  When I solve the model, Solver Foundation will use whatever values are entered in the spreadsheet.  But let's not get ahead of ourselves...next let's express the relation between Start, Finish, and Duration.  Relationships that need to be enforced are described using Constraints.  OML provides a bunch of operators and expressions for describing constraints.  In this case, we have one constraint per task.  This is easy to express using Foreach:

  Constraints[
    Foreach[{t, Tasks}, Start[t] + Duration[t] == Finish[t] ]
 ] 

We almost have a working model - the last thing we need to express is our Goal: that we want the project to finish as soon as possible.  So let's introduce ProjectFinish as another decision:

  Decisions[
     Reals[0, Infinity], Start[Tasks], Finish[Tasks], ProjectFinish
  ]

The ProjectFinish is the maximum finish date of any task in the model.  In other words, it is at least as big as the Finish date of each task, so we add this constraint.

   Foreach[{t, Tasks}, ProjectFinish >= Finish[t] ]

Now we can express our goal: minimize the project finish:

   Goals[ Minimize[finish -> ProjectFinish ] ]

The "->" syntax simply gives a label to our goal, it's optional.  Here's the complete model:

 Model[
  Parameters[Sets, Tasks],  
  Parameters[Reals, Duration[Tasks]],

  Decisions[
     Reals[0, Infinity], Start[Tasks], Finish[Tasks], ProjectFinish
  ],

  Constraints[
    Foreach[{t, Tasks}, Start[t] + Duration[t] == Finish[t] ],
    Foreach[{t, Tasks}, ProjectFinish >= Finish[t] ],
  ],
 
  Goals[ Minimize[ finish -> ProjectFinish ]  ]
]

If you have entered this into the modeling pane, and used the Binding button to define the data binding between the spreadsheet data and Duration, you can hit Solve and observe the Start and Finish dates.  You will see something like this:

Solver Foundation Results 

If you want to turn the starts and finishes into dates, you can enter a formula like "=now() + C5" in the adjacent column.  Actually I put this formula in a column next to my duration values, so I can see the durations, starts, and finishes all in one place.

You should be thoroughly unimpressed by this model (as I am ;)), because it doesn't do all that much.  All I'm doing is letting Solver Foundation compute finish = start + duration, which I could have done very easily in Excel.  I need to model "links": a link causes the start date of the successor task to be later than the finish of the predecessor task.  If you've made it this far, you have all the building blocks you need to model this in OML.  Feel free to post your solution here, or just wait until tomorrow for my solution.

Extra credit: if you use Project, you know that you can define "constraints" on tasks.  For example, I can double click on a task and say that a task must "Start No Earlier Than March 30, 2009".  These constraints need to be honored, in addition to links.  How can I model SNET constraints using OML?