Download framework here.
All posts are here:
- Part I – Workers and ParallelWorkers
- Part II – Agents and control messages
- Part III – Default error management
- Part IV – Custom error management
- Part V – Timeout management
- Part VI – Hot swapping of code
- Part VII – An auction framework
- Part VIII – Implementing MapReduce (user model)
- Part IX – Counting words …
Custom error management
In the last part we saw what happens by default in the framework when an error occurs. But that might not be what you want. You might want to have your sophisticated error detection and recovery distributed algorithm.
To make such a thing possible each agent has a manager. The manager is an agent that gets called whenever an error occurs in the agent it is monitoring.
let manager = spawnWorker (fun (agent, name:string, ex:Exception, msg:obj,
state, initialState) -> printfn "%s restarting ..." name; agent <-- Restart) counter1 <-- SetManager(manager)
Whenever an error is generated the manager receives a tuple of:
(agent, name, exception, message, currentState, inititialState)
This manager prints out something and then restarts the agent. Let’s trigger an error by posting the wrong message:
counter1 <-- "afdaf" counter1 <-- 2
The expectation is that the counter will restart from 0 whenever an error is triggered. This is what happens:
Bob restarting …
From 0 to 2
Which is what we expected. Obviously this is not a very sophisticated error recovery algorithm. You might want to do something more meaningful. Hopefully you have enough information to build whatever you need.
A particularly important class of unexpected event is timeouts. We’ll talk about them next.