Thursday, November 20, 2008

A Reliable Multicast Framework for Light-weight Sessions and Application Level Framing

(One day when I grow up, I want to write a paper with a title longer than this! :P )

This is a beautiful paper describing why TCP is a bad model for achieving reliable multicast. It is an enormous burden and sometimes impossible for the sender to keep state of of each receiver in the multicast group. Throw in high churn with receivers coming and dropping out, and you have an obvious scalability problem. The logical solution is to make the receivers keep state and make them responsible.

In-order delivery is the second major selling point of TCP, and inarguably achieved at a high premium. This paper proposes doing away with it and let the application deal with ordering using numbering in application data units (ADU). Simple, but effective solution. If a recipient notices a "hole", it sends out a request for repairing that hole. Probabilistic delays in these requests and repair messages guard against an implosion.

I liked the overall design philosophy in this paper - one-size-fits-all is not sensible and not trying to do too many things. I had my questions about the scalability of this approach but I am dead sure that multicast is not (and will not be!) deployed on a large-scale anywhere. The largest I can imagine is within an enterprise for streaming talks...and these are not particularly challenged networks. I think SRM should more than do the job. Nice and neat!

No comments: