Monday, April 27, 2009


Problem: Distributing a large files to multiple users can overload the resources of a hosting machine.

Solution: BitTorrent is a peer-to-peer protocol that lets peers download pieces of the file from various peers and upload other pieces to them. This redistributes the cost of upload to the downloaders. Some of the challenges include figuring out which peers have what parts of the file and high churn rates. The rarest first technique and the choking and unchoking algorithms allow BitTorrent efficiency to mitigate the challenges.
Rarest First: Replicating the rarest pieces as quickly as possible reduces the risk of them getting completely lost as current peers stop uploading. It also makes sure that pieces which are more common are left for later, so the likelihood that a peer which currently is offering upload will later not have anything of interest is reduced.
Chocking/Unchocking: A good choking algorithm should utilize all available resources, provide reasonably consistent download rates for everyone, and be somewhat resistant to peers only downloading and not uploading. To avoid situations in which resources are wasted by rapidly choking and unchoking peers, BitTorrent peers recalculate who they want to choke once every ten seconds, and then leave the situation as is until the next ten second period is up. Ten seconds is a long enough period of time for TCP to ramp up new transfers to their full capacity.

Analysis: If there are few leechers, BitTorrent works very well. However, if a substantial number of peers tried to download without uploading, there would be problems. Not many users are very abusive today which is a real world demonstration of the prisoner’s dilemma.

1 comment:

  1. One interesting question here is how much would one gain from using a distributed reputation scheme instead of a one-to-one reputation scheme. For example, the tracker can collect information about the node behavior and disseminate this information to other nodes. As a result, a malicious node would not get a chance to take advantage of every "good" node it contacts. Of course, this rises other problems such as how does the tracker trusts the information it gets from nodes, but in principle such scheme could work well even in scenarios where the number of malicious nodes is high.