Wednesday, October 7, 2009

Parallel Database Systems

Parallel Database Systems: The Future of High Performance Database Systems
by David DeWitt and Jim Gray

I remember reading this paper years ago. Reading it again next to having read the MapReduce paper put the parallel DB mindset into a modern view. Anyone who has read the MapReduce paper should read this article.
MapReduce summary

These papers give perspective on how the problem can expand and be attacked more than a decade into the future. The Parallel DB paper goes over some barriers to speedup: startup, interference, skew. The scale of the problem and needed parallelism has increased dramatically since the parallel DB article was written (multi-PB vs multi-TB). There was also great foresight in this paper concerning the use of local computation on commodity processors and memory (shared-nothing architecture).

MapReduce has attempted to formalize and attach these problems and remove them from the input given by the requester. MapReduce can be thought of as a type of parallel SQL. MapReduce attacks each of the barriers to speedup: (see MapReduce summary)
Startup: MapReduce has streamlined a general purpose.
Interference: The MapReduce workers are distributed evenly enough and as close as possible to the data that must be operated upon.
Skew: MapReduce deals with the straggler program by restarting slow jobs.

Analysis: The Parallel DB paper tried to look ahead at the evolution of high performance DB systems. The authors did not, and could not have foreseen the exponential expansion of the internet. The scale of demand upon DB systems requires a change in mindset. Traditionally, the speed of RDMS has been increased in the background while maintaining all constraints. Today, we accept that there fundamental tradeoffs (see CAP Theorem). By loosening some of the constraints (such as strict consistency), we can achieve the needed performance provided by modern distributed storage systems.

9 comments:

  1. It is very nice post to know about Parallel Database Systems. I read it and got good information.


    Cloud computing company

    ReplyDelete
  2. It’s amazing in support of me to truly have a web site that is valuable meant for my knowledge. cloud provider

    ReplyDelete
  3. I would be supportive on all your articles and blogs as a result of they are simply up to the mark.SolveMyPC

    ReplyDelete
  4. You have focused one every point at length. I truly enjoyed reading this thanks!
    payday loan

    ReplyDelete
  5. I wanna thanks to a great extent for providing such informative and qualitative material therefore often.faxless payday loan

    ReplyDelete
  6. Your blogs are easily accessible and quite enlightening so keep doing the amazing work guys.whole life insurance

    ReplyDelete
  7. I have spent a lot of the time in different blogs but this is really a unique blog for me.Cheats 2015

    ReplyDelete
  8. Personally I think overjoyed I discovered the blogs.convertible term insurance

    ReplyDelete
  9. I was getting bore since morning but as soon as I got this link & reached at this blog, I turned into fresh and also joyful too.Mark Curry

    ReplyDelete