5/07/2007

why DPF?

it's topic I droped on the usenet and I expect some detailed discussion and we can have a deeper and broader knowledge on this expensive DB2 UDB feature।

Hi gurus, I know many of you are very senior DBAs and experts from IBM
internal, so I really want to know your advice on this basic topic.
"why dpf?"

usually there's a rumor, em, I believe it's a rumor, that DPF can help
you get greater performance, so even only one server, many IBM
presales will sell DPF feature with the performance story. DPF
license is not a cheap one:)

I believe DPF is much more for scalability than performance.

I believe that only when your data/table is larger than non-DPF can
serve, or you have to use more than one server, you use DPF.

how about your opinion? can you list the reasons you use DPF? Thank
you.

2 comments:

Yonghang Wang 说...

With DB2 9 and large rowids, large tablespaces and range partitioning
scalability is hardly and issue anymore.
It is still correct though that on an SMP box it can be beneficial to
have multiple logical data partitions, rather than using SMP parallelism.
To make a long story short for a warehouse that requires multiple CPUs
you will see the recommendation of DPF for performance.
For the best practices I recommend looking into Balanced Configuration
Units (BCU):
http://www.db2mag.com/story/showArticle.jhtml?articleID=180206351

.. oh.. and it's not a rumor... for a warehouse a well designed DPF
system will achieve near linear scalability compared to an SMP approach
which delivers diminishing returns.

Cheers
Serge
--
Serge Rielau
DB2 Solutions Development
IBM Toronto Lab

Yonghang Wang 说...

yes, Serge. DPF need more experienced professional to keep the near
linear scalability. In my experience, every time customer complained
DPF delivers worse performance and I always tell them the need for
more design and tuning, then do sth. to fulfill their need.

Actually, we deliver system always according to the BCU methodology
and gain good feedback.

But the problem that if only one server with many CPU and memory, if
the data/table size is not a limit, then what's the best solution?

to keep simple, is the advantage for multi-partition over SMP
parallelism(sure?,is there any direct data or material) is worth the
big effort on DPF design? especially for a firm without strong skill.