[Developers] EMObject memory management and OD performance

Wed Jul 25 10:04:52 CEST 2007

Dear David Epelboim,

A few additional remarks on your reply to my previous posting about
the performance of OpendTect in relation to EMObject Memory Management.

> Have you considered that if a horizon takes 45% and grows to 50% at the
> time of copying the old to the new array you need to have 95% of the
> size of the horizon allocated in memory?
> 
> I know, it's temporary, because after the copy process is ended you'll
> need only 50%, but on those iterative operations you still need memory
> availability close to 100% many times. Plus the times it takes to copy.

You are right that the computational performance will seriously drop
(swapping/crashing) if one of your horizons covers about half of the
heap space and OpendTect needs to copy it to a bigger array to extend
this horizon. But even if I implement some smart new memory management
strategy where this copying is no longer necessary, there will always
be another horizon (twice as big) that doesn't fit. Scientists tend to
increase the resolution of their data as soon as more storage becomes
available. What I mean to say is that a temporary increase of memory
usage with a factor two is not our greatest concern.

> I'm not way involved in coding, so I would like to know from you what do
> you think about using dynamic arrays linked to each other using a
> pointer at the last position. Or using several 2D arrays and "tiling"

This is certainly an option to save some memory, especially if your horizons
are not rectangular but e.g. L-shaped. And you are right that it would make
copying unnecessary in case the horizon grows. On the other hand, it requires
a lot of extra administration, and what would be the optimal grain size of
the arrays. Small blocks will harm performance, big block will just waste
memory (and not temporarily).

Our general policy is to keep things simple as long as it is not really
necessary to make them complex. And if this simplicity turns out to become
a problem only in specific cases but not in general, then it is not such
a bad idea to provide the programmer with some "annotation"-like function
calls to actively improve the performance for those specific cases. This is
the context in which one should take my previous posting. I would like to
conclude with a quote by Donald Knuth (paraphrasing Hoare):

"We should forget about small efficiencies, say about 97% of the time:
 Premature optimization is the root of all evil. Yet we should not pass
 up our opportunities in that critical 3%"

Best regards,

Jaap Glas.

-- 
-- dr. Jaap C. Glas
-- Software Engineer
-- dGB Earth Sciences
-- Nijverheidstraat 11-2
-- 7511 JM Enschede, The Netherlands
-- jaap.glas at dgb-group.com
-- http://www.dgb-group.com
-- Tel: +31 534315155, Fax: +31 534315104