[Developers] EMObject memory management and OD performance

Mon Jul 23 11:20:08 CEST 2007

Dear all,

For those still wondering why operations on e.g. horizons can sometimes
become really slow, and everyone interested in the internals of OpendTect.

PERFORMANCE OF OPENDTECT IN RELATION TO ITS EMOBJECT MEMORY MANAGEMENT

The usual way to store an EMObject that is not too sparsely distributed
in geometrical space is keeping it in one or more n-dimensional arrays
in memory. This guarantees quick random access to individual geometrical
positions. For example, a 3D horizon will be stored in a 2-dimensional
array, or in several if it consists of more than one section.

The strategy of OpendTect is to keep the sizes of those arrays as small
as possible to restrict memory usage. We will not reserve an array that
covers the whole survey if the EMObject is currently located in a small
part only. The disadvantage of this strategy is of course that a growing
object does not fit its array any more. What will happen if the function
EMObject::setPos(,,) likes to add a new position outside the area covered
by the current array? A bigger array will be allocated and the contents
of the old array have to be copied to the new array before the old one
can be released. This takes time, and its duration is of course linearly
proportional to the current size of the object. However, not wasting
valuable memory makes it worth the effort.

Unfortunately, this economical memory management strategy has one pitfall.
A gradually growing EMObject will induce a repeated copying of the object
to a new and slightly bigger array. The total computational effort becomes
quadratically proportional to the final size of the object. This can become
a real burden in case of large EMObjects. Moreover, it is unnecessary if
either the final size or a reasonable maximum size of the object is already
known in advance. When applying the autotracker to extend a 3D horizon for
example, its new maximum size will be bounded by the space covered by the
tracker box. To transfer such knowledge to the ignorant lower parts of
OpendTect, the programmer can use the following functions:

1) void expandWithUdf( RCol& start, RCol& stop ) from Geometry/binidsurface.h

   Reserves memory for a predefined space around the current object in one
   go, and fills the added space with Undefs. At the moment, such a function
   only exists for the class BinIDSurface, in which the different sections
   of e.g. a 3D horizon are stored.

2) virtual void trimUndefParts() from Geometry/geomelement.h.

   Frees the memory occupied by all undefined space around the current
   object. To be applied if it is not guaranteed that all reserved space
   has been used by the operation extending the object. For example, the
   autotracker will not always succeed in tracking up to the boundaries
   of the tracker box.

A clear example of the application of the two functions above can be found
in Horizon3D::setArray2D(,,) in EarthModel/emhorizon3d.cc, which is called
to provide an existing 3D horizon with new depth values. The performance of
an operation like "Fill Holes" with option "Extrapolate outward" set does
really benefit from it. Another example of their application is hidden in
the autotracker code in MPEEngine/autotracker.cc. Beware that the call to
expandWithUdf(,) is executed by function Horizon3DExtender::preallocExtArea()
in this case.

Best regards.

-- 
-- dr. Jaap C. Glas
-- Software Engineer
-- dGB Earth Sciences
-- Nijverheidstraat 11-2
-- 7511 JM Enschede, The Netherlands
-- jaap.glas at dgb-group.com
-- http://www.dgb-group.com
-- Tel: +31 534315155, Fax: +31 534315104