[Developers] SeparString (and FileMultiString) changes in OD 4.1

Thu Dec 3 09:27:24 CET 2009

Dear fellow programmers,

The new OpendTect 4.1 developers release contains a number of non-trivial
changes to the SeparString (and FileMultiString) classes in Basic/separstr.h
and Basic/separstr.cc. The SeparString class contains a list encoded in a
string where the items are separated by a user chosen separator. The
FileMultiString class is a subclass with the back-quote as separator.

Where the header file already mentioned the possibility to escape the
separation character with a backslash in earlier releases, its implementation
was still a "TODO". The correction of this negligence has some consequences
for old code and new code.

So if your own plugin uses the SeparString and/or FileMultiString class
or if you think of using them in the future, the following information
might be of interest.

=======================================================================
One must pay attention when using any SeparString-function that inputs
or returns an argument of type (char*), (const char*), (BufferString&)
or (const BufferString&). If the argument represents one element of the
SeparString, the argument is expected to be unescaped. If the argument
represents a whole SeparString or a sub-SeparString, the argument is
expected to be escaped.

The add(), operator+=(), operator[](), and indexOf() member functions
are dealing with unescaped string arguments.

The SeparString() constructor, and the operator=(), from(), buf() and
rep() member functions are dealing with escaped string arguments.
=======================================================================

In order to handle any SeparString reloaded from old data files, both
the SeparString() constructor, and the assignment operator=() are gifted
with intelligence to recognize unescaped input arguments and convert
them. This intelligence will only fail in case the old string already
contained backslashes, and all of them appear either pairwise or precede
a separation character. This is highly unlikely, but adaptions to either
the old data files or the code reading them is required in that case.
New code should not rely on the conversion intelligence.

The unescapedStr() function has been added to the SeparString class to
offer an unescaped alternative for the buf() function to those who know
what they are doing. It is only useful under controlled circumstances,
since the distinction between separ-chars and escaped separ-chars will
be lost.

A tricky part of this project is locating all places in your old code
where the add() or operator+=() function of a (non-empty) SeparString or
FileMultiString is applied to a (const char*) input argument that does
actually contain a SeparString. Tail addition and concatenation are no
longer equivalent in the new situation! Cases like this must be solved
by explicitly converting the input into a SeparString or FileMultiString
first:

sepstr += strarg;  ---->  sepstr += SeparString( strarg, sepstr.sepChar() );

fms += strarg;  ---->  fms += FileMultiString( strarg );

Notice that the add() and operator+=() functions have been overloaded
for the types (const SeparString&), (const FileMultiString&) and
(const BufferStringSet&) to implement concatenation.

Please contact me if you have any more questions about the new semantics
of the SeparString or FileMultiString class.

Best regards,

Jaap Glas

-- 
-- dr. Jaap C. Glas
-- Software Engineer
-- dGB Earth Sciences
-- Nijverheidstraat 11-2
-- 7511 JM Enschede, The Netherlands
-- jaap.glas at dgbes.com
-- http://www.dgbes.com
-- Tel: +31 534315155, Fax: +31 534315104