[Users] Neural network modeling in OpendTect
Paul de Groot
paul.degroot at opendtect.org
Tue Oct 19 13:44:15 CEST 2004
One of our users posted the following questions:
>I have been trying to use the Neural Network module but I am unable to
>make it work. Part of the problem is that I do not know how to enter the
>output data set.
>When I try unsupervised learning it works fine but the problem arises as
>I try the supervised learning I have got the following error message:
>
> Select target information
>
>If you can say me where found information about how work Neural Networks
>and how I Load LAS files or logs from which menu?
>
>
In case you are struggling with the same questions, please find herewith
a small tutorial on neural network modelling and the implementation in
the (commercial) neural network plugin to OpendTect.
In neural network modelling two basic types of learning approaches are
used: supervised and unsupervised. Supervised means learning by example.
In OpendTect the examples are given in the form of "Picksets", or
extracted along the well track. Picksets are manually picked example
locations that represent the specific geologic object of interest. For
example to create a chimney cube you manually pick two sets of examples:
one representing chimneys (vertically disturbed seismic response) and
one representing non-chimneys (normal seismic response). Next you pop-up
the neural network window and create a new network from Picksets, select
"supervised" mode. The type of supervised network in OpendTect is the
well-known fully-connected Multi-Layer-Perceptron (MLP). Select the
input attributes and the example picksets on which the network will be
trained (chimneys and non-chimneys, make sure you store the picksets
otherwise you will not see them in the list of available picksets).
Specify a % (say 20-50) that will be used for testing the network (the
examples are randomly split into a set for training and testing. The
training set is used to update the weights of the network while the test
set is passed through the network to check the performance and avoid
overfitting but without feeding the result back to update the weights.
Training should be stopped when the error on the test set is minimal. If
you continue beyond the point where the error on the test set is minimal
the network starts to recognize individual examples from the training
set and looses its generalization capabilities). For supervised networks
OpendTect supports two output modes: classify and probability. We use
probability for two-class problems (e.g. chimney or non-chimney). The
two outputs are mirror-images. If chimney node is high, non-chimney will
be low. When we apply the trained network we output only the "chimney"
node. The higher the value the more probable it is that the point of
evaluation belongs to the chimney class. Hence the final output will be
a chimney "probability" cube (this is what we call TheChimneyCube. The
procedure is described in detail in the user doc of the dGB plugins).
When we have more than two outputs for the neural network it makes no
sense to output the value of one neural network output node only. For
example a network with three output nodes (e.g. sand, silt, shale
represented by 3 picksets) tries to predict vectors (1,0,0), (0,1,0) and
(0,0,1). A low value for the first node (sand) means the sample is
probably not sand but it does not say whether it is silt or shale. So,
for these type of problems (supervised, more than two output classes)
select "classify" instead of "probability" as output mode. In classify
mode the final neural network output vector (e.g. 0.8, 0.2, 0.15) is
compared to the ideal vectors (1,0,0 etc.) to create two new outputs:
class and confidence. Class is the index of the winning class (1,2, or
3. The example would return a 1 for sand) and confidence is a measure
how close the real output is to the vector of the winning class. (The
confidence is calculated as a normalized Euclidean distance between the
two vectors and ranges between 1 meaning the real vector is identical to
the ideal vector of the winning class and 0 meaning the vectors are
completely dis-similar i.e. we have no confidence in the classification
result). Classify mode is typically used for seismic facies
classification (e.g. channel, levee, pointbar etc.).
Since release 1.2 it is also possible to train a supervised neural
network on examples extracted along one or more well tracks to predict
real well log properties. Input are seismic attributes (from the current
attribute set and/or existing data volumes). The neural network output
is the selected target log (one only). In this case the examples are
created at every nearest sample position along the track (or 4 times as
many if you select the "all corners" option for more statistics). The
logs are resampled to the seismic sample range along the (deviated)
track using the specified resampling method (average, median etc.
Average means the output is averaged over one seismic sample interval
centred around the sample position). Well data is loaded from Well
manage. Important for this workflow is that you have accurate deviation
/ depth-time models.
Unsupervised learning is a form of competitive learning (aka clustering
or segmentation). The algorithm organizes the data in some way and we
interpret the result afterwards. This is also the main difference
between supervised and unsupervised. In supervised work the
interpretation is done up-front (we pick the examples and tell the
network that these represent chimneys) whilst in unsupervised networks
the interpretation is done afterwards (the network output is in the form
of seismic pattern maps or 3D objects that represent similar seismic
response. What this means in terms of geological or petro-physical
variations is left to the interpreter). In OpendTect we use a so-called
Unsupervised Vector Quantizer (UVQ) network for this work. This network
is trained on a set of random input vectors (created in Pickset-New) to
find the segment centers. Typically you choose 4-10 segments and stop
training when the average match is around 90%. When applied the input is
compared to the segment centers and two outputs are generated (similar
to "classify" mode in supervised work): segment (the index of the
winning segment) and match (the confidence ranging from 1 (input and
winning vector are similar) and 0 (vectors are completely dis-similar).
The UVQ is akin to the well-known 1D Kohonen Self-Organizing-Map (SOM).
The difference is in the learning algorithm. The UVQ network updates
only the winning segment centre during training. The Kohonen learning
algorithm instead updates the winning segment centre as well as its
immediate neighbours. The result is a smoothly varying output from one
colour (segment) to the next. The UVQ implementation in OpendTect
arrives at the same smooth end-result by sorting the vectors before
storing the network.
UVQ segmentation is often done by segmenting waveforms along a mapped
horizon (this is done a/o in GDI and Stratimagic). This method works
well if you have a decent horizon to work from. The attribute set
consists of amplitudes extracted at regular sampling interval (see the
default set for UVQ segmentation). When the network has been trained and
stored you can display the segment centres by pressing Info ... (neural
network main window).
Instead of segmenting waveforms along horizons you can also segment
entire 3D volumes. To do that you follow the same procedure (create a
random pick set for the volume or sequence (in between two horizons) you
wish to segment) and train a UVQ network to segment the data into the
user-specified number of segments. In the 3D case you should not be
using waveforms (these only make sense if you have an anchor (i.e.
horizon) to work from). Instead use only phase-independent attributes
such as energy, frequency attributes, similarity etc.
I hope this helps.
Best regards,
Paul de Groot.
--
-- Paul de Groot
-- OpendTect Support Team
-- paul.degroot at opendtect.org
-- http://www.opendtect.org
-- Tel: +31 534315155 , Fax: +31 534315104
More information about the Users
mailing list