Subscribe RSS

Posts Tagged ‘matlab’

Digital images and arrays : the same thing ?

July 30th, 2009 by fmn | 2 Comments | Filed in Enseignement, Research

Take an image processing book (Morphological Image Analysis by example). The first pages are about defining important concepts. Thus an image is often defined as a mapping from D, a subspace of \mathbb{Z}^2, to a number (0 or 1 in the binary case). The subspace D is the definition domain of the image, defined in this case on a rectangular lattice. The pixel coordinates are thus integers (positives or negatives).

Let’s be practical make some code with a good library (ImageJ by example). Often, the documentation says an image is an bi-dimensional array. The pixel coordinates belongs to [(0, 0), (w-1, h-1)], where w and h are the image dimensions.

The main difference between these two definitions (mathematical and practical)  relies on the definition domain. In Mathematics, the pixel coordinates can be negatives. In practice (via the library), they can’t.  Let’s study the consequences by computing a convolution product in Octave (or Matlab). For simplicity, the product is computed in one dimension :

octave> a = [1 2 3 4 5]
octave> b = [1 0 -1]
octave> conv(a, b)
  ans = 1 2 2 2 24 -5
octave> conv(b, a)
  ans = 1 2 2 2 24 -5

The commutativity is respected, since conv(a, b) = conv(b, a), this is a good thing.  But there is some troubles. What is exactly defined with a=[1 0 –1]? It a (very bad) derivative filter that can be used to detect edges.  But what is the origin of the signal? Is it on the zero ? On the -1 ? We are not sure as for Octave/Matlab it is an array. It is yet a capital information. If the origin is on the zero, there won’t be any offset between an edge and it’s detection. By example, if an edge is at x, the filter response is maximal at x. But if the origin on the -1 (or the 1), the filter response is maxima at x+1 (or x-1). It is not the same filter.

There is generally an usage convention in image processing library : the origin of the filter is on the center of the coefficients.  But this is only a convention. This information is nowhere, unless in the documentation of the library functions.  Practically, it is necessary to manipulate signals and images with offsets. This why the Matlab designers (by example) says in the  conv documention:

C = conv(…,’shape’) returns a subsection of the two-dimensional convolution, as specified by the shape parameter:

full
Returns the full two-dimensional convolution (default).

same
Returns the central part of the convolution of the same size as A.

valid
Returns only those parts of the convolution that are computed without the zero-padded edges. Using this option, C has size [ma-mb+1,na-nb+1] when length(c) is max(length(a)-max(0,length(b)-1),0)

These options are not necessary clear, especially the valid one.  As the manipulation of the offsets is made by the function and not the data structure, errors can appear. I remember, during my Ph.D., some hard bugs due to this imprecision. It is thus mandatory to add comments in the code to retain the offset information. If this information is lost : bug !

I wonder why image or signal processing libraries don’t provide a good data structure? A data structure that can handle negative coordinates. There would be a increase in time costs (a translation for each pixel access), but much errors would be avoided. This why generally high level programming languages are used : to not deal with low-level details.

FMN.

Tags: , , , , ,