A Multi-Resolution Texture Image Retrieval Using Fast Fourier Transform

Texture is an important visual property that characterizes a wide range of natural and artificial images which makes it a useful feature for retrieving images. Several approaches have been proposed to describe the texture contents of an image. In early research works, such as edge histograms-based techniques and co-occurrence-based approaches, texture descriptors were mainly extracted from the spatial domain. Later on, dual spaces (transform of spatial domain) such as frequency space or spaces resulting from Gabor or wavelet transforms were explored for texture characterization. Recent physiological studies showed that human visual system can be modeled as a set of independent channels of various orientations and scales, this finding motivated the proliferation of multi-resolution methods for describing texture images. Most of these methods are either wavelet-based or Gabor-based. This paper summarizes our recent study of the use of Fourier-based techniques for characterizing image textures. At first, a singleresolution Fourier-based technique is proposed and its performance is compared against the performance of some classical Fourier-based methods. The proposed technique is then extended into a multi-resolution version. Performance of the modified technique is compared against those of the single-resolution approach and some other multi-resolution approaches recently described in literature. Two performance indicators were used in this comparison: retrieval accuracy and execution time of the techniques.


Introduction
Content-Based Image Retrieval (CBIR) has been an active research topic in the last two decades.As a consequence, several experimental and commercial image retrieval systems have been proposed during this period of time.In a CBIR system, image databases are queried using the visual content of an image which is usually represented by low level features such as color, texture, shapes, or a combination of some or all of these features.
Texture is widely used in CBIR systems since it plays a crucial role in characterizing real world images both natural (such as images of clouds, water, trees, remotely sensed data and medical images) and manmade ones (such as images of bricks, fabrics, and buildings ).
Several approaches have been proposed to describe texture contents of an image.In early works, texture features were mainly extracted from the pixel space itself using edge density, edge histograms, and cooccurrence-based-features to characterize the image texture (Haralick et al. 1973;Conners and Harlow, 1980;Amadasun andKing, 1989 andFountain andTan, 1998).More recently, various transforms have been used to produce dual spaces from which texture features were extracted.The most common transforms are Fourier (Tsai and Tseng, 1999;Weszka et al. 1976 andGibson andGaydecki, 1995), wavelet (Smith and S-F, 1994;Kokare et al. 2007;Huang and Dai, 2003;Huang andDai, 2004 andHuang et al. 2006) and Gabor transforms (Daugman and Kammen, 1987;Jain andFarrokknia, 1991 andBianconi andFernandez, 2007).Moreover, recent studies showed that human visual system can be modeled as a set of independent channels of various orientations and scales (Beck et al. 1987), this finding motivated the proliferation of multi-resolution methods for describing texture images.Most of these methods are either waveletbased or Gabor-based.While the use of Gabor filters is supported by physiological evidences (Beck et al. 1987), it suffers from serious drawbacks such as the need to tune the parameters of the filter and the complexity of the calculations involved (Bianconi and Fernandez, 2007).In the other hand, wavelet-based approaches are much simpler and faster.As a consequence, wavelet-based approaches gained much more popularity among the computer vision community.Fourier transform has also been widely used in characterizing textures.One of the reasons for its popularity is its suitability for describing periodic functions; and it's known that texture images usually contain quasirepetitive patterns.Concentrations of Fourier power spectrum values capture dominant orientations of the patterns in the image and their distribution in the frequency space is closely related to coarseness of the texture (Tsai andTseng, 1999 andWeszka andDyer, 1976).These two features (directionality and coarseness of a texture) are of importance in texture analysis (Campbell and Robson, 1968).The main drawback of using Fourier transform is the poor spatial localization it provides.Windowed Fourier transform has been introduced to overcome this problem at the cost of a significant increase in computations (Yu et al. 2002).This paper proposes a new single-resolution Fourierbased technique for characterizing texture images and compares its performance against those of some classical Fourier-based techniques.It also describes a multi-resolution version of the technique and evaluates its performance in comparison with those of several other multi-resolution techniques recently described in literature.The two main performance indicators that are used in this comparison are the accuracy and execution time of the techniques.

Summary of the Techniques Considered in this Research Study
This study includes key Fourier-based techniques, and some recent multi-resolution approaches (mainly Gabor-and wavelet-based ones) described in literature.

Fourier-based Techniques
Fourier transform has been widely used by image processing research community.It has the very useful property of highlighting the dominant spatial frequencies as well as the dominant orientations of the structures contained in the image.Another advantage of using Fourier transform is the fact that frequencydomain features are generally less sensitive to noise than spatial domain features (Tsai and Tseng, 1990).Weszka and Dyer, 1976 partitioned the spectral domain into ring-shaped and wedge-shaped areas.Variance values of the Fourier power spectrum of ringshaped regions and wedge-shaped regions were used to describe the coarseness and the directionality of the texture respectively D. Tsai and C. Tseng, 1999 proposed the use of average energy (power spectrum) of 4 co-centric areas as feature descriptor for roughness of cast surfaces.Bayes and neural network classifiers were evaluated.The authors reported 100% classification accuracy rate on cast specimens containing nine roughness classes.
D. Gibson and P.A. Gaydecki, 1995 used the Fourier moduli to classify histological images.They conducted an experiment in which they compared their technique with the Co-occurrence matrix based approach proposed by (Haralick et al. 1973).They concluded that the two approaches worked with approximately equal success but the Fourier based one was much faster.Yu et al. 2002 applied a local Fourier transform to the image and used first and second moments as texture features.The local Fourier transform consisted of applying eight 3x3 templates approximating the Fourier transform of the 3x3 window resulting in 8 transformed images for which the first and second moments were calculated to describe the texture content of the image.The authors compared their technique with some other related works (Color moments, Color correlogram) on an image database of 10000 images using 200 queries.They reported the supremacy of Color Texture Moments-based method.

Wavelet-based Techniques
Interest of computer vision community shifted to wavelet-based approaches when several physiological studies of the visual cortex suggested the use of multiscale analysis of the visual information by Visual Systems of primates.Beck et al.1987 for example, found that the visual cortex can be modeled as a set of independent channels, each with a specific frequency and direction.
Smith and Chang, 1994 proposed a method for classification and discrimination of texture based on energies of image sub-bands.They compared four image decompositions (5-level pyramidal wavelet decomposition, 4x4 uniform sub-band decomposition of the Fourier transform, 4x4 DCT sub-bands and a simple spatial partition of the image into 4x4 blocks).Each sub-band was then represented by its mean and standard deviation.They found that Wavelet-based decomposition and Fourier techniques produced the best results and the spatial partition had the worst retrieval performance.
M. Kokare et al. 2007;Kokare et al. 2005 andKokare et al. 2006 proposed a set of 2-D rotated filters to which a 5-level discrete wavelet transform was applied.Each sub-band was represented by its energy and standard deviation.Their approach produced a good characterization of diagonally oriented texture.They reported about 8% improvement of the retrieval rate compared with traditional Wavelet Transforms.
Recently, (Huang et al. 2003;Huang andDai, 2004 andHuang et al. 2006) proposed a wavelet based approach that concatenates gradient vectors of the subband images to obtain a single feature vector called Composite Sub-band Gradient (CSG) vector.A gradient vector of an image is the histogram that records the total gradient magnitude of the image pixels at different directions.Their experiments showed that their approach outperformed the single-resolution technique using the same texture features.Some other recent works (Wang andYong, 2008 andHuang andAvlyente 2008) took into account the correlations that exist between different sub-bands when selecting the texture features.
Wang and Yong, 2008 identified the most correlated sub-bands and used a linear regression to estimate the relationship between these sub-bands.Their feature vector consisted of the linear regression parameters (a i, b i ) of correlated sub-bands as well as the means and standard deviations of the linear approximation errors.Their technique produced better performance than classical methods such as pyramid structured wavelet transform, tree structured wavelet transform and Gabor, and combination of these methods with GLCM (grayscale concurrence matrix) on a dataset made of 40 Brodatz images.
K. Huang and S. Aviyente, 2008 used the correlation information to cluster the image sub-bands.The energy of the highest sub-band in each cluster was considered as texture feature.Their method performed an effective selection of sub-bands since there was no loss of accuracy compared with the one obtained with the full set of sub-bands.
Z-He et al. 2009 presented a novel wavelet-based method which used non-separable wavelet filters.They claimed that their approach can capture more directions and edge information which leaded to better retrieval accuracy.
Recently, complex wavelets transform became a popular tool for texture characterization.This is because of its shift invariance property and its good directional selectivity over the traditional discrete wavelet transforms (Kokare et al. 2006;Selesnick, 2002, Celik and Tjahjadi, 2009and Vo and Oraintara, 2009).
M. Kokare et al. 2005 proposed new 2D Rotated Complex Wavelet Filters that are non-separable; they are 45 degree rotated version of the Complex Wavelet Filters proposed by Selesnick, 2002.The authors conducted experiments on two sets of images (116x16 and 40x16) and reported an improvement in accuracy compared to traditional wavelets and Gabor-based method especially when combining the new filters with the dual tree complex wavelet transform since this combination allowed characterization of 12 directions instead of three by traditional real wavelets, and six in most Gabor wavelets and complex wavelets techniques Celik and Tjahjadi, 2009 described a multi-scale texture retrieval classifier that used a Gabor-like function and dual-tree complex wavelet transform with 3 scales and 6 directions.The texture feature vector consisted of the variances and entropies of the sub-bands resulting from the transform.Their experiments on 24 texture images (six from Brodatz album and six from the MIT VisTex database) showed that the new approach outperforms the traditional discrete wavelet transform and is also robust against noise.

Proposed Methods
The proposed single-resolution method is partly inspired from previous energy-based works (Tsai andTseng, 1999 andWeszka andDyer, 1976).Like those methods a discrete Fourier-transform is first applied to the original image f(x,y) to obtain the transformed image F(u,v).The discrete 2-D Fourier transform F(u,v) of an MxN image f(x,y) is given by the following Eq.( 1): (1) Like those approaches also, the frequency domain is partitioned into several regions.The difference with those works is that sectors are used instead of rings and wedges, where a sector is the intersection of a wedge and a ring; see Fig. 1.The advantage of using sectors is the fact that a more accurate description of the power spectrum distribution in the frequency domain can be obtained.Each sector characterizes a range of orientations and some levels of coarseness of the texture.Moreover, unlike the above mentioned works, the proposed approach takes advantage of the symmetry property of Fourier transform and limited the analysis to only half of the frequency space which reduces the overall processing time.
Let {f(x,y), x=1,n , y=1,n} be the texture image and {F(u,v), u=1,n, v=1,n} be its Fourier transform.First, the origin of the transformed image is shifted to the center of the image at position (n/2+1, n/2+1).This will produce a symmetrical Fourier image.Because of this symmetry, only half of the image is processed to extract texture features.In this implementation the right half of the image is considered; it's defined by Eq. ( 2): (2) The spectral domain is partitioned into half-ringshaped regions and wedge-shaped regions where: A half-ring-shaped region Rr1,r2 is defined as follows (3): (3) A wedge-shaped region is defined as follows (4) (4) The intersections of these rings and wedges define sectors S r1,r2 1,2 , as shown in formula ( 5) In order to reduce the effect of noise, only the significant values of the power spectrum are considered for further processing.
The mean and standard deviations of the significant values of power spectrum in these sectors constitute the feature vector.
Such a feature vector describes three aspects of the texture: its roughness, coarseness and directionality.As for texture roughness, (Gibson and Gaydecki, 1995) indicated that rough textures tend to contain more energy in the high frequency components than smoother ones.Regarding texture coarseness, (Tsai and Tseng, 1999) indicated that magnitude of power spectra for frequency components away from the origin drop rapidly to approximately zero for coarse textures.Finally, it is 0-well known that linear structures in the spatial domain at direction produce linear structures in the frequency domain with high energy values at direction +90 °.

Feature Extraction Algorithm (Single-resolution)
The feature extraction algorithm used in the singleresolution method can be summarized as follows: (5) sider only (u,v) satisfying formula ( 5)) 4. Calculate means ( i ) and standard deviations ( i ) of the n sectors.The feature vector is defined as follows: FV= ( 1 , 1 2 , 2 ,… n , n ).
Few local multi-resolution Fourier-based methods were proposed for texture analysis.They have mainly applied windowed Fourier transform of different sizes (resolutions) to the original image and extracted textural features from those transformed images.The proposed technique adopts a global multi-resolution approach in the sense it applies Fourier transform to images of different sizes and the textural features are extracted from the various transformed images.Therefore, the new multi-resolution method is just an extension of the single-resolution method with few adjustments introduced to prevent any significant deterioration of the processing time.Experiments we have conducted showed that the use of mean value alone instead of mean and standard deviation as a discriminating feature does not deteriorate the retrieval performance and it slightly improves the processing time.They also showed that the use of sectors for multi-resolution approach does not improve significantly the performance as when wedges and rings are used but it slightly increases the processing time.Taking into account these two results, the following feature extraction algorithm for our multi-resolution method is adopted.

Feature Extraction Algorithm (Multi-resolution)
The Feature Extraction Algorithm can be summarized as follows 1. Apply Fast Fourier transform to original image f0(x,y) to get F0(u,v)(i.e.F0(u,v)=FFT(f0(x,y)) 2. Set Resolution level k to 0; e. k=k+1.4. The feature vector FV is defined as: FV=(FV 1 , FV 2 ,…FV L ), where L is the maximum number of scales considered.

Similarity Measurement
Given two k-dimensional feature vectors f1 and f2, representing two images Im1 and Im2, the dissimilarity between Im1 and Im2 can be estimated using various distance metrics.The simplest and most popular one is the Euclidean distance.This metric assumes all the components of the feature vector are of the same level of magnitude which is not the case for the feature vectors of the proposed techniques.Lower resolution images have low energies.A normalized metric that reduces the bias towards high-magnitude components is needed.Few of such metrics were tested; the following two were selected because of the good retrieval results they produced (formula 6 and 7): This metric is known in literature as Caneberra metric; it is used to compare feature vectors produced by the single-resolution technique. (7) The metric defined by Eq. ( 7) is used to compare feature vectors produced by the multi-resolution technique because the experiments we have conducted showed that the use of this metric leads to a slightly better results than when Caneberra metric is used.
Therefore, the feature vector of the single-resolution approach consists of 24 elements containing the mean and standard deviations of the twelve sectors S r1,r2, 1, 2 shown in Fig. 1 and the feature vector of the multi-resolution approach consists of 7xL elements containing the means of the three half-rings and the four wedges of the L images representing L resolutions.See Fig. 2. The value of the threshold defining significant power spectrum values is set to the mean of the values of the transformed image by FFT.The experiments showed that this value gives good results with all the images that have been tested.

Test Dataset
The dataset used in the experiments is made of 79 grayscale images selected from the Brodatz album we downloaded from: [http://www.ux.uis.no/~tranden/brodatz.html].
Images that have uniform textures (ie.similar texture over the whole image) were selected.All the images are of size 640 x 640 pixels.Each image is partitioned into 8x8 non-overlapping sub-images from which 4 sub-images are chosen to constitute the image database (ie.database= 316 images) and one to be used as a query image (ie.79 query images).

Hardware and Software Environment
We have conducted all the experiments on an Intel Core 2 (2GHz) Laptop with 1 GB RAM.The software environment consists of MS Windows XP professional and Matlab7.

Performance Evaluation
To evaluate the performance of the proposed approaches, we have adopted the well-known efficacy formula (8) introduced by Kankahalli et al. 1996. (8) Where n is the number of relevant images retrieved by the CBIR system, N is the total number of relevant images that are stored in the database, and T is the number of images displayed on the screen as a response to the query.
In the experimentation that has been conducted N=4, and T=10 which means Efficacy=n/4; The average time needed to perform feature extraction and compare two images is also recorded.It's the average time needed for comparing each of the 79 query images with 316 images in the database.This value is used as a performance indicator of the speed of the methods under investigation.
The first set of experiments is designed to compare the results obtained by the proposed single-resolution method with two classical single resolution Fourierbased techniques.The purpose of this experiment is to show the improvement obtained by adopting a sectorbased segmentation of the frequency domain over the classical ring/wedge-based segmentation.
Both of the Fourier-based methods partition the Fourier space into 4 wedges and 3 rings.The first one (referred to as CF(µ)) uses mean of the power spectrum of the resulting regions as texture feature.The second one (referred to as CF( )) uses standard deviation of the power spectrum of the resulting regions as texture feature.The third one (referred to as CF(µ, )) uses both the mean and standard deviations as texture feature.
The second set of experiments was designed to test the performance of the new multi-resolution method.The first experiment is designed to evaluate the effect of the type of interpolation technique (nearest, bilinear and bi-cubic) on both the accuracy and execution time of the proposed method.The second experiment is designed to compare the performance of the multi-res-  olution approach (referred to as MRFFT) with the single-resolution one (referred to as SRFFT) and obtained five other multi-resolution methods.These methods are: Dual-Tree Complex Wavelet using means and standard deviations of the sub-bands similar to the one described in (Celik and Tjahjadi, 2009) (referred to as DTCW(µ, )), Rotated Wavelet Filters proposed by (Kokare et al. 2007) (referred to RWF (µ, )), traditional Gabor Filters using means and standard deviations of the different sub-bands as described in Zhang et al. 2000 (referred to as Gabor (µ, )), 3-level wavelet-based method using energy of sub-bands as feature descriptor similar to the one proposed by (Wouwer et al. 1999) (referred to as MRWEDb4) and CSG Method similar to the one proposed by (Huang et al. 2006) (referred to as CSGDb10).

Results and Discussion
Table .1 summarizes the results of the first set of experiments designed to evaluate our single-resolution method.The table shows that the proposed method outperforms the classical Fourier methods in terms of accuracy.(There is between 4% and 14% improvement).This can be explained by the fact that partitioning frequency domain into sectors describes more accurately the distribution of the power spectrum in the Fourier space than when using rings and wedges.The table also indicates that this gain in accuracy is obtained at the expense of an increase in processing time required to build and compare the twelve-element-vectors of the new method (there is about 43% increase in processing time).But it is worth noting that the extra time is not exorbitant since the total time is still less than many current retrieval techniques (as shown in table 3).CF(µ) = classic Fourier method using mean CF( ) = classic Fourier method using standard deviation CF((µ, )= classic Fourier method using mean and standard deviation SRFFT = Our single-resolution technique using µ and of sectors Table .2 summarizes the results obtained in the experiment designed to identify the most appropriate interpolation technique to be used by our multi-resolution method.As expected the nearest interpolation technique is the fastest among the three approaches; but with 80% accuracy it is far behind the two other techniques.Bilinear interpolation technique is the best in terms of accuracy (95%) and it requires less execution time than the bi-cubic technique (about 10% faster).For these reasons bilinear interpolation was used in the proposed multi-resolution technique.
Table .3, indicates that the use of a multi-resolution approach improves the retrieval performance by 2% (from 93% to 95%) when compared to the single resolution approach; the price to pay for better performance is an increase in the processing time (about 73% increase), though the new time is still reasonable.
The best in terms of processing time is the classical Wavelet-based technique (2 times faster than the proposed method) but with lower efficacy rate.
Figure 3 shows the first 10 retrieved images obtained by our single-resolution method and the three classical Fourier methods, for the query sub-image D96-6.Figure 4 shows the retrieval results for the same query image obtained by our multi-resolution method and four other techniques (namely DTCW (µ, ), Gabor (µ, ), RWF and CSGDb10).

Conclusion and Future Work
This paper describes a new single-resolution Fourierbased method for retrieving texture images.The experiment that has been conducted shows that the new method outperforms several existing Fourier-based methods.It also describes a multi-resolution version of this method that produces better retrieval results than the single resolution approach at the expense of some additional processing time.The multi-resolution method outperforms several existing multi-resolution methods.Currently, two possible improvements to the proposed technique are under investigation:

Figure
Figure 1.The right half of the frequency domain is partitioned into 12 sectors each representing a specific direction and frequency range.Note how "vertical" wedges (1, 5 and 9) are each made of two 22.5 o wide wedges 3. while k < MAX_LEVEL do a.Consider only pixels with significant power spectrum values (ie.F(u,v)=0 if F(u,v)< threshold).b.Calculate R k , cumulated energies (means) of the right half ring-shaped regions and normalize them (ie.R k = (r 1 , r 2 , …r n ), where r i = normalized sum of energies for ring-shaped region i).c.Calculate W k , cumulated energies (means) of right wedge-shaped regions and normalize them (ie.W k = (w 1 ,w 2 ,..w m ), where w i =normalized sum of energies for wedge-shaped region i).The feature vector for level k is therefore defined as FV k =(R k ,W k ).d. Calculate the next resolution image by halving image sizes and applying an interpolation technique on pixel values (nearest, bilinear, or bicubic).

Figure 2 .
Figure 2. The seven regions (3 rings and 4 wedges) considered for each resolution level in the frequency domain and their significance (multi-resolution approach)

Figure 3 .Figure 4 .
Figure 3. Retrieval results for our single resolution method and three versions of the classical.Retrieved images are sorted by decreasing value of similarity score from left to right and top to bottom a. Studying needed modifications for making proposed methods rotation-invariant and b.Exploring ways to perform image segmentation based on the Fourier features used by the proposed methods.