Takagi-Sugeno Neuro-Fuzzy Modeling of a Multivariable Nonlinear Antenna System

This article investigates the use of a clustered based neuro-fuzzy system to nonlinear dynamic system modeling. It is focused on the modeling via Takagi-Sugeno (T-S) modeling procedure and the employment of fuzzy clustering to generate suitable initial membership functions. The T-S fuzzy modeling has been applied to model a nonlinear antenna dynamic system with two coupled inputs and outputs. Compared to other well-known approximation techniques such as artificial neural networks, the employed neuro-fuzzy system has provided a more transparent representation of the nonlinear antenna system under study, mainly due to the possible linguistic interpretation in the form of rules. Created initial memberships are then employed to construct suitable T-S models. Furthermore, the T-S fuzzy models have been validated and checked through the use of some standard model validation techniques (like the correlation functions). This intelligent modeling scheme is very useful once making complicated systems linguistically transparent in terms of the fuzzy if-then rules.


Takagi-Sugeno Models
Developing mathematical models of real systems is a central topic in many disciplines of engineering and science.Models can be used for simulations, analysis of the system's behavior, better understanding of the underlying mechanisms in the system, design of new processes, or design of controllers.Takagi-Sugeno (T-S) modeling plays an essential role in deriving local linear models of the nonlinear dynamic system under concern (Lo and Chen, 1999;Takagi and Sugeno, 1985).Through the use of the heuristic rules inherent in the fuzzy systems, T-S fuzzy models, then make it possible to have a transparent like system which is governed by the fuzzy inference system and rules.Fuzzy modeling concerns the methods of describing the characteristics of a system using fuzzy inference rules.Fuzzy modeling methods have a distin-_______________________________________ *Corresponding author E-mail: ebrgallaf@eng.uob.bhguishing feature in that they can express complex nonlinear systems linguistically.
In a similar way fuzzy clustering has been utilized as well in classifying data-driven fuzzy modeling, since it draws a methodology for assigning label to similar data.Such assignment does give quantitative directions for shaping the fuzzy membership functions.Model validation and verification is also an important task in the modeling paradigm.This is due to the choice of the right model from a number of models that might present similar characteristics.Statistically validated models in addition to probabilistic validation are used sometimes to make the suitable choice of a system model.In general, fuzzy control systems can be classified as linguistic (Lo and Chen, 1999;Mamdani and Assilian, 1975).The linguistic type fuzzy control system is well recognized and received by the control society.The T-S type fuzzy system, which will be used in this article mainly focuses on the modeling aspect.It has been reported that a T-S fuzzy system can exactly model any nonlinear system Wang, et al. (2000).On the other hand there is a main drawback of the linguistic model compared with the T-S model in that there is a difficulty in dealing with a multidimensional system since a large number of fuzzy rules have to be used.Gorzalczany et al. (2000), has briefly presented and compared four neuro-fuzzy systems used for rule-based modeling of dynamic processes (chaotic Mackey-Glass time series).The following systems have been considered: NFMOD -the proposed system, the well-known ANFIS and NFIDENT systems, and an alternative neurofuzzy system already reported in literature.The main criterion of comparison of all systems is their performance (modeling accuracy) versus interpretability (the transparency and the ability to explain generated decisions; it also includes an analysis and pruning of obtained fuzzyrule bases).On the other hand, Zhang and Knoll (1995).have proposed an approach for solving multivariate modeling problems with neuro-fuzzy systems.Instead of using selected input variables, statistical indices are extracted to feed a fuzzy controller.The original input space was transformed into an eigen-space.If a sequence of training data are sampled in a local context, a small number of eigenvectors which possess larger eigen-values provide a good summary of all the original variables.Fuzzy controllers can be trained for mapping the input projection in the eigen-space to the outputs.Implementations with the prediction of time series was used to validate the concept.
The article of Ikonen and Kortela (2000) is concerned with a process modeling using fuzzy neural networks.In Distributed Logic Processors (DLP) the rule base is parameterized.The DLP derivatives required by gradientbased training methods are given, and the recursive prediction error method is used to adjust the model parameters.The power of the approach is illustrated with a modeling example where NOx-emission data from a full-scale fluidized-bed combustion district heating plant are used.The method presented in their paper was general, and can be applied to other complex processes as well.Bologna (2001) has presented a new neuro-fuzzy model denoted as Fuzzy Discretized Interpretable Multi-Layer Perceptron (FDIMLP).Fuzzy rules were extracted in polynomial time with respect to the size of the problem and the size of the network.He applied our model to three classification problems of the public domain.It turned out that FDIMLP networks compared favorably with respect to EFUNN and ANFIS neuro-fuzzy systems.Ning et al. (2001) presented a fuzzy satisfactory clustering algorithm in their paper.It started with two cluster centers and adds new center if necessary.A system data set was quickly divided into several satisfactory fuzzy clusters by the algorithm.A T-S type fuzzy model was then, identified.Chen and Linkens (1998) introduced a three-layered RBF (Redial Basis Function) network to implement a fuzzy model.Differing from existing clustering-based methods, in their approach the structure identification of the fuzzy model, including input selecting and partition validating, was implemented on the basis of a class of sub-clusters created by a self-organizing network instead of raw data.The important input variables which independently and significantly influence the system output can be extracted by a fuzzy neural network.On the other hand, the optimal number of fuzzy rules can be determined separately via the fuzzy c-means clustering algorithm with a modified fuzzy entropy measure as the criterion of cluster validation.Akkizidis and Roberts (2001) proposed an algorithmic methodology for identifying and modeling non-linear control strategies.The methodology presented was based on choices of different fuzzy clustering algorithms, projection of clusters and merging techniques.The best features of well-known clustering methods such as the Gustafson-Kessel and mountain method were combined.The latter was used to determine and define the number and the approximate positions of the cluster prototypes; whereas the former was used to define the shapes of the clusters according to the data distribution.The projection of the prototypes and variables of clusters was a recognized approach to extracting the information included in the data clusters into fuzzy sets.Merging these fuzzy sets, based on proposed guidelines, can minimize the number of rules and make the identifying control strategy more transparent.Bossley (1997) has looked into the problem of antenna modeling via neuro-fuzzy systems, however, getting an optimized five layers neural network was not easily achieved due to the large number of generated fuzzy rules.

Article Contribution
The system under study is typical of the type used for oceanary satellite communication systems and has a high nonlinear coupling among its two outputs.Hence, it is required to have transparent sub-models.This class of multivariable system has been modeled via a classical Neuro-fuzzy system as in Bossley (1997).However, it did result in a large number of rules, and large number of training patterns were required.In this respect, this research frame work is investigating the use of clustered fuzzy rules, that makes it easy for the training mechanism to be achieved in less time with fewer number of rules.Fuzzy sets in the antecedent of the rules are obtained from the partition matrix by projection onto certain antecedent variables.The obtained point-wise fuzzy sets are then approximated by some suitable parametric functions.The transparency of the antenna model obtained using the above approach may be hindered by the redundancy present in the form of many overlapping (compatible) membership functions.Certain similarity measures were used in order to assess the compatibility (pair-wise similarity) of fuzzy sets in the rule base, in order to detect sets that can be merged.Fuzzy sets estimated from antenna training data can also be similar to the universal set, thus adding no information to the model.Sets of such nature were removed from the antecedent of the rules, thus reducing the number of the fuzzy rules.

Intelligent Modeling
Fuzzy modeling and control are typical examples of techniques that make use of human knowledge and deductive processes.Various alternative approaches have been proposed, Fuzzy Logic and Set Theory being one of them.Artificial neural networks and fuzzy models belong to the most popular model structures used.From the input-output view, fuzzy systems are flexible mathematical functions, which can approximate other functions or just data measurements with a desired accuracy.Compared to wellknown approximation techniques such as Neural Networks, fuzzy systems provide a more transparent representation of the system under study, which is mainly due to the possible linguistic interpretation in the form of rules.The logical structure of the rules facilitates the understanding and analysis of the model in a semi-qualitative manner, close to the way human reason about the real world.
Given the state of a system with a given input, the next state x(k + 1) can be determined.In the sense of discrete-time setting, it can be written as: where x(k) and u(k) are the state and the input at time k, respectively, and f is a static function.Fuzzy models of different types can be used to approximate the state-transition function.As the state of a system is often not measured, input-output modeling is usually applied.The most common is the NARX (Nonlinear Auto-Regressive with Exogenous input) model, as defined by where y(k) ,…y(k -n y + 1) , and u(k) ,…, u(k -n y + 1) denote the past model outputs and inputs respectively and n y and n u are integers related to the model order (usually selected by the designer).For instance in Eq. ( 3), a linguistic fuzzy model of a dynamic system may consist of rules of the following form : In Eq. ( 3), the input dynamic filter is a simple generator of the lagged inputs and outputs, and no output filter is used.Since the fuzzy models can approximate any smooth function to any degree of accuracy, models of the type in Eq. ( 3) can approximate any observable and controllable modes of a large class of discrete-time nonlinear systems.To facilitate data-driven optimization of fuzzy models (learning), differentiable operators (product, sum) are often preferred to the standard min and max operators.Once the structure is fixed, the performance of a fuzzy model can be fine-tuned by adjusting its parameters.Tunable parameters of linguistic models are the parameters of antecedent and consequent membership functions (determine their shape and position) and the rules (determine the mapping between the antecedent and consequent fuzzy regions).

Neuro-Fuzzy Modeling
Figure 1 shows typical five layers of a neuro-fuzzy system that can be employed to accomplish a rule network.Typically, such rules are Nodes in the first layer compute the membership degree of the inputs in the antecedent fuzzy sets.The product nodes Π in the second layer represent the antecedent conjunction operator.The normalization node Ν and the summation node Σ realize the fuzzy-mean operator.
Using smooth antecedent membership functions, such as a Gaussian function, as given below (8) in which c ij and τ ij parameters are adjusted by gradientdescent learning algorithms, such as back-propagation.This allows a fine-tuning of the fuzzy model to the available data in order to optimize its prediction accuracy.There may be a lot of structure/parameter combinations which make the fuzzy model behave in a satisfactory way.The problem can be formulated as that of finding the structure complexity which will give the best performance in generalization.In our approach we choose the number of rules as the measure of complexity to be properly tuned on the basis of available data.We adopt an incremental approach where different architectures having different complexity (i.e.number of rules) are first assessed in cross-validation and then compared in order to select the best one.

14
The Journal of Engineering Research Vol. 2, No. 1 (2005) 12- 24It is assumed that a set of N input-output data pairs ( )

Hyper ellipsoidal fuzzy clusters
The initialization of the architecture is provided by a hyper-ellipsoid fuzzy clustering procedure inspired by Babuska and Verburggen (1995).This procedure is clustering the data in the input-output domain obtaining a set of hyper-ellipsoids which are a preliminary rough representation of the input/output mapping.Methods for initializing the parameters of a fuzzy inference system form the outcome of the fuzzy clustering procedure.Here we use the axes of the ellipsoids (eigenvectors of the scatter matrix) to initialize the parameters of the consequent functions.We project the cluster on the input domain to initialize the centers of the antecedents and we adopt the scatter matrix to compute the width of the membership functions.Once the initialization is done, the learning procedure begins.In the case of linear T-S models this minimization procedure can be decomposed in a least-squares problem to estimate the linear parameter of the consequent models and a nonlinear minimization to find the parameters of the membership functions.The structural identification loop (the outer one) searches for the best structure, in terms of optimal number of rules, by increasing gradually the number of local models.

Fuzzy Clustering
Identification methods based on fuzzy clustering originate from data analysis and pattern recognition, where the concept of graded membership is employed to represent the degree to which a given object, represented as a vector of features, is similar to some prototypical object.Based on that similarity, feature vectors can be clustered such that vectors within a cluster are as similar as possible, and vectors from different clusters are as dissimilar as possible.This thought of fuzzy clustering is depicted in Fig. 2. Data is clustered into two groups with prototypes v 1 and v 2 , using the Euclidean distance measure.The partitioning of the data is expressed in the fuzzy partition matrix whose elements µ ij are degrees of membership of data points (x i , y i ) in a fuzzy cluster with prototypes v j .The concept of similarity of data to a given prototype leaves enough space for the choice of an appropriate distance measure and of the character of the prototype itself.Prototypes can be defined as linear subspaces, or the clusters can be ellipsoids with adaptively determined shape Akkizidis and Roberts, (2001).
From these clusters, the antecedent membership functions and the consequent parameters of the T-S model can be extracted as follows, (Bossley, 1997).
Each obtained cluster is represented by one rule in the T-S model.Membership functions for fuzzy sets A 1 and A 2 are generated by pointwise projection of the partition matrix onto the antecedent variables.Such pointwise defined fuzzy sets are then approximated by a suitable parametric function.

Fuzzy Clustering Algorithm
Second, every constructed cluster is nonempty and different from the entire set, that is, The general form of the objective function used in fuzzy clustering is (13) where w(x i ) is a prior weight for each x i and d(x j , v k ) is the degree of dissimilarity between the data x i and the supplemental element v k , which can be considered as the central vector of the k th cluster.Degree of dissimilarity is defined as a measure that satisfies two assumptions given by ( 14) Based on the above background, fuzzy clustering can be precisely formulated as an optimization problem: Minimize 16 The Journal of Engineering Research Vol. 2, No. 1 (2005) 12-24 Consider a finite set of elements The issue is to perform a partition of such collection of elements into c fuzzy sets wi th respect to a given criterion, where c is a given number of clusters.The criterion is usually to optimize an objective function that acts as a performance index of clustering.The end result of fuzzy clustering can be expressed by a parti tion matrix U such that : In Eq .( 10), ij u is a numerical value in [0,1] and expresses the degree to which an element j x belongs to the i th cluster.However, there are two additional constraints on the value of u ij .First, a total mem bership of the element X x j ∈ in all classes is equal to unity; that is : One of the widely employed clustering methods based on Eq. ( 16) is the Fuzzy C-Means (FCM) algorithm.The objective function of the FCM algorithm is expressed in the form of (17 where m is called exponential weight that influences the degree of fuzziness of the membership (partition) matrix.To solve this minimization problem, the objective function J(u ij ,v k ) in Eq. ( 17) is differentiated with respect to v k ( for fixed u ij , i=1,…,c, j=1,…,n ) and to u ij ( for fixed v k , i=1,…,c ) and the conditions of Eq. ( 11), are applied obtaining (18) (19) The system described by the Eqs.( 18) and ( 19) cannot be solved analytically.However, the FCM algorithm provides an iterative approach to approximating the minimum of the objective function starting from a given position.

T-S Fuzzy Space Model
At each sample time k, given an operating point condition (for example u(k -1) and y (k -1), a local linear fuzzy state-space model can be constructed via calculating the degree of fulfillment µ i (x(k)) of the antecedents, using product as the fuzzy logic AND operator.The inference of the entire structure (hierarchy) due to rule i results in a sub-model (1) which can be expressed as: In order to employ Quadratic Programming for systems which depend on current as well as on the previous inputs, it is necessary to construct a state-space representation, such that the state vector x(k) to accommodate not only the state variables, appearing in y(k), but also the previous inputs and the offset as last element.This results in a system with only current inputs, but leads to a more complex A-matrix.17 The Journal of Engineering Research Vol. 2, No. 1 (2005) 12-24 for al l j =1,2,…, n and

= y
The latter contains also η s, corresponding to the previous inputs.If the maximal delay in the input i, i=1,…,n i is u i,dmax , then the number of the additional columns is The ones in C are positioned such that y 1 (k) = x 1 (k).At any time index k , initially the control signal u(k -1) is used.However, after the optimization, u(k) is available and could be used in next iterations.

Correlation Tests
Traditionally more rigorous statistical validation tests are employed in which model residuals are examined, and if found to be sufficiently correlated with a function of the data then the model is inadequate.This is achieved by defining a matrix Z, where Z(x t ) is (31) in which x t is the observational vector of inputs, outputs and errors seen up to time step t, and m(t) represents the degree of dependency of the two training signals y(t) and u(t), i.e.
(32) and m(t -1) is a monomial of the vector x t given by ( 33) The following two hypotheses are defined : where the purpose of validation is to use the data to decide if H 0 holds.Two different test statistics have been proposed in the literature, the most common being the standard sample correlation measure, ρ(k), is defined as, (34) (35) where H o hold is asymptotically a X 2 (s) distribution where s is the number of delays, t d .For a given acceptance level (typically 95%) a critical point is found, and if elements of d are outside this acceptance region, H o is rejected.

Antenna System (Input-Output Training Pattern)
To test these proposed neuro-fuzzy methodologies further, they are applied to model a realistic nonlinear dynamical system.The system considered is a nonlinear (MIMO) dynamics of an antenna system with two coupled inputs 18 The Journal of Engineering Research Vol. 2, No. 1 (2005) 12-24 t  and …,t d and ( ) [ ].
If H o holds, this statistic asymptotically approaches a normal distribution, and with 95% confidence limits , H o is accepted if An alternative stati stic is given by: and outputs.A data set containing 500 samples of training patterns were produced by applying random torques to the different channels, with suitable sampling rate and an amplitude drawn from uniform random distribution in the range (-1.5 , + 1.5) N/m.

Antenna system
A coupled two degree of freedom satellite dish, typical of the type used for oceanary satellite communication systems, is presented.
The behavior of the antenna is described by the following nonlinear idealized time invariant state space equations: where ϕ is the azimuth angle, ψ is the elevation angle, b ϕ and b ψ are the associated friction coefficients, and T ϕ and T ψ are the torques applied to the axes.To produce a more realistic simulation, the outputs are corrupted by additive Gaussian noise, [e ϕ (t) e ψ (t)] T , representing a crude approximation to measurement noise.The azimuth is permitted to turn through a complete revolution, while end stops restrict the elevation to the interval [0,π].In this antenna there are essentially two sources of nonlinearity: that produced from the end stops on the elevation and the other as a results of the non-isotropic moment of inertia tensor.Indeed when isotropy is present the state-space equations (above) are linear.The strength of this non-linearity depends on the degree of anti-isotropy and the angular velocities of the antenna.These torques are chosen to emulate typical operating conditions.Such block diagram used to produce the identification data was simulated through SIMULINK/MATLAB, using a set of nonlinear differential equations that describe the antenna system.Half of the training pattern was used in the modeling of the dynamic system whereas the other half was used to validate the fuzzy models resulting from the modeling.For a typical sequence of training data, such responses of the antenna inputs-outputs is shown in Fig. 3.

Neuro-Fuzzy Modeling
Neuro-fuzzy modeling is applied to the problem of identifying a discrete model of the antenna.A fuzzy model can be constructed from data by using the output of the clustering algorithm and by constructing regressors to form inputs to the neuro-fuzzy network.Hence a conventional linear difference model with regressors is constructed containing previous inputs and outputs, i.e.
(39) (40) Fuzzy IF-THEN rules can be extracted by projecting the clusters onto the axes and the membership functions of the fuzzy sets generated by pointwise projection of the partition matrix onto the antecedent variables.Then consequent parameters for each rule are obtained as least squares estimates.When an initial structure is obtained through clustering, the membership functions and the consequent parameters are tuned to satisfy certain cost function through the learning procedure of the neuro-fuzzy.

Membership Functions and Associated Fuzzy Rules
As a result, the membership functions of all the inputs (regressors) and outputs are shown in Fig. 5 for azimuth angle.The antenna system has six inputs (in terms of fuzzy model) and two outputs, hence two groups of seven sets of MFs are shown.Each universe of discourse (set) has three 19 The Journal of Engineering Research Vol. 2, No. 1 (2005) As discussed in section (3), fuzzy modeling of any dynamical system could be achieved through clustering the training data.In this respect, Fig. 3 shows the employed Input -Output data training pattern.For this simulation example, clustering ha s been applied to th e antenna training pattern.In Fig. 4 it is shown the training pattern following appl ying the clustering algorithm, where it illustrates clearly the clusters and their three associated centers is shown in Fig 4 .For instant, the fi gure shows the training pattern which has been clustered into three .To reduce the fuzzy rules while preserving the model accuracy, the number of clusters were chosen to be three clusters .The fuzziness parameter m was kept at 2.2 with a termination cr iterion ∈ =0.01.The result of the clustering algorithm is the fuzzy partition matrix and the cluster centers matrix, which will be used to construct the fuzzy model for the antenna system.

Fuzzy Sub-models:
The T-S fuzzy model presented has been used to identify the nonlinear antenna system.As was mentioned before, the number of rules in the T-S fuzzy model equals the number of clusters in the product space.The consequent of each rule is a local model that approximates the output of the real function for the range of x for which the rule is applicable.As a result of the modeling development, the following rules are obtained for azimuth and elevation angles : Here the C and D matrices are common for all of the three fuzzy sub-models, and the D matrix is equal to zero.Furthermore, the elevation angle dynamics is of the same above structure.The antenna simulation system incorporating the three models are shown in Fig. 6.Consequently, Fig. 6 shows the actual antenna output superimposed over the evaluated fuzzy model output.From the figure, it is apparent how the fuzzy model output resembles the actual system output.

Fuzzy Sub-models Validation
Figure 7 displays the cross-correlation function of the error signal with the first input signal of the antenna.The correlation in the figure is within the confidence interval, which indicates that the two signals are not correlated.To further investigate the constructed local linear sub-models of the antenna, Figure 8 shows the attained linearized sub-models over the antenna time response.In terms of antenna nonlinear behavior, it is obvious that the entire operating region has been sub-divided into a number of local models which could be employed for further control synthesis.From the shown antenna response, fuzzy models are useful for describing the antenna dynamics where the underlying physical mechanisms are not completely known and the antenna behavior is understood in qualitative terms.Consequently, an important property of fuzzy models is their capability to represent nonlinear dynamic systems.Therefore, the obtained fuzzy sub-models can also be applied to systems that are well understood but due to the nonlinearities untraceable with standard linear methods.Rule-based structure of fuzzy models allows for integrating heuristic knowledge with information obtained 21 The Journal of Engineering Research Vol. 2, No. 1 (2005)

Conclusions
This article has concentrated on the modeling of nonlinear dynamic systems via the utilization of the well known fuzzy modeling paradigm, the Takagi-Sugeno (T-S) technique.T-S models depend heavily on some initial membership centers of the universe of discourse of used fuzzy variables, such centers have been obtained by employing clustering algorithm.Once such centers are computed, a fuzzy system can establish initial membership centers through which they are updated via a neural network learning mechanism.One of the advantages of T-S modeling is that systems can be modeled by few rules, and consequently fewer linear sub-models.This advantage has overcome the problem of the large number of rules in the fuzzy modeling.Fuzzy models have also been verified and validated through some standard validation techniques, where they have shown clearly the successful ability of T-S techniques to model nonlinear systems with a good degree of accuracy.
Figure 1.A five layer neurofuzzy network architecture

Figure 5 .
Figure 3. Input-Output data training pattern

Figure 7 .
Figure 6.Fuzzy model responses (azimuth and elevation angels) compared to the antenna outputs