Determinants of Contraceptive Practice in Oman by Using Multilevel Modeling

In this paper, multilevel logistic regression models are developed for examining the hierarchical effects of contraceptive use and its selected determinants in Oman using the 2008 Oman National Reproductive Health Survey (ONRHS). Comparison between single level and multilevel logistic regression models has been made to examine the plausibility of multilevel effects of contraceptive use. From the multilevel logistic regression model analysis, it was found that there is real multilevel variation among contraceptive users in Oman. The results indicate that a multilevel logistic regression model is the best fit over ordinary multiple logistic regression models. Generally, this study revealed that women’s age, education, number of living children and region of residence are important factors that affect contraceptive use in Oman. The effect of regional variation for age of women, education of women and number of living children further implies that there exists considerable differences in modern contraceptive use among regions, and a model with a random coefficient or slope is more appropriate to explain the regional variation than a model with fixed coefficients or without random effects. The study suggests that researchers should use multilevel models rather than traditional regression methods when their data structure is hierarchal.


Introduction
tratified multistage cluster sampling designs are the norms for most national level socioeconomic, demographic and health surveys because of cost, time and efficiency considerations.As these survey designs are hierarchical in nature, data collected in these surveys often introduce multilevel dependency or correlation among the observations that can have implications for model-based statistical inference, although these samples provide efficient estimates of the descriptive population quantities [1][2][3][4].The individual observations in hierarchical data structures are not completely independent, and the results of the analysis can be affected by this clustered structure of the underlying data.For example, households from the same community or geographical area are more homogenous than households in different communities.
Standard statistical techniques such as regression models have problems dealing with the hierarchical data structure because they assume independent and normally distributed errors with a constant variance (homoscedasticity) S [1,[5][6][7][8].Analyzing variables from different levels without taking into account the hierarchical data structure leads to misleading estimation results and interpretation because one faces the problem of heteroscedasticity.To overcome these problems, a methodology of multilevel modeling has been developed recently, taking the hierarchical data structure into account and satisfying the assumption of independence and heteroscedasticity, which obviously leads to correct analysis and proper interpretation of the data.Multilevel models are useful in determining the direct effects of the individual and the group explanatory variables.There is an increasing use of multilevel models for the analysis of clustered data.These models are also referred to as mixed effects models, random effects models, or hierarchical models in the literature [1][2][3][4]9,10].Many popular statistical packages, such as HLM, MLwiN, R, SAS, Stata, and SPSS have the capacity to fit multilevel models.Methodological work on analyzing multilevel models has been done by many [e.g., [5][6][7]11], who gives an illustrative introduction to multilevel models.
The primary objective of this paper is to develop an analytical framework of multilevel analysis for studying contraceptive use in Oman utilizing the 2008 Oman National Reproductive Health Survey (ONRHS) data, which is hierarchical in nature as it followed a multistage cluster sampling design.Although the hierarchical effect of contraceptive use in Oman is still unknown, information regarding this would be very important for policy purposes.

Data
The data used in this study come from the 2008 Oman National Reproductive Health Survey (ONRHS).The 2008 ONRHS was tagged on to the World Health Survey in Oman (OWHS) in 2008.The 2008 OWHS was designed by the World Health Organization (WHO) as part of comprehensive standardized data collection on population health in different countries.Both the ONRHS and OWHS were implemented by the Ministry of Health of Oman.The details of the OWHS may be seen in [12].
The ONRHS used a sub-sample of 3,703 Omani households out of 5,464 households selected for the 2008 WHSO.The ONRHS considered ever-married Omani women of reproductive age of 15 to 49 years.Ultimately 4,560 women were successfully interviewed, who constitute our study sample.The sample was calculated to provide national estimates, as well as separate estimates for both rural and urban areas.
The sample of the 2008 ONRHS was selected following a multistage stratified cluster sampling design.Stratification was made on two factors: level of urbanization (urban/rural) and geographical distribution.Administratively, Oman was divided into 10 regions at the time of survey (currently 11 regions).The survey covered all the 10 regions of Oman, dividing into 10 urban and 10 rural strata.Equal samples were selected from all strata, with increased samples from urban Muscat.From each stratum, 10 clusters were selected randomly.The 2003 population census 'Enumeration Areas' (EAs) were considered as clusters.The 2003 census has defined the EA as the area assigned for one enumerator.There are a total of 3167 EAs in the country.Each EA contains on average 110 households (both Omani and non-Omani) arranged in two or more enumeration blocks (EBs).Twenty four households were selected systematically from the selected EAs.EAs were used as the primary sampling units (PSUs).
As the 2008 ONRHS data is of hierarchical structure, the hierarchy for our study follows individuals as level-1 and regions as level-2.The survey considered non-pregnant currently married women as the respondents for current contraceptive use.After excluding currently pregnant and widowed or divorced women, we were left with 3056 women as our study population.

Study Variables
The study considered current use of contraceptive methods by currently married non-pregnant women of reproductive age as the dependent variable.The current use covers both modern and traditional methods of contraceptive.The variable is categorical which we dichotomous as follows: those women currently using any of the methods as 1 and those not currently using any method coded as 0.
The explanatory variables considered for this analysis include age of the women, region of residence, number of living children, level of education, place of residence, type of family and economic status or wealth quintiles.A shortcoming of the dataset used is that it contains no information on household income or wages, household expenditure patterns and community characteristics.However, the dataset contains information about house fixed and durable assets such as TV, refrigerator, car, house etc.These variables were used to construct a proxy index of household economic status or wealth index using the principal components analysis.It is a composite measure of the cumulative living standard of a household, which places individual households on a continuous scale of relative wealth.The wealth index is divided into four categories, with the lowest one representing the poorest 25 percent and the highest one representing the wealthiest 25 percent of households see [13].This wealth index has the advantage of providing a reasonably reliable measure of the household's economic status, and it is not affected by the endogeneity and transitory nature of labour income.

Multilevel logistic regression model
As mentioned earlier, as our basic data structure is of two-levels, with individual women as level-1 and region of residence as level-2, we concentrate on a two-level logistic regression model.Our outcome variable ij y is a binary response for individual i (i=1, 2,…, n j ) and group j (j = 1,2,…, k) and we have one explanatory variable at individual level This is a formal model, where j refers to the level 2 unit (region) and i to the level 1 unit (women), such that j u is the random effect of level 2 and a standard logistic regression model will be without j u .Conditional on j u and ij y s being assumed to be independent, Model (1) can be written as follows, splitting up into two models: one for level 1 and the other for level 2.
, 1 log and and, 1) is the simplest possible multilevel model for binary data.

Differentials of contraceptive use: bivariate analysis
Table 1 presents the percentage distribution of currently married and non-pregnant women using any of the family planning methods by selected socio-economic and demographic factors.The survey data indicate that 29% of the eligible women were using any family planning methods.Women's age shows significant (p < 0.01) association with contraceptive use.The association is curvilinear, as the rate of contraceptive increases with the age of women until 35 years and then decreases.Contraceptive use rate is higher among women living in urban areas than their rural counterparts (30% vs. 26%) but the difference is not statistically significant.However, the contraceptive use rate varies significantly among the 10 geographical regions considered in this study.Musandam (45%) followed by Muscat (38%) show higher prevalences of contraceptive use and Al-Dhakhlia (18%) the lowest.Women's education and number of living children show significant association with contraceptive use in Oman.

Single level logistic regression analysis
The single level standard logistic regression analysis results are reported in Table 2.The results indicate women's age, educational level, number of living children, and region of residence as significant predictors of contraceptive use in Oman.After controlling for other factors, the age of women shows significant negative association with contraceptive use.Young women of age 15-29 are 2 times more likely to use some family planning method than their older counterparts.
Married women who had secondary education were 2.8 times more likely to use contraception than uneducated woman.Similarly, woman who had college, university and higher education were 3.1 times more likely to use contraception than uneducated woman.The likelihood of contraceptive use by married women who had (5-7) children was 3.7 times greater than by married women who had (0-2) children.Similarly, married women who had more than seven children were 3.4 times more likely to use contraception than married women who had (0-2) children.

Multilevel logistic regression analysis of contraceptive use
In this study we consider multilevel models allowing between-region variance of modern contraceptive use.The data have a two-level hierarchical structure with women at level 1, nested within 10 regions at level 2. The multilevel logistic regression analysis was used for comparisons and analysis of the effects of socio-economic and demographic variables on modern contraceptive use among women, and to estimate any residual correlations across regions in predicting modern contraceptive use in Oman.The multilevel modeling process was performed stepwise.The first step examined the null model (or empty model) of overall probability of modern contraceptive use without adjustment for predictors.The second step included both the analysis of single and multilevel models for random intercept and fixed slope multilevel analysis.The third step considered a model for two level random intercept and random slope (random coefficient) multilevel logistic regression analysis.

Multilevel logistic regression model comparison
We compare the three possible multilevel logistic regression models (empty model, random intercept model and random slope model) using a deviance-based chi-square value and the Akaike information criterion (AIC).The deviances of the empty model with random intercept (deviance = 3634) and the random intercept and fixed slope model (deviance = 3403) indicate that the random intercept and fixed slope model is better than the empty model with regard to random intercept.In addition to this, the AIC value of the empty model with random intercept (AIC = 3638) is larger than that for the random intercept and fixed coefficient model (AIC = 3419), which implies that random intercept and fixed slope is better than the empty model with random intercept in predicting contraceptive use across regions.
The deviances of a fixed slope and random intercept (deviance = 3403) and random slope model (deviance = 3386) show that the random slope model is better than the random intercept and fixed slope model.The AIC value of the random coefficient model (AIC = 3406) is smaller than that of the random intercept and fixed coefficient model (AIC = 3419), implying that the random coefficient model is better compared to the random intercept and fixed slope model in describing contraceptive use status (see Table 3).

Empty random intercept logistic regression analysis
The simplest non-trivial specification of the hierarchical linear model is one in which only the intercept varies between level two units and no predictors (explanatory) variables are entered in the model.A random intercept or variance components model allows the overall probability of contraceptive use to vary across regions.An interceptonly multilevel model has been estimated to predict the probability of women who use contraception; this is described in Equation (8), where j refers to the level 2 unit (region) and i to the level 1 unit (women) and which is still a single level model.The results of the multilevel empty model are presented in Table 4.The ratio of women who use contraception to women who do not by using ML estimate from the standard logit model is exp (-0.900) = 0.407 that is the same sample ratio of 884 women who used contraception to 2173 women who did not .This is the odds-ratio when no predictors have been considered in the model.This ratio is estimated to be exp(-0.917)= 0.400, exp(-0.938)= 0.391, exp(-0.928)= 0.395, and exp(-0.931)= 0.394 from multilevel model by MQL-1, MQL-2, PQL-1 and PQL-2 multilevel estimation methods respectively.By comparing the odds-ratio, it can be said that under all multilevel methods, the standard logistic model odds-ratio has overestimated.
From Table 4, it can be noticed that that there is a significant difference between the results of the standard logistic model and the multilevel logistic model.Failing to take into account the clustering within clusters (level 2), the standard logit model has overestimated the odds-ratio by about 1.85% (-0.900 -( -0.917)*100/0.917),4.05%, 3.02% and 3.33% and compared to the multilevel model by using the corresponding methods MQL-1, MQL-2, PQL-1 and PQL-2.The fixed intercepts for all the multilevel estimated methods are similar methods and the random quantity including its standard error at cluster level for PQL-1 and PQL-2 are high compared to all other methods.The likelihood ratio statistic for testing the null hypothesis that 2 0 u  = 0 can be calculated by comparing the two-level model with the single-level model without the level 2 random effects.The difference in -2*log likelihood between an empty model with random effect from Table 4 and an empty model without random effect from (single level logistic regression analysis) is (3979 -3634)=345, which constitutes strong evidence that the between-regions variance is non-zero.This implies that the empty model with random effect is better than the empty model without random effects.The decrease in deviance is 345, showing that the multilevel model fits the data dramatically better than the standard logistic regression.Conversely, the variance of the random effect of the random intercepts ( 2 0 u  =0.107, p-value =0.0116) reveals that there is significant difference in contraceptive use across the regions.
The empty model with random effect also helps to calculate the between-region variations, by the help of the intra-class correlation coefficient (ICC).ICC is estimated by is the variance of the standard logistic distribution.Then ICC = 0.0315, which implies that 3.15% of the variation in the contraceptive use is attributable to unobserved regional characteristics.

Multilevel random intercept and fixed coefficient model analysis
The random intercept and fixed coefficient logistic regression is the same as the empty random intercept logistic regression model, but with additional explanatory variables to allow for the effects of women's age, educational level, region, number of living children, living arrangements and economic status.
In the random intercept and fixed coefficient multilevel logistic regression model, we allowed the probability of contraceptive use to vary across regions, but we assumed that the effects of the explanatory variables are the same for each region.That is, the random intercept varies across region, but women's level explanatory variables are fixed across regions.The results are presented in Table 5.
For the single level model, place of residence and women's education were found to be significant determinants of variation in contraceptive use in all regions with respect to reference categories, and while it is not with the multilevel model (see Table 5).The estimated coefficients and odds-ratio have a similar interpretation, as in an/the ordinary logistic regression discussed above.The p-value differs for the same explanatory variables, such as place of residence and economic status (Poor), between single level and multilevel ;these were not significant for the single model but are significant under the multilevel model.This is because analyzing variables from different levels without taking into account the hierarchical data structure leads to misleading estimation results, as one faces the problem of heteroscedasticity.The individual observations in a hierarchical data structure are not completely independent and the results if analysis can be effected by this cluster structure of the underlying data.Women in the same region are more homogenous than women in different regions.
The results in Table 5 indicate that women's age, educational level, number of children, and living arrangements were found to be significant determinants of variation in contraceptive use in all regions, with the respecting reference categories and place of residence was not significant under both modeling.On the other hand, the 'poor' category for economic status was found to be a significant determinant of variation in contraceptive use in all regions with the respecting reference categories under the multi-level, while it was not significant under the single-level model.
The random part of the random intercept and fixed slope model showed that the intercept variance of the random effect was 0.123, whereas the variance of the intercept for the empty model was 0.107.The variance of the intercept for the empty multilevel model decreased compared to that under random effect of the intercept and fixed slope model.The reduction of the random effects of the intercept variance was due to the inclusion of fixed explanatory variables.That is, taking into account the fixed independent variables can provide extra predictive value on contraceptive use in each region.The significance of random effect intercept variance indicates that there is significant regional random effect variation on contraceptive use among women (see Table 5).This implies that there is still unexplained variation in contraceptive use across regions.
It is observed that there exist significant differences between the   coefficients of these two models for each of the explanatory variables.Also the   coefficients of the standard model have been either under or overestimated for   coefficients in comparison with those of the multilevel model.The difference in   coefficients estimated from a multilevel and from a standard level model arises because of the addition of the random effects, and implies that a single level model for this outcome not appropriate.

Multilevel random slope model analysis
So far we have allowed the probability of contraceptive use to vary across regions, assuming that the effects of the explanatory variables are the same for each region.We will now modify this assumption by allowing the difference between urban and rural families within region to vary across regions.To allow for this effect, we will need to introduce a random coefficient for place of residence.The model will be fitting has the same form as the random slopes model considered in the previous section, but since the variable 'place of residence' has only two categories, we use the more general term 'coefficient' rather than 'slope' to describe its effect.The results are presented in Table 6.
The results indicate that the women's age, educational leveland number of children are significant determinates of variation in contraceptive use in all regions with respect to the corresponding reference categories under multilevel modeling (see Table 6). .The test statistics of the first parameter is 4.883, which we compare to a chi-squared distribution on 1 d.f.We conclude that there is a significant difference between the regions.The test statistics of the second and third parameters is 3.574, which is approximately chi-squared distributed on 2 d.f.(p =0.167).At 10% we conclude that both parameters are non -zero, which implies that the effect of urban and rural middle does indeed vary across the region.
On average (after adjusting for the effects of age and the other explanatory variables), the log-odds of using contraception are 0.035 higher for urban areas than for rural areas.Depending on the value of 2 1 u  , the difference in a given region will be larger or smaller than 0.035.The last column of Table 6 shows that the   coefficient under single level analysis corresponding to place of residence covariates has been under estimated about 42% compared with the multilevel estimate.Similarly, the   coefficients for the women's age group (25-29) of the single level model have been under estimated by almost 31%.The   coefficient under single level analysis corresponding to the women's age covariate for its all categories has been under estimated with different percentage compared with multilevel estimate.On the other hand the   coefficients for the women's education group (college/university) of the single level model have been over estimated by almost 13%.
Thus it is evident that if the multilevel effect is not taken into account in multivariate modeling, the estimates will be either underestimated or overestimated considerably.These results imply that a single-level multivariate model for this outcome variable is not appropriate.

Discussion and Conclusion
This study analyzed determinants of contraceptive use in Oman using the multilevel modeling technique with application to the hierarchical data from the 2008 Oman National Reproductive Health Survey.Different models are fitted to the data to identify potential determinants of use of modern contraception among women in the reproductive age group.First, bivariate analysis with the chi-square test and the ordinary logistic regression model was fitted to the data and significant variables were considered for further investigation in multilevel models.Secondly, the multilevel models were fitted.The multilevel model was step wise; on the first step the intercept only or the empty model was fitted to check whether multilevel effects or heterogeneity exists among the hierarchies.In the next step the random intercept and fixed slope model, usually called the random intercept model, was fitted, and finally the random intercept and random slope (random slope model) was fitted.
 can be defined as the probability of the response equal to one and ij p can be modeled using a logit link function where ij y has a Bernoulli distribution.Then we can write the two level model as . 1 log variance of intercept and slope of place of residence.The Wald test has been used to test the significant of the parameters

Table 2 .
Maximum likelihood estimate for binary logistic regression of predicting contraceptive use.

Table 3 .
Summary of three possible multilevel logistic regression model selection criteria 3.

Table 4 .
Estimation of parameters and standard errors of an intercept-only, fixed effect, and random effect logit model for predicting the probability of contraceptive use.

Table 5 .
Results of random intercept and fixed coefficient logistic regression analysis.

Table 6
also shows that the values of

Table 6 .
Results of random slope logistic regression model analysis of contraceptive use.