Statistical Signal Characterization of Spectral Analysis of Heart Rate Variability for Screening of Patients with Obstructive Sleep Apnea

A new screening technique for obstructive sleep apnea (OSA) is implemented. This technique is based on finding the statistical signal characterization (SSC) parameters of the spectrum of pre-processed R-R-interval (RRI) data. A single classification factor (CR) is selected to classify between patients with obstructive sleep apnea and normal controls. Both nonparametric spectral analysis techniques (with Welch method as an example) and parametric techniques (with Burg method as an example) are used to estimate the power spectral density. The data tested in this work are drawn from MIT database. Both trial and test (challenge) groups of data are used. Each of these two groups contains 20 OSA and 10 normal records. The trial data is used to set the threshold value of the classification factor, which is then used in identifying the challenge data. The total accuracy of the screening is about 93% and 90% using Burg and Welch algorithms respectively.


Introduction
Sleep apnea is complete or partial cessation of breathing during sleep.Obstructive sleep apnea (OSA) is the common form of apnea that occurs when the upper airways are partially or completely obstructed for 10 seconds or more due to the relaxation of the dilating muscles.The dilating muscles contract during inspiration to prevent the collapse of the upper airways caused by inward air suction (Rodenstein, et al. 1990).Hypoapnea is a form of apnea where the airways partially collapse and _________________________________________ *Corresponding author's e-mail: abhossen@squ.edu.omcause 50% reduction of air accompanied by drops in oxygen desaturation of at least 4% followed by compensating hyperventilation (Rodenstein, et al. 1990;Tsi 1999;AASM Task Force Report, 1999;Penzel, 2000).
The severity of the apnea is measured by Apnea Hypoapnea Index (AHI), which is the number of apnea and hypoapnea episodes per hour.OSA is said to be mild if AHI is between 5 and 15 per hour, moderate if AHI is between 15 and 30 per hour, and severe if AHI is greater than 30 per hour.If AHI is less than 5 per hour, the case is considered normal (AASM Task Force Report, 1999).
The most commonly used technique for sleep analysis is an overnight polysomnographic recording, which is an overnight monitoring of sleep-related body functions, such as brain activity, eye movements, muscle activity, leg and arm movements, heart rate, air snoring, respiration effort, and blood oxygen level.These activities are measured by several special electrodes and sensors attached to the body.Although the polysomnography provides reliable results in the diagnosis of sleep disorders, it is expensive, time consuming and inconvenient to the patient (Boyer and Kapur, 2002).
Usually, patients undergo a primary diagnosis, by primary screening techniques, to evaluate their condition, after which a decision is made if further in depth diagnosis is required through polysomnography (Stevenson, 2003;Pack, 1993).These techniques are also helpful for follow up of treatment.An example of primary diagnosis techniques is the pulse oximetry, which is based on measuring oxygen saturation in the blood via an oximeter attached to a fingertip or ear.The presence of apnea is evaluated by measuring the number of oxygen desaturations per hour, which is also called oxygen desaturation index (ODI).However, the screening results by oximetry methods are inconsistent as reported by Pack (1993) and Netzer (2001).
The development in biomedical signal processing technology has led several researchers to propose other methods for apnea detection and screening by processing and manipulating one or more of the body activities such as the electrocardiogram (ECG) signal, electroencephalogram (EEG) signal and the heart rate variability (HRV) or the R-R intervals (RRI).This study utilized the RRI data, which is extracted from ECG data records, to accomplish the screening process.
The pattern of ECG signal is composed of several waves that are repeated on every beat.These waves were named arbitrarily as P, Q, R, S and T as shown in Fig. 1.The beat is recognized by the QRS waves, which are also called QRS complex.The beat is said to be normal (N) if it is originated from the sino-atrial node of the heart, and ectopic if it is originated elsewhere.The time between peaks of two consecutive R waves of the ECG signal is referred to as R-R interval (RRI), which is used to measure the heart rate.Heart Rate Variability (HRV) implies the variation of heart rate that has been proven to accompany the variation of several physiological activities such as breathing, thermoregulation and blood pressure changes (Task Force of the European Society of Cardiology and the North American Society of Pacing and Electrophysiology, 1996).
Theoretically, in OSA the cessation of breathing will cause the respiration center in the brain to activate its autonomic components (sympathetic and parasympathetic) that send feedback impulses to the heart to compensate for the lack of oxygen and low blood pressure.This interaction between the heart and the brain is reflected into the beat-to-beat variation of the heart rate.Therefore, the analysis of HRV should somehow reveal the variations in breathing.
Many algorithms manipulate HRV or RRI data either in frequency-domain or in time-domain, to extract screening parameters.Several studies have examined HRV using spectral analysis techniques under different conditions (Task Force of the European Society of Cardiology and the North American Society of Pacing and Electrophysiology, 1996) and in OSA (Khoo, et al. 1999).In (Drinnan, et al. 2000), the FFT is applied directly on the RRI signal, and the area under the spectral curve between 0.01 and 0.05 Hz is divided by the area between 0.005 and 0.01 Hz.The algorithm used led to the classification accuracy of 93.3 % when applied to the MIT challenge data.Another technique employed the Hilbert transformation of the RRI time series to derive the instantaneous amplitudes and frequencies of the series (Mietus, et al. 2000).Some basic statistics such as the mean and the standard deviation were calculated for every 5-minute sliding window.A comparison is made between the probability distribution of the measured statistics in order to set up detection threshold limits.The algorithm used obtained accuracy of 93.3% on MIT challenge data.A comparison of several time domain and frequency domain algorithms for apnea detection is found in the work reported by Penzel et al. (2002).
In this paper a new technique is presented.This technique computes the statistical signal characterization parameters for the power spectral density (PSD), which is computed using different spectral estimation techniques, of the pre-processed RRI data.So instead of finding the power at different frequency bands in the PSD, statistical parameters are calculated for the estimated PSD.These parameters depend on the morphology (number of maxima and minima) of the PSD.
The paper is organized as follows: Section 2 deals with the pre-processing steps of the data used in this work.In Section 3, Welch and Burg spectral analysis techniques are introduced.The statistical signal characterization method is discussed in Section 4. Data implementation steps are explained in Section 5. Different results of Burg and Welch Methods are listed and compared in Section 6.This section also includes the results of parametric methods other than Burg.Moreover, a comparison with other techniques (Drinnan, et al. 2000;Mietus, et al. 2000) are also provided in this section.Conclusions are given in Section 7.

The Data
The ECG records are drawn from the database of Massachusetts Institute of Technology (MIT) (PhysioNet: An NIH/NCRR Research Resource for Complex Physiologic Signals).The ECG signals (single channel) were extracted from polysomnographics recordings with sampling rate of 100 Hz and 12-bit resolution for an average duration of 8 hours.The database contains 35 trial records and 35 test records.Each set of the records contains 20 OSA records, 10 normal records, and 5 borderline records.Apnea records are those having 100 minutes or more of apnea during the recordings and at least one hour with apnea/hypoapnea index of 10 or more.The normal records are those having less than 5 minutes of apnea during the recording.The borderline records are those having 5 to 99 minutes of apnea during the recording, and at least one hour with apnea/hypoapnea index of 5 or more (PhysioNet: An NIH/NCRR Research Resource for Complex Physiologic Signals).In this study we only used the apnea records and the normal records; borderline records are excluded.The trial data is used to set up the classification factor.This factor is then used to classify the test data.

Pre-processing of Data
RRI series could be exposed to different types of errors throughout its generation process.The original ECG may be exposed to different type of physiological and environmental interferences that could overwhelm the ECG features and lead to detection of false QRS peaks or misdetection of normal QRS peaks (Friesen, et al. 1990).Moreover, the variation of heartbeats is non-periodic and would result in non-periodic variation of the instantaneous RRI and irregular inter-sample spacing.Furthermore, the QRS detection algorithm, regardless of its accuracy, could miss normal peaks or detect false or ectopic beats.Large errors induced by missing values, outliers, non-stationarity and irregular inter-sample spacing could influence the accuracy of any analysis technique.Therefore the RRI data usually undergo several processing and filtering stages before they can be used in an analysis.
Figure 2 shows the pre-processing steps needed before applying the new technique.The QRS detection, which is the primary process for every ECG signal analysis technique, is accomplished by the "ecgpuwave" software (PhysioNet: An NIH/NCRR Research Resource for Complex Physiological Signals).The RRI data are the normal-to-normal (N-N) intervals obtained directly from QRS detector without any smoothing and filtering steps; therefore it could contain false intervals, missed and/or ectopic intervals.
Removing of outliers is achieved by using a 41-points moving average filter (MAF).Re-sampling at 1 Hz and substituting for the missed peaks are then accomplished by simple linear interpolation (Mietus, et al. 2000).The re-sampling and the estimation of missed values are intended to have an equally spaced RRI data and preserve the temporal sequence that is necessary for the frequency domain analysis (Bigger, 2000).
The next processing steps are intended to emphasize the oscillations of RRI data that are commonly found in apnea patients but less common in normal subjects.These oscillations lie in the VLF band of RRI spectrum, especially between 0.01 and 0.04 Hz.A high pass filtering (HPF) is used to remove the oscillations below 0.01 Hz.HPF is implemented by a local de-trending over an 81point moving window, which gives a highpass 3db cutoff at 0.01 Hz.
In other words, a simple linear regression is used to estimate the trend line of the RRI series over a sliding 81point window.The estimated value at the center of each window forms the high pass filtered RRI series (RRI-HF) (Mietus, et al. 2000).This set of data is now ready for the analysis.

Spectral Analysis
Spectral analysis is used to find the PSD, which is the frequency content of a signal or a system based on a finite set of data.The various techniques of PSD estimation can be classified into non-parametric and parametric methods.

Non-Parametric Methods
Nonparametric methods estimate the PSD directly from the signal itself by FFT.The simplest method of such methods is the periodogram, which is the magnitude squared of the FFT of the signal or a section of the signal.Examples of improved versions of the periodogram are Welch's method, which is one of the most popular nonparametric techniques.It is employed in this study as an example of such methods (Kay, 1988).
This method uses the following steps to compute the PSD of a signal: 1. Apply a non-rectangular data window (Hamming) to different sections of the signal.2. Find the periodogram, that is the magnitude squared of the FFT of the windowed section.3. Compute the average of the periodograms of all sections.

Parametric Methods
Parametric methods are based on modeling the signal, to estimate its PSD, as the output of a linear system driven by white noise.These methods firstly estimate the parameters (coefficients) of the linear system that hypothetically construct the signal.The most commonly used linear system model is the all-pole model, a filter with all of its zeroes at the origin in the z-plane.The output of data of such a filter for white noise input is called an autoregressive (AR) process of order P, which represents the number of poles.Examples of parametric methods are the Yule-Walker autoregressive (AR) method, covariance method (Pcov), modified covariance method (Pmcov) and the Burg method (Proakis and Manolakis, 2000).
The Burg method, which is used in this study as an example of parametric techniques, is based on minimizing the forward and backward prediction errors while satisfying the Levinson-Durbin recursion (Marple, 1987).This method avoids the calculation of the autocorrelation function; and instead computes the reflection coefficients.This method has a good accuracy in estimating the AR model for short data length and ensures stable AR model.Simply, the Burg method has the following steps: 1. Select an order to the all-pole filter model.2. Estimate the lattic reflection coefficients from the signal.3. Fit an AR model to the estimated reflection coefficients.Some parametric methods generate frequency component estimates for a signal based on an eigenanalysis or eigendecomposition of the correlation matrix.Examples are the multiple signal classification (MUSIC) method and the eigenvector method, Peig (Proakis and Manolakis, 2000).
In Fig. 3, the spectrums of both normal and OSA data segments are estimated using Welch (with N=128) and Burg (with N =128, P=10) algorithms.

Statistical Signal Characterization
The statistical signal characterization (SSC) is a method that characterizes a waveform not only as a function of the frequency component amplitudes (like FFT) but also as a function of the relative phases of its frequency components (Hirsch, 1992).The input waveform to the SSC process is basically divided into segments where each segment is bounded by two extrema: maxima and minima as shown in Fig. 4. The segment amplitude, A n , and the segment period, T n , are defined as: (1) n=1,2,3, There are four SSC parameters that can be computed from the time-domain signal.The parameters are the amplitude mean M a , the period mean M t , the amplitude mean deviation D a , and the period mean deviation, D t : (3) (4)  , p=14) Methods where A i is the amplitude of the ith segment, and T i is the period of the ith segment and N s is the total number of segments.
The SSC process is usually applied to a time domain signal.In this work, the SSC process is modified by applying it to the spectrum (PSD) of a time domain signal.For example, a similar figure to Fig. 4 can be plotted by considering the maxima and the minima in the spectral signal as in Fig. 5.In Fig. 4 the x-axis represents the time whereas in Fig. 5 it represents the frequency.Hence, corresponding changes have to be applied to Eqs. ( 4) and ( 6) by replacing any time point with a frequency point.
To initially set up the screening algorithm, the SSC were applied to the trial data.This is in order to find the best SSC parameter(s) that screen the OSA patients from normal subjects.Only two parameters (M a and D a ), which are both related to the amplitude of the PSD, are found to be useful in our work.Figure 6 shows the steps of determining the two SSC parameters from the estimated specrum of the (RRI-HF) data.

Performance Measures
The performance of the technique is evaluated by three main metrics: specificity, sensitivity and accuracy as defined below (Rangayyan, 2001)  where TP, TN, FN, FP are defined in Table 1 (Rangayyan, 2001), and T is the total number of data under test.
Sensitivity represents the ability of a classifier to detect the positive cases, e.g.OSA.Specificity indicates the ability of a classifier to detect negative cases, e.g.normal subjects.Accuracy represents the overall performance, which indicates the percentage of correctly classified positives and negative cases from the total cases.

Implementation Procedure
The following procedure is implemented in this work: 1. Estimate the PSD of all successive segments of the RRI-HF data using Welch or Burg method or any other method.2. Compute the SSC parameters (M a and D a ) of the PSD of all successive segments.3. Find the maximum values, Max (M a ) and Max (D a ) among the results of all segments.4. Calculate (CR) as the sum of Max (M a ) and Max (D a ). 5. Find (manually) the threshold value of (CR) to identify the trial data with best performances.This can be done by sketching a line that can separate better between the data results of both groups (normal and apnea).6. Find the performance of the algorithm on test data using the same threshold.7. Compute the overall performance of the screening process.

Results and Discussion
In order to study the effect of the transform length N and the filter order P in the performance of the algorithm, the procedure is repeated for N = 64, 128 and 256 for both Welch and Burg methods, and for P = 10, 14 and 20 for Burg method.Other filter orders were also attempted but the results obtained were low and, therefore, been excluded.

Results
Tables 2-4 show the performance results of Welch method at different transform lengths N. Tables 5-9 show the results for Burg method at different transform length N and filter order P.
The best accuracy obtained for trial data by Welch method is 93.3% with N=128 and 256.For test data the maximum accuracy is 86.7%, obtained with N=256.The overall performance shows that the accuracy increases from 81.7% to 90% with N increasing from 64 to 256.Moreover, the overall specificity is constant at 85%; whereas sensitivity (detection of more OSA cases) improves with increasing N causing the improvement in accuracy.
The maximum accuracy obtained by Burg method for trial and test data are 96.7% and 90% respectively.The best overall accuracy is 93.3%.These results are obtained with N=128 and P = 14.Moreover, by observing the Burg accuracy results in Tables (5,7,9) for P=14, which are 80%, 93.3% and 81.7%, it can be noticed that the increase in N had not always lead to better results.In addition, the results of Burg method with N=128 at different filter order show that the accuracy is not improving consequently with increasing the filter order.Therefore, the results of Burg method depend on selecting an optimum filter order and transform length N. The parametric Burg method results in better screening capability (accuracy=93.3%)than the non-parametric Welch method (accura-cy=90.0%).
Other parametric methods have been tested with N=128     Figures 7 and 8 show the results (computed values of CR) of all data using both Welch and Burg algorithms respectively at FFT length N=128 and filter order P =14 (for Burg only).In these figures an optimum threshold value (CR) is selected manually (see the line sketched between the two groups) from the trial data, with a value of 0.378 for Welch method and 0.63 for Burg method.Any subject with (CR) above the threshold is classified as normal and any subject with (CR) less than the threshold is classified as OSA.

Comparison with other Methods
The new technique is compared with other methods by Drinnan, et al. (2000) and Mietus, et al. (2000).The results of screening of both trial data and test data are shown in Table 11.By applying the new technique with Burg spectral analysis to the trial set, we could correctly classify 29 out of 30 of combined OSA and normal subjects.When this technique is applied to the test set, it correctly classified 27 out of 30 of the subjects.The results show that this method have overall better performances compared to other methods used in the comparison.

Conclusions
A new method for OSA screening is investigated in this paper.It is based on estimating the SSC parameters of the spectral analysis of the pre-processed and filtered RRI data.Two basic algorithms are used for the spectral estimation (Welch and Burg) as examples to non-parametric and parametric spectral analysis techniques respectively.The work shows significant results in screening of subjects with OSA from those of normal controls.
This study has shown that the morphology of the spectrum could be used to screen OSA and normal subjects by deriving relevant statistical parameters.Common spectral methods usually find and compare spectral power between two different frequency bands (Drinnan et al. 2000).The study also revealed that both parametric and non parametric methods have obtained acceptable results (accuracy above 85%).However, parametric method could over-perform non-parametric method if optimum filter order and transform length are selected.
The best sensitivity result of implementing this technique with Burg method (N = 128, P = 14) and Welch method (N = 256) is 92.5% for both.The specificity of       both algorithms are 95% and 85% respectively, while the total accuracy is about 93.33% with Burg algorithm and about 90% with Welch spectral estimation method.The most important advantage of this technique is that a single-parameter is used for screening.We would like to pointout that this is a screening technique and is not purely diagnostic.It helps to short-identify patients who need full polysomnography especially in a very busy center like the one we have at Sultan Qaboos University Hospital in Oman.
Figure 7. Results of all data using Welch method (N=128) Figure 8. Results of all data using Burg method (N=128)

Figure
Figure 2. Block-diagram of pre-processing steps

Figure 3 .
Figure 3. Spectrums of normal and OSA data segment estimated by Welch (N=128) and Burg (N=128, p=14) Methods Figure 5. Segment amplitude and frequency characterization

Segment amplitude and period characterization
.

Statistical signal characterization parameters and
P=14 for performance comparison with Burg.The methods and their results are shown in Table10.The results of Welch and Burg methods are also shown in this table for comparison.The parametric methods resulted in different accuracies above 80%.The second best accuracy after Burg is 90%, which is obtained by Peig method.