Journal homepage www.
jzs.univsul.edu.iq Journal of Zankoy Sulaimani Part-A- (Pure and Applied Sciences) AnnualForecasting Using a Hybrid Approach Qais Mustafa AbdulqaderTechnical College of Petroleum and Mineral Sciences,Duhok Polytechnic University, Zakho-IraqE-mail: [email protected] Article info Abstract Original: Revised: Accepted: Published online: In this paper, we used a hybrid method based on wavelet transforms and ARIMA models and applied on the time series annual data of rain precipitation in the Province of Erbil-Iraq in millimeters. A sample size of (45) values has been taken during the period 1970 – 2014.
We intended to obtain the ability to explain how the hybrid method can be useful when making a forecast of time series and how the quality of forecasting can be enhanced through applying it on actual data and comparing the classical ARIMA method and our suggested method depending on some statistical criteria. Results of the study proved an advantage of the statistical hybrid method and showed that the forecast error could be reduced when applying Wavelet-ARIMA technique and this helps to give the enhancement of forecasting of the classical model. In addition, it was found that out of wavelet families, Daubechies wavelet of order two using fixed form thresholding with soft function is very suitable when de-noising the data and performed better than the others. The annual rainfall in Erbil in the coming years will be close to 370 millimeters.
Key Words: Forecasting Time series ARIMA Wavelet transforms De-noising Introduction Rainfall forecasting is one of the most challenging objects. Manyalgorithms have been developed and proposed but still an accurate prediction ofrainfall is very difficult. (Tantanee et al., 2005), presented in their study anew procedure for predicting rainfall and depending on a combination of waveletanalysis and conventional autoregressive AR model. The research showed that thewavelet autoregressive model procedure gives a better prediction of annualrainfall than the classical AR model. (Al-Safawi et al.
, 2009) have estimatedthe autoregressive model using wave shrink. The results showed that thesuitable model using classical ARIMA method is AR(6) and this model hasimproved when using wave shrink technique and especially when using Haarwavelet with a soft threshold to forecast the quantity of the annual rainfallin Erbil city for the period 1992-2007. (Al-Shakarchy, 2010) applied the factoranalysis for forecasting two series representing rain rates and relativehumidity in Mosul province.
Results showed that the suitable model for the twoseries is ARIMA(0,0,1) and ARIMA(1,0,0) respectively. (Ali, 2013) used ARIMAmethod for analyzing and forecasting of Baghdad rainfall. It is found that theseasonal model SARIMA(2,1,3)x(0,1,1) is the best model and according to thismodel, rainfall forecast for the coming years was also prepared and presentingand showing similar trend and range of the real data. (VenkataRamana et al., 2013) searched to obtain a good model for monthly predictionof rainfall data by using hybrid technique consisting of the wavelet techniquewith artificial neural network ANN.
The results of the analysis showed that theperformances of the obtained models are more efficient than the ANN models.(Shoba and Shobha, 2014) have made an analysis of various algorithms of datamining used for rainfall prediction model. The study showed that sometimes whencertain algorithms are combined, they perform better and are more effective.(Eni and Adeyeye, 2015) applied seasonal ARIMA method for building a suitablemodel and to forecast the rainfall data in Warri Town, Nigeria. Results showedthat seasonal model ARIMA (1, 1, 1) (0, 1, 1) is adequate depending on somestatistical criteria. Recently, (Shafaei et al., 2016) offered some techniques for testingtheir capability of predicting the monthly precipitation such as waveletanalysis WA, seasonally mixed model SARIMA and ANN method.
The study concludedthat searching for the effect of decomposition level on model performance, itwas indicated that going from 2 to 3 decomposition levels increased the correlationbetween observed and estimated data, but no significant difference was foundbetween predictions from 2 and 3 level models. (Ramesh Reddy et al., 2017)applied ARIMA model to forecast the monthly mean rainfall of coastal Andhra-India. They found that the best model for fitting data is ARIMA (5,0,0)(2,0,0)depending on some performance criteria. (Ashley et al., 2017) applied DCTpresenting the discrete cosine transform and DWT presenting discrete wavelettransform to make a reduction in the 5 dimensionalities of rainfall time seriesobservations.
The conclusions of the research demonstrated that the DWT has thesuperiority to the DCT and best preserves and characterizes the observedrainfall records of the data. From the above-suggested methods, we observe that most of theseapproaches and methods are applying to forecast the short period. This paperoffers a new technique for forecasting the long-range of annual rainfall data.In another word, it mainly deals with combining wavelet transformation withclassical ARIMA methodology for modeling of annual rain precipitation based onthe available data.
The procedure of this paper is prepared as the following: First,we provide brief explanations of ARIMA methodology and wavelet transformationand then we offer the hybrid method. Next, we deal with application on realdata. Finally, we present some conclusions of the study. ARIMA Methodology, Wavelet Transformation, and HybridMethod ARIMA Methodology Box-Jenkins suggested an approach for analyzingtime series data including an identification of the model, parametersestimation, diagnostic checking for the suggested model, and using the modelfor forecasting.
ARIMA model is a mixed model which depends on parameters p, d,q representing a combination of autoregressive order part (AR); the degree ofdifference involved and the moving average order part (MA) respectively. Themodel becomes popular by (Box et al., 1970) and can be well explained throughthe mathematical formula: Here, p represents a non-seasonal autoregressive order, q isa moving average order of the non-seasonal, are called coefficients ofautoregressive, are coefficients of movingaverage and is a random error. If the data are not stationary, then the differenceof first or second order has to be taken. For obtaining a convenient model, wedepend on two functions called ACF as Autocorrelation Function and PACF asPartial Autocorrelation Function. The pattern of both functions plot providesus an idea towards which one of the specified model could be the best forfitting and appropriate for making a prediction and depending on somestatistical performance. Also, in this study, we will apply the Portmanteautest statistic (i;e.
Box-Pierce) for the purpose of randomness of time series.We refer to (Makridakis et al., 1998) for more details. Wavelet Transformation Awavelet transformation is a proceeding subject, very efficient, and effectivein the field of processing the signal that has been very interest afterdeveloping the theory of wavelet methodology (Grossmanand Morlet, 1984). applications ofwavelet analysis have increased in many fields such as in edge detection, imagecompression, optical engineering, and the applications of time series asalternate to the classical Fourier transformation in local maintain, notinvolving cyclic and multi-scaled phenomenon. Wavelets can give the specificlocality of any changes in the dynamical patterns of the sequence, while thetransformations of Fourier focus essentially on their frequency and this is themajor difference between wavelets analysis and Fourier analysis. in addition,the transformation of Fourier supposes unlimited length signals, while the transformationof wavelet can be used to any form and any size of time series data, even whenthese time series are not identically sampled (Antoniosand Constantine, 2003). Generally,wavelet transforms can be applied for seeking, reducing the noise and filteringtime series data which help and also support forecasting and other analysis ofthe experiment.
The formula of wavelet transform can be presented as thefollowing: Here, ?(t) represents the essential wavelet with efficient length (t)that is commonly much shorter than the target time series f(t), ‘a’ representsthe scale factor or dilation that specifies the information of characteristicfrequency so that its variation yields increase to a spectrum and ‘b’represents the translation of time information so, its difference displays the’sliding’ of the wavelet over f(t) (Burruset al., 1998). Hybrid Method Theconcept of the suggested method is based on combining ARIMA methodology withwavelet transforms. As the wavelet approach can be easily used for signalanalysis, this study used the approach to decompose the details (which aresmall differences) from the approximations (which represents the importantpart) of data. In wavelet analysis, the approximations are the high-scale andlimited frequency components of the signal, and the details represent thelimited-scale and high-frequency components (Fugal,2009). The process isdone by applying discrete wavelet transform DWT because the data of the study arerecorded in discrete time. The procedureof hybrid method can be expressed in figure1 Figure-1:The process of hybrid method ApplicationInformation About Erbil City Erbil which is the Kurdish central is the capital city of Kurdistan Region in Iraq.
The city is located between (36°12?17?N 44°20?33?E). It is locatedabout 350 kilometers north of Baghdad. The climate of Erbil is very hot insummer and very cold and wet in winters. There ismore rainfall in the winter than in the summer in Erbil. Theaverage total of receiving rain of the city is between 300-400 millimetersannually. The city represents the managerial center of Erbil province. Itis bounded from the north by Turkey and nearby Duhok Province, from the east byIran and near to Sulaymaniyah Province, from the south, is close to Kirkukprovince, and near to Mosul province from the west (Wahab andKhayyat, 2014). ApplicationUsing ARIMA Methodology The variable used in the analysis represents the annual data of rainprecipitation in Erbil province in Kurdistan Region of Iraq (in millimeters)and represents taking (45) observations as sample size during the period 1970 -2014 which is shown in table1.
The data were obtained from the GeneralDirectorate of Meteorology and Seismic Monitoring in Erbil province. Table-1: Annually data on rain precipitation from 1970 to 2014 Year Amount of Rain Year Amount of Rain 1970 255.4 1993 601.6 1971 448.2 1994 583.0 1972 406.4 1995 494.4 1973 261.
5 1996 418.9 1974 547.5 1997 441.6 1975 417.2 1998 337.
2 1976 452.3 1999 229.2 1977 347.2 2000 272.3 1978 380.
1 2001 330.9 1979 375.6 2002 361.5 1980 321.5 2003 587.7 1981 141.
8 2004 255.6 1982 444.1 2005 297.5 1983 178.3 2006 514.6 1984 43.9 2007 273.
4 1985 463.9 2008 410.7 1986 154.0 2009 411.0 1987 235.9 2010 359.6 1988 626.
9 2011 301.6 1989 367.3 2012 366.
4 1990 332.0 2013 345.2 1991 344.
1 2014 385.2 1992 694.0 Figure2 shows the plots of timeseries of rain data for Erbil city. Depending on Box-Jenkins procedure, thefirst step to do is identification through employing the ACF and PACF plotswhich are clear in figure 3. Figure-2: Time series plot of rain data in Erbil province from 1970to 2014Figure-3: Autocorrelation function and partial autocorrelationfunction of rain data Depending on PACF and PACFplots and checking for stationarity in mean and variance, the appropriate modelfor the respected series is identified as ARIMA(2,1,0) after well considerationof modelling and fitting and depending on two performance measures such as RMSEas root mean square error and MAE as mean absolute error. The estimated modelis shown in table2.Table 2: Estimation of ARIMA(2,1,0) Parameter Estimates Std.
Error t-ratio P-value AR(1) -0.72091 0.129125 -5.58304 0.000002 AR(2) -0.
540025 0.128616 -4.19875 0.000136 After getting the estimationof the ARIMA (2,1,0) model, we should look for getting randomness. Figure 4offers the residuals pattern and stability of ACF and PACF inside the intervalsusing classical ARIMA (2,1,0).Figure-4: ACF and PACF of residuals usingARIMA(2,1,0) on series data.
From Figure 4, there is nosignificant appear from the autocorrelations coefficients of ACF and PACF,which concludes that the time series is random (i.e.; white noise). Concerningthe randomness of residuals, we did a test using a Portmanteau test (orBox-Pierce test), which has been mentioned in theoretical part. The value ofthe test was (7.326) comparing to the P-value (0.835) indicates that thehypothesis cannot be rejected at the 95% or higher confidence level andconcluding that the series is random. Application Using a Hybrid Method In this part, the conversionof original data from time domain to frequency domain has been done to makefiltration.
Figure 5 shows applying Daubechies waveletwith multiresolution of five levels for the rain precipitation for 45values as sequential observations, denoting s as a signal and it means thesummation of signal approximation and its details, a5 is an approximation atlevel 5 and d5; d4; d3; d2; d1 is the details level from 1 to 5 respectively. Figure-5: Daubechies waveletof the rain precipitation using multiresolution of five levels. The real data of rainprecipitation were reduced from noise using wavelet denoising procedure (usingthe software MATLAB, version 2013) with Daubechies wavelet family from order 2 toorder 5as shown in figure6. It should be noted that aftermaking many empirical experiments, it has been found that the performance ofDaubechies wavelet was better than others in terms of de-noising the rain data.
Figure 7 shows the real and de-noised signals by applying the Daubechieswavelet with Fixed Form Threshold (Patil and Raskar, 2015).Figure-6: Daubechies wavelet of order 2,3,4, and 5 Figure-7: The original and de-noisedsignals using Daubechies wavelet with Fixed Form Threshold. The data were analyzed usingfive levels of multiresolution for the selected wavelet, and then de-noisedusing Fixed Form Threshold and depending on soft thresholding. After that, thenew series was modeled again using ARIMA methodology. Also, the values offorecasting criteria were compared with those in the first method. Table 3presents the performance values of the two indicators of selecting an optimalmodel for the original data model using ARIMA method and hybrid method.
Table-3: Thevalues of the performance measures for the original data model using classicalARIMA methodology and hybrid method. Method Kind RMSE MAE Classical ARIMA Method Original (raw) data ARIMA(2,1,0) 133.937 106.565 Hybrid Method Fixed Form De-noised data Daubechies(2) 131.380 104.143 Daubechies(3) 131.555 104.
553 Daubechies(4) 131.593 104.411 Daubechies(5) 131.706 104.546 From Table 3, we observe thatthe best model for the original data was ARIMA(2,1,0).
However, when the hybridmethod applied to the original data the errors of the forecasting havedecreased for all wavelet orders and the new models have been enhanceddepending on the forecasting measures. To make a comparison of the two procedures,we can see that the maximum reduction is when applying Fixed Form Thresholdingand using Daubechies wavelet of order 2 (i.e.; from the Table 3 the goodreduction in RMSE and MAE from 133.937 to 131.380 and from 106.565 to 104.
143,respectively). Figure 8 presents the original and filtered data usingDaubechies wavelet of order 2. Figure-8: The original and filtered signalsusing Daubechies wavelet of order 2 The forecast values of ourhybrid method are presented in table 4 which shows the forecasting for the nextyears starting from 2015 up to 2030 of the annual rain precipitation (inmillimeters) of Erbil province – Iraq. Table-4:Forecast values of the annual rain of Erbil province-Iraq using hybrid method Forecast Period 367.8 2015 360.
3 2016 373.5 2017 368.1 2018 364.
9 2019 370.1 2020 368.1 2021 366.7 2022 368.8 2023 368.0 2024 367.
4 2025 368.3 2026 368.0 2027 367.7 2028 368.1 2029 368.0 2030 Conclusions In this research, we offered a new technique as hybrid method forenhancing the Box-Jenkins ARIMA analysis when forecasting time series data.Indeed, we concluded that:1- The appropriatemodel for forecasting using classical Box – Jenkins method was ARIMA(2,1,0).
2- The classicalmodel has been enhanced and improved when making filtration of the data andusing Daubechies wavelets orders from 1 to 5 and among them, the Daubechieswavelet of order 2 gave results better than others.3- Depending on our hybrid method toforecast for the coming years, the Erbil city will receive an average totalrainfall of 360-370 millimeters annually. References 1 Ali S.
M.., “Time series analysisof Baghdad rainfall using ARIMA method”, Iraqi Journal of Science,Vol.
54,1136-1142, (2013).2 Al-Safawi S., Ali T.
, and BadalM.,” Estimation AR(p) model using wave shrink”, SecondScientific Conference of Mathematics – Statistics and Informatics, Universityof Mosul, 274-299, (2009).3 Al-Shakarchy DH., “Usingfactor analysis to forecast of time series with an application on two seriesrain rates and relative humidity in Mosul city”, Tikrit Journal ofAdministrative and Economic Sciences, Vol.
6, 93-108, (2010).4 Antonios A., and Constantine E.V.,”Wavelet exploratory analysis of the FTSE ALL SHARE index”.
InProceedings of the 2nd WSEAS international conference on non-linear analysis.Non-linear systems and Chaos, Athens, 1-13, (2003).5 Ashley W.
, Walker J. P., Robertson D. E., and Pauwels V. R.N.
, ” AComparison of the discrete cosine and wavelet transforms for hydrologic modelinput data reduction”, Journal of Hydrology and Earth SystemSciences, Vol.3, 1-23, (2017).6 Box G., Jenkins G., and ReinselG., “Time series analysis: Forecasting and control”, thirdedition, Prentice-Hall International Inc., New Jersey, USA, (2008).
7 Burrus C., Gopinath R., and GuoH., “Introduction to wavelet and wavelet transforms, Prentice Hall,New Jersey, USA, (1998).8 Eni D., and Adeyeye F., “SeasonalARIMA modeling and forecasting of rainfall in Warri Town, Nigeria”,Journal of Geoscience and Environment Protection, Vol.3, 91-98, (2015).
9 Fugal D., “Conceptualwavelets in digital signal processing”, Space and Signals TechnologiesLLC, San Diego, California, (2009).10 Grossman, A. and Morlet, J., “Decompositionof Hardy functions into square integrable wavelets of constant shape”,SIAM, Journal of Mathematical Analysis, Vol.15, 723-736, (1984).11 Makridakis S.
, Wheelwright S., andHyndman R., “Forecasting methods and applications”, Thirdedition, Wiley& Sons, Inc, New York, (1998).12 Patil P.
L., and Raskar V. B., “Image denoising with wavelet thresholding method for different level ofdecomposition, International Journal of Engineering Research and GeneralScience, Vol.3, 1092-1099, (2015).13 Ramesh Reddy J. C.
, Ganesh T.,Venkateswaran M., and Reddy P., “Forecasting of monthly mean rainfallin Coastal Andhra”, International Journal of Statistics andApplications, Vol.7, 197-204, (2017).14 Shafaei M., Adamowski J.,Fakheri-Fard A.
, Dinpashoh Y., and Adamowski K., “A wavelet-SARIMA-ANNhybrid model for precipitation forecasting”, Journal of Water and LandDevelopment, Vol.28, 27-36, (2016).15 Shoba G.
, and Shobha G., “Rainfallprediction using data mining techniques: A survey”, InternationalJournal of Engineering and Computer Science, Vol.3, 6206-6211, (2014).16 Tantanee S., Patamatammakul S.,Oki T.
, Sriboonlue V., and Prempree T., “Coupled wavelet-autoregressivemodel for annual rainfall prediction”, Journal of Environmental Hydrology,Vol.13, 1-8, (2005). 17 Venkata Ramana R. Krishna S.,Kumar R., and Pandey N.
G., “Monthly rainfall prediction using waveletneural network analysis”, Springer, Water Resource Manage, Vol.27,3697–3711, (2013).
18 Wahab S., andKhayyat A., “Modeling the suitability analysis to establish new firestations in Erbil City using the analytic hierarchy process and geographicinformation systems”, Journal of Remote Sensing and GIS, Vol.2, 1-10,(2014).