| چکیده انگلیسی مقاله |
Extended Abstract Introduction and Objective: Long-term hydrometeorological variables can be used for planning and managing water resources at the basin level using different physical models such as hydrological and hydraulic models. However, such variables are often accompanied by missing data, which makes analysis difficult or sometimes impossible. Data gaps cause problems in interpretation, model calibration, and biased statistics. In this study, the validity of a non-parametric random learning machine algorithm called MissForest has been evaluated to fill the gap of daily streamflow series in a region with scarce data and strong climate variability. Material and Methods: The daily streamflow data in the gauge stations of the Southern Baluchestan catchment have been analyzed in a long-term hydrological period (09/23/1972 to 09/22/2018). First, The missingness percentage was selected based on a conventional criterion (less than 50%) as an acceptable ratio of the missing rate in the streamflow data, and then the mechanisms and patterns of the missing data were investigated. Accordingly, the number of gauge stations reduced to 7 samples. Then, the temporal distribution of the missing daily streamflows during the months of the year and the relative frequency of gap length during the period were investigated. In the following, the performance of the missing data reconstruction algorithm is challenged with two different artificial missing data scenarios. Two types of artificial gaps were generated, namely a) Removed contiguous segments: at each gauge only a segment (having lengths of 7, 14, 21, 30, 60, 180, and 365 days) was randomly removed from the entire record (1972–2018); b) Removed single data points: observed values (30, 60, 90, 120, 180 and 365 days) were randomly removed from the entire record (1972–2018) at each of the gauges. MissForest was applied to infill the gaps contained in the records together with the artificial gaps. Our analysis includes reconstructions of the 1972–2018 period at each of any streamflow gauges. Finally, The performance of MissForest at infilling daily streamflow data was tested by comparing the filled series with the observed data using goodness-of-fit indicators (GoF): coefficient of determination ( 2 ), the percent bias (PBIAS), and the Kling-Gupta efficiency (KGE). Results: The results showed that, in general, the MissForest algorithm performed satisfactorily and well, and it provides the possibility of accurately and reliably simulating lost data quickly and automatically. The performance of the MissForest algorithm is highly dependent on the number of predictor records, record length, and streamflow type. Finally, reconstruction of real gaps in streamflow data was possible through the application of this intelligent algorithm. The river flow time series were simulated with the natural flow regime with good performance; While this performance had a slight drop for flow rate changes as a result of water storage and diversion for irrigation, especially downstream of dams. The performance of this algorithm in filling the daily time series of flow with severe changes in the flow regime such as peak discharge was not evaluated optimally. This drop in performance is more related to the hydroclimatic conditions of the studied watershed than the structure of the algorithm. The reconstructed hydrographs allow analysis of flow variability and their interaction with key climate variables. Conclusion: The MissForest algorithm is introduced as one of the imputation methods based on machine learning with high credibility and performance in reconstructing the missing data of the daily streamflow, and it can be used automatically and intelligently in the reconstruction of the statistical defects of the river flow in the scale used daily. It is suggested that the effects of different watersheds with specific hydro-physical-climatic characteristics should be analyzed in the next studies on the performance of the MissForest algorithm. Among the other issues that need to be addressed in future studies are the investigation of the proposed method of this study in other climatic and geographical regions, the sensitivity measurement to the rainfall and flow regime, and finally the investigation of its performance compared to other common methods. |
| نویسندگان مقاله |
جواد آریان منش | Javad Aryanmanesh Department of physical Geography, University of Sistan and Baluchestan, Iran گروه جغرافیای طبیعی، دانشکده جغرافیا و برنامه ریزی محیطی، دانشگاه سیستان و بلوچستان، ایران.
حمید نظری پور | Hamid Nazaripour Department of physical Geography, University of Sistan and Baluchestan, Iran گروه جغرافیای طبیعی، دانشکده جغرافیا و برنامه ریزی محیطی، دانشگاه سیستان و بلوچستان، ایران
پیمان محمودی | Peyman Mahmoodi Department of physical Geography, University of Sistan and Baluchestan, Iran گروه جغرافیای طبیعی، دانشکده جغرافیا و برنامه ریزی محیطی، دانشگاه سیستان و بلوچستان، ایران
پرویز خسروی | Parviz Khosravi Iran Meteorological Organization. سازمان هواشناسی کشور.
|