Abstract:
Statistics of extremes has seen much growth both in theory and application since its early theoretical developments almost a century ago in the 1920s and its first major applications to real-life problems pioneered by Emil Gumbel in the early 1940s. Although the theory and applications of extreme value theory (EVT) have been extensively advanced and utilised in most developed countries,intermsofapplicationslittlehasbeendoneinmanydevelopingcountries in Africa despite the abundance of areas of applications and raw data in some ofthesecountries. Inhydrology,thechoiceoffloodfrequencyprobabilitydistributions for a particular site or region remains the subject of ongoing research. The work contained in this thesis is a contribution towards this area and it addresses this problem in one of the developing and economically challenged countries in Africa, Mozambique, in the lower Limpopo River basin (LLRB). The LLRB is a basin characterised by extreme natural hazards, alternating between extreme floods and severe droughts.
ThisthesisisbasedonanextensiveapplicationofEVTtoextremefloodheights data in the LLRB of Mozambique at three sites: Chokwe, Combomune and Sicacate hydrometric stations. Two fundamental approaches of EVT, block maxima and peaks-over-threshold (POT), are used in this thesis. Recent theoretical results by Ferreira and de Haan (2015) have shown that despite its inefficiency due to data lost as a result of blocking, the block maxima approach is more efficient in a number of situations than the POT approach, and the two approaches are quite comparable for large sample sizes. A number of
ii
candidate distributions are investigated for their goodness-of-fit to the annual daily maximum flood heights in a block maxima realisation at each site. The findings reveal that the GEV distribution is the most appropriate distribution to apply in the LLRB and the distribution can be recommended as the likelihood function for regional and spatial extremes flood frequency analysis in the basin. The thesis addresses the issue of cumulative effects on daily flood heights through a comparative analysis of six annual maxima moving sums. The findings demonstrate that the six annual maxima time series models are notsignificantlydifferentbasedonthecharacteristicsconsideredinthisthesis.
In an attempt to reduce uncertainties in the estimates, a Bayesian Markov chain Monte Carlo (MCMC) approach with a conjugate prior and a GEV likelihood function is used to model the tails of the extreme flood heights in the basin. The findings reveal that the addition of prior information in Bayesian MCMC substantially reduces uncertainties in the estimates and improves precision in the predicted extreme floods. The r largest order statistics models developed in this thesis are generally promising and the standard errors of the estimates of the parameters are substantially reduced. In order to account for climate change impact, nonstationary models are considered with the longterm trend and seasonal oscillation index (SOI) (a meteorological variable indicator) as covariates of the parameters of the GEV distribution and the generalised Pareto distribution (GPD).
Among the major contributions of this thesis is a proposed procedure for the determination of the 8 days window period used in extracting independent r largest order values within the same year for the r largest order statistics approach. A summary of the key findings and contributions of this thesis are given in Chapter 9. Moreover, contributions by the study topic in each chapter are given at the end of each chapter.