Normal approximation and smoothness for sums of means of lattice-valued random variables
Motivated by a problem arising when analysing data from quarantine searches, we explore properties of distributions of sums of independent means of independent lattice-valued random variables. The aim is to determine the extent to which approximations to those sums require continuity corrections. We show that, in cases where there are only two different means, the main effects of distribution smoothness can be understood in terms of the ratio rho_12=(e_2 n_1)/(e_1 n_2), where e_1 and e_2 are the respective maximal lattice edge widths of the two populations, and n_1 and n_2 are the respective sample sizes used to compute the means. If rho_12 converges to an irrational number, or converges sufficiently slowly to a rational number; and in a number of other cases too, for example those where rho_12 does not converge; the effects of the discontinuity of lattice distributions are of smaller order than the effects of skewness. However, in other instances, for example where rho_12 converges relatively quickly to a rational number, the effects of discontinuity and skewness are of the same size. We also treat higher-order properties, arguing that cases where rho_12 converges to an algebraic irrational number can be less prone to suffer the effects of discontinuity than cases where the limiting irrational is transcendental. These results are extended to the case of three or more different means, and also to problems where distributions are estimated using the bootstrap. The results have practical interpretation in terms of the accuracy of inference for, among other quantities, the sum or difference of binomial proportions.ρ12=(e2n1)/(e1n2)e1e2n1n2ρ12ρ12ρ12ρ12
Confidence intervals for the difference of two binomial proportions are well known, however, confidence intervals for the weighted sum of two binomial proportions are less studied. We develop and compare seven methods for constructing confidence intervals for the weighted sum of two independent binomial proportions. The interval estimates are constructed by inverting the Wald test, the score test and the Likelihood ratio test. The weights can be negative, so our results generalize those for the difference between two independent proportions. We provide a numerical study that shows that these confidence intervals based on large-sample approximations perform very well, even when a relatively small amount of data is available. The intervals based on the inversion of the score test showed the best performance. Finally, we show that as for the difference of two binomial proportions, adding four pseudo-outcomes to the Wald interval for the weighted sum of two binomial proportions improves its coverage significantly, and we provide a justification for this correction.
Nowadays insurance market is one of the most rapidly developing sectors of economy, the purpose of which is to protect the property interests of individuals and legal entities under ensuing of specific events (insured accidents) at the expense of monetary funds formed from insurance dues (insurance premiums) paid by them. Probabilistic nature of insured accidents as well as the uncertainty of the moment of their occurrence and the severity of losses leads to the necessity of forming loss reserves. Reserves for incurred but not reported claims (hereinafter referred to as IBNR reserves) seem to be the most challenging in terms of actuarial calculations. The following article provides the descriptions of various actuarial techniques of loss reserving and examples of their application to a real insurance portfolio. In this paper the point estimating methods such as Chain Ladder, Bornhuetter-Fergusson, multiplicative techniques are compared with the stochastic method of Bootstrap and the most accurate estimate is determined using run-off analysis.
Border-based regulatory inspectors, such as quarantine and customs organizations, intervene at national and state borders. Performance indicators can be used to help assess and compare non-compliance, and also to assess the inspectorate's ability to detect and rectify. We document a suite of three performance indicators that target inspectorate at and before the border, and provide protocols for collecting the necessary data to compute point estimates of the indicators. For proving interval estimates, we then discuss mathematical models that are present in the data collection. We cover three distinct setups, namely, the importation of certain classes of air cargo in Australia, both historically and under current policies, and passengers arriving at an international airport. The methodology developed here using the terminology of quarantine inspections can be applied in more general settings. Under the model, which best describes the way, data is collected, we provide confidence bounds for the performance indicators, with coverage close to the nominal level. The methodology is then illustrated on real and fabricated data.
Prediction of the duration of a repair and maintenance project of a gas transport system is an important part of planning activities. There exist numerous sources of uncertainties that may result in time overruns possibly leading to multiple negative consequences. Our experience in planning this work suggests that accepting the stochastic nature of the project duration is a constructive step towards the preparedness to contingencies and defining penalties for repair companies. To support this approach, one needs to construct probability distributions of the durations of the projects. To address the issue of the scarcity of observed data, we suggest using a bootstrap resampling procedure. Gram-Charlier functions and order statistics are employed to approximate the distributions. It is demonstrated how to derive them for a separate repair project and a larger project consisting of a number of concurrently running subprojects. Following this, guidance is provided on how to decide about what duration should define the deadline for completion of the whole work. A simple example is provided.
This book presents recent non-asymptotic results for approximations in multivariate statistical analysis. The book is unique in its focus on results with the correct error structure for all the parameters involved. Firstly, it discusses the computable error bounds on correlation coefficients, MANOVA tests and discriminant functions studied in recent papers. It then introduces new areas of research in high-dimensional approximations for bootstrap procedures, Cornish–Fisher expansions, power-divergence statistics and approximations of statistics based on observations with random sample size. Lastly, it proposes a general approach for the construction of non-asymptotic bounds, providing relevant examples for several complicated statistics. It is a valuable resource for researchers with a basic understanding of multivariate statistics.
A model for organizing cargo transportation between two node stations connected by a railway line which contains a certain number of intermediate stations is considered. The movement of cargo is in one direction. Such a situation may occur, for example, if one of the node stations is located in a region which produce raw material for manufacturing industry located in another region, and there is another node station. The organization of freight traﬃc is performed by means of a number of technologies. These technologies determine the rules for taking on cargo at the initial node station, the rules of interaction between neighboring stations, as well as the rule of distribution of cargo to the ﬁnal node stations. The process of cargo transportation is followed by the set rule of control. For such a model, one must determine possible modes of cargo transportation and describe their properties. This model is described by a ﬁnite-dimensional system of diﬀerential equations with nonlocal linear restrictions. The class of the solution satisfying nonlocal linear restrictions is extremely narrow. It results in the need for the “correct” extension of solutions of a system of diﬀerential equations to a class of quasi-solutions having the distinctive feature of gaps in a countable number of points. It was possible numerically using the Runge–Kutta method of the fourth order to build these quasi-solutions and determine their rate of growth. Let us note that in the technical plan the main complexity consisted in obtaining quasi-solutions satisfying the nonlocal linear restrictions. Furthermore, we investigated the dependence of quasi-solutions and, in particular, sizes of gaps (jumps) of solutions on a number of parameters of the model characterizing a rule of control, technologies for transportation of cargo and intensity of giving of cargo on a node station.