A Comparison of Regression Models for Count Data in Third Party Automobile Insurance

University essay from KTH/Matematisk statistik

Author: Annelie Johansson; [2014]

Keywords: ;


In the area of pricing insurances, many statistical tools are used. The number

of tools that exist are overwhelming and it is dicult to know which one to choose.

In this thesis ve regression models are compared on how good they t the number

of reported claims in third party automobile insurance. The models considered

are OLS, Poisson, Negative Binomial and two Hurdle models. The Hurdle models

are based on Poisson regression and Negative Binomial regression respectively, but

with additional number of zeros. The AIC and BIC statistics are considered for all

the models and the predicted number of claims are calculated and compared to the

observed number of claims. Also, a hypothesis test for the null hypothesis that the

Hurdle models are not needed is performed. The OLS regression is not suitable for

this kind of data. This can be explained by the fact that the number of claims are

not normally distributed. This is the case because many policyholders never report

any claims and the data therefore includes an excess number of zeros. Also, the

number of claims can never be negative. The other four models are considerably

better and all of them t the data satisfactory. The one of them that performs

best in one test is inadequate in another. The Negative Binomial model is a bit

better than the other models, but the model choice is not obvious. The conclusion

is not that a specic model is preferable, but that one need to choose model critically.

Keywords: Regression Models, Insurance, Count Data, Regression Analysis, Hurdle


  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)