Projects

The following is by no means an exhaustive list but gives an indication of the many projects I've been involved.

Catastrophic models for Property and Auto

Production-ready models in order to estimate ultimate average severity for declared CAT events in order to assist PI Actuarial Trend & Claims Analysis in their projections. We estimated event-level and peril-level (Hail, Hurricane, Tornado, ...) severity models. Also, we used a lot of external data such as CAT-event, weather, demographic and social media. We used different machine learning techniques as XGBoost.

Affinity Partnership Analysis

Identify major drivers of high Property and Auto loss ratios for one of our affinity organizations, using loss ratio, frequency and severity models. Here we used the classical GLM framework with oversampling. It was a pure inference project.

Actuarial Pricing Projects

I was involved in four multivariable models for pricing (ratemaking) used by the pricing areas of property and casualty (non-life) insurance companies for the Auto and Household implemented from scratch and for the very first time. The pricing models are the technical heart of and insurance company, i.e. the most important models in this industry.

From the data base creation to the final production ready output and model monitoring. That means that I was working into the data base design, ETL, data wrangling task (cleansing, EAD, imputation, outliers, etc.), feature engineering (creation of weather features from raw meteorological data for example), model building (frequency models, Severity and Loss Ratio), model implementation or model monitoring.

In this area, I was involved in really cool thinks such geo-location of the policies, spatial analysis (territorial ratemaking), use of GIS (Geographical Information Systems), creation of meteorological models using spatial analysis such spatial interpolation, creating from scratch weather features or using other kind of external information such as census, credit, lifestyles, fire station location and environment.

I used classical GLM with Gamma, Log-Normal (Severity), Poisson, Negative Binomial, ZIP/B (Zero Inflated Poisson and Negative Binomial), GLM with mixed models with cross-section and longitudinal datasets.

Data base construction

The data base contains around 250 internal and external risk factors. We used public and private external data sources such as census, weather, credit, lifestyles, fire station location and environment data among others.

The project implied the use of spatial interpolation in order to create weather risk factors from raw meteorological data. The use of QGIS and SAS E-Miner during the project was the key during the data cleaning, data imputation data exploration and creation of derivate variables tasks.

The external part of this data base is available for the rest of the company in order to enrich new models construction in areas such as Analytics, e.g., Fraud, Marketing, etc.

Machine Learning Algorithm implementations

For those SAS clients not using SAS Enterprise Miner (the machine learning tool from SAS), I’ve been implementing several machine learning from scratch: decision trees, stepwise regression, naïve Bayes, spatial interpolation and smoothing, clustering for insurance rating territories among others.

Price optimization models using GLM

Price elasticity of demand by homogenous segment using Logistics models for the pricing area of an insurance company

Price optimization using Survival models

Implementing retention and price elasticity of demand by homogenous segment using not the common logistic models but more refined survival models.

Credit Scoring for a credit and surety insurance company

Credit scoring project for one of the world leaders in credit and surety insurance. Development of PD (Probability of default) and LGD (Loss given Default) models for the credit portfolio of SMEs using Generalized Linear Models.

Credit Scoring for a credit rating agency

Credit scoring project for a credit rating agency. Development of PD (Probability of default) and LGD (Loss given Default) models for the credit portfolio of SMEs using Generalized Linear Models

Text Mining / Natural Language Processing

Text mining project to separate water losses into weather-related and non-weather-related events. This distinction is essential in building accurate per peril ratemaking predictive models for weather and non-weather-related water losses for the household portfolio of a property and casualty insurance company.

Clustering for client segmentation

I was involved in several client segmentation projects for different insurance companies.

A/B Testing

As part of the price optimization strategy of the online channel we used A/B testing.

Udacity Deep Learning Nanodegre projects:

As part of the Udacity nanodegrees I've been working in the following mini projects

Udacity Self-Driving Car Engineer Program projects

Finding Lane Lines on the Road

When we drive, we use our eyes to decide where to go. The lines on the road that show us where the lanes are act as our constant reference for where to steer the vehicle. Naturally, one of the first things we would like to do in developing a self-driving car is to automatically detect lane lines using an algorithm.

In this project, the aim is detect lane lines in images using Python and OpenCV. OpenCV means "Open-Source Computer Vision", which is a package that has many useful tools for analyzing images.

(more coming soon)

Udacity Deep Learning Nanodegre projects:

(more coming soon)