From the Author:

I first came upon 510 when searching for a thesis subject. They had already worked on Priority Index Models (PIM ) for typhoons in the Philippines with a vision to do the same for all vulnerable communities across the globe. I have had an interest in humanitarian aid for a long time, specifically in potential applications of econometrics and data analytics. With a Masters in Business Analytics & Quantitative Marketing I wanted to use my skills to impact humanitarian aid & there was an opportunity to continue 510’s work on PIM.

Our team lead was part of the shelter cluster in Malawi during the 2015 flood and understood the challenges on the ground. We decided it would be interesting to research the Red Cross operations during this response in order to improve future aid. Focusing on people living in flood-prone areas, we established three levels of vulnerability. The data to populate the PIM still needed to be collected for Malawi, so we started searching for baseline and flood related data for the models. The intention was to use four different machine learning techniques, a combination of two decision trees and two random forests to create accurate forecasts on vulnerability. This became the basis for my thesis.

About the Master Thesis: Priority-Based Humanitarian Aid Modelling for Flood Impact in Malawi

This research forecasts the amount of help that Traditional Authorities (administrative areas) in Malawi would need in case of a flood and looks at the most influential factors.

We collected geographical, infrastructural and socio-demographics for baseline data. These data sets were combined with event-specific variables, such as the amount of rainfall and the percentage of the area flooded. We engaged the most important stakeholders, such as Malawian Governmental Institutions and the humanitarian sector to better understand the process of data collection and sharing. Using the 2015 flood as training data, four different machine learning techniques (CART Decision Tree, Conditional Inference Tree, CART Random Forest & Conditional Forest) were used to predict the amount of help needed by each Traditional Authority.


In researching this topic, we found that the Random Forest model performs best in terms of accurate predictions, albeit 64%. The two most important variables in this research, were 1) The percentage of an area that is flooded, and that for every area; 2) The percentage of drinking water coming from natural sources. A flood map can be retrieved from satellite images right after or during the disaster and intuitively gives an idea of the impact on a certain area (although it does depend on other factors too, such as the elevation and number of people living there). The percentage of drinking water coming from natural sources may be a proxy indicator for poverty. Although this indicator has been selected as one of the most important in this research, the data might not always be available in every country nor for every flood.

In order to develop this research further, a few ithings needed to happen. First of all, the model uses one data set (from one flood.) This needs to be increased before practical use. The predictions are too uncertain to be used in the field yet, but by collecting more data we could significantly improve their accuracy and reliability. In order to better understand the practical implications of the research, conversations with Red Cross staff & field trips were conducted to determine its uses in the field. Here we learnt more about their current EWS and river gauges.

Local Knowledge: Early Warning System (EWS) daily gauge levels in Thukuta

The main points mentioned were;

1) The model needs to be enhanced with other information to become more useful. For example creating a map to show our prediction, but include information on; poverty and the amount of people living in an area indicating women and children. (Over-saturating the map, would make the information overwhelming and therefore lose some of it efficiency and usefulness.)

2) These forecasts are very sector specific. Some areas may need a lot of help with medical support (health) and food assistance (if they lose their livelihoods) Being clearer on how the model was created and defining the scope.

A good vehicle for successfully implementing the use of these models in the field would be SIMS support. Using SIMS would ensure the right information is disseminated on the ground and potentially allow for faster forecasting (preferably within 1-3 days). As models are never 100% accurate it is important to enrich them by involving local knowledge from the local villages, Malawian Red Cross and other NGOs.

Discussing flood prone areas at Ministry of Water Resources & Irrigation

One way of continuing the dialogue is to involve stakeholders in trainings, such as the ERU and FACT. The outcome would be teams on the ground knowing how to interpret and work with the information. This was also recommended during some of the interviews with Red Cross staff members.

Currently elements of PIM have already been implemented in humanitarian aid efforts, and we continue to enhance its capabilities in spite of the challenges. The steps taken for this thesis in Malawi were just the beginning, see related projects below.

The thesis underlying this work is available here.




Written by Jurg Wilbrink 510 GEO INFORMATION SERVICES The 510 initiative is working with the Malawi Red Cross to identify the country’s most vulnerable areas. A workshop and mapping exercise were held on data collection and sharing with different stakeholders in Malawi. Vulnerability data contributes to the Red Cross’ data preparedness initiative (see also Method:… Read More

Comments are closed