ORTEC & 510 – Can text mining optimize humanitarian response?

When a disaster strikes, we want to be prepared.

Every year, nearly 160 million people worldwide are affected by natural disasters such as earthquakes, hurricanes, and floods. Thanks to data, an increasing number of natural disasters can be predicted, enabling people to safely evacuate in time. This way, data can make an enormous difference in the impact of disasters on the lives of many.

Recently, the International Federation of Red Cross and Red Crescent Societies (IFRChas started to introduce more data-driven methods to their daily work. To explore opportunities to continue this path, 510 – a data initiative of the Netherlands Red Cross – and ORTEC have partnered up. The purpose of the 510 initiative is to improve speed, quality, and cost-effectiveness of humanitarian aid by using and creating data and digital products. The collaboration involves joint research into the use of text mining for disaster response, thereby collectively creating a larger impact on the humanitarian sector. For this, we use machine learning models. Since one small decision on this model creation can impact the lives of many, having a solid understanding and control of the way machine learning models work is of utmost importance.

Often the very data needed for these models to function accurately is not easily found and is very often not “ready to use”. How and where this data is extracted from cleaned, analyzed, and used for decision-making is just as important.

At 510, a team of data scientists uses data and artificial intelligence (AI) to help with disaster preparedness, early warning early action and response. The team converts the data into understanding, e.g. uncovering the areas which are most likely to be affected and indicating which areas should be considered as a priority in humanitarian response. The information derived from the data is shared with humanitarian aid workers, decision-makers and people affected, enabling them to prepare and cope with (predicted) disasters and crises.

Text mining to predict the impact of disasters

Part of the data needed could be found by using text mining: a process by which a large set of existing information is examined to generate new knowledge. For the 510 initiative, Luisa Baeskow and Annelies Riezebos, two Master of Science graduate students, are investigating the application of text mining techniques to specific Red Cross documents. The Vulnerability & Capacity Assessment (VCAs) reports and documents relating to the Disaster Relief Emergency Fund (DREFs) may contain data on both vulnerability and historical events. Through data and text mining, the application scope of these documents can be widened for disaster response and preparedness (the measures that are taken to prepare for and reduce the effects of disasters) in order to better shape and target future interventions.

IBF: Impact Based Forecasting

When communities can proactively respond to a disaster, damage, suffering, and emergency aid costs can be reduced. In order to respond proactively, predictions on disasters are crucial to be able to send out early warnings and improve preparedness. The collection of historical and integration actual data is used to predict the impact of impending disasters on vulnerable people that live in areas prone to natural disasters. The data is included in 510’s Impact-Based Forecasting (IBF) framework: a framework driven by (historical) impact data, as well as by the vulnerability and exposure of affected communities (a predictor of the severity of a disaster’s impact). The framework, presented in the figure above, consists of three steps. It starts with understanding the risk: based on geographic and population data, risk models can be developed. These risk models predict which areas are (most) vulnerable and assess a community’s risk. Text mining techniques are used during step 2 of the Impact-Based Forecasting framework: to identify the impact of impending disasters. This is then followed by forecasting triggered actions, thereby allowing for ‘early warning, early action’: e.g. enabling communities to evacuate at an early stage.


About 510

510 began in 2016 and has been growing ever since. At the beginning of 2019, a unique collaboration has started between 510, a data initiative of the Netherlands Red Cross, and ORTEC. It focuses on the smart use of data for humanitarian aid at a global level: ‘510’ refers to the total surface area of the earth in million km squared. The mission of 510 is to shape the future of humanitarian aid by converting data into understanding and put it in the hands of humanitarian aid workers, decision-makers and people affected, enabling them to better prepare for and cope with disasters and crises. The collaboration includes joint research into the use of text mining for disaster response and funding, thereby collectively creating a larger impact on the humanitarian sector.

 Technical deep dive

From the Disaster Relief Emergency Fund, historical impact information can be mined since these documents have been written after previous disasters had struck and now provide a situation analysis. The extraction of impact data from the Disaster Relief Emergency Funds in the form of sentences containing impact data will be continued by Annelies.

Not only does this automatic extraction of the most relevant sentences speed up the process of analyzing the documents later, but it can also provide insights into what qualitative data the current Impact-Based Forecasting (IBF) models are missing. A better understanding of disaster impact is key to optimizing the disaster response.

The Vulnerability & Capacity Assessment contains information about the vulnerability and exposure of the population and their capacity to respond if a disaster strikes on a community level. A classification of the Vulnerability and Capacity Assessment reports, depending on the information about hazard exposure included in these, will be done by Luisa.

While steps have been taken to “digitize” the Vulnerability & Capacity Assessment reports, no automatic analysis on these reports has been done until now. The research will therefore be a step in the process of exploring ways to make the Vulnerability & Capacity Assessment reports fit the digital age.

Author profile

Luisa Baeskow and Annelies Riezebos are both writing their master theses at ORTEC and 510 concerning the application of text mining techniques to certain Red Cross documents. Luisa Baeskow is a Master of Science student in Communication and Information Sciences at Tilburg University, specializing in Cognitive Science and Artificial Intelligence. Annelies Riezebos is studying for a master’s degree in Statistical Science for the Life and Behavioral Sciences at Leiden University, with a specialization in Data Science.


Based on information from the slides and these websites:

Comments are closed