Inventory of Case Studies

Once the new parametric data base is created for each selected volcano, the creation of the Inventory of Case Studies will be divided into three tasks (Figure 6):

WP3.1. Identification of the primary volcanic processes.

Based on the relevance for society, volume of material emitted to the atmosphere, geophysical evidence, and other parameters we will select periods of time representing volcanic eruptions or crises for the three selected volcanoes. This search will be done using reports from the local volcanological institution (i.e Smithsonian, USGS, INGV) and from other sources. These episodes or Cases Studies must cover both eruptive and non-eruptive episodes. For non-eruptive episodes we will use as indicator the change to orange or red in the “volcanic traffic light system” that did not result in a final eruptive episode.

The working team will be leader by Prof. Ibáñez with a direct collaboration with Drs. Prudencio, Ontiveros, Feriche of the Research Team and several members of the Working Team.

WP3.2. Determination of the baseline.

Once we identified the different case studies for each selected volcano, we will extract the time interval covering each period of unrest from the new parametric data base. Using the remaining portions of the parametric catalogue the average value of each parameter will be calculated, together with other classical statistical values. This average value and associated statistical parameters will represent the background baseline. These values will be stored in a D-dimension vector named the “baseline vector” and will be used to determine when the parameters change.

The working team will be leader by Prof. Benítez with a direct collaboration with Drs. Mota and Ibáñez of the Research Team and several members of the Working Team.

WP3.3. Extraction of parameters that evolve before each Case Studies.

This sub-task contains two actions:

  • (a) to analyse the existence of concept drifts in the volcanic phenomenon as exhibited in its seismic recordings, i.e., in the parametric data base.
  • (b) to use dimensional reduction to find the Minimum Common Number of representative parameters in a volcanic crisis.

The working team will be leader by Prof. Benítez with a direct collaboration with Drs. Mota and Ibáñez of the Research Team and several members of the Working Team.

WP3.3.a. Identification of parameters that evolve over time.

The objective is the detection of any potential change from the usual statistical parameter baseline. FEMALE will start by using concepts such as anomaly detection and concept drift, also known as covariate/dataset shift or non-stationarity41. Anomaly detection is defined as a change that diverges so significantly from the defined baseline as to incite suspicion that it was generated by a different/new mechanism41. Concept drift appears in non-stationary environments, and is defined as the temporal evolution of the target data distribution.

FEMALE will seek single anomalies i.e. the change of a parameter over its line value. Information will be enriched by data fusion of single anomalies into collective anomalies and contextual ones; i.e. anomaly behaviour occurs during a localized period in time as a consequence of internal volcanological changes. We will apply direct methods to identify time periods in which there are these so-called context changes. These methods can be summarized as:

  • Supervised anomaly detection: Based on fits and the construction of predictive models. We will start by using a straightforward supervised method that evaluates the performance of a classifier by its accuracy over time (called DDM).
  • Semi-Supervised anomaly detection: In this case, the detector is trained with data without anomalies. New anomalies are detected by computing statistical measures or prediction uncertainties.
  • Unsupervised anomaly detection: An unsupervised approach scores the data solely based on intrinsic properties of the dataset. Typically, distances are used to categorize normal behaviour and outliers. Recent approaches include the sum of distances generated with the k-nearest neighbour (k-NN) rule25.

When abnormal values exceeding the threshold of any of the methods described above are found, this time is marked. The value of the threshold can be set statistically or through expert opinion and it can be specific to each volcano. We will create a binary vector (“change vector”) of dimension D giving a value of 1 to those parameters that change and 0 to those that do not. In addition we will note the time when parameters start to change for each Case and parameter (tstart).

The working team will be leader by Prof. Benítez with a direct collaboration with Drs. Mota, Alguacil, Prudencio, Feriche and Ibáñez of the Research Team and several members of the Working Team.

WP3.3.b. The goal of dimensional reduction

(DR) algorithms is to extract a common subset of features with useful information for classification according to stated criteria. It is realistic to expect that every case studies may have different elements of the change vector with value 1; i.e. the number of parameters that evolve over time for every case may be different. The dimensional reduction procedure will find a set of data representing the Minimum Common Number of representative parameters for the maximum analysed case studies. Given a set of features, the generic procedure would consist of evaluating the performance of each possible subset and determining the most effective and common subset.

The Common Number of parameter analysis will be done in several steps:

  • a) for every initial volcano and separating the eruptive and non-eruptive processes
  • b) for all subsequently selected volcanoes and again separating the eruptive and non-eruptive cases.

FEMALE will create a Preliminary Inventory of Case Studies, i.e. a new database of the N-cases, eruptive and non-eruptive, from the initially selected volcanoes composed of a set of N matrices. Each matrix will be the product of a matrix extracted from the parametric database (covering the time of unrest) by its corresponding change vector or by the common change vector. Each new matrix will have the same dimension as the parametric database with a value of 0 in each column in which no changes were determined and true values for the remaining columns. Maintaining the same matrix dimension allows easy access to parameters for further computation, indexing of parameters, or the addition of new parameters in the future. The informative baseline vector, change vector, and common change vector will complete the Inventory.

The working team will be leader by Prof. Ibáñez with a direct collaboration with Drs. Mota, Alguacil, Prudencio, Feriche and Benítez of the Research Team and several members of the Working Team.

© 2021 Female

Website designed and developed by Ramón Ruiz