As indicated above, the use of ML in volcano seismology is permitting to increase the volcano seismic catalogues with more accurate information. We are pioneers in this task, and our advances, as reported in the introduction section, are providing reliable tools for this purpose. We have implemented systems that work in quasi real time with continuous records of volcano seismic signals. These systems are based on HMM13,21 and RNN19 techniques. The performance in terms of accuracy of these systems has been greater than 80% of correctly detected and labeled events. We have also developed systems that work with isolate events using also HMM thecniques and other techniques based on Deep Learning as DNN40, CNN41 and BNN9. Theses systems are built in two steps; in the first step, the beginning and the end of the event is detected, and in the second one the event is classified; our group have developed algorithms based on ML and Signal processing to detect events embedded in continuous records8,42. The accuracy of the results obtained with this configuration is better than that previously described. But nevertherless, an additional software is needed to isolate the event before their classification. At the output of all the implemented systems the labels and the scores of the event are compared to produce a global score indicating the confidence that could be provided on the label. If this score does not reach a threshold, the final desicision will be taken by an human expert.

However the major difficulty we have at the present is to provide a real and fast exportability of the already labelled data bases from a studied volcano to a new one in order to reduce the time and effort in the previous preparation of the data bases. In the selected volcanoes (the three preliminary scenarios and the new four test sites) there is already a specific classified data basis, but with several lacks of data: potential mistakes, confused classes and non-homogeneous names (see table1). In the present WP5 we will board a double task:

WP5.1 Creation of a new and reliable volcano-seismic catalogue of each scenario using ML algorithms.

During the last years and under the support of several research projects, our team group has been working in developing a set of classifiers based on different ML techniques that provide very good results. The implemented techniques belong to the supervised learning approach; that means that it is necessary to built an appropriate database to train the algorithms. Up to get this database we need to have enough representations of the different classes of events with their labels; from these data the algorithms learn how to distinguish among the different types of events.

In order to create a volcano-seismic catalogue we need to adapt our classification systems to the new escenarios provided by the Kilauea, Etna and Asama volcanoes. It well be necessary to generate a labeled database to train properly the classifiers. This is a very hard work that usually has been manually performed. In this WP5.1 we propose a semisupervised procedure to generate labeled events of the new volcanoes with the goal of training properly the classification systems. This procedure implements a parallel architecture that combines our already developed systems and uses the confidence score of all classifiers as input to a decision block. In this block the goodness of the automatically generated label is confirmed or it can be transferred to a human expert to validate the assigned label (Figure 8).

WP5.2. Exportability of the general seismic data classes from one scenario to other.

Given the vast amount of data, computation, and time resources required to develop reliable systems, an emerging approach is to exploit what has been learned in one domain (where a lot of labelled training data are available) to improve generalization in another domain where data are scarce. This is is known as Transfer Learning (TL)43, alluding to the fact of exporting the knowledge acquired in a domain to a different one. We propose to apply TL philosophy to resolve the problem of exportability from one volcano-seismic scenario to another. Particularly, we will study two approaches:

  • TL for feature extraction: An important goal is the development of robust data pattern extraction mechanisms which are able to characterize each type of event properly; emphasizing the mechanisms that generates the event and minimizing the effects of the environment. This should make easier to extend a classification system trained in one scenario to another. Traditionally, feature-engineering approaches based on the knowledge of human experts were used to extract relevant and discriminative information; however, newer approaches based on deep hierarchical models can learn representative features from raw data. These approaches have become the state of the art in many disciplines, improving the traditional ones. We propose to apply TL to use feature representations generated by models trained in a different specific problem as efficient information to build a system for automatic classification of volcano-seismic events. We propose retrains the model, keeping the spatial and spectral information extracted by the pre-trained model in a different domain as the input Information
  • TL and Uncertainty: From a machine learning perspective, changes within the seismic environment produce probability distributions that are very different from the original training data. The insufficient amount of new labelled data, along with the time needed to analyse and retrain the systems are the main factors that limit the exportability across seismic volcanoes. Transfer Learning could relieve the data scarcity problem and the time needed to react to these changes. Additionally, Bayesian Deep Learning provide a fast uncertainty quantification at the seismic waveform level9. We propose to link TL and epistemic uncertainty to reduce the generalization error gap between data distribution and data from the new scenarios. This would yield more robust clasification systems, able: to quantify uncertainty, to detect statistically meaningful changes and to help analysts to build large-scale, high-quality annotated datasets.

The working team will be leader by Prof. Benítez with a direct collaboration with Drs. Mota, Alguacil, Prudencio, Feriche and Ibáñez of the Research Team and several members of the Working Team.

© 2021 Female

Website designed and developed by Ramón Ruiz