Second (Virtual) Workshop of the HPDaSc Project

Second (Virtual) Workshop of the HPDaSc Project

Event Location
Event Site

10:00 -10:30 / 14:00 – 14:30

Title: Data-driven dimensionality reduction of a continuous SEIRD model for COVID19, Gabriel F. Barros, Malu Grave and Alvaro L. G. A. Coutinho


The COVID-19 pandemic has caused widespread damage worldwide, in terms of human lives and international economic weakening. Mathematical epidemiology proposes models that help the understanding of epidemics and to outline policies to control infectious diseases. The disease transmission is usually described by compartmental models, in which the population under study is divided into compartments and has assumptions about the nature and time rate of transfer from
one compartment to another.  Those models are a set of nonlinear transient ordinary differential equations (ODEs). In this work, we use a partial differential equation (PDE) model to capture the continuous spatio-temporal dynamics of COVID-19. PDE models incorporate spatial information more naturally and allow for capturing the dynamics across all scales. They have a significant advantage over ODE models, whose ability to describe spatial information is limited by the number of geographic compartments. We study a compartmental SEIRD model (susceptible, exposed, infected, recovered, deceased) that incorporates spatial spread through diffusion terms.  The model is discretized in space by the finite element method and in time by finite-differences. Since fine meshes are required for accuracy, after linearization by Newton's method, the resulting algebraic systems have a large dimension. To reduce the computational costs, we extract the dominant low-rank structures that can approximate the dynamics of the original high-dimensional problem. For this purpose, we consider Dynamic Mode Decomposition (DMD), a data-driven dimensionality reduction method that is widely used in the context of scientific machine learning and reduced-order modeling. DMD can capture the dynamics of nonlinear systems from observable data, in this case, generated by the  high-fidelity (or full-order model, FOM), find a basis with reduced dimensions, and eventually compute the evolution of large-dimensional dynamical systems. Our goal is to reconstruct the original FOM dataset with the given reduced basis and predict short-time future estimates for the continuous SEIRD model, enabling many-query computations like what-if scenarios, parametric studies, uncertainty-quantification and time-critical applications such as optimal control, and test support.

10:30 – 11:00 / 15:50 – 16:00

Title: Online event detection - Rebecca Salles and Eduardo Ogasawara


When analyzing time series, it is possible to observe significant changes in the behavior of its observations that frequently characterize the occurrence of events. Events signalize non-stationarity and may appear as anomalies, change points, or
frequent patterns. Event detection is recognized as a basic function in surveillance and monitoring systems, specially, for applications based on online data analysis.
Online event detection presents challenges including the need for incremental learning and the balance between plasticity and stability. In literature, there are several methods for event detection. However, the search for a suitable method for a
time series is not a simple task, especially considering that the nature of the events is often not known. The improper selection of a detection method can lead to failure to identify events and affect decision making causing possible damage to
applications. In this context, this research proposes the exploration of different detection and preprocessing methods for online event detection in time series. This includes the study of machine learning based methods and ensemble approaches
for event detection and classification. Research results are to be encapsulated in Harbinger, a framework for integration and analysis of event detection methods.

11:00 - 11:30 / 16:00 – 16:30

Title: Workflow Scheduling with Privacy restrictions - Daniel de Oliveira, Rodrigo Prado, Yuri Frota e Esther Pacitti.


Computing clouds provide an elastic and highly available environment on demand. Several scientists have already migrated their local infrastructure experiments to the cloud (either because of the cost or the need for scalability). Such experiments can be modeled as scientific workflows, and many of them are intensive in computing and data production. This data is stored in the cloud by Workflows Management Systems (WfMS) using storage services. A concern arising from this
storage in the cloud is data confidentiality, i.e., The risk of unauthorized access to red files due to malicious users. Knowledge about the result and structure of workflow can be inferred through improper data access. Various mechanisms such as data dispersion (files are distributed across many cloud storage services) and encryption can be adopted to increase data confidentiality. However, the adoption of these mechanisms cannot be performed in a way that is decoupled from the
scheduling of \workflow, since it can increase the execution time of workflow and its financial cost. In this presentation, we show the workflow scheduling approach called SaFER (workflow Scheduling with conFidEntiality pRoblem), which considers
confidentiality restrictions on data produced and consumed while reducing execution time and financial cost of running workflow. Experiments performed with benchmarks from workflows showed promising results.

11:30 – 12:00 / 15:30 – 16:00

Title: E2Clab: Exploring the Computing Continuum through Repeatable, Replicable and Reproducible Edge-to-Cloud Experiments, Daniel Rosendo, Alexandru Costan, Gabriel Antoniu, Patrick Valduriez


Distributed digital infrastructures for computation and analytics are now evolving towards an interconnected ecosystem allowing complex applications to be executed from IoT Edge devices to the HPC Cloud (aka the Computing Continuum, the Digital Continuum, or the Transcontinuum). Understanding end-to-end performance in such a complex continuum is challenging. This breaks down to reconciling many, typically contradicting application requirements and constraints with low-level infrastructure design choices. One important challenge is to accurately reproduce relevant behaviors of a given application workflow and representative settings of the physical infrastructure underlying this complex continuum. In this paper we introduce a rigorous methodology for such a process and validate it through E2Clab. It is the first platform to support the complete analysis cycle of an application on the Computing Continuum: (i) the configuration of the experimental environment, libraries and frameworks; (ii) the mapping between the application parts and machines on the Edge, Fog and Cloud; (iii) the deployment of the application on the infrastructure; (iv) the automated execution; and (v) the gathering of experiment metrics. We illustrate its usage with a prototype Smart Surveillance System application deployed on the Grid'5000 testbed, showing that our framework allows one to understand and improve performance, by correlating it to the parameter settings, the resource usage and the specifics of the underlying infrastructure. We also analyze a real-life
botanical observation platform named Pl@ntNet and demonstrate how E2Clab helps Pl@ntNet engineers to understand the performance of the Pl@ntNet system in order to anticipate what should be the appropriate evolution of the infrastructure.

12:00 - 12:30 / 16:00 – 16:30 - Closing