PhD Thesis: Francisco Borges: September 30, 12:00 hrs. 2016. (Assistant Professor at IFBA Instituto Federal de Educação, Ciência e Tecnologia da Bahia, Campus Santo Amaro. Bahia. Brazil)

Title: Care HPS: A High Performance Simulation Methodology for Complex Agent-Based Models.

TDX Source:


This thesis introduces a methodology to do research on HPC for complex agent based models that demand high performance solutions. This methodology, named Care High Performance Simulation (HPS), enables researchers to: 1) develop techniques and solutions of high performance parallel and distributed simulations for agent-based models; and, 2) study, design and implement complex agent-based models that require high performance computing solutions. This methodology was designed to easily and quickly develop new ABMs, as well as to extend and implement new solutions for the main issues of parallel and distributed simulations such as: synchronization, communication, load and computing balancing, and partitioning algorithms in order to test and analyze. Also, some agent-based models and HPC approaches and techniques are developed which can be used by researchers in HPC for ABMs that required high performance solutions.

A set of experiments are included with the aim of showing the completeness and functionality of this methodology and evaluate how the results can be useful. These experiments focus on: 1) presenting the results of our proposed HPC techniques and approaches which are used in the Care HPS; 2) showing that the features of Care HPS reach the proposed aims; and, 3) presenting the scalability results of the Care HPS. As a result, we show that Care HPS can be used as a scientific instrument for the advance of the agent-based parallel and distributed simulations field.


PhD Thesis: Albert Gutiérrez Millà: July 22, 10:00 hrs. 2016. (Researcher at Barcelona Supercomputing Center. CASE - Fusion Dpt.- Barcelona-Spain)

Title: Crowd Modeling and Simulation on High Performance Architectures.

TDX Source:


Management of security in major events has become crucial in an increasingly populated world. Disasters have incremented in crowd events over the last hundred years and therefore the safety management of the attendees has become a key issue. To understand and assess the risks involved in these situations, models and simulators that allow understand the situation and make decisions accordingly are necessary.

But crowd simulation has high computational requirements when we consider thousands of people. Moreover, the same initial situation can vary on the results depending on the non deterministic behavior of the population; for this we also need a significant amount of statistical reliable simulations. In this thesis we have proposed crowd models and focused on providing a DSS (Decisions Support System). The proposed models can reproduce the complexity of agents, psychological factors, intelligence to find the exit and avoid obstacles or move through the crowd, and recreate internal events of the crowd in case of high pressures or densities.

In order to model these aspects we use agent-based models and numerical methods. To focus on the applicability of the model we have developed a workflow that allows you to run in the Cloud DSS to simplify the complexity of the systems to the experts and only left to the them the configuration. Finally, to test the operation and to validate the simulator we used real scenarios and synthetic in order to evaluate the performance of the models.


PhD Thesis: Liu Zhengchun: July 22, 12:00 hrs. 2016. (Researcher Argonne National Laboratory. MSC Dpt. USA)

Title: Modeling & Simulation for Healtcare Operations Management Using High Performance Computing & Agent Based Model.

TDX Source:


Hospital based emergency departments (EDs) are highly integrated service units to primarily handle the needs of the patients arriving without prior appointment, and with uncertain conditions. In this context, analysis and management of patient flows play a key role in developing policies and decision tools for overall performance improvement of the system. However, patient flows in EDs are considered to be very complex because of the different pathways patients may take and the inherent uncertainty and variability of healthcare processes. Due to the complexity and crucial role of an ED in the healthcare system, the ability to accurately represent, simulate and predict performance of ED is invaluable for decision makers to solve operations management problems. One way to realize this requirement is by modeling and simulation.

Armed with the ability to execute a compute-intensive model and analyze huge datasets, the overall goal of this study is to develop tools to better understand the complexity (explain), evaluate policy (predict) and improve efficiencies (optimize) of ED units. The two main contributions of this thesis are: (1) An agent-based model for quantitatively predicting and analyzing the complex
behavior of emergency departments. (2) A simulation and optimization based methodology for calibrating model parameters under data scarcity.

Starting from simulating the emergency departments, our efforts proved the feasibility and ideality of using agent-based model & simulation techniques to study the healthcare system.


PhD Thesis: Javier Panadero Martínez: September 28, 12:00 hrs. 2015. (Researcher at Internet Interdisciplinary Institute (IN3) - Universitat Oberta de Catalunya. Barcelona-Spain) 

Title: Performane Prediction: analysis of the scalability of parallel applications.

TDX Source:


Executing message-­-passing applications using an elevated number of resources is not a trivial task. Due to the complex interaction between the message-­-passing applications and the HPC system, depending on the system, many applications may suffer performance inefficiencies, when they scale to a large number of processes. This problem is particularly serious when the application is executed many times over a long period of time.

With the purpose of avoiding these problems and making an efficient use of the system, as main contribution of this thesis, we propose the methodology P3S (Prediction of Parallel Program Scalability), which allows us to analyze and predict the strong scalability behavior for message-­passing applications on a given system.

The methodology strives to use a bounded analysis time, and a reduced set of resources to predict the application performance. The P3S methodology is based on analyzing the repetitive behavior of parallel message-­-passing applications. Such applications are composed of a set of phases, which are repeated through the whole application, independently of the number of application processes.


PhD Thesis: Adriana Gaudiani. Date: September 11, 12:00 hrs. 2015. (Associate Researcher at Science Institute. Universidad Nacional de General Sarmiento, Buenos Aires, Argentina)

Title: Simulación y Optimización como Metodología para Mejorar la Calidad de la Predicción en un Entorno de Simulación Hidrográfica.

External source:


This dissertation deals with the role of computation in HPC to improve the quality of the results of simulations, where computation is used to provide the best possible value for simulation model parameters.
Flooding is one of the most common natural hazards faced by human society. Modelling and computational simulation provide powerful tools which enable flood event forecasting. Nevertheless, a series of limitations cause a lack of accuracy in forecasting, such as the case of uncertainty in the values of the input parameters to the flood model.

In order to predict flood behaviour, we have developed a methodology focused on enhancing a flood simulator, EZEIZA V, (developed by the National Institute of Water (INA), Argentina) to minimize the difference between simulated and observed results, adjusting the input parameters, by using a two-phase optimization methodology via simulation.

In order to find the “optimum” set of input parameters, we reduced the search space using a “Monte Carlo + Clustering K-Means” method. As a result of this, we achieved   improvements of up to 35% which, for example, represents a significant difference of 0.5 to 1 meters of water level along whole Paraná River basin. 


PhD Thesis: Hai Nguyen Hoang, Date: July 18, 12:00 hrs. 2014. (Lecturer in Information Technology Faculty. Danang University of Education. Danang University. Vietnam)

Title: A dynamic link speed mechanism for energy Saving in interconnection networks.

TDX Source:


The growing processing power of parallel computing systems requires interconnection networks a higher level of complexity and higher performance, thus they consume more energy. A larger amount of energy consumed leads to many problems related to cost, cooling infrastructure and system stability. Link components contribute a substantial proportion of the total energy consumption of the networks.

Several proposals have been approaching a better link power management. In this thesis, we leverage built-in features of current link technology to dynamically adjust the link speed as a function of traffic. By doing this, the interconnection network consumes less energy when traffic is light. We also propose a link speed aware routing policy that favors high-speed links in the process of routing packets to boost the performance of the network when the energy saving mechanism is deployed.

The evaluation results show that the networks deploying our energy saving mechanism reduce the amount of energy consumption with the expense of an increase in the average packet latency. However, with the link speed aware routing policy proposal, our mechanism incurs a less increase in the average packet latency while achieving similar energy saving, compared with other conventional approaches in literature.


PhD Thesis: João Artur Dias Lima Gramacho, Date: July 17, 12:00 hrs. 2014. (Software Analyst & Developer at Oracle MySQL Replication Team. Lisbon Area, Portugal).

Title: ARTFUL Deterministically Assessing the Robustness against Transient Faults of Programs.

TDX Source:


Computer chips are evolving to obtain more performance, using more transistors and becoming denser and more complex. One side effect of such a scenario is that processors are becoming less robust than ever against transient faults. As on-chip solutions are expensive or tend to degrade processor performance, the efforts to deal with these transient faults in higher levels, such as the operating system or even by the programs, are increasing. Software based fault tolerance approaches to deal with transient faults often use fault injection experiments to evaluate the behavior of programs with and without their fault detection proposals.

Using fault injection experiments to evaluate programs behavior in presence of transient faults require running the program under evaluation and injecting a fault (usually by flipping a single bit in a processor register) for a sufficient amount of times, always observing the program behavior and if it ended presenting the expected result. One problem with this strategy is that the fault injection space is proportional to the amount of instructions executed multiplied by the amount of bits in the processor architecture register file.

Instead of being exhaustive (it would be unfeasible), this approach consumes lots of CPU time by running or simulating the program being evaluated as many times as necessary to obtain a reasonable valid statistical approximation (usually just a few thousand times). So, the time required to evaluate how a single program would behave in presence of transient faults might be proportional to the time needed to run the program for five thousand times.

In this work we present the concept of a program's robustness against transient faults and also present a methodology named ARTFUL (from Assessing the Robustness against Transient Faults) designed to, instead of using executions with fault injections, deterministically calculate this robustness based on program’s execution trace over an given processor architecture and on information about the used architecture.


PhD Thesis: Hugo Meyer, Date: July 16, 12:00 hrs. 2014. (Researcher. Barcelona Supercomputing Center. Nexus II Building. Barcelona. Spain)

Title: Fault Tolerance in Multicore Clusters. Techniques to Balance Performance and Dependability. Outstanding dissertation awardaward-1_0.png

TDX Source:


In High Performance Computing (HPC) the demand for more performance is satisfied by increasing the number of components. With the growing scale of HPC applications has came an increase in the number of interruptions as a consequence of hardware failures. The remarkable decrease of Mean Times Between Failures (MTBF) in current systems encourages the research of suitable fault tolerance solutions which makes it possible to guarantee the successful completion of parallel applications. Therefore, fault tolerance (FT) mechanisms are a valuable feature to provide high availability in HPC clusters.

A widely used strategy to provide FT support is rollback-recovery, which consists in saving the application state periodically. In the presence of failures, applications resume their executions from the most recent saved state. These FT protocols usually introduce overheads during applications' executions. Uncoordinated checkpoint allows processes to save their states independently. By combining this protocol with message logging, problems such as the domino effect and orphan messages are avoided. Message logging techniques are usually responsible for most of the overhead during failure-free executions in uncoordinated solutions. Considering this fact, in this thesis we focus on lowering the impact of message logging techniques taking into account failure-free executions and recovery times.

A contribution of this thesis is a Hybrid Message Pessimistic Logging protocol (HMPL). It focuses on combining the fast recovery feature of pessimistic receiver-based message logging with the low overhead introduced by pessimistic sender-based message logging during failure-free executions. The HMPL aims to reduce the overhead introduced by pessimistic receiver-based approaches by allowing applications to continue normally before a received message is properly saved. In order to guarantee that no message is lost, a pessimistic sender-based logging is used to temporarily save messages while the receiver fully saves its received messages.


PhD Thesis: Cristian Tissera, Date: April 11, 8:30 hrs. 2014. (Assistant professor UNSL, Argentina)

Title: Modelo basado en Autómatas Celulares extendidos para diseñar estrategias de Evacuaciones en Casos de Emergencia.

To be published at: UNSL-Doctorado en Ciencia de la Computación


Shopping centers, schools and clubs are examples of frequently used in daily activities of a large number of people within a closed space. The  architetcs of this type of construction usually try to maximize the productivity of the available space, but it is also necessary to consider proper planning to ensure the safety of people when an emergency evacuation occurs. If the evacuation paths are not properly resolved, any incident can seriously compromise the safety of persons.

For this reason it is necessary to study the dynamics in pedestrian situations in which are involved hundreds of people and the simulation of emergency evacuation allow research and theorize about the behavior of individuals with the goal of developing preventive measures and possible emergency scenarios.

In this thesis the development of new simulation model for the study of emergency situations is presented. The proposed model has an hybrid structure, where the dynamics of the fire & smoke spread are modeled by cellular automata and the individual behavior is performed using intelligent agents. This hybrid model, with the computational method proposed, allows the construction of artificial environments habited by autonomous agents and the thesis shows the model fomulation and interaction, their verification and the experimental framework generated as a proof of concept to validate the proposed ideas.


PhD Thesis supervised by members of the group:


Campus d'excel·lència internacional U A B