Research

The IN2P3 Computing Center hosts a research team in Computer Science since 2008. This team was renamed CCLab in 2016 to reflect the idea of a research lab within a computing center in which researchers in Computer Science, researchers in Physics, expert engineers of the computing center, and students can collaborate on specific studies. The area of expertise of the CCLab covers various topics in Computer Science, such as High Performance Computing, scheduling, data management, scientific workflows, and large scale distributed systems.

Simulation of distributed systems and applications is at the core of the activities of the CCLab. Simulation is indeed a fast, controlled, and reproducible way to evaluate new algorithms for distributed computing platforms in a variety of conditions. However, the realism of simulations is rarely assessed, which critically questions the applicability of a whole range of findings. By participating to the development of the SimGrid and WRENCH toolkits, the CCLab aims at bridging theory and practice via simulation-driven engineering.

The CCLab also aims at engaging deep partnerships with scientific teams, especially in astroparticles physics, and using a mix of user-driven research with system software development to address specific challenges that these communities face, and inform future research directions from acquired experience.

 

SimGrid: Simulation of Distributed Computer Systems

http://simgrid.org

SimGrid is a toolkit that provides core functionalities for the simulation of distributed applications in heterogeneous distributed environments. The simulation engine uses algorithmic and implementation techniques toward the fast simulation of large systems on a single machine. The models are theoretically grounded and experimentally validated. The results are reproducible, enabling better scientific practices. Its models of networks, cpus and disks are adapted to (Data)Grids, P2P, Clouds, Clusters and HPC, allowing multi-domain studies. It can be used either to simulate algorithms and prototypes of applications, or to emulate real MPI applications through the virtualization of their communication, or to formally assess algorithms and applications that can run in the framework. The formal verification module explores all possible message interleavings in the application, searching for states violating the provided properties. We recently added the ability to assess liveness properties over arbitrary and legacy codes, thanks to a system-level introspection tool that provides a finely detailed view of the running application to the model checker. This can for example be leveraged to verify both safety or liveness properties, on arbitrary MPI code written in C/C++/Fortran.

 

WRENCH: Workflow Management System Simulation Workbench

http://wrench-project.org

Capitalizing on recent advances in distributed application and platform simulation technology, WRENCH makes it possible to (1) quickly prototype workflow, WMS implementations, and decision-making algorithms; and (2) evaluate/compare alternative options scalably and accurately for arbitrary, and often hypothetical, experimental scenarios. This project will define a generic and foundational software architecture, that is informed by current state-of-the-art WMS designs and planned future designs. The implementation of the components in this architecture when taken together form a generic “scientific instrument” that can be used by workflow users, developers, and researchers. This scientific instrument will be instantiated for several real-world WMSs and used for a range of real-world workflow applications.

 

HAC-SPECIS: High-performance Application and Computers, Studying PErformance and Correctness In Simulation

http://hacspecis.gforge.inria.fr/

The goal of the HAC SPECIS Inria Project Lab (IPL) is to answer methodological needs of HPC application and runtime developers and to allow to study real HPC systems both from the correctness and performance point of view. To this end, we gather experts from the HPC, formal verification and performance evaluation community.

 

Past Funded Projects

  • 2017 - 2019 – PRACE-5IP: PRACE Fifth Implementation Phase project.
  • 2015 - 2018 – POP: H2020 Center of Excellence on Performance, Optimization, and Productivity.
  • 2013 - 2018 – DALHIS: Inria Associate Team on Data Analysis on Large-scale Heterogeneous Infrastructures for Science
  • 2015 - 2017 – PRACE-4IP: PRACE Fourth Implementation Phase project.
  • 2013 - 2017 – MOEBUS: ANR on Scheduling in HPC.
  • 2012 - 2016 – SONGS: ANR on the Simulation of Next Generation Systems.