FAIR Data Infrastructure
for Physics, Chemistry,
Materials Science,
and Astronomy e.V.

Pillar C: Soft-matter and biomolecular simulations

Spokesperson: Kurt Kremer (Max Planck Institute for Polymer Research, Mainz)
Deputy: Carsten Baldauf (Fritz Haber Institute of the Max Planck Society, Berlin)


The field of biophysical and soft-matter simulations covers a wide range of methodologies, for example from all-electron simulations to atomistic resolution to coarse graining and to finite-element methods as well as hybrid methods thereof. It touches related fields like computational/theoretical chemistry, materials science and condensed-matter physics. Let alone in the field of molecular-mechanics based atomistic simulations, a multitude of different computer codes utilizing many different force fields and parametrizations are being used.


To develop an infrastructure for the upload, storage, and sharing of input and output files of diverse simulation types;
To raise awareness in the community about the importance of publicly and accessible sharing simulation outputs, in particular in order to allow for data storing according to the rules of good scientific practice, but also to allow data sharing to bring forward science and to support the reach and visibility of one’s own research work.


At first, pillar C will have to come up with a categorization of simulations and will have to develop an infrastructure for data that allows for upload, processing, categorization, normalization, storage and sharing following the example of NOMAD Repository and Archive. We plan to extend the infrastructure developed in NOMAD towards trajectories of calculations (e.g. molecular dynamics) and force fields (including storage of run parameters etc.). In particular molecular dynamics simulations cover a large range of multiscale biophysics and soft-matter simulations, and thus stand as an essential first step. Parsing of input and output files from the main existing MD codes, e.g. Gromacs or LAMMPS, will be the first step. It is crucial to obtain all parameters relevant for propagating the simulation, including the force field used for the simulation. This will require some flexibility, given the tendency to use custom force fields, especially in soft matter community. Parsing the output should contain enough information to perform any type of commonly-applied analysis. This includes, but is not limited to:

In any case, not only normalized (extracted) data will be stored but also the original uploaded files. Besides these first fundamental steps, future goals include:

Making the data usable and accessible for analyses based on artificial intelligence methods.