SDM : Scientific Metadata Management of Construction materials

Technical Committee SDM

General Information

Chair: Dr Tanja MANNINGER
Deputy Chair: Dr. Fabien GEORGET
Activity starting in: 2024
Cluster D

Subject matter

To simplify the adhesion to data management requirements of funding calls in research and to enable researchers and industry that specialize in modelling and simulation to reuse data from existing projects, a common data management and metadata strategy is needed. Industry and researchers rely more and more on materials models, AI related approaches and intelligent tools. This approach is sensible to reduce workload and costs to create a reliable infrastructure and sustainable construction approaches critical for a functioning society More advanced simulation tools are steadily introduced, nevertheless all these applications heavily rely on copious amounts of high-quality real-world data. At the moment the use of existing data is an elaborate process and often includes that the original creator of said data must take part in the process of reading the data into the application. RILEM TCs usually include the measurement and publication of high-quality data, it is essential to make these data reusable by adding metadata information, this creates additional value for the scientific community. Open data sharing and adherence to the FAIR principle (Findability, Accessibility, Interoperability, and Reusability of data) is still in its infancy among the concrete research community. To ensure that this becomes a widespread practice to benefit the scientific community, and by extension the entire society, actions at various levels are required. This TC aims to take the first step forwards and provide a unified framework for further developments.

The primary objective of this new TC is to provide a guideline for all RILEM TCs and associated researchers for data handling and metadata creation. Resulting metadata files should contain enough information about separately stored data to enable other researchers to reuse said data. The following topics will be in the scope of the proposed TC:

• Metadata collection and storage

o A main deliverable is the definition of a metadata scheme adapted to the materials of RILEM
o A metadata collection tool adhering to this scheme will also be part of the deliverables.
o A guide for a selection of data hosting platforms will be added (target: help new users store the actual data)
o A file format will be decided on also specifying the structuring of collected metadata

• Adherence to the FAIR principle regarding Data ownership, address of security concerns

o Handling data ownership, control of access
o Platforms that enable data storage, a short practical guideline
o Unique data linkage to storage

• How to format actual data to facilitate machine readability

o Discussion of convenient data formats for storage
o Discussion of convenient structuring of data in aforementioned formats
o How does this relate back to the metadata collection

• Make metadata searchable, outlook and additional goals

o The metadata files for all performed tests should be collected for each RILEM TC
o Metadata sheets should be part of additional data for publications
o Basic metadata should also be possible to be collected for databases, and a metadata output file can be created defining the data that can be found in the database (column order, spacers, delimiters etc.)
o Look into possible options for a searchable database of collected metadata, where the metadata files are stored.

• Dissemination of our results to the RILEM community

o Regular communication with interested committees and coordination of efforts if feasible
o Regular communication with interested parties (e.g. data-stewards)
o Organization of workshops and online teaching material

Terms of reference

The TC is anticipated to start at the RILEM Spring Convention in 2024 and has intended to take five full years.

  • Membership: potential members are located in Europe (The Netherlands, Belgium, France, Germany, Finland, Austria...), North-America (USA, Canada, …) and Asia (China), based on the composition of the Data management pre-Group that exists since 2022. The main composition of the group will be academic but also includes governmental institutions (research centers etc.), the inclusion of industry (concrete producers, materials suppliers, monitoring and automation suppliers etc.) might become more prevalent in the later stage of the TC when the development of the tool is progressed sufficiently and the concept has already been tested.
  • An objective of the TC is to develop (a) tool(s) for metadata collection. The first version will be a sophisticated excel document that enables the user to work on the metadata collection also offline and therefore without problems due to website access. It will also allow the user to export a metadata document in a machine-readable format without any previous knowledge. Online tools, or guidelines to develop online tools will be created depending on the interest in the TC, and the progress of applications for funding.
  • Experimental work will be implemented in the form of case studies utilizing the developed tool to support metadata collection in ongoing experiments and presenting the work process.
  • Digitalization of the concrete industry is constantly progressing. The findability, accessibility and reusability of data especially for simulation and modelling is currently insufficient. By showing a working approach and promoting our tool the progress in this area of digitalization can be accelerated. This will drive innovation in the construction industry

Detailed working programme

  • Create a Tool for metadata collection for cement and concrete related experiments and data. The tool must be comprehensible and easy to use. It will have an output option that gives the user a computer readable formatted metadata file (maybe txt, csv etc.)
  • The metadata output file format and structuring will be decided (what is defined where, what is the structure)
  • Set up a website derived from the GitLab page with the definitions and structure of the metadata collection tool
  • Collect some use cases
  • Optional goal: an EU application for funding will be attempted. The aim will be the creation of an online metadata collection tool. This online option will be easier to operate for the potential users. The reason why we do not want it as the main option is accessibility and existing internet restrictions for some colleagues and data- and metadata handling restrictions for other colleagues. To make the offline excel tool also more accessible, the creation of a user-friendly GUI for the offline tool will also be part of the funding application.
  • Optional goal: researchers that want to translate the excel tool into their language to enable the use by workers not proficient in English will be supported. The output file format and language will not be changed but kept exactly like the file from the English tool, thus enabling international sharing of data even when initially hindered by a language barrier.
  • Issue a small guideline on where to store actual data including some possibilities and a discussion about security, including personal data protection. After testing and improvement it can become a RILEM recommendation.
  • Issue a small guideline how to structure actual data and in what format to store it to make the life of people working with recycled data easier.
  • Create a STAR report or State of the art article on the concept of metadata management for construction materials, the development and use of the tool and the guideline of actual data storage. Depending on the volume of information a fitting format will be chosen.
  • Communication with interested committees: Regular updates about the ongoing work should be provided (e.g. per E-mail).
  • Coordination of efforts with other organisations can be an aim at later stages of the TC. E.g.: FIB is currently working on several databases. Roman Wan-Wendner is chairing the TG 4.5 Time-dependent Behavior of Concrete. He will act as a liaison to fib for now. It is sensible to keep contact to this and other groups and committees to discuss our work and gain new ideas for improvement of the TC.
  • Communication with interested Data-stewards, departments or connected organizations: regular update e-mails or e-mail newsletters will be prepared, to inform these parties about the progress of the TC. A list of Data-stewards and interested parties will be kept.
  • It is planned to educate the scientific community at RILEM about the use of the tool in seminars, webinars, workshops and at conferences. It will also be presented to the young RILEM community. Another goal is to enable university students to get in touch with the concept early in their career, enabling them to use the tool already for their data management in BA and MA thesis.
  • Increase (open) data sharing among the concrete research community and enable a much bigger pool of reusable data. Propose recommendations with the goal to teach best practices for data management. Based on the STAR, the guideline or recommendations as well as the manual for the tool could also be made available online on a designated website that will be linked to the TC website.

The state-of-the-art report will contain the work and project parts collected during the committee's tenure. It will be structured into a theoretical description of our work in general, the tool itself, use cases of the tool as well as a manual how to store the actual data to make it as reusable as possible. This will be done in constant consultation of the actual user base of the data stored and equipped with metadata: the modelling and simulation community. The goal of this document is to provide practical guidelines on metadata and data management for cement and concrete applications. Based on this document, the information will be disseminated to the entire concrete research community. Another important outreach task will be taking part in the upcoming First International RILEM Symposium on Data-Driven Concrete Science, in Europe.

Technical environment

The main relevance of this TC to RILEM’s mission would be:

  • The progress of scientific knowledge, as the metadata will be gathered and linked to the actual data.
  • The metadata collection tool aims to create an output file that is machine readable. Thus, it provides an interface between human and machine.
  • The dissemination and application of this knowledge worldwide, will be enabled through the organization of outreach activities/workshops and publications.
  • The project is a basis for actions to promote data sharing in construction chemistry and concrete research.

This is a new subject that has not been handled by any former or existing RILEM committees.

As stated in the proposed terms of reference, the work will be carried out in collaboration with diverse groups from various countries, including practitioners, and people working in modelling and simulation.

Expected achievements

The deliverables of the proposed TC will be the following:

  • Meta Data-management Tool
  • Website for description of underlying ideas and structuring of the tool, and other online teaching materials (e.g. YouTube video to be posted on RILEM channel)
  • Optional: Online tool and GUI for offline excel tool
  • Optional: Translated offline excel tools
  • State-of-the-art report on metadata collection tool and format designed for the cement and concrete research community
  • Improvement of collaborations with researchers in data modelling, simulation and AI by providing easily reusable datasets due to the additional metadata collection proven in some test cases.
  • A paper in RILEM technical letters at the start and end of life of the TC.

Group of users

The targeted users (“data providers”) are researchers, Ph.D. students, practitioners, and industry experts in the field of construction materials and structures. A second target group (“data users”) are researchers in simulation, machine learning and AI training, by being provided with accessible data that can be easily found and made machine readable by using the data provided in the before created metadata files.

Specific use of the results

The availability of a tool for metadata collection and file output as well as a guide for how to handle actual data, possible storage solutions and formats will enable researchers from the construction area to implement growing demands for data management and metadata management.
Open science and data curated with adjunct metadata are necessary assets to propel further research and publication especially in the field of machine learning, simulations and AI training. The implementation of the TC’s recommendations will boost the findability, accessibility, interoperability and most important reusability of data.