We want to develop a checklist to ensure that we are performing reproducible experiments and analyses. For some inspiration please check out the following paper from the chemical machine learning community. Also check out Stuart Prescott’s talk from the introductory session.
To get things started, we suggest breaking this checklist into two parts: before the experiment and after the experiment.
Please complete the survey on the importance of different aspects of the checklist.
Before the experiment (and the experimental plan)
- Do you have a scientific question or questions that you want to answer?
- Have you consulted an expert/previous literature to determine if reflectivity is able to answer your scientific question?
- Did you ask an instrument scientist if the facility/instrument/technique the best place to answer your question?
- Do you know when the proposal deadline is and what writing a proposal involves?
- Do you know how reflectivity data is modeled and analysed?
- Have you performed some basic simulations of what you expect to measure to get a feel for how the experiment will work?
- The experiment/analysis is improved with precharacterisation, can/have you use complimentary methods to inform your scientific question (XRR/NR/PNR/SQUID/ellipsometry)?
- Can you tune the sample to improve experimental sensitivity (contrast matching/multilayers/larger surface)?
- The answers to questions 1,4-7 should be discussed in your proposal.
- Are you going to measure what you think you are measuring?
- Have you documented how you would make your sample?
- Can you make the sample reproducibly enough (including characterisation of the sample), for the scientific question?
- Are the samples stable for the duration/conditions of the measurement (ie. beam damage in XRR)?
- Have you considered the effect of sample imperfections on your experiment (sample size/flatness/roughness)?
- Do you need to prepare the sample at the facility, is there the necessary equipment?
- Do you know if there is a suitable sample environment for your study (including simultaneous characterisation), if not can one be developed?
- Have you done your home (preparing for the beamtime)?
- Does your experimental team have the person power and knowledge to run the experiment?
- Do you know (including alignment/temperature ramping/etc) how long the experiments will take, do you need to do a test measurement?
- What control and null measurements are necessary for your scientific question/sample environment?
- Do you have a measurement order and a backup plan?
- Have you done all necessary remote training (facility access stuff/software is possible)?
- Has anything changed since the proposal or has the sample changed (may have safety implications)?
- Do you that the instrument is setup for the measurements you want to do?
- Is there an agreed upon system for documentation (run numbers/metadata/logbook/calibration measurements) and have you shared with the instrument scientist?
- Can you answer the question?
- Do you have an understanding of the analysis methodology (ie. the analytical model) that you will use?
- Do you know what a “good” reduced dataset should look like, for example from the simulation?
- Is it possible to perform on the fly/live experimental analysis?
- Do you know how to reduce the data to be suitable for analysis (this might be done for you)?
- Can you load the reduced data into the analysis software of choice?
- Do you have the expertise/capability to analyse the data?
- Can you access all the important metadata for your anlaysis?
- Do you know the corrections that have been performed on your data?
After the experiment and analysis
(this is heavily borrowed from the above paper)
- Data sources
- Are all data sources listed and publicly available?
- If using an external database, is an access date or version number provided?
- Are any potential biases in the source dataset reported and/or mitigated?
- Data cleaning
- Are the data cleaning steps clearly and fully described, either in text or as a code pipeline?
- Is an evaluation of the amount of removed source data presented?
- Are instances of combining data from multiple sources identified, and potential issues mitigated?
- Data representations
- Are methods for representing data as features or descriptors clearly articulated, ideally with software implementations?
- Are comparisons against standard feature sets provided?
- Model choice
- Is a software implementation of the model provided such that it can be trained and tested with new data?
- Are baseline comparisons to simple/trivial models (for example, 1-nearest neighbour, random forest, most frequent class) provided?
- Are baseline comparisons to current state-of-the-art provided?
- Code and reproducibility
- Is the code or workflow available in a public repository?
- Are scripts to reproduce the findings in the paper provided?