Material for the Annual General Assembly 2021.
structure of the text data file
- header with meta data
- column description
- data set
- [sub-header]
- [data set]
header with meta data
Wrapped YAML
- each line of the header starts with
#
(hash and space) - it is structured due to YAML or JSON rules
- all non-YAML entries start with an additional hash
#
the 1st line
The first line contains information about
- the general content;
- the orso file format version (and level of strictness) used;
- the encoding;
- a link to us.
# # ORSO reflectivity data file | 0.1 standard | YAML encoding | https://www.reflectometry.org/
meta data
What follows are several blocks with meta data, where the dictionary is not yet defined. Thus here only the format, but not the content is mandatory.
The essential categories deal with
- the origin of the file
- the origin and meta information of the data
- the processing steps performed so far
fuhrer categories might be added, e.g.
- sample description (in- or output of analysis program)
- simulation / fitting history
- instrumental details for an improved resolution function
Here we concentrate on the essential ones:
origin / ownership of the file
Information about the one who created this file.
clear text:
created
by Jochen Stahn
from Paul Scherrer Institut
on 2020-04-06T13:21:18
on the computer lnsa17.psi.ch
suggestion for the implementation (= the dictionary)
# creator:
# name : Jochen Stahn
# affiliation : Paul Scherrer Institut (PSI)
# time : 2020-04-06T13:21:18
# computer : lnsa17.psi.ch
or if automatically generated
# creator:
# description : automated output for neutron reflectometer Amor at PSI
# time : 2020-04-06T13:21:18
# computer : lnsa17.psi.ch
further entries might be
# contact : e-mail / URL
origin and ownership of the data
Here goes all the information about the ownership of the raw data, and where and how they were obtained.
Made-up example:
# data source:
# owner:
# name: T. Proposer
# affiliation: The Institute (TI)
# contact: t_proposer@institute.org
# experiment:
# facility: Paul Scherrer Institut, SINQ
# ID: 2020 0304
# date: 2021-05-12 - 2021-05-15
# title: Generation of input for formatting purposes
# instrument: Amor
# probe: neutrons
# sample:
# name: Ni1000
# description:
# - amb: air
# - layer: {material: Ni, thickness: 100 nm}
# - sub: Si
# measurement:
# scheme: energy-dispersive
# instrument_settings:
# polarisation: +
# wavelength:
# unit: angstrom
# min: 3.0
# max: 12.0
# resolution:
# type: proportional
# value: 0.022 # Delta lambda / lambda
# incident_angle:
# unit: deg
# value: 0.4
# resolution:
# type: constant
# unit: deg
# value: 0.01
# data_files:
# - file : amor2020n001925.hdf
# created : 2020-02-03T14:27:45
# - file : amor2020n001926.hdf
# created : 2020-02-03T14:37:15
or for different angles:
# incident_angle:
# unit: deg
# resolution:
# type: constant
# unit: deg
# value: 0.01
# data_files:
# - file : amor2020n001925.hdf
# created : 2020-02-03T14:27:45
# incident_angle:
# value: 0.4
# - file : amor2020n001926.hdf
# created : 2020-02-03T14:37:15
# incident_angle:
# value: 1.5
history of processing
This section should provide the possibility to reproduce the data reduction process. I.e.
to recreate this file (except for the creator
part) from the raw data.
It also mentions the corrections performed so far. There might be a reference to a well-defined algorithm in the future.
# reduction:
# software: eos.py
# call : eos.py -Y 2020 -n 1925-1927 -y 9,55 ni1000 -O -0.2 -r 1064 -s 1 -i -a 0.005 -e
# comment:
# corrections performed by normalisation to measurement on reference sample
# corrections:
# - footprint
# - incident intensity
# - detector efficiency
column description
The last part of the header is the description of the columns to follow.
# columns:
# - {name: Qz, unit: 1/angstrom, dimension: WW transfer}
# - {name: R, dimension: reflectivity}
# - {name: sR, dimension: error-reflectivity}
# - {name: sQz, unit: 1/angstrom, dimension: resolution-WW transfer}
This is a compact notation for
# columns:
# - name: Qz
# unit: 1/angstrom
# dimension: WW transfer
# - name: R
# dimension: reflectivity
# - name: sR
# dimension: error-reflectivity
# - name: sQz
# unit: 1/angstrom
# dimension: resolution-WW transfer
In case there are multiple data sets (see below) it might be necessary to provide an identifier also for the first data set. Thus there can be the optional line
# data set: <identifier>
Optionally, the last line might be a short-notation description
# # Qz RQz sR sQ
where the second #
means that this is a comment line not to be processed.
data set
Each data set consists of a rectangular array of numbers. All entries have the same
data type (the default is float
). There is no leading space.
1.03563296e-02 3.88100068e+00 4.33909068e+00 5.17816478e-05
1.06717294e-02 1.16430511e+01 8.89252719e+00 5.33586471e-05
...
Suggestion to define the size of each column according to one of the following rules (with increasing rigidity):
- Each column can have its individual length, but the descriptive header must fit.
- All columns must have the same length
- All columns are 16 spaces wide, i.e.
% 16f
,% 16g
or% 16e
- All columns have the format
% 16e
multiple data sets
In case there are several data sets in one file (e.g. for different spin states) the following rules apply:
empty line
This is optional. An empty line is recognized by gnuplot as a separator for 3 dimensional data sets.
separator
If more than one data set is provided, they are separated by a line starting with
# data_set: <identifier>
where the identifier is either an unique name or a number. The default numbering of data sets starts with 0, the first additional one thus gets number 1 and so on.
overwrite meta data
Below the separator line, meta data might be added. These overwrite the meta data supplied in the main header (i.e. data set 2 does not know anything about the changes made for data set 1).
# data_source:
# measurement:
# polarisation: -
# input_files:
# data_files:
# - file : amor2020n001930.hdf
# created : 2020-02-03T15:27:45
# reduction:
# call : eos.py -Y 2020 -n 1930 -y 9,55 ni1000 -O -0.2 -r 1064 -s 1 -i -a 0.005 -e
repetition of short-version column description
optional
# # Qz RQz sR sQ
next data set
of the same format (number, format and description of columns) as data set 0
1.03563296e-02 1.08100068e+00 4.33909068e+00 5.17816478e-05
1.06717294e-02 1.06430511e+01 8.89252719e+00 5.33586471e-05
...
example
all of the above mentioned lines without comments.
text_example.ort
# # ORSO reflectivity data file | 0.1 standard | YAML encoding | https://www.reflectometry.org/
# creator:
# name : G. User
# affiliation : PSI
# time : 2020-04-06T13:21:18
# computer : lnsa17.psi.ch
# data source:
# owner:
# name: T. Proposer
# affiliation: The Institute (TI)
# contact: t_proposer@institute.org
# experiment:
# facility: Paul Scherrer Institut, SINQ
# ID: 2020 0304
# date: 2021-05-12 - 2021-05-15
# title: Generation of input for formatting purposes
# instrument: Amor
# probe: neutrons
# sample:
# name: Ni1000
# description:
# - amb: air
# - layer: {material: Ni, thickness: 100 nm}
# - sub: Si
# measurement:
# scheme: angle- and energy-dispersive
# instrument_settings:
# sample_rotation:
# alias: mu
# unit: deg
# value: 0.7
# detector_rotation:
# alias: mu
# unit: deg
# value: 1.4
# incident_angle:
# unit: deg
# min: 0.4
# max: 1.0
# resolution:
# type: constant
# unit: deg
# value: 0.01
# wavelength:
# unit: angstrom
# min: 3.0
# max: 12.5
# resolution:
# type: proportional
# value: 0.022 # Delta lambda / lambda
# polarisation: +
# data_files:
# - file : amor2020n001925.hdf
# created : 2020-02-03T14:27:45
# - file : amor2020n001926.hdf
# created : 2020-02-03T14:37:15
# - file : amor2020n001927.hdf
# created : 2020-02-03T14:27:02
# references:
# - file : amor2020n001064.hdf
# created : 2020-02-02T15:38:17
# reduction:
# software: eos.py
# call : eos.py -Y 2020 -n 1925-1927 -y 9,55 ni1000 -O -0.2 -r 1064 -s 1 -i -a 0.005 -e
# comment:
# corrections performed by normalisation to measurement on reference sample
# corrections:
# - footprint
# - incident intensity
# - detector efficiency
# columns:
# - {name: Qz, unit: 1/angstrom, dimension: WW transfer}
# - {name: R, dimension: reflectivity}
# - {name: sR, dimension: error-reflectivity}
# - {name: sQz, unit: 1/angstrom, dimension: resolution-WW transfer}
# data set: spin_up
# # Qz RQz sR sQ
1.03563296e-02 3.88100068e+00 4.33909068e+00 5.17816478e-05
1.06717294e-02 1.16430511e+01 8.89252719e+00 5.33586471e-05
...
# data_set: spin_dn
# data_source:
# measurement:
# instrument_settings:
# polarisation: -
# input_files:
# data_files:
# - file : amor2020n001930.hdf
# created : 2020-02-03T15:27:45
# # Qz RQz sR sQ
1.03563296e-02 1.08100068e+00 4.33909068e+00 5.17816478e-05
1.06717294e-02 1.06430511e+01 8.89252719e+00 5.33586471e-05
...
A much reduced header:
# # ORSO reflectivity data file | 0.1 standard | YAML encoding | https://www.reflectometry.org/
# creator:
# name : G. User
# affiliation : PSI
# time : 2020-04-06T13:21:18
# computer : lnsa17.psi.ch
# data source:
# owner:
# name: T. Proposer
# affiliation: The Institute (TI)
# experiment:
# facility: Paul Scherrer Institut, SINQ
# ID: 2020 0304
# date: 2021-05-12 - 2021-05-15
# title: Generation of input for formatting purposes
# instrument: Amor
# probe: neutrons
# sample:
# name: Ni1000
# measurement:
# scheme: angle- and energy-dispersive
# instrument_settings:
# incident_angle:
# unit: deg
# min: 0.4
# max: 1.0
# wavelength:
# unit: angstrom
# min: 3.0
# max: 12.5
# polarisation: +
# data_files:
# - file : amor2020n001925.hdf
# created : 2020-02-03T14:27:45
# - file : amor2020n001926.hdf
# created : 2020-02-03T14:37:15
# - file : amor2020n001927.hdf
# created : 2020-02-03T14:27:02
# references:
# - file : amor2020n001064.hdf
# created : 2020-02-02T15:38:17
# reduction:
# software: eos.py
# call : eos.py -Y 2020 -n 1925-1927 -y 9,55 ni1000 -O -0.2 -r 1064 -s 1 -i -a 0.005 -e
# columns:
# - {name: Qz, unit: 1/angstrom, dimension: WW transfer}
# - {name: R, dimension: reflectivity}
# - {name: sR, dimension: error-reflectivity}
# - {name: sQz, unit: 1/angstrom, dimension: resolution-WW transfer}
# data set: spin_up
# # Qz RQz sR sQ
1.03563296e-02 3.88100068e+00 4.33909068e+00 5.17816478e-05
1.06717294e-02 1.16430511e+01 8.89252719e+00 5.33586471e-05
...
# data_set: spin_dn
# data_source:
# measurement:
# instrument_settings:
# polarisation: -
# input_files:
# data_files:
# - file : amor2020n001930.hdf
# created : 2020-02-03T15:27:45
# # Qz RQz sR sQ
1.03563296e-02 1.08100068e+00 4.33909068e+00 5.17816478e-05
1.06717294e-02 1.06430511e+01 8.89252719e+00 5.33586471e-05
...
from example towards dictionary
legend:
1_
mandatory (according to the principles)
2_
mandatory if applicable (e.g. proposal ID for large scale facilities)
3_
recommended
4_
optional
_1
easy to realize
_2
probably needs some programming
_3
probably needs some manual work (operator’s name on the lab x-ray machine)
11# # ORSO reflectivity data file | 0.1 standard | YAML encoding | https://www.reflectometry.org/
13# creator: this identifies the person or
program who created this file
13# name:
13# affiliation:
11# time: date and time of file creation,
format: yyyy-mm-ddThh:mm:ss
11# computer:
12# data source: This information should be
provided in the raw data file
If not, one has to find ways to
provide it.
12# owner: This referes to the actual owner
of the data set, i.e. the main
proposer or the person doing the
measurement on a lab reflectometer
12# name:
12# affiliation:
12# experiment:
22# facility:
22# ID: proposal ID
12# date: format: yyyy-mm-dd
22# title: proposal or project title
12# instrument:
12# probe: neutrons or x-rays
12# sample:
12# name:
32# composition: free text notes on the nominal
composition of the sample
12# measurement:
22# scheme: one of
angle- and energy-dispersive
angle-dispersive
energy-dispersive
12# instrument_settings:
22# incident_angle:
unit: rad / deg
min:
max:
22# wavelength:
unit: nm / angstrom
min:
max:
polarisation: one of + / - / ++ / +- / -+ / --
11# data_files:
11# - file: file name or identifier (doi)
12# created: yyyy-mm-ddThh:mm:ss
21# references:
21# - file: file name or identifier (doi)
22# created: yyyy-mm-ddThh:mm:ss
22# reduction:
23# software: name and version of the reduction software
23# call: command line call or similar
11# columns: the 4 leading columns must follow
the scheme for "I(x)":
x with units
I with units if applicable
sigma of I with units (if applicable)
sigma of resolution of x with units
Further columns can be of any type,
content and order. But always with
descriptioon and units.
11# - name: Qz
11# unit: 1/angstrom / 1/nm
31# dimension: WW transfer
11# - name: R
31# dimension: reflectivity
11# - name: sR
31# dimension: error-reflectivity
11# - name: sQz
11# unit: 1/angstrom / 1/nm
31# dimension: resolution-WW transfer
41# data set: optional, to provide a name /
identifier for a multy data set
file
41# # Qz RQz sR sQ
1.03563296e-02 3.88100068e+00 4.33909068e+00 5.17816478e-05
1.06717294e-02 1.16430511e+01 8.89252719e+00 5.33586471e-05
...
rectangular matrix
all entries of the same data tyle