The OpenKIM system includes a collection of Tests, Models, Predictions and Reference Data. A Test is a computer program that couples with a Model (i.e. an interatomic potential) to generate one or more Predictions, each of which is associated with a specific material property. In turn, every material property is associated with a Property Definition that is created by a developer and includes a formal definition stored in a standardized format (see Section 2.2). A Prediction is thus a realization of a Property Definition (referred to as a Property Instance) for a specific case. Similarly, an item of Reference Data is a Property Instance obtained from an experiment or a first principles calculation. Thus, every Property Instance is either a Prediction or an item of Reference Data.
A developer interested in contributing a new Test or Reference Data must first determine whether a suitable Property Definition already exists in OpenKIM by searching the properties page on https://openkim.org. If so, they proceed to use the appropriate definition when writing their Test or uploading the Reference Data. Otherwise, in consultation with the KIM Editor and developers of similar properties, they determine whether any of the existing Property Definitions can be adapted to their need or a new Property Definition is warranted. In the event that a Property Definition is adapted due to corrections or new requirements, all existing Tests and Reference Data associated with it must be revised with a corresponding version update.
Every property in KIM will be associated with the following information:
The name and contact information of the original "contributor" of the property and the "maintainer" who is currently in maintaining it.
A Property Definition file in EDN format (Section 2.2) which defines the property and from which a template can be generated for Test developers.
A file containing documentation about the property, including a detailed explanation of the property and the variables that appear in the Property Definition.
A Wiki-style web page which provides supplemental documentation of the property and enables community involvement and discussion in its maintenance and evolution. This should include:
A list of interested users involved in the definition and maintenance of the property.
A list of tags specified by the developers that characterize the property. (The openkim.org system will provide recommendations for tags and synonyms based on the property documentation.) Existing nomenclature will be adopted when possible, such as that of the IUPAC (International Union of Pure and Applied Chemistry) available at https://iupac.org/.
Two validators to verify the correctness of a Property Instance. (1) A “Definition Validator” ensures that the Property Instance is valid EDN and conforms to the Property Definition. (2) A “Physics Validator”, provided by the property developer, ensures that the Property Instance contains physically acceptable values.
A Property Definition is stored in a subset of the EDN format as described below. In the following discussion, a map
is an unordered set of key-value pairs akin to Perl’s hash, Python’s dictionary, and Java’s Hashtable. A key
is a string. Key names can only include lower-case alphanumeric characters and dashes. The names are arbitrary and set by the developer to reflect the meaning of the key. A value
is a string, boolean, or a vector of integers and ":"
strings. A ;
character encountered outside of a string indicates the start of a comment. The ;
and all subsequent characters to the next newline are ignored.
Note about strings: In strings, a backslash should be double-escaped. Otherwise it is interpreted as a special character (such as tabs \t
, newlines \n
, etc.). For example, if the latex notation \cos\theta
is included in a string, it should be written as \\cos\\theta
.
A Property Definition must contain the following required key-value pairs:
tag:<email-address>,<date>:property/<property-name>
The fields appearing within <...> stand in for text as defined below.
<property-name> is restricted to lowercase alphanumeric characters and the dash character (“-”).
<date> is the date of establishment of the property in “yyyy-mm-dd” format.
<email-address> is the e-mail address of the property contributor. A contributor has several options available. They may use their own e-mail address, their openkim.org username followed by “@noreply.openkim.org”, an openkim.org organization name followed by “@noreply.openkim.org”, an openkim.org user or organization’s UUID followed by “@noreply.openkim.org”, or in agreement with the KIM Editor they may use “staff@noreply.openkim.org”. The e-mail address must be in lowercase characters for the range of A-Z and cannot contain a plus (“+”) character, but otherwise does not have character restrictions if it is a valid e-mail address. Several examples follow:
(a) Contributor’s email address is “jessie@example.net”: <email-address> = “jessie@example.net”
(b) Contributor’s username “Jessie” in openkim.org: <email-address> = “jessie@noreply.openkim.org”
(c) Contributing organization “ExampleLab” in openkim.org: <email-address> = “examplelab@noreply.openkim.org”
(d) A UUID of “53584e2a-3caf-446f-ba11-b843d3d24a3a” corresponding to a user or organization in openkim.org: <email-address> = “53584e2a-3caf-446f-ba11-b843d3d24a3a@noreply.openkim.org”
(e) “staff@noreply.openkim.org” used in agreement with the KIM Editor: <email-address> = “staff@noreply.openkim.org”
The required fields list above are followed by an unordered set of key-map pairs. Each key is associated with a map which must contain the following standard keys-value pairs:
true
or false
.true
or false
.Below is an example of a Property Definition for the cohesive energy relation of a cubic crystal.
Property Instances are either Predictions or items of Reference Data and must conform to the specification in the associated Property Definition. A Property Instance is stored in a subset of the EDN format as described in Section 2.2. Multiple Property Instances in a file may optionally be contained within an array represented by a start bracket ([
) at the beginning, and an end bracket (]
) at the end of the file:
[
{
⋮
Property Instance 1
⋮
}
{
⋮
Property Instance 2
⋮
}
]
If the brackets are not present, the Property Instances are assumed to be in an array. Multiple Property Instances can only be separated by whitespace or comments (lines beginning with a “;”).
Each Property Instance must contain the following required key-value pairs:
The required fields listed above are followed by an unordered set of key-map pairs for keys included in the Property Definition. Required keys (as indicated in the Property Definition) must be included. Each key is associated with a map containing one or more of the following key-value pairs (required keys are indicated by a star):
All keys beginning with “source” are associated with the physical source-units (if applicable). The keys associated with uncertainty and precision conform to the ISO “Guide to the Expression of Uncertainty in Measurement” and the ThermoML standard notation [2].
If the source-value key is a scalar, the values of the uncertainty and digits keys must be scalars. If the source-value key’s value is an array (EDN vector), the values of the uncertainty and digits keys must be either arrays of the same extent, or scalars in which case they are taken to apply equally to all values in the source-value array.
Below is a fictitious example of a Property Instance corresponding to the cohesive-energy-relation-cubic-crystal Property Definition given in Section 2.2.
Note that, although the value of the short-name key is a scalar in this case, it is still enclosed in brackets because it was defined as an array in the Property Definition.
Metadata related to a Property Instance will be stored separately in an auxiliary document. For a Prediction, this can include details of the Test calculations. For Reference Data this can include the source citation, the origin of the data (experimental or first principles), and the type of first principles calculation performed and associated parameters needed to define the calculation.
[1] R. D. Chirico, M. Frenkel, V. V. Diky, K. N. Marsh, and R. C. Wilhoit. ThermoML – An XML-Based Approach for Storage and Exchange of Experimental and Critically Evaluated Thermophysical and Thermochemical Property Data. 2. Uncertainties. J. Chem. Eng. Data, 48:1344–1359, 2003.
[2] M. Frenkel, R. D. Chirico, V. Diky, Q. Dong, K. N. Marsh, J. D. Dymond, W. A. Wakeham, S. E. Stein, E. Koenigsberger, and A. R. H. Goodwin. XML-Based IUPAC Standard for Experimental, Predicted, and Critically Evaluated Thermodynamic Property Data Storage and Capture (ThermoML). Pure Appl. Chem., 78(3):541–612, 2006.
[3] T. Kindberg and S. Hawke. The ’tag’ uri scheme. https://www.ietf.org/rfc/rfc4151.txt.