File Format Definition

Introduction

The purpose of this page is to define the file format.

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119. (http://www.ietf.org/rfc/rfc2119.txt)

File Elements

F5 will never use attribute names starting with '_'. User extensions SHOULD use names starting with '_' until the extension is included in F5.

F5 will ignore all hdf5 elements (groups, datasets, ...) starting with '_'. Applications are free to use such names for things which are never planned to be included into F5.

Root element

The root element constains a group version with an attribute version which is a url indicating the file format version. Using a url is inspired by the dtd naming scheme used by xml. (Note: it is not possible to attach a attribute to the root element itself. Otherwise this would be an appropriate place).

All urls MUST be limited to 511 characters + 1 trailing 0 to make it easier for applications to process the files.

Applications SHOULD check if urls respect this limit.

For now the version is of the format:

The domain f5.org should be registered.

Toplevel Groups

The root level of an F5 files contains two reserved groups called

Time Slice

Note that time slices are replaced by parameter slices in F5 version 0.1.1 (see below).

Parameter Slice

Grids

Fields

Fields SHOULD use the same naming conventions as grids do.

The name of a field MUST be unique. If fields in more than one grid have the same name, they are different representations of the same data. E.g. a field might be stored as a uniform scalar field. At the same time this field might be evaluated on a surface and stored as a surface field in another grid. The name of these two fields MUST be the same. This enables an application to change the data of the uniform field and know that it has to change the data of the surface field at the same time. If the application is not able to understand the surface representation it SHOULD warn the user and MIGHT want to remove the surface representation.

Each F5 field might be represented by more than one hdf5 dataset each with the name of the dataset and more supporting hdf5 datasets/groups.

hdf5 datasets are stored in fortran order.

The basic structure of the grid is described by the field named 'Positions'. This is a reserved keyword and MUST NOT be used by any other field.

Each field has one or more F5 ContentTypes. A field ContentType describes how the field is represented by hdf5 objects. The field ContentType is an url. ContentTypes defined in the standard will start with the version url of the file (see above). The next part will point to a specific part of the specification together with its version.

User extension ContentTypes use other urls.

A field ContentType could e.g. be

A field with name "ImageData" with this ContentType is represented as:


Points/                                         Group
Points/StandardCartesianChart3D/                Group
Points/StandardCartesianChart3D/Positions       Group
    Attribute: DataspaceDims {3}
        Type:      32-bit big-endian integer
        Data:  30, 20, 10
    Attribute: base      scalar
        Type:      struct {
                   "x"                +0    IEEE 32-bit big-endian float
                   "y"                +4    IEEE 32-bit big-endian float
                   "z"                +8    IEEE 32-bit big-endian float
               } 12 bytes
        Data:  {-1, -1, -1}
    Attribute: delta     scalar
        Type:      struct {
                   "x"                +0    IEEE 32-bit big-endian float
                   "y"                +4    IEEE 32-bit big-endian float
                   "z"                +8    IEEE 32-bit big-endian float
               } 12 byte
        Data:  {0.2, 0.1, 0.0666667}
    Attribute: extent    {3}
        Type:      32-bit big-endian integer
        Data:  10, 20, 30
Points/StandardCartesianChart3D/ImageData       Dataset {30/30 20/20 10/10}
    Type:     

For a strict definition of all the attributes, see below. Note that the values in extent and DataspaceDims == dims(Dataset) are swapped because the data are stored in fortran order.

The loosest ContentType is

A field ContentType is just a string in compliance with the url naming scheme. It doesn't enforce any hierachy though it might be useful to have one.

Users MUST have some kind of access to the domain they're using for user defined contraints to avoid name space pollution.

All ContentTypes of each field (including the Positions) are listed in as attributes of a group Fields/Fieldname. F5 uses the content of the hdf5 attributes of this group no matter of their name. The first ContentType should be named ContentType000. The next ContentType001 and so on.

The group Field is also useful as a directory of the fields contained in this grid.

An application SHOULD try to deal with all the the listed ContentTypes. Think of the full field ContentType being the sum of all the F5 field contraints. It might be possible to ignore some of the contraints during readonly access without loosing relevant information (e.g. if a field is stored in two representations, e.g. cartesian and polar, which could be mapped to each other) but this is not guaranteed. An application SHOULD warn the user about F5 field contraints it is ignoring. It is not safe to ignore contraints during write access on an existing file (when modifying or appending data). If the application does not understand the semantics of a ContentType it will not be able to correctly modify the data. If write access is required nonstheless, the application might consider saving the data to a new file including only the contraints it is aware of. Another way could be to delete all data not used by the contraints the application is aware of. This should only be done after user confirmation.

The ContentType information for the above example would be:

Fields                                                        Group
Fields/Positions                                              Group
Fields/Positions/                                             Group
    Attribute: ContentType {STRING}
        "http://www.zib.de/visual/F5-0.1.0/RegularGrid-0.1.0/StandardCartesian/Uniform/"
Fields/ImageData                                              Group
Fields/ImageData/                                             Group
    Attribute: ContentType {STRING}
        "http://www.zib.de/visual/F5-0.1.0/RegularGrid-0.1.0/StandardCartesian/Uniform/VertexCentered/"

Basic building blocks

Simple dataset

Examples: curvilinear coordinates, values of a scalar field.

Storing a value for every node in fortran order.

F5 Field ContentTypes

Each listentry describes a template for the Positions part of a field type and the additional data needed to store a field at these positions. The trees reside in a F5 Grid hdf5 group.