File Format Standardization for XAFS Data
Standardization of file formats for data collected at XAS beamlines is a perennial topic of discussion. Although the data from a typical XAS experiment is not especially complicated, and is reasonably well represented by a flat, context-free file, it is obviously attractive to incorporate meta-data in a structured way into data files so that they can more readily be exchanged.
There are at least three issues that can be addressed in a discussion of file formats. The first, and most important, is the format of a file containing a single scan of data -- i.e. one spectrum to be processed and analyzed. Here the primary issues are how to identify the various parts of the data, and how to extract meaningful metadata. Much of the metadata can be held in simple strings and small data objects, and so can typically be held in the header of a plain flat file. The decision of what information to include in the header, and the structure of this data is probably the most important task for defining a standard format.
The second possible issue is the storing of data that is related to the individual spectra, but is too large to fit in a reasonable header. Examples would be data for individual fluorescence channels from a multi-element detector, the full MCA spectra from such detectors, or a diffraction pattern measured on the sample. For the purposes of the discussion here, focused on XAS data, these topics may be left aside for part or all of the work here.
A third possible issue is the storing of multiple spectra that may be tightly or loosely related to one another. This essentially constitutes a Library, such as many researchers keep in any of several ad-hoc ways, and which are often of very high value.
These topics are starting to be discussed in detail by an informal Working Group of the IXAS. Members of this group include BruceRavel, Armando Sole, James Hester, Gerd Wellenreuther, and MattNewville, but membership is not closed -- all interested are welcome. Some documents on listed below.
Old Page (c. 2009)
Some time ago, BruceRavel and Ken McIvor from MRCAT, APS Sector 10 drafted a proposal for a single spectrum with metadata file format which would be easy to implement at most extant beamlines, which would be simple to incoporate into most extant data analysis software, and which addresses most of the metadata concerns that Bruce and Ken were able to identify. Here is their draft of the specification:
Note that the specification calls for the implementation of interface libraries in a variety of common language as part of the development of the standard.
References, External Links
Matt's ideas on a SQL-based Library format, Matt's development code for this library idea
Bruce's latest code for Xas Data Interchange format
Talks from January, 2010 Workshop on HDF5 for Synchrotron Data
Q2XAFS 2011 Workshop on Improving Data for XAFS