« Previous - Version 4/15 (diff) - Next » - Current version
Quincey Koziol, 02/15/2011 04:25 pm

Ideal Block-Based VFD Characteristics

Some of these characteristics may be mutually exclusion. Which, I don't know. Let's elaborate as we flesh out what this thing looks like.

What is the ideal, block-based Virtual File Driver to support PMPIO?
  1. Blocks are either pure meta-data (MD) or pure-raw data (RD)
  2. MD blocks can be written throughout file (e.g. don't have to let MD grow without bound and write at close)
  3. Block size of RD controlled independently of MD
  4. Produces a single file out the bottom (not one for RD and one for MD)
  5. Can be re-opened correctly by any standard HDF5 VFD (e.g. sec2 for example)
  6. PMPIO baton handoff is performed on the open file.
    • Means in-memory state of file is message-passed to next processor rather than written on proc i as the result of a close and then read back onto proc i+1 as result of an open. However, from an API utilization standpoint, might look to the application like it is closing and then later opening the file but message-pass optimization occurs transparently under the covers.
    • Means that whatever state is passed around does NOT grow as number of processors the baton is passed between grows.
  7. Is informed by HDF5's higher level MD cache of the N hottest hot spots of MD, where N is a variable chosen by caller
  8. Can handle MD async. (And likely RD async)
  9. A perfect block-based VFD makes dataset chunking irrelevant (e.g. I/O requests don't correlate at all with data chunks)
  10. Computes diagnostic statistics for performance debugging (e.g. like Silo's VFD currently does)
  11. Can use MPI under the covers to aggregate blocks from different MPI-tasks 'files' to a single, shared file on disk.
  12. Option to ship blocks off processor via MPI message to...
    • Other processors sitting idle within MPI_WORLD_COMM but set aside explicitly to handle I/O
    • Special service software running on the actual I/O nodes of the system

Internal HDF5 lib communication with VFD.

The internal parts of HDF5 lib can communicate directly with VFD by adding what amounts to out of band read/write messages to the VFD. Currently, there is a mem type tag on each message that indicates the type of memory HDF5 is sending to or requesting from the VFD. We could add new types to this enum to support messages to be sent between HDF5 lib proper and VFD. For example, to send information of hot spots in MD, HDF5 lib could write data to VFD with mem_type of MD_HOT_SPOTS. The VFD would advertise to HDF5 if and what kind of out of band messaging it supports. So, HDF5 would only send such messages to VFDs that claim to support them. This way, however, its possible for HDF5 lib proper to communicate with VFD without changing existing VFD API.

Likewise, HDF5 lib could request information from VFD by a read method with an appropriate mem_type.

Also available in: HTML TXT