[MINC-development] [Fwd: data format]

John G. Sled minc-development@bic.mni.mcgill.ca
Fri, 22 Nov 2002 12:40:45 -0500


This is a message that Art Wetzel had sent describing the data layout
they are using at the Pittsburgh Supercomputing Centre to serve up the
visual human.  From what I understand, some of the details are still
in flux, but it does give some insight into their approach.

cheers,

John


############################
The basic element of data going to the client is a cube.  Since the
client is in control of the entire interaction every cube is delivered
by explicit client request.  For the Vis Female we currently provide
4 scales of cubes all composed of 8*8*8 = 512 voxels.  The scales are
just multiples of 2 so we have full resolution, 1/2 res, 1/4 res and
1/8 res cubes.  The 1/8 res cubes for example enclose what was originally
a full scale 64*64*64 voxel region.  Therefore, each of the 512 voxels
in a 1/8 scale cube corresponds to the low pass filtering of an 8*8*8
voxel region from the full scale data.  In going from a low res cube
to the next higher res (by a factor of 2) there are 8 possible octant
cubes to choose from.  These subcubes nestle into the corners of the
larger cubes.

These parameters such as number of res levels and cube size can be changed
and we will want different choices for some special circumstances.  For
example, with substantially larger data sets we'll want more levels and
possibly larger cubes.  One wants larger cubes in order to use compression
methods with longer supports.  One wants smaller cubes to minimize
the amount of unused data in extracting cutting planes.  Clearly with
the 8*8*8 cubes we actually get 8 times as much data as needed to construct
any cutting plane (ignoring interpolation implications).  This is offset
by 2 things A) taking advantage of 3D compression correlations and
B) the fact that we are presuming interactive navigation with rotations
or at least a lot of region sweeps which will eventually use the
extra data which stays in the client side cache which is flushed by
the LRU criteria.  The client cache is maintained as an octree which
is traversed as needed to construct the cut plane image.

The client is responsible for requesting those cubes which intersect the
current cutting plane.  We always retrieve the coarsest level of cubes
first to provide both a quick display and also a basis for some of the
hierarchical compression methods.  Every client request therefore consists
of a scale together with a cube number from the full object bounding box.

Now the issue of data organization at the server...

I've run compatible servers using several different data organizations
not to mention compression methods.  For example, when working with
new data that one wants to see quickly but for just a single user
its convenient to just concatenate all the planar transverse (or
what ever base orientation you have) images and let the server extract
individual cubes (which we sometimes call chads) from the solid block.
In that mode the easiest thing given our 64bit server machine is to
just memory map the whole thing and let the server code extract voxels
to compose each cube in real time as requested by the client.  Of course
this mode suffers from poor speed since it is effectively page faulting
new data from disk and will produce very bad disk seek patterns for
some orientations.  Also, it is only compatible with compression
methods that can be done on the fly by the server.  Nevertheless,
this is convenient for a quick check when working with brand new data
before going to the trouble of building a good data structure.

The better method explicitly reorganizes and maintains the data in cube
form on the server.  In the current best variant on this strategy the data
is collected into contiguous blocks corresponding to the coarse cube
size together with all of its finer detail descendents.  Again, the
data is all memory mapped but when a paging op occurs it applies to
all scales of a particular cubic volume at once.  The linear layout
then is something like 84211111111211111111.....4211111111 etc following
a depth first recursive tree.  This was probably not very successful
in showing it but I hope you get the idea.  In the uncompressed 4 scale
hierarchy this whole block is mostly devoted to the 512 sale 1 cubes
containing a total of 262144 voxels.  It also contains 64 scale 2 cubes,
8 scale 4 cubes and the single level 8 parent cube.  Therefore, without
compression the whole hierarchy is ~1.143 times as big as just the
original scale 1 voxels.  Another advantage of this orgainzation is
that it can use 16 bit offsets to index scale 1 data and only needs
a small number of full 64 bit pointers to tie together the high levels.

Finally compression ...

I have been working a number of variations on compression techniques
for both lossless and lossy storage and transmission.  Given a fast
network link the system (like the release version you've tested)
operates pretty well with no explict compression other than the
apparent visual compression produced by the hierarchy and the coarse
to fine retrieval order.  When navigating quickly enough to get
just scale 8 data this is really a crude 512:1 compressive effect.

Next the are methods which operate by simply compressing the data within
each cube without making use of the overall hierarchy.  These include
include 3D DCT forming a JPEG type blocked transform and also some of the
short support wavlets.  When using the wavelets in this method its only
hierarchy is the 3 octaves to reduce the 8*8*8 cube to a single voxel.
At the most reasonable visual quality levels but where lossy methods
are acceptable the 3D DCT within block compression works well out to
more than 20:1.   In the case of the Visible Female data the reversible
integer wavelets are only able to get ~2:1 as limited by data SNR.
Other than Haar the wavelets also have block bounds issues which limit
their effectiveness in this mode.

The better methods take advantage of the coarse to fine hierarchy in
a multiresolution sense.  Again there's DCT and short wavelets.
Using DCT's and coding each level as a difference between an interplated
expansion of the coarser parent and the true target we get good results
in excess of 30:1 compression.  In operation the effective rate is
much higher yet when navigating quickly enough to stay at the scale
2 or even scale 4 levels.  This is the version that we'll put out for
the end of November public release upgrade.  I'm working on a variation
of that to deliver lossless data by using an approximate integer
DCT that will give exactly the same result on all machines to be
combined with a final precomputed "lossless overlay".  This gives
the speed advantage of lossy compression for navigation but still can
give lossless delivery for analysis or a soon to be provided data
save facility so you can export data to other applications on the
client end.

In the hierarchical modes the wavelet methods extend beyond a single
cube resolution on up the full hierarchy.  However, the block size
does create some awkardness with boundary effects so its hard to do
things like the 9/7 filter that is so popular in 2D image compression.
Its much more straightforward to use the TS (2-6) transform which
is a reversible integer wavelet as used by the Ricoh CREW system.
In that case a low res boundary layer that is missing where a higher
res cube fits into into the corner of its coarser parent can be sent
along with the child coefficients to complete the transform support.

This explaination leaves out the reversible color space transforms,
both the one from CREW and another that I like better, and also
handing noncubic voxels such as the Visible Male and particularly
the new 70mm film rescan which is 1mm in z but .144 in x&y.
Also I've not done an implementation with embedding methods due
to issues presented by the random access requirement.

You ask about discrete segmentation data.  Eventually I want to put
in a set of (probably 16 bit) code numbers that will label the
voxels.  This will be a 0 terminated octree that fits into the
regular data hierarchy.  While we're working on segmentation however
so that the voxel labeling is changing its much more convenient to
store that in a separate data structure what drives a separate
server process.  That is what is shown as the "Voxel server software"
in Figure 3 of the Draft PSC Visible Human Overview that I sent
last week.  In the current incarnation however, that returns the
URL for a mesh model of the appropriate tissue rather than a coded
ID number.

As you say this whole area is something we can talk about.  We have
enough flexability to handle a lot of variants on the theme so long
as the server and client can coordinate.  We're putting in a startup
negotiation phase to be able to run a lot of varieties to test the
effectiveness of different compressions, bandwidths, application
requirements etc.