For efficiency inside the server, entire rows (i.e., tuples) are passed around in 32 kbyte row buffers. If the UDT is to be stored in a multi-column table, the sum of the sizes of all the columns must not exceed 32k. This 32k limit does not affect the UDT size limit directly, but it does influence the choice of max length (item #2, next).
When you define a variable-length opaque type, you can specify a maximum allowable length with a MAXLEN modifier. If you don't specify anything, the default value is 2 kbytes.
If you intend to make your UDT indexable with an Rtree, a copy of the UDT data will be stored in the leaf pages of the index. The Rtree index page size is 2 kbytes, but after allowing for data structure overhead there are approximately 1960 bytes available. Furthermore, the Rtree index will not work if there are not at least two UDTs on each page, so you should restrict the UDT size to less than or equal to 1/2 of this amount. In the example code we have chosen 960 bytes.
Because of these size limitations, it is necessary to convert UDTs which are 'too big' into large objects. This means that the variable-length portion of the UDT data will get stored in Smart Blob Space (i.e., sbspace), and a handle to the large object (i.e., lohandle) will get stored in the actual database table (i.e., in-row). This is the concept of multi-representation: the data comprising a UDT can mean different things, depending on how the actual user data is stored.
The DataBlade API contains a certain amount of support for dealing with multirep data, but the structures and library routines are somewhat difficult to use. Hence this example of how to implement a multirep UDT.
MyUdt.
This is a variable-length opaque type which internally comprises an
array of double-precision floating point values.
A bare minimum of SQL functions are provided to show how to deal with multirep data:
MyUdtIn and MyUdtOut,
respectively.
Assign, Destroy, LOhandles
NumElements and Element.
AppendElement and ReplaceElement.
MI_MULTIREP_DATA structure which is provided in the
public header file(milo.h) and used throughout the code is
inadequate. Because it is a
simple union, you must supply a tag field outside the structure to
determine which side of the union is being used. This is the purpose of
the mrep_state field in the MyUdtData structure.
Assign routine is called
to convert a 'big' UDT to a large object.
That is because there is no way to reliably determine if a UDT will be
stored in a table until the server calls your Assign routine.
At that time you can determine what sbspace the large object should go into
by examining information about the column; hence the large object will be
placed in the correct sbspace. If you converted a big UDT to a large
object in the text input routine, you would be making two mistakes:
MyUdtData structure. Strictly speaking, this form is
not needed: you could always place the array data in a separate buffer
(see next form), but if you have a lot of UDT instances containing
small arrays you will have to allocate a buffer for the data in the
text input routine, and then in the Assign routine you will
have to allocate another lvarchar and copy the data into it.
MyUdtData
structure. It is important to allocate this buffer with
PER_COMMAND memory duration so it will survive all
subsequent user-defined routine calls until the Assign
routine is called and the buffer can be converted
to a large object. The reason for this form is to get around the size
restriction (UDT max length, item #2 in the introduction above)
of the lvarchar that is returned to the server.
MyUdtData structure.
To access the large object data, it must first be pinned
(see below). For better performance, the pinned data should be cached,
but that is beyond the scope of this example.
All of the server-callable blade routines (user-defined routines, or UDRs)
must be able to handle any of the above forms. This is why UDRs which need
to access the floating point array call PinMultirepData first.
PinMultirepData call with an
UnpinMultirepData call; otherwise your blade may appear to
leak memory. The pin and unpin calls as implemented here take care of all
forms of multirep data and do all the necessary memory management.
MyUdtIn) will return an lvarchar containing the actual
data,and the Assign routine will not have to do anything.
Assign routine to insert the UDT into the table, it must
convert the buffer to a large object and store the lohandle in the UDT.
Assign routine will never be called, and the
NumElements routine will operate on a UDT whose data is stored
in a per-command memory buffer.
NumElements
routine will get called to return the number of array elements; finally
the Element routine gets called to return one array element.
The Assign routine will never get called because the UDT
is not stored in a table.
AppendElement, which creates
a new UDT, with the array data stored in a per-command memory buffer.
The Assign routine will convert this buffer to a large object.
ReplaceElement routine.
Even though the size of the UDT is not being changed, you
need to make a copy of the original UDT and let the Assign
routine convert the new UDT back into a large object if necessary. This
is because if the original UDT was a large object, you must pin the old
data to bring it into memory, modify the data, and create a new large
object. A more sophisticated implementation of this routine would
modify the large object data in-place, via mi_lo_writewithseek.
This exercise is left to the reader.