Developer Tips

Installation

Install instructions can be found in the README file.

Hyracks Data Mapping

Hyracks supports several basic data types stored in byte arrays. The byte arrays can be accessed through objects referred to as pointables. The pointable helps with tracking the bytes stored in a larger storage array. Some pointables support converting the byte array into a desired format such as for numeric type. The most basic pointable has three values stored in the object.

  • byte array
  • starting offset
  • length

In VXQuery the TaggedValuePointable is used to read a result from this byte array. The first byte defines the data type and alerts us to what pointable to use for reading the rest of the data.

Fixed Length Data

Fixed length data types can be stored in a set field size. The following outlines the Hyracks data type or custom VXQuery definition with the details about the implementation.

Data TypePointable NameData Size
xs:booleanBooleanPointable1
xs:byteBytePointable1
xs:dateXSDatePointable6
xs:dateTimeXSDateTimePointable12
xs:dayTimeDurationLongPointable8
xs:decimalXSDecimalPointable9
xs:doubleDoublePointable8
xs:durationXSDurationPointable12
xs:floatFloatPointable4
xs:gDayXSDatePointable6
xs:gMonthXSDatePointable6
xs:gMonthDayXSDatePointable6
xs:gYearXSDatePointable6
xs:gYearMonthXSDatePointable6
xs:intIntegerPointable4
xs:integerLongPointable8
xs:negativeIntegerLongPointable8
xs:nonNegativeIntegerLongPointable8
xs:nonPositiveIntegerLongPointable8
xs:positiveIntegerLongPointable8
xs:shortShortPointable2
xs:timeXSTimePointable8
xs:unsignedByteShortPointable2
xs:unsignedIntLongPointable8
xs:unsignedLongLongPointable8
xs:unsignedShortIntegerPointable4
xs:yearMonthDurationIntegerPointable4

Variable Length Data

Some information can not be stored in a fixed length value. The following data types are stored in variable length values. Because the size varies, the first two bytes are used to store the length of the total value in bytes. QName is one exception to this rule because the QName field has three distinct variable length fields. In this case we basically are storing three strings right after each other.

Please note that all strings are stored in UTF8. The UTF8 characters range in size from one to three bytes. UTF8StringWriter supports writing a character sequence into the UTF8StringPointable format.

Data TypePointable NameData Size
xs:anyURIUTF8StringPointable2 + length
xs:base64BinaryXSBinaryPointable2 + length
xs:hexBinaryXSBinaryPointable2 + length
xs:NOTATIONUTF8StringPointable2 + length
xs:QNameXSQNamePointable6 + length
xs:stringUTF8StringPointable2 + length

String Iterators

For many string functions, we have used string iterators to traverse the string. The iterator allows the user to ignore the details about the byte size and number of characters. The iterator returns the next character or an end of string value. Stacking iterators can be used to alter the string into a desired form.

Array Backed Value Store

The array back value store is a key design element of Hyracks. The object is used to manage an output array. The system creates an array large enough to hold your output. Adding to the result, if necessary. The array can be reused and can hold multiple pointable results due to the starting offset parameter in the pointable.