Galaxy_TOC

Bigintreeindb_t1 Encoding Method

WARNING: This is an untested specification, i.e. it lacks implementations.

This document describes the core of the method, how to encode arbitrary precision integers (hereafter: bigint-s) to a plain, relational, database in a format, that allows to execute queries greater-than and smaller-than efficiently.

General scheme:

BigInts are placed to a binary tree. An example of the tree and the root-originating vertex paths:

 Bigint Value as a String The Number of columns that list the X,0,1 equals with the number of levels in the tree minus 1. In the case of this example: 4 - 1 = 3 . 73 0 1 X 74 X X X 75 1 0 X 76 1 0 1 42 0 X X 50 0 1 0 34 0 0 X 22 0 0 0 1984 1 X X

If the X were to be replaced with 0, then the path encoding of 34 (0,0,X) would not differ from 22 (0,0,0). On the other hand, if prior to the replacement X -> 0, the 0 were to be replaced with a "digit" that is smaller than the 0, for example, -1, then the 22 would have one insignificant "negative digit" more than the 34 has.

 `22==(-1)(-1)(-1) < (-1)(-1)(0)==34`

That encoding has 3 sequential integers as its digits: (-1),0,1. The traditional ternary system uses slightly different digits, the 0,1,2, but all the rest of the contemplation stays the same.

`{(-1),0,1} + 1 = {0,1,2}`

 73 0 2 1 74 1 1 1 75 2 0 1 76 2 0 2 42 0 1 1 50 0 2 0 34 0 0 1 22 0 0 0 1984 2 1 1

The order of the numbers that are formed from the series of digits that form the path matches that of the bigints that have been inserted to the tree, regardless of the value of the bigint. As the traditional relational database engines use decimal systems, the base 3 numbers can be converted to base 10 and stored as classical C/C++ unsigned int values, which in turn can be queried by using the traditional SQL.

As the number of levels in the tree, the maximum path length, grows, new C/C++ unsigned int columns can be added without rewriting any of the previously written data, unless a tree balancing operation is performed. All of the new unsigned int columns must be filled with a default value of 1.