Galaxy_TOC

Link to LightMSGP specifications index.







LightMSGP Specification Version 2 (LightMSGP_v2)


WARINING: This specification is a work in progess and it currently lacks an implementation that tests all of its aspects.


LightMSGP_v2 is meant to fix the fundamentally flawed addressing part of the LightMSGP_v1.



Addressing Scheme



The adressing space consists of an infinitely large graph(inspired by the work of Dave Ackley), where each vertex has a maximum, finite, degree of Deg_max. Each of the vertices has an ID that is a whole number from the range between 0 and Deg_max. That is to say, there are only Deg_max+1 different ID-s. None of the vertices in that infinitely large graph has any immediate neighbours that share an ID or has the same ID with any of its immidiate neighbours.

Addresses are paths from one vertex to another.

2 addresses are considered to be equivalent, when their starting points match and their end points match.
2 addresses are considered to be equal, when the paths match.
Addresses are allowed to contain loops and cycles.
Addresses are allowed to start and end at the same vertex.






Translation of the Addressing Scheme
to Container Based data Structures


Each container becomes a centre vertex of a star graph, where the elements of the container are the leaves of the tree that has the center vertex of the star graph as its root. The elements of a container can be other containers. From software design perspective a container can be a gateway to elements that reside outside of the container.

In the case of object oriented programming, an instance can be in the role of a container and its public methods can be in the role of a set of elements that reside in the container. An operating system process is a container that contains instances, a computer contains operating system processes, the Internet contains computers and all instances in all the computers of the world can send messages, data, to all other instances in all other computers on the wild-wild-internet, including the computers of aliens out of our solar system. Hence the infinite address space.





Routing


Suppose a package is on route with an address of 13-56-34-24. It means that the creator of the package is vertex number 13 and the destination of the package is vertex number 24. Suppose the subgraph of the infinitely large graph that contains the vertex number 13 looks like the following graph:

If the vertex number 56 knows that the address 13-56-34-24 is equivalent to the address 13-56-63-24, then the vertex number 56 has the liberty to reroute the package by passing it on to vertex number 63 in stead of passing it on to the vertex number 34. The vertex number 56 has also the liberty to clone the package by sending a copy of it to both routes, one copy to the vertex number 34 and the other to the vertex number 63. Each package has a package ID and all clones of the package have the same package ID.

To determine the actual path travelled at package arrival, each package has 2 FIFOs as part of its header. One of the FIFOs, the address FIFO, consists of addresses, which are stored as arrays of vertex ID-s and that have the route starting point stored to index 0. The other FIFO is for address cursors and it has exactly as many elements in it as the address FIFO has. The FIFO of address cursors consists of whole numbers that are indices of the arrays that reside at the FIFO of addresses(paths in graph).

If vertex X_0 is directly connected to vertex X_1 and the X_0 sends a package directly, over that connection, to the vertex X_1, then the vertex X_1 receives a package, where the top-most array of the address FIFO describes the current route that the package is currently travelling and the topmost element of the FIFO of address cursors is a whole number that equals to the array index, where the X_1 is marked at the top-most element of the address FIFO. If the X_1 decides to keep the package at its current route, then it only increments the topmost element of the FIFO of address cursors and passes the package on to the next vertex at the address.

If the vertex X_1 changes the route of the package, then the content of the 2 FIFOs is updated only by pushing a new address and a new address cursor to the tops of the FIFOs. Prefix of the new address must match with the prefix of the old address. That way the ultimate destination vertex of the package can read the actual route of the package by reading the topmost element of the address FIFO. The rest of the elements of the 2 FIFOs distribute information about alternative routes between the two end-points of the package route. An illustration:





The Rational


Computers as containers of operating system processes connect and disconnect from the network at random, have changing IP-addresses. A same computer can run some semi-random number of multiple instances of the same application that has instantiated more than one applicatinon specific instances. For example, a computer can run multiple web browser instances that each load the same web page and JavaScript program at more than one tab. The instances that were initialized within the JavaScript programs of all of the tabs need some way for communicating with each other, from tab to tab within the same web browser, from browser to browser within the same computer, with JavaScript instances that run at other web browser instances at other computers, and so on and so forth. All of that should run reliably, with somewhat minimized amount of control signal related communication, and at an environment, where communication channels, for example, internet connections, are un-reliable.

Another class of problems is that "big" computers, which are 64bit computers in 2015, should be able to communicate with the "Internet of Things" 8bit microcontrollers. The microcontrollers at some device should be able to communicate with each-other and all of that should somehow work together without encountering a situation, where it turns out that an adressing scheme that involves multiple computers, does not cover all of the real life situations. The fact that some computers, instances within a program, etc. can be connected with each other by more than one probabilistically existent channels, should also be utilized to increase the probability that there exists at least one channel between the instances. That is the motive behind the very general approach and the search for theoretical limits.

Relative paths are necessary to avoid infinite memory space requirements for either vertex ID-s or absolute paths. If the vertex ID-s were finite, then in the case of an infinite address space some absolute paths would be infinitely large. If the absolute paths were finite, then, for example, if the root node were the center of some inter-galactic star graph, in the case of an infinitely large address space the star graph would have infinitely many leaves and that requires some vertex ID-s to have infinitely many bits. However, if the degree of a vertex is allowed to be at least 3, then it is NOT possible to reach all vertices in a finite number of steps (a show killer for inter-galactic distributed hash tables that implement Tor, GNUnet, Bitmessage, Freenet, I2P, some-alien-UFO-dating-network), but it is possible to take infinitely large shortcuts, as explained at the following illustration:

As a matter of fact, in an graph with infinitely many vertices that have a finite maximum degree there are infinitely many subgraphs that can not be reached in finite number of steps, because the path to any one of the vertices of the sub-graph is infinitely long. To prevent all data from the inter-galactic distributed hash-table from overwhelming a computer that has a finite amount of memory, the distributed hashtable should avoid propagating data to all of its "infinite corners". That is to say, data must stay somewhat local. The way to assure data locality within the infinite graph is to assign a finite number of hops that a data package is allowed to propagate and distributed hashtable nodes that are not spam-bots, decrement that number before passing the data packet on.

That explains one of the reasons, why different search engines are not able to offer the same search results even, if they used the same ranking system, the same settings and the same search engine software.



Data Package Format


The exact representation of the data package can be translated on the fly. For example, it might be "armoured" like it is done by the GNU Privacy Guard, it might be some structured text formalt like XML, JSON, ProgFTE, YAML, it might be in some binary structured format like the HDF, it might be a folder with files that have been compressed by some file compression algorithm, it might be an instance of an application specific class, a runtime hash-table data structure, an array, a line at a relational database table, etc.

The data package is a record with the following mandatory fields:

1_byte_package_maximum_size_class A single byte, where bit endianness is encoded by setting the highest bit to 1 and the lowest bit to 0. The rest of the 8b-2b=6b represent a whole number N that is used for calculating i_package_max_size_in_bytes=2^N with an exception that the value 2^63 stands for package maximum size of 2^63B or more. Data package header size is included to the package size.

The motive for having this field is that it allows a microcontroller to warn a huge alien computer that it is not capable of accepting the package that the huge alien computer offers it. On 2015 planet Earth that field might be useful with connected embedded devices.
1_byte_package_header_encoding_format_class Bit endianness is encoded the same way as it is encoded with the 1_byte_package_maximum_size_class and the meaning of the rest of the bits is a whole number N.

Range [0..20] from the values of the N have been reserved for "standardized" encodings that will be described in this specification.
1_byte_protocol_type_number_byte_0 Bit endianness is encoded the same way as it is encoded with the 1_byte_package_maximum_size_class and the meaning of the rest of the bits is a whole number N.

If N=63, then there exists also 1_byte_protocol_type_number_byte_1 that has the same format as this one. The values of N of all of the 1_byte_protocol_type_number_byte_X must be added to get the encoded version. The values 0..20 are reserved to the various versions of the LightMSGP_vX. The LightMSGP_v2 has the value of 2.

i_protocol_channel_number A positive whole number.

0 stands for an ordinary data package with a request to report all delivery failures.

1 stands for an ordinary data package with a request to omit delivery failure related feedback.

20 stands for a message that informs that one of the vertices on the path of the data package found the data package size (1_byte_package_maximum_size_class) to be too big.

21 stands for a message that informs that one of the vertices on the path of the data package did not support the encoding (1_byte_package_header_encoding_format_class) of the data package.

22 stands for a message that informs that one of the vertices on the path of the data package found that all of its known routes to the destination of the data package are broken due to broken connections.

The mismatch of the 1_byte_protocol_type_number_byte_0 does not deserve to be reported. The useage of this field is explained at the section Handshaking at the "First Contact"

x_session_ID A random number, which might be in a form of a string that might be a GUID.

The routing protocol does not require this field to be part of the package header, but its presence at the header allows packages of closed/old sessions to be dismissed without analysing the data of the package. Its presence at the header is a speed optimization hack that also simplifies the client code a little.
x_msg_ID A random number, which might be in a form of a string that might be a GUID.
x_address_FIFO One of the FIFOs that was described at the Routing section of this specification.
x_address_cursor_FIFO One of the FIFOs that was described at the Routing section of this specification.
i_max_number_of_hops A whole number that offers vertices an opportunity to smartly limit the flooding of data. The number of currently jumped hops equals with the value of the topmost element of the x_address_cursor_FIFO.
x_data_format A random number or a string. Its meaning is application specific.


Optionally applications may add the data to the data package header record by adding a field named x_data. The hahdshaking at the "first contact" includes an exchange of the values of the 1_byte_package_maximum_size_class and the 1_byte_package_header_encoding_format_class.





Handshaking at the "First Contact"

Handshaking can not be standardized in this specification, because there needs to be at least some information available before-hand, how an Earthly microcontroller can communicate with an alien supercomputer. After the physical interfaces, network level protocols, error-correction code formats, encryption issues and the rest of that kind of issues have been taken care of, the 2 vertices are expected to exchange the values of 1_byte_package_maximum_size_class and 1_byte_package_header_encoding_format_class. That exchange can take place repeatedly, because it might be that one of the vertices gets an upgrade, degrades gracefully or recovers from graceful degradation. In many cases the handshaking does not make any sense. For example, in the case of the container based data structures instance methods are seen as vertices and any necessary handshaking on a larger scale might be handled by some hack at the instance or application level.

If the
20<=i_protocol_channel_number
and the vertex decides to adhere to the reccommendations of this specification, then the reccommended way for a fault-detectiong vertex VX_F0 to act is to decrement the uppermost element of the x_address_cursor_FIFO by one, clone the uppermost element of the x_address_FIFO, modify the clone by throwing away the suffix that comes after the ID of the VX_F0, reverse the array so that the ID of the VX_F0 ends up at index 0, create a new data package, where the x_data equals with the edited version of the undelivered data package and where the i_protocol_channel_number indicates the fault and send the newly created data package away, except, when the value of the i_protocol_channel_number of the data package that the VX_F0 failed to deliver, equaled 1. In either case, the VX_F0 is expected not to save any copies of the undeliverable packages, neither is it expected to save any copies of the fault feedback packages.

The vertex, VX_F1, that receives the failure feedback form the VX_F0 is expected to reroute x_data field of the failure feedback carrying data package or just pass the failure feedback data package on to the next vertex at its destination address. If the VX_F1 fails to pass on the failure feedback data package, then it is expected to discard it and forget about it.