Galaxy_TOC

Plaice Series of hash Functions



Plaice hash functions are specifically designed for web applications. The main difference between the classical hash functions (MD5, sha256, Tiger, SHAvite-3, BLAKE2, etc.) and Plaice hash functions is that the classical hash functions work with bit-streams, but the Plaice hash functions operate only with characters and classical, CPU-supported, 4B integers. Where classical hash functions tend to use the XOR, the Plaice hash functions tend to use the TXOR.




Motivation


Bitstreams depend on the byte endianness and bit endianness of the CPU. That is to say, there are literally 4 types of CPU-s: big-endian by bytes and little-endian by bits, big-endian by bytes and big-endian by bits, etc. The optionally present Unicode byte order mark illustrates a case, where the same text can have 2 different bit-streams, regardless of the endianness of the CPU. For 2 different bit-streams the bit-stream oriented hash functions probabilistically return 2 different hash values. Consequently, a bit-stream oriented hash function can return at least 2 different values for the very same text. (4 CPU-types times BOM-or-no-BOM gives 8 different hashes.)




Things to Consider, when Designing Solutions



If Unicode code points were used for calculations, then the calculation result might equal with a code point that has not been assigned to a character. Text editors can not display the code-points that are not assigned to a character, string processing routines are likely to fail, etc. Code points of a same character might have different integer values at different encoding standards. For example, Unicode differs from TRON. Due to the difference of integer values of code points at different encoding standards, some sort of a unifying encoding should be used. The unifying encoding must not contain un-assigned code points. To simplify calculations, the lowest code point of the unifying encoding should be 0.

A Possible Sub-solution:

The unifying encoding might be based on an artificial alphabet that is dynamically constructed from the hash-able text. The alphabet assembly algorithm can be part of the hashing algorithm.






Plaice_t1


The Plaice_t1 is the very first, experimental, version of the Plaice series of hash functions. Its reference implementation, which is part of the Kibuvits Ruby Library, is its specification. That is to say, the Ruby code is the specification.








Thank You for reading this HTML-page. :-)

Timestamped 1. version of this document.