Description Usage Arguments Value 128-bit numbers and QQIDs Process Input formats Endianness Author(s) See Also Examples
xlt2qq
converts a vector of 128-bit numbers (hexlets) in hexadecimal
notation, UUID format, IPv6 addresses, or MD5 hashes to QQIDs.
1 | xlt2qq(xlt)
|
xlt |
(character) a vector of UUIDs, MD5 hashes, IPv6 addresses, or generally 32 digit hexadecimal numbers |
(character) a vector of QQIDs
UUIDs, IPv6 addresses and MD5 hashes are
specially formatted 128-bit numbers, referred to as
hexlets.
Randomly chosen 128-bit numbers have a collision probability that is small
enough to make them useful as (practically) unique identifiers in
applications where a centralized management of IDs is not feasible or not
desirable. However since they are long strings of numerals and letters,
without overt semantic content, they are hard to distinguish by eye. This
creates difficulties when developing, or debugging with structured data, or
for the curation of ID tagged information. The qqid
package provides
tools to convert the leading 20-bits of 128-bit numbers to two "Q-words",
and the remainder to a string of 18 Base64 encoded characters. The
"Q-words" - the letter Q evokes the word "cue" i.e. a hint or mnemonic -
define a unique and invertible mapping to 2^10 integers (0, 1023). Thus two
Q-words can encode 20 bits, or 5 hexadecimal letters:
1 2 3 4 5 6 7 8 9 | .
. [0-9a-f] [0-9a-f] [0-9a-f] [0-9a-f] [0-9a-f]
. hex: |--0x[1]--| |--0x[2]--| |--0x[3]--| |--0x[4]--| |--0x[5]--|
. bit: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
. |----------int[1]-----------| |----------int[2]-----------|
. int: (0, 1023) (0,1023)
. Q: (aims, ..., zone) . (aims, ..., zone) . Base64...
.
|
Input strings are first converted to plain hexadecimal
strings. A leading "0x" is deleted, the "-" and ":" separators of UUIDs and
IPv6 addresses respectively are deleted, and all letters are converted to
lower case. It is an error if the result is not exactly a 32 digit
hexadecimal "[0-9a-f]\{32\}"
string. The first five hexadecimal
letters are interpreted as two ten bit numbers, and mapped as indices into
the 1024-element Q-Word vector. The QQID has two Q-words as a head
representing digits 1:5 of the input, and the 18 Base64 encoded digits 6:32
of the input as its tail. Since the mapping is fully reversible, QQIDs have
exactly the same statistical properties as the input. For details on QQID
format see is.QQID()
.
A hexlet comprises 16 octets and is written in the
hexadecimal numeral convention. A canonical MD5 hash is such a string of 32
hexadecimal characters. To improve readability, separators are inserted
into UUIDs: "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
where "x"
is a hexadecimal letter. A canonical expanded IPv6 address has the form:
"xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx"
where "x"
is a
hexadecimal letter. Conventions exist to omit leading zeros in IPv6
addresses, such shortened addresses are treated as an error. It is up to
the user to expand them correctly before processing. There are many
representations of hexadecimal numbers, most commonly they have a prefix of
"0x". xlt2qq()
converts all letters to lowercase on input.
The qqid
package uses its own functions to
convert to and from bits, and is not affected by big-endian vs.
little-endian processor architecture or variant byte order. All numbers are
interpreted to have their lowest order digits on the right.
(c) 2019 Boris Steipe,
licensed under MIT (see file LICENSE
in this package).
qq2uu()
to convert a vector of QQIDs to UUIDs.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | # Convert three example UUIDs and one NA to the corresponding QQIDs
xlt2qq( c(xltIDexample(c(1, 3, 5)), NA) )
# A random hex-string is converted into a valid QQID
(x <- paste0(sample(c(0:9, letters[1:6]), 32, replace=TRUE), collapse=""))
(x <- xlt2qq(x))
is.QQID(x) # TRUE
# forward and back again
myID <- "0c460ed3-b015-adc2-ab4a-01e093364f1f"
myID == qq2uu(xlt2qq(myID)) # TRUE
# Confirm that the example hexlets are converted correctly
xlt2qq( xltIDexample(1:5) ) == QQIDexample(1:4) # TRUE TRUE TRUE TRUE TRUE
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.