README.md

HOQCenc

Han Oostdijk, 12May2021

HOQCenc is an R package containing routines for encryption and decryption of character strings and files. The main function is xcode that encrypts an input string by executing a number of operations (specified by argument trans) when argument ed='e'. When ed='d' is specified decryption will be done by executing the operations in reversed order.

Installation

You can install this version from GitHub with:

# install.packages("devtools")
devtools::install_github("HanOostdijk/HOQCenc",build_vignettes=TRUE) 

Example

In the Details section more examples are given and some background. In this first example we show:

key <- "1VerySecretKey"
message1 <- "Line 1 of a two-line message!\nAs expected a second line."
message2 <- "Line 1 of a two-line message.\nAs expected a second line." 

(s <- xcode(message1,key=key,ed='e',trans='cscvp')) # encrypt message1
#> [1] "tSCLY5l2X4eM81f8Q7D3 gbmK9#hRSCtD5VC4M9CpH2LxQqGKKl4oHoKqc3dqOOT"
xcode(s,key=key,ed='d',trans='cscvp')               # show result of decryption
#> [1] "Line 1 of a two-line message!\nAs expected a second line."
xcode(message2,key=key,ed='e',trans='cscvp')        # encrypt message2
#> [1] "iixEWSv881hYenne#d9VnDEMEowLfD1HrwHyAiHK4qzwbwtneecPdwg1k1OioMOT"

Details

Key and alphabet

When we specify key='' an ordered set is constructed consisting of the lower and upper case letters, the digits 0 till 9 and the space and the # character. This ordered set is our ‘alphabet’: all operations will work only with this set with the exception of the e operation that converts our standard symbols to this new alphabet. When we use the key 1VerySecretKey the characters in this key will be inserted in this set before the others so that the following ordered set will be constructed:

     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,] "1"  "V"  "e"  "r"  "y"  "S"  "c"  "t" 
[2,] "K"  "a"  "b"  "d"  "f"  "g"  "h"  "i" 
[3,] "j"  "k"  "l"  "m"  "n"  "o"  "p"  "q" 
[4,] "s"  "u"  "v"  "w"  "x"  "z"  "A"  "B" 
[5,] "C"  "D"  "E"  "F"  "G"  "H"  "I"  "J" 
[6,] "L"  "M"  "N"  "O"  "P"  "Q"  "R"  "T" 
[7,] "U"  "W"  "X"  "Y"  "Z"  "0"  "2"  "3" 
[8,] "4"  "5"  "6"  "7"  "8"  "9"  " "  "#" 

In almost all operations (but not the e, s and f operation) this ‘alphabet’ will be used.

Operations

The following operations are defined (we describe only what happens in case of encryption because for decryption the reversed action is performed):

The encryption characteristics of these operations are the following:

So it is advised to combine several options as e.g. done in the example below

key <- "1VerySecretKey"

# small differences in text lead to similar outcomes with simple trans
xcode("abcdefghijklmnopqrstuwvxyz",key=key,ed='e',trans='p')
#> [1] "bdrhybhiKqlmnopqmtB1vxwzSx9339  "
xcode("1bcdefghijklmnopqrstuwvxyz",key=key,ed='e',trans='p')
#> [1] "eKrhybhiKqlmnopqmtB1vxwzSx9339  "
xcode("abcdefghijklmnopqrstuwvxy1",key=key,ed='e',trans='p')
#> [1] "bdrhybhiKqlmnopqmtB1vxwzSV9339  "
xcode("ABabcdefghijklmnopqrstuwvxy1",key=key,ed='e',trans='p')
#> [1] "BsbdrhybhiKqlmnopqmtB1vxwzSV9329"

# small differences in text lead to unrelated outcomes with complex trans
xcode("abcdefghijklmnopqrstuwvxyz",key=key,ed='e',trans='cscvp')
#> [1] "DiQFluQYGgAia1Rck6#8ztKqOFCQupSH"
xcode("1bcdefghijklmnopqrstuwvxyz",key=key,ed='e',trans='cscvp')
#> [1] "B2k1QQQRlZ6GwdAT#kqrJ6K1VXZNwVH "
xcode("abcdefghijklmnopqrstuwvxy1",key=key,ed='e',trans='cscvp')
#> [1] "GTdRmtkyUuNwof6lzcieMmxz3TojYC6v"
xcode("ABabcdefghijklmnopqrstuwvxy1",key=key,ed='e',trans='cscvp')
#> [1] "7nCbDxf2eZMDNr Ywl9bqqJ4Mi2GkXdk"

The e operation

Most of the operations used in the xcode function are working only at the 64 characters of our ordered set of the lower and upper case letters, the digits 0 till 9 and the space and the # character. The e operation:

Because the e operation is necessary for most cases it is always prefixed to the list of operations that is specified in the trans argument.

xcode('Deze #!',key='',ed='e',trans='')
#> [1] "Deze #23#21#00  "
xcode('example!',key='',ed='e',trans='')
#> [1] "example#21#00#00"
xcode('example!',key='',ed='e',trans='p')
#> [1] "hueiimh832833899"
xcode('example!',key='',ed='e',trans='a')
#> [1] "f8028741438cf02b46c7c993804538ef"

Only when the noe argument is explicitly set to TRUE this is not done.

xcode('example!',key='',ed='e',trans='',noe=T)
#> [1] "example!"
xcode('example!',key='',ed='e',trans='p',noe=T)
#> Error in playfair(text, ed): invalid characters in text of `p` operation: `!`
xcode('example ',key='',ed='e',trans='p',noe=T) # but with space instead of !
#> [1] "hueiimg8"
xcode('example ',key='',ed='e',trans='a',noe=T)
#> Error in aes$encrypt(text): Text length must be a multiple of 16 bytes

The p operation

With the p operation a Playfair transformation is done that acts on pairs of characters. With key='' the following ‘alphabet’ is used

     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,] "a"  "b"  "c"  "d"  "e"  "f"  "g"  "h" 
[2,] "i"  "j"  "k"  "l"  "m"  "n"  "o"  "p" 
[3,] "q"  "r"  "s"  "t"  "u"  "v"  "w"  "x" 
[4,] "y"  "z"  "A"  "B"  "C"  "D"  "E"  "F" 
[5,] "G"  "H"  "I"  "J"  "K"  "L"  "M"  "N" 
[6,] "O"  "P"  "Q"  "R"  "S"  "T"  "U"  "V" 
[7,] "W"  "X"  "Y"  "Z"  "0"  "1"  "2"  "3" 
[8,] "4"  "5"  "6"  "7"  "8"  "9"  " "  "#" 

and therefore “ex” is converted to "hu’ because ‘h’ and ‘u’ lie on the opposite corner of the rectangle though ‘e’ and ‘x’. Special cases:

Advantage of Playfair is that a frequency distribution on character level makes little sense because the method works on pairs of characters.

xcode('example ',key='',ed='e',trans='p',noe=T)
#> [1] "hueiimg8"
xcode("hueiimg8",key='',ed='d',trans='p',noe=T)
#> [1] "example "

The s and f operations

The s (shuffle) operation shuffles the characters in such a way that 0123456789abcdef is translated to 0f2d4b6987a5c3e1 and the f (flip) operation flips the characters in such a way that 0123456789abcdef is translated to fedcba9876543210.

xcode('0123456789abcdef',key='',ed='e',trans='s',noe=T)
#> [1] "0f2d4b6987a5c3e1"
xcode('0123456789abcdef',key='',ed='e',trans='f',noe=T)
#> [1] "fedcba9876543210"

The v operation

The v (Vigenère) operation shifts the i-th character a number of places in the ‘alphabet’. That number is determined by the position of the i-th letter in the ‘alphabet’. In the first example the ‘alphabet’ is ABCabcdef ... and therefore the shifts for the letters ‘a’, ‘b’ etc. are 4, 5 etc. resulting in ‘4’, ‘6’ etc. In the second example the ‘alphabet’ is abcdef ... and therefore the shifts for the letters ‘a’, ‘b’ etc. are 1, 2 etc. resulting in ‘1’, ‘3’ etc.

xcode('0123456789abcdef',key='ABC',ed='e',trans='v',noe=T)
#> [1] "468 ACbdfhoqsuwy"
xcode('0123456789abcdef',key='abc',ed='e',trans='v',noe=T)
#> [1] "13579#bdfhlnprtv"

The c operation

The c operation shifts the first character a number of places in the ‘alphabet’. That number is again determined by the position of the first letter in the ‘alphabet’. In the first example the ‘alphabet’ is abcdef ... and therefore the shifts for the letter ‘a’ is 1 resulting in ‘1’ . The shift for the next position is therefore determined by the position of ‘1’ in the ‘alphabet’ and that is 44. So the next position (the ‘1’) is shifted by 44 positions in the ‘alphabet’ resulting in ‘R’ . And so on … The main advantage of this method is that the encryption of a character is determined by all of its predecessors. The weak point is that only the position of the ‘a’ in the ‘alphabet’ has to be guessed or tried by brute force and this is very easy. So this method should never be used on its own.

xcode('0123456789abcdef',key='abc',ed='e',trans='c',noe=T)
#> [1] "1RIAtnieb#acfjou"

The a operation

The a uses the digest::AES function with the mode='ECB' method. The method works in blocks of 16 characters: blocks that are identical with the exception of one character will be encrypted to very unrelated results. However if a block occurs more than once, each block is identically encrypted. So also this method should never be used on its own.

xcode('0123456789abcdef',key='abc',ed='e',trans='a',noe=T)
#> [1] "747f22502381a3fb7eb0cb42cb5f6612"
xcode('1123456789abcdef',key='abc',ed='e',trans='a',noe=T)
#> [1] "85762ea7ba16b3a881fc3144898bcdec"
xcode('0123456789abcdee',key='abc',ed='e',trans='a',noe=T)
#> [1] "4ada7b6fcbeb0f93d9fdb0d731074152"

xcode('0123456789abcdef0123456789abcdef',key='abc',ed='e',trans='a',noe=T)
#> [1] "747f22502381a3fb7eb0cb42cb5f6612747f22502381a3fb7eb0cb42cb5f6612"

The h operation

The h uses the Hill transformation. The method works in blocks of 4 characters: blocks that are identical with the exception of one character will be encrypted to very unrelated results. Each character is transformed to its rank in the ordered set. A linear transformation is then done on these four numbers by applying a matrix multiplication and the resulting numbers (modulo 64) are used as rank numbers in the ordered set to determine the four output characters. The matrix used is a fixed matrix. The examples show (again) that it is necessary to combine this operation with other ones to make it more safe.

xcode('0123456789abcdef',key='abc',ed='e',trans='h',noe=T)
#> [1] "3y 8tOwy3 YazqsE"
xcode('1123456789abcdef',key='abc',ed='e',trans='h',noe=T)
#> [1] "4yb8tOwy3 YazqsE"
xcode('0123456789abcdee',key='abc',ed='e',trans='h',noe=T)
#> [1] "3y 8tOwy3 YayprB"

xcode('0123456789abcdef0123456789abcdef',key='abc',ed='e',trans='h',noe=T)
#> [1] "3y 8tOwy3 YazqsE3y 8tOwy3 YazqsE"

Encrypting raw vectors

The xcode function can also handle raw vectors

text     <- "this will be raw text"
(textraw  <- charToRaw(text))
#>  [1] 74 68 69 73 20 77 69 6c 6c 20 62 65 20 72 61 77 20 74 65 78 74
(s<-xcode(textraw,key='abc',ed='e',trans='cscp'))
#> [1] "wZMbUPjpN2jYeY X6cUZEVuQpUm3ijNKUV7l4FoHp BuNBSYYnbJz5t2pTfS5SYHJUTWDSrP1kqWBaJ#"
(s<-xcode(s,key='abc',ed='d',trans='cscp'))
#>  [1] 74 68 69 73 20 77 69 6c 6c 20 62 65 20 72 61 77 20 74 65 78 74
rawToChar(s) 
#> [1] "this will be raw text"

Reading and writing files

To ease reading and writing of files with and without encryption four functions are defined:

To demonstrate the various possibilities we first define text to work on and generate the name of a test file:

text     <- "This is a 21st century example of an English text."
textraw  <- charToRaw(text)

tfile <- tempfile(pattern = "file", tmpdir = tempdir(), fileext = ".txt")

Example of working with asci text (remember that the encrypted text is always asci):

# write asci with encoding
write_enc_ascfile(text,tfile,encode=T)
# read without decoding
(read_enc_ascfile(tfile,decode=F))
#> [1] "iFLbqk9MTET mpNazXQsSBBJLgPEU3LF1J5iuesj9#dMJsf7aTtkp32Slzk6Tfno"
# read with decoding
(read_enc_ascfile(tfile,decode=T))
#> [1] "This is a 21st century example of an English text."

Again asci text but now using the binary mode functions:

# write asci in binary mode with encoding
write_enc_binfile(text,tfile,encode=T)
# read without decoding and no raw -> character conversion
(read_enc_binfile(tfile,decode=F,raw2char=F))
#>  [1] 69 46 4c 62 71 6b 39 4d 54 45 54 20 6d 70 4e 61 7a 58 51 73 53 42 42 4a 4c
#> [26] 67 50 45 55 33 4c 46 31 4a 35 69 75 65 73 6a 39 23 64 4d 4a 73 66 37 61 54
#> [51] 74 6b 70 33 32 53 6c 7a 6b 36 54 66 6e 6f 00
# read without decoding and with raw -> character conversion
(read_enc_binfile(tfile,decode=F,raw2char=T))
#> [1] "iFLbqk9MTET mpNazXQsSBBJLgPEU3LF1J5iuesj9#dMJsf7aTtkp32Slzk6Tfno"
# read with decoding and no raw -> character conversion
(read_enc_binfile(tfile,decode=T))
#> [1] "This is a 21st century example of an English text."

Now with binary data and therefore the write_enc_binfile and read_enc_binfile functions must be used:

# write raw text binary mode with encoding
write_enc_binfile(textraw,tfile,encode=T)
# read without decoding and no raw -> character conversion
(read_enc_binfile(tfile,decode=F,raw2char=F))
#>   [1] 4c 77 43 78 37 23 56 72 61 68 67 44 32 72 6c 55 50 42 79 46 4d 75 59 35 43
#>  [26] 53 56 6d 42 4f 53 41 5a 6d 57 73 44 53 49 56 4f 4c 58 76 75 39 4f 20 57 36
#>  [51] 20 53 61 4a 43 59 66 54 79 54 67 63 71 36 49 61 23 33 78 46 4a 68 6c 73 6a
#>  [76] 75 4a 78 6d 76 75 64 6c 67 33 38 39 4a 4d 45 79 4e 6d 74 39 30 51 44 48 54
#> [101] 55 72 69 44 32 70 6d 51 44 4d 50 77 44 78 67 6d 7a 4d 4f 37 6d 44 20 69 33
#> [126] 69 56 5a 58 32 4c 5a 57 56 63 42 5a 6c 58 42 35 54 31 74 56 4e 57 38 6e 78
#> [151] 49 63 31 55 74 70 56 4f 56 69 00
# read without decoding and with raw -> character conversion
(read_enc_binfile(tfile,decode=F,raw2char=T))
#> [1] "LwCx7#VrahgD2rlUPByFMuY5CSVmBOSAZmWsDSIVOLXvu9O W6 SaJCYfTyTgcq6Ia#3xFJhlsjuJxmvudlg389JMEyNmt90QDHTUriD2pmQDMPwDxgmzMO7mD i3iVZX2LZWVcBZlXB5T1tVNW8nxIc1UtpVOVi"
# read with decoding and no raw -> character conversion
(read_enc_binfile(tfile,decode=T,raw2char=F))
#>  [1] 54 68 69 73 20 69 73 20 61 20 32 31 73 74 20 63 65 6e 74 75 72 79 20 65 78
#> [26] 61 6d 70 6c 65 20 6f 66 20 61 6e 20 45 6e 67 6c 69 73 68 20 74 65 78 74 2e
# read with decoding and with raw -> character conversion
(read_enc_binfile(tfile,decode=T,raw2char=T))
#> [1] "This is a 21st century example of an English text."


HanOostdijk/HOQCenc documentation built on Dec. 17, 2021, 10:28 p.m.