knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "README-" ) library(utf8splain)
bytes( "hello 🌍" )
If you run it in a crayon compatible terminal, for example
a recent enough version of rstudio, the
Each rune is encoded in a variable number of bytes, depending on how far it is in the table, for example "h" (and all other ascii characters) only need one byte, but 🌍 needs 4 bytes.
utf-8 bytes are organised as follows: - the first byte of a rune starts with as many 1 as the rune needs bytes, followed by a 0, e.g. the first rune for the utf-8 encoded 🌍 starts with "11110", and the only byte of the encoded "h" starts with "0" - the remaining bytes (if any) all start with "10"
All the bits that are not taken are used to store the binary representation of the rune,
for example the 7 bits "1101000" follow the initial "0" in the encoding of "h". 🌍 correspond to the rune U+1F30D, i.e. the rune number
world_decimal <- strtoi( "0x1F30D", base = 16) world_decimal world_binary <- paste( substr(as.character( rev(intToBits(world_decimal)) ), 2, 2 ), collapse = "" ) world_binary world_binary_signif <- sub( "^0+", "", world_binary ) world_binary_signif nchar(world_binary_signif)
So 🌍 needs
r nchar(world_binary_signif) bits, which needs 4 utf-8 bytes.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.