library("RProtoBuf") options("width"=90)
\tableofcontents
Protocol Buffers are a language-neutral, platform-neutral, extensible way of serializing structured data for use in communications protocols, data storage, and more.
Protocol Buffers offer key features such as an efficient data interchange format that is both language- and operating system-agnostic yet uses a lightweight and highly performant encoding, object serialization and de-serialization as well data and configuration management. Protocol Buffers are also forward compatible: updates to the \texttt{proto} files do not break programs built against the previous specification.
While benchmarks are not available, Google states on the project page that in comparison to XML, Protocol Buffers are at the same time \textsl{simpler}, between three to ten times \textsl{smaller}, between twenty and one hundred times \textsl{faster}, as well as less ambiguous and easier to program.
The Protocol Buffers code is released under an open-source (BSD) license. The Protocol Buffer project (\url{http://code.google.com/p/protobuf/}) contains a C++ library and a set of runtime libraries and compilers for C++, Java and Python.
With these languages, the workflow follows standard practice of so-called Interface Description Languages (IDL) (c.f. \href{http://en.wikipedia.org/wiki/Interface_description_language}{Wikipedia on IDL}). This consists of compiling a Protocol Buffer description file (ending in \texttt{.proto}) into language specific classes that can be used to create, read, write and manipulate Protocol Buffer messages. In other words, given the 'proto' description file, code is automatically generated for the chosen target language(s). The project page contains a tutorial for each of these officially supported languages: \url{http://code.google.com/apis/protocolbuffers/docs/tutorials.html}
Besides the officially supported C++, Java and Python implementations, several projects have been created to support Protocol Buffers for many languages. The list of known languages to support protocol buffers is compiled as part of the project page: \url{http://code.google.com/p/protobuf/wiki/ThirdPartyAddOns}
The Protocol Buffer project page contains a comprehensive description of the language: \url{http://code.google.com/apis/protocolbuffers/docs/proto.html}
This section describes how to use the R API to create and manipulate protocol buffer messages in R, and how to read and write the binary \emph{payload} of the messages to files and arbitrary binary R connections.
In contrast to the other languages (Java, C++, Python) that are officially supported by Google, the implementation used by the \texttt{RProtoBuf} package does not rely on the \texttt{protoc} compiler (with the exception of the two functions discussed in the previous section). This means that no initial step of statically compiling the proto file into C++ code that is then accessed by R code is necessary. Instead, \texttt{proto} files are parsed and processed \textsl{at runtime} by the protobuf C++ library---which is much more appropriate for a dynamic language.
The \texttt{readProtoFiles} function allows importing \texttt{proto} files in several ways.
args(readProtoFiles)
Using the \texttt{file} argument, one can specify one or several file paths that ought to be proto files.
pdir <- system.file("proto", package = "RProtoBuf") pfile <- file.path(pdir, "addressbook.proto") readProtoFiles(pfile)
With the \texttt{dir} argument, which is ignored if the \texttt{file} is supplied, all files matching the \texttt{.proto} extension will be imported.
dir(pdir, pattern = "\\.proto$", full.names = TRUE) readProtoFiles(dir = pdir)
Finally, with the \texttt{package} argument (ignored if \texttt{file} or \texttt{dir} is supplied), the function will import all \texttt{.proto} files that are located in the \texttt{proto} sub-directory of the given package. A typical use for this argument is in the \texttt{.onLoad} function of a package.
readProtoFiles( package = "RProtoBuf" )
Once the proto files are imported, all message descriptors are available in the R search path in the \texttt{RProtoBuf:DescriptorPool} special environment. The underlying mechanism used here is described in more detail in section~\ref{sec-lookup}.
ls("RProtoBuf:DescriptorPool")
The objects contained in the special environment are descriptors for their associated message types. Descriptors will be discussed in detail in another part of this document, but for the purpose of this section, descriptors are just used with the \texttt{new} function to create messages.
p <- new(tutorial.Person, name = "Romain", id = 1)
Once the message is created, its fields can be queried and modified using the dollar operator of R, making protocol buffer messages seem like lists.
p$name p$id p$email <- "francoisromain@free.fr"
However, as opposed to R lists, no partial matching is performed and the name must be given entirely.
The \verb|[[| operator can also be used to query and set fields of a message, supplying either their name or their tag number :
p[["name"]] <- "Romain Francois" p[[ 2 ]] <- 3 p[[ "email" ]]
Protocol buffers include a 64-bit integer type, but R lacks native 64-bit integer support. A workaround is available and described in Section~\ref{sec:int64} for working with large integer values.
Protocol buffer messages and descriptors implement \texttt{show} methods that provide basic information about the message :
p
For additional information, such as for debugging purposes, the \texttt{as.character} method provides a more complete ASCII representation of the contents of a message.
cat(as.character(p))
However, the main focus of protocol buffer messages is efficiency. Therefore, messages are transported as a sequence of bytes. The \texttt{serialize} method is implemented for protocol buffer messages to serialize a message into the sequence of bytes (raw vector in R speech) that represents the message.
serialize( p, NULL )
The same method can also be used to serialize messages to files :
tf1 <- tempfile() tf1 serialize( p, tf1 ) readBin(tf1, raw(0), 500)
Or to arbitrary binary connections:
tf2 <- tempfile() con <- file(tf2, open = "wb") serialize(p, con) close(con) readBin(tf2, raw(0), 500)
\texttt{serialize} can also be used in a more traditional object oriented fashion using the dollar operator :
# serialize to a file p$serialize(tf1) # serialize to a binary connection con <- file(tf2, open = "wb") p$serialize(con) close(con)
The \texttt{RProtoBuf} package defines the \texttt{read} function to read messages from files, raw vector (the message payload) and arbitrary binary connections.
args(read)
The binary representation of the message (often called the payload) does not contain information that can be used to dynamically infer the message type, so we have to provide this information to the \texttt{read} function in the form of a descriptor :
message <- read(tutorial.Person, tf1) cat(as.character(message))
The \texttt{input} argument of \texttt{read} can also be a binary readable R connection, such as a binary file connection:
con <- file(tf2, open = "rb") message <- read(tutorial.Person, con) close(con) cat(as.character(message))
Finally, the payload of the message can be used :
# reading the raw vector payload of the message payload <- readBin(tf1, raw(0), 5000) message <- read( tutorial.Person, payload )
\texttt{read} can also be used as a pseudo method of the descriptor object :
# reading from a file message <- tutorial.Person$read(tf1) # reading from a binary connection con <- file(tf2, open = "rb") message <- tutorial.Person$read(con) close(con) # read from the payload message <- tutorial.Person$read(payload)
The \texttt{RProtoBuf} package uses the S4 system to store information about descriptors and messages, but the information stored in the R object is very minimal and mainly consists of an external pointer to a C++ variable that is managed by the \texttt{proto} C++ library.
str(p)
Using the S4 system allows the \texttt{RProtoBuf} package to dispatch methods that are not generic in the S3 sense, such as \texttt{new} and \texttt{serialize}.
The \texttt{RProtoBuf} package combines the \emph{R typical} dispatch of the form \verb|method( object, arguments)| and the more traditional object oriented notation \verb|object$method(arguments)|.
Messages are represented in R using the \texttt{Message} S4 class. The class contains the slots \texttt{pointer} and \texttt{type} as described on the Table~\ref{Message-class-table}.
\begin{table}[h] \centering \begin{tabular}{|cp{10cm}|} \hline \textbf{slot} & \textbf{description} \ \hline \texttt{pointer} & external pointer to the \texttt{Message} object of the C++ proto library. Documentation for the \texttt{Message} class is available from the protocol buffer project page: \url{http://code.google.com/apis/protocolbuffers/docs/reference/cpp/google.protobuf.message.html#Message} \ \hline \texttt{type} & fully qualified path of the message. For example a \texttt{Person} message has its \texttt{type} slot set to \texttt{tutorial.Person} \ \hline \end{tabular} \caption{\label{Message-class-table}Description of slots for the \texttt{Message} S4 class} \end{table}
Although the \texttt{RProtoBuf} package uses the S4 system, the \verb|@| operator is very rarely used. Fields of the message are retrieved or modified using the \verb|$| or \verb|[[| operators as seen on the previous section, and pseudo-methods can also be called using the \verb|$| operator. Table~\ref{Message-methods-table} describes the methods defined for the \texttt{Message} class :
\begin{table}[h] \centering \begin{small} \begin{tabular}{|ccp{8cm}|} \hline \textbf{method} & \textbf{section} & \textbf{description} \ \hline \hline \texttt{has} & \ref{Message-method-has} & Indicates if a message has a given field. \ \texttt{clone} & \ref{Message-method-clone} & Creates a clone of the message \ \texttt{isInitialized} & \ref{Message-method-isInitialized} & Indicates if a message has all its required fields set\ \texttt{serialize} & \ref{Message-method-serialize} & serialize a message to a file or a binary connection or retrieve the message payload as a raw vector\ \texttt{clear} & \ref{Message-method-clear} & Clear one or several fields of a message, or the entire message\ \texttt{size} & \ref{Message-method-size} & The number of elements in a message field\ \texttt{bytesize} & \ref{Message-method-bytesize} & The number of bytes the message would take once serialized\ \hline \texttt{swap} & \ref{Message-method-swap} & swap elements of a repeated field of a message\ \texttt{set} & \ref{Message-method-set} & set elements of a repeated field\ \texttt{fetch} & \ref{Message-method-fetch} & fetch elements of a repeated field\ \texttt{setExtension} & \ref{Message-method-setExtension} & set an extension of a message\ \texttt{getExtension} & \ref{Message-method-getExtension} & get the value of an extension of a message\ \texttt{add} & \ref{Message-method-add} & add elements to a repeated field \ \hline \texttt{str} & \ref{Message-method-str} & the R structure of the message\ \texttt{as.character} & \ref{Message-method-ascharacter} & character representation of a message\ \texttt{toString} & \ref{Message-method-toString} & character representation of a message (same as \texttt{as.character}) \ \texttt{as.list} & \ref{Message-method-aslist} & converts message to a named R list\ \texttt{update} & \ref{Message-method-update} & updates several fields of a message at once\ \texttt{descriptor} & \ref{Message-method-descriptor} & get the descriptor of the message type of this message\ \texttt{fileDescriptor} & \ref{Message-method-fileDescriptor} & get the file descriptor of this message's descriptor\ \hline \end{tabular} \end{small} \caption{\label{Message-methods-table}Description of methods for the \texttt{Message} S4 class} \end{table}
\label{Message-method-getfield}
The \verb|$| and \verb|[[| operators allow extraction of a field data.
message <- new(tutorial.Person, name = "foo", email = "foo@bar.com", id = 2, phone = list(new(tutorial.Person.PhoneNumber, number = "+33(0)...", type = "HOME"), new(tutorial.Person.PhoneNumber, number = "+33(0)###", type = "MOBILE") ) ) message$name message$email message[["phone"]] # using the tag number message[[2]] # id
Neither \verb|$| nor \verb|[[| support partial matching of names. The \verb|$| is also used to call methods on the message, and the \verb|[[| operator can use the tag number of the field.
Table~\ref{table-get-types} details correspondence between the field type and the type of data that is retrieved by \verb|$| and \verb|[[|.
\begin{table}[h] \centering \begin{small} \begin{tabular}{|c|p{5cm}p{5cm}|} \hline field type & R type (non repeated) & R type (repeated) \ \hline \hline double & \texttt{double} vector & \texttt{double} vector \ float & \texttt{double} vector & \texttt{double} vector \ \hline uint32 & \texttt{double} vector & \texttt{double} vector \ fixed32 & \texttt{double} vector & \texttt{double} vector \ \hline int32 & \texttt{integer} vector & \texttt{integer} vector \ sint32 & \texttt{integer} vector & \texttt{integer} vector \ sfixed32 & \texttt{integer} vector & \texttt{integer} vector \ \hline int64 & \texttt{integer} or \texttt{character} vector \footnotemark & \texttt{integer} or \texttt{character} vector \ uint64 & \texttt{integer} or \texttt{character} vector & \texttt{integer} or \texttt{character} vector \ sint64 & \texttt{integer} or \texttt{character} vector & \texttt{integer} or \texttt{character} vector \ fixed64 & \texttt{integer} or \texttt{character} vector & \texttt{integer} or \texttt{character} vector \ sfixed64 & \texttt{integer} or \texttt{character} vector & \texttt{integer} or \texttt{character} vector \ \hline bool & \texttt{logical} vector & \texttt{logical} vector \ \hline string & \texttt{character} vector & \texttt{character} vector \ bytes & \texttt{character} vector & \texttt{character} vector \ \hline enum & \texttt{integer} vector & \texttt{integer} vector \ \hline message & \texttt{S4} object of class \texttt{Message} & \texttt{list} of \texttt{S4} objects of class \texttt{Message} \ \hline \end{tabular} \end{small} \caption{\label{table-get-types}Correspondence between field type and R type retrieved by the extractors. \footnotesize{1. R lacks native 64-bit integers, so the \texttt{RProtoBuf.int64AsString} option is available to return large integers as characters to avoid losing precision. This option is described in Section~\ref{sec:int64}}. R also lacks an unsigned integer type.} \end{table}
\label{Message-method-setfield}
The \verb|$<-| and \verb|[[<-| operators are implemented for \texttt{Message} objects to set the value of a field. The R data is coerced to match the type of the message field.
message <- new(tutorial.Person, name = "foo", id = 2) message$email <- "foo@bar.com" message[["id"]] <- 42 message[[1]] <- "foobar" cat(message$as.character())
Table~\ref{table-message-field-setters} describes the R types that are allowed in the right hand side depending on the target type of the field.
\begin{table}[h] \centering \begin{small} \begin{tabular}{|p{5cm}|p{7cm}|} \hline internal type & allowed R types \ \hline \hline \texttt{double}, \texttt{float} & \texttt{integer}, \texttt{raw}, \texttt{double}, \texttt{logical} \ \hline \texttt{int32}, \texttt{int64}, \texttt{uint32}, \texttt{uint64}, \texttt{sint32}, \texttt{sint64}, \texttt{fixed32}, \texttt{fixed64}, \texttt{sfixed32}, \texttt{sfixed64} & \texttt{integer}, \texttt{raw}, \texttt{double}, \texttt{logical}, \texttt{character} \ \hline \texttt{bool} & \texttt{integer}, \texttt{raw}, \texttt{double}, \texttt{logical} \ \hline \texttt{bytes}, \texttt{string} & \texttt{character} \ \hline \texttt{enum} & \texttt{integer}, \texttt{double}, \texttt{raw}, \texttt{character} \ \hline \texttt{message}, \texttt{group} & \texttt{S4}, of class \texttt{Message} of the appropriate message type, or a \texttt{list} of \texttt{S4} objects of class \texttt{Message} of the appropriate message type.\ \hline \end{tabular} \end{small} \caption{\label{table-message-field-setters}Allowed R types depending on internal field types. } \end{table}
\label{Message-method-has}
The \texttt{has} method indicates if a field of a message is set. For repeated fields, the field is considered set if there is at least on object in the array. For non-repeated fields, the field is considered set if it has been initialized.
The \texttt{has} method is a thin wrapper around the \texttt{HasField} and \texttt{FieldSize} methods of the \texttt{google::protobuf::Reflection} C++ class.
message <- new(tutorial.Person, name = "foo") message$has("name") message$has("id") message$has("phone")
\label{Message-method-clone}
The \texttt{clone} function creates a new message that is a clone of the message. This function is a wrapper around the methods \texttt{New} and \texttt{CopyFrom} of the \texttt{google::protobuf::Message} C++ class.
m1 <- new(tutorial.Person, name = "foo") m2 <- m1$clone() m2$email <- "foo@bar.com" cat(as.character(m1)) cat(as.character(m2))
\label{Message-method-isInitialized}
The \texttt{isInitialized} method quickly checks if all required fields have values set. This is a thin wrapper around the \texttt{IsInitialized} method of the \texttt{google::protobuf::Message} C++ class.
message <- new(tutorial.Person, name = "foo") message$isInitialized() message$id <- 2 message$isInitialized()
\label{Message-method-serialize}
The \texttt{serialize} method can be used to serialize the message as a sequence of bytes into a file or a binary connection.
message <- new(tutorial.Person, name = "foo", email = "foo@bar.com", id = 2) tf1 <- tempfile() tf1 message$serialize(tf1) tf2 <- tempfile() tf2 con <- file(tf2, open = "wb") message$serialize(con) close(con)
The (temporary) files tf1
and tf2
both contain the message payload
as a sequence of bytes. The \texttt{readBin} function can be used to
read the files as a raw vector in R:
readBin(tf1, raw(0), 500) readBin(tf2, raw(0), 500)
The \texttt{serialize} method can also be used to directly retrieve the payload of the message as a raw vector:
message$serialize(NULL)
\label{Message-method-clear}
The \texttt{clear} method can be used to clear all fields of a message when used with no argument, or a given field.
message <- new(tutorial.Person, name = "foo", email = "foo@bar.com", id = 2) cat(as.character(message)) message$clear() cat(as.character(message)) message <- new(tutorial.Person, name = "foo", email = "foo@bar.com", id = 2) message$clear("id") cat(as.character(message))
The \texttt{clear} method is a thin wrapper around the \texttt{Clear} method of the \texttt{google::protobuf::Message} C++ class.
\label{Message-method-size}
The \texttt{size} method is used to query the number of objects in a repeated field of a message :
message <- new(tutorial.Person, name = "foo", phone = list(new(tutorial.Person.PhoneNumber, number = "+33(0)...", type = "HOME"), new(tutorial.Person.PhoneNumber, number = "+33(0)###", type = "MOBILE") )) message$size("phone") size( message, "phone")
The \texttt{size} method is a thin wrapper around the \texttt{FieldSize} method of the \texttt{google::protobuf::Reflection} C++ class.
\label{Message-method-bytesize}
The \texttt{bytesize} method retrieves the number of bytes the message would take once serialized. This is a thin wrapper around the \texttt{ByteSize} method of the \texttt{google::protobuf::Message} C++ class.
message <- new(tutorial.Person, name = "foo", email = "foo@bar.com", id = 2) message$bytesize() bytesize(message) length(message$serialize(NULL))
\label{Message-method-swap}
The \texttt{swap} method can be used to swap elements of a repeated field.
message <- new(tutorial.Person, name = "foo", phone = list(new(tutorial.Person.PhoneNumber, number = "+33(0)...", type = "HOME" ), new(tutorial.Person.PhoneNumber, number = "+33(0)###", type = "MOBILE" ))) message$swap("phone", 1, 2) cat(as.character(message$phone[[1]])) cat(as.character(message$phone[[2]])) swap(message, "phone", 1, 2) cat(as.character(message$phone[[1]])) cat(as.character(message$phone[[2]]))
\label{Message-method-set}
The \texttt{set} method can be used to set values of a repeated field.
message <- new(tutorial.Person, name = "foo", phone = list(new(tutorial.Person.PhoneNumber, number = "+33(0)...", type = "HOME"), new(tutorial.Person.PhoneNumber, number = "+33(0)###", type = "MOBILE"))) number <- new(tutorial.Person.PhoneNumber, number = "+33(0)---", type = "WORK") message$set("phone", 1, number) cat(as.character( message))
\label{Message-method-fetch}
The \texttt{fetch} method can be used to get values of a repeated field.
message <- new(tutorial.Person, name = "foo", phone = list(new(tutorial.Person.PhoneNumber, number = "+33(0)...", type = "HOME"), new(tutorial.Person.PhoneNumber, number = "+33(0)###", type = "MOBILE" ))) message$fetch("phone", 1)
\label{Message-method-setExtension}
The \texttt{setExtension} method can be used to set an extension field of the Message.
if (!exists("protobuf_unittest.TestAllTypes", "RProtoBuf:DescriptorPool")) { unittest.proto.file <- system.file("tinytest", "data", "unittest.proto", package="RProtoBuf") readProtoFiles(file=unittest.proto.file) } ## Test setting a singular extensions. test <- new(protobuf_unittest.TestAllExtensions) test$setExtension(protobuf_unittest.optional_int32_extension, as.integer(1))
\label{Message-method-getExtension}
The \texttt{getExtension} method can be used to get values of an extension.
test$getExtension(protobuf_unittest.optional_int32_extension)
\label{Message-method-add}
The \texttt{add} method can be used to add values to a repeated field.
message <- new(tutorial.Person, name = "foo") phone <- new(tutorial.Person.PhoneNumber, number = "+33(0)...", type = "HOME") message$add("phone", phone) cat(message$toString())
\label{Message-method-str}
The \texttt{str} method gives the R structure of the message. This is rarely useful.
message <- new(tutorial.Person, name = "foo", email = "foo@bar.com", id = 2) message$str() str(message)
\label{Message-method-ascharacter}
The \texttt{as.character} method gives the debug string of the message.
message <- new(tutorial.Person, name = "foo", email = "foo@bar.com", id = 2) cat(message$as.character()) cat(as.character(message))
\label{Message-method-toString}
\texttt{toString} currently is an alias to the \texttt{as.character} function.
message <- new(tutorial.Person, name = "foo", email = "foo@bar.com", id = 2) cat(message$toString()) cat(toString( message))
\label{Message-method-aslist}
The \texttt{as.list} method converts the message to a named R list.
message <- new(tutorial.Person, name = "foo", email = "foo@bar.com", id = 2) as.list(message)
The names of the list are the names of the declared fields of the message type, and the content is the same as can be extracted with the \verb|$| operator described in section~\ref{Message-method-getfield}.
\label{Message-method-update}
The \texttt{update} method can be used to update several fields of a message at once.
message <- new(tutorial.Person) update(message, name = "foo", id = 2, email = "foo@bar.com") cat(message$as.character())
\label{Message-method-descriptor}
The \texttt{descriptor} method retrieves the descriptor of a message. See section~\ref{subsec-descriptor} for more information about message type descriptors.
message <- new(tutorial.Person) message$descriptor() descriptor(message)
\label{Message-method-fileDescriptor}
The \texttt{fileDescriptor} method retrieves the file descriptor of the descriptor associated with a message. See section~\ref{subsec-fileDescriptor} for more information about file descriptors.
message <- new(tutorial.Person) message$fileDescriptor() fileDescriptor(message)
\label{subsec-descriptor}
Message descriptors are represented in R with the \emph{Descriptor} S4 class. The class contains the slots \texttt{pointer} and \texttt{type} :
\begin{table}[h] \centering \begin{tabular}{|cp{10cm}|} \hline \textbf{slot} & \textbf{description} \ \hline \texttt{pointer} & external pointer to the \texttt{Descriptor} object of the C++ proto library. Documentation for the \texttt{Descriptor} class is available from the protocol buffer project page: \url{http://code.google.com/apis/protocolbuffers/docs/reference/cpp/google.protobuf.descriptor.html#Descriptor} \ \hline \texttt{type} & fully qualified path of the message type. \ \hline \end{tabular} \caption{\label{Descriptor-class-table}Description of slots for the \texttt{Descriptor} S4 class} \end{table}
Similarly to messages, the \verb|$| operator can be used to extract information from the descriptor, or invoke pseudo-methods. Table~\ref{Descriptor-methods-table} describes the methods defined for the \texttt{Descriptor} class :
\begin{table}[h] \centering \begin{small} \begin{tabular}{|ccp{8cm}|} \hline \textbf{Method} & \textbf{Section} & \textbf{Description} \ \hline \hline \texttt{new} & \ref{Descriptor-method-new} & Creates a prototype of a message described by this descriptor.\ \texttt{read} & \ref{Descriptor-method-read} & Reads a message from a file or binary connection.\ \texttt{readASCII} & \ref{Descriptor-method-readASCII} & Read a message in ASCII format from a file or text connection.\ \hline \texttt{name} & \ref{Descriptor-method-name} & Retrieve the name of the message type associated with this descriptor.\ \texttt{as.character} & \ref{Descriptor-method-ascharacter} & character representation of a descriptor\ \texttt{toString} & \ref{Descriptor-method-tostring} & character representation of a descriptor (same as \texttt{as.character}) \ \texttt{as.list} & \ref{Descriptor-method-aslist} & return a named list of the field, enum, and nested descriptors included in this descriptor.\ \texttt{asMessage} & \ref{Descriptor-method-asmessage} & return DescriptorProto message. \ \hline \texttt{fileDescriptor} & \ref{Descriptor-method-filedescriptor} & Retrieve the file descriptor of this descriptor.\ \texttt{containing_type} & \ref{Descriptor-method-containingtype} & Retrieve the descriptor describing the message type containing this descriptor.\ \texttt{field_count} & \ref{Descriptor-method-fieldcount} & Return the number of fields in this descriptor.\ \texttt{field} & \ref{Descriptor-method-field} & Return the descriptor for the specified field in this descriptor.\ \texttt{nested_type_count} & \ref{Descriptor-method-nestedtypecount} & The number of nested types in this descriptor.\ \texttt{nested_type} & \ref{Descriptor-method-nestedtype} & Return the descriptor for the specified nested type in this descriptor.\ \texttt{enum_type_count} & \ref{Descriptor-method-enumtypecount} & The number of enum types in this descriptor.\ \texttt{enum_type} & \ref{Descriptor-method-enumtype} & Return the descriptor for the specified enum type in this descriptor.\ \hline \end{tabular} \end{small} \caption{\label{Descriptor-methods-table}Description of methods for the \texttt{Descriptor} S4 class} \end{table}
The \verb|$| operator, when used on a descriptor object retrieves descriptors that are contained in the descriptor.
This can be a field descriptor (see section~\ref{subsec-field-descriptor} ), an enum descriptor (see section~\ref{subsec-enum-descriptor}) or a descriptor for a nested type
# field descriptor tutorial.Person$email # enum descriptor tutorial.Person$PhoneType # nested type descriptor tutorial.Person$PhoneNumber # same as tutorial.Person.PhoneNumber
\label{Descriptor-method-new}
The \texttt{new} method creates a prototype of a message described by the descriptor.
tutorial.Person$new() new(tutorial.Person)
Passing additional arguments to the method allows directly setting the fields of the message at construction time.
tutorial.Person$new(email = "foo@bar.com") # same as update(tutorial.Person$new(), email = "foo@bar.com")
\label{Descriptor-method-read}
The \texttt{read} method is used to read a message from a file or a binary connection.
# start by serializing a message message <- new(tutorial.Person.PhoneNumber, type = "HOME", number = "+33(0)....") tf <- tempfile() serialize(message, tf) # now read back the message m <- tutorial.Person.PhoneNumber$read(tf) cat(as.character(m)) m <- read( tutorial.Person.PhoneNumber, tf) cat(as.character(m))
\label{Descriptor-method-readASCII}
The \texttt{readASCII} method is used to read a message from a text file or a character vector.
# start by generating the ASCII representation of a message text <- as.character(new(tutorial.Person, id=1, name="Murray")) text # Then read the ascii representation in as a new message object. msg <- tutorial.Person$readASCII(text)
\label{Descriptor-method-tostring}
\texttt{toString} currently is an alias to the \texttt{as.character} function.
\label{Descriptor-method-ascharacter}
\texttt{as.character} prints the text representation of the descriptor as it would be specified in the \texttt{.proto} file.
desc <- tutorial.Person cat(desc$toString()) cat(toString(desc)) cat(as.character(tutorial.Person))
\label{Descriptor-method-aslist}
The \texttt{as.list} method returns a named list of the field, enum, and nested descriptors included in this descriptor.
tutorial.Person$as.list()
\label{Descriptor-method-asmessage}
The \texttt{asMessage} method returns a message of type \texttt{google.protobuf.DescriptorProto} of the Descriptor.
tutorial.Person$asMessage()
\label{Descriptor-method-filedescriptor}
The \texttt{fileDescriptor} method retrieves the file descriptor of the descriptor. See section~\ref{subsec-fileDescriptor} for more information about file descriptors.
desc <- tutorial.Person desc$fileDescriptor() fileDescriptor(desc)
\label{Descriptor-method-name}
The \texttt{name} method can be used to retrieve the name of the message type associated with the descriptor.
# simple name tutorial.Person$name() # name including scope tutorial.Person$name(full = TRUE)
\label{Descriptor-method-containingtype}
The \texttt{containing_type} method retrieves the descriptor describing the message type containing this descriptor.
tutorial.Person$containing_type() tutorial.Person$PhoneNumber$containing_type()
\label{Descriptor-method-fieldcount}
The \texttt{field_count} method retrieves the number of fields in this descriptor.
tutorial.Person$field_count()
\label{Descriptor-method-field}
The \texttt{field} method returns the descriptor for the specified field in this descriptor.
tutorial.Person$field(1)
\label{Descriptor-method-nestedtypecount}
The \texttt{nested_type_count} method returns the number of nested types in this descriptor.
tutorial.Person$nested_type_count()
\label{Descriptor-method-nestedtype}
The \texttt{nested_type} method returns the descriptor for the specified nested type in this descriptor.
tutorial.Person$nested_type(1)
\label{Descriptor-method-enumtypecount}
The \texttt{enum_type_count} method returns the number of enum types in this descriptor.
tutorial.Person$enum_type_count()
\label{Descriptor-method-enumtype}
The \texttt{enum_type} method returns the descriptor for the specified enum type in this descriptor.
tutorial.Person$enum_type(1)
\label{subsec-field-descriptor}
The class \emph{FieldDescriptor} represents field descriptor in R. This is a wrapper S4 class around the \texttt{google::protobuf::FieldDescriptor} C++ class. Table~\ref{fielddescriptor-methods-table} describes the methods defined for the \texttt{FieldDescriptor} class.
\begin{table}[h] \centering \begin{tabular}{|cp{10cm}|} \hline \textbf{slot} & \textbf{description} \ \hline \texttt{pointer} & External pointer to the \texttt{FieldDescriptor} C++ variable \ \hline \texttt{name} & simple name of the field \ \hline \texttt{full_name} & fully qualified name of the field \ \hline \texttt{type} & name of the message type where the field is declared \ \hline \end{tabular} \caption{\label{FieldDescriptor-class-table}Description of slots for the \texttt{FieldDescriptor} S4 class} \end{table}
\begin{table}[h] \centering \begin{small} \begin{tabular}{|ccp{8cm}|} \hline \textbf{method} & \textbf{section} & \textbf{description} \ \hline \hline \texttt{as.character} & \ref{fielddescriptor-method-ascharacter} & character representation of a descriptor\ \texttt{toString} & \ref{fielddescriptor-method-tostring} & character representation of a descriptor (same as \texttt{as.character}) \ \texttt{asMessage} & \ref{fielddescriptor-method-asmessage} & return FieldDescriptorProto message. \ \texttt{name} & \ref{fielddescriptor-method-name} & Return the name of the field descriptor.\ \texttt{fileDescriptor} & \ref{fielddescriptor-method-filedescriptor} & Return the fileDescriptor where this field is defined.\ \texttt{containing_type} & \ref{fielddescriptor-method-containingtype} & Return the containing descriptor of this field.\ \texttt{is_extension} & \ref{fielddescriptor-method-isextension} & Return TRUE if this field is an extension.\ \texttt{number} & \ref{fielddescriptor-method-number} & Gets the declared tag number of the field.\ \texttt{type} & \ref{fielddescriptor-method-type} & Gets the type of the field.\ \texttt{cpp_type} & \ref{fielddescriptor-method-cpptype} & Gets the C++ type of the field.\ \texttt{label} & \ref{fielddescriptor-method-label} & Gets the label of a field (optional, required, or repeated).\ \texttt{is_repeated} & \ref{fielddescriptor-method-isrepeated} & Return TRUE if this field is repeated.\ \texttt{is_required} & \ref{fielddescriptor-method-isrequired} & Return TRUE if this field is required.\ \texttt{is_optional} & \ref{fielddescriptor-method-isoptional} & Return TRUE if this field is optional.\ \texttt{has_default_value} & \ref{fielddescriptor-method-hasdefaultvalue} & Return TRUE if this field has a default value.\ \texttt{default_value} & \ref{fielddescriptor-method-defaultvalue} & Return the default value.\ \texttt{message_type} & \ref{fielddescriptor-method-messagetype} & Return the message type if this is a message type field.\ \texttt{enum_type} & \ref{fielddescriptor-method-enumtype} & Return the enum type if this is an enum type field.\ \hline \end{tabular} \end{small} \caption{\label{fielddescriptor-methods-table}Description of methods for the \texttt{FieldDescriptor} S4 class} \end{table}
\label{fielddescriptor-method-ascharacter}
The \texttt{as.character} method gives the debug string of the field descriptor.
cat(as.character(tutorial.Person$PhoneNumber))
\label{fielddescriptor-method-tostring}
\texttt{toString} is an alias of \texttt{as.character}.
cat(tutorial.Person.PhoneNumber$toString())
\label{fielddescriptor-method-asmessage}
The \texttt{asMessage} method returns a message of type \texttt{google.protobuf.FieldDescriptorProto} of the FieldDescriptor.
tutorial.Person$id$asMessage() cat(as.character(tutorial.Person$id$asMessage()))
\label{fielddescriptor-method-name}
The \texttt{name} method can be used to retrieve the name of the field descriptor.
# simple name. name(tutorial.Person$id) # name including scope. name(tutorial.Person$id, full=TRUE)
\label{fielddescriptor-method-filedescriptor}
The \texttt{fileDescriptor} method can be used to retrieve the file descriptor of the field descriptor.
fileDescriptor(tutorial.Person$id) tutorial.Person$id$fileDescriptor()
\label{fielddescriptor-method-containingtype}
The \texttt{containing_type} method can be used to retrieve the descriptor for the message type that contains this descriptor.
containing_type(tutorial.Person$id) tutorial.Person$id$containing_type()
\label{fielddescriptor-method-isextension}
The \texttt{is_extension} method returns TRUE if this field is an extension.
is_extension( tutorial.Person$id ) tutorial.Person$id$is_extension()
\label{fielddescriptor-method-number}
The \texttt{number} method returns the declared tag number of this field.
number( tutorial.Person$id ) tutorial.Person$id$number()
\label{fielddescriptor-method-type}
The \texttt{type} method can be used to retrieve the type of the field descriptor.
type( tutorial.Person$id ) tutorial.Person$id$type()
\label{fielddescriptor-method-cpptype}
The \texttt{cpp_type} method can be used to retrieve the C++ type of the field descriptor.
cpp_type( tutorial.Person$id ) tutorial.Person$id$cpp_type()
\label{fielddescriptor-method-label}
Gets the label of a field (optional, required, or repeated). The \texttt{label} method returns the label of a field (optional, required, or repeated). By default it returns a number value, but the optional \texttt{as.string} argument can be provided to return a human readable string representation.
label(tutorial.Person$id) label(tutorial.Person$id, TRUE) tutorial.Person$id$label(TRUE)
\label{fielddescriptor-method-isrepeated}
The \texttt{is_repeated} method returns TRUE if this field is repeated.
is_repeated( tutorial.Person$id ) tutorial.Person$id$is_repeated()
\label{fielddescriptor-method-isrequired}
The \texttt{is_required} method returns TRUE if this field is required.
is_required( tutorial.Person$id ) tutorial.Person$id$is_required()
\label{fielddescriptor-method-isoptional}
The \texttt{is_optional} method returns TRUE if this field is optional.
is_optional(tutorial.Person$id) tutorial.Person$id$is_optional()
\label{fielddescriptor-method-hasdefaultvalue}
The \texttt{has_default_value} method returns TRUE if this field has a default value.
has_default_value(tutorial.Person$PhoneNumber$type) has_default_value(tutorial.Person$PhoneNumber$number)
\label{fielddescriptor-method-defaultvalue}
The \texttt{default_value} method returns the default value of a field.
default_value(tutorial.Person$PhoneNumber$type) default_value(tutorial.Person$PhoneNumber$number)
\label{fielddescriptor-method-messagetype}
The \texttt{message_type} method returns the message type if this is a message type field.
message_type(tutorial.Person$phone) tutorial.Person$phone$message_type()
\label{fielddescriptor-method-enumtype}
The \texttt{enum_type} method returns the enum type if this is an enum type field.
enum_type(tutorial.Person$PhoneNumber$type)
\label{subsec-enum-descriptor}
The class \emph{EnumDescriptor} is an R wrapper class around the C++ class \texttt{google::protobuf::EnumDescriptor}. Table~\ref{enumdescriptor-methods-table} describes the methods defined for the \texttt{EnumDescriptor} class.
\begin{table}[h] \centering \begin{tabular}{|cp{10cm}|} \hline \textbf{slot} & \textbf{description} \ \hline \texttt{pointer} & External pointer to the \texttt{EnumDescriptor} C++ variable \ \hline \texttt{name} & simple name of the enum \ \hline \texttt{full_name} & fully qualified name of the enum \ \hline \texttt{type} & name of the message type where the enum is declared \ \hline \end{tabular} \caption{\label{EnumDescriptor-class-table}Description of slots for the \texttt{EnumDescriptor} S4 class} \end{table}
\begin{table}[h] \centering \begin{small} \begin{tabular}{|ccp{8cm}|} \hline \textbf{method} & \textbf{section} & \textbf{description} \ \hline \hline \texttt{as.list} & \ref{enumdescriptor-method-aslist} & return a named integer vector with the values of the enum and their names.\ \texttt{as.character} & \ref{enumdescriptor-method-ascharacter} & character representation of a descriptor\ \texttt{toString} & \ref{enumdescriptor-method-tostring} & character representation of a descriptor (same as \texttt{as.character}) \ \texttt{asMessage} & \ref{enumdescriptor-method-asmessage} & return EnumDescriptorProto message. \ \texttt{name} & \ref{enumdescriptor-method-name} & Return the name of the enum descriptor.\ \texttt{fileDescriptor} & \ref{enumdescriptor-method-filedescriptor} & Return the fileDescriptor where this field is defined.\ \texttt{containing_type} & \ref{enumdescriptor-method-containingtype} & Return the containing descriptor of this field.\ \texttt{length} & \ref{enumdescriptor-method-length} & Return the number of constants in this enum.\ \texttt{has} & \ref{enumdescriptor-method-has} & Return TRUE if this enum contains the specified named constant string.\ \texttt{value_count} & \ref{enumdescriptor-method-valuecount} & Return the number of constants in this enum (same as \texttt{length}).\ \texttt{value} & \ref{enumdescriptor-method-value} & Return the EnumValueDescriptor of an enum value of specified index, name, or number.\ \hline \end{tabular} \end{small} \caption{\label{enumdescriptor-methods-table}Description of methods for the \texttt{EnumDescriptor} S4 class} \end{table}
The \verb|$| operator, when used on a EnumDescriptor object retrieves EnumValueDescriptors that are contained in the descriptor.
tutorial.Person$PhoneType$WORK name(tutorial.Person$PhoneType$value(number=2))
\label{enumdescriptor-method-aslist}
The \texttt{as.list} method creates a named R integer vector that captures the values of the enum and their names.
as.list(tutorial.Person$PhoneType)
\label{enumdescriptor-method-ascharacter}
The \texttt{as.character} method gives the debug string of the enum type.
cat(as.character(tutorial.Person$PhoneType ))
\label{enumdescriptor-method-tostring}
The \texttt{toString} method gives the debug string of the enum type.
```{ tostringmethod3} cat(toString(tutorial.Person$PhoneType))
### The asMessage method \label{enumdescriptor-method-asmessage} The \texttt{asMessage} method returns a message of type \texttt{google.protobuf.EnumDescriptorProto} of the EnumDescriptor. ```r tutorial.Person$PhoneType$asMessage() cat(as.character(tutorial.Person$PhoneType$asMessage()))
\label{enumdescriptor-method-name}
The \texttt{name} method can be used to retrieve the name of the enum descriptor.
# simple name. name( tutorial.Person$PhoneType ) # name including scope. name( tutorial.Person$PhoneType, full=TRUE )
\label{enumdescriptor-method-filedescriptor}
The \texttt{fileDescriptor} method can be used to retrieve the file descriptor of the enum descriptor.
fileDescriptor(tutorial.Person$PhoneType) tutorial.Person$PhoneType$fileDescriptor()
\label{enumdescriptor-method-containingtype}
The \texttt{containing_type} method can be used to retrieve the descriptor for the message type that contains this enum descriptor.
tutorial.Person$PhoneType$containing_type()
\label{enumdescriptor-method-length}
The \texttt{length} method returns the number of constants in this enum.
length(tutorial.Person$PhoneType) tutorial.Person$PhoneType$length()
\label{enumdescriptor-method-has}
The \texttt{has} method returns TRUE if this enum contains the specified named constant string.
tutorial.Person$PhoneType$has("WORK") tutorial.Person$PhoneType$has("nonexistant")
\label{enumdescriptor-method-valuecount}
The \texttt{value_count} method returns the number of constants in this enum.
value_count(tutorial.Person$PhoneType) tutorial.Person$PhoneType$value_count()
\label{enumdescriptor-method-value}
The \texttt{value} method extracts an EnumValueDescriptor. Exactly one argument of 'index', 'number', or 'name' must be specified to identify which constant is desired.
tutorial.Person$PhoneType$value(1) tutorial.Person$PhoneType$value(name="HOME") tutorial.Person$PhoneType$value(number=1)
\label{subsec-EnumValueDescriptor}
The class \emph{EnumValueDescriptor} is an R wrapper class around the C++ class \texttt{google::protobuf::EnumValueDescriptor}. Table~\ref{enumvaluedescriptor-methods-table} describes the methods defined for the \texttt{EnumValueDescriptor} class.
\begin{table}[h] \centering \begin{tabular}{|cp{10cm}|} \hline \textbf{slot} & \textbf{description} \ \hline \texttt{pointer} & External pointer to the \texttt{EnumValueDescriptor} C++ variable \ \hline \texttt{name} & simple name of the enum value \ \hline \texttt{full_name} & fully qualified name of the enum value \ \hline \end{tabular} \caption{\label{EnumValueDescriptor-class-table}Description of slots for the \texttt{EnumValueDescriptor} S4 class} \end{table}
\begin{table}[h] \centering \begin{small} \begin{tabular}{|ccp{8cm}|} \hline \textbf{method} & \textbf{section} & \textbf{description} \ \hline \hline \texttt{number} & \ref{enumvaluedescriptor-method-number} & return the number of this EnumValueDescriptor. \ \texttt{name} & \ref{enumvaluedescriptor-method-name} & Return the name of the enum value descriptor.\ \texttt{enum_type} & \ref{enumvaluedescriptor-method-enumtype} & return the EnumDescriptor type of this EnumValueDescriptor. \ \texttt{as.character} & \ref{enumvaluedescriptor-method-ascharacter} & character representation of a descriptor. \ \texttt{toString} & \ref{enumvaluedescriptor-method-tostring} & character representation of a descriptor (same as \texttt{as.character}). \ \texttt{asMessage} & \ref{enumvaluedescriptor-method-asmessage} & return EnumValueDescriptorProto message. \ \hline \end{tabular} \end{small} \caption{\label{enumvaluedescriptor-methods-table}Description of methods for the \texttt{EnumValueDescriptor} S4 class} \end{table}
\label{enumvaluedescriptor-method-number}
The \texttt{number} method can be used to retrieve the number of the enum value descriptor.
number(tutorial.Person$PhoneType$value(number=2))
\label{enumvaluedescriptor-method-name}
The \texttt{name} method can be used to retrieve the name of the enum value descriptor.
# simple name. name(tutorial.Person$PhoneType$value(number=2)) # name including scope. name(tutorial.Person$PhoneType$value(number=2), full=TRUE)
\label{enumvaluedescriptor-method-enumtype}
The \texttt{enum_type} method can be used to retrieve the EnumDescriptor of the enum value descriptor.
enum_type(tutorial.Person$PhoneType$value(number=2))
\label{enumvaluedescriptor-method-ascharacter}
The \texttt{as.character} method gives the debug string of the enum value type.
cat(as.character(tutorial.Person$PhoneType$value(number=2)))
\label{enumvaluedescriptor-method-tostring}
The \texttt{toString} method gives the debug string of the enum value type.
cat(toString(tutorial.Person$PhoneType$value(number=2)))
\label{enumvaluedescriptor-method-asmessage}
The \texttt{asMessage} method returns a message of type \texttt{google.protobuf.EnumValueDescriptorProto} of the EnumValueDescriptor.
tutorial.Person$PhoneType$value(number=2)$asMessage() cat(as.character(tutorial.Person$PhoneType$value(number=2)$asMessage()))
\label{subsec-fileDescriptor}
File descriptors describe a whole \texttt{.proto} file and are represented in R with the \emph{FileDescriptor} S4 class. The class contains the slots \texttt{pointer}, \texttt{filename}, and \texttt{package} :
\begin{table}[h] \centering \begin{tabular}{|cp{10cm}|} \hline \textbf{slot} & \textbf{description} \ \hline \texttt{pointer} & external pointer to the \texttt{FileDescriptor} object of the C++ proto library. Documentation for the \texttt{FileDescriptor} class is available from the protocol buffer project page: \url{http://developers.google.com/protocol-buffers/docs/reference/cpp/google.protobuf.descriptor.html#FileDescriptor} \ \hline \texttt{filename} & fully qualified pathname of the \texttt{.proto} file.\ \hline \texttt{package} & package name defined in this \texttt{.proto} file.\ \hline \end{tabular} \caption{\label{FileDescriptor-class-table}Description of slots for the \texttt{FileDescriptor} S4 class} \end{table}
Similarly to messages, the \verb|$| operator can be used to extract fields from the file descriptor (in this case, types defined in the file), or invoke pseudo-methods. Table~\ref{filedescriptor-methods-table} describes the methods defined for the \texttt{FileDescriptor} class.
f <- tutorial.Person$fileDescriptor() f f$Person
\begin{table}[h] \centering \begin{small} \begin{tabular}{|ccp{8cm}|} \hline \textbf{method} & \textbf{section} & \textbf{description} \ \hline \hline \texttt{name} & \ref{filedescriptor-method-name} & Return the filename for this FileDescriptorProto.\ \texttt{package} & \ref{filedescriptor-method-package} & Return the file-level package name specified in this FileDescriptorProto.\ \texttt{as.character} & \ref{filedescriptor-method-ascharacter} & character representation of a descriptor. \ \texttt{toString} & \ref{filedescriptor-method-tostring} & character representation of a descriptor (same as \texttt{as.character}). \ \texttt{asMessage} & \ref{filedescriptor-method-asmessage} & return FileDescriptorProto message. \ \texttt{as.list} & \ref{filedescriptor-method-aslist} & return named list of descriptors defined in this file descriptor.\ \hline \end{tabular} \end{small} \caption{\label{filedescriptor-methods-table}Description of methods for the \texttt{FileDescriptor} S4 class} \end{table}
\label{filedescriptor-method-ascharacter} The \texttt{as.character} method gives the debug string of the file descriptor.
cat(as.character(fileDescriptor(tutorial.Person)))
\label{filedescriptor-method-tostring}
\texttt{toString} is an alias of \texttt{as.character}.
cat(fileDescriptor(tutorial.Person)$toString())
\label{filedescriptor-method-asmessage}
The \texttt{asMessage} method returns a protocol buffer message representation of the file descriptor.
asMessage(tutorial.Person$fileDescriptor()) cat(as.character(asMessage(tutorial.Person$fileDescriptor())))
\label{filedescriptor-method-aslist}
The \texttt{as.list} method creates a named R list that contains the descriptors defined in this file descriptor.
as.list(tutorial.Person$fileDescriptor())
\label{filedescriptor-method-name}
The \texttt{name} method can be used to retrieve the file name associated with the file descriptor. The optional boolean argument can be specified if full pathnames are desired.
name(tutorial.Person$fileDescriptor()) tutorial.Person$fileDescriptor()$name(TRUE)
\label{filedescriptor-method-package}
The \texttt{package} method can be used to retrieve the package scope associated with this file descriptor.
tutorial.Person$fileDescriptor()$package()
\label{subsec-ServiceDescriptor}
Not fully implemented. Needs to be connected to a concrete RPC implementation. The Google Protocol Buffers C++ open-source library does not include an RPC implementation, but this can be connected easily to others.
\label{subsec-MethodDescriptor}
Not fully implemented. Needs to be connected to a concrete RPC implementation. The Google Protocol Buffers C++ open-source library does not include an RPC implementation, but this can be connected easily to others. Now that Google gRPC is released, this an obvious possibility. Contributions would be most welcome.
The \texttt{asMessage} function uses the standard coercion mechanism of the \texttt{as} method, and so can be used as a shorthand :
# coerce a message type descriptor to a message asMessage(tutorial.Person) # coerce a enum descriptor asMessage(tutorial.Person.PhoneType) # coerce a field descriptor asMessage(tutorial.Person$email) # coerce a file descriptor asMessage(fileDescriptor(tutorial.Person))
The \texttt{RProtoBuf} package implements the \texttt{.DollarNames} S3 generic function (defined in the \texttt{utils} package) for all classes.
Completion possibilities include pseudo method names for all classes, plus : \begin{itemize} \item field names for messages \item field names, enum types, nested types for message type descriptors \item names for enum descriptors \item names for top-level extensions \item message names for file descriptors \end{itemize}
In the unlikely event that there is a user-defined field of exactly the same name as one of the pseudo methods, the user-defined field shall take precedence for completion purposes by design, since the method name can always be invoked directly.
The S3 generic \texttt{with} function is implemented for class \texttt{Message}, allowing to evaluate an R expression in an environment that allows to retrieve and set fields of a message simply using their names.
{r withwithin
message <- new(tutorial.Person, email = "foo
### The com" method
with(message, {
## set the id field
id <- 2
## set the name field from the email field name <- gsub( "[@]", " ", email ) sprintf( "%d [%s] : %s", id, email, name )
})
The difference between \texttt{with} and \texttt{within} is the value that is returned. For \texttt{with} returns the result of the R expression, for \texttt{within} the message is returned. In both cases, the message is modified because \texttt{RProtoBuf} works by reference. ## identical The \texttt{identical} method is implemented to compare two messages. ```r m1 <- new(tutorial.Person, email = "foo@bar.com", id = 2) m2 <- update(new(tutorial.Person) , email = "foo@bar.com", id = 2) identical(m1, m2)
The \verb|==| operator can be used as an alias to \texttt{identical}.
m1 == m2 m1 != m2
Alternatively, the \texttt{all.equal} function can be used, allowing a tolerance when comparing \texttt{float} or \texttt{double} values.
\texttt{merge} can be used to merge two messages of the same type.
m1 <- new(tutorial.Person, name = "foobar") m2 <- new(tutorial.Person, email = "foo@bar.com") m3 <- merge(m1, m2) cat(as.character(m3))
The \texttt{P} function is an alternative way to retrieve a message descriptor using its type name. It is not often used because of the lookup mechanism described in section~\ref{sec-lookup}.
P("tutorial.Person") new(P("tutorial.Person")) # but we can do this instead tutorial.Person new(tutorial.Person)
\label{sec-extensions}
Extensions allow you to declare a range of field numbers in a message that are available for extension types. This allows others to declare new fields for a given message type possibly in their own \texttt{.proto} files without having to edit the original file. See \url{https://protobuf.dev/docs/proto#extensions}.
Notice that the last line of the \texttt{Person} message schema in \texttt{addressbook.proto} is the following line :
extensions 100 to 199;
This specifies that other users in other .proto files can use tag numbers between 100 and 199 for extension types of this message.
\label{sec:groups}
Groups are a deprecated feature that offered another way to nest information in message definitions. For example, the \texttt{TestAllTypes} message type in \texttt{unittest.proto} includes an OptionalGroup type:
optional group OptionalGroup = 16 { optional int32 a = 17; }
And although the feature is deprecated, it can be used with RProtoBuf:
test <- new(protobuf_unittest.TestAllTypes) test$optionalgroup$a <- 3 test$optionalgroup$a cat(as.character(test))
Note that groups simply combine a nested message type and a field into a single declaration. The field type is OptionalGroup in this example, and the field name is converted to lower-case 'optionalgroup' so as not to conflict with the type name.
Note that groups simply combine a nested message type and a field into a single declaration. The field type is OptionalGroup in this example, and the field name is converted to lower-case 'optionalgroup' so as not to conflict with the type name.
Saptarshi Guha wrote another package that deals with integration of Protocol Buffer messages with R, taking a different angle: serializing any R object as a message, based on a single catch-all \texttt{proto} file. Saptarshi's package is available at \url{http://ml.stat.purdue.edu/rhipe/doc/html/ProtoBuffers.html}.
Jeroen Ooms took a similar approach influenced by Saptarshi in his \texttt{RProtoBufUtils} package. Unlike Saptarshi's package, RProtoBufUtils depends on RProtoBuf for underlying message operations. This package is available at \url{https://github.com/jeroenooms/RProtoBufUtils}.
Protocol Buffers have a mechanism for remote procedure calls (RPC) that is not yet used by \texttt{RProtoBuf}, but we may one day take advantage of this by writing a Protocol Buffer message R server, and client code as well, probably based on the functionality of the \texttt{Rserve} package. Now that Google gRPC is released, this an obvious possibility. Contributions would be most welcome.
Extensions have been implemented in RProtoBuf and have been extensively used and tested, but they are not currently described in this vignette. Additional examples and documentation are needed for extensions.
Some of the design of the package is based on the design of the
\texttt{rJava} package by Simon Urbanek (dispatch on new
, S4 class
structures using external pointers, etc). We would like to thank
Simon for his indirect involvment on \texttt{RProtoBuf}.
The user defined table mechanism, implemented by Duncan Temple Lang
for the purpose of the \texttt{RObjectTables} package allowed the
dynamic symbol lookup (see section~\ref{sec-lookup}). Many thanks
to Duncan for this amazing feature.
\renewcommand{\pnasbreak}{\begin{strip}\vskip0pt\end{strip}}
\newpage
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.