formElementHandlers | R Documentation |
These functions are used when parsing HTML pages containing forms to gather a description of the individual forms. The idea is to read the individual elements within an HTML form and provide their details in terms of
1the element name
2the default value
3the set of possible values
4whether it is visible, i.e. settable by the user or simply a hidden field
With this information, we can automate the access to the HTML form via an S function that provides the same user specification of options but without the manual operation and without the management of the returned data, e.g. saving it as a file and bringing it into S.
We ignore JavaScript-related operations.
multiFormElementHandlers
can handle multiple
forms within a single page.
formElementHandlers
accumulates all the form elements
into a single structure and does not observe multiple
form boundaries. If you have an HTML form with potentially
more than one form, use multiFormElementHandlers
.
This is hidden from most users via the function
getHTMLFormDescription
.
formElementHandlers(url = NULL, checkDynamic = TRUE, dropButtons = TRUE)
multiFormElementHandlers(url = NULL, checkDynamic = TRUE, dropButtons = TRUE)
url |
the URL of the HTML page. This is not necessary for
creating the description of the form elements as this is done
via a call to |
checkDynamic |
a logical value indicating whether to test whether
the form has dynamic elements. If this is |
dropButtons |
a logical value indicating whether to omit button elements in the form description. These are typically Submit or Reset buttons that are not relevant for submitting the form request from R. |
This uses the htmlTreeParse
function in the XML parsing package to
gather up and process the different HTML form elements
in the HTML document. It organizes the information into
a more programmatically accessible structure.
An object of class HTMLFormDescription
.
inputs |
a list describing the different select elements. Eac element corresponds to a separate select element and is a named character vector. The values in the character vector are the text for the option elements and the names are the corresponding value attribute which is submitted if that option is selected. |
textareas |
the names of the TEXT or TEXTAREA elements. |
fixed |
|
form |
the attributes (a named character vector) giving the HTML attributes associated with the FORM element. These describe the action, the URI for submission, the encoding format, etc. |
url |
this is supplied when the handlers are created and allows the complete information about the form(s) to be entirely self-describing, i.e. to resolve relative links, etc. for the POST actions. |
hidden |
a list containing character vectors of length 1 or more. Each element in the list corresponds to an HTML element of type "hidden" with a name. Such elements can have multiple values for the same name, i.e. the name="x" can be repeated and all these values must be sent as part of the form. |
inputdefaults |
|
textareadefaults |
|
selectdefaults |
...
Currently, we organize the information from a form into a simple HTMLFormDescription object which is an S3-style class. This maintains the information about the form in separate fields and one must look across these fields to understand an individual element. For example, one would get its
Duncan Temple Lang <duncan@wald.ucdavis.edu>
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.