Man pages for stasvlasov/nstandr
Name Standardization in R

browse_dot_graph	Generates a temporary html file with visualization of given...
check_rows	Assumes that rows (if logical) are same length as x
cockburn_combabbrev	Collapses single character sequences
cockburn_detect_corp	Detect Corporates (code - 'firm')
cockburn_detect_govt	Detect Goverment Organizations (Non-Corporates group)
cockburn_detect_hosp	Detect Hospitals (Non-Corporates group)
cockburn_detect_indiv	Detect Individuals (Non-Corporates group)
cockburn_detect_inst	Detect Non-profit Institutes (Non-Corporates group)
cockburn_detect_inst_conds	Detects Non-profit institutes with special conditions
cockburn_detect_inst_conds_1	Detects Non-profit institutes with special conditions
cockburn_detect_inst_conds_2	Detects Non-profit institutes with special conditions
cockburn_detect_inst_german	Detects German Non-profit institutes
cockburn_detect_type	Identifies Entity Type
cockburn_detect_univ	Detect Universities (Non-Corporates group)
cockburn_detect_uspto	Special USPTO codes. Codes as 'indiv'
cockburn_remove_standard_names	Creates so called stem name (a name with all legal entity...
cockburn_remove_uspto	Removes special USPTO codes.
cockburn_replace_compustat	COMPUSTAT specific standardization for organizational names
cockburn_replace_compustat_names	COMPUSTAT specific standardization for organizational names....
cockburn_replace_derwent	Performs Derwent standardization of organizational names
cockburn_replace_govt	Cleanup Goverment Organizations (Non-Corporates group)
cockburn_replace_punctuation	Removes punctuation and standardise some symbols.
cockburn_replace_standard_names	Create standard name
cockburn_replace_type	Cleanup Entity Type
cockburn_replace_univ	Cleanup Universities (Non-Corporates group)
defactor	Defactor the object
defactor_vector	Converts factor to character
detect_legal_form	Detect legal form
detect_patterns	Codes strings (e.g., organizational names) based on certain...
escape_regex	Escapes special for regex characters
escape_regex_for_type	Escapes special for different types of pattern
escape_regex_for_types	Escapes special for regex characters conditionally
get_dots	Provides access to arguments of nested functions. Sort of an...
get_standardize_options	Gets 'standardize_options' at point with consistent updates...
get_target	Gets a target vector to standardize.
get_vector	Gets vector by column and defactor if needed. Optionaly one...
inset_target	Insets target vector back to input object ('x')
is_empty	Checks if string has something to print
magerman_condense	Condensing names
magerman_detect_characters	Detect candidates for characters that need to be cleaned
magerman_detect_comma_period_irregularities	Detects comma period irregularities
magerman_detect_legal_form	Detect legal form
magerman_detect_legal_form_beginning	Detects legal form at the beginning of a name
magerman_detect_legal_form_end	Detects legal form at the end of a name
magerman_detect_legal_form_middle	Detects legal form in the middle of a name
magerman_detect_umlaut	Detect umlauts
magerman_remove_common_words	Remove common words
magerman_remove_common_words_anywhere	Removes common words anywhere in a name
magerman_remove_common_words_at_the_beginning	Removes common words at the beginning of a name
magerman_remove_common_words_at_the_end	Removes common words at the end of a name
magerman_remove_double_quotation_marks_beginning_end	Removes double quotation irregularities
magerman_remove_double_quotation_marks_irregularities	Removes double quotation irregularities
magerman_remove_double_spaces	Removes double spaces
magerman_remove_html_codes	Removes html codes
magerman_remove_legal_form	Removes legal form
magerman_remove_legal_form_and_clean	Removes legal form
magerman_remove_non_alphanumeric_at_the_beginning	Removes non alphanumeric characters at the beginning of a...
magerman_remove_non_alphanumeric_at_the_end	Removes non alphanumeric characters at the end of a name
magerman_remove_special_characters	Removes special characters
magerman_replace_accented_characters	Replaces accented characters
magerman_replace_comma_period_irregularities	Replaces comma period irregularities
magerman_replace_comma_period_irregularities_all	Replaces comma and period irregularities
magerman_replace_legal_form_beginning	Replaces legal form at the beginning of a name
magerman_replace_legal_form_end	Replaces legal form at the end of a name
magerman_replace_legal_form_middle	Replace legal form in the middle of a name
magerman_replace_proprietary_characters	Replaces proprietary characters
magerman_replace_sgml_characters	Replaces sgml characters
magerman_replace_spelling_variation	Replaces spelling variation
magerman_replace_umlaut	Replaces Umlauts
make_dot_edges	Makes dot graph edges for visualizing arrows between sequence...
make_dot_graph	Generates graph description for visualizing list of...
make_dot_nodes	Generates description of dot graph nodes.
nstandr-package	nstandr: Name Standardization in R
paste_dot_node	Makes a dot node (as html table) from procedure's attributes.
paste_dot_node_tr_td	Makes TR TD record for dot node TABLE
replace_patterns	A wrapper for string replacement and cbinding some columns.
save_dot_graph_as	Saves dot graph as file using system command 'dot' from...
standardize	Standardizes organizational names. Takes either vector or...
standardize_cockburn	Standardizes strings using exact procedures described in...
standardize_dehtmlize	Converts HTML characters to UTF-8
standardize_detect_enc	Detects string encoding
standardize_is_data_empty	Checks if all elements in vector(s) are either "", NA, NULL...
standardize_magerman	Standardizes strings using exact procedures described in...
standardize_make_procedures_list	Makes list of procedures calls from table.
standardize_omit_empty	Removes elements that are either "", NA, NULL or have zero...
standardize_options	Does nothing but stores (as its own default arguments)...
standardize_remove_brackets	Removes brackets and content in brackets
standardize_remove_quotes	Removes double quotes
standardize_squish_spaces	Removes redundant whitespases
standardize_toascii	Translates non-ascii symbols to its ascii equivalent
standardize_toupper	Uppercases vector of interest in the object (table)
standardize_x_split	Splits the object (table) in chunks by rows
unlist_if_possible	If column in the 'x' table is list unlist it if possible
visualize	Visualizes list of procedures.
x_length	Gets lengths of the object
x_width	Gets width of the object