Man pages for stasvlasov/nstandr
Name Standardization in R

browse_dot_graphGenerates a temporary html file with visualization of given...
check_rowsAssumes that rows (if logical) are same length as x
cockburn_combabbrevCollapses single character sequences
cockburn_detect_corpDetect Corporates (code - 'firm')
cockburn_detect_govtDetect Goverment Organizations (Non-Corporates group)
cockburn_detect_hospDetect Hospitals (Non-Corporates group)
cockburn_detect_indivDetect Individuals (Non-Corporates group)
cockburn_detect_instDetect Non-profit Institutes (Non-Corporates group)
cockburn_detect_inst_condsDetects Non-profit institutes with special conditions
cockburn_detect_inst_conds_1Detects Non-profit institutes with special conditions
cockburn_detect_inst_conds_2Detects Non-profit institutes with special conditions
cockburn_detect_inst_germanDetects German Non-profit institutes
cockburn_detect_typeIdentifies Entity Type
cockburn_detect_univDetect Universities (Non-Corporates group)
cockburn_detect_usptoSpecial USPTO codes. Codes as 'indiv'
cockburn_remove_standard_namesCreates so called stem name (a name with all legal entity...
cockburn_remove_usptoRemoves special USPTO codes.
cockburn_replace_compustatCOMPUSTAT specific standardization for organizational names
cockburn_replace_compustat_namesCOMPUSTAT specific standardization for organizational names....
cockburn_replace_derwentPerforms Derwent standardization of organizational names
cockburn_replace_govtCleanup Goverment Organizations (Non-Corporates group)
cockburn_replace_punctuationRemoves punctuation and standardise some symbols.
cockburn_replace_standard_namesCreate standard name
cockburn_replace_typeCleanup Entity Type
cockburn_replace_univCleanup Universities (Non-Corporates group)
defactorDefactor the object
defactor_vectorConverts factor to character
detect_legal_formDetect legal form
detect_patternsCodes strings (e.g., organizational names) based on certain...
escape_regexEscapes special for regex characters
escape_regex_for_typeEscapes special for different types of pattern
escape_regex_for_typesEscapes special for regex characters conditionally
get_dotsProvides access to arguments of nested functions. Sort of an...
get_standardize_optionsGets 'standardize_options' at point with consistent updates...
get_targetGets a target vector to standardize.
get_vectorGets vector by column and defactor if needed. Optionaly one...
inset_targetInsets target vector back to input object ('x')
is_emptyChecks if string has something to print
magerman_condenseCondensing names
magerman_detect_charactersDetect candidates for characters that need to be cleaned
magerman_detect_comma_period_irregularitiesDetects comma period irregularities
magerman_detect_legal_formDetect legal form
magerman_detect_legal_form_beginningDetects legal form at the beginning of a name
magerman_detect_legal_form_endDetects legal form at the end of a name
magerman_detect_legal_form_middleDetects legal form in the middle of a name
magerman_detect_umlautDetect umlauts
magerman_remove_common_wordsRemove common words
magerman_remove_common_words_anywhereRemoves common words anywhere in a name
magerman_remove_common_words_at_the_beginningRemoves common words at the beginning of a name
magerman_remove_common_words_at_the_endRemoves common words at the end of a name
magerman_remove_double_quotation_marks_beginning_endRemoves double quotation irregularities
magerman_remove_double_quotation_marks_irregularitiesRemoves double quotation irregularities
magerman_remove_double_spacesRemoves double spaces
magerman_remove_html_codesRemoves html codes
magerman_remove_legal_formRemoves legal form
magerman_remove_legal_form_and_cleanRemoves legal form
magerman_remove_non_alphanumeric_at_the_beginningRemoves non alphanumeric characters at the beginning of a...
magerman_remove_non_alphanumeric_at_the_endRemoves non alphanumeric characters at the end of a name
magerman_remove_special_charactersRemoves special characters
magerman_replace_accented_charactersReplaces accented characters
magerman_replace_comma_period_irregularitiesReplaces comma period irregularities
magerman_replace_comma_period_irregularities_allReplaces comma and period irregularities
magerman_replace_legal_form_beginningReplaces legal form at the beginning of a name
magerman_replace_legal_form_endReplaces legal form at the end of a name
magerman_replace_legal_form_middleReplace legal form in the middle of a name
magerman_replace_proprietary_charactersReplaces proprietary characters
magerman_replace_sgml_charactersReplaces sgml characters
magerman_replace_spelling_variationReplaces spelling variation
magerman_replace_umlautReplaces Umlauts
make_dot_edgesMakes dot graph edges for visualizing arrows between sequence...
make_dot_graphGenerates graph description for visualizing list of...
make_dot_nodesGenerates description of dot graph nodes.
nstandr-packagenstandr: Name Standardization in R
paste_dot_nodeMakes a dot node (as html table) from procedure's attributes.
paste_dot_node_tr_tdMakes TR TD record for dot node TABLE
replace_patternsA wrapper for string replacement and cbinding some columns.
save_dot_graph_asSaves dot graph as file using system command 'dot' from...
standardizeStandardizes organizational names. Takes either vector or...
standardize_cockburnStandardizes strings using exact procedures described in...
standardize_dehtmlizeConverts HTML characters to UTF-8
standardize_detect_encDetects string encoding
standardize_is_data_emptyChecks if all elements in vector(s) are either "", NA, NULL...
standardize_magermanStandardizes strings using exact procedures described in...
standardize_make_procedures_listMakes list of procedures calls from table.
standardize_omit_emptyRemoves elements that are either "", NA, NULL or have zero...
standardize_optionsDoes nothing but stores (as its own default arguments)...
standardize_remove_bracketsRemoves brackets and content in brackets
standardize_remove_quotesRemoves double quotes
standardize_squish_spacesRemoves redundant whitespases
standardize_toasciiTranslates non-ascii symbols to its ascii equivalent
standardize_toupperUppercases vector of interest in the object (table)
standardize_x_splitSplits the object (table) in chunks by rows
unlist_if_possibleIf column in the 'x' table is list unlist it if possible
visualizeVisualizes list of procedures.
x_lengthGets lengths of the object
x_widthGets width of the object
stasvlasov/nstandr documentation built on July 27, 2023, 10:29 p.m.