family_name_df | R Documentation |
This dataset, family_name_df, is a data frame containing 1,806 Chinese surnames along with their frequency and distribution across China. The dataset includes 1806 observations and 7 variables, covering information such as whether a surname is compound, its initial, frequency ranks, and relative frequency between 1930 and 2008. This dataset is useful for sociolinguistic analysis, demography, and historical population studies.
data(family_name_df)
A data frame with 1806 observations and 7 variables:
Chinese surname (character)
Indicates if the surname is compound (numeric)
Initial letter of surname in Pinyin (character)
Rank of the initial letter (numeric)
Estimated number of people with the surname (1930–2008) (numeric)
Relative frequency per million (1930–2008) (numeric)
Surname uniqueness score (numeric)
The dataset name has been kept as 'family_name_df' to avoid confusion with other datasets in the R ecosystem. This naming convention helps distinguish this dataset as part of the ChinAPIs package and assists users in identifying its specific characteristics. The suffix 'df' indicates that the dataset is a data frame. The original content has not been modified in any way.
Data taken from the ChineseNames package version 2023.8
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.