README.md

ctnamecleaner, ctpopulator, ctcorrelator

ctnamecleaner

An R package that finds that takes a list of Connecticut hamlets or neighborhoods and adds a column with the matching official town names.

What function ctnamecleaner() does

Let's assume you have a dataframe in R called towncoffeeshops that looks like

Town | Coffeeshops --- | ---: Andover | 2 Centerbrook | 5 Yalesville | 1

Run this in R

ctnamecleaner(Town, towncoffeeshops, filename="towncoffeecleaned", case="Upper")

You'll get a new file called towncoffeecleaned.csv that looks like

Town | Coffeeshops | real.town.name --- | ---: | --- Andover | 2 | ANDOVER Centerbrook | 5 | ESSEX Yalesville | 1 | WALLINGFORD

Alternatively

ctnamecleaner(Town, towncoffeeshops)

The command above will create a dataframe without exporting.

Usage

ctnamecleaner(name, data, filename="nope", case="Title")

Arguments

ctpopulator

An R package that appends the most-recent population of Connecticut towns to a dataframe for efficient per-capita calculations.

What function ctpopulator() does

Let's assume you've collapsed duplicate town names column real.town.name in the CTNAMECLEANED dataframe above and summed up or averaged the figures you were working with.

Run this in R

ctpopulator(real.town.name, CTNAMECLEANED, filename="towncoffeepop")

You'll get a new file called towncoffeepop.csv that looks like the table below. Note: if you exclude the CSV filename parameter only the dataframe will be exported and can be assigned to an object.

Town | Coffeeshops | real.town.name | pop2013 --- | ---: | --- | ---: Andover | 2 | ANDOVER | 3095 Centerbrook | 5 | ESSEX | 6668 Yalesville | 1 | WALLINGFORD | 45112

Usage

ctnamecleaner(name, data, filename="nope")

Arguments

ctcorrelator

An R package that takes a town dataframe and checks for correlations between the original data set and 500 different variables including demographics, median income, education attainment, and poverty from an ever-growing list. Why? Correlation does not mean causation. But having a quickly generated list could help point a researcher of journalist into unforseen directions with respect to the original data.

What function ctcorrelator() does

Let's assume you've collapsed duplicate town names column real.town.name in the CTNAMECLEANED dataframe above and summed up or averaged the figures you were working with.

This is a dataframe called ctcoffeeshops.

Town | Coffeeshops --- | ---: Andover | 2 Essex | 5 Wallingford | 1

Run this in R

ctcorrelator(ctcoffeeshops, p=.9)

You'll get a new file called array_summary.csv that looks similar to this:

row | correlation | n() --- | --- | ---: 1 | moderate.negative.correlation | 7 2 | moderate.positive.correlation | 70 3 | no.correlation | 12 4 | strong.negative.correlation | 3 5 | strong.positive.correlation | 103 6 | very.strong.positive.correlation | 8 7 | weak.negative.correlation | 6 8 | weak.positive.correlation | 49

You'll get a new file called strong.very.strong.csv that looks similar to this:

row | column.abbrev | corre | correlation | raw | column.name --- | --- | --- | --- | --- | --- 1 | below.poverty | 0.947982822 | very.strong.positive.correlation | 0.947982822 | Below poverty 2 | g11 | 0.934302408 | very.strong.positive.correlation | 0.934302408 | Educational Attainment for the Population 25 Years and Over, 11th grade (City) 3 | female.householder.male.partner | 0.931860863 | very.strong.positive.correlation | 0.931860863 | Unmarried-partner Households by Sex of Partner, Female householder and male partner (City)

And then you'll also get a new file called plot.png that looks similar to plot

Usage

ctcorrelator(dat_data, p=.9)

Arguments

What you'll need to start

What to run within R or RStudio

Assuming user is starting from scratch

install.packages("devtools")
library(devtools)

install_github("trendct/ctnamecleaner")
library(ctnamecleaner)

Future versions

Will account for zip codes and census tracts or possibly blocks in Connecticut.

Version

0.3.1

MIT



trendct/ctnamecleaner documentation built on May 31, 2019, 7:47 p.m.