knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

When we receive the dataframe of every ads description, we clean the dataframe to get descriptions in lower characters and without any accent,faulty character or any punctuation.

#clean(annonces)
#match_environnement <- function("rue de montreuil","extrait de la derniere description au 97 rue de montreuil dans un bel immeuble")
#match_environnement <- function("metro passy","extrait de la derniere description au 4 rue de l'abbe gillet metro passy")

-If an ads description contains a street, we use number to get the number of the street if available by taking the number in front of the word "rue" (street in french) in the description.

#number <- function(castorus)

-After collecting the extracted information we combines them with the information already present in the data scrapping. In this dataframe we give priority to the data already present from the data scrapping. When this one returns an NA, we fill the void thanks to the extracted data.

#combine_roads <- function(castorus,roadnames,metros)


paris-appartemnt-project/apartment_project documentation built on May 6, 2019, 4:33 p.m.