divar: Divar posts dataset

Description Usage Format Source

Description

The dataset contains 947635 posts that were published in Divar classified ads platform. These posts were published and archived before 2017. Distribution of these posts in different groups does not resemble the actual distributions. The dataset contains posts from six major categories in Divar:

Vehicles
Electronic devices
Businesses
For the home
Personal
Leisure & Hobbies

Usage

1

Format

A tibble with 947635 rows and 17 variables:

id

Unique id of the post (1246772 ids)

archive_by_user

Whether the post was archived by user or automatically by the system (True/False)

published_at

Weekday and hour the post was published

cat1

First level category of the post (for example cat1)

cat2

Second level category of the post (can be empty)

cat3

Third level category of the post (can be empty) (for example light)

city

Name of the city the ad was published in (for example Mashhad)

title

Title of the post. All phone numbers are replaced by the $NUM token.

city

Name of the city the ad was published in (for example Mashhad)

desc

Description of the post. All phone numbers are replaced by the $NUM token.

price

Name of the city the ad was published in (for example Mashhad)

image_count

Number of images for the post

platform

The platform the post was submitted from (mobile or web)

mileage

(Only for light vehicles) the mileage of the vehicle posted in kilometers.

brand

(Only for light vehicles and electronic devices) English and Persian name of the brand separated with ::(two colons)

year

(Only for light vehicles) production year of the vehicle in Iranian calendar. Can be the special value of <1366 which means older than year 1366.

type

(Only for clothing) boys/girls for children clothing and men/women for adult clothing

Source

Cafebazaar Research Group


mcnakhaee/dadegan documentation built on Sept. 3, 2020, 2:19 a.m.