README.md

manager

Tools for wrangling, managing, and understanding data.

Core tools:

Miscellany:

Tags:

Click here for additional details on winograd() function Each time the function is run, it pulls, via web scraping with rvest, the text of one Winograd schema from here (website created by Ernest Davis; available under a CC 4.0 license).

A Winograd schema is a sentence that includes an ambiguous pronoun that could refer to either of two antecedent nouns. Which noun the pronoun is rightly associated with depends on which of two words/phrases is present elsewhere in the sentence. For example:

I spread the cloth on the table in order to [protect/display] it.

If the sentence is written as "...to protect it," then it refers to the table. If the sentence is written as "...to display it," then it refers to the cloth.

Winograd schemas require commonsense human reasoning, and they're difficult for computers to resolve. Picking a sentence construction (e.g., "...to protect it" or "...to display it") and asking a question that tests one's understanding of the pronoun's identity (e.g., "What is being [protected][displayed]?") can be an effective way to distinguish people and bots in online surveys. (This is especially true if multiple Winograd schemas are presented; the chance of a bot successfully "guessing" its way past three Winograds is just 12.5%.)

Back when I ran survey studies, I implemented Winograd schemas to preserve data quality when collecting responses via Prolific/Reddit/MTurk/etc. My experience is that they can do a bit too good of a job of flagging responses as potential bots: It's not hard to give the wrong response to a Winograd schema, especially if you're moving quickly. But I often preferred to be overly conservative in the face of bot risk/low-attention responses.



jacob-gg/manager documentation built on July 2, 2024, 2:09 a.m.