Michał Bojanowski (Kozminski Univeristy), Dominika Czerniawska (University of Manchester and University of Warsaw), Wojciech Fenrich (University of Warsaw)
Authors thank (Polish) National Science Centre for support through SONATA grant 2012/07/D/HS6/01971 for the project Dynamics of Competition and Collaboration in Science: Individual Strategies, Collaboration Networks, and Organizational Hierarchies (http://recon.icm.edu.pl).
This is a dataset built from a qualitative study of 40 Individual in-Depth Interviews (IDI) conducted in the period April-August of 2016 as a part of the RECON project on collaboration in Polish science. This repository is an R package, but the data is also stored in portable CSV format so that it can be used with any other analytical software.
Data consists of 40 individual in-depth interviews conducted between April and August 2016 by two interviewers. The interviewees mentioned 333 collaborators in total. The sample consists of 20 female and 20 male scientists from six Polish cities. Respondents represented a broad range of disciplines: natural sciences, social sciences, life sciences, the humanities, engineering, and technology on different levels of career from PhD candidates to professors.
Each interview consisted of several parts two of which are of relevance here:
nodes
table described below.collaboration
table described below.resources
and described in detail
below.While collaboration networks assembled from part (2) include alter-alter ties, the data on resources is available only for ego-alter dyads.
The data is contained in three tables as shown in the diagram below:
In all tables the NA
symbol (Not Available) is used to encode missing
information.
The nodes
table has 374 rows and the following 7 columns:
id_interview
– Unique interview identification numberid_node
– Node number unique within each interview. Value 0
corresponds to the respondent (the ego)is_ego
– A binary variable which is equal to 1
for the
respondents (the egos) and 0
otherwise.is_polish
– A binary variable which is equal to 1
if the
researcher is affiliated with a Polish academic institution and 0
otherwise.department
– A numeric variable providing information whether two
persons are affiliated with the same department at the same academic
institution. Two researchers (i) and (j) mentioned in the same
interview are affiliated with the same department if the have valid
values on variable department
and these values are equal.scidegree
– Character variable encoding scientific degree. Values
are "mgr"
=MA, "dr"
=PhD, "drhab"
=habilitated PhD, and
"prof"
=full professor.female
– Binary variable which is equal to 1
if the researcher
is female and 0
for males.The collaboration
table has 1732 rows and the following 3 columns:
id_interview
– Unique interview identification numberfrom
and to
– Node numbers referencing id_node
column from the
nodes
table. As id_node
in table nodes
the values are unique
within each interview. Pair of researchers declared as
collaborators. For example a row with id_interview=2
, from=1
,
and to=2
indicates that in the interview 2 nodes 1 and 2 where
mentioned by the respondent as collaborators.The resources
table has 1761 rows and the following 4 columns:
id_interview
– Unique interview identification number.from
and two
– Node numbers referencing id_node
column from
the nodes
table. As id_node
in table nodes
the values are
unique within each interview.code
– Character variable indicating what type of resource was
declared to flow from researcher from
to researcher to
from
interview id_interview
.Possible values for variable code
are:
| code | | :--------------------------------- | | career_development | | conceptualization | | contacts_in_academia | | data_analysis | | data_curation | | data_or_other_sources | | drafting | | equipment | | formal_administration | | funding_acqusition | | investigation | | knowledge_other | | methodology | | motivation | | non_academic_contacts | | other_charactersitics | | other_input | | prestige | | professional_achievements_formal | | project_administration | | proofreading | | prototype_construction | | software_creation | | supervision_in | | traits_of_character |
Below are example data and plots from interview 2.
Node data:
nodes %>%
filter(id_interview == 2) %>%
knitr::kable()
| id_interview | id_node | is_ego | is_polish | department | scidegree | female | | ------------: | -------: | ------: | ---------: | ---------: | :-------- | -----: | | 2 | 0 | 1 | 1 | 1 | dr | 0 | | 2 | 1 | 0 | 1 | 2 | dr | 0 | | 2 | 2 | 0 | 1 | 3 | dr | 1 | | 2 | 3 | 0 | 1 | 3 | dr | 1 | | 2 | 4 | 0 | 1 | 2 | dr | 1 | | 2 | 5 | 0 | 1 | NA | dr | 1 | | 2 | 6 | 0 | 0 | NA | prof | 0 | | 2 | 7 | 0 | 1 | NA | NA | NA |
Collaboration network:
g <- collaboration %>%
filter(id_interview == 3) %>%
select(-id_interview) %>%
igraph::graph_from_data_frame(directed=FALSE) %>%
simplify()
xy <- graphlayouts::layout_with_stress(g)
plot(
g,
layout=xy,
vertex.color = "white",
edge.color = "black",
vertex.label.color = "black"
)
Resource flows:
edb <- resources %>%
filter(id_interview==3) %>%
select(-id_interview) %>%
arrange(from, to)
rg <- graph_from_data_frame(edb)
rnames <- sort(unique(E(rg)$code))
layout(matrix(1:16, 4, 4))
for(r in rnames) {
rgs <- delete.edges(rg, E(rg)[code != r])
opar <- par(mar=c(0,0,1,0))
plot(
simplify(rgs),
layout=xy,
vertex.color = "white",
edge.color = "black",
vertex.label.color = "black",
main = r
)
par(opar)
}
layout(1)
This is an R package, but you can download the files in CSV format using links below:
This is an R package so you can install it directly from GitHub using:
remotes::install_github("recon-icm/reconqdata")
MIT license, see file LICENSE.md
.
TBA. Please contact the authors for now.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.