In Framus94/HierarchiesAndCareers:

options(scipen = 999)
knitr::opts_chunk$set(echo = F, 
                      out.height = "89%", out.width = "89%",
                      fig.align = "center"
                      )

options(xtable.comment = FALSE)
# data manipulation
library(tidyverse); library(magrittr)
# networks
library(tidygraph); library(ggraph); library(igraph)
# layout
library(wesanderson); library(xtable); library(kableExtra)

# colors for plots
#venue_colors_fill <- scale_fill_manual(values = c(wes_palette("Zissou1")[c(5,4,2)], wes_palette("Darjeeling2")[2]))
venue_colors_fill <- scale_fill_manual(name = "Exhibition\nvenue", 
                                       breaks = c("Collector", "Gallery", "Non Profit", "Museum"),
                                       values = c(wes_palette("FantasticFox1")[5], # dark red
                                                  wes_palette("Zissou1")[c(4,2)], #  orange, light blue
                                                  wes_palette("Darjeeling2")[2])) # dark blue


#venue_colors_color <- scale_color_manual(values = c(wes_palette("Zissou1")[c(5,4,2)], wes_palette("Darjeeling2")[2]))
venue_colors_color <- scale_color_manual(name = "Exhibition\nvenue", 
                                         breaks = c("Collector", "Gallery", "Non Profit", "Museum"),
                                         values = c(wes_palette("FantasticFox1")[5],
                                                    wes_palette("Zissou1")[c(4,2)], 
                                                    wes_palette("Darjeeling2")[2]))

# setup costum color palette
# color_palette <- data_frame(
#     # Correspondents mentioned
#     corres = unique(archive_aggr$Correspondent),
#     # define costum colors for plot and rearrange their order of appearance
#     colours =  c(wes_palette("Darjeeling1")[c(1,2,4,5)],
#                  wes_palette("Darjeeling2")[1:2],
#                  wes_palette("Darjeeling2")[4:5],
# 
#                  wes_palette("IsleofDogs2")[3],
#                  wes_palette("IsleofDogs1")[1],
#                  wes_palette("Cavalcanti1")[4],
#                  wes_palette("FantasticFox1")[2], # snake yellow
#                  wes_palette("Moonrise3")[c(2,4)], # douce pink, douce olive
#                  wes_palette("GrandBudapest1")[2], # magenta
#                  wes_palette("FantasticFox1")[5] # dark red
#                  )[1:length(unique(archive_aggr$Correspondent))]
#   )

drake::loadd()
devtools::load_all()


exhplaces <- exhplaces %>% 
  mutate(continent = if_else(country %in% c("United States", "Canada", "Mexico"), 
                             "North America", continent), 
         continent = continent %>% if_else(. != "North America" & . == "Americas", 
                                           "South America", .))

\newpage \pagenumbering{arabic}

Introduction {-}

How do visual artists attain recognition? In this thesis, I argue that the emergence of recognition can be analyzed through the artist's exhibition biography. More specifically, successful and less successful careers can be distinguished by the prestige of exhibition venues to which artists have access. This thesis thereby contributes to the research question in which interactions actors attain status [@yogev2010, 513]. Initially, analyzing processes of status formation allows to better understand how differences in the artistic and economic value of products on markets of cultural goods arise: Status acts upon the "perceived product quality" (Ibid) of an actor's outcomes. Similarly, Fabien Accominotti [-@accominotti2018conse, 4] argues that studying processes of status formation is crucial to explain inequalities between individuals. First, two individuals of equivalent merit are perceived differently and might also achieve different results, if one of them has been endorsed and the other has not. Second, the outcomes of those who attained status are less likely to be challenged. Therefore, he argues, the formation of status is also the foundation for legitimate inequalities.

In order to address the question posed above, I first draw on the results of @fraiberger2018 who observe a path dependency between the first five exhibitions and the artist's further course of career. In other words, a privileged starting position helps to maintain access to other prestigious venues. In fact, I obtain similar results using a different dataset but the same methodology. Second, I show that inspite of the observed lock-in effects, trajectories do not have a regular course. Hence, a high starting position does not imply that artists will exclusively exhibit in the most coveted locations. Finally, I examine whether there are specific patterns of movement between exhibition venues by which artists attain, maintain, or loose status. Thereby, I show that the opportunity to show a broader selection of works has to be weighed against the possibility of exhibiting together with other artists but in more significant locations. In regard of certain career phases, however, no particular pattern of movement could be identified that distinguishes successful from less successful trajectories.

This study applies methods of network analysis and descriptive statistics to a dataset that combines exhibition data from the website artist-info.com and artist related variables from the biographical encyclopedia Allgemeines Künstlerlexikon. In chapter \@ref(hypotheses), I will review the previously mentioned literature, specifically regarding those aspects that allow to better understand the course of artists' careers. In this context, I will present the hypotheses derived from these studies. Chapter \@ref(exhibition-venues) is dedicated to developing a plausible and reliable measure for the status of exhibition venues. For this purpose, I will discuss how the status of an exhibition venue varies over time and depends on the available data. Subsequently, I will evaluate if the status order is consistent with other sources. These steps constitute the foundation for analyzing artists' careers as sequences of affiliations with locations of higher or lower prestige, as presented in chapter \@ref(artist-careers). Finally, I will specify the limitations of this thesis and draw a conclusion on the results in chapters \@ref(limitations) and \@ref(conclusion).

Hypotheses {#hypotheses}

In a recently published study, @fraiberger2018 argue that the first exhibitions and the prestige of the associated venues are of particular importance for the artists' careers. Based on the exhibitions of 31794 individuals, the authors show that the initial exhibitions influence in which venues the artists continue to exhibit during their lifetime. More precisely, 58.6% of the artists whose first five exhibitions take place in the most prestigious 20% of venues, on average, will also have access to the most selected venues during their last five exhibitions. In view of their findings, the authors argue that "artistic careers can be interpreted in the context of the institutions to which they have access" (Ibid, 2). Indeed, they find that accessing prestigious places does not only facilitate the chances of reaching other important places. The authors also provide evidence for the relation between the artists' first exhibitions and the length of documented careers. 10 years after the fifth exhibit, 39% of artists who began their careers in prestigous venues continued to exhibit. In contrast, only 14% of the artists beginning in the lower 40% of the exhibition venues have a trajectory documented for longer than 10 years. Moreover, artists whose trajectories begin at selected venues have about twice as many exhibitions compared to those who begin at less prestigious venues. Finally, their works are associated with higher prices both at auctions and galleries. Hence, these results suggest that the first exhibitions of an artist have a strong influence on the further exhibition biography of artists, their artistic recognition and the economic value of their artworks. Therefore, @fraiberger2018 do not only propose how to empirically operationalize the reputation of artists, as the title of their study indicates. The observed path dependency also contributes to explaining how inequalities among artists emerge. Is it possible to replicate the authors' results with a different dataset? As I am particularily concerned with the evolvement of the artistic recognition, I test the following hypothesis.

H1: A high prestige of venues hosting the artists' first five exhibitions coincides with a high prestige of the artists' last five exhibition venues.

\newpage

For the field of contemporary literature, Dubois and François [-@dubois2013, 515] have shown that "publishing trajectories progressively and cumulatively widen the inequalities between poets". Hence, careers do not only feature a path dependency, that is, superior entry positions coincide with better positions at the end of a trajectory. Instead, the authors find that individuals can even improve their higher starting positions. Dubois and François [-@dubois2013, 505] obtain these results by first creating three categories of poets that differ in the degree to which the individuals received institutional, critical and acadamic recognition over the course of their careers. Based on the sequences of the poets' publications with high, mid-range or low status publishers and in different formats such as paperback or others, the authors show that the publishing trajectories leading to one of the three reputation classes differ significantly from each other. More precisely, they apply an optimal matching analysis on the poets' sequences of publications that yields three types of trajectories (Ibid, 509). With a logit model the authors show that belonging to one of the types of trajectories significantly increases the likelihood of belonging to one of the three reputation classes (Ibid, 513). Then, for each type of trajectory, they indicate that the proportions of poets publishing in a paperback edition or with a major publisher become increasingly unequal in each of the three trajectory types (Ibid, 515). In other words, "inequalities, moderate at the beginning of the sequences, thus become progressively greater" (Ibid). \newline Arguably, the tendency of cumulative reputation seems to primarily apply to the trajectory types one and three when comparing the beginning to the end of the sequences. Indeed, the proportions of publishing in paperback or with a major publisher increase by a factor of $\frac{75\%}{46\%}=$ r round(75/46, 2) and $\frac{51\%}{34\%}=$ r round(51/34, 2) respectively.[^prop] In the second type of trajectories, however, the distribution changes only slightly ($\frac{42\%}{40\%}=$ r round(42/40, 2)). It is therefore all the more worth asking whether the cumulative quality of reputation applies to the careers of visual artists.

[^prop]: The percentages are given by @dubois2013 in the text on p. 515.

H2: Differences in reputation between artists at the beginning of their career will manifest in more striking differences in later career phases.

\newpage

In other words, I expect individuals to have access to even more prestigious venues given their artworks have been shown in prestigious venues before. However, this does not imply that careers have a regular, ideally even linear course. @dubois2013 find that careers leading to highest recognition are at least as irregular as those of less recognized poets. Specifically, they "observe that poets go from a major publisher to a minor one, from a mid-range publisher to a paperback edition, from a paperback to publishing an essay of a play with a mid-range publisher, etc" (Ibid, 516). Hence, the careers of poets are irregular although they may have a distinct tendency on average. Consider the most successful trajectories, as an example. "[W]hen these poets find themselves in modest publishing states, the chances they will transition back into a dominant state (major publisher or paperback) is each close to 50%" (Ibid, 517). The authors underline this finding referring to analyses [@giuffre1999; @menger2009] in which irregular engagements are seen as evidence of low status, since these individuals seem to be incapable of maintaining relationships with high-status actors. Similarly, opposed to Joel Podolny's [-@podolny2005] model of market competition, they argue that "actors do not associate exclusively with partners of equivalent status, and when reputable poets associate with mid-range or minor publishing houses this does not weaken their position in the hierarchy" [@dubois2013, 518]. In the case of visual artists, I expect that individuals will not exhibit exclusively in high status venues given their previous exhibitions have been in the most prestigious venues.

H3: Careers of artists with a high recognition at the beginning of their career show at least as much variability as careers of lower initial recognition.

Even if there is a risk for actors to dillute their status if they affiliate with lower-status others, there is a specific motivation for poets to cooperate with minor publishers. @dubois2013 argue that ambitious poetic work is associated a considerable time effort. Simultaneously, there is a need "to stay visible in the social space of poetry via publications" (Ibid, 517). Therefore, even the most renowned poets publish several short collections with minor publishers, as major publishers are unlikely to release in higher frequencies. Rather, they assemble the shorter collections, the poets' writings of several years, and publish them in, for example, paperback format (Ibid). In other words, poets descend in the status hierarchy to realize a particular project that cannot be accomplished with major, but with minor publishers. \newline In a similar vein, we can think about exhibitions of visual artists. Initially, it seems reasonable to differentiate by the status of exhibition venues in the sense that an exhibition at a prestigious venue contributes more to an artist's recognition than a show in a less prestigious venue. However, not all exhibitions are of equal importance. A solo exhibition at the same place, in which the artist's works are given all attention, has a different weight than a group exhibition, in which the works are displayed together with those of other artists. Accordingly, Laura Braden [-@braden2009, 446] notes that solo exhibitions are "the most legitimizing form of exhibition". Exhibtion titles often include the artists' name and therefore "indicate a greater level of valorization" (Ibid.). Consequently, it seems reasonable that artists strive for a solo rather than for a group exhibition. At the same time, the coincidence of an exhibition in the most prestigious venues associated with the most distinguishing format seems to be a rather exceptional event.

Arguably, the exhibition format that most closely corresponds to the argument of @dubois2013 is the retrospective, a special case of a solo exhibition. Consider, for example, the show "Adrian Piper: A Synthesis of Intuitions 1965-2016", hosted by The Museum of Modern Art. In collaboration with Piper, artworks from five decades have been assembled that represent the artist's various subjects and utilized materials [@momaPiper]. Similar to the case of poets, the most prestigious exhibition venues might display only the most significant contributions of the last few years or even of the artists' entire career. Conversely, lower-status venues could be more likely to exhibit also the more recently produced artworks. Unfortunately, the available dataset does not cover details of the \mbox{exhibited} works and the time period from which they originate. In this thesis, it is therefore not possible to assess whether the requirements for the artists' works differ according to the prestige of the exhibition venues. \newline Nonetheless, it can be ascertained if solo and group exhibitions are equally accessible at all prestige levels. In other words: Is there a trade-off between the significance of the exhibition format and the exhibition venue?

Borkenhagen and Martin [-@borkenhagen2018] even argue that successful and less successful trajectories can be distinguished by the way individuals differentiate, in particular career phases, between the status of organizations and the status of the positions available. The authors analyze this interrelation empirically through the careers of chefs in restaurants. They show that individuals who later become chefs in the most prestigious places differ from those who end their careers in less prestigious organizations with regard to the types of movements throughout their careers. In early years of their trajectories, future top chefs are more likely to seek positions in prestigious restaurants instead of seeking promotions, for example, to ascend from cook to sous-chef (Ibid, 18). In later phases of their careers these individuals descend in the status order of restaurants to occupy a higher position. In other words, they convert "organizational status" into "occupational status" (Ibid.). Yet, on average, these individuals are still employed in more prestigious restaurants than those who first sought promotions in low-profile restaurants (Ibid, 15). Hence, it is more effective to first occupy a minor position in an esteemed restaurant than to become chef de cuisine or even executive-chef somewhere less prestigous. In this sense, future top chef favor "organizational status" over "occupational status" in early years of their career (Ibid, 18). In contrast, individuals that end up in less prestigious restaurants "are much less likely to make initial investments in organizational status" (Ibid). Rather, they tend to "privilege occupational status over organizational status" (Ibid). This raises the question whether there are certain patterns that distinguish successful and less successful trajectories in the visual arts. More specifically, do the careers differ regarding their movements to solo and group exhibitions in venues of higher or lower prestige?

H4: Successful trajectories differ from less successful trajectories in terms of how they balance organizational and occupational status.

In order to test these hypotheses, I will first operationalize the concept of prestige of exhibition venues in the next chapter. The artists' careers can then be interpreted as a sequence of affiliations with more prestigious or less prestigious exhibition venues. The hypothesized career patterns are then discussed in chapter \@ref(artist-careers).

\newpage

Status on the fields of cultural production {#exhibition-venues}

Processes of status formation are one of the primary lines of research on the literary field [@rosengren1985; @dubois2013], the film industry [@kapsis1989; @cattani2014] and the visual arts [@braden2009; @yogev2010; @braden2016; @ertug2016; @fraiberger2018]. As stated in the introduction of this thesis, status serves as a key to explain inequalties between artists and the differences in the valuation artworks. In the first section of this chapter, I develop the argument that access to exhibition venues is an indicator of the status of an artist. Despite the use of different terms, such as status, prestige or reputation, I will refer to a "recognition of the esthetic value of an artist by actors other than the artist herself" [@dubois2013, 502]. Based on this, the career of an artist can be understood as a sequence of affiliations with exhibition venues. Regarding the locations, the terms status or prestige refer to a centrality measure for networks, namely Eigenvector centrality. I will introduce this concept in the second section of this chapter.

Careers as sequences of affiliations {#reputation-careers}

In her literature review on studies of careers in literature and visual arts, Janssen [-@janssen2001, 342] notes that artistic talent is not considered to be the most important variable to explain success. "Rather, artists’ fame and fortune is assumed to be largely dependent on a complex set of producers, distributors." Consider the case of the artist Marcel Duchamp, as an example. He selected functional, often mass-produced objects, gave them a title and declared those to be works of art. By proclaiming that the act of creation merely consists in making choices, he contested the tradition that art is handcrafted and reflects the artists' abilities [@momaDuchamp]. However, attaching a name to an ordinary object would be pointless, if there was not an ensemble of actors who, with reference to the tradition of the arts, assess this act as meaningful and valuable. In other words, the distinction between object and artwork as well as the recognition of individuals as artists is dependent on actors with the propensity to "consecrate", that is, "the injection of meaning and value" [@bourdieu1996rules, 171]. The art dealer, as Bourdieu [-@bourdieu1996rules, 168-9] exemplifies, introduces the artist to the "cycle of consecration and [...] into more and more select company and into more and more rare and exotic places (for example, in the case of a painter, group exhibitions, one-person exhibitions, prestigious collections, museums)". In addition, the artist receives an even more decisive consecration "the more consecrated the merchant [is] himself" (Ibid, 167). Therefore, artistic recognition is proportional to the work of intermediaries who contribute to the consecration of the work and the artist, who, in other words, establish the "belief" (Ibid, 172) in the social significance.

Arguably, there is a risk of underestimating the role of the artist from this perspective. Processes of value formation, the production of meaning and the establishment of careers are associated with intermediaries such as gallerists, critics and museum representatives. Indeed, the concept of the "dealer-critic-system", described in the seminal study of White and White [-@white1993], has been challenged for narrowing down the visual artist to mere a supplier of images [@graw2008, 129]. Certainly, addressing the question of how the recognition of artists emerges presupposes the consideration of characteristics of the individuals. In her extensive case study on Andy Warhol's career, for example, Zahner [-@zahner2006] illustrates that his strategies of self-enactment were strongly related to his artistic success. Consequently, the measurement of the artists' recognition requires an analytical strategy that captures both the efforts of the artists and the intermediaries.

Similar to @fraiberger2018, I argue that the artists' recognition can be interpreted with respect to the exhibition venues to which they have access. More precisely, the selection and the selectivity of these venues are indicators of the artist's recognition. Gallerists and curators evaluate, in view of their accountability towards their audiences, which artists provide contributions that are eligible for an exhibition [@ertug2016, 114]. In other words, they draw a line between those whose works are worth exhibiting and those that are not. In this sense, the intermediaries' admission decisions are an indicative for the recognition of the artist. At the same time, not all venues are equally selective regarding the artists they exhibit or the venues they exchange artworks with. @fraiberger2018 have shown that for exhibitions in prestigious venues, artists tend to be eligible only if their works have been shown in other prestigious venues before.[^selectivity] Therefore, the decision on which artist is eligible for an exhibition is also an assessment of the significance of previous events in the artist's career.

[^selectivity]: The selectivity regarding the artist's previous affiliations applies to some prestigious venues in particular: "The link weight between Museum of Modern Art (MoMA) and Guggenheim was 33 times higher than expected if artists would move randomly between institutions [...], reflecting a highly concentrated movement of selected artists between a few prominent institutions" [@fraiberger2018, 1].

This allows to interpret a career as a sequence of affiliations with exhibition venues to which artists are invited. This seems reasonable, not only because the artists' chances of accessing exhibition venues is an indicator of the attributed artistic significance. In the fields of cultural production, professions also comprise rather short-term projects than longer-term relationships with the same employer. Therefore, not the promotion is the most common way that individuals move upward, but the affiliation with higher-status organizations [@faulkner1987, 48]. In the next section, I will introduce an empirical concept to measure the prestige of exhibition venues.

Towards a network-based definition of status

Where do artists wish to present their works? To whom would gallery owners and museum representatives like to sell or lend works of art? According to Bourdieu's study of the art field both artists and exhibition venues compete for symbolic capital, that is, appreciation, status and reputation [@bourdieuhandbuch2014kapital, 138]. In this section, I argue that this observation, namely that symbolic capital is the dominant resource in the art field, allows to derive an empirical concept of the prestige of exhibition venues. More precisely, the pursuit for symbolic capital structures the circulation of artworks between exhibition venues. The mode of their circulation gives an indication of the venues' prestige.

From the artist's perspective, to begin with, exhibitions are an essential part in the process of artistic recognition. Consider, for example, the historical study of Oskar Bätschmann [-@batschmann1997, 9]. He explores how, with the disappearance of court patronage and the loss of importance of state and academic authorities, public exhibitions have become the most important forum for the recognition and rejection of visual art. Therefore, it seems reasonable that artists assign particular importance to an exhibition for their career. At the same time, it is crucial in which venues the artists' works are displayed. The higher the degree of consecration of an exhibition venue, as Bourdieu [-@bourdieu1996rules, 167] argues, the more it will contribute the artists' recognition. Congruently, as discussed in chapter \@ref(hypotheses), @fraiberger2018 have shown that artists have access to more prestigious venues the more important their previous venues have been. Consequently, artists seem to be more likely to choose high-profile than low-profile venues for their exhibitions.

This does not imply, however, that artists orient their exhibition activities exclusively towards the esteem of the location. As @dubois2013 argue in the case of poets, individuals affiliate with organizations also based on projects they can realize rather with lower-status than with high-status organizations. Expecting individuals to focus rather on prestigious venues does also not entail that they can always identify which venues are of a higher profile than others. In fact, Fraiberger et al. [-@fraiberger2018sm, 7] have shown that experts can indeed identify the most prestigious venues. Beyond the very top of exhibition sites, however, their assessments differ considerably from each other (see also section \@ref(status-evaluation)). Therefore, I rather expect a tendency for artists to orient towards those venues that contribute most to their artistic recognition. In this sense, the movements of artists to specific exhibition venues can be considered as an implicit assessment of the significance attributed to the location.

Undoubtedly, it is more up to gallery owners, curators and museum representatives to decide whether and where artworks are exhibited than to the artists [@yogev2010, 517]. Typically, as denoted by Fraiberger et al. [-@fraiberger2018sm, 4], artworks that curators or gallerists appreciate are exhibited, in many cases those from the inventory of the exhibition site. Most frequently, artists are invited to participate in an exhibition and, in a few cases, can submit an application to be selected by a jury. Nevertheless, also intermediaries have preferences in which other venues the artworks they dispose of will be shown. Galleries, for example, benefit from exchanging works of art with museums. In her study on the art market in Isreal, Yogev [-@yogev2010, 523] finds that "galleries receive the museum’s seal of approval, complementing their central position; they take advantage of significant additional display spaces and increase the value of their artists’ works". At the same time, she observes that the market is "dominated by a small, well-known elite group dictating the dominant artist tone [...], namely esteemed gallerists and curators" (Ibid, 530). Therefore, accessing this circuit of influencial players is a particularly beneficial resource for actors on art markets to enhance the value of their artists' works. \newline Even institutions that are rather indifferent to increasing the economic value of their collection may associate an interest to exchange artworks with high-profile actors. Consider this announcement by The International Art Market Studies Association [-@artmarketstudies2017] for a symposium on the exchange of artworks for exhibitions between museums:

\begin{small}“Since the end of the 19th century, the expansion of temporary exhibitions has determined the emergence of an international system for museums, based on the circulation of artworks and objects. For museums, sharing pieces from their collection has become crucial to ensure that they in turn get the loans they need to organise their own exhibitions. Lending artworks to prestigious institutions, particularly foreign ones, also enables curators to guarantee a heightened visibility to their own collections. Where to exhibit, how often, and which pieces can be obtained from which partners: nowadays, these are the fundamental criteria of a museum’s positioning within the international hierarchy of cultural heritage prestige. But loan policy does simply affect an institution’s image: it acts directly on the definition of the objects. The acceptation or refusal of a loan is the result of complex transactions, formulated or not, during which the value of an artwork is negotiated and reviewed. It also reflects the importance and rank of institutions, sometimes even of towns and nations.” \end{small}

\newpage

knitr::include_graphics(
  c("figures/catalogue_sand.jpg",
    "figures/Cooper-Hewitt-Painting-Interior-of-de-Forest-House-label.jpg")
  )

This quote underlines that the exchange of artworks is essential for the realization of exhibition projects. Thereby, the exchange of artworks acts as a status signal. On the one hand, the importance of a museum is reflected through the institutes it can acquire exhibition objects from. On the other hand, the request of an important museum does not only increase the value of the work of art, but also the "visibility" of the exhibition venue that loans the artwork. For example, loaners may be mentioned in the acknowledgements at the exhibition opening, or, in a less ephemeral way, in exhibition catalogs or on the exhibition labels of the displayed objects. This is illustrated by figure \@ref(fig:exhibition-catalogue-and-label), showing the list of those who provided artworks for the "Zeitgeist" exhibition at Martin-Gropius-Bau, Berlin, and the exhibition label indicating the provenance of an artwork by Walter Launt Palmer. Hence, if there are venues intermediaries consider to be of particular importance, for the valutation of their collection of artworks or their institutes, we would expect artworks to transition toward these venues. In this sense, the way artworks circulate between venues through loans or sales can be understood as an evaluation of the significance of exhibition venues.

# create reproducable graph
set.seed(123)
g <- play_erdos_renyi(30, 3/30, directed = T) %>% 
  add_vertices(1, attr = list()) %>% 
  as_tbl_graph() %>% 
  mutate(name = as.character(1:31), 
         eigen_centr_directed = centrality_eigen(directed = T), 
         eigen_centr_undirected = centrality_eigen(directed = F),
         in_degree = centrality_degree(mode = "in"),
         out_degree= centrality_degree(mode = "out"),
         cat = ifelse(eigen_centr_directed == 0, "= 0", "> 0"), 
         is_islolate = node_is_isolated())


# plot 
plot_g = ggraph(g, layout = "fr") + 
  geom_edge_link(arrow = arrow(length = unit(4, 'mm'), angle = 12), 
                 end_cap = circle(2.5, 'mm'), 
                 alpha = 0.3) + 
  theme_bw() +
  theme(axis.title = element_blank(), 
        axis.ticks = element_blank(), 
        axis.text = element_blank()) +
  guides(color = guide_legend(title = "Eigenvector\nCentrality", order = 1),
         size = guide_legend(title= element_blank())) 

ggpubr::ggarrange(common.legend = T,
  plot_g +
    geom_node_point(aes(size = eigen_centr_directed, color = cat)) + 
    scale_color_manual(values = c(wes_palette("Royal1")[2], wes_palette("Royal2")[5])) +
    geom_node_text(aes(label = name), 
                   size = 2, 
                   check_overlap = T, 
                   repel = T) + 
    annotation_custom(grob = grid::grobTree(grid::textGrob("Computed for the\ndirected network", 
                                                           x=0.601,  y=0.95, hjust=0, 
                                                           gp=grid::gpar(col="black", fontsize=11)))),
  plot_g + 
    geom_node_point(aes(size = eigen_centr_undirected, color = is_islolate)) +
    scale_color_manual(values = c(wes_palette("Royal2")[5], wes_palette("Royal1")[2])) +
    geom_node_text(aes(label = name), 
                   size = 2, 
                   check_overlap = T, 
                   repel = T) + 
    annotation_custom(grob = grid::grobTree(grid::textGrob("Computed for the\nundirected network", 
                                                           x=0.565, y=0.95, hjust=0, 
                                                           gp=grid::gpar(col="black", fontsize=11))))
)

g %>% as_data_frame("vertices") %>% arrange(- eigen_centr_undirected) %>% select(- contains("cat"))

In order to empirically identify which places are considered more important than others, I define a network in which an artist was exhibited first in venue $i$ and then in $j$ in chronological order. Figure \@ref(fig:comparison-eigen-centr-for-directed-and-undirected-graph) illustrates the resulting co-exhibition network. Venues are represented by the vertices whereas the ties depict the artists moving from one location to the next. In accordance with @fraiberger2018, I calculate the Eigenvector centrality for the undirected network to identify the important nodes. This measure is an extension of the degree centrality, i.e. the number of connections with other nodes. Regarding the co-exhibition network, the degree centrality of venue $i$ is equal to the number of artists who were first in $i$ and then in $j$, or who come from $j$ and are then exhibited in $i$. The value of a node's Eigenvector centrality is proportional to the sum of the degree centralities of the nodes to which it is connected. Typically, nodes have a high Eigenvector centrality, that are connected to nodes that are also well connected. These nodes, in turn, have also many ties with others, and so on [@bonacich1987, 1171]. In the present case, venues have a high Eigenvector centrality if they display the works of artists that are also shown by venues which, in turn, have many co-exhibits as well. When artists move between selected locations, forming a highly interconnected substructure, these venues obtain the highest centrality scores. Indeed, referring to the argumentation above, I expect some exhibition venues to be more attractive than others. Systematic movements towards these venues will therefore contribute to their centrality. In the next section, after deriving the prestige of venues from exhibition data, I will also evaluate the plausibility of the resulting status order.

Prior to this, however, attention should be drawn to some particularities of the construction of a ranking on the basis of exhibition data. The dataset contains exhibitions that are defined by the artist and not by artworks exhibited. Fraiberger et al. [-@fraiberger2018sm, 8] mention that a small number of venues, "like Galerie Boisserée, specialize in low value productions by highly regarded artists". It is comprehensible to highlight this as a potential "weakness" of the exhibition-based ranking, as the status of these venues could be seen as "biased upward". However, the venues' access to resources, even if it is the recognized name rather than an artist's most appreciated artworks, can also indicate the higher status of a venue. Accordingly, in the announcement of The International Art Market Studies Association [-@artmarketstudies2017] quoted above, the ability of museums to obtain exhibits from important other institutions is considered to reflect the significance of a museum. \newline At the same time, this is an argument to calculate Eigenvector centrality for an undirected network. Admittedly, the network consists of sequences of artists participating in exhibitions in a chronological order. Conceptually, the ties are thus directional. However, computing Eigenvector centrality for a directed network implies that only those ties will contribute to a node's centrality that are directed to this node. Therefore node 6, on the left side of figure \@ref(fig:comparison-eigen-centr-for-directed-and-undirected-graph), has a centrality value of zero although it is associated to other interconnected nodes. If exhibition venues derive their status also from the venues in which they can have their artworks represented, this method seems not to be an appropriate choice. Rather, as illustrated on the right side of figure \@ref(fig:comparison-eigen-centr-for-directed-and-undirected-graph), I operationalize the prestige of the exhibition locations by Eigenvector centrality calculated for the undirected network.

Status hierarchies empirically {#prestige-empirically}

This chapter focuses on Eigenvector centrality as a measure for status. As discussed in section \@ref(reputation-careers), I interpret an artist's career as a sequence of affiliations with exhibition venues. Before I test the hypotheses about the courses of the artists' careers, it is therefore necessary to assess whether Eigenvector centrality is a reliable and valid measure for their status. In the first section, I describe the data used for analysis. The second section takes into account that the number of documented exhibitions varies from year to year. I begin evaluating the prestige measure by exploring how the status of exhibition venues varies over time. Subsequently, I assess to what extent the network-based ranking of exhibition venues depends on the number of exhibitions used for calculation. Finally, the top 100 venues according to Eigenvector centrality are compared to the top 100 venues according to the study by @fraiberger2018. In particular, I examine how many of the venues are present in both rankings and whether the order of the ranking coincides.

Dataset

The dataset used for the thesis was compiled from various sources. First, I retrieved data on the exhibitions of visual artists from artist-info.com. This source provides information about which artists exhibited in which venues and in which period of time. Based on the addresses supplied for the exhibition venues, using the Google Maps API, it was possible to fetch their geocoordinates and the names of the cities, countries and continents in which they are located. Biographical data on the artists were supplemented by the online encyclopedia Allgemeines Künstlerlexikon.

In this thesis I extensively refer to the study by @fraiberger2018 which, at the time of data collection, had not yet been published. In fact, the data to reproduce their study are currently available [@fraiberger2018ds]. However, the names of the artists and exhibition venues have been anonymized. Therefore, I decided to use the prepared dataset to validate certain of the authors' findings and to test further hypotheses derived from the literature. It should also be mentioned that both data sources are different. The study by @fraiberger2018sm is based on data provided by the company magnus.com which, in turn, relies on various sources. Among these resources, however, the authors do not mention artist-info.com (Ibid, 3). Presumably, the datasets include an intersection of exhibitions documented in both sources. At the same time, the data providers may have a different selectivity with respect to the exhibitions they record. To this extent, using the same methodology, I also examine the external validity of the results of @fraiberger2018.

Evaluating the status order {#status-evaluation}

# extract edges from graph
galsimple_edges <- galsimple %>% 
  igraph::as_data_frame("edges") %>% 
  filter(exh_start_Y_from >= 1945, exh_start_Y_to >= 1945)

exhibitions_per_year <- exhibitions %>% 
  filter(id %in% galsimple_edges$exh_id_from | id %in% galsimple_edges$exh_id_to) %>% 
  distinct(id, .keep_all = T) %>% 
  group_by(year = exh_start_Y) %>% 
  summarise(num_exh_per_year = n())


places_exhibiting_per_type_per_year <- exhibitions %>% 
  filter(exh_place_id %in% galsimple_edges$from | exh_place_id %in% galsimple_edges$to) %>% 
  distinct(exh_place_id, exh_start_Y, .keep_all = T) %>% 
  left_join(exhplaces, by = c("exh_place_id" = "id")) %>% 
  group_by(year = exh_start_Y, type_exhplace) %>% 
  summarise(num_places_exhibiting = n())

places_exhibiting_per_year <- places_exhibiting_per_type_per_year %>% 
  group_by(year) %>% 
  summarise(num_places_exhibiting = sum(num_places_exhibiting))

edges_per_year <- full_join(
  galsimple_edges %>% 
    group_by(year = exh_start_Y_from) %>% 
    summarise(edges_out = n()),
  galsimple_edges %>% 
    group_by(year = exh_start_Y_to) %>% 
    summarise(edges_in = n()), 
  by = "year") %>% 
  full_join(places_exhibiting_per_year, "year")

Is it possible to calculate a reliable and valid status measure of exhibition venues based on this data? Initially, it seems reasonable that the status of exhibition venues should not vary substantially from one year to the next. To be sure, changes in the attributed significance of certain venues should be reflected by the centrality measure. However, it is unlikely that over a long period of time, the status of venues varies from the highest to the lowest values in one year, and from the lowest to the highest values in the next year, and so forth. Therefore, a high volatility is more likely related to changes in the amount of available data than to shifts in attributed relevance of the venue.

In fact, figure \@ref(fig:places-and-exhibitions-per-year) shows that the number of documented exhibitions and active venues varies considerably over time. Most exhibitions were documented in the period between 1995 and 2015. This period accounts for r filter(exhibitions_per_year, year >= 1995, year <= 2015) %$% sum(num_exh_per_year) out of r sum(exhibitions_per_year$num_exh_per_year) total available exhibitions, that is, r round((filter(exhibitions_per_year, year>= 1995, year <= 2015) %$% sum(num_exh_per_year) / sum(exhibitions_per_year$num_exh_per_year)) * 100, 1)%. On average, r filter(exhibitions_per_year, year >= 1995, year <= 2015) %$% mean(num_exh_per_year) %>% round(0) exhibitions were documented per year in this period. In contrast, for the period before 1995 the dataset provides details for about r filter(exhibitions_per_year, year <= 1994) %$% mean(num_exh_per_year) %>% round(0) exhibitions per year. Nevertheless, as the status order of the exhibition venues is derived from the artists' movements between the venues, the stability of the status order is less a matter of the number of exhibitions than of the number of artists' exhibition participations. Consider that one exhibition implies at least one tie between exhibition venues, that is, an artist first exhibits in one venue and subsequently in another. By definition, a solo exhibition can only result in one link between venues, whereas a group show entails at least two ties. The measure of the status of exhibition venues, to be more precise, is therefore rather affected by variation in the number of ties between venues than by the number of exhibitions they host.

places_exhibiting_per_type_per_year %>% 
  filter(year >= 1945) %>% 
  mutate_at(vars(type_exhplace), 
            ~ str_replace(., "NonProfit", "Non Profit") %>% 
              factor(levels = c("Collector", "Gallery", "Non Profit", "Museum"))) %>% 
  ggplot(aes(x = year)) + 
  geom_col(aes(y = num_places_exhibiting, fill = type_exhplace)) +
  venue_colors_fill +
  geom_line(aes(y = num_exh_per_year, color = "Exhibitions"), data = exhibitions_per_year) +
  scale_x_continuous(breaks = seq(1945, 2015, 10)) + 
  scale_color_manual(name="",  values=c(Exhibitions="black")) +
  theme_bw() +
  labs(x = "Year", color = "Exhibition venue", fill = "Exhibition\nvenue", y = "Number of exhibitions and venues") +
  guides(fill = guide_legend(order = 1))

Figure \@ref(fig:edges-per-year) shows that most movements occur also between 1995 and 2015. In contrast to the number of exhibitions in figure \@ref(fig:places-and-exhibitions-per-year), the period before 1995 is characterized by higher variability in the number of observed edges. Despite this variation, the amount of edges tends to increase from 1975 until 1997. Conversely, a sharp decrease not only occurs at the end of the observation period, but also in the late 1990s to early 2000s. In view of figure \@ref(fig:edges-per-year), it should be mentioned that inward edges refer to the number of artists who were previously involved in exhibitions and will again exhibit in year $t$. Conversely, outward edges indicate the number of artists whose works are exhibited in year $t$ and will later be displayed elsewhere. Note also that the number of inward edges at the end of the observed time period is higher than the number of outward edges, reflecting that the artists arrive at the last documented exhibition of their trajectories. More importantly, in view of the variation in the number of observations, it is necessary to ascertain whether the status order of the exhibition venues is also changing. In other words, how stable is the position of exhibition venues given that new ones emerge and others disappear?

edges_per_year %>% 
  tidyr::gather(var, val, - year) %>% 
  filter(!is.na(val)) %>% 
  mutate(var = var %>% 
           str_replace("edges_in", "Inward edges") %>% 
           str_replace("edges_out", "Outward edges") %>% 
           str_replace("num_places_exhibiting", "Number of exhibition venues") %>% 
           factor(levels = c("Inward edges", "Outward edges", "Number of exhibition venues"))
         ) %>% 
  ggplot(aes(year, val, color = var)) + 
  geom_line() +
  theme_bw() +
  theme(legend.position = "bottom") +
  scale_x_continuous(breaks = seq(1905, 2015, 10), limits = c(1945, NA)) +
  scale_y_continuous(breaks = seq(0, 10000, 2000)) +
  labs(x = "Year", y = "Number of edges", color = element_blank())

At first sight, figure \@ref(fig:edges-per-year) suggests to restrict the period of observation, beginning in 1995 since most edges are observed from this year on. Considering the available data, it seems also reasonable to restrict the study up to and including year 2013. Indeed, after calculating the Eigenvector centrality of the exhibition venues in every year $t$, I observed a high level of variation of the centrality scores before 1975 and after 2015. More precisely, I calculated the venues' Eigenvector centrality, scaled between 0 and 1, including all ties resulting from exhibitions that took place up to and including year $t$. Between 1975 and 2015, however, the status order remains rather stable, also with respect to the strong increase in available exhibition data around the year 1995. Figure \@ref(fig:eigenvector-centrality-t1) illustrates this finding. Arguably, from 1975 to 1985 the centrality scores seems to be generally shifted upwards.[^ranking] Apart from this, however, the prestige score seems not to be strongly dependent on the varying number of observations between 1975 and 2015. Therefore, I extended the study to this period. This allows to include the careers of artists who began their career within this period of 40 years, increasing the data basis used for analysis considerably.

[^ranking]: Hence, it seems reasonable to neglect the numerical differences of the scaled centrality measure and to use a ranking instead. In accordance with Fraiberger et al. [-@fraiberger2018sm, 8], I have sorted the centrality values for all exhibition locations in the given year in ascending order and calculated the percentile rank. Hence, a venue located in the 90th percentile has a centrality value equal to or greater than 90% of the venues in that year. The most and least prestigous exhibition venues in a given year have a prestige of 100 and 0 respectively. For the further analysis, beginning with the next section, I will use these percentile ranks as measure for the prestige of the exhibition venues.

nodes_w_t1 <- igraph::as_data_frame(galsimple_eigen, "vertices") 

nodes_l_t1 <- nodes_w_t1 %>% 
  select(name, type_exhplace, name_exhplace, num_of_exh, contains("eigen_")) %>% 
  tidyr::gather(var, eigen_val, - name, - name_exhplace, - type_exhplace, - num_of_exh) %>% 
  mutate(year = var %>% str_extract("[0-9]{4}") %>% as.integer(), 
         type_exhplace = type_exhplace %>% str_replace("NonProfit", "Non Profit") %>% 
           factor(levels = c("Collector", "Gallery", "Non Profit", "Museum")))

nodes_l_t1 %>% 
  filter(!is.na(eigen_val), year <= 2015) %>% 
  group_by(name) %>% 
  ggplot(aes(year, eigen_val, group = name, color = type_exhplace)) + 
  venue_colors_color + 
  geom_line(size = 0.35) +
  theme_bw() +
  theme(legend.position = "bottom") +
  labs(color = "Exhibition venue", y = "Eigenvector Centrality", x = "Year") +
  scale_x_continuous(limits = c(min(nodes_l_t1$year), NA), 
                     breaks = seq(min(nodes_l_t1$year), max(nodes_l_t1$year), 5))

list_t1 <- list(nodes_l_t1,
                galsimple_eigen_past5,
                galsimple_eigen_past10, 
                galsimple_eigen_past25,
                galsimple_eigen_past50, 
                galsimple_eigen_past75,
                galsimple_eigen_past100) %>% 
  purrr::reduce(full_join, by = c("name", "year")) %>% 
  na.omit() %>% 
  left_join(exhplaces %>% select(name = id, contains("yr")), by = c("name")) %>% 

  group_by(year) %>% 
  mutate(rank_past_all = rank(- eigen_val),
         rank_past5 = rank(- eigen_past5_exh),
         rank_past10= rank(- eigen_past10_exh), 
         rank_past25 = rank(- eigen_past25_exh), 
         rank_past75 = rank(- eigen_past75_exh), 
         rank_past100 = rank(- eigen_past100_exh)) %>% 

  ungroup()

corr_matrix <- list_t1 %>% 
  select(contains("rank")) %>% 
  cor(method = "spearman")

colnames(corr_matrix) <- colnames(corr_matrix) %>% 
  str_replace("rank_past_?", ":epsilon ==") %>% 
  str_replace("all", "infinity")

rownames(corr_matrix) <- rownames(corr_matrix) %>% 
  str_replace("rank_past_?", ":epsilon ==") %>% 
  str_replace("all", "infinity")


col <- colorRampPalette(c("#BB4444", "#EE9988", "#FFFFFF", "#77AADD", "#4477AA"))

corr_matrix %>% 
  corrplot::corrplot(method = "shade", 
                     col = col(200), 
                     type = "upper",  number.cex = .7,
                     cl.lim = c(0,1),
                     addCoef.col = "black", # Add coefficient of correlation
                     tl.col = "black", tl.srt = 90, # Text label color and rotation
                     # hide correlation coefficient on the principal diagonal
                     diag = T)

min_corr <- data.frame(corr_matrix) %>% 
  mutate(rownames = rownames(.)) %>% 
  filter(rownames == ":epsilon ==infinity") %>% 
  gather(var, val, - rownames) %>% 
  slice(which.min(val)) %>% 
  mutate_if(is.numeric, ~ round(.,2)) %>% 
  mutate(var = str_extract(var, "[0-9]+"))

For the remaining exhibition venues active between 1975 and 2015, i.e. those that have ties with other venues, the prestige measure can be further evaluated. For this purpose, I suggest to vary the number of exhibitions per venue used for calculating centrality. Possibly, the important exhibition venues derive their central position from their long exhibition history. Will the same venues remain central, even if only their most recent exhibition activity is considered? In order to answer this question, I generate a network for each year, which consists of ties formed by the last $\epsilon={5,10,25,75,100}$ exhibitions of each venue. I then calculate Spearman's Rho to measure how strongly the ranks of the same venue are correlated in each year, given the varying number of exhibitions. The rankings for a varying $\epsilon$ of a venue can then be compared with the ranking calculated from all past exhibitions $(\epsilon = \infty)$. Figure \@ref(fig:rank-correlations) shows that the correlation with this benchmark ranking has a minimum $\rho_s=$ r min_corr$val for $\epsilon=$ r min_corr$var. The high ranking correlation suggests that even the movements, resulting from the last $\epsilon$ exhibitions only, contain enough information about the status of the exhibition venues in a given year. In conclusion, the two inquiries illustrated by figure \@ref(fig:eigenvector-centrality-t1) and \@ref(fig:rank-correlations) indicate that the measure is not prone to varying availability of data.

# Order infinity 

# 1975-2015 based on 1965

nodes_w_tinf <- igraph::as_data_frame(galsimple_inf_eigen1975_2015_base1965, "vertices") 

nodes_l_tinf <- nodes_w_tinf %>% 
  select(name, type_exhplace, name_exhplace, num_of_exh, contains("eigen_")) %>% 
  tidyr::gather(var, eigen_val, - name, - name_exhplace, - type_exhplace, - num_of_exh) %>% 
  mutate(year = var %>% str_extract("[0-9]{4}") %>% as.integer())

nodes_l_tinf %>% 
  filter(!is.na(eigen_val), year <= 2015) %>% 
  group_by(name) %>% 
  #filter(max(val, na.rm = T) != 0) %>% 
  mutate(mean_centr = mean(eigen_val, na.rm = T), 
         quartile = ntile(mean_centr, 4),
         centr_cat = case_when(mean(eigen_val) < 0.025 ~ "ec < 0.25", 
                               mean(eigen_val) >= 0.05 ~ "ec >= 0.25") 
         ) %>% 


  #filter(year >= 1975, year <= 2015) %>% 
  ggplot(aes(year, eigen_val, group = name, color = type_exhplace)) + 
  geom_line() +
  theme_bw() +
  theme(legend.position = "bottom") +
  scale_x_continuous(limits = c(min(nodes_l_tinf$year), NA), 
                     breaks = seq(min(nodes_l_tinf$year), max(nodes_l_tinf$year), 5))

nodes_w <- igraph::as_data_frame(galsimple_inf_eigen1975_2015_base1945, "vertices") 

nodes_l <- nodes_w %>% 
  select(name, type_exhplace, name_exhplace, num_of_exh, contains("eigen_")) %>% 
  tidyr::gather(var, eigen_val, - name, - name_exhplace, - type_exhplace, - num_of_exh) %>% 
  mutate(year = var %>% str_extract("[0-9]{4}") %>% as.integer())

nodes_l %>% 
  filter(!is.na(eigen_val), year <= 2015) %>% 
  group_by(name) %>% 
  #filter(max(val, na.rm = T) != 0) %>% 
  mutate(mean_centr = mean(eigen_val, na.rm = T), 
         quartile = ntile(mean_centr, 4),
         centr_cat = case_when(mean(eigen_val) < 0.025 ~ "ec < 0.25", 
                               mean(eigen_val) >= 0.05 ~ "ec >= 0.25") 
         ) %>% 


  #filter(year >= 1975, year <= 2015) %>% 
  ggplot(aes(year, eigen_val, group = name, color = type_exhplace)) + 
  geom_line() +
  theme_bw() +
  theme(legend.position = "bottom") +
  scale_x_continuous(limits = c(min(nodes_l$year), NA), 
                     breaks = seq(min(nodes_l$year), max(nodes_l$year), 5))

list_tinf <- list(nodes_l_tinf,
                  galsimple_eigen_inf_past5,
                  galsimple_eigen_inf_past10, 
                  galsimple_eigen_inf_past25,
                  galsimple_eigen_inf_past50, 
                  galsimple_eigen_inf_past75,
                  galsimple_eigen_inf_past100) %>% 
  purrr::reduce(full_join, by = c("name", "year")) %>% 
  na.omit() %>% 
  left_join(exhplaces %>% select(name = id, contains("yr")), by = c("name")) %>% 

  group_by(year) %>% 
  mutate(rank_past_all = rank(- eigen_val),
         rank_past5 = rank(- eigen_past5_exh),
         rank_past10= rank(- eigen_past10_exh), 
         rank_past25 = rank(- eigen_past25_exh), 
         rank_past75 = rank(- eigen_past75_exh), 
         rank_past100 = rank(- eigen_past100_exh)) %>% 

  ungroup()

col <- colorRampPalette(c("#BB4444", "#EE9988", "#FFFFFF", "#77AADD", "#4477AA"))
list_tinf %>% 
  select(contains("rank")) %>% 
  cor(method = "spearman") %>% 
  corrplot::corrplot(method = "color", col = col(200), 
         type = "upper",  number.cex = .7,
         addCoef.col = "black", # Add coefficient of correlation

         tl.col = "black", tl.srt = 90, # Text label color and rotation
         # hide correlation coefficient on the principal diagonal
         diag = T)

subgraph1985 <- create_subgraph_year(galsimple_eigen, 
                                     .lower_boundary_yr = 1985, .yr_of_interest = 1985) %>% 
  as_tbl_graph() #%>% igraph::as_data_frame("edges") %>% View

ggraph(subgraph1985, layout = "fr") + 
  geom_edge_link(arrow = arrow(length = unit(4, 'mm'), angle = 12), end_cap = circle(3, 'mm'),
                 alpha = 0.3) + 
  geom_node_point(aes(color = type_exhplace, size = eigen_1985_20), alpha = 0.8) +
  geom_node_text(aes(label = name), nudge_y = 0.2, size = 2, check_overlap = T, repel = T) +
  theme_bw() +    
  labs(color = "Exhibition Place")

vlow_status <- 45

ranked_nodes <- list_t1 %>% 
  select(name:year, first_exh_yr, last_exh_yr, eigen_val, rank_past_all) %>% 
  group_by(year) %>% 
  mutate(venue_percentile = percent_rank(eigen_val) * 100, 
         venue_status = case_when(venue_percentile <= 40 ~ "lower\n40%", 
                                  venue_percentile >= 41 & venue_percentile <= 80 ~ "mid 41%\nto 80%",
                                  venue_percentile >= 81 ~ "top\n20%"), 
         venue_decile     = ntile(venue_percentile,  10),
         venue_quintile   = ntile(venue_percentile,   5)) %>% 
  #mutate_at(vars(matches("[0-9]tile")), factor) %>% 
  ungroup()  




career_paths <- galsimple_edges %>% 

  # convert to factors for better plotting
  mutate_at(vars(contains("exh_type")), 
            ~ str_replace(.,"S$","solo") %>% 
              str_replace("G$","group") %>% 
              factor(levels = c("group", "solo"))) %>% 
  mutate_at(vars(contains("exh_venue")), 
            ~ str_replace(., "NonProfit", "Non Profit") %>% 
              factor(levels = c("Collector", "Gallery", "Non Profit", "Museum"))) %>% 

  # join node data with eigenvector centrality values 
  mutate_at(vars(from, to), as.integer) %>% 
  left_join(ranked_nodes, by = c("from" = "name", "exh_start_Y_from" = "year")) %>% 
  filter(!is.na(eigen_val),
         exh_start_Y_from <= 2015, 
         exh_start_Y_to <= 2015)


# create df of artists with several restricions for data quality
artists_lean <- create_artist_selection(career_paths, artists,
                                        .max_years_between_birth_first_exh = 50,
                                        .count_only_exh_alive = T,
                                        .period_start = 1975, .period_end = 2015,
                                        .num_exh_in_period = 10)


career_paths <- career_paths %>% 
  # join artists and keep those with more than n exhibitions
  inner_join(artists_lean, by = c("artist_id" = "id")) %>% 

  group_by(artist_id) %>% 

  arrange(artist_id, exh_start_Ymd_from) %>% 

  mutate(
    # indicate number of career step for each artist (overwrite former tie number variable)
    tie_number = row_number(), 

    career_phase = case_when(tie_number <= 5 ~ "first_5_exh", 
                             tie_number >  5 & tie_number <= exh_during_period - 5 ~ "mid_exh",
                             tie_number >  exh_during_period - 5 ~ "last_5_exh"
                             ) %>% factor(levels = c("first_5_exh", "mid_exh", "last_5_exh")), 

    career_phase_deciles = ntile(tie_number, 10) %>% as.character() %>% factor(levels = as.character(seq(1,10,1))), 
    career_phase_quintiles = ntile(tie_number, 5) %>% as.character() %>% factor(levels = as.character(seq(1,5,1))),
    career_phase_quartiles = ntile(tie_number, 4) %>% as.character() %>% factor(levels = as.character(seq(1,4,1))),
    artist_exh_activity = case_when(exh_during_period == 10 ~ "1-10", 
                                    exh_during_period >= 11 & exh_during_period <= 25 ~ "11-25", 
                                    exh_during_period >= 26 & exh_during_period <= 50 ~ "26-50",
                                    exh_during_period > 50 ~ "50+") %>% 
      factor(levels = c("1-10", "11-25", "26-50", "50+"))) %>% 
  ungroup()

# career_paths %>% 
#   group_by(artist_id, career_phase) %>% 
#   add_count() %>% 
#   arrange(artist_id, career_phase) %>% 
#   select(tie_number, career_phase, exh_during_period, everything()) %>% 
#   filter(tie_number > exh_during_period) %>% 
#   View()

top_100_fraib <- readxl::read_xlsx("../data/Top100_Institutions_Fraiberger_et_al.xlsx") %>% 
  mutate(fraib_location_unclear = fraib_location_unclear %>% if_else(is.na(.), "no", .))


# get ranking for the most current year available
latest_ranked_nodes <- ranked_nodes %>% 
      filter(!year %in% c(2016:2018), 
             !is.na(venue_status)) %>% # exclude messy data
      group_by(name) %>% 
      slice(which.max(year))


top_100_fraib_lean <- top_100_fraib %>% 

  filter(!is.na(artist_info_name1)) %>% 

  fuzzyjoin::stringdist_left_join(latest_ranked_nodes,

    # join manually searched artist-info equivalent to fraib et al 
    # with artist-info name
    by = c(artist_info_name1 = "name_exhplace"), 
    max_dist = 2) %>% 

  # exclude galleries that must have been aggregated by fraiberger
  filter(fraib_location_unclear != "yes") %>% 

  select(- artist_info_other_names, - last_exh_year_artist_info)  %>% 
  mutate(fraib_rranked = rank(Rank), 
         artist_rranked = rank(venue_percentile)) %>% 
  left_join(exhplaces %>% 
              select(id, country:address), 
            by = c("name" = "id"))



corr_all_cont <- top_100_fraib_lean %>% 
  mutate(Continents = "All continents") %>% 
  group_by(Continents) %>% 
  summarise(n = n(), 
            rho = cor(fraib_rranked, artist_rranked, method = "spearman")) 

corr_grouped_cont <- top_100_fraib_lean %>% 
  mutate(Continents = continent %>% if_else(. == "Europe", ., "Americas and Asia")) %>% 
  group_by(Continents) %>% 
  summarise(n = n(), 
            rho = cor(fraib_rranked, artist_rranked, method = "spearman")) %>% 
  arrange(- rho)


corr_cont <- bind_rows(corr_all_cont, corr_grouped_cont) %>% 
  mutate_if(is.numeric, ~ round(., 3))

top_100_fraib <- readxl::read_xlsx("../data/Top100_Institutions_Fraiberger_et_al.xlsx") %>% 
  mutate(fraib_location_unclear = fraib_location_unclear %>% if_else(is.na(.), "no", .), 
         Institution_continent = Institution_continent %>% na_if("NA"))




# get ranking for the most current year available
latest_ranked_nodes <- ranked_nodes %>% 
      filter(!year %in% c(2016:2018), 
             !is.na(venue_status)) %>% # exclude messy data
      group_by(name) %>% 
      slice(which.max(year))


top_100_fraib_lean <- top_100_fraib %>% 

  filter(!is.na(artist_info_name1)) %>% 

  fuzzyjoin::stringdist_left_join(latest_ranked_nodes,

    # join manually searched artist-info equivalent to fraib et al 
    # with artist-info name
    by = c(artist_info_name1 = "name_exhplace"), 
    max_dist = 2) %>% 

  # exclude galleries that must have been aggregated by fraiberger
  filter(fraib_location_unclear != "yes") %>% 

  select(- artist_info_other_names, - last_exh_year_artist_info)  %>% 
  left_join(exhplaces %>% 
              select(id, country:address), 
            by = c("name" = "id")) %>% 

  arrange(Rank) %>% 
  mutate(fraib_rranked = row_number()) %>% 

  arrange(rank_past_all) %>% 
  mutate(artist_rranked = row_number()) %>%   
  select(Institution, Rank, fraib_rranked, 
         rank_past_all, artist_rranked, venue_percentile, everything()) 

# top_100_fraib %>% 
#   #anti_join(top_100_fraib_lean, by = c("Institution")) %>% 
#   filter(fraib_location_unclear != "yes", !is.na(Institution_continent)) %>% 
#   group_by(Institution_continent) %>% 
#   summarise(n = n())

However, it remains to be answered how valid the ranking is. Is it consistent with other sources, for example, are the same venues considered to be prestigious? Fraiberger et al. [-@fraiberger2018sm, 7] rely on expert ratings and sales-based rankings of artworks' auction results to validate the network-based ranking. They find that eigenvector centrality has the strongest correlation with the sales-based ranking $(\rho_s = 0.88)$. The expert rating, on the other hand, is most related to the nodes' Pagerank $(\rho_s=0.51)$. While the experts show a high degree of agreement with regard to the network-based ranking of top institutions, their assessments of lower ranked venues are widely dispersed. The authors conclude that, "[o]verall, sales data is far more objective information on the perceived value and prestige" (Ibid.). Unfortunately, the dataset prepared for this study does not include these two variables. Simultaneously, as mentioned before, the data provided for replicating the results are anonymized [@fraiberger2018ds]. It is therefore not possible to supplement these variables by matching the names of exhibition venues from both datasets. Instead, I will use a list of the top 100 exhibition venues provided in the supplementary materials of their study [@fraiberger2018sm, 16]. As a benchmark, the network-based ranking of the top institutions seems appropriate since it was proven to be highly consistent with both expert grading and sales-based ranking. The purpose is to verify if the same venues are among the top 100 and whether the order of the placements is similar.

top_100_intersect <- top_100_fraib_lean %>% 
  filter(rank_past_all <= 100) %>% 
  arrange(fraib_rranked)

top_100_intersect %>%
  select(Institution, Rank, rank_past_all, venue_percentile) %>% 
  mutate_if(is.numeric, ~ round(., 1)) %>% 
  kable(format = "latex", booktabs = T, escape = F, align = c("l", rep("c", 3)), 
        caption.short = "Intersection of the top 100 venues",
        caption = "Intersection of the top 100 venues indicated by Fraiberger et al. (2018) and derived from eigenvector centrality computed for the artist-info dataset",
        col.names = linebreak(c("Exhibition Venue", "Rank\n(Fraiberger et al.)", "Rank\n(computed)", "Percentile\n(computed)"), align = "c")) %>% 
  kable_styling(latex_options = c("striped"))

corr_all_cont <- top_100_fraib_lean %>% 
  filter(rank_past_all <= 100) %>% 
  mutate(Continents = "All continents") %>%
  group_by(Continents) %>%
  summarise(n = n(), 
            rho = cor(Rank, rank_past_all, method = "spearman")) 

corr_grouped_cont <- top_100_fraib_lean %>% 
  filter(rank_past_all <= 100) %>% 
  mutate(Continents = continent %>% if_else(. == "Europe", ., "Americas and Asia")) %>% 
  group_by(Continents) %>% 
  summarise(n = n(), 
            rho = cor(Rank, rank_past_all, method = "spearman")) %>% 
  arrange(- rho)


corr_cont <- bind_rows(corr_all_cont, corr_grouped_cont) %>% 
  mutate_if(is.numeric, ~ round(., 2))

corr_all_cont <- top_100_fraib_lean %>% 
  filter(rank_past_all <= 100) %>% 
  mutate(Continents = "All continents") %>% 
  group_by(Continents) %>% 
  summarise(n = n(), 
            rho = cor(fraib_rranked, artist_rranked, method = "spearman")) 

corr_grouped_cont <- top_100_fraib_lean %>% 
  filter(rank_past_all <= 100) %>% 
  mutate(Continents = continent %>% if_else(. == "Europe", ., "Americas and Asia")) %>% 
  group_by(Continents) %>% 
  summarise(n = n(), 
            rho = cor(fraib_rranked, artist_rranked, method = "spearman")) %>% 
  arrange(- rho)


corr_cont <- bind_rows(corr_all_cont, corr_grouped_cont) %>% 
  mutate_if(is.numeric, ~ round(., 3))

In fact, r nrow(top_100_fraib_lean) out of r nrow(top_100_fraib) venues are also present in the artist-info dataset and sufficiently comparable.[^lack_of_comparability] Table \@ref(tab:table-intersection-of-top-100) presents the intersection of r nrow(top_100_intersect) venues listed among the top 100 based on both datasets, measured by eigenvector centrality. In other words, I find that approximately half of the r nrow(top_100_fraib_lean) venues, present in both datasets and suitable for a comparison, are also ranked in the top 100. Moreover, the placements in the intersection of the top 100 exhibition venues show a weak positive correlation $(\rho_s=$ r filter(corr_cont, Continents == "All continents") %>% pull(rho)), that is, a lower rank indicated by @fraiberger2018sm is indeed associated with a lower calculated rank based on the artist-info dataset.

[^lack_of_comparability]: Presumably, @fraiberger2018 summarized the exhibition data of certain venues. For example, the artist-info dataset provides information about r filter(exhplaces, str_detect(name_exhplace, "Gagosian")) %>% nrow() different locations of the Gagosian Gallery. On the other hand, for some venues no network based ranking was available in r max(top_100_fraib_lean$year). Excluding these venues yields a subset of r nrow(top_100_fraib_lean) venues for comparison.

# create comparative table
rank_comp <- top_100_fraib %>% 

  # average rank top 100 fraiberger
  filter(!is.na(Institution_continent)) %>% 
  group_by(Institution_continent) %>% 
  summarise(avg_r_top_100_fraib = mean(Rank), 
            n_top_100_fraib = n()) %>% 

  # top 100 intersect of artist-info and top 100 list by fraiberger
  full_join(
    top_100_fraib_lean %>% 
      filter(rank_past_all <= 100) %>% 
      group_by(continent) %>% 
      summarise(avg_r_intersect_rank_by_artistinfo = mean(rank_past_all), 
                avg_r_intersect_rank_by_fraib      = mean(Rank),
                n_intersect = n()), 

    by = c("Institution_continent" = "continent")
  ) %>% 

  # top 100 artist-info
  full_join(
    ranked_nodes %>% 
      filter(year == 2015) %>% 
      arrange(rank_past_all) %>% 
      head(100) %>% 
      left_join(exhplaces %>% 
                  select(id, country:address), 
                by = c("name" = "id")) %>% 
      group_by(continent) %>% 
      summarise(avg_r_artist_top100 = mean(rank_past_all), n_artist_top100 = n()), 
    by = c("Institution_continent" = "continent")
  ) %>% 

  full_join(
    ranked_nodes %>% 
      filter(year == 2015) %>% 
      left_join(exhplaces %>% 
                  select(id, country:address), 
                by = c("name" = "id")) %>% 
      group_by(continent) %>% 
      summarise(avg_r_artist_2015_all = mean(rank_past_all), n_artist_2015_all = n()), 
    by = c("Institution_continent" = "continent")
  ) %>% 

  mutate_if(is.numeric, ~ round(., 0)) %>% 
  select(Institution_continent, 
         avg_r_top_100_fraib, 
         avg_r_intersect_rank_by_artistinfo, 
         avg_r_artist_top100, 
         avg_r_artist_2015_all, everything()) %>% 
  arrange(avg_r_top_100_fraib, avg_r_artist_2015_all)

obs_rankings <- rank_comp %>% 
  mutate(separating_col = "                                          ") %>% 
  select(Institution_continent, 
         n_top_100_fraib, 
         n_artist_top100, 
         n_intersect, 
         separating_col,
         n_artist_2015_all) %>% 
  mutate_if(is.numeric, .funs = ~ (. / sum(., na.rm=T) * 100) %>% round(1)) 

obs_rankings %>% 
  mutate_all(~ replace_na(., "")) %>% 
  kable(col.names = c("Continent", 
                      paste0("Fraiberger et al.",
                             footnote_marker_symbol(1, format = "latex"),
                             "\nn=", 
                             sum(rank_comp$n_top_100_fraib, na.rm=T)), 
                      paste0("artist-info\nn=", 
                             sum(rank_comp$n_artist_top100, na.rm=T)), 
                      paste0("Intersection\nn=", 
                             sum(rank_comp$n_intersect, na.rm=T)), 
                      "                                          ",
                      paste0("artist-info\nn=", 
                             sum(rank_comp$n_artist_2015_all, na.rm=T))) %>% linebreak(align = "c"), 
        escape = F, 
        booktabs = T, 
        linesep = "",
        format = "latex", 
        align = c("l", rep("c", 5)), 
        caption = "Geographical comparison of the network-based rankings (number of venues)") %>% 
  kable_styling(latex_options = "striped") %>% 
  add_header_above(header = c(" ", "Top 100" = 3, " ", "Total")) %>% 
  add_header_above(header = c(" ", "Number of observations in %" = 5), 
                   align = "c") %>% 
  footnote(symbol = "Organizations with multiple locations have been excluded")

avg_rank_intersect <- top_100_fraib_lean %>% 
  filter(rank_past_all <= 100) %>% 
  summarise(fraib = mean(Rank), 
            artist = mean(rank_past_all)) %>% 
  mutate_if(is.numeric, round)

Compared to the benchmark ranking, however, I overestimate the status of the exhibition venues listed in table \@ref(tab:table-intersection-of-top-100). On average, venues are ranked r avg_rank_intersect$artistth based on the artist-info record, whereas the same locations are only ranked r avg_rank_intersect$fraibth in the benchmark list. Given the result that the network measure yields in fact similar results to the extent specified above, this seems also be to due to the absence of important exhibition locations in the artist-info dataset. For example, exhibition data for the Museum of Modern Art and Whitney Museum are missing, which is mainly related to the challenges associated with the data collection method applied for this study.[^data_collection] More specifically, it is possible that the artist-info dataset omits data on North American venues more systematically, as table \@ref(tab:obs-ranking-comparison) suggests. Compared to the benchmark list of @fraiberger2018sm, there are considerably more European institutions than North American institutions in the top 100 derived from artist-info, r filter(obs_rankings, Institution_continent == "North America") %>% pull(n_top_100_fraib) as opposed to r filter(obs_rankings, Institution_continent == "North America") %>% pull(n_artist_top100) percent. The intersection of both datasets, as well as the total number of venues in the artist-info, comprise about one fifth of American institutions. In contrast, almost four-fifths of the exhibition venues are located in Europe.

[^data_collection]: The automatic retrieval of exhibition data failed for some venues. Given the commonly extensive exhibition history of these locations, this is presumably due to prolonged server loading times and a deficient configuration of the program used for data retrieval. Manual additions had not been possible in the context of this thesis, also because the access to artist-info.com has become subject to payment.

European and North American exhibition venues also differ with regard to the network-based ranking, as shown by table \@ref(tab:avg-ranking-comparison). Consider, once again, the intersection of exhibitions venues, i.e. those listed in the top 100 by Fraiberger et al. and by the network-based ranking for the artist-info dataset. The average rank of North American venues indicated by @fraiberger2018sm is 53 whereas European locations are only ranked 60th (see $Intersection^a$). Conversely, based on the artist-info dataset, these European venues are ranked 43rd whereas the North American locations are only on 64th position (see $Intersection^b$). The tendency to overestimate the status of European places also applies considering the entire top 100 of both datasets. In this case, however, the differences are considerably smaller (see first two columns of table \@ref(tab:avg-ranking-comparison)). The same geographical divergences apply to the total number of exhibition venues with affiliations in 2015 as well, as shown in the column furthest to the right. Consequently, in comparison to the benchmark, I tend to overestimate the status of European venues.

In summary, I find a geographical bias when comparing the top 100 venues listed by @fraiberger2018sm to the 100 based on the artist-info dataset, the intersection of these 100 venues and the total number of locations in the dataset at hand. Admittedly, a larger number of locations would have been preferable for more substantiated comparison. As mentioned above, the limited number of locations is partly due to the procedure of data collection. Nevertheless, this section also revealed that half of the comparable exhibition venues, available in both data sets, are listed in the top 100. In addition, the order of ranking reflects a weak positive correlation.

avg_rankings <- rank_comp %>% 
  mutate(separating_col = " ") %>% 
  select(Institution_continent, 
         avg_r_top_100_fraib, 
         avg_r_artist_top100, 
         avg_r_intersect_rank_by_fraib,
         avg_r_intersect_rank_by_artistinfo, 
         separating_col,
         avg_r_artist_2015_all) 



avg_rankings %>% 
  mutate_all(~ replace_na(., "")) %>% 
  kable(col.names = c("Continent", 
                      paste0("Fraiberger et al.",
                             footnote_marker_symbol(1, format = "latex"),
                             "\nn=", 
                             sum(rank_comp$n_top_100_fraib, na.rm=T)), 
                      paste0("artist-info\nn=", 
                             sum(rank_comp$n_artist_top100, na.rm=T)), 
                      paste0("Intersection", 
                             footnote_marker_alphabet(1, "latex"), 
                             "\nn=", 
                             sum(rank_comp$n_intersect, na.rm=T)), 
                      paste0("Intersection", 
                             footnote_marker_alphabet(2, "latex"), 
                             "\nn=",  
                             sum(rank_comp$n_intersect, na.rm=T)), 
                      " ",
                      paste0("artist-info\nn=", 
                             sum(rank_comp$n_artist_2015_all, na.rm=T))) %>% linebreak(align = "c"), 
        escape = F, 
        booktabs = T, 
        linesep = "",
        format = "latex", 
        align = c("l", rep("c", 6)), 
        caption = "Geographical comparison of the network-based rankings (average rank)") %>% 
  kable_styling(latex_options = "striped") %>% 
  add_header_above(header = c(" ", "Top 100" = 4, " ", "Total")) %>% 
  add_header_above(header = c(" ", "Average Rank" = 6), 
                   align = "c", escape = F) %>% 
  footnote(symbol = "Organizations with multiple locations have been excluded", 
           alphabet_title = "Calculation of the mean based on: ", 
           alphabet = c("ranks indicated by Fraiberger et al.", 
                        "ranks computed for the artist-info dataset"))

ranked_nodes %>% 
    left_join(exhplaces %>% 
              select(id, country:address), 
              by = c("name" = "id")) %>% 
  group_by(continent, year) %>% 
  summarise(avg_percentile = median(venue_percentile), 
            n = n()) %>% 
  filter(n >= 50) %>% 

  ggplot(aes(year, avg_percentile, color = continent)) + 
  geom_line()

galsimple_eigen %>% 
  filter(num_of_exh >= 10)

Careers in the visual arts {#artist-careers}

career_paths %>% 
  group_by(artist_id, career_phase_quintiles) %>% 
  summarise(n = n()) %>% 
  group_by(career_phase_quintiles) %>% 
  summarise(avg = mean(n), 
            median = median(n),
            sd = sd(n)) %>% 
  mutate_if(is.numeric, ~ round(., 7)) %>% 
  knitr::kable(caption = "Exhibitions per career phase")

This chapter examines the careers of artists, addressing the hypotheses presented in chapter \@ref(hypotheses). In the first section, I analyse to what extent the path dependency, observed by @fraiberger2018, applies to the artists of the dataset at hand. Subsequently, I will explore the artists' trajectories over time to find out if initial differences in recognition increase throughout the careers. The third section describes how regular or irregular careers evolve. In other words, do artists exclusively exhibit in prestigious venues given that their careers began in high-profile locations? Finally, the fourth section examines whether successful and less successful artists can be distinguished by the type of moves in particular phases of their careers.

Path dependencies

# assign reputation class for begin of career according to Fraiberger et al. (2018, 2)

# calculate avg reputation per career phase for each artist
artists_avg_reputation_per_phase <- career_paths %>% 
  group_by(artist_id, career_phase) %>% 
  summarise(avg_venue_perc = mean(venue_percentile)) 

# high initial 
artists_high_init_rep <- artists_avg_reputation_per_phase %>% 
  filter(career_phase == "first_5_exh" & avg_venue_perc >= 80) %>% 
  distinct(artist_id) %>% 
  pull()

# low initial
artists_low_init_rep <- artists_avg_reputation_per_phase %>% 
  filter(career_phase == "first_5_exh" & avg_venue_perc <= 40) %>% 
  distinct(artist_id) %>% 
  pull()

# assign reputation class for end of career 
# high end
artists_high_end_rep <- artists_avg_reputation_per_phase %>% 
  filter(career_phase == "last_5_exh" & avg_venue_perc >= 80) %>% 
  distinct(artist_id) %>% 
  pull()

# low end
artists_low_end_rep <- artists_avg_reputation_per_phase %>% 
  filter(career_phase == "last_5_exh" & avg_venue_perc <= 40) %>% 
  distinct(artist_id) %>% 
  pull()

career_paths <- career_paths %>% 
  mutate(initial_rep_fraib = case_when(artist_id %in% artists_high_init_rep ~ "high initial", 
                                       artist_id %in% artists_low_init_rep ~  "low initial") %>% 
           replace_na("moderate initial") %>% 
           factor(levels = c("low initial", "moderate initial", "high initial")), 

         end_rep_fraib = case_when(artist_id %in% artists_high_end_rep ~ "high final", 
                                   artist_id %in% artists_low_end_rep ~  "low final") %>% 
           replace_na("moderate final") %>% 
           factor(levels = c("low final", "moderate final", "high final")))

artists_avg_reputation_per_phase %>% 
  ggplot(aes(avg_venue_perc)) + 
  geom_histogram() + 
  facet_grid(~ career_phase) +
  theme_bw()

tmatrix <- career_paths %>% 
  distinct(artist_id, .keep_all = T) %$%
  table(initial_rep_fraib, end_rep_fraib) 

# tmatrix %>% 
#   addmargins(2) %>% 
#   xtable::xtable(align = c("l", rep("c", 3), "r"), digits = 0,
#                  caption = "Transition matrix for artists' reputation classes at the beginning and end of their careers indicating number of artists")

Does a high prestige of venues hosting the artists’ first five exhibitions coincide with a high prestige of the artists’ last five exhibition venues? In order to examine the hypothesized relationship, I created categories of recognition based on the first and the last five exhibits for artists with more than 10 exhibits. I assigned a high initial recognition if the artists' works were on average shown in the top 20% of exhibition venues during the first five exhibits. Conversely, a low reputation class was assigned if the artists' works were shown in the lower 40% of the exhibition venues. Artists, who on average exhibited in the intermediate percentiles, were classified with a moderate initial recognition. The same procedure was applied to define the recognition at the end of the artists' careers, that is, their last five exhibitions. For the best possible comparability of the results, I have followed the procedure specified by Fraiberger et al. [-@fraiberger2018, 2].

The classification of trajectories results in r rowSums(tmatrix)["high initial"] high, r rowSums(tmatrix)["moderate initial"] moderate and r rowSums(tmatrix)["low initial"] low initial careers. Arguably, the category of artists who, on average, had their first five exhibitions in the lower 40% of the venues is sparsely populated. In their study as well, Fraiberger et al. [-@fraiberger2018, 6] observe a disproportionately low number of artists in the lowest initial reputation class. As the authors suggest, this is related to the fact that exhibitions in venues of lower prestige are less frequently documented [@fraiberger2018sm, 3]. Indeed, the artists are assigned to one of these categories based on the average prestige of the first five venues. If fewer exhibitions are recorded for the lower 40% of venues, artists are less likely to be assigned to the lowest reputation class. Rather, their initial recognition is overestimated. To what extent will this affect the results? Fraiberger et al. [-@fraiberger2018sm, 3] consider this to be rather unproblematic, "as the data collection was institution-based, not artist centered". Moreover, since the status of the low initial artists tends to be overestimated, differences to the higher prestige groups are in fact larger. Therefore, even if this group is populated more sparsely, statements about the artists of low initial recognition will still be conservative. Hence, in regard to the hypothesis on the path dependency discussed in this section, disadvantages resulting from the low prestige of the initial exhibition venues are actually underestimated.[^underestimate]

[^underestimate]: In a similar sense, this argument was set out by @borkenhagen2018. Comparable to the artist-info dataset, their "Chef and Restaurant Database [...] is not only incomplete, but tend[s] to focus on the more interesting and popular cases" (Ibid, 6). However, they argue that "the data’s skew toward high-status chefs produces a sample that very likely understates the true differences between high- and low-status actors, resulting in biases that are conservative for our claims" (Ibid, 7).

tmatrix_p <- tmatrix %>% 
  {(. / rowSums(.)) * 100} %>% 
  round(1)
tmatrix_p %>%
  cbind(rep("Initial", 3), # initialize column to collapse with collapse_rows
        rownames(.) %>% str_remove(" initial"), # initialize column with rownames
        .) %>% # finally the table object

  kable(x = ., format = "latex", 
        align = c(rep("l",2) , rep("c", 3)), 
        booktabs = T, 
        row.names = F,
        col.names = colnames(.) %>% str_remove(" final"),
        caption = "Transition matrix of the artists' first and last five exhibitions") %>% 
  #row_spec(c(0, 2), extra_latex_after = "\\rowcolor{gray!6}") %>% 
  #column_spec(1:2, background = "#FFFFFF") %>%
  collapse_rows(columns = 1:2, latex_hline = 'none') %>% 
  add_header_above(c("", "", "Final" = 3))

In their study, Fraiberger et al. [-@fraiberger2018, 2] report that "[o]f the 4058 high-initial reputation artists, 58.6% remain in high-prestige territory until the end of their recorded career, and only 0.2% had the average prestige of their five most recent exhibits in the bottom 40%". Table \@ref(tab:transition-reputation-perc) shows similar results.[^alluvial] Calculating the transition probabilites reveals that only r tmatrix_p["high initial", "low final"]% of the artists who started their high status positions end up in the lowest reputation class. Among all transitions, this is the most unlikely event. Conversely, r tmatrix_p["high initial", "high final"]% of the artists who exhibited in the most prestigious during their first exhibitions will also do so by the end of their career. In other words, a high status position at the beginning of a career has the tendency to be reproduced.

[^alluvial]: Fraiberger et al. [-@fraiberger2018, 2] refer to figure 2f on p. 6 of their main article, that is, an alluvial diagram illustrating the transition probabilities. The transition matrix of table \@ref(tab:transition-reputation-perc) is an equivalent representation.

However, the transition matrix also reflects upward mobility with respect to these two career phases. Although careers of the lowest reputation class have a r tmatrix_p["low initial", "low final"]% probability of ending there as well, no less than r tmatrix_p["low initial", "moderate final"]% of the artists reach the end of their careers in moderate prestige exhibition venues. Stated differently, positions of low initial reputation are most likely to be converted into moderate positions. Nevertheless, this upward mobility seems to be restricted to the next higher level of reputation. Noteably, only r tmatrix_p["low initial", "high final"]% of low initial prestige artists reach the most emblematic locations.

The purpose of this section was to answer the question whether the observation of @fraiberger2018 also applies to the artists of the artist-info dataset using the same methodology. Indeed, a higher prestige of the first five venues is associated with a higher prestige of the last five locations of the artists' recorded careers. In a next step, we can examine in more detail the course of careers resulting in higher or lower degrees of artistic recognition.

Mobility {#career-mobility}

Will differences in recognition between artists at the beginning of their career manifest in more striking differences in later career phases? In this section, I explore whether the observation of @dubois2013, namely that recognition is cumulative in the field of contemporary poetry, applies to the visual arts as well. In order to demonstrate how the careers of artists of low, moderate and high initial recognition proceed, I consider two representations of the time dimension inherent to the career of an artist. On the one hand, a career can be understood as a sequence of events. In the present case these are represented by the artists' participations in exhibitions. On the other hand, we might wonder how the careers have developed after several years since the first documented exhibition.

Considering the events as a timeline I first arrange the exhibitions of each artist in chronological order, from the first to last documented exhibit. Then, I divide the sequence of exhibitions on an individual's trajectory into quintiles. Hence, artists whose record contains only 10 exhibitions, for instance, have only two exhibitions per quintile. Accordingly, the quintiles of trajectories with 50 exhibitions each comprise 10 records. Subsequently, for each quintile, the average percentile of the exhibition places in the status order was calculated. Figure \@ref(fig:plot-career-quintiles)a displays the average trajectories of artists that were assigned a initial reputation based on their first five exhibits. Conversely, figure \@ref(fig:plot-career-quintiles)b shows the average trajectories of those who were classified by the prestige of the venues of their final five exhibitions. The application of two approaches for creating categories of artists, according to their recognition at the beginning and at the end of their careers, allows to compare the different analytical strategies of the studies discussed in this thesis. @fraiberger2018 focus on how the careers of artists evolve, given the prestige of their first exhibition venues. In contrast, @dubois2013 as well as @borkenhagen2018 analyse the careers that have led individuals to certain positions. In other words, the authors trace back the individuals' careers. Transferring these two perspectives to the present case, we are therefore examining two groups of artists.

ggpubr::ggarrange(
   career_paths %>% 
     group_by(career_phase_quintiles, initial_rep_fraib) %>% 
     summarise(avg_perc = mean(venue_percentile)) %>% 
     ungroup() %>% 
     mutate_at(vars(initial_rep_fraib), ~ str_replace(., " ", "\n"))  %>% 
     ggplot(aes(career_phase_quintiles, avg_perc , color = initial_rep_fraib, group = initial_rep_fraib)) +
     geom_line() + 
     theme_bw() + 
     theme(legend.position = "bottom", legend.text.align = 0.5, 
           legend.title.align = 0.5, legend.title = element_text(size = 10), 
           title = element_text(size = 10.5)) +
     labs(x = NULL, y = NULL, color = NULL) + 
     scale_y_continuous(limits = c(29,89), breaks = seq(30,90,10)) +
     ggtitle("Initial reputation class"),

   career_paths %>% 
     group_by(career_phase_quintiles, end_rep_fraib) %>% 
     summarise(avg_perc = mean(venue_percentile)) %>% 
     ungroup() %>% 
     mutate_at(vars(end_rep_fraib), ~ str_replace(., " ", "\n"))  %>% 
     ggplot(aes(career_phase_quintiles, avg_perc , color = end_rep_fraib, group = end_rep_fraib)) +
     geom_line() + 
     theme_bw() + 
     theme(legend.position = "bottom", legend.text.align = 0.5, 
           title = element_text(size = 10.5)) +
     labs(x = NULL, y = NULL, color = NULL) + 
     scale_y_continuous(limits = c(29,89), breaks = seq(30,90,10)) +
     ggtitle("Final reputation class"), 

   labels = "auto", font.label = list(size = 11.5)) %>%

  ggpubr::annotate_figure(left = ggpubr::text_grob("Average percentile", rot = 90))

Consider figure \@ref(fig:plot-career-quintiles)a, to begin with. It shows that initial differences diminish throughout the artists' careers. Partially, this is due to the fact that the high initial trajectories decline and only remain in the top 20% of exhibition venues until the second quintile. In addition, the artists who were assigned a lower or moderate initial recognition improve their initial position. From this perspective, the initial differences do not divate further in later phases of the artists' careers. Instead, the average trajectories of the artists in these three groups all result in moderate artistic recognition. Still, the average trajectories do not overlap. In contrast, figure \@ref(fig:plot-career-quintiles)b indicates that initial discrepancies are indeed reinforced, given that reputation classes were defined at the end of careers. Presumably, there are variables that amplify initial differences throughout the courses of the artists' careers. In the following, however, I will not elaborate on this further. This thesis is not dedicated to the estimation of the effect of certain events that increase the chances of a trajectory to end in a certain level of reputation. Instead, I will continue to examine careers descriptively.

Importantly, figure \@ref(fig:plot-career-quintiles) shows that it is dependent on the analytical strategy whether hypothesis discussed in this section can be rejected or not. Only when reputation classes are based on the last five exhibitions of the artists, their average trajectories indicate that "inequalities, moderate at the beginning of the sequences, [...] become progressively greater" [ @dubois2013, 515]. In contrast, differences have narrowed between the trajectories of artists who were classified by their initial degree of recognition. Does this observation still hold when considering the artistic recognition over the years after the first exhibition?

Analyzing the artists' careers over the years from the first to the last exhibition addresses the problem that the trajectories have a varying length.[^length] As figure \@ref(fig:plot-example-careers) illustrates, some artists might have many exhibitions within a few years, others have few exhibitions over several years. Therefore, it seems reasonable to examine the recognition of the artists in the years following the first exhibition. For this purpose, I assign each exhibition on an artist's trajectory a number which corresponds to years since the first exhibition. If the first exhibition of an artist is recorded in 1975, for example, an exhibition in 1980 takes place in the sixth year of the career of this artist. Subsequently, for each year since the first exhibition in the low, moderate and high initial trajectories, I calculate the average percentile of the exhibition venues. The result is displayed by figure \@ref(fig:plot-career-years).

career_paths <- left_join(
  career_paths,
  career_paths %>% 
  group_by(artist_id) %>% 
  summarise(first_career_yr = min(exh_start_Y_from), 
            last_career_yr = max(exh_start_Y_from)),
  by = "artist_id") 

career_paths <- career_paths %>% 
  mutate(career_yr = exh_start_Y_from - first_career_yr + 1, 
         career_duration = last_career_yr - first_career_yr, 
         career_yrs = case_when(career_yr <= 5 ~ "1-5", 
                                career_yr >= 6 & career_yr <= 10 ~ "6-10", 
                                career_yr >= 11 & career_yr <= 15 ~ "11-15", 
                                career_yr >= 16 & career_yr <= 20 ~ "16-20", 
                                career_yr >= 21 & career_yr <= 25 ~ "21-25", 
                                career_yr >= 26 & career_yr <= 30 ~ "26-30", 
                                career_yr >= 31 & career_yr <= 35 ~ "31-35", 
                                career_yr >= 36 & career_yr <= 41 ~ "36-41") %>% 
           factor(levels = c("1-5","6-10", "11-15", "16-20", "21-25", "26-30", "31-35", "36-41")))

ggpubr::ggarrange(
   career_paths %>% 
     group_by(career_yr, initial_rep_fraib) %>% 
     summarise(avg_perc = mean(venue_percentile)) %>% 
     ungroup() %>% 
     mutate_at(vars(initial_rep_fraib), ~ str_replace(., " ", "\n"))  %>% 
     ggplot(aes(career_yr, avg_perc , color = initial_rep_fraib, group = initial_rep_fraib)) +
     geom_line() + 
     theme_bw() + 
     theme(legend.position = "bottom", legend.text.align = 0.5, 
           legend.title.align = 0.5, legend.title = element_text(size = 10), 
           title = element_text(size = 10.5)) +
     labs(x = NULL, y = NULL, color = NULL) + 
     scale_y_continuous(limits = c(29,89), breaks = seq(30,90,10)) +
     ggtitle("Initial reputation class"),

   career_paths %>% 
     group_by(career_yr, end_rep_fraib) %>% 
     summarise(avg_perc = mean(venue_percentile)) %>% 
     ungroup() %>% 
     mutate_at(vars(end_rep_fraib), ~ str_replace(., " ", "\n"))  %>% 
     ggplot(aes(career_yr, avg_perc , color = end_rep_fraib, group = end_rep_fraib)) +
     geom_line() + 
     theme_bw() + 
     theme(legend.position = "bottom", legend.text.align = 0.5, 
           title = element_text(size = 10.5)) +
     labs(x = NULL, y = NULL, color = NULL) + 
     scale_y_continuous(limits = c(29,89), breaks = seq(30,90,10)) +
     ggtitle("Final reputation class"),

   labels = "auto", font.label = list(size = 11.5)) %>%

  ggpubr::annotate_figure(left = ggpubr::text_grob("Average percentile", rot = 90))

exh_count_2015 <- ranked_nodes %>% 
    filter(year == 2015) %>% 
    group_by(venue_status) %>% 
    summarise(avg = mean(num_of_exh)) %>% 
    mutate_if(is.numeric, round) %>% 
    na.omit()

In comparison to the average trajectories based on the quintiles the lines in figure \@ref(fig:plot-career-years) show in fact a similar trend. Consider the artists that were assigned a reputation class based on their initial exhibits. As the left side of figure \@ref(fig:plot-career-years) indicates, differences at the beginning of the artists' careers diminish over the years. Careers of artists whose careers span a period of more than 30 years even overlap, on average, in certain years. Therefore, the graph suggests that initial inequalities have a tendency to decrease rather than to increase the longer the careers continue. Similar to figure \@ref(fig:plot-career-quintiles)a, low initial artists attain higher recognition throughout their careers, while high initial artists access increasingly lower prestige venues on average. Regarding the artists that were assigned a reputation class based on their last five exhibits the downward trend in the last years of their careers applies even to all reputation classes (see figure \@ref(fig:plot-career-years)b). However, the average trajectories still show increasing disparities regarding the artistic recognition throughout the artists' careers. Therefore, similar to the conclusion from above, the hypothesis about growing inequalities can only be confirmed if the reputation classes are established based on the last five exhibitions. Conversely, the hypothesis must be rejected from the perspective of the low, moderate and high initial trajectories. Their careers tend to converge on average.[^decrease] In other words, reputation is not cumulative in the sense of access to increasingly prestigious exhibition venues.

[^decrease]: It is worth mentioning that figures \@ref(fig:plot-career-years)a and \@ref(fig:plot-career-years)b both show a decline in the artistic recognition after the 35th year of a career at the latest. The exhibitions of the low initial trajectories even cover only a maximum of 35 years. Still, a high entry level facilitates to maintain the position in the status order over a long period of time.

[^length]: Consider the two artists of the artists of "The Pictures Generation" [@walker2010], as an example. The career of Cindy Sherman has been documented with r filter(career_paths, name == "Sherman, Cindy") %>% nrow() exhibitions from r filter(career_paths, name == "Sherman, Cindy") %$% min(exh_start_Y_from) to r filter(career_paths, name == "Sherman, Cindy") %$% max(exh_start_Y_from). In contrast, r filter(career_paths, name == "Longo, Robert") %>% nrow() exhibitions between r filter(career_paths, name == "Longo, Robert") %$% min(exh_start_Y_from) and r filter(career_paths, name == "Longo, Robert") %$% max(exh_start_Y_from) have been recorded for Robert Longo, an artist of the same generation.

[^sherman]: This involves, for example, the career of Cindy Sherman (*1954) and other artists of "The Pictures Generation" [@walker2010]. In the artist-info dataset, her career has been documented with r filter(career_paths, name == "Sherman, Cindy") %>% nrow() exhibitions from r filter(career_paths, name == "Sherman, Cindy") %$% min(exh_start_Y_from) to r filter(career_paths, name == "Sherman, Cindy") %$% max(exh_start_Y_from).

Irregularity of careers

# compute standard deviation for initial categories 
sd_initial_careers <- career_paths %>% 
  #filter(career_phase != "first_5_exh") %>% 
  # for each an initial reputation was assigned 
  # group_by(artist_id) is equaivalent group_by(artist_id, initial_rep_fraib)
  group_by(artist_id, initial_rep_fraib) %>% 
  mutate(artist_exh_count = n()) %>% 
  #filter(artist_exh_count >= 10) %>% 
  summarise(sd = sd(venue_percentile), 
            mean_ven_perc = mean(venue_percentile), 
            artist_exh_count = n()) %>% 
  # compute percentiles of standard dev. for each reputation class 
  group_by(initial_rep_fraib) %>% 
  mutate(sd_initial_percentile = ntile(sd, 100)) %>% 

  # computate percentiles of standard for all trajectories
  ungroup() %>% 
  mutate(sd_all_percentile = ntile(sd, 100)) %>% 
  mutate_at(vars(initial_rep_fraib), ~ str_remove(., " initial")) %>% 
  mutate_at(vars(contains("rep_fraib")),
            ~ factor(., levels = c("low","moderate","high"))) %>% 
  left_join(select(artists, id, name), by = c("artist_id" = "id")) 


# compute standard deviation for end categories 
sd_end_careers <- career_paths %>% 
  filter(career_phase != "last_5_exh") %>% 
  # for each an initial reputation was assigned 
  # group_by(artist_id) is equaivalent group_by(artist_id, initial_rep_fraib)
  group_by(artist_id, end_rep_fraib) %>% 
  mutate(artist_exh_count = n()) %>%
  #filter(artist_exh_count >= 10) %>% 
  summarise(sd = sd(venue_percentile) %>% round(5),
            mean_ven_perc = mean(venue_percentile), 
            artist_exh_count = n()) %>% 
  # compute percentiles of standard dev. for each reputation class 
  group_by(end_rep_fraib) %>% 
  mutate(sd_end_percentile = ntile(sd, 100)) %>% 
  ungroup() %>% 
  mutate_at(vars(end_rep_fraib), ~ str_remove(., " final")) %>% 
  mutate_at(vars(contains("rep_fraib")),
            ~ factor(., levels = c("low","moderate","high"))) %>% 
  left_join(select(artists, id, name), by = c("artist_id" = "id"))

career_paths %>% 
  group_by(initial_rep_fraib) %>% 
  summarise(sd = sd(venue_percentile), 
            mean = mean(venue_percentile)) %>% 
  mutate(coefficient_of_variation = sd / mean * 100) %>% 
  mutate_if(is.numeric, ~ round(., 2))

career_paths %>% 
  group_by(initial_rep_fraib, career_phase_quintiles) %>% 
  summarise(sd = sd(venue_percentile), 
            mean = mean(venue_percentile)) %>% 
  mutate(coefficient_of_variation = sd / mean * 100) %>% 
  mutate_if(is.numeric, ~ round(., 2)) %>% 
  select(- mean) %>% 
  gather(var, val, - initial_rep_fraib, - career_phase_quintiles) %>% 
  ggplot(aes(career_phase_quintiles, val, fill = initial_rep_fraib)) + 
  geom_col(position = "dodge") + 
  facet_grid(~ var) +
  theme_bw() +
  theme(legend.position = "bottom")

This section explores whether the trajectories of artists with a high initial recognition show a similar variability regarding the status of exhibition venues as careers of lower initial recognition. In other words, do artists exhibit exclusively in high status venues given their previous exhibitions took place in the most prestigious venues? I address this question by computing the standard deviation of the exhibition venues' prestige on each artist's trajectory. More precisely, the exhibitions on a trajectory are associated with an exhibition location that has a prestige value for the year in which the exhibition took place. The prestige value, as introduced in section \@ref(status-evaluation), is equal to the percentile rank of the venue among all locations with ties this year. For example, a venue located in the 90th percentile has an Eigenvector centrality value equal to or greater than 90% of the venues in that year. Calculating the standard deviation for the prestige of exhibition venues on the artists' trajectories reveals in which strata of the status order they have exhibited in the course of their careers. Are the works of certain artists, for example, shown exclusively in the top 20% of exhibition venues or also in locations of lower prestige? It should be mentioned, that this section focuses on the careers of artists that were assigned a reputation class based on their first five exhibts. As I am primarily concerned with the careers of those who initially had the opportunity to present their works in the most prestigious locations, I will no longer examine the careers grouped by their final recognition (see figures \@ref(fig:plot-career-quintiles)b and \@ref(fig:plot-career-years)b).

plot_career <- function(.artist_name){

  # get exhibitions 
  trajectory = career_paths %>% 
    filter(str_detect(name, .artist_name), 
           exh_start_Y_from <= 2015) %>% 
    mutate(exh_start_Ym_from = paste0(exh_start_Ym_from, "01") %>% lubridate::ymd(), 
           mean = mean(venue_percentile)) 

  # define colors for avaible exhibition venues
  colors = data_frame(breaks = c("Collector", "Gallery", "Non Profit", "Museum"),
                      values = c(wes_palette("FantasticFox1")[5],
                                 wes_palette("Zissou1")[c(4,2)], 
                                 wes_palette("Darjeeling2")[2])) %>% 
    semi_join(distinct(trajectory, exh_venue_from) %>% mutate_all(as.character), 
              by = c("breaks" = "exh_venue_from"))

  # plot 
  trajectory %>% 
    ggplot(aes(exh_start_Ym_from, venue_percentile, color = exh_venue_from, shape = exh_type_from)) +
    geom_point() +
    theme_bw() +
    labs(y = NULL, x = NULL, 
         color = "Exhibition\nvenue", shape = "Exhibition\ntype") +
    ggtitle(.artist_name) +
    theme(title = element_text(size = 10), legend.title.align = 0.5) + 
    scale_y_continuous(limits = c(0, 100), breaks = seq(0,100,20)) +
    scale_color_manual(breaks = colors$breaks, values = colors$values)

}


# ggpubr::ggarrange(
#   plot_career("Sherman, Cindy"), 
#   plot_career("Longo, Robert"), 
#   common.legend = T, legend = "bottom", ncol = 2
# )

sample_median_sd_artists <- sd_initial_careers %>% 
  filter(sd_initial_percentile %in% c(49:50), 
         artist_exh_count >= 15) %>% 
  group_by(initial_rep_fraib) %>% 
  slice(which.max(artist_exh_count)) 

ggpubr::ggarrange(
  # sample_median_sd_artists %>% 
  #   filter(initial_rep_fraib == "high") %>% 
  #   pull(name) %>% 
    plot_career("Weiner, Lawrence"),
  # sample_median_sd_artists %>% 
  #   filter(initial_rep_fraib =="moderate") %>% 
  #   pull(name) %>% 
    plot_career("Clemente, Francesco"),
    plot_career("Zech, Sati"), 
  common.legend = T, legend = "bottom", ncol = 3) %>% 
  ggpubr::annotate_figure(left = ggpubr::text_grob("Venue Percentile", rot = 90)
)

high_initial_career <- sd_initial_careers %>% 
  filter(#sd_initial_percentile %in% 49:51,
         #artist_exh_count >= 40,
         initial_rep_fraib == "high") %>% 
  filter(name == "Kruger, Barbara") %>% 
  pull(name) 

moderate_initial_career <- sd_initial_careers %>% 
  filter(#sd_initial_percentile %in% 49:51,
         #artist_exh_count >= 20, 
         initial_rep_fraib == "moderate", 
         mean_ven_perc >= 50, mean_ven_perc <= 78) %>% 
  filter(name == "Dokoupil, Georg Jîrî") %>% 
  pull(name)



low_initial_career <- sd_initial_careers %>% 
  filter(#sd_initial_percentile %in% 45:50, 
         #artist_exh_count >= 10, 
         initial_rep_fraib == "low") %>% 
  filter(name == "Kraenzlein, Dieter") %>% 
  pull(name) 


ggpubr::ggarrange(
  plot_career(high_initial_career), 
  plot_career(moderate_initial_career), 
  plot_career(low_initial_career),

  common.legend = T, legend = "bottom", ncol = 3) %>% 
  ggpubr::annotate_figure(left = ggpubr::text_grob("Venue Percentile", rot = 90)#,bottom = ggpubr::text_grob("Year of exhibition")
                          )

high_final_career <- sd_end_careers %>% 
  filter(sd_end_percentile %in% 48:52, 
         artist_exh_count >= 40, 
         end_rep_fraib == "high") %>% 
  filter(name == "Grosse, Katharina") %>% 
  pull(name) 

moderate_final_career <- sd_end_careers %>% 
  filter(sd_end_percentile %in% 48:52,
         artist_exh_count >= 20, 
         end_rep_fraib == "moderate", 
         mean_ven_perc >= 50, mean_ven_perc <= 78) %>% 
  filter(name == "Dokoupil, Georg Jîrî") %>% 
  pull(name)

low_final_career <- sd_end_careers %>% 
  filter(sd_end_percentile %in% 48:50, 
         artist_exh_count >= 10, 
         end_rep_fraib == "low") %>% 
  filter(name == "Lauterjung, Michael") %>% 
  pull(name) 

ggpubr::ggarrange(
  plot_career(high_final_career), 
  plot_career(moderate_final_career), 
  plot_career(low_final_career),

  common.legend = T, legend = "bottom", ncol = 3
) %>% 
  ggpubr::annotate_figure(left = ggpubr::text_grob("Venue Percentile", rot = 90))

First, consider the careers of the three artists displayed by figure \@ref(fig:plot-example-careers). Among the trajectories of low, moderate or high initial prestige, they represent trajectories of median variability, that is, about 50% have a higher or lower standard deviation. The vertical axis displays the percentile of the venue in which the exhibition took place in the corresponding year. The exhibitions of Lawrence Weiner, an example of a high initial trajectory, are concentrated in the top 20% of exhibition venues. Nevertheless, this does not result in a homogeneous picture, since other exhibitions are widely dispersed in the status order. The trajectories of Francesco Clemente and Sati Zech display a comparable degree of variability. Importantly, the trajectories vary in different segments of the status order. On average, Clemente and Zech exhibited in the r filter(sd_initial_careers, name=="Clemente, Francesco") %>% pull(mean_ven_perc) %>% round(0) and r filter(sd_initial_careers, name=="Zech, Sati") %>% pull(mean_ven_perc) %>% round(0) percentile respectively. The artworks of Weiner, by contrast, were shown in the r filter(sd_initial_careers, name=="Weiner, Lawrence") %>% pull(mean_ven_perc) %>% round(0) percentile on average. Although the average prestige of the venues reflect where these three trajectories began, this does not imply that the artists exclusively exhibit in these segments throughout their careers. Rather, as the distribution of exhibitions illustrates, there is a chance that their works are displayed in venues of the same or similar prestige.

sd_initial_careers %>%
    ggplot(aes(initial_rep_fraib, sd, group = initial_rep_fraib)) + 
      geom_boxplot(fill = c(wes_palette("Rushmore1")[3],
                            wes_palette("Darjeeling2")[2],
                            wes_palette("Rushmore1")[5])) +
      labs(x = NULL, y = NULL, fill = NULL) +
      #geom_boxplot(aes(x = "ungrouped", y = sd), data = sd_initial_careers, inherit.aes = F) +
      stat_summary(fun.y=mean, geom="point", shape=20, size=2, color="grey", fill="red") +
      ggtitle("Initial reputation class") +
      theme_bw() +
      theme(legend.position = "bottom", legend.title = element_blank(),
            title = element_text(size = 10.5)) +
      guides(fill = element_blank())

# make boxplot
ggpubr::ggarrange(

 sd_initial_careers %>% 
    ggplot(aes(initial_rep_fraib, sd, fill = initial_rep_fraib)) + 
      geom_boxplot() +
      stat_summary(fun.y=mean, geom="point", shape=20, size=2, color="red", fill="red") +
      labs(x = NULL, y = NULL, fill = NULL) +
      ggtitle("Initial reputation class") +
      theme_bw() +
      theme(legend.position = "bottom", legend.text.align = 0.5, 
            title = element_text(size = 10.5), axis.text.x = element_blank()), 

sd_end_careers %>% 
    ggplot(aes(end_rep_fraib, sd, fill = end_rep_fraib)) + 
      geom_boxplot() +
      stat_summary(fun.y=mean, geom="point", shape=20, size=2, color="red", fill="red") +
      labs(x = NULL, y = NULL, fill = NULL) +
      ggtitle("Final reputation class") +
      theme_bw() +
      theme(legend.position = "bottom", legend.text.align = 0.5, 
            title = element_text(size = 10.5), axis.text.x = element_blank()) 

)

# compute standard deviation for initial categories 
sd_initial_careers_solo <- career_paths %>% filter(exh_type_from == "solo") %>% 
  filter(career_phase != "first_5_exh") %>% 

  # artists have no longer at least 10 exhibitions 
  group_by(artist_id) %>% 
  mutate(n_solo_exh = n()) %>% 
  filter(n_solo_exh >= 10) %>% 

  # for each an initial reputation was assigned 
  # group_by(artist_id) is equaivalent group_by(artist_id, initial_rep_fraib)
  group_by(artist_id, initial_rep_fraib) %>% 
  summarise(sd = sd(venue_percentile) %>% round(5), 
            mean_ven_perc = mean(venue_percentile)) %>% 
  # compute percentiles of standard dev. for each reputation class 
  group_by(initial_rep_fraib) %>% 
  mutate(sd_initial_percentile = ntile(sd, 100)) %>% 
  ungroup() %>% 
  mutate_at(vars(initial_rep_fraib), ~ str_remove(., " initial")) %>% 
  mutate_at(vars(contains("rep_fraib")), ~ factor(., levels = c("low","moderate","high"))) 


# compute standard deviation for end categories 
sd_end_careers_solo <- career_paths %>% 

  filter(exh_type_from == "solo", 
         career_phase != "last_5_exh") %>% 

  # artists have no longer at least 10 exhibitions 
  group_by(artist_id) %>% 
  mutate(n_solo_exh = n()) %>% 
  filter(n_solo_exh >= 10) %>% 

  # for each an initial reputation was assigned 
  # group_by(artist_id) is equaivalent group_by(artist_id, initial_rep_fraib)
  group_by(artist_id, end_rep_fraib) %>% 
  summarise(sd = sd(venue_percentile) %>% round(5),
            mean_ven_perc = mean(venue_percentile), 
            artist_exh_count = n()) %>% 
  # compute percentiles of standard dev. for each reputation class 
  group_by(end_rep_fraib) %>% 
  mutate(sd_end_percentile = ntile(sd, 100)) %>% 
  ungroup() %>% 
  mutate_at(vars(end_rep_fraib), ~ str_remove(., " final")) %>% 
  mutate_at(vars(contains("rep_fraib")), ~ factor(., levels = c("low","moderate","high"))) 





# make boxplot
ggpubr::ggarrange(

 sd_initial_careers_solo %>% 
    ggplot(aes(initial_rep_fraib, sd, fill = initial_rep_fraib)) + 
      geom_boxplot() +
      stat_summary(fun.y=mean, geom="point", shape=20, size=2, color="red", fill="red") +
      labs(x = NULL, y = NULL, fill = NULL) +
      ggtitle("Initial reputation class") +
      theme_bw() +
      theme(legend.position = "bottom", legend.text.align = 0.5, 
            title = element_text(size = 10.5), axis.text.x = element_blank()), 

sd_end_careers_solo %>% 
    ggplot(aes(end_rep_fraib, sd, fill = end_rep_fraib)) + 
      geom_boxplot() +
      stat_summary(fun.y=mean, geom="point", shape=20, size=2, color="red", fill="red") +
      labs(x = NULL, y = NULL, fill = NULL) +
      ggtitle("Final reputation class") +
      theme_bw() +
      theme(legend.position = "bottom", legend.text.align = 0.5, 
            title = element_text(size = 10.5), axis.text.x = element_blank()) 

)

high <- filter(sd_initial_careers, initial_rep_fraib=="high") %>% pull(sd)
mod <- filter(sd_initial_careers, initial_rep_fraib=="moderate") %>% pull(sd)
low <- filter(sd_initial_careers, initial_rep_fraib=="low") %>% pull(sd)

ttest_pvalues <- c(t.test(high, mod)$p.value,
                   t.test(high, low)$p.value,
                   t.test(mod, low)$p.value)

Based on these examples of median varability of low, moderate and high initial recognition, we can now consider how regular or irregular the trajectories are on the group level. Figure \@ref(fig:boxplot-initial-careers) displays boxplots for the standard deviations of the trajectories in each initial reputation class. On the one hand, the boxes, indicating 50% of the observations, are not located at completely different levels of the y-axis. Hence, there is a considerable number of trajectories with the same or similar standard deviation in all groups. On the other hand, the positions of the boxes suggest that the standard deviations of these groups are not the same either. Indeed, three paired t-tests confirm that the averages of the standard deviations, illustrated by the grey dots, are significantly different from each other (p r ttest_pvalues %>% ifelse(all(. < .01), "< 0.01", .)).

career_paths <- career_paths %>% 
  #select(- var,- eigen_val, -first_exh_yr,- rank_past_all, -venue_percentile) %>% 
  left_join(ranked_nodes %>% 
              select(from = name, year, 
                     from_venue_perc = venue_percentile, 
                     from_venue_decile= venue_decile,
                     from_venue_quintile= venue_quintile,
                     from_venue_status = venue_status), 
            by = c("from", "exh_start_Y_from" = "year")) %>% 
  left_join(ranked_nodes %>% 
              select(to = name, year, 
                     to_venue_perc = venue_percentile, 
                     to_venue_decile= venue_decile,
                     to_venue_quintile = venue_quintile,
                     to_venue_status = venue_status),
            by = c("to", "exh_start_Y_to" = "year")) %>% 
  mutate_at(vars(contains("venue_status")), 
            ~ factor(., levels = c("lower\n40%", "mid 41%\nto 80%","top\n20%")))

groups_exh_stats <- career_paths %>% 
  #filter(career_phase != "first_5_exh") %>% 
  filter(!is.na(venue_status)) %>% 
  group_by(initial_rep_fraib, venue_status) %>% 
  summarise(exh = n()) %>% 
  group_by(initial_rep_fraib) %>% 
  mutate(n = sum(exh), 
         perc = exh / n) 

 artists_per_repgroup <- career_paths %>% 
  distinct(artist_id, .keep_all = T) %>% 
  group_by(initial_rep_fraib) %>% 
  summarise(artists = n())

venues_per_statusgroup <- ranked_nodes %>% 
  group_by(year, venue_status) %>% 
  summarise(venues = n())

venue_stats <- career_paths %>% 
    group_by(from_venue_status, to_venue_status) %>% 
    summarise(exh = n()) %>% 
    na.omit() %>% 
    group_by(from_venue_status) %>% 
    mutate(n = sum(exh), 
           perc = exh / n) %>% 
    ungroup()

  groups_exh_stats %>%
    ungroup() %>% 
    mutate_at(vars(initial_rep_fraib), 
              ~ str_replace(., " ", "\n") %>% factor(levels = c("low\ninitial","moderate\ninitial","high\ninitial"))) %>% 
    ggplot(aes(initial_rep_fraib, perc, fill = venue_status)) +
    geom_col() +
    scale_fill_manual(values = c(
      wes_palette("Rushmore1")[3],
      wes_palette("Darjeeling2")[2],
      wes_palette("Rushmore1")[5]
      ), labels = c("lower 40%", "mid 41% to 80%","top 20%")
    ) +
    theme_bw() +
    labs(x = NULL, fill = "Status Exhibition Venue", y = NULL) + 

    scale_y_continuous(breaks = seq(0,1,0.2)) +
    theme(legend.title = element_text(size = 10))

ggpubr::ggarrange(
  groups_exh_stats %>%
    ungroup() %>% 
    mutate_at(vars(initial_rep_fraib), 
              ~ str_replace(., " ", "\n") %>% factor(levels = c("low\ninitial","moderate\ninitial","high\ninitial"))) %>% 
    ggplot(aes(initial_rep_fraib, perc, fill = venue_status)) +
    geom_col() +
    # scale_fill_manual(values = c(wes_palette("Rushmore1")[5],
    #                              wes_palette("Rushmore1")[3], 
    #                              wes_palette("Darjeeling2")[2] 
    #                              ), 
    #                   labels = c("top 20%", "mid 41% to 80%", "lower 40%")) +
    theme_bw() +
    labs(x = NULL, fill = "Status Exhibition Venue", y = NULL) + 

    scale_y_continuous(breaks = seq(0,1,0.2)) +
    theme(legend.title = element_text(size = 10)),

  # circulation is a stronger argument
  venue_stats %>% 
    mutate_at(vars(from_venue_status), 
              ~ factor(., levels = c("lower\n40%", "mid 41%\nto 80%", "top\n20%"))) %>% 

    ggplot(aes(from_venue_status, perc, fill = to_venue_status)) +
    geom_col() +
    # scale_fill_manual(values = c(wes_palette("Rushmore1")[5],
    #                              wes_palette("Rushmore1")[3], 
    #                              wes_palette("Darjeeling2")[2] 
    #                              ), 
    #                   labels = c("top 20%", "mid 41% to 80%", "lower 40%")) +
    theme_bw() +
    labs(x = NULL, fill = "Status Exhibition Venue", y = NULL) + 

    scale_y_continuous(breaks = seq(0,1,0.2)) +
    theme(legend.title = element_text(size = 10)),

  ncol = 2, common.legend = T, legend = "bottom", labels = "auto", font.label = list(size = 11) #hjust = 0.06#, vjust = 0.85
) %>% 
  ggpubr::annotate_figure(left = ggpubr::text_grob("Percentage", rot = 90, size = 11))

Differences are most evident when comparing high and low initial trajectories in figure \@ref(fig:exh-status-order-rep-class). In fact, r filter(groups_exh_stats, initial_rep_fraib=="high initial", venue_status=="top\n20%") %>% {.$perc *100} %>% round()% of the exhibitions of the high initial group take place in the top 20% of exhibition venues. In addition, works of these artists were displayed in the lower 40% in only r filter(groups_exh_stats, initial_rep_fraib=="high initial", venue_status=="lower\n40%") %>% {.$perc*100} %>% round()% of the cases. In other words, more than every second of the exhibitions of the high initial artists takes place in the top 20% and only every 17th in the bottom 40%. This finding suggests that the works of these artists are not only exhibited on average at more selected locations. Primarily, they are distinguished by their absence from certain places. The inverse applies for low initial trajectories. No less than about 40% of the exhibitions take place in either moderate or low status venues. Despite the upward mobility indicated by figure \@ref(fig:plot-career-quintiles)a and \@ref(fig:plot-career-years)a, the works of these artists are shown in the top 20% of exhibition venues in only r filter(groups_exh_stats, initial_rep_fraib=="low initial", venue_status=="top\n20%") %>% {.$perc*100} %>% round()% of cases.

In conclusion, we can reject the hypothesis of similar variability of low and high initial trajectories. This does not imply that belonging to one of these reputation classes strictly excludes exhibitions in other segments of the status order. However, the trajectories of these initial reputation classes differ significantly on average. Moreover, high initial artists are also very unlikely to exhibit in low-profile venues. Conversely, low initial artists are most likely to exhibit in venues of lower or moderate prestige.

from <- data_frame(name = career_paths$from %>% as.character())
to <- data_frame(name = career_paths$to %>% as.character()) 

career_nodes <- bind_rows(from, to) %>% pull(name) 

gcareer <- induced_subgraph(galsimple, career_nodes) %>% 
  as_tbl_graph() %>% 
  activate(edges) %>% 
  filter(exh_start_Y_from <= 2015, exh_start_Y_from >= 1975,
         exh_start_Y_to <= 2015, exh_start_Y_to >= 1975)


igraph::as_data_frame(gcareer, "edges") %>% View()


lc <- igraph::largest_cliques(gcareer)[[1]] 

gs1 <- induced_subgraph(graph = gcareer, vids = lc) %>% 
  as.undirected()

#par(mfrow=c(1,2)) # To plot two plots side-by-side
plot(gs1,
     vertex.label = V(gs1)$name_exhplace,
     vertex.label.color = "black", 
     vertex.label.cex = 0.9,
     vertex.size = 0,
     edge.color = 'gray28',
     main = "Largest Clique",
     layout = layout.circle(gs1)
)


clique_nodes <- ranked_nodes %>% filter(name %in% names(lc)) 

clique_nodes %>%  
  summarise(mean(venue_percentile))

Types of moves {#types-of-moves}

career_paths <- career_paths %>% 
  mutate(occ_up =  exh_type_from == "group" & exh_type_to == "solo",
         occ_down= exh_type_from == "solo"  & exh_type_to == "group", 
         occ_equal=exh_type_from == exh_type_to,

         org_up =  from_venue_perc < to_venue_perc, 
         org_down= from_venue_perc > to_venue_perc,
         org_equal=from_venue_perc ==to_venue_perc,

         both_up = occ_up & org_up,
         both_down= occ_down & org_down, 

         occ_for_org = occ_up & org_down,

         org_for_occ = occ_down & org_up)

table_down_up_solo <- career_paths %>% 
  filter(exh_type_to == "solo") %>% 
  mutate(down_and_solo = exh_type_to == "solo" & from_venue_perc > to_venue_perc, 
         up_and_solo   = exh_type_to == "solo" & from_venue_perc < to_venue_perc) %>% 
  summarise(perc_down_and_solo = mean(down_and_solo, na.rm = T), 
            perc_up_and_solo = mean(up_and_solo,   na.rm = T)) %>% 
  mutate_at(vars(contains("perc")), ~ round(. * 100, 1))

solo_group_stats <- career_paths %>%
  group_by(exh_type_from) %>% 
  summarise(solo_group = n(), 
            avg_percentile = mean(venue_percentile)) %>% 
  mutate_if(is.numeric, round)

Is it possible to distinguish successful and less successful artists by the type of moves throughout their careers? This section examines whether the finding of Borkenhagen and Martin [-@borkenhagen2018, 18] applies to the context of visual arts as well. In their study, the authors observe that future top-chefs favor "organizational status" over "occupational status" in early years of their careers. In contrast, less successful individuals "are much less likely to make initial investments in organizational status". Rather, they tend to "privilege occupational status over organizational status".

Importantly, as discussed in section \@ref(career-mobility) in more detail, the authors examine the movements that directed individuals to certain positions. On the contrary, this thesis focuses on the development of careers based on the individuals' starting position. Therefore, in view of the results reported so far, this section explores why the trajectories tend to converge given the initial differences (see figures \@ref(fig:plot-career-quintiles)a and \@ref(fig:plot-career-years)a). In case of the high and moderate initial trajectories the decline of initial inequalities in artistic recognition could be related to movements toward solo exhibition in lower status venues (organizational for occupational status). Put differently, the artists favor significant exhibitions over prestigious exhibition places. Conversely, the increase of the low initial trajectories might be associated with a preference of group exhibitions in highly regarded venues over solo shows somewhere less prestigious (occupational for organizational status). In other words, are there patterns of movement that indicate how artists improve, maintain or loose their status?

In order to test this hypothesis, it is necessary to first examine whether solo exhibitions tend to take place in less prestigious locations. In other words, the exhibition format and the significance of the places have to involve a trade-off. Indeed, the prestige of the venues associated with a solo exhibition is lower than the prestige during a group exhibition. More precisely, the r filter(solo_group_stats, exh_type_from=="solo") %>% pull(solo_group) solo exhibitions in the dataset occur in the r filter(solo_group_stats, exh_type_from=="solo") %>% pull(avg_percentile)th percentile of exhibition venues on average. In contrast, the r filter(solo_group_stats, exh_type_from=="group") %>% pull(solo_group) group exhitions happen, on average, in the r filter(solo_group_stats, exh_type_from=="group") %>% pull(avg_percentile)th percentile. Congruently, table \@ref(tab:summary-stats-solo-group) shows that the prestige of the exhibition venues differs also when considering the solo or group exhibitions for each of the low, moderate or high initial trajectories.

solo_group_stats %>%      
  group_by(initial_rep_fraib, exh_type_from) %>% 
  summarise(mean = mean(exh_count_sg), 
            min = min(exh_count_sg), 
            max = max(exh_count_sg)) %>% 
  mutate_if(is.numeric, ~ round(.,1))

career_paths %>%
  group_by(initial_rep_fraib, exh_type_from) %>% 
  summarise(solo_group = n(), 
            avg_percentile = mean(venue_percentile)) %>% 
  group_by(initial_rep_fraib) %>% 
  mutate(n = sum(solo_group), 
         perc = (solo_group / n) * 100) %>% 
  mutate_if(is.numeric, ~ round(., 1)) %>% 
  arrange(initial_rep_fraib, n) %>% 
  select(initial_rep_fraib, exh_type_from, solo_group, perc, avg_percentile) %>%  
  kable(x = ., 
        caption = "Summary statistics of solo and group exhibitions for low, moderate and high initial trajectories", 
        caption.short = "Summary statistics of solo and group exhibitions",
        col.names = c("Trajectories", "Type", "Exhibition\nCount", "Percentage", "Average\nPercentile"), 
        booktabs = T, 
        align = c(rep("l",2), rep("c", 3)),
        escape = F) %>% 
  collapse_rows(1) %>% 
  kable_styling(bootstrap_options = "striped")

career_paths_st_years <- career_paths %>% 
  mutate(from_vstatus = case_when(from_venue_perc <= 40 ~ 40,
                                  from_venue_perc >= 41 & from_venue_perc <= 80 ~ 80,
                                  from_venue_perc >= 81 ~ 100),
         to_vstatus = case_when(to_venue_perc <= 40 ~ 40,
                                to_venue_perc >= 41 & to_venue_perc <= 80 ~ 80,
                                to_venue_perc >= 81 ~ 100)
         ) %>% 
  mutate(occ_up =  exh_type_from == "group" & exh_type_to == "solo",
         occ_down= exh_type_from == "solo"  & exh_type_to == "group", 
         occ_equal=exh_type_from == exh_type_to,

         org_up =  from_vstatus < to_vstatus, 
         org_down= from_vstatus > to_vstatus,
         org_equal=from_vstatus ==to_vstatus,

         both_up = occ_up & org_up,
         both_down= occ_down & org_down, 

         occ_for_org = occ_up & org_down,

         org_for_occ = occ_down & org_up) 


moves_st_years <- career_paths_st_years %>% 
  filter(!is.na(from_venue_perc), !is.na(to_venue_perc),
         !is.na(exh_type_from), !is.na(exh_type_to)) %>% 
  group_by(initial_rep_fraib, career_yrs
           ) %>% 
  summarise(n = n(),

            occ_up_ = sum(occ_up, na.rm=T) / n, 
            occ_down_ = sum(occ_down, na.rm=T) / n,
            occ_equal_= sum(occ_equal, na.rm = T) / n,

            org_up_ = sum(org_up, na.rm=T) / n,
            org_down_ = sum(org_down, na.rm=T) / n,
            org_equal_ = sum(org_equal, na.rm=T) / n, 

            occ_for_org_=sum(occ_up & org_down, na.rm=T) /n,
            org_for_occ_=sum(org_up & occ_down, na.rm=T) /n,

            both_up_ = sum(both_up, na.rm = T) /n ,
            both_down_ = sum(both_down, na.rm = T) / n)  %>% 
  ungroup() %>% 
  mutate_at(vars(matches("_$")), ~ round(., 2)) %>% 
  mutate(ratio = round(occ_for_org_ / org_for_occ_, 2), 
         initial_rep_fraib = str_replace(initial_rep_fraib," ", "\n") %>% factor(levels = c("low\ninitial", "moderate\ninitial", "high\ninitial")))

bind_rows(
  moves_st_years %>% 
    mutate_if(is.numeric, as.character) %>% 
    select(initial_rep_fraib, career_yrs, occ_for_org_) %>% 
    spread(career_yrs, occ_for_org_) %>% 
    mutate(var = "Occupational for\norganizational"),


  moves_st_years %>% 
    mutate_if(is.numeric, as.character) %>% 
    select(initial_rep_fraib, career_yrs, org_for_occ_) %>% 
    spread(career_yrs, org_for_occ_) %>% 
    mutate(var = "Organizational for\noccuptational"),

  moves_st_years %>% 
    mutate(ratio = cell_spec(ratio, "latex", bold = T)) %>% 
    select(initial_rep_fraib, career_yrs, ratio) %>% 
    spread(career_yrs, ratio) %>% 
    mutate(var = cell_spec("Ratio", "latex", bold = T))
) %>% 
  arrange(initial_rep_fraib) %>% 
  select(initial_rep_fraib, var, everything()) %>% 
  mutate_all(linebreak) %>% 
  mutate_all(~ replace_na(., "-") %>% str_replace("(\\\\)+textbf[{]NaN[}]", "-")) %>% 



  kable(format = "latex", booktabs = T, escape = F, 
        caption.short = "Transitions to solo and group exhibitions",
        caption = "Transitions to solo and group exhibitions in venues of the top 20, lower 40 or intermediate percentiles",
        align = c("l", "l", rep("c", 8)),
        col.names = c("", "", as.character(unique(moves_st_years$career_yrs)))) %>%
  add_header_above(c("", "", "Career Years" = 8)) %>% 
  collapse_rows(1)

moves_st_years %>% 
  select(initial_rep_fraib, career_yrs, n) %>% 
  spread(career_yrs, n) %>% 
  mutate_if(is.numeric, ~ replace_na(., "-")) %>% 
  mutate_all(linebreak) %>% 
  kable(format = "latex", booktabs = T, escape = F, 
        caption.short = "Summary statistics on transitions to solo and group exhibitions",
        caption = "Transitions to solo and group exhibitions in venues of the top 20, lower 40 or intermediate percentiles", 
        align = c("l", rep("c", 8)),
        col.names = c("", as.character(unique(moves_st_years$career_yrs)))) %>%
  add_header_above(c("", "Career Years" = 8)) %>% 
  collapse_rows(1)

Borkenhagen and Martin [-@borkenhagen2018, 17] find that the exchange of occupational status for organizational status type of move is very common, accounting for about a fifth of all transitions in their dataset. In case of the exhibition participations, this represents no more than r round(mean(career_paths_st_years$occ_for_org, na.rm=T) * 100, 2)% of all transitions between venues. Similarly, the transitions of artists that involve an exchange of occupational for organizational status account for r round(mean(career_paths_st_years$org_for_occ, na.rm=T) * 100, 2)% of all moves. For their argument, however, it is more important to consider when these movements occur than how common they are. In order to demonstrate the difference between successful and less successful trajectories, the authors compare the proportions of the two types of moves throughout the careers of the individuals. More specifically, they compute "a ratio of the proportion of moves [...] that involve trading occupational-for-organizational status to the proportion that involves the reverse trade" (Ibid, 18). In the context of exhibition participations, a ratio greater than one implies a higher proportion of transitions from group to solo exhibitions, but to less prestigious venues, i.e. preferring occupational over organizational status. In contrast, the ratio is smaller than one when solo exhibitions are more frequently followed by a group show, but in more prestigious venues, that is, favoring organizational over occupational status. Table \@ref(tab:table-moves-years-status-categories) presents the result of this approach.[^status_categories] For an overall picture, I will first relate the movements involving one of these trades to figure \@ref(fig:plot-career-years)a which shows the average trajectories of the artists of low, moderate and high initial recognition. I will then review if there are moves in certain phases of their careers that are associated with the progression of the average trajectories.

[^status_categories]: Note that the categories of the top 20, lower 40 and intermediate percentiles of the exhibition venues were used to calculate the proportions of transitions involving one of the two trade-offs. For example, all movements that lead from an exhibition in the top 20 percentiles to one in the lower 40 percentiles have been included. In contrast, movements between exhibitions in the same segment of the status order were excluded. An analysis that also took into account the movements between each percentile, for example between the 90th percentile and the 91st, has led to similar results.

In the first 10 years of the artists' careers the ratios reflect the trend of the trajectories displayed by figure \@ref(fig:plot-career-years)a. Artists of low and moderate initial recognition attain access to more prestigious venues, as the upward tendency of their average trajectories indicates. This coincides with more transitions from solo to group exhibitions in higher status venues, depicted by ratios smaller than one. Conversely, the ratio of 1.2 for the high initial artists shows that the artists are moving to less prestigious venues but attain more solo exhibitions. This conforms with the decline of their average trajectory in figure \@ref(fig:plot-career-years)a.\newline For the low initial artists the ratios are even smaller from year 21 to 30 compared to the beginning of their careers. In this phase of their career, these artists are therefore more likely to move toward a group exhibition in more prestigious venues than to solo exhibitions in lower-profile venues. At the same time, their average trajectories in figure \@ref(fig:plot-career-years)a varies considerably and do not indicate a particular trend in these years. Additionally, the ratio for the 11th to 15th year actually contrasts their average trajectory in career phase. However, as table \@ref(tab:table-number-of-transitions) suggests, conclusions on the moves of low initial trajectories are restricted by the more limited number of observations, especially in later phases of their careers.

Are there particular patterns of movements that relate to the progression of the artists' careers? First, it is striking that in later phases of their careers the ratios of the high and moderate initial artists are equal to one. This implies that the proportions of movements involving a trade-off are in fact the same. Admittedly, it is correct that in the case of the high initial artists, their tendency to favor solo exhibitions in less prestigious venues in early years coincides with the general downward trend in the prestige of the exhibition venues. Ascending even further, though, is comprehensibly more difficult given an already high starting position. Moreover, the proportions resulting in a ratio of 1.2 differ by only one percentage point. Thus, in the case of the high initial artists, the table indicates no specific pattern of movement. In other words, no rational can be derived to explain why the prestige of exhibition venues diminishes on average over the course of their careers.

A similar conclusion can be drawn for the artists of moderate initial recognition. It is true that the long-term course of their average trajectory coincides with the way they balance between the two trade-offs: They initially favor organizational over occupational status and also have long-term access to higher prestige venues. However, the proportions of moves that includes one of the two trade-offs differ also only by one percent point.\newline Despite these proportions differ in the case of the low initial artists by a maximum of 0.3 percentage points in the last years of their documented career, the limited number of observations on these career phases does not allow to derive a substantiated conclusion (see table \@ref(tab:table-number-of-transitions)). The proportions in the earlier years of their career, similar to the moderate and high initial artists, are all close to the overall percentage of both types of moves (r round(mean(career_paths_st_years$occ_for_org, na.rm=T) * 100, 2)% and r round(mean(career_paths_st_years$org_for_occ, na.rm=T) * 100, 2)%, see above). In summary, I therefore reject the hypothesis on the different patterns of movement between successful and less successful careers.

moves <- career_paths %>% 
  filter(!is.na(from_venue_perc), !is.na(to_venue_perc),
         !is.na(exh_type_from), !is.na(exh_type_to)) %>% 
  group_by(initial_rep_fraib, career_phase_quintiles
           ) %>% 
  summarise(n = n(),

            occ_up_ = sum(occ_up, na.rm=T) / n, 
            occ_down_ = sum(occ_down, na.rm=T) / n,
            occ_equal_= sum(occ_equal, na.rm = T) / n,

            org_up_ = sum(org_up, na.rm=T) / n,
            org_down_ = sum(org_down, na.rm=T) / n,
            org_equal_ = sum(org_equal, na.rm=T) / n, 

            occ_for_org_=sum(occ_up & org_down, na.rm=T) /n,
            org_for_occ_=sum(org_up & occ_down, na.rm=T) /n,

            both_up_ = sum(both_up, na.rm = T) /n ,
            both_down_ = sum(both_down, na.rm = T) / n)  %>% 
  ungroup() %>% 
  mutate_at(vars(dplyr::matches("_$")), ~ round(., 2)) %>% 
  mutate(ratio = round(occ_for_org_ / org_for_occ_, 2), 
         initial_rep_fraib = str_replace(initial_rep_fraib," ", "\n") %>% factor(levels = c("low\ninitial", "moderate\ninitial", "high\ninitial")))

bind_rows(
  moves %>% 
    mutate_if(is.numeric, as.character) %>% 
    select(initial_rep_fraib, career_phase_quintiles, occ_for_org_) %>% 
    spread(career_phase_quintiles, occ_for_org_) %>% 
    mutate(var = "Occupational for\norganizational"),


  moves %>% 
    mutate_if(is.numeric, as.character) %>% 
    select(initial_rep_fraib, career_phase_quintiles, org_for_occ_) %>% 
    spread(career_phase_quintiles, org_for_occ_) %>% 
    mutate(var = "Organizational for\noccuptational"),

  moves %>% 
    mutate(ratio = cell_spec(ratio, "latex", bold = T)) %>% 
    select(initial_rep_fraib, career_phase_quintiles, ratio) %>% 
    spread(career_phase_quintiles, ratio) %>% 
    mutate(var = cell_spec("Ratio", "latex", bold = T))
) %>% 
  arrange(initial_rep_fraib) %>% 
  select(initial_rep_fraib, var, everything()) %>% 
  mutate_all(linebreak) %>% 



  kable(format = "latex", booktabs = T, escape = F, 
        caption = "Transitions in the status order of exhibition venues (percentiles) and between solo and group exhibitions", 
        align = c("l", "l", rep("c", 5)),
        col.names = c("", "", paste0(seq(1,5,1)))) %>%
  add_header_above(c("", "", "Career Phase" = 5)) %>% 
  collapse_rows(1)

moves_years_perc <- career_paths %>% 
  filter(!is.na(from_venue_perc), !is.na(to_venue_perc),
         !is.na(exh_type_from), !is.na(exh_type_to)
         ) %>% 
  group_by(initial_rep_fraib, career_yrs
           ) %>% 
  summarise(n = n(),

            occ_up_ = sum(occ_up, na.rm=T) / n, 
            occ_down_ = sum(occ_down, na.rm=T) / n,
            occ_equal_= sum(occ_equal, na.rm = T) / n,

            org_up_ = sum(org_up, na.rm=T) / n,
            org_down_ = sum(org_down, na.rm=T) / n,
            org_equal_ = sum(org_equal, na.rm=T) / n, 

            occ_for_org_=sum(occ_up & org_down, na.rm=T) /n,
            org_for_occ_=sum(org_up & occ_down, na.rm=T) /n,

            both_up_ = sum(both_up, na.rm = T) /n ,
            both_down_ = sum(both_down, na.rm = T) / n)  %>% 
  ungroup() %>% 
  mutate_at(vars(dplyr::matches("_$")), ~ round(., 2)) %>% 
  mutate(ratio = round(occ_for_org_ / org_for_occ_, 2), 
         initial_rep_fraib = str_replace(initial_rep_fraib," ", "\n") %>% factor(levels = c("low\ninitial", "moderate\ninitial", "high\ninitial")))


bind_rows(
  moves_years_perc %>% 
    mutate_if(is.numeric, as.character) %>% 
    select(initial_rep_fraib, career_yrs, occ_for_org_) %>% 
    spread(career_yrs, occ_for_org_) %>% 
    mutate(var = "Occupational for\norganizational"),


  moves_years_perc %>% 
    mutate_if(is.numeric, as.character) %>% 
    select(initial_rep_fraib, career_yrs, org_for_occ_) %>% 
    spread(career_yrs, org_for_occ_) %>% 
    mutate(var = "Organizational for\noccuptational"),

  moves_years_perc %>% 
    mutate(ratio = cell_spec(ratio, "latex", bold = T)) %>% 
    select(initial_rep_fraib, career_yrs, ratio) %>% 
    spread(career_yrs, ratio) %>% 
    mutate(var = cell_spec("Ratio", "latex", bold = T))
) %>% 
  arrange(initial_rep_fraib) %>% 
  select(initial_rep_fraib, var, everything()) %>% 
  mutate_all(~ replace_na(., "-")) %>% 
  mutate_all(linebreak) %>% 




  kable(format = "latex", booktabs = T, escape = F, 
        caption = "years Transitions in the status order of exhibition venues (percentiles) and between solo and group exhibitions", 
        align = c("l", "l", rep("c", 8)),
        col.names = c("", "", as.character(unique(moves_years_perc$career_yrs)))) %>%
  add_header_above(c("", "", "Career Years" = 8)) %>% 
  collapse_rows(1)

career_paths_st_quintiles <- career_paths %>% 
  mutate(from_vstatus = case_when(from_venue_perc <= 40 ~ 40,
                                  from_venue_perc >= 41 & from_venue_perc <= 80 ~ 80,
                                  from_venue_perc >= 81 ~ 100),
         to_vstatus = case_when(to_venue_perc <= 40 ~ 40,
                                to_venue_perc >= 41 & to_venue_perc <= 80 ~ 80,
                                to_venue_perc >= 81 ~ 100)
         ) %>% 
  mutate(occ_up =  exh_type_from == "group" & exh_type_to == "solo",
         occ_down= exh_type_from == "solo"  & exh_type_to == "group", 
         occ_equal=exh_type_from == exh_type_to,

         org_up =  from_vstatus < to_vstatus, 
         org_down= from_vstatus > to_vstatus,
         org_equal=from_vstatus ==to_vstatus,

         both_up = occ_up & org_up,
         both_down= occ_down & org_down, 

         occ_for_org = occ_up & org_down,

         org_for_occ = occ_down & org_up) 


moves_st_quintiles <- career_paths_st_quintiles %>% 
  filter(!is.na(from_venue_perc), !is.na(to_venue_perc),
         !is.na(exh_type_from), !is.na(exh_type_to)) %>% 
  group_by(initial_rep_fraib, career_phase_quintiles
           ) %>% 
  summarise(n = n(),

            occ_up_ = sum(occ_up, na.rm=T) / n, 
            occ_down_ = sum(occ_down, na.rm=T) / n,
            occ_equal_= sum(occ_equal, na.rm = T) / n,

            org_up_ = sum(org_up, na.rm=T) / n,
            org_down_ = sum(org_down, na.rm=T) / n,
            org_equal_ = sum(org_equal, na.rm=T) / n, 

            occ_for_org_=sum(occ_up & org_down, na.rm=T) /n,
            org_for_occ_=sum(org_up & occ_down, na.rm=T) /n,

            both_up_ = sum(both_up, na.rm = T) /n ,
            both_down_ = sum(both_down, na.rm = T) / n)  %>% 
  ungroup() %>% 
  mutate_at(vars(matches("_$")), ~ round(., 2)) %>% 
  mutate(ratio = round(occ_for_org_ / org_for_occ_, 2), 
         initial_rep_fraib = str_replace(initial_rep_fraib," ", "\n") %>% factor(levels = c("low\ninitial", "moderate\ninitial", "high\ninitial")))

bind_rows(
  moves_st_quintiles %>% 
    mutate_if(is.numeric, as.character) %>% 
    select(initial_rep_fraib, career_phase_quintiles, occ_for_org_) %>% 
    spread(career_phase_quintiles, occ_for_org_) %>% 
    mutate(var = "Occupational for\norganizational"),


  moves_st_quintiles %>% 
    mutate_if(is.numeric, as.character) %>% 
    select(initial_rep_fraib, career_phase_quintiles, org_for_occ_) %>% 
    spread(career_phase_quintiles, org_for_occ_) %>% 
    mutate(var = "Organizational for\noccuptational"),

  moves_st_quintiles %>% 
    mutate(ratio = cell_spec(ratio, "latex", bold = T)) %>% 
    select(initial_rep_fraib, career_phase_quintiles, ratio) %>% 
    spread(career_phase_quintiles, ratio) %>% 
    mutate(var = cell_spec("Ratio", "latex", bold = T))
) %>% 
  arrange(initial_rep_fraib) %>% 
  select(initial_rep_fraib, var, everything()) %>% 
  mutate_all(linebreak) %>% 



  kable(format = "latex", booktabs = T, escape = F, 
        caption = "Transitions in the status order of exhibition venues (lower 40 , mid .., top 20 percent) and between solo and group exhibitions", 
        align = c("l", "l", rep("c", 5)),
        col.names = c("", "", paste0(seq(1,5,1)))) %>%
  add_header_above(c("", "", "Career Phase" = 5)) %>% 
  collapse_rows(1)

Limitations {#limitations}

In this thesis, careers of artists were examined within the limits of a descriptive approach to exhibition data. No particular statistic was derived to estimate effects of certain events in the artists' careers on their further trajectories. Instead, the focus was on exploring the progression of the careers given that the artists began in exhibition venues of different prestige. As the exhibitions at less notable venues tend to be documented less frequently, particularly the artists' first exhibitions may have not been recorded. Supplementing data on the first exhibitions would therefore enhance the validity of the results presented in chapter \@ref(artist-careers). At the same time, aspects of the compiled dataset have not yet been considered in detail. For example, it was not examined whether the transitions between exhibition venues depend on the geographical location of the venues and the artists' countries of origin. Moreover, the relational dimension of the dataset was primarily used to operationalize the prestige of the exhibition venues. In further investigations, the co-exhibition network could also be analyzed by visualizations.

career_paths %>% 
  group_by(career_phase_quintiles, initial_rep_fraib, artist_exh_activity) %>% 
  summarise(avg_perc = mean(venue_percentile)) %>% 
  ggplot(aes(career_phase_quintiles, avg_perc , color = initial_rep_fraib, group = initial_rep_fraib)) +
  geom_line() + 
  theme_bw() +
  theme(legend.position = "bottom") +
  facet_grid(~ artist_exh_activity)

ggpubr::ggarrange(
  career_paths %>% 
    group_by(career_phase_quintiles, initial_rep_fraib, exh_type_from) %>% 
    summarise(exh_count = n()) %>% # exhibitions per venue per repclass per trajectory decile
    group_by(career_phase_quintiles, initial_rep_fraib) %>% 
    mutate(sum_exh_decile_repclass = sum(exh_count), 
           perc_exh_decile_repclass = exh_count / sum_exh_decile_repclass) %>% 

    ggplot(aes(career_phase_quintiles, perc_exh_decile_repclass, fill = exh_type_from)) +
      geom_col() + 
      facet_grid(~ initial_rep_fraib) +
      labs(fill = "Exhibition\ntype", x = NULL) +
      theme_bw() + theme(legend.position = "bottom"), 

  career_paths %>% 
    group_by(career_phase_quintiles, end_rep_fraib, exh_type_from) %>% 
    summarise(exh_count = n()) %>% # exhibitions per venue per repclass per trajectory decile
    group_by(career_phase_quintiles, end_rep_fraib) %>% 
    mutate(sum_exh_decile_repclass = sum(exh_count), 
           perc_exh_decile_repclass = exh_count / sum_exh_decile_repclass) %>% 

    ggplot(aes(career_phase_quintiles, perc_exh_decile_repclass, fill = exh_type_from)) +
      geom_col() + 
      facet_grid(~ end_rep_fraib) +
      labs(fill = "Exhibition\ntype", x = NULL) +
      theme_bw() +
      theme(legend.position = "bottom"), 

    legend = "right", common.legend = T, nrow = 2
)

ggpubr::ggarrange(

   career_paths %>% 
    group_by(career_phase_quintiles, initial_rep_fraib, exh_type_from, exh_venue_from) %>% 
    summarise(exh_count = n()) %>% # exhibitions per venue per repclass per trajectory decile
    group_by(career_phase_quintiles, initial_rep_fraib) %>% 
    mutate(sum_exh_decile_repclass = sum(exh_count), 
           perc_exh_decile_repclass = exh_count / sum_exh_decile_repclass) %>% 

    ggplot(aes(career_phase_quintiles, perc_exh_decile_repclass, 
               fill = exh_venue_from, alpha = exh_type_from)) +
      scale_alpha_manual(values = c("solo" = 1, "group" = 0.78)) +
      venue_colors_fill + 
      geom_col() + 
      facet_grid(~ initial_rep_fraib) +
      labs(fill = "Exhibition\nvenue", y = NULL, x = NULL, alpha = "Exhibition\ntype") +
      theme_bw() + theme(legend.position = "bottom", 
                         panel.grid.major = element_blank(),
                         panel.grid.minor = element_blank(), 
                         legend.title.align = 0.5, legend.title = element_text(size = 10)), 


    career_paths %>% 
      group_by(career_phase_quintiles, end_rep_fraib, exh_type_from, exh_venue_from) %>% 
      summarise(exh_count = n()) %>% # exhibitions per venue per repclass per trajectory decile
      group_by(career_phase_quintiles, end_rep_fraib) %>% 
      mutate(sum_exh_decile_repclass = sum(exh_count), 
             perc_exh_decile_repclass = exh_count / sum_exh_decile_repclass) %>% 

      ggplot(aes(career_phase_quintiles, perc_exh_decile_repclass, 
                 fill = exh_venue_from, alpha = exh_type_from)) +
        scale_alpha_manual(values = c("solo" = 1, "group" = 0.78)) +
        venue_colors_fill + 
        geom_col() + 
        facet_grid(~ end_rep_fraib) +
        labs(x = NULL, y = NULL, alpha = "Exhibition\ntype") +
        theme_bw() + theme(legend.position = "bottom", 
                           panel.grid.major = element_blank(),
                           panel.grid.minor = element_blank(), 
                           legend.title.align = 0.5, legend.title = element_text(size = 10)), 

   common.legend = T, nrow = 2, legend = "bottom", 
   labels = "auto", font.label = list(size = 11.5)) %>%

  ggpubr::annotate_figure(left = ggpubr::text_grob("Percentage", rot = 90))

# retrieved from https://www.metmuseum.org/exhibitions/listings/2009/pictures-generation
picture_generation <- c("John Baldessari, Ericka Beckman, Dara Birnbaum, Barbara Bloom, Eric Bogosian, Glenn Branca, Troy Brauntuch, James Casebere, Sarah Charlesworth, Rhys Chatham, Charles Clough, Nancy Dwyer, Jack Goldstein, Barbara Kruger, Louise Lawler, Thomas Lawson, Sherrie Levine, Robert Longo, Allan McCollum, Paul McMahon, MICA-TV (Carole Ann Klonarides & Michael Owen), Matt Mullican, Richard Prince, David Salle, Cindy Sherman, Laurie Simmons, Michael Smith, James Welling, and Michael Zwack") %>% str_split(", ") %>% unlist() %>% str_remove("^and ") %>% 
  data_frame(raw = .) %>% 
  transmute(first_name = str_extract(raw, "[a-zA-Z]+"), 
            last_name = str_remove(raw, paste0(first_name, " ")), 
            name = paste0(last_name, ", ", first_name))

# 88125 Klonarides, Carole Ann
# 264167 Klonaridi & Owen, [Carol Ann Klonaridi & Michael Owen]
# 244752 MICA-TV, [Carole Ann Klonarides & Michael Owen]
# 88127 Owen, Michael



sd_deciles_same_rep <- inner_join(
  sd_initial_careers %>% 
    group_by(initial_rep_fraib) %>% 
    ungroup() %>% 
    mutate(sd_tile_initial = ntile(sd_initial, 100), 
           initial_rep_fraib = as.character(initial_rep_fraib)),
  sd_end_careers %>% 
    group_by(end_rep_fraib) %>% 
    ungroup() %>%  
    mutate(sd_tile_end = ntile(sd_end, 100), 
           end_rep_fraib = as.character(end_rep_fraib)),
  by = c("name")) %>% 

  mutate_at(vars(contains("rep_fraib")), ~ str_remove(., "\ninitial|\nfinal")) %>% 
  left_join(career_paths %>% 
              group_by(name) %>% 
              summarise(n = n(), 
                        sd = sd(venue_percentile), 
                        mean = mean(venue_percentile), 
                        median = median(venue_percentile)), 
            by = "name")





# sd_deciles_same_rep %>% 
#   filter(name %in% pull(picture_generation, name)) %>% 
#   View()

# high initial and high final 45th
# sd_deciles_same_rep %>% 
#   filter(name == "Sherman, Cindy") %>% 
#   pull(name) %>% 
#   plot_career()
# 
# # moderate, moderate 77th
# sd_deciles_same_rep %>% 
#   filter(name == "Longo, Robert") %>% 
#   pull(name) %>% 
#   plot_career()

# moderate and moderate final, 23th percentile sd()
# sd_deciles_same_rep %>% 
#   filter(name == "Lawler, Louise") %>% 
#   pull(name) %>% 
#   plot_career()
# 
# 
# # moderate initial and moderate final, 10th percentile for sd()
# sd_deciles_same_rep %>% 
#   filter(name == "Kruger, Barbara") %>% 
#   pull(name) %>% 
#   plot_career()
# 
# 
# sd_deciles_same_rep %>% 
#   filter(name == "Cragg, Tony") %>% 
#   pull(name) %>% 
#   plot_career()
# 
# sd_deciles_same_rep %>% 
#   filter(name == "McCollum, Allan") %>% 
#   pull(name) %>% 
#   plot_career()

# Smith, Michael

ggpubr::ggarrange(   

  career_paths %>% 
    #filter(exh_type_from != "group") %>% 
    group_by(venue_status, exh_type_from) %>% 
    summarise(exhibitions = n()) %>% 
    group_by(venue_status) %>% 
    mutate(n = sum(exhibitions), 
           perc = exhibitions / n,
           nlegend = paste("n=",n)) %>% 

    ggplot(aes(venue_status, perc, fill = exh_type_from)) + 
    geom_col() + 
    scale_fill_manual(values = c(wes_palette("Royal2")[5], wes_palette("Royal1")[2])) +
    theme_bw() +
    labs(fill = "Type", x = NULL, y = NULL) + 
    geom_text(aes(label = nlegend, y = 1), check_overlap = T, nudge_y = 0.018, size = 2.8) +
    #ggtitle(label = "Exhibitions (total)") +
    theme(title = element_text(size = 10)) +
    scale_y_continuous(breaks = seq(0,1,0.2)), 

  career_paths %>% 
    filter(career_phase == "first_5_exh") %>% 
    group_by(venue_status, exh_type_from) %>% 
    summarise(exhibitions = n()) %>% 
    group_by(venue_status) %>% 
    mutate(n = sum(exhibitions), 
           perc = exhibitions / n, 
           nlegend = paste("n=",n)) %>% 

    ggplot(aes(venue_status, perc, fill = exh_type_from)) + 
    geom_col() + 
    scale_fill_manual(values = c(wes_palette("Royal2")[5], wes_palette("Royal1")[2])) +
    theme_bw() +
    labs(fill = "Type", x = NULL, y = NULL) + 
    geom_text(aes(label = nlegend, y = 1), check_overlap = T, nudge_y = 0.018, size = 2.8) +
    #ggtitle(label = "Exhibitions (first five)") +
    theme(title = element_text(size = 10)) +
    scale_y_continuous(breaks = seq(0,1,0.2)), 

  common.legend = T, legend = "right", labels = "auto", font.label = list(size = 11)

) %>% ggpubr::annotate_figure(left = ggpubr::text_grob("Percentage", rot = 90), 
                              bottom = ggpubr::text_grob("Venue Status"))

# Where do solo exhibition take place?
career_paths %>% 
  distinct(exh_id_from, .keep_all = T) %>% 
  filter(exh_start_Y_from <= 2015) %>% 

  group_by(exh_type_from, from_venue_status, exh_start_Y_from) %>% 
  summarise(exh = n()) %>% 
  group_by(from_venue_status, exh_start_Y_from) %>% 
  mutate(n = sum(exh), 
         perc = exh / n) %>% 

  ggplot(aes(exh_start_Y_from, perc, fill = exh_type_from)) +
  geom_col() +
  theme_bw() +
  facet_grid(~ from_venue_status)

# How are solo exhibitions distributed among groups? 

career_paths %>% 
    group_by(career_phase_quintiles, initial_rep_fraib, exh_type_from) %>% 
    summarise(exh_count = n()) %>% # exhibitions per venue per repclass per trajectory decile
    group_by(career_phase_quintiles, initial_rep_fraib) %>% 
    mutate(sum_exh_decile_repclass = sum(exh_count), 
           perc_exh_decile_repclass = exh_count / sum_exh_decile_repclass) %>% 

    ggplot(aes(career_phase_quintiles, perc_exh_decile_repclass, fill = exh_type_from)) +
      geom_col() + 
      facet_grid(~ initial_rep_fraib) +
      labs(fill = "Exhibition\ntype", x = NULL) +
      theme_bw() + theme(legend.position = "bottom")

moves %>% 
  select(occ_for_org_, org_for_occ_, ratio,
         initial_rep_fraib,  career_phase_quintiles) %>% 
  gather(var, val, - initial_rep_fraib, - career_phase_quintiles) %>% 
  ggplot(aes(career_phase_quintiles, val, 
               color = var, group = var)) +
    geom_line() +
    theme_bw() +
    facet_grid(~ initial_rep_fraib)

ggpubr::ggarrange(
  # both up prob
  moves %>%  
    ggplot(aes(career_phase_quintiles, both_up_, 
               color = initial_rep_fraib, group = initial_rep_fraib)) +
    geom_line() +
    theme_bw(),

  # both down prob
  moves %>% 
    ggplot(aes(career_phase_quintiles, both_down_, 
               color = initial_rep_fraib, group = initial_rep_fraib)) +
    geom_line() +
    theme_bw(), 

  common.legend = T, legend = "bottom"

)

\newpage

Conclusion {#conclusion}

This thesis began with deriving hypotheses from studies on careers in the visual arts, contemporary \mbox{literature} and American restaurants. Despite the diversity of the research objects, these analyses focused on affiliations of individuals with organizations of higher or lower status. Their results have therefore provided a promising starting point for the exploration of careers. In chapter \@ref(exhibition-venues), I have argued that an artist's career can be interpreted as a sequence of affiliations with exhibition venues. The locations to which artists have access are an indicator of their artistic recognition. Which locations are selected by the artists for their exhibitions indicates, in turn, the status of the venues. Similarly, intermediaries also have preferences regarding other places where the works they dispose of are exhibited. These implicit assessments allowed to operationalize the status of exhibition venues based on the artists' participations in exhibitions. The validity and reliability of the resulting status order was then evaluated in chapter \@ref(prestige-empirically). I have found that the status measure is not prone to variations of the amount of available data. In addition, the status order turned out to be partly consistent with to the top 100 of exhibition venues reported by @fraiberger2018sm.

Chapter 4 explored the courses of the artists' careers. I first showed that the path dependency observed by @fraiberger2018 also applies to the present data set. A higher prestige of the venues during the initial exhibitions of the artists coincides with a higher prestige of the last venues on their trajectories. However, whether artistic recognition is cumulative depends also on the analytical strategy. The inequalities between artists who initially received very different levels of recognition have in fact decreased throughout the course of their careers. Subsequently, I examined in which places the artists exhibit, given that their careers began at different initial positions. It turned out that belonging to the group of artists with low, medium or high initial recognition does not preclude exhibitions in the highest or lowest segments of the status order. Finally, I explored whether there are specific patterns of movement between exhibition venues by which artists attain, maintain, or loose status. In fact, no particular type of movement could be identified. This might also be due to the fact that the dataset includes information about where the works of an artist are displayed, but not about the processes that led to the exhibition.

In summary, within the limitations specified above, the analysis provided further evidence for path dependencies in relation to the artists' first exhibitions. However, under the condition that artists continue to exhibit, initial inequalities in artistic recognition diminish throughout the careers.

nodes_l_t1 %>% 
    filter(name_exhplace %>% str_detect("Galerie Hans Mayer|Serpentine|Folkwang|Maenz|Denise René$|Michael Werner - Köln|Galerie Schmela|Konrad Fischer|Kunsthalle Düssel"), !is.na(val)) %>% 
  ggplot(aes(year, val, color = name_exhplace)) + 
  geom_line() +
  theme_bw()

career_paths %>% 
    group_by(career_phase_quintiles, initial_rep_fraib, exh_type_from, exh_venue_from) %>% 
    summarise(exh_count = n()) %>% # exhibitions per venue per repclass per trajectory decile
    group_by(career_phase_quintiles, initial_rep_fraib) %>% 
    mutate(sum_exh_decile_repclass = sum(exh_count), 
           perc_exh_decile_repclass = exh_count / sum_exh_decile_repclass) %>% 

    ggplot(aes(career_phase_quintiles, perc_exh_decile_repclass, 
               fill = exh_venue_from, alpha = exh_type_from)) +
      scale_alpha_manual(values = c("solo" = 0.78, "group" = 1)) +
      venue_colors_fill + 
      geom_col() + 
      facet_grid(~ initial_rep_fraib) +
      labs(fill = "Exhibition\nvenue", y = NULL, x = NULL, alpha = "Exhibition\ntype") +
      theme_bw() + theme(legend.position = "bottom", 
                         panel.grid.major = element_blank(),
                         panel.grid.minor = element_blank(), 
                         legend.title.align = 0.5, legend.title = element_text(size = 10))+
  guides(fill=guide_legend(nrow=2), alpha = guide_legend(nrow = 2))

solo_group_stats <- career_paths %>% 
     group_by(career_phase_quintiles, initial_rep_fraib, exh_type_from) %>% 
     summarise(avg_perc = mean(from_venue_perc), exh_count_sg = n()) %>% 
     ungroup() %>% 
     mutate_at(vars(exh_type_from), ~ factor(., c("solo", "group"))) 

solo_group_stats %>% 
     mutate_at(vars(initial_rep_fraib), ~ str_replace(., " ", "\n"))  %>%
     ggplot(aes(career_phase_quintiles, avg_perc , color = initial_rep_fraib, group = initial_rep_fraib)) +
     geom_line() + 
     theme_bw() + 
     theme(legend.position = "bottom", legend.text.align = 0.5, 
           legend.title.align = 0.5, legend.title = element_text(size = 10), 
           title = element_text(size = 10.5)) +
     labs(x = NULL, y = "Average Percentile", color = NULL) + 
     #scale_y_continuous(limits = c(29,89), breaks = seq(30,90,10)) +
  facet_grid(~ exh_type_from)

solo_group_stats <- career_paths %>% 
     group_by(career_yrs, initial_rep_fraib, exh_type_from) %>% 
     summarise(avg_perc = mean(from_venue_perc), exh_count_sg = n()) %>% 
     ungroup() %>% 
     mutate_at(vars(exh_type_from), ~ factor(., c("solo", "group"))) 

solo_group_stats %>% 
     mutate_at(vars(initial_rep_fraib), ~ str_replace(., " ", "\n"))  %>%
     ggplot(aes(career_yrs, avg_perc , color = initial_rep_fraib, group = initial_rep_fraib)) +
     geom_line() + 
     theme_bw() + 
     theme(legend.position = "bottom", legend.text.align = 0.5, 
           legend.title.align = 0.5, legend.title = element_text(size = 10), 
           title = element_text(size = 10.5), 
           axis.text.x = element_text(size = 9)) +
     labs(x = NULL, y = "Average Percentile", color = NULL) + 
     #scale_y_continuous(limits = c(29,89), breaks = seq(30,90,10)) +
  facet_grid(~ exh_type_from)

knitr::include_graphics("figures/Matching_Algorithmus_AKL.png")

kable_top_venues <- function(.year, .top_n){

  ranked_nodes %>% 
    filter(year == .year) %>% 
    arrange(rank_past_all) %>% 
    head(.top_n) %>% 
    #mutate_if(is.character, ~ str_replace("&"))
    select(rank_past_all, name, num_of_exh, first_exh_yr, venue_percentile) %>% 
    mutate_if(is.numeric, ~ round(.,1)) %>% 

    kable(format = "latex", 
          col.names = linebreak(c("Rank", "Exhibition Venue", "Number of\nExhibitions", "Year of first\nexhibition", "Percentile"), align = "c"),
          align = c("c", "l", rep("c", 2), "r"),
          escape = F,
          booktabs = T, 
          caption = paste0("Top ", .top_n, " exhibition venues in ", .year, ", as computed by eigenvector centrality")) %>% 
    kable_styling(latex_options = "striped", font_size = 6)

}