-
Notifications
You must be signed in to change notification settings - Fork 1
/
ISGC.Rmd
553 lines (378 loc) · 24.1 KB
/
ISGC.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
---
title: "ISGC"
author: "Marion Maisonobe"
date: "30/04/2021"
output:
pdf_document: default
html_document: default
always_allow_html: yes
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
## ISGC Congress
```{r , echo = FALSE, message = F, warning = F}
# tinytex::install_tinytex()
library(tinytex)
library(tidyverse)
library(kableExtra)
library(ggalluvial)
library(knitr)
library(ggraph)
library(RColorBrewer)
library(circlize)
# under-dev package
#library(tnet)
#library(cartography)
#library(usethis)
# install.packages("cartigraph_0.0.0.9000.tar.gz", repos = NULL, type = "source")
library(igraph)
library(cartigraph)
abstracts <- read_tsv("data/abstracts-2015-2019.tsv")
authors_abstracts <- read_tsv("data-net/edges-2015-2019.tsv")
```
ISGC is an international conference of green chemistry held every two years in La Rochelle since 2013.
In the frame of the NETCONF research project we focus on several editions of the conference in order to study its structure and dynamics. we are interested in communications, panels, topics, participants and their geographical origin. We consider studying conferences can bring many insights on emergent scientific communities, both regarding their organisation and dynamics.
## Topics
```{r , echo = FALSE, message = F, warning = F}
# keep OC (Oral Communications) and FC (Flash Communications) only
ocfc <- abstracts %>%
filter(final_status %in% c("OC","FC", "FC - PC", "OC - PC"))
# count number of com per topic
a <- ocfc %>% count(topic_1)
# count number of com per topic per year
# ocfc %>%
# group_by(year, topic_1) %>%
# mutate(n_topic = n()) %>%
# pivot_wider(names_from = year, values_from = n_topic)
# count number of com per topic per year and export this as a table
tab_1 <- ocfc %>%
group_by(year, topic_1) %>%
summarise(n = n()) %>%
mutate(prct = round(n/sum(n)*100, 1)) %>%
pivot_wider(names_from = year, values_from = c(n, prct), values_fill = list(n = 0)) %>%
mutate(total_nb = rowSums(across(starts_with("n")))) %>%
mutate(total_prct = round(total_nb/sum(total_nb)*100, 1)) %>%
mutate(total_prct = replace_na(total_prct, 0),
prct_2015 = replace_na(prct_2015, 0),
prct_2017 = replace_na(prct_2017, 0),
prct_2019 = replace_na(prct_2019, 0))%>%
relocate(n_2015, .before = 2) %>%
relocate(prct_2015, .before = 3) %>%
relocate(prct_2017, .before = 5) %>%
relocate(prct_2019, .before = 7) %>%
arrange(-total_nb)
```
We first look at the topics associated to flash and oral communications given at the ISGC congress over time. We notice that `r tab_1$topic_1[1]` is the more popular topic. It attracted `r tab_1$n_2019[1]` communications at the 2019 edition. However, we notice that certain topics are not represented at every ISGC editions. It is notably the case of `r tab_1$topic_1[1]` which was not present at the 2017 edition. During the 2017 edition, it appears that `r tab_1$topic_1[1]` has been temporarily replaced by another category named `r tab_1$topic_1[2]` (see Table 1 below).
```{r , echo = FALSE, message = F, warning = F}
tab_1 %>%
add_row(summarise(., # add the totals by columns
across(where(is.numeric), sum), # numeric totals
across(where(is.character), ~"Total")) %>% # write "Total" at the end of the 1st column
mutate(across(starts_with("prct_"), round, 0))) %>% # round the %
kbl(col.names = NULL) %>%
add_header_above(c("ISGC topic", rep(c("Nb", "%"), 4))) %>%
add_header_above(c(" ", "2015" = 2, "2017" = 2, "2019" = 2, "Total" = 2)) %>%
add_header_above(c(" ", "Number and % of oral and flash communications" = 8)) %>%
kable_classic(full_width = F, html_font = "Cambria") %>%
row_spec(dim(tab_1)[1]+1, bold = T) %>% # format last row
column_spec(c(3, 5, 7, 9), italic = T) # format first column
```
According to François Jérôme, the main organiser of the ISGC conference, it is possible to associate certain similar topics so that we can better analyse trends in the popularity of ISGC topics over time. I refer to his suggestions to decide which topics can be grouped together on the basis of their similarity. Following this operation, we obtain 16 unique topics.
```{r , echo = FALSE, message = F, warning = F}
tab_2 <- tab_1 %>%
mutate(topic_1 = recode(topic_1, "Renewable carbon" = "Biomass conversion",
"Waste valorization" = "Biomass conversion",
"Chemical valorization of wastes" = "Biomass conversion",
"Catalytic systems" = "Homogenous, heterogenous and biocatalysis",
"Polymers and materials" = "Polymers",
"Environnemental impact and life cycle assessment" = "Life cycle and environmental assessment",
"Atom-economy synthesis" = "Clean reactions",
"Eco-technology" = "New technologies",
"Alternative technologies" = "New technologies",
"Non-thermal activation methods" = "New technologies"
)) %>%
group_by(topic_1) %>%
summarise(across(where(is.numeric), sum)) %>%
arrange(-total_nb)
# Copy the URL of the html window you get
# load webshot library
# library(webshot)
#install phantom:
# webshot::install_phantomjs()
# Make a webshot in pdf : high quality but can not choose printed zone
# webshot("plots/table1.html" , "plots/table.pdf", delay = 0.2)
# write_tsv(tab_2, "data/topic.tsv")
allu_1 <- tab_2 %>%
pivot_longer(starts_with("n_"), names_to = "year", names_prefix = "n_", values_to = "n") %>%
select(topic_1, year, n)
# svg(file = "plots/topics_ISGC.svg", width = 10, height = 6)
# new plot
# plot.new()
ggplot(allu_1,
aes(x = year, stratum = topic_1, alluvium = topic_1,
y = n,
fill = topic_1, label = topic_1)) +
scale_x_discrete(expand = c(.1, .1)) +
geom_flow() +
geom_stratum(color ="gray65") + #alpha = .5,
geom_text(stat = "stratum", size = 3) +
xlab("ISGC edition") + ylab("Number of oral and flash communications") +
theme(legend.position = "none", panel.background = element_rect(fill = "white", colour = "lightgrey")) +
ggtitle("Number of communications per ISGC topic")
# dev.off()
```
We observe that 4 topics are present at each edition of the congress: Biomass conversion (named "Renewable Carbon" at the 2017 Edition), Polymers, Alternative solvents, and New technologies. The 8 remaining topics are only present at one or two editions.
## Participation
There are different ways of approaching the conference participation. It can be analysed using registration data or though the lens of communications' authors i.e. active participants. In the latter case, it is possible to filter the information to consider only the active participants who registered to the conference. Indeed, certain communications' co-authors did not attend the conference.
In what follows, we consider the participation through the lens of all communications' authors, whether they registered or not.
```{r , echo = FALSE, message = F, warning = F}
# keeping everyone including industrials and keynote speakers
d <- authors_abstracts %>%
mutate(year = as.numeric(year))
# how many participations over the 3 conferences?
t <- group_by(d, i) %>%
summarise(t_c = n_distinct(year)) %>%
arrange(-t_c)
```
By counting all the speakers including industrials and keynote speakers over the 3 editions of the congress, we find a total of `r length(t$i)` participants. Among them, `r dim(filter(t, t$t_c %in% 3))[1]` were represented at all editions, `r dim(filter(t, t$t_c %in% 1))[1]` intervened to 1 edition only, (i.e. `r round(dim(filter(t, t$t_c %in% 1))[1]/length(t$i)*100, 2)`%), and `r dim(filter(t, t$t_c %in% 2))[1]` contributed to 2 editions.
```{r , echo = FALSE, message = F, warning = F}
# limiting ourselves to oral communications and flash communications panels
d <- d %>%
filter(final_status %in% c("OC","FC", "FC - PC", "OC - PC"))
# how many participations to OC and FC over the 3 conferences?
t <- group_by(d, i) %>%
summarise(t_c = n_distinct(year)) %>%
arrange(-t_c)
```
If we limit ourselves to participants to oral communications and flash communications panels, we find a total of `r length(t$i)` participants. Among them, `r dim(filter(t, t$t_c %in% 3))[1]` were represented at all editions, `r dim(filter(t, t$t_c %in% 1))[1]` intervened to 1 edition only, (i.e. `r round(dim(filter(t, t$t_c %in% 1))[1]/length(t$i)*100, 2)`%), and `r dim(filter(t, t$t_c %in% 2))[1]` contributed to 2 editions.
```{r , echo = FALSE, message = F, warning = F}
# number of OC and FC panels overall
t <- distinct(d, j, .keep_all = T) %>%
group_by(year) %>%
summarise(n = n())
# add number of panels intervention per conference
# (useful for edge weighting)
d <- group_by(d, year, i) %>%
summarise(n_p = n()) %>%
inner_join(d, ., by = c("year", "i"))
# add total number of panels and total number of conferences with at least one communication
# (useful for vertex subsetting)
d <- group_by(d, i) %>%
summarise(t_p = n_distinct(j), t_c = n_distinct(year)) %>%
inner_join(d, ., by = "i")
```
Overall, there had been `r n_distinct(d$j)` oral and flash communications distinct panels: `r t$n[t$year %in% 2015]` in 2015, `r t$n[t$year %in% 2017]` in 2017, and `r t$n[t$year %in% 2019]` in 2019.
```{r , echo = FALSE, message = F, warning = F}
# nb of authors per paper
nbaut <- d %>%
group_by(id, year) %>%
count(sort = T)
```
In chemistry, most communications are the result of a team work. At ISGC, there is an average of `r round(mean(nbaut$n), 2)` authors per oral and flash communications.
## Networks of panels
In each of the graphs below, the grey dots denote a Congress panel, the blue dots denote participants who attended only one panel, and the red dots denote 'multi-positional' participants who attended more than one Congress panel. Finally, the intensity of the links refers to an inverse weighting that visually accentuates participation in panels with few participants.
Remember that we are counting potential participants here: in practice, some of these participants will not have come to the Congress.
```{r , echo = FALSE, message = F, warning = F}
#2m-net participants - panel 2015
include_graphics("plots/congres-isgc2015-2mode.png") #, dpi = 300
#2m-net participants - panel 2017
include_graphics("plots/congres-isgc2017-2mode.png")
#2m-net participants - panel 2019
include_graphics("plots/congres-isgc2019-2mode.png")
tab_3 <- readr::read_tsv("data-net/table.tsv") %>%
relocate(P, .before = 3) %>%
mutate(ratio = round(P/P0, 1), .after = P) %>%
mutate(P1 = round((P1/P*100), 1),
P2 = round((P2/P*100), 1),
C = C-1)
```
Are there more and more multi-positional participants - degree 2+ in these bipartite graphs - at the ISGC Congress? The answer is yes, slightly more, which can partly result from the opportunities offered by the increased number of panels. From 2015 to 2017, the number of multi-positional participants increased from `r tab_3$P2[tab_3$year %in% 2015]` to `r tab_3$P2[tab_3$year %in% 2019]`%, i.e. an increase of `r round((tab_3$P2[tab_3$year %in% 2019]-tab_3$P2[tab_3$year %in% 2015])/tab_3$P2[tab_3$year %in% 2015]*100, 2)`%. At the same time, the number of panels increased from `r tab_3$P0[tab_3$year %in% 2015]` to `r tab_3$P0[tab_3$year %in% 2019]`, an increase of `r round((tab_3$P0[tab_3$year %in% 2019]-tab_3$P0[tab_3$year %in% 2015])/tab_3$P0[tab_3$year %in% 2015]*100, 2)`% (see Table 2 below). The number of multi-positional participants thus increased at a higher pace than the number of panels.
```{r, echo = FALSE, message = F, warning = F}
tab_3 %>%
kbl(col.names = NULL) %>%
add_header_above(c("ISGC Congress", "Panels" = 1, "Participants" = 1,
"Ratio" = 1, "% degree 1" = 1, "% degree 2+" = 1, "C - 1" = 1)) %>%
kable_classic(full_width = F, html_font = "Cambria") %>%
column_spec(c(5, 6), italic = T) # format first column
```
In the table above, the "Ratio" column gives the ratio of participants to panels, which drops very slightly over time. It is difficult to draw a conclusion on the fragmentation of the conference as the number of panels at the ISGC is the result of the organisers' choices when setting up the programme and not of the participants. Indeed, the participants of the ISGC congress choose to associate their paper with a main topic and it is on this basis that the papers are then grouped into panels.
The number of isolated components of the graph, noted as "C - 1" above because only components not connected to the main component have been counted, corresponds to the number of panels in which no participant took part in any other panel of the Congress. This figure is decreasing over time, from `r tab_3$C[tab_3$year %in% 2015]` in 2015 to `r tab_3$C[tab_3$year %in% 2019]` in 2019, i.e. a drop of `r round((tab_3$C[tab_3$year %in% 2015]-tab_3$C[tab_3$year %in% 2019])/tab_3$C[tab_3$year %in% 2015]*100, 2)`%.
In the one-mode networks below, the blue dots indicate participants who attended only one panel of the Congress edition concerned; the red dots indicate participants who attended several panels.
```{r , echo = FALSE, message = F, warning = F}
#2m-net participants - panel 2015
include_graphics("plots/congres-isgc2015-1mode.png") #, dpi = 300
#2m-net participants - panel 2017
include_graphics("plots/congres-isgc2017-1mode.png")
#2m-net participants - panel 2019
include_graphics("plots/congres-isgc2019-1mode.png")
```
## Linking communications according to common authors and topics
```{r , echo = FALSE, message = F, warning = F}
f <- left_join(d, select(abstracts, Idu, topic_1), by = "Idu" ) %>%
group_by(Idu) %>%
mutate(a_id = cur_group_id()) %>% ungroup() %>%
mutate(rtopic = recode(topic_1, "Renewable carbon" = "Biomass conversion",
"Waste valorization" = "Biomass conversion",
"Chemical valorization of wastes" = "Biomass conversion",
"Catalytic systems" = "Homogenous, heterogenous and biocatalysis",
"Polymers and materials" = "Polymers",
"Environnemental impact and life cycle assessment" = "Life cycle and environmental assessment",
"Atom-economy synthesis" = "Clean reactions",
"Eco-technology" = "New technologies",
"Alternative technologies" = "New technologies",
"Non-thermal activation methods" = "New technologies"
))
# number of participants with at least two communications
nmp <- f %>%
count(i) %>%
filter(n > 1) %>%
nrow(.)
# network of Oral and flash communications with at least one common author
net <- select(f, p = a_id, e = i) %>% distinct()
# From 2-mode to 1-mode, this function transform the result into an igraph object
g <- cartigraph::netproj(net, method = "sum")
V(g)$topic <- as.character(f$topic_1[match(V(g)$name, f$a_id)])
V(g)$rtopic <- as.character(f$rtopic[match(V(g)$name, f$a_id)])
V(g)$year <- as.character(f$year[match(V(g)$name, f$a_id)])
V(g)$label <- paste0(V(g)$name, V(g)$year, sep = "_") #
# V(g)$label <- str_trunc(V(g)$topic, 20, "right") # If necessary add "str_to_title"
```
We now consider the network of oral and flash communications sharing one author at least.
```{r , echo = FALSE, message = F, warning = F}
# Network plots
colorlist <- c(brewer.pal(9, "Set1")[-6], brewer.pal(8, "Set2")[-7], brewer.pal(12, "Paired")[-11], brewer.pal(12, "Set3")[-c(2, 8, 12)])
plot(g, vertex.label = NA, vertex.size = 1)
```
It counts `r length(V(g))` communications (on a total of `r max(f$a_id)`, among which `r length(V(g)[year == "2015"])` from the 2015 edition, `r length(V(g)[year == "2017"])` from the 2017 edition, and `r length(V(g)[year == "2019"])` from the 2019 edition. It involves `r nmp` participants.
```{r , echo = FALSE, message = F, warning = F}
# Keep only the links between communications with different topics
# inspired by https://stackoverflow.com/questions/60279825/calculating-network-statistics-between-attribute-classes-with-igraph-in-r
vattrs <- igraph::vertex_attr(g, name = "topic")
total_el <- igraph::as_edgelist(g, names = FALSE)
# rows from total_el where the attribute of the edge source == attribute of edge target
edges_to_rm <- vattrs[total_el[, 1L]] == vattrs[total_el[, 2L]]
# heterophilic communication network in terms of topics
gh <- subgraph.edges(g, eids = which(!edges_to_rm), delete.vertices = T)
# plot(gh, vertex.label = NA)
```
There are `r length(V(gh))` communications with different topics but at least one common authors. We here consider the original topics' name before the recoding suggested by François Jérôme's expertise. We group all the communications according to their topic and attempt to visualise the relations between topics.
```{r , echo = FALSE, message = F, warning = F}
contracted_g <- contract.vertices(g,
mapping = as.integer(as.factor(V(g)$topic)),
vertex.attr.comb = list(weight = "sum", topic = "first", "ignore"))
# Simplify edges
gcon <- simplify(contracted_g, remove.multiple = TRUE, remove.loops = FALSE,
edge.attr.comb = "sum")
V(gcon)$name <- V(gcon)$topic # I need to make sure I corectly transfer the topic information to the nodes
V(gcon)$label <- as_vector(str_remove_all(str_trunc(V(gcon)$name, 16, "right"), "\\."))
# V(gcon)$label <- 1:25
V(gcon)$name <- V(gcon)$label
# sum(E(gcon)$weight)
el <- get.edgelist(gcon)
df <- data.frame(from = el[,1], to = el[,2], value = as.vector(E(gcon)$weight), stringsAsFactors = F)
#svg(file = "Chord.svg", width = 8, height = 6)
# new plot
# plot.new()
chordDiagram(df, grid.col = colorlist[1:25], annotationTrack = c("grid"))
circos.trackPlotRegion(track.index = 1, panel.fun = function(x, y) {
xlim = get.cell.meta.data("xlim")
ylim = get.cell.meta.data("ylim")
sector.name = get.cell.meta.data("sector.index")
circos.text(mean(xlim), ylim[1] + .1, sector.name, cex = 0.55, facing = "clockwise", niceFacing = TRUE, adj = c(0, 0.5))
}, bg.border = NA)
#circos.info()
#head(df)
# chordDiagram(df, scale = F)
# circos.clear()
# dev.off()
```
The chord diagram with 25 different topics is not easy to read but it confirms the important link between Biomass conservation and Renawable Carbon that we grouped together in the 16 categories topic variable.
Since certain original topics were only present in certain editions we attempt to visualise the relations between topics using an historical layout. This network is derived from the network of communications having at least one common author.
```{r , echo = FALSE, message = F, warning = F}
V(g)$yt <- str_c(V(g)$year, str_remove_all(str_trunc(V(g)$topic, 16, "right"), "\\."), sep = "_")
contracted_g <- contract.vertices(g,
mapping = as.integer(as.factor(V(g)$yt)),
vertex.attr.comb = list(weight = "sum", topic = "first", year = "first", yt = "first", rtopic = "first", "ignore"))
# Simplify edges
gcon <- simplify(contracted_g, remove.multiple = TRUE, remove.loops = T,
edge.attr.comb = "sum")
V(gcon)$name <- V(gcon)$yt # I need to make sure I corectly transfer the topic information to the nodes
V(gcon)$label <- V(gcon)$topic # as_vector(str_remove_all(str_trunc(V(gcon)$name, 16, "right"), "\\."))
# Prepare the historical layout (not legible at the com level) inspired by the histPlot function in the R package {Bibliometrix}
V(gcon)$size <- 2
gcon <- delete_vertices(gcon, igraph::degree(gcon) == 0)
dg <- decompose.graph(gcon)
layout_m <- create_layout(gcon, layout = "nicely")
layout_m$cluster <- 0
rr <- 0
for (k in 1:length(dg)) {
bsk <- dg[[k]]
a <- ifelse(layout_m$name %in% V(bsk)$name, k, 0)
layout_m$cluster <- layout_m$cluster + a
Min <- min(layout_m$y[layout_m$cluster == k]) - 1
layout_m$y[layout_m$cluster == k] <- layout_m$y[layout_m$cluster == k] + (rr - Min)
rr <- max(layout_m$y[layout_m$cluster == k])
}
bsk <- gcon
wp <- membership(cluster_infomap(bsk, modularity = FALSE))
layout_m$x <- layout_m$year
# Historical layout (not legible at the com level)
ghist <- ggraph(layout_m) +
geom_edge_arc(width = 1, strength = 0.05, check_overlap = T, edge_alpha = 0.5, color = "grey") +
geom_node_text(aes(label = V(bsk)$label), size = 2, repel = TRUE, color = colorlist[wp], alpha = 0.8) +
geom_node_point(size = V(bsk)$size, color = colorlist[as.factor(V(bsk)$topic)], alpha = 0.15) +
scale_color_brewer() +
scale_x_discrete(labels = as.character(seq(2015, 2019, by = 2)), expand = c(.1, .1)) + # expand = c(.1, .1)
theme_minimal() +
theme(legend.position = "none",
panel.background = element_rect(fill = "gray97",
color = "grey97"), axis.line.y = element_blank(),
axis.text.y = element_blank(), axis.ticks.y = element_blank(),
axis.title.y = element_blank(), axis.title.x = element_blank(),
panel.grid.minor.y = element_blank(), panel.grid.major.y = element_blank(),
panel.grid.minor.x = element_blank(), axis.text.x = element_text(face = "bold",
angle = 90, size = 6)) + labs(title = "Historical layout")
plot(ghist)
# plot(g, vertex.label = NA)
```
Let now consider the network of relations between ISGC topics using the list of topics recoded into 16 categories.
```{r , echo = FALSE, message = F, warning = F}
contracted_g <- contract.vertices(g,
mapping = as.integer(as.factor(V(g)$rtopic)),
vertex.attr.comb = list(weight = "sum", rtopic = "first", "ignore"))
# Simplify edges
gtop <- simplify(contracted_g, remove.multiple = TRUE, remove.loops = T,
edge.attr.comb = "sum")
V(gtop)$label <- as_vector(str_remove_all(str_trunc(V(gtop)$rtopic, 16, "right"), "\\."))
# Activate the following line if you want a pdf or svg output
# svg(file = "plots/Topics_network_recoded_cartigraph.svg", width = 8, height= 6)
# plot_g <- cartigraph::cartigraph(gtop, inches = 0.05, legend.title.nodes = "Nb of communications\n",
# layout = layout_with_gem(gtop),
# values.rnd.links = 0, legend.title.links = "Nb of joint communications",
# title = "", # Network of links between thematic sessions
# col.nodes = colorlist[as.factor(V(gtop)$rtopic)], frame = F
# ) # change to vertex.label.cex = 0.7 when possible
# plot(gcon, layout = layout_with_gem, vertex.color = colorlist[as.factor(V(gcon)$topic)], edge.width = E(gcon)$weight/10)
# activate the following line if you want to clean up the graphics window
# dev.off()
knitr::include_graphics("plots/Topics_network_recoded_cartigraph.svg") # afficher la figure
```
## Individual trajectory
We focus on the specific case of PNA. A researcher who started his career in Singapore and then moved to Poitiers, in France where he was first hired as a postdoctoral researcher, and latter on succeeded to get a permanent position at the CNRS. Here is a subset of the network of communications involving this author. We notice that certain of his co-authors are not correctly indexed in the ISGC database, which means that we have to be very careful when interpreting coauthorship data derived from this source.
```{r , echo = FALSE, message = F, warning = F}
components <-clusters(g)
second <- sort(as.numeric(components$csize), decreasing = T)[2]
# biggest_cluster_id <- which.max(components$csize)
second_cluster_id <- which(components$csize == second)
# ids
vert_ids <- V(g)[components$membership == second_cluster_id]
V(g)$membership <- components$membership
## extract the component including PNA communications, the p_id of PNA is "1829"
# he contributed to 4th isgc talks with the following a_id: 13, 242, 610 and 693)
vert_ids <- V(g)[components$membership == (V(g)$membership[V(g)$name == "13"])]
# subgraph
cg <- igraph::induced_subgraph(g, vert_ids)
plot(cg, vertex.label = str_c(V(cg)$rtopic, V(cg)$year, sep = "_"))
pnacom <- f %>% filter(a_id %in% c("13", "242", "610", "693"))
```