Skip to content

Instantly share code, notes, and snippets.

@briatte
Last active December 16, 2015 19:39
Show Gist options
  • Save briatte/5486509 to your computer and use it in GitHub Desktop.
Save briatte/5486509 to your computer and use it in GitHub Desktop.
Graphiques ESR - DPG CJC 2013-04-30

Graphiques pour la CJC.

DATA

stats_esr.csv : extrait du RERS 2012, tableaux 6.4.2 et 8.18.1.

jce.R : extrait de l'enquête CJC "Les jeunes chercheurs étrangers en France. Résultats de l’enquête 2010", 2012, section VII.3. Les données sont générées directement dans le code.

vacataires.tsv : extrait de MESR-DGRH, Étude sur la situation des personnels enseignants non permanents de l’enseignement supérieur, 2009, tableaux 2 et 17. Variables utilisées :

  • Etablissement = établissement (ou académie avant préparation)
  • total = effectif total des personnels vacataires
  • inf96h = effectif < 96h ETD
  • sup96h = effectif > 96h ETD
  • ETECA = enseignants titulaires et enseignants-chercheurs associés
  • PCNP = personnels non permanents (moniteurs, ATER, invités/associés…) / ETECA

PACKAGES

Codé par François sous R.

structure(list(Préfecture = structure(c(3L, 2L, 4L, 1L, 3L,
2L, 4L, 1L), .Label = c("Créteil", "Évry", "Paris", "Strasbourg"
), class = "factor"), variable = structure(c(1L, 1L, 1L, 1L,
2L, 2L, 2L, 2L), .Label = c("% de doctorants avec carte de séjour étudiant",
"% de doctorants avec carte de séjour scientifiques-chercheurs"
), class = "factor"), value = c(76, 46, 51, 74, 13, 46, 35, 3
)), .Names = c("Préfecture", "variable", "value"), row.names = c(NA,
-8L), class = "data.frame")
Annee L M D Croissance.Inscriptions Theses Croissance.Soutenances
2004 897069 460426 67041 10137
2005 900196 453333 68190 1.6849978 10377 2.31280717
2006 878053 452886 68238 0.070342038 11421 9.14105595
2007 848111 449249 66390 -2.78355174 12006 4.872563718
2008 832140 506817 65419 -1.484278268 12356 2.83263192
2009 851646 527947 64990 -0.660101554 12703 2.731638196
2010 863762 509063 64279 -1.106115528 12883 1.397190095
2011 845212 493043 62132 -3.455546256
#! edit this
setwd("~/Documents/ESR/CJC/2013-DPG")
require(Amelia)
require(geocode)
require(ggmap)
require(ggplot2)
require(RColorBrewer)
require(reshape)
require(scales)
require(zoo)
# Requires Gill Sans MT.
require(extrafont)
# font_import()
loadfonts()
## EFFECTIFS DOC/DT
x <- read.csv("stats-esr.csv")[-1, c(1, 5, 7)]
names(x) <- c("Année", "Taux de croissance annuel du doctorat (% inscriptions) ", "Taux de croissance annuel des docteurs (% soutenances)")
x <- melt(x, id = "Année")
ggplot(data = x, aes(x = Année, y = value)) +
geom_hline(y = 0, linetype = "dotted") +
geom_line(aes(color = variable)) +
geom_point(color = "white", size = 16) +
geom_text(aes(color = variable, family = "Gill Sans MT",
label = paste(ifelse(value > 0, "+", ""), round(value, 1)))) +
scale_color_brewer("", palette = "Set1") +
scale_y_continuous(breaks = -5:10) +
scale_x_continuous(breaks = min(x$Année):max(x$Année)) +
labs(y = NULL, x = NULL) + #, title = "Figure 1. Taux de croissance des effectifs doctorants et docteurs") +
theme_bw() + theme(
plot.title = element_text(face = "bold"),
text = element_text(family = "Gill Sans MT", size = 12),
panel.border = element_rect(color = "white"),
panel.grid.minor.y = element_blank(),
panel.grid.minor.x = element_blank(),
panel.grid.major.x = element_blank(),
axis.text.y = element_blank(),
axis.ticks = element_blank(),
legend.position = 'top',
legend.text = element_text(face = "bold", size = 12),
legend.direction = "vertical")
ggsave("stats-esr.svg", width = 6, height = 4)
## DISPARITES JCE
y <- data.frame(
P = c("Paris", "Évry", "Strasbourg", "Créteil"),
E = c(76, 46, 51, 74),
S = c(13, 46, 35, 3))
names(y) <- c("Préfecture",
"% de doctorants avec carte de séjour étudiant",
"% de doctorants avec carte de séjour scientifiques-chercheurs")
y <- melt(y, id = "Préfecture")
ggplot(data = y, aes(x = Préfecture, y = value, fill = variable)) +
geom_bar(aes(group = variable), stat = "identity", position = "dodge") +
geom_text(aes(group = variable, label = value, family = "Gill Sans MT"),
hjust = -1, position = position_dodge(width = 0.9)) +
scale_fill_brewer("", palette = "Set1") +
scale_y_continuous(limits = c(0, 80)) +
labs(y = NULL) +
theme_bw() + theme(
plot.title = element_text(face = "bold"),
text = element_text(family = "Gill Sans MT", size = 12),
panel.border = element_rect(color = "white"),
panel.grid.minor.y = element_blank(),
panel.grid.minor.x = element_blank(),
panel.grid.major.x = element_blank(),
axis.text.x = element_blank(),
axis.text.y = element_text(size = 12),
axis.ticks = element_blank(),
legend.position = "top",
legend.text = element_text(face = "bold", size = 12),
legend.direction = "vertical") + coord_flip()
ggsave("stats-esr-jce.svg", width = 6, height = 4)
## VACATAIRES
# Data.
z <- read.table("vacataires.tsv", sep = "\t",
header = TRUE,
stringsAsFactors = FALSE)
# LOCB academy values.
z$academie <- ifelse(grepl("Academie", z$Etablissement), z$Etablissement, NA)
z$academie <- na.locf(z$academie, fromLast = TRUE)
z <- subset(z, !grepl("Academie", Etablissement))
# Mark missing values.
z$truezero <- is.na(z$truezero)
z$total[z$total == 0 & z$truezero] <- NA
z$inf96h[z$inf96h == 0 & z$truezero] <- NA
z$sup96h[z$sup96h == 0 & z$truezero] <- NA
z$truezero <- NULL
# Subset to metropolitan.
z <- z[!grepl("ANTILLES|LA REUNION|POLYNESIE", z$academie), ]
# Subset to measured PCNP and ETECA.
z <- z[!(is.na(z$PCNP) | is.na(z$ETECA)), ]
# Factor units.
z$Etablissement <- factor(z$Etablissement)
z$academie <- factor(z$academie)
# Check result.
str(z)
# Imputing vacs. by PCNP and ETECA
a.out <- amelia(x = z[, c(1, 3:6)], idvars = "Etablissement", m = 10^3,
bounds = matrix(c(2, 3, 0, 0, Inf, Inf), nrow = 2))
save(file = "vacataires.amelia.Rda", a.out)
# cbind(z[, c(1, 3)], sapply(1:10, FUN = function(x) { a.out$imputations[[x]]$inf96h }))
# Maximal lon/lat information.
if(!file.exists(file <- "vacataires-fullgeo.Rda")) {
vacataires.fullgeo <- geocode(paste("Universite", z$Etablissement, "France"),
output = "all")
save(file = file, vacataires.fullgeo)
}
# Minimal lon/lat information.
if(!file.exists(file <- "vacataires-geo.txt")) {
l <- geocode(paste("Universite", z$Etablissement, "France"))
write.csv(data.frame(z, l), file)
}
## RESULTS
z <- read.csv("vacataires-geo.txt")
# mark imputed
z$inf96h.imputed <- is.na(z$inf96h)
z$sup96h.imputed <- is.na(z$sup96h)
z$total.imputed <- is.na(z$inf96h) | is.na(z$sup96h)
# apply imputed
z$inf96h <- rowMeans(sapply(1:10^3, FUN = function(x) { a.out$imputations[[x]]$inf96h }))
z$sup96h <- rowMeans(sapply(1:10^3, FUN = function(x) { a.out$imputations[[x]]$sup96h }))
z$total <- z$inf96h + z$sup96h
# imputation ratio:
prop.table(table(z$total.imputed))
# sums and ratios:
inf96h <- tapply(z$inf96h, z$inf96h.imputed, sum)
# FALSE TRUE
# 56129.00 30695.45
inf96h[2] / sum(inf96h)
# TRUE
# 0.3535346
sup96h <- tapply(z$sup96h, z$sup96h.imputed, sum)
# FALSE TRUE
# 9813.00 16689.24
sup96h[2] / sum(sup96h)
# TRUE
# 0.6297294
total <- tapply(z$total, z$total.imputed, sum)
# FALSE TRUE
# 65615.00 47711.69
total[2] / sum(total)
# TRUE
# 0.4210102
tapply(z$total, z$total.imputed, sum)
# FALSE TRUE
# 65615.00 47711.69
# ratio sur-service
sum(z$sup96h) / sum(z$total)
# ratio par académies
academies = data.frame(tapply(z$total, z$academie, sum), tapply(z$sup96h, z$academie, sum))
academies$ratio <- academies[, 2] / academies[, 1]
academies[rev(order(academies$ratio)), 1:3]
# nb. moyen de vacataires :
sum(total) / nrow(z)
# [1] 786.9909
sum(sup96h) / nrow(z)
# [1] 184.0433
sum(sup96h) / sum(total)
# [1] 0.233857
## MAP
ggplot(map_data("france")) +
geom_polygon(aes(x = long, y = lat, group = group),
fill = "grey95", colour = "grey50") +
geom_point(data = z[z$total > 0, ],
aes(x = lon, y = lat, size = total), colour = "grey10", alpha = .5) +
geom_point(data = z[z$sup96h > 0, ],
aes(x = lon, y = lat, size = sup96h), colour = brewer.pal(3, "Set1")[1],
alpha = .5) +
scale_size_area("Vacataires\n", max_size = 12) +
labs(y = NULL, x = NULL, title = NULL) + # "Vacations en-dessous et au-dessus de 96h ETD")
theme_bw() + theme(
text = element_text(family = "Gill Sans MT", size = 12),
panel.border = element_rect(color = "white"),
axis.text = element_blank(),
axis.ticks = element_blank(),
panel.grid = element_blank(),
legend.text = element_text(face = "bold", size = 12),
legend.key = element_rect(colour = "white")
)
ggsave("stats-esr-vac.svg", width = 6, height = 4)
We can make this file beautiful and searchable if this error is corrected: It looks like row 2 should actually have 6 columns, instead of 7 in line 1.
Etablissement total inf96h sup96h truezero PCNP ETECA
AIXMARSEILLE 1 2060 1910 150 14.5 1126
AIXMARSEILLE 2 0 0 0 14.4 669
AIXMARSEILLE 3 0 0 0 16.5 686
AVIGNON 368 307 61 18 275
AIX IEP 0 0 0 22.6 36
MARSEILLE ECOLE CENTRALE 0 0 0 12.7 73
Academie dAIX 2428 2217 211 15.4 2865
AMIENS 112 7 105 12.2 910
COMPIEGNE UT 0 0 0 18.2 193
Academie dAMIENS 112 7 105 13.4 1103
ANTILLESGUYANE 0 0 0 18 351
GUADELOUPE IUFM 0 28
MARTINIQUE IUFM 40 39 1 3.2 30
GUYANE IUFM 0 23
Academie des ANTILLES 40 39 1 15.3 432
BESANCON 1637 1596 41 11.4 1002
BELFORTMONTBELIARD UT 165 148 17 12.8 132
BESANCON ENS MECA 0 0 0 12.5 57
Academie de BESANCON 1802 1744 58 11.6 1191
BORDEAUX 1 505 485 20 11.1 760
BORDEAUX 2 414 413 1 9.7 400
BORDEAUX 3 931 824 107 19.7 471
BORDEAUX 4 695 674 21 12.6 482
PAU 916 836 80 13.5 581
BORDEAUX IEP 228 204 24 20.6 40
BORDEAUX ENSEIR 0 17.6 63
BORDEAUX ENSCP 135 135 2.7 36
Academie de BORDEAUX 3824 3571 253 13.4 2833
CAEN 873 795 78 12.6 1085
CAEN ISMRA 177 162 15 11.8 60
Academie de CAEN 1050 957 93 12.6 1145
CLERMONT 1 756 717 39 16.6 373
CLERMONT 2 1571 1475 96 12.4 850
CLERMONT ENSC 0 0 0 5.8 31
CLERMONT IFMA 51 36 15 19.8 42
Academie de CLERMONT 2378 2228 150 12.3 1296
CORTE 0 0 0 12.3 222
Academie de CORSE 0 0 0 8.7 222
PARIS 8 0 0 0 17.7 703
PARIS 12 391 343 48 13.7 1004
PARIS 13 1206 1088 118 17.7 744
MARNE LA VALLEE 0 0 0 17.7 349
CACHAN ENS 0 0 0 19.7 191
NOISY LE GD ENS LOUIS LUMIERE 65 53 12 16.4 10
ST OUEN SUP MECA 99 87 12 19.5 47
Academie de CRETEIL 1761 1571 190 26.4 3048
DIJON 2346 2211 135 19.7 1096
Academie de DIJON 2346 2211 135 73.7 1096
GRENOBLE 1 59 59 9.8 1113
GRENOBLE 2 0 0 0 17 572
GRENOBLE 3 245 221 24 19.2 280
GRENOBLE INP 0 0 0 15 381
CHAMBERY 1041 958 83 14 528
GRENOBLE IEP 0 0 0 21 47
Academie de GRENOBLE 1345 1238 107 13.9 2921
LA REUNION 778 667 111 14 400
Academie de LA REUNION 778 667 111 14 400
LILLE 1 1243 1179 64 13.8 1180
LILLE 2 0 0 0 19.6 427
LILLE 3 992 938 54 22.8 595
ARTOIS 779 696 83 9.8 627
LITTORAL 289 269 20 13.6 421
VALENCIENNES 757 648 109 15.4 523
LILLE ECOLE CENTRALE 172 165 7 15.5 94
ROUBAIX ENSAIT 0 0 0 26.2 28
LILLE IEP 0 0 0 36.1 21
LILLE ENS CHIMIE 0 8.2 45
Academie de LILLE 4232 3895 337 15.8 3961
LIMOGES 768 748 20 12.6 637
LIMOGES ENSCI 41 38 3 4 24
Academie de LIMOGES 809 786 23 12.3 661
LYON 1 0 0 0 1 11 1328
LYON 2 1935 1618 317 20.4 631
LYON 3 2653 453 2200 26.8 421
ST ETIENNE 1231 1148 83 14.5 539
LYON IEP 0 0 0 1 20 45
LYON ENS 0 0 0 1 30.7 106
LYON ENS LETTRES 33 33 29.8 95
LYON ENSSIB 0 0 0 1 14.3 9
LYON ECOLE CENTRALE 32 23 9 16.6 122
LYON ENSATT 0 0 0 1 87.2 7
LYON INSA 0 0 0 1 13.6 480
ST ETIENNE ENI 49 45 4 6.5 54
Academie de LYON 5933 3320 2613 17.7 3837
MONTPELLIER 1 781 707 74 15.4 439
MONTPELLIER 2 1632 1425 207 10.6 943
MONTPELLIER 3 0 0 0 17.1 479
NIMES CUFR 658 557 101 22.7 51
PERPIGNAN 0 0 0 17.3 339
MONTPELLIER ENS CHIMIE 0 18.4 41
Academie de MONTPELLIER 3071 2689 382 14.4 2292
NANCY 1 1701 1674 27 9.1 980
NANCY 2 1341 1257 84 16.6 556
NANCY INP 802 767 35 12.4 327
METZ 998 934 64 11.4 644
METZ ENI 93 80 13 13.6 64
Academie de NANCYMETZ 4935 4712 223 11.9 2571
NANTES 1383 1281 102 12.7 1334
LE MANS 302 290 12 13.5 443
ANGERS 2275 2112 163 1 14.7 579
NANTES ECOLE CENTRALE 0 0 0 18.2 104
Academie de NANTES 3960 3683 277 13.6 2460
NICE 0 0 0 14.7 1024
TOULON 0 0 0 16.4 393
NICE OBSERVATOIRE 0 33
Academie de NICE 0 0 0 14.9 1450
ORLEANS 1313 1228 85 11.6 861
TOURS 1715 1610 105 15.4 842
BOURGES ENSI 193 174 19 16 27
BLOIS ENS PAYSAGE 27 19 8 61.3 6
BLOIS ENIVL 69 62 7 14.8 23
Academie dORLEANSTOURS 3317 3093 224 14 1759
POLYNESIE 109 100 9 7.4 71
NOUVELLE CALEDONIE 352 332 20 8.5 68
PACIFIQUE IUFM 0 3.2 30
Academie de POLYNESIE 461 432 29 7.1 169
PARIS 1 0 0 0 23.2 809
PARIS 2 382 368 14 33.2 280
PARIS 3 0 0 0 24.2 433
PARIS 4 172 169 3 24.5 726
PARIS 5 663 614 49 14 750
PARIS 6 0 0 0 17 1438
PARIS 7 0 0 0 18.2 961
PARIS DAUPHINE 663 583 80 23 311
PARIS INALCO 0 0 0 29.9 217
PARIS ENSAM 0 0 0 13.6 354
PARIS EPHE 0 0 0 9 190
PARIS COLLEGE DE FRANCE 0 0 0 37.8 76
PARIS IPG 0 0 0 17.2 50
PARIS ENS CHIMIE 39 34 5 5.6 51
PARIS CNAM 0 0 0 15.8 356
PARIS EHESS 0 0 0 18.7 192
PARIS OBSERVATOIRE 0 0 0 7.5 94
PARIS ENS 0 0 0 18.9 155
PARIS MUSEUM 0 0 0 7.7 229
PARIS EC NAT CHARTES 50 50 12.2 15
PARIS IAE 50 50 15.3 25
PARIS IEP 0 0 0 36.8 69
PARIS PALAIS DECOUV 0 64.7 4
PARIS INRP 0 11
PARIS ECFRSE EXTREME ORIENT 0 39
PARIS INSTITUT DE FRANCE 0 13
Academie de PARIS 2019 1868 151 19.8 7848
POITIERS 1701 1510 191 12.8 1154
LA ROCHELLE 507 478 29 15.5 330
POITIERS ENSMA 0 0 0 1 7.3 57
Academie de POITIERS 2208 1988 220 13.2 1541
REIMS 108 53 55 10.9 1097
TROYES UT 0 0 0 1 22.6 111
Academie de REIMS 108 53 55 12 1208
RENNES 1 1222 1186 36 12.8 1196
RENNES 2 0 0 0 18.9 653
BREST 1186 1139 47 6.3 890
BRETAGNE SUD 241 217 24 13.6 443
RENNES IEP 2944 480 2464 21.9 37
RENNES INSA 247 223 24 16.8 159
RENNES ENS CHIMIE 0 0 0 21.9 43
BREST ENI 41 34 7 17 63
Academie de RENNES 5881 3279 2602 20.5 3487
ROUEN 0 0 0 5.9 1120
LE HAVRE 0 0 0 19.1 461
ROUEN INSA 0 0 0 1 20.8 133
Academie de ROUEN 0 0 0 9.9 1713
STRASBOURG 3087 2618 469 11.6 1919
MULHOUSE 763 720 43 30 506
STRASBOURG INSA 0 0 0 11.1 102
Academie de STRASBOURG 3850 3338 512 12.3 2527
TOULOUSE 1 0 0 0 14.7 460
TOULOUSE 2 1118 941 177 13.2 1074
TOULOUSE 3 1967 1836 131 13.6 1637
TOULOUSE INP 0 0 0 17.8 343
ALBI CUFR 533 480 53 11.8 80
TOULOUSE IEP 149 135 14 14.6 42
TOULOUSE INSA 0 0 0 16.4 235
TARBES ENI 40 34 6 12.8 79
Academie de TOULOUSE 3807 3426 381 13.9 3950
PARIS 10 731 657 74 15.6 1156
PARIS 11 1034 947 87 23.6 1699
EVRY 0 0 0 13.9 490
CERGYPONTOISE 841 710 131 10.7 825
VERSAILLES STQUENTIN 1068 990 78 12.9 572
CERGY ENSEA 112 33 79 13.8 60
PARIS ECOLE CENTRALE 883 828 55 30.5 82
EVRY ENSIEE 97 90 7 14.1 27
SURESNES INSHEA 0 26
Academie de VERSAILLES 4766 4255 511 18.8 4937
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment