Sunday, February 1, 2015

Unemployment

Beginning 2013 I made a post plotting unemployment in Europe. Last year I did the same. Now that the unemployment numbers of December are on Eurostat, I am making them again. The plots shown are unemployment and its first derivative, both smoothed.
You can discuss these data extensively. Eurostat has a page for those who are interested. What I am personally interested in is a feeling for the trend where we are going in the future. In many places things are improving, for which youth unemployment is a good indicator. However, a number of changes have occurred in the last month(s). Cheap oil, dropping Euro prices, quantitative easing starting in the euro zone and ending in the USA, so I probably should not speculate.

Unemployment


First Derivative


 




Code

This code has bee lightly adapted from last years. This had mostly to do with a change in countries, the Euro Area has grown, and it seems last year I managed to get all indicator variables in columns, while this year the values were in three age columns.
library(ggplot2)
library(KernSmooth)
library(plyr)
library(scales) # to access breaks/formatting functions

r1 <- read.csv("une_rt_m.csv",na.strings=':',check.names=FALSE)
names(r1)[2] <- 'GEO'
r1 <- reshape(r1,
    varying=list(names(r1)[3:5]),
    v.names='Value',
    timevar='AGE',
    idvar=names(r1)[1:2],
    times=names(r1)[3:5],
    direction='long')
rownames(r1) <- 1:nrow(r1)
levels(r1$GEO) <- sub(' countries)',')' ,levels(r1$GEO),fixed=TRUE)
levels(r1$GEO) <- sub('European Union','EU' ,levels(r1$GEO))
levels(r1$GEO)[levels(r1$GEO)=='Euro area (EA11-2000, EA12-2006, EA13-2007, EA15-2008, EA16-2010, EA17-2013, EA18-2014, EA19)'] <- "Euro Area"
levels(r1$GEO)[levels(r1$GEO)=='United Kingdom'] <- 'UK'
levels(r1$GEO)[levels(r1$GEO)=='United States'] <- 'US'
levels(r1$GEO)[levels(r1$GEO)=='Germany (until 1990 former territory of the FRG)'] <- 'Germany'
levels(r1$GEO)
grep('12|13|15|16|17|18|25|27',x=levels(r1$GEO),value=TRUE)
r1 <- r1[!(r1$GEO %in% grep('12|13|15|16|17|18|25|27',x=levels(r1$GEO),value=TRUE)),]
r1$GEO <- factor(r1$GEO)
r1$Age <- factor(r1$AGE,levels=unique(r1$AGE))
r1$Date <- as.Date(paste(gsub('M','-',as.character(r1$TIME)),'-01',sep=''))

#
maxi <- aggregate(r1$Value,by=list(GEO=r1$GEO),FUN=max,na.rm=TRUE)
parts <- data.frame(
    low = maxi$GEO[maxi$x<quantile(maxi$x,1/3)]
    ,middle = maxi$GEO[maxi$x>quantile(maxi$x,1/3) & maxi$x<quantile(maxi$x,2/3)]
    ,high = maxi$GEO[maxi$x>quantile(maxi$x,2/3)]
)
#ggplot(r1[r1$GEO %in% low,],aes(x=Date,y=Value,colour=Age)) +
#        facet_wrap( ~ GEO, drop=TRUE) +
#        geom_line()  +
#        theme(legend.position = "bottom")
#        ylab('% Unemployment') + xlab('Year')


r1$class <- interaction(r1$GEO,r1$Age)
head(r1)
r3 <- r1[complete.cases(r1),]
r3$class <- factor(r3$class)
Perc <- ddply(.data=r3,.variables=.(class),
    function(piece,...) {
        lp <- locpoly(x=as.numeric(piece$Date),y=piece$Value,
            drv=0,bandwidth=90)
        sdf <- data.frame(Date=as.Date(lp$x,origin='1970-01-01'),
            sPerc=lp$y,Age=piece$Age[1],GEO=piece$GEO[1])}
    ,.inform=FALSE
)
for (i in c('low','middle','high')) {
    png(paste(i,'.png',sep=''))
    print(
        ggplot(Perc[Perc$GEO %in% parts[,i] ,],
                aes(x=Date,y=sPerc,colour=Age)) +
            facet_wrap( ~ GEO, drop=TRUE) +
            geom_line()  +
            theme(legend.position = "bottom")+
            ylab('% Unemployment') + xlab('Year') +
            scale_x_date(breaks = date_breaks("5 years"),
                labels = date_format("%y"))
    )
    dev.off()
}

dPerc <- ddply(.data=r3,.variables=.(class),
    function(piece,...) {
        lp <- locpoly(x=as.numeric(piece$Date),y=piece$Value,
            drv=1,bandwidth=365/2)
        sdf <- data.frame(Date=as.Date(lp$x,origin='1970-01-01'),
            dPerc=lp$y,Age=piece$Age[1],GEO=piece$GEO[1])}
    ,.inform=FALSE
)

for (i in c('low','middle','high')) {
    png(paste('d',i,'.png',sep=''))
    print(
        ggplot(dPerc[dPerc$GEO %in% parts[,i] ,],
                aes(x=Date,y=dPerc,colour=Age)) +
            facet_wrap( ~ GEO, drop=TRUE) +
            geom_line()  +
            theme(legend.position = "bottom")+
            ylab('Change in % Unemployment') + xlab('Year')+
            scale_x_date(breaks = date_breaks("5 years"),
                labels = date_format("%y"))
    )
    dev.off()
}


1 comment:

  1. Thank you for this nice work . I want to replicate your code in Stock Market sector based return analysis . and your csv file une_rt_m.csv is not present here. Can you post a sample of this data file?

    ReplyDelete