Sunday, June 2, 2013

Cars in Netherlands

I am looking for a new car. So when I saw there was an update on vehicles in Statistics Netherlands I just had to go and look at the data. So, I learned the brown is getting more popular, often the number of cars from a certain construction year is larger at six years of age than five years of age and lighter cars get more popular, especially in these later years in the crisis.

Color of Cars

The data I downloaded contains number of cars by color, build year and reference date. Unfortunately it is all in Dutch, but I did translate all relevant parts. The data actually contains 14 colors, including 'other' but some of the colors were so infrequent, it made all confusing. So I added some colors to other. To plot what is sold in a certain year, I took reference date 1 January, the year after building.
The data shown grey and black as most sold, white and brown do get more popular, blue and green get less popular. There is also a little dent in 2009 (crisis) and a new decrease in 2012. New tells us 2013 is worse.

% change in number of cars is interesting. I was expecting black cars maybe get more in accidents or something, because they are less visible. What I did see is a marked decrease in white cars, especially in the beginning of the previous decade. And there is a marked increase in cars of say 5 or 6 years old. Finally, at say ten years of age the cars start disappearing. The marked increase at 5 or 6 years may be explained by importing older cars. I sometimes read or hear they are bought in Germany. The Dutch tax on imported cars is fairly high, so it is interesting to import a cheaper second hand, which has significantly less tax. For example; Athlon car lease Germany has a Dutch language site especially for this purpose.

Fuel

There are three kinds of frequently used fuels; gasoline (Benzine), diesel and LPG (liquid propane gas). Gasoline is the standard, most expensive fuel, lowest in road tax, diesel is cheaper to run, but more expensive in tax, LPG even more so. It boils down to this, Diesel and LPG if you drive a lot, Gasoline if you drive a bit. It is possible to convert from Gasoline to LPG, which may explain the big increases in number of LPG for older cars.

Weight

Weight is also interesting. It is tax categories, these I translated into continuous values by taking the lower end of the brackets. The first bracket is 0 to 450 kg, then 100 kg increases. Beats the hell out of me what cars are under 650 kg, but there is something. Obviously most cars are between about 700 and 1800 kg. There is also some shift at 750 kg, I you have a good color discrimination you might see what exactly. My eyes won't so the next plot shows that cars 651 to 750 kg are getting less frequent, 751 to 850 much more frequent. There are tax advantages to less fuel consumption, and the economy can only strengthen that. Somewhat heavier cars suffer from a decrease. Above 1500 kg it seems the market works differently.

The final plots show number of  heaver cars are most decreasing when older. Which makes sense, they are still pretty expensive to run in terms of fuel and tax. The final plot I chose because you can see the big increase in love for small cars again, it would seem old small cars don't disappear or their number even increases, while the bigger cars, the 1151 to 1250 bracket has prominent decreases from 7 years old. 

R code.

library(ggplot2)      
col1 <- read.csv2('Motorvoertuigen__per_260513101348.csv',na.strings='-')
col2 <- col1[!is.na(col1$Waarde),]
col2$BuildYear <- as.numeric(sub('Bouwjaar ','',as.character(col2$Bouwjaren)))
col2$RefYear <- as.numeric(sub(', 1 januari','',as.character(col2$Peildatum)))
col2$Colour <- factor( c("Beige", "Blue", "Brown", "Other", "Yellow", 
        "Grey", "Green", "Other", "Other",
        "Other", "Red", "Other", "White" ,"Black")[col2$Onderwerpen_2])
col3 <- aggregate(col2$Waarde,list(Colour=col2$Colour,BuildYear=col2$BuildYear,
        RefYear=col2$RefYear),sum)
col4 <- col3[col3$RefYear==col3$BuildYear+1,]
colourcode <-  c("#C8AD7F", "Black","Blue", "Brown", "Green" ,
    "Grey" ,  "Purple" , "Red","White" , "Yellow")
png('col1.png')
p <- ggplot(col4, aes(x=BuildYear, y=x, colour=Colour))
p + geom_line() + 
    scale_colour_manual(values=colourcode) +
    scale_y_log10("Numer of vehicles") +
    scale_x_continuous(breaks=seq(2000,2012,2))
dev.off()

##############

lastyeardata <- col3[,c('x','BuildYear','Colour','RefYear')] 
lastyeardata$RefYear <- lastyeardata$RefYear+1
colnames(lastyeardata)[colnames(lastyeardata)=='x'] <- 'LastYearAmount'
change <- merge(x=col3,y=lastyeardata)
change$Pchange <- with(change,100*(x-LastYearAmount)/LastYearAmount)
change$Age <- change$RefYear-change$BuildYear
png('col2.png')
p <- ggplot(change[change$BuildYear<2010,], aes(x=Age, y=Pchange, colour=Colour))
p + geom_line() + 
    scale_colour_manual(values=colourcode) +
    scale_y_continuous("Numer of vehicles") +
    facet_wrap(~BuildYear,nrow=2)
dev.off()

############
fuel1 <- read.csv2('Motorvoertuigen__per_010613135050.csv',na.strings='-')
fuel2 <- fuel1[!is.na(fuel1$Waarde),]
fuel2$BuildYear <- as.numeric(sub('Bouwjaar ','',as.character(fuel2$Bouwjaren)))
fuel2$RefYear <- as.numeric(sub(', 1 januari','',as.character(fuel2$Peildatum)))
fuel4 <- fuel2[fuel2$RefYear==fuel2$BuildYear+1,]
png('fuel1.png')
p <- ggplot(fuel4, aes(x=BuildYear, y=Waarde, colour=Onderwerpen_2))
p + geom_line() + 
    scale_y_continuous("Numer of vehicles") +
    scale_x_continuous(breaks=seq(2000,2012,2)) +
    labs(colour="Fuel")
dev.off()
##

lastyeardata <- fuel2[,c('Waarde','BuildYear','Onderwerpen_2','RefYear')] 
lastyeardata$RefYear <- lastyeardata$RefYear+1
colnames(lastyeardata)[colnames(lastyeardata)=='Waarde'] <- 'LastYearAmount'
change <- merge(x=fuel2,y=lastyeardata)
change$Pchange <- with(change,100*(Waarde-LastYearAmount)/LastYearAmount)
change$Age <- change$RefYear-change$BuildYear
png('fuel2.png')
p <- ggplot(change[change$BuildYear<2010,], 
    aes(x=Age, y=Pchange, colour=Onderwerpen_2))
p + geom_line() + 
    scale_y_continuous("Numer of vehicles") +
    facet_wrap(~BuildYear,nrow=2) +
    labs(colour="Fuel")
dev.off()

##############

weight1 <- read.csv2('Motorvoertuigen__per_010613140907.csv',na.strings='-')
weight2 <- weight1[!is.na(weight1$Waarde),]
weight2$BuildYear <- as.numeric(sub('Bouwjaar ','',as.character(weight2$Bouwjaren)))
weight2$RefYear <- as.numeric(sub(', 1 januari','',as.character(weight2$Peildatum)))
weightcats <- levels(weight2$Onderwerpen_2)
weightcats <- gsub('en meer','and more',weightcats)
levels(weight2$Onderwerpen_2) <- weightcats 
lweightcats <- as.numeric(gsub('( |-).*$','',weightcats))
weight2$lweight <- lweightcats[weight2$Onderwerpen_2]
weightcats <- weightcats[order(lweightcats)]
weight2$WeightCat <- factor(weight2$Onderwerpen_2,levels=weightcats)
weight4 <- weight2[weight2$RefYear==weight2$BuildYear+1,]
png('weight1.png')
p <- ggplot(weight4, aes(x=lweight, y=Waarde, colour=factor(BuildYear)))
p + geom_line() + 
    scale_y_continuous("Numer of vehicles") +
    labs(colour='Build Year')
dev.off()

png('weight2.png')
p <- ggplot(weight4[weight4$lweight>600& weight4$lweight<1800,], aes(x=BuildYear,y=Waarde))
p + geom_line() + 
    scale_y_continuous("Numer of vehicles") + 
    facet_wrap(~WeightCat)
dev.off()
##

lastyeardata <- weight2[,c('Waarde','BuildYear','Onderwerpen_2','RefYear')] 
lastyeardata$RefYear <- lastyeardata$RefYear+1
colnames(lastyeardata)[colnames(lastyeardata)=='Waarde'] <- 'LastYearAmount'
change <- merge(x=weight2,y=lastyeardata)
change$Pchange <- with(change,100*(Waarde-LastYearAmount)/LastYearAmount)
change$Age <- change$RefYear-change$BuildYear
png('weight3.png')
p <- ggplot(change[change$lweight>600& change$lweight<2200 & change$BuildYear<2010,]
    , aes(x=Age, y=Pchange, colour=WeightCat))
p + geom_line() + 
    scale_y_continuous("% Chane in Numer of vehicles") +
    facet_wrap(~BuildYear,nrow=2)
dev.off()

png('weight4.png')
p <- ggplot(change[change$lweight>600 & change$lweight<1200,],
    aes(x=RefYear, y=Pchange, colour=factor(Age)))
p + geom_line() + 
    scale_y_continuous("% Change in Number of vehicles") +
    facet_wrap(~WeightCat)
dev.off()

5 comments:

  1. Nice set of plots! Great to get your hands on an interesting data set :)

    ReplyDelete
    Replies
    1. Also, you really need to kill the ads on your blog. Firefox on mac does a total redirect to very spammy looking ads and I have to click 'back' to get back to your blog, had to do that twice. And it is trying to open pop ups.

      Delete
  2. Why would you use png() and dev.off() to save a png of your plots isntead of using ggsave?

    ReplyDelete
    Replies
    1. ggsave is a convenient function for saving a plot. It defaults to saving the last plot that you displayed, and for a default size uses the size of the current graphics device. It also guesses the type of graphics device from the extension. This means the only argument you need to supply is the filename.

      For me, actual on screen size varies quite a lot. Sometimes full screen, sometimes small next to a big editor window, sometimes bigger, next to a small sized editor and console. Moving to the blog, I want in a general a constant size. Png() gives a constant size, which I can have displayed 'actual size' (without deformations) in blogger.

      Delete
  3. The basic difference between leasing a car and buying it is that the former pays for the period the car is used, whereas the latter pays for the entire cost of the car.

    vehicle Leasing & Audi contract hire

    ReplyDelete