Color of Cars
The data I downloaded contains number of cars by color, build year and reference date. Unfortunately it is all in Dutch, but I did translate all relevant parts. The data actually contains 14 colors, including 'other' but some of the colors were so infrequent, it made all confusing. So I added some colors to other. To plot what is sold in a certain year, I took reference date 1 January, the year after building.
The data shown grey and black as most sold, white and brown do get more popular, blue and green get less popular. There is also a little dent in 2009 (crisis) and a new decrease in 2012. New tells us 2013 is worse.
Fuel
There are three kinds of frequently used fuels; gasoline (Benzine), diesel and LPG (liquid propane gas). Gasoline is the standard, most expensive fuel, lowest in road tax, diesel is cheaper to run, but more expensive in tax, LPG even more so. It boils down to this, Diesel and LPG if you drive a lot, Gasoline if you drive a bit. It is possible to convert from Gasoline to LPG, which may explain the big increases in number of LPG for older cars.
Weight
Weight is also interesting. It is tax categories, these I translated into continuous values by taking the lower end of the brackets. The first bracket is 0 to 450 kg, then 100 kg increases. Beats the hell out of me what cars are under 650 kg, but there is something. Obviously most cars are between about 700 and 1800 kg. There is also some shift at 750 kg, I you have a good color discrimination you might see what exactly. My eyes won't so the next plot shows that cars 651 to 750 kg are getting less frequent, 751 to 850 much more frequent. There are tax advantages to less fuel consumption, and the economy can only strengthen that. Somewhat heavier cars suffer from a decrease. Above 1500 kg it seems the market works differently.
The final plots show number of heaver cars are most decreasing when older. Which makes sense, they are still pretty expensive to run in terms of fuel and tax. The final plot I chose because you can see the big increase in love for small cars again, it would seem old small cars don't disappear or their number even increases, while the bigger cars, the 1151 to 1250 bracket has prominent decreases from 7 years old.
R code.
library(ggplot2)col1 <- read.csv2('Motorvoertuigen__per_260513101348.csv',na.strings='-')
col2 <- col1[!is.na(col1$Waarde),]
col2$BuildYear <- as.numeric(sub('Bouwjaar ','',as.character(col2$Bouwjaren)))
col2$RefYear <- as.numeric(sub(', 1 januari','',as.character(col2$Peildatum)))
col2$Colour <- factor( c("Beige", "Blue", "Brown", "Other", "Yellow",
"Grey", "Green", "Other", "Other",
"Other", "Red", "Other", "White" ,"Black")[col2$Onderwerpen_2])
col3 <- aggregate(col2$Waarde,list(Colour=col2$Colour,BuildYear=col2$BuildYear,
RefYear=col2$RefYear),sum)
col4 <- col3[col3$RefYear==col3$BuildYear+1,]
colourcode <- c("#C8AD7F", "Black","Blue", "Brown", "Green" ,
"Grey" , "Purple" , "Red","White" , "Yellow")
png('col1.png')
p <- ggplot(col4, aes(x=BuildYear, y=x, colour=Colour))
p + geom_line() +
scale_colour_manual(values=colourcode) +
scale_y_log10("Numer of vehicles") +
scale_x_continuous(breaks=seq(2000,2012,2))
dev.off()
##############
lastyeardata <- col3[,c('x','BuildYear','Colour','RefYear')]
lastyeardata$RefYear <- lastyeardata$RefYear+1
colnames(lastyeardata)[colnames(lastyeardata)=='x'] <- 'LastYearAmount'
change <- merge(x=col3,y=lastyeardata)
change$Pchange <- with(change,100*(x-LastYearAmount)/LastYearAmount)
change$Age <- change$RefYear-change$BuildYear
png('col2.png')
p <- ggplot(change[change$BuildYear<2010,], aes(x=Age, y=Pchange, colour=Colour))
p + geom_line() +
scale_colour_manual(values=colourcode) +
scale_y_continuous("Numer of vehicles") +
facet_wrap(~BuildYear,nrow=2)
dev.off()
############
fuel1 <- read.csv2('Motorvoertuigen__per_010613135050.csv',na.strings='-')
fuel2 <- fuel1[!is.na(fuel1$Waarde),]
fuel2$BuildYear <- as.numeric(sub('Bouwjaar ','',as.character(fuel2$Bouwjaren)))
fuel2$RefYear <- as.numeric(sub(', 1 januari','',as.character(fuel2$Peildatum)))
fuel4 <- fuel2[fuel2$RefYear==fuel2$BuildYear+1,]
png('fuel1.png')
p <- ggplot(fuel4, aes(x=BuildYear, y=Waarde, colour=Onderwerpen_2))
p + geom_line() +
scale_y_continuous("Numer of vehicles") +
scale_x_continuous(breaks=seq(2000,2012,2)) +
labs(colour="Fuel")
dev.off()
##
lastyeardata <- fuel2[,c('Waarde','BuildYear','Onderwerpen_2','RefYear')]
lastyeardata$RefYear <- lastyeardata$RefYear+1
colnames(lastyeardata)[colnames(lastyeardata)=='Waarde'] <- 'LastYearAmount'
change <- merge(x=fuel2,y=lastyeardata)
change$Pchange <- with(change,100*(Waarde-LastYearAmount)/LastYearAmount)
change$Age <- change$RefYear-change$BuildYear
png('fuel2.png')
p <- ggplot(change[change$BuildYear<2010,],
aes(x=Age, y=Pchange, colour=Onderwerpen_2))
p + geom_line() +
scale_y_continuous("Numer of vehicles") +
facet_wrap(~BuildYear,nrow=2) +
labs(colour="Fuel")
dev.off()
##############
weight1 <- read.csv2('Motorvoertuigen__per_010613140907.csv',na.strings='-')
weight2 <- weight1[!is.na(weight1$Waarde),]
weight2$BuildYear <- as.numeric(sub('Bouwjaar ','',as.character(weight2$Bouwjaren)))
weight2$RefYear <- as.numeric(sub(', 1 januari','',as.character(weight2$Peildatum)))
weightcats <- levels(weight2$Onderwerpen_2)
weightcats <- gsub('en meer','and more',weightcats)
levels(weight2$Onderwerpen_2) <- weightcats
lweightcats <- as.numeric(gsub('( |-).*$','',weightcats))
weight2$lweight <- lweightcats[weight2$Onderwerpen_2]
weightcats <- weightcats[order(lweightcats)]
weight2$WeightCat <- factor(weight2$Onderwerpen_2,levels=weightcats)
weight4 <- weight2[weight2$RefYear==weight2$BuildYear+1,]
png('weight1.png')
p <- ggplot(weight4, aes(x=lweight, y=Waarde, colour=factor(BuildYear)))
p + geom_line() +
scale_y_continuous("Numer of vehicles") +
labs(colour='Build Year')
dev.off()
png('weight2.png')
p <- ggplot(weight4[weight4$lweight>600& weight4$lweight<1800,], aes(x=BuildYear,y=Waarde))
p + geom_line() +
scale_y_continuous("Numer of vehicles") +
facet_wrap(~WeightCat)
dev.off()
##
lastyeardata <- weight2[,c('Waarde','BuildYear','Onderwerpen_2','RefYear')]
lastyeardata$RefYear <- lastyeardata$RefYear+1
colnames(lastyeardata)[colnames(lastyeardata)=='Waarde'] <- 'LastYearAmount'
change <- merge(x=weight2,y=lastyeardata)
change$Pchange <- with(change,100*(Waarde-LastYearAmount)/LastYearAmount)
change$Age <- change$RefYear-change$BuildYear
png('weight3.png')
p <- ggplot(change[change$lweight>600& change$lweight<2200 & change$BuildYear<2010,]
, aes(x=Age, y=Pchange, colour=WeightCat))
p + geom_line() +
scale_y_continuous("% Chane in Numer of vehicles") +
facet_wrap(~BuildYear,nrow=2)
dev.off()
#
png('weight4.png')
p <- ggplot(change[change$lweight>600 & change$lweight<1200,],
aes(x=RefYear, y=Pchange, colour=factor(Age)))
p + geom_line() +
scale_y_continuous("% Change in Number of vehicles") +
facet_wrap(~WeightCat)
dev.off()
Nice set of plots! Great to get your hands on an interesting data set :)
ReplyDeleteAlso, you really need to kill the ads on your blog. Firefox on mac does a total redirect to very spammy looking ads and I have to click 'back' to get back to your blog, had to do that twice. And it is trying to open pop ups.
DeleteWhy would you use png() and dev.off() to save a png of your plots isntead of using ggsave?
ReplyDeleteggsave is a convenient function for saving a plot. It defaults to saving the last plot that you displayed, and for a default size uses the size of the current graphics device. It also guesses the type of graphics device from the extension. This means the only argument you need to supply is the filename.
DeleteFor me, actual on screen size varies quite a lot. Sometimes full screen, sometimes small next to a big editor window, sometimes bigger, next to a small sized editor and console. Moving to the blog, I want in a general a constant size. Png() gives a constant size, which I can have displayed 'actual size' (without deformations) in blogger.
The basic difference between leasing a car and buying it is that the former pays for the period the car is used, whereas the latter pays for the entire cost of the car.
ReplyDeletevehicle Leasing & Audi contract hire