Data
The most important things to reiterate from previous post is that the data is from KNMI and they come with a comment: "These time series are inhomogeneous because of station relocations and changes in observation techniques. As a result, these series are not suitable for trend analysis. For climate change studies we refer to the homogenized series of monthly temperatures of De Bilt or the Central Netherlands Temperature".Data reading has slighlty changed, mostly because I needed different variables. In addition, for testing I wanted some categorical variables, these are Month and year. For year I have chosen five chunks of 22 years, 22 was chosen since it seemed large enough and resulted in approximately equal size chunks. Finally, for display purposes, wind direction was categorized in 8 directions according to the compass rose (North, North-East, East etc.).
library(circular)
library(dplyr)
library(ggplot2)
library(WRS2)
r1 <- readLines('etmgeg_235.txt')
r2 <- r1[grep('^#',r1):length(r1)]
explain <- r1[1:(grep('^#',r1)-1)]
explain
r2 <- gsub('#','',r2)
r3 <- read.csv(text=r2)
r4 <- mutate(r3,
Date=as.Date(format(YYYYMMDD),format='%Y%m%d'),
year=floor(YYYYMMDD/1e4),
rDDVEC=as.circular(DDVEC,units='degrees',template='geographics'),
# Vector mean wind direction in degrees
# (360=north, 90=east, 180=south, 270=west, 0=calm/variable)
DDVECf=as.character(cut(DDVEC,breaks=c(0,seq(15,330,45),361),left=TRUE,
labels=c('N','NE','E','SE','S','SW','W','NW','N2'))),
DDVECf=ifelse(DDVECf=='N2','N',DDVECf),
DDVECf=factor(DDVECf,levels=c('N','NE','E','SE','S','SW','W','NW')),
rFHVEC=FHVEC/10, # Vector mean windspeed (in 0.1 m/s)
yearf=cut(year,seq(1905,2015,22),labels=c('05','27','49','71','93')),
month=factor(format(Date,'%B'),levels=month.name),
tcat=interaction(month,yearf)
) %>%
select(.,YYYYMMDD,Date,year,month,DDVEC,rDDVEC,DDVECf,rFHVEC,yearf,tcat)
Analysis
The circular package comes with an aov.circular() function, which can do one way analysis. Since I am a firm believer that direction varies according to the seasons, the presence of a time effect (the five categories) has been examined by Month. To make result compact, only p-values are displayed, they are all significant.sapply(month.name,function(x) {
aa <- filter(r4,month==x)
bb <- aov.circular(aa$rDDVEC,aa$yearf,method='F.test')
format.pval(bb$p.value,digits=4,eps=1e-5)
}) %>% as.data.frame
January 4.633e-05
February < 1e-05
March < 1e-05
April < 1e-05
May < 1e-05
June 0.00121
July 0.000726
August 0.0001453
September 0.02316
October < 1e-05
November 0.0001511
December 0.003236
The associated plot with this data shows frequency of directions by year and Month. The advantage here being that the time axis is the x-axis, so changes are more easily visible.
ggplot(r4[complete.cases(r4),], aes(x=yearf))+
geom_histogram()+
facet_grid(DDVECf ~ month)+
ggtitle('Frequency of Wind Direction')
t2way(rFHVEC ~ yearf + month + yearf:month,
data = r4)
value p.value
yearf 1063.0473 0.001
month 767.5687 0.001
yearf:month 169.4807 0.001
Conclusion
The data seems to show a change in wind measurements over these 110 years. This can be due to changes in wind or measurement instrument or instrument location. The statistical testing was chosen such as to counter some effects of these changes, hence it can be thought that the change is due to changes in wind itself.
Is there no information available to when instruments were changed? Might be worthwile to plug that info into the ANOVA...
ReplyDeleteCheers,
Andrej