Thursday, October 22, 2009

ISO week

I am working with a model that produces estimates of snow water equivalent through time. Because I deal with large spatial extents, I decided to have the model produce weekly averages. The problem with this is knowing which file to access for a given date. The snow files are saved using an ISO-8061 week number (I had no idea how many date "standards" there were, but this is a common one). I thought that getting the week number out of a date format in R would be easy, but I was wrong. It turns out to be platform specific and regardless, as far as I could tell, there is no built-in way to calculate the ISO week. Thus the following function.

ISOweek<-function(date,format="%Y-%m-%d",return.val="weekofyear"){
##converts dates into "dayofyear" or "weekofyear", the latter providing the ISO-8601 week
##date should be a vector of class Date or a vector of formatted character strings
##format refers to the date form used if a vector of
## character strings is supplied

##convert date to POSIXt format
if(class(date)[1]%in%c("Date","character")){
date=as.POSIXlt(date,format=format)
}
if(class(date)[1]!="POSIXt"){
print("Date is of wrong format.")
break
}else if(class(date)[2]=="POSIXct"){
date=as.POSIXlt(date)
}


if(return.val=="dayofyear"){
##add 1 because POSIXt is base zero
return(date$yday+1)
}else if(return.val=="weekofyear"){
##Based on the ISO8601 weekdate system,
## Monday is the first day of the week
## W01 is the week with 4 Jan in it.
year=1900+date$year
jan4=strptime(paste(year,1,4,sep="-"),format="%Y-%m-%d")
wday=jan4$wday

wday[wday==0]=7 ##convert to base 1, where Monday == 1, Sunday==7

##calculate the date of the first week of the year
weekstart=jan4-(wday-1)*86400
weeknum=ceiling(as.numeric((difftime(date,weekstart,units="days")+0.1)/7))

#########################################################################
##calculate week for days of the year occuring in the next year's week 1.
#########################################################################
mday=date$mday
wday=date$wday
wday[wday==0]=7
year=ifelse(weeknum==53 & mday-wday>=28,year+1,year)
weeknum=ifelse(weeknum==53 & mday-wday>=28,1,weeknum)

################################################################
##calculate week for days of the year occuring prior to week 1.
################################################################

##first calculate the numbe of weeks in the previous year
year.shift=year-1
jan4.shift=strptime(paste(year.shift,1,4,sep="-"),format="%Y-%m-%d")
wday=jan4.shift$wday
wday[wday==0]=7 ##convert to base 1, where Monday == 1, Sunday==7
weekstart=jan4.shift-(wday-1)*86400
weeknum.shift=ceiling(as.numeric((difftime(date,weekstart)+0.1)/7))

##update year and week
year=ifelse(weeknum==0,year.shift,year)
weeknum=ifelse(weeknum==0,weeknum.shift,weeknum)

return(list("year"=year,"weeknum"=weeknum))
}else{
print("Unknown return.val")
break
}
}




Of course after I wrote this function, I found this thread in R-help. Gustaf Rydevik provided a function that is similar to mine (his function is also about twice as fast); however, on my computer it was giving incorrect years and week numbers for days in January that occur in the previous year's final week (e.g., 1 January 2010 should be week 53 of 2009). Reading through the rest of the thread indicates that the confusion involving week numbers is widespread. I guess the moral of the story is pick a standard and stick to it.

Please let me know in the comments if you encounter any problems running this function (or if you can suggest ways to make it more efficient).

7 comments:

bbolker said...

2 brief comments:

(1) it really wants POSIXlt (rather than POSIXct -- just any old "POSIXt" object won't do)

(2) from profiling, it looks like the strptime() call is taking most of the time. I don't know if you can combine the two strptime() calls, or whether that will help much, or what the alternative code does to avoid strptime() calls. (It took me about 20 seconds to convert 10^5 date values. In the big picture, is that worth worrying about?)

Forester said...

Hi Ben. Thanks for your comments.

(1) You are right -- I also ran into the POSIXct problem but forgot to update my posted code. Now I have edited the script so that it works (by converting "ct" to "lt") -- it is a crude fix, but it will do until I figure out a better way to handle all the different time formats.

(2) In this case the processing time is not a big deal, but I am always interested in learning more efficient coding methods that I can translate to other, more time sensitive, processes. Converting dates is not the bottleneck of my current project, so it is not worth spending a lot of time on; however, I will play around with the strptime() calls when I get a chance.

Ajay said...

Thanks for this - it works well for me. There is a base function called ISOdate so I wonder if you would call it something like ISOweek?

Forester said...

Good idea Ajay.

Fernando said...

A useful function is match.arg, so that you can replace

return.val=c("week","day") in the arguments

and inside the function

return.val <- match.arg(return.val)

Also, you can replace

print("some error")
break

with

stop("Some error")

I enjoyed your posts.

Thanks!

Fernando

G. said...

Hi,
Interesting to see that someone have found my code! And interesting to find that coding errors come back to haunt you - you're right of course - the code I posted in dec 2008 was a bit buggy - I discovered this myself when 2010 turned around. I'll include my corrected version in the next comment.

Best Regards,
Gustaf

G. said...

## Inputs a date object,
##posix object, or 3 numbers and
##gives back the iso week.
## By Gustaf Rydevik, revised 2010


getweek<-function(Y,M=NULL,D=NULL){

if(!class(Y)[1]%in% c("Date","POSIXt")) {
date.posix<-strptime(paste(Y,M,D,sep="-"),"%Y-%m-%d")
}
if(class(Y)[1]%in% c("POSIXt","Date")){
date.posix<-as.POSIXlt(Y)
Y<-as.numeric(format(date.posix,
"%Y"))
M<-as.numeric(format(date.posix,
"%m"))
D<-as.numeric(format(date.posix,
"%d"))
}


LY<- (Y%%4==0 & !(Y%%100==0))|(Y%%400==0)
LY.prev<- ((Y-1)%%4==0 & !((Y-1)%%100==0))|((Y-1)%%400==0)
date.yday<-date.posix$yday+1
jan1.wday<-strptime(paste(Y,"01-01"
,sep="-"),"%Y-%m-%d")$wday
jan1.wday<-ifelse(jan1.wday==0,7,
jan1.wday)
date.wday<-date.posix$wday
date.wday<-ifelse(date.wday==0,7,
date.wday)


####If the date is in the
####beginning, or end of the year,
### does it fall into a week of
###the previous or next year?

Yn<-ifelse(date.yday<=(8-jan1.wday)&jan1.wday>4,Y-1,
ifelse(((365+LY-date.yday)<(4-date.wday)),Y+1,Y))

##Set the week differently if
##the date is in the
##beginning,middle or end of the
##year

Wn<-ifelse(
Yn==Y-1,
ifelse((jan1.wday==5|(jan1.wday==6 &LY.prev)),53,52),
ifelse(Yn==Y+1,1,(date.yday+(7-date.wday)+(jan1.wday-1))/7-(jan1.wday>4))
)

return(list(Year=Yn,ISOWeek=Wn))
}