Unicode date formats, YYYY?!
Saturday, October 24th, 2015Last year, I tweeted:
Good morning!
Have you audited your code to fix all uses of YYYY where you meant yyyy yet?
If not: Happy 2015!
— Peter Hosey (@boredzo) December 29, 2014
This year, I noticed that the problem comes earlier, so I sent out an early reminder:
You probably want to do this year's YYYY audit soon, because 2016 starts on Dec 26 this year: https://twitter.com/boredzo/status/549582777956323331
— Pumpkin Hollow (@boredzo) October 24, 2015
But exactly what problem am I referring to? Dan Wood wondered exactly what the YYYY issue is and what languages it affects.
So here’s a short explainer, not bound by the constraints of Twitter.
Year for week-of-year
You may have heard of “ISO 8601 format” (2015-01-01), but in fact, that’s only one of three formats that ISO 8601 defines:
- Calendar format: 2015-01-01
- Week format: 2015-W01-04
- Ordinal format: 2015-001
The ordinal format is straightforward: It’s the NNNth day of the calendar year.
But the week format does not work that way. The first week of a year is not even guaranteed to contain January 1st! Rather, -W01-01 (the first day of the first week) is the Monday of the week that contains the year’s first Thursday. (Yes, that is the actual definition from the ISO 8601 standard.)
As such, ISO defines a parallel track of years that have the same year numbers as calendar years, but start and end on different dates (and always start on a Monday). ISO-week-year 2015 starts on the Monday of the week containing 2015’s first Thursday; that Monday is Monday, 2014-12-29 (2015-W01-01). 2015-01-01 is Thursday, 2015-W01-04.
Unicode date formats
YYYY and yyyy are Unicode date format patterns. These offer quite a bit more flexibility than the old str[fp]time(3) formats, particularly in choosing different representations of the same value (e.g., “September” vs “Sep” vs “09” vs “9”).
- YYYY is defined as the “year for week-of-year”: that is, the year for ISO week dates.
- yyyy is defined as the calendar year.
Whom this affects
The second part of Dan Wood’s question is what languages this affects.
NSDateFormatter* and CFDateFormatter both accept the Unicode date format syntax.
Contrary to what I’d previously assumed, PHP does not use Unicode date formats. As befits PHP, it uses something that looks the same but works subtly differently: “Y” is always the full calendar year, whereas “y” is a two-digit calendar year. ISO week years are “o”.
I actually don’t know of any others. Feel free to chime in in the comments if you know any other languages or frameworks that include built-in support for Unicode date formats.
* On the Mac, an NSDateFormatter configured for “10.0 behavior” accepts str[fp]time(3) formats; “10.4 behavior” is Unicode date formats. All other current Apple OSs, including iOS, have NSDateFormatters with 10.4 behavior only. CFDateFormatter has never supported str[fp]time(3) formats. ↶