Parse human-readable date/time text


The code for parsedatetime can now be found on GitHub

Below is the change history before everything shifted to GitHub -- kept around for historical and pleasant memories reasons :)

9 January 2009


Apply patch submitted by Michael Lim to fix the problem parsedatetime was having handling dates when the month day preceeded the month
Issue 26

Fixed TestErrors when in a local where the bad date actually returns a date ;)

Checked in the TestGermanLocale unit test file missed from previous commit

21 March 2008


Updating Copyright year info
Fixing defects from Google Project page

The comparison routine for the "failing" test was not accurate. The test was being flagged as failing incorrectly
Issue 18

Added patch from Bernd Zeimetz for the German localized constants! Debian Patch for de_DE He identifies some issues with how unicode is handled and also some other glitches - will have to work on them.
Issue 20

Tweaked to default to all tests if not given on the command line. Removed 'this' from the list of "specials" - it was causing some grief and from the looks of the unit tests, not all that necessary.

Bernd identified that for the German locale the dayofweek check was being triggered for the dayoffset word "morgen" (the "mo" matched the day "morgen"). To solve this I added a small check to make sure if the whole word being checked was not in the dayOffsets list, and if so not trigger.
Issue 19

28 November 2007


Fixing two bugs found by Chandler QA

Time range of "today 3:30-5pm" was actually causing a traceback. Added a new regex to cover this range type and a new test.
Chander #11299

A really embarrassing for a date/time library - was actually *not* considering leap years when returning days in a month! Added tests for Feb 29th of various known leap years and also added a check for the daysInMonth() routine which was created to replace the naively simple DaysInMonthList.
Chandler #11203

12 June 2007


Fixed a bug reported via the project page by Alexis where parsedatetime was not parsing day suffixes properly. For example, the text "Aug 25th, 2008" would return the year as 2007 - the parser was not 'seeing' 2008 as a part of the expression.

The fix was to enhance one of the "long date" regexes to handle that situation but yet not break the current tests - always fun for sure!
Issue 16

Fixed a bug Brian K. (one of the Chandler devs) found when parsing with the locale set to fr_FR. The phrase "same 3 folders" was causing a key error inside the parser and it turns out that it's because short weekday names in French have a trailing '.' so "sam." was being used in the regular expression and the '.' was being treated as a regex symbol and not as a period.

It turned out to be a simple fix - just needed to add some code to run re.escape over the lists before adding them to the re_values dictionary.

Also added a TestFrenchLocale set of unit tests but made them only run if PyICU is found until I can build an internal locale for fr_FR.
Issue 17

14 February 2007


Minor doc string changes and other typo fixes

Updated Copyright years

Added a fallbackLocales=[] parameter to parsedatetime_consts init routine to control what locales are scanned if the default or given locale is not found in PyICU.
Issue #9

While working on the regex compile-on-demand issue below, I realized that parsedatetime was storing the compiled regex's locally and that this would cause prevent parsedatetime from switching locales easily. I've always wanted to make it so parsedatetime can be set to parse within a locale just by changing a single reference - this is one step closer to that.

Made the regex compiles on-demand to help with performance Requested by the Chandler folks
Issue #15

To test the change I ran 100 times the following code:

    for i in range(0, 100):
        c = pdc.Constants()
        p = pdt.Calendar(c)
        p = None
        c = None

and that was measured by hotshot:

    24356 function calls (22630 primitive calls) in 0.188 CPU seconds

after the change:

    5000 function calls in 0.140 CPU seconds

but that doesn't test the true time as it doesn't reference any regex's so any time saved is deferred. To test this I then ran before and after tests where I parsed the major unit test bits:

before the change:

    80290 function calls (75929 primitive calls) in 1.055 CPU seconds

after the change:

    55803 function calls (52445 primitive calls) in 0.997 CPU seconds

This tells me while doing the lazy compile does save time, it's not a lot over the normal usage. I'll leave it in as it is saving time for the simple use-cases.

26 December 2006


0.8.1 release

Fixed the 'eom' part of testEndOfPhrases. It was not adjusting the year when checking for month rollover to the new year.

Changed API docs to reflect that it's a struct_time type (or a time tuple) that we accept and return instead of a datetime value. I believe this lead to Issue #14 being reported. Also added some error handling to change a datetime value into a struct_time value if passed to parse().

24 October 2006


Merged in changes from Darshana's change_parse_to_return_enum branch. Because this changes the primary method used to call parsedatetime I bumped the version to 0.8 to signal the API change.

This is a big change in that instead of a simple True/False that is returned to show if the date is valid or not, Parse() now returns a "masked" value that represents what is valid:

                    date = 1
                    time = 2

so a value of zero means nothing was parseable/valid and a value of 3 means both were parsed/valid.

Implemented the CalculateDOWDelta() method in and added a new flag CurrentDOWParseStyle in for the current DOW -- Issue #10

Changed birthday epoch to be a constant defined in parsedatetime_const along with lots of little cosmetic code changes. Removed the individual files in the docs/ folder and added dist, build and to svn:ignore.

Added birthday epoch constraint, fixed date parsing. 3-digit year not allowed now and fixed the unit tests too to either have yy or yyyy.

9 October 2006


0.7.4 released

Fixed "ago" bug -- Issue #7

Fixed bug where default year for dates that are in the future get next year, not current year -- Issue #8

Fixed strings like "1 week ago", "lunch tomorrow"

25 September 2006


0.7.3 released

Added Darshana as an author and updated the copyright text

Fixed a subtle dictionary reference bug in buildSources() that was causing any source related modifier to not honor the day, month or year. It only started being seen as I was working on adding "eod" support as a 'true' modifier instead.

Found another subtle bug in evalModifier() if the modifier was followed by the day of the week - the offset math was not consistent with the other day-of-week offset calculations.

The following is now supported:

        eod tomorrow
        tomorrow eod
        monday eod
        eod monday
        meeting eod
        eod meeting

Added a sub-range test. Not that it works, just wanted to start the process -- Issue #6

Alan Green filed Issue #5

In it he asked for support for Australian date formats "dd-mm-yyyy"

This is the first attempt at supporting the parsing of dates where the order of the day, month and year can vary. I adjusted the parseDate() code to be data driven and added a dp_order list to the Constants() class that is either initialized to t he proper order by the pdtLocale classes or the order is determined by parsing the ICU short date format to figure out what the date separator is and then to find out what order it's in.

I also added a as a starting point for tests.

1 September 2006


Fixed two bugs found by Darshana during her Chandler testing. Details are documented in Issue 3 and Issue 4.

24 August 2006


Turns out that ICU works with weekdays in Sun..Sat order and that Python uses Mon..Sun order. Fixed PyICU locale code to build the internal weekday list to be Python Style -- Issue #2

22 August 2006


Major localization refactoring. Added support for PyICU and also a simple locale class for people who do not have PyICU. All of the constants and strings should now be localizable - but I'm sure I missed some hardcoded constants :)

5 August 2006

I have made the code for parsedatetime public. It's been kind-of public for a while now (well ever since Darshana started using it for Chandler) but now it's actually available from the Python Cheeseshop and also from (see the lin ks in the left sidebar.)

So now the fun begins as people hopefully start to use it and I'm given feedback. I know the one bit of feedback that everyone will say is about the lack of docs and I am working on it. I'll be epydoc'ing the source tonight and posting that soon as a link in the sidebar.

Contact Info

Mike Taylor
Mike Taylor

Copyright © 2004-2016
Mike Taylor

Creative Commons License
This work is licensed under a Creative Commons License.