Written by Jaan Oras
Last time we wrote about how we scaled our services 10x over the course of 30 days. In this post, we’ll dive deeper into one of the more interesting technical improvements we achieved — optimising the way dates/timestamps are handled between the database and application layers.
While profiling one of our services, we noticed that postgres date parsing was taking a substantial amount of CPU, well over 20%. Even though this service loads a lot of time series data, it was still quite surprising to see the amount of CPU cycles it spent parsing dates. This hinted towards a bug, possibly related to loading too much data. So we decided to look into it.
Going with the initial assumption that we load too much data, we needed to make sure that data loading code was correct. Since there is no good way to do it automatically, we went through code and queries related to data loading.
This search did not uncover anything meaningful and our conclusion was that we do just process a lot of data. We did discuss some future improvements such as adding caching or maybe using integers for timestamps.
However, considering that at the time we needed to profile and improve many services, the 20% CPU usage was not significant enough to warrant making big changes. We were content with the fact that the CPU load was not related to a bug, and it was just the service doing its job.
That said, as we were improving some other performance issues with TypeORM, I got curious and decided to have a look into how the date parsing code works. TypeORM uses node-postgres for PostgreSQL access and internally this uses pg-types for parsing the primitive types, which in turn uses postgres-date for date parsing. This is where I decided to concentrate my (timeboxed) effort.
Just looking at the postgres-date code, everything seems reasonable, it uses several regular expressions for tokenisation and then composes a date object. But just in case, I decide to profile it in isolation.
It seems that
timeZoneOffset function was taking around 30% of CPU time. Having just recently looked into the data we have in our database, I remembered that most of the offset were +00. I checked the data once more, and it turned out all of the timestamps had an offset of 0.
Since it seemed like a good opportunity for optimising for the most frequent case, I added an early exit with a simple string
endsWith check for +00. Related to that, I also noticed that the code adjusts the time even if the timezone offset is 0, which I also fixed.
Thus essentially with total change of 3 lines, we got an improvement of 50%
I was quite happy at this point, almost ready to go on to tackle other issues, but this investigation seemed interesting. If I could improve the speed further, we could get a free speed improvement across all of our services.
Writing our own date parser
From this thought, a silly idea was born. Why don’t we create a faster parser than the regular expression based one? Usually this is a bad idea, as regular expressions are really-really fast. However, my hope was that instead of tokenising with regex and parsing the string tokens to integers, I could do everything in one go with a hand rolled parser.
Also, just like in the +00 trick, I could optimise for the most frequent cases. Even though, I know that writing parsers by hand is also a thankless job in most cases, I still decided to give it a shot — I did still have time in my timeboxed investigation.
Substituting the entire logic with just a call to the
Date constructor and running the postgres-date testsuite against it, it was obvious that there are some differences. For example the
Date constructor parser does not support BC notation and seconds in timezone offsets.
As my goal was to have a drop in replacement for postgres-date, I was not willing to take a risk that parsing fails, or parses incorrectly. This meant the
Date constructor based approach alone was not enough.
For comparison, an updated benchmark with date values that do work with the Date object. It is about 3.2x faster than the postgres-date with the timezone optimisation discussed earlier.
Coming back to the idea of a hand rolled parser. The ISO-8601 date format is not complicated — essentially the parser needs to check for presence of some strings and parse integers, there is no recursion, no lookbacks or lookaheads. I suppose that’s also why postgres-date uses regular expressions to parse it.
I initially decided to write the code as a class with private methods parsing different parts for the date to make things readable and later optimise it to make it faster. We’ll come back to this later.
The parsing is built from private methods, each handling different tokens, either returning null or a value, or boolean functions. I used
charCodeAt for string access, as it is slightly faster, and it also returns me a character code that I can use to calculate integer values. It is also quite convenient that for out of bounds indices it returns
NaN, this saves us from having to do bounds checking.
From these primitive functions we can compose more complex functions, for example, the parsing of the date part:
The resulting code looks really readable, which is something you can’t always say about regular expressions.
Performance gains of our new parser
Compared to the regular expression tokeniser and integer parsing solution, our new parser is faster for the main case! Not as fast as the native Date constructor, but still 4x faster than postgres-date, or 2.7x faster than the one with the timezone optimisation. Being only 20% slower than native code is quite a good result.
Coming back to the topic of code structure. I have always assumed that there is some penalty for using object oriented code in JS, and if we could replace it with functions it would be faster. First, I tried a more convenient way, with closures — and it was much slower! Then I tried with having a state parameter for keeping state — it was exactly as fast as the object oriented version!
This challenged my prejudice towards performance of object-oriented JS, which was a positive finding for the future.
Conclusion to the investigation
At this point the result was good enough and we started using the improved code in our services. This was quite easy because pg-types allows setting custom parsers for any of the types.
As a result, we gained visible performance boosts in all of our core services that deal with a lot of time based data.
So what did we learn from this little adventure? I think the main learning is to not get scared if the profiling shows time being spent in an underlying library. Even libraries that are well established may have suboptimal code in them and it never hurts to take a look. We also learned a little bit about how object-oriented JS can be just as fast as functional code, and we also got to prove ourselves that regular expressions are not always the fastest way to parse something.
We have open-sourced the improved parser, which you can find on GitHub and NPM. It is a direct drop-in replacement for postgres-date, so making use of it is quite simple. Make sure to let us know if you have any further ideas on how to improve the performance.
We have also opened an issue and will follow up on getting this incorporated upstream, so that all users of node-postgres could benefit from this faster parser.