Ever wonder which is the fastest way to concatenate strings in Python?
This is a response to Ever wonder which is the fastest way to concatenate strings in Ruby?.
I tested using Python 2.5 on Mac OS X 10.4.9 on my four-core Mac Pro. Here are the results:
Method | Time (seconds per million iterations) |
---|---|
+ | 0.308359861374 |
str.join(list) | 0.53214097023 |
str.join(tuple) | 0.48233294487 |
% | 0.515310049057 |
That’s quite a surprise—the usual advice is to avoid the + operator because it is inefficient. But here we see that it wins quite handily. Google revealed Chris Siebenmann’s explanation: Python 2.4 fixed the + operator to be much more efficient by only allocating one string instead of n-1 strings.
So, clearly, the old advice no longer applies. Go forth and use +.
Oh, and in case you’re curious, here’s the code. (The times shown above are using the re-create-the-string-every-time version of the code, for comparison’s sake. Not doing that only saves about 1⁄100 second on each test-case.)
May 6th, 2007 at 23:36:42
That’s cool. Here are the results from my 2.0GHz Core Duo MacBook Pro:
+: 0.512433052063
str.join(list): 0.9431848526
str.join(tuple): 0.859102964401
%: 0.936084985733
–Alternate versions that don’t re-create the string every time–
+: 0.50304889679
str.join(list): 0.887317895889
str.join(tuple): 0.794829130173
%: 0.89204788208
May 7th, 2007 at 17:44:53
Looking at Chris’s explanation, I’m not so sure that this optimization is quite the panacea your benchmark would imply it is…
“Here’s the summary: counting on this optimization is unwise, as it turns out to depend on low-level details of system memory allocation that can vary from system to system and workload to workload. Also, this is only for plain (byte) strings, not for Unicode strings; as of Python 2.4.2, Unicode string concatenation remains un-optimized.”
Perhaps we could get a new set of benchmark timings with unicode strings?
May 7th, 2007 at 18:17:39
David: Here you go. The % operator still wins with Unicode strings.
WRT assuming the presence of the + optimization: I think it makes less sense to assume that % is faster. + is the concatenation operator, so it’s reasonable to assume that that’s the best way to concatenate; until 2.4, that wasn’t true, but now it is (for plain strings).