Context Navigation

← Previous Change
Ticket Comment History
Next Change →

Changes between Initial Version and Version 1 of Ticket #4175, comment 2

Timestamp:: 09/12/18 11:29:25 (6 years ago)
Author:: robe

Legend:

: Unmodified
: Added
: Removed
: Modified

Ticket #4175, comment 2

-              initial
+              v1
 seems to have been caused by this.  So has nothing to do with raster, but use of split_to_array in the functions
+seems to have been caused by this.  So has nothing to do with raster, but use of regex functions.
+https://github.com/postgres/postgres/commit/f6f61d937bfddbe2a5f6a37bc26a0587117d7837
+https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=f6f61d937bfddbe2a5f6a37bc26a0587117d7837
+{{{
+author  Andrew Gierth <rhodiumtoad@postgresql.org>
+        Tue, 28 Aug 2018 04:52:25 -0400 (09:52 +0100)
+committer       Andrew Gierth <rhodiumtoad@postgresql.org>
+        Tue, 28 Aug 2018 06:55:18 -0400 (11:55 +0100)
+commit  f6f61d937bfddbe2a5f6a37bc26a0587117d7837
+tree    cec65e55e1c5f85a32c2421b8c17dc72327e9a0d        tree | snapshot
+parent  0f3dd76f527deb81ee5ba60048df04c598c93960        commit | diff
+Avoid quadratic slowdown in regexp match/split functions.
+regexp_matches, regexp_split_to_table and regexp_split_to_array all
+work by compiling a list of match positions as character offsets (NOT
+byte positions) in the source string.
+Formerly, they then used text_substr to extract the matched text; but
+in a multi-byte encoding, that counts the characters in the string,
+and the characters needed to reach the starting byte position, on
+every call. Accordingly, the performance degraded as the product of
+the input string length and the number of match positions, such that
+splitting a string of a few hundred kbytes could take many minutes.
+Repair by keeping the wide-character copy of the input string
+available (only in the case where encoding_max_length is not 1) after
+performing the match operation, and extracting substrings from that
+instead. This reduces the complexity to being linear in the number of
+result bytes, discounting the actual regexp match itself (which is not
+affected by this patch).
+In passing, remove cleanup using retail pfree() which was obsoleted by
+commit ff428cded (Feb 2008) which made cleanup of SRF multi-call
+contexts automatic. Also increase (to ~134 million) the maximum number
+of matches and provide an error message when it is reached.
+Backpatch all the way because this has been wrong forever.
+Analysis and patch by me; review by Kaiting Chen.
+Discussion: https://postgr.es/m/87pnyn55qh.fsf@news-spur.riddles.org.uk
+see also https://postgr.es/m/87lg996g4r.fsf@news-spur.riddles.org.uk
+}}}
+This was committed to 9.3-12 stable branches.