Changes between Initial Version and Version 1 of Ticket #4175, comment 2


Ignore:
Timestamp:
09/12/18 11:29:25 (6 years ago)
Author:
robe

Legend:

Unmodified
Added
Removed
Modified
  • Ticket #4175, comment 2

    initial v1  
    1 seems to have been caused by this.  So has nothing to do with raster, but use of split_to_array in the functions
     1seems to have been caused by this.  So has nothing to do with raster, but use of regex functions.
    22
    33
    4 https://github.com/postgres/postgres/commit/f6f61d937bfddbe2a5f6a37bc26a0587117d7837
     4https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=f6f61d937bfddbe2a5f6a37bc26a0587117d7837
     5
     6
     7{{{
     8author  Andrew Gierth <rhodiumtoad@postgresql.org>     
     9        Tue, 28 Aug 2018 04:52:25 -0400 (09:52 +0100)
     10committer       Andrew Gierth <rhodiumtoad@postgresql.org>     
     11        Tue, 28 Aug 2018 06:55:18 -0400 (11:55 +0100)
     12commit  f6f61d937bfddbe2a5f6a37bc26a0587117d7837
     13tree    cec65e55e1c5f85a32c2421b8c17dc72327e9a0d        tree | snapshot
     14parent  0f3dd76f527deb81ee5ba60048df04c598c93960        commit | diff
     15Avoid quadratic slowdown in regexp match/split functions.
     16
     17regexp_matches, regexp_split_to_table and regexp_split_to_array all
     18work by compiling a list of match positions as character offsets (NOT
     19byte positions) in the source string.
     20
     21Formerly, they then used text_substr to extract the matched text; but
     22in a multi-byte encoding, that counts the characters in the string,
     23and the characters needed to reach the starting byte position, on
     24every call. Accordingly, the performance degraded as the product of
     25the input string length and the number of match positions, such that
     26splitting a string of a few hundred kbytes could take many minutes.
     27
     28Repair by keeping the wide-character copy of the input string
     29available (only in the case where encoding_max_length is not 1) after
     30performing the match operation, and extracting substrings from that
     31instead. This reduces the complexity to being linear in the number of
     32result bytes, discounting the actual regexp match itself (which is not
     33affected by this patch).
     34
     35In passing, remove cleanup using retail pfree() which was obsoleted by
     36commit ff428cded (Feb 2008) which made cleanup of SRF multi-call
     37contexts automatic. Also increase (to ~134 million) the maximum number
     38of matches and provide an error message when it is reached.
     39
     40Backpatch all the way because this has been wrong forever.
     41
     42Analysis and patch by me; review by Kaiting Chen.
     43
     44Discussion: https://postgr.es/m/87pnyn55qh.fsf@news-spur.riddles.org.uk
     45
     46see also https://postgr.es/m/87lg996g4r.fsf@news-spur.riddles.org.uk
     47
     48}}}
     49
     50This was committed to 9.3-12 stable branches.