Posted by: mlord
A Regex curiosity - 15/06/2004 10:39
Yesterday, I managed to more than double the responsiveness of my GeoCaching WAP portal, by making a very simple change to how it was using regular expressions to parse the cache pages.
Previously, several AWK statements resembled this:
Cheers
Previously, several AWK statements resembled this:
gsub(".*images/WptTypes/","",text)On my 600Mhz server, for a 10KByte text value, this could sometimes take 15-25 seconds to execute under gawk. The simple change was to anchor the regex pattern, by stuffing a caret symbol at the beginning:
gsub("^.*images/WptTypes/","",text)This change renders execution time to nearly instantaneous, the way it should work, without any change whatsoever to the meaning of the original regular expression used. I wonder if the gawk folks might be willing to add this to their regex parser:
if (substr(pattern,1,2) == ".*") { pattern = "^"pattern };Or perhaps even this:
sub("^[.][*]","^&",pattern)
Cheers