Shai Berger shai at platonix.com
Tue May 25 13:41:58 IDT 2010

Hi again Yitzhak,

On Tuesday 25 May 2010 13:20:38 Yitzhak Wiener wrote:
> The file is indeed large, ~6MB.
> I added print lines before/after each line and found that the only line
> that consumes more than 1 second was: " match = re.search(pattern, txt,
> re.S) ", it consumed ~5 minutes!

There are few things to try.

First is the easiest: In the pattern, change the two occurrences of ".*" to 
".*?". In this case, this should not change the result, but -- especially if 
most of the file is after the text you're looking for -- it should improve 
timings (".*" means "the longest possible string of characters", and ".*?" 
means "the shortest possible"; I'm assuming there is only one possible string, 
being both longest and shortest; why this should change the timing is left as 
an exercise).

If this doesn't help, perhaps you can do some smart cutting of the file to 
pieces before the search.

Have fun,

