[Python-il] [python-il]location in file

Rani Hod rani.hod at gmail.com
Wed May 26 19:55:53 IDT 2010


It's not the usual question mark you find in regular expressions (i.e.
"ab?c" matches "abc" and "ac"); it's the syntax for some modified behaviour.

On Wed, May 26, 2010 at 19:47, Yitzhak Wiener <Yitzhak.Wiener at dspg.com>wrote:

> Hi Shai,
>
> It worked. Thanks.
> Adding the '?' after the '*' solved the time problem. I found it in the
> python documentation but didn't really understand the logic of that. Why
> it has effect? The '*' is before so it should still be greedy according
> to logic!? Shouldn't it?
>
>
> Best Regards,
> Yitzhak
>
> -----Original Message-----
> From: Shai Berger [mailto:shai at platonix.com]
> Sent: Tuesday, May 25, 2010 1:42 PM
> To: Yitzhak Wiener
> Cc: python-il at hamakor.org.il
> Subject: Re: [Python-il] [python-il]location in file
>
> Hi again Yitzhak,
>
> On Tuesday 25 May 2010 13:20:38 Yitzhak Wiener wrote:
> > The file is indeed large, ~6MB.
> > I added print lines before/after each line and found that the only
> line
> > that consumes more than 1 second was: " match = re.search(pattern,
> txt,
> > re.S) ", it consumed ~5 minutes!
> >
>
> There are few things to try.
>
> First is the easiest: In the pattern, change the two occurrences of ".*"
> to
> ".*?". In this case, this should not change the result, but --
> especially if
> most of the file is after the text you're looking for -- it should
> improve
> timings (".*" means "the longest possible string of characters", and
> ".*?"
> means "the shortest possible"; I'm assuming there is only one possible
> string,
> being both longest and shortest; why this should change the timing is
> left as
> an exercise).
>
> If this doesn't help, perhaps you can do some smart cutting of the file
> to
> pieces before the search.
>
> Have fun,
>        Shai.
>
> ______________________________________________________________________
> DSP Group, Inc. automatically scans all emails and attachments using
> MessageLabs Email Security System.
> _____________________________________________________________________
>
> ______________________________________________________________________
> DSP Group, Inc. automatically scans all emails and attachments using
> MessageLabs Email Security System.
> _____________________________________________________________________
> _______________________________________________
> Python-il mailing list
> Python-il at hamakor.org.il
> http://hamakor.org.il/cgi-bin/mailman/listinfo/python-il
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://hamakor.org.il/pipermail/python-il/attachments/20100526/8a2e563c/attachment.htm 


More information about the Python-il mailing list