[Python-il] [python-il]location in file

Shai Berger shai at platonix.com
Mon May 24 19:13:40 IDT 2010


I would ignore the number of words, and focus on headers. With the headers, we 
specify the part of the text we want; we use a capturing group to pick out 
only the interesting part.

section_header = "MultiProgPage_Data at c2 - SECTION HEADER"
next_section_header = "beeper_FW_bg_Task at c2 - SECTION HEADER"
part_header = "RAW DATA:"

pattern = "%s.*%s(.*)%s" % (section_header, part_header, next_section_header)

Then, just extract your section

match = re.search(pattern, txt, re.M)
if match:
	section = match.group(1)

	words = section.split()
	del words[-1] # This is the '49.'  of the next header


You might find a reading of http://docs.python.org/library/re.html, top to 
bottom, worthwhile.

Have fun,
	Shai.


On Monday 24 May 2010 18:58:54 Yitzhak Wiener wrote:
> Hi Shai,
> 
> Thanks for the prompt reply.
> As a really beginner, I think I partly understand your idea, but I don't
> know how to do it. Can help with this? Assuming I prefer the first
> option to search on the entire file, I would start as follows:
> txt = file(r" project_release.dump").read()
> #now I should find the next x hexadecimal words (x value is known) that
> start after the string "RAW DATA:" in section that starts with "
> MultiProgPage_Data at c2 - SECTION HEADER".
> How do I do that?
> 
> 
> 
> Thanks,
> Yitzhak
> 
> 
> -----Original Message-----
> From: python-il-bounces at hamakor.org.il
> [mailto:python-il-bounces at hamakor.org.il] On Behalf Of Shai Berger
> Sent: Monday, May 24, 2010 6:34 PM
> To: python-il at hamakor.org.il
> Subject: Re: [Python-il] [python-il]location in file
> 
> Hi Yitzhak,
> 
> You said,
> 
> > I am searching for data in file. The file is from type of text. I was
> > using RE for finding the location in file that I was interested in.
> 
> but in the code, you wrote,
> 
> > for line in s:
> 
> [...]
> 
> >    if re.match(r".*RAW DATA.*", line):
> 
> That is, instead of finding the location in the FILE, you found the
> location
> in the LINE.
> 
> What you should do instead is get a string that contains your whole
> section;
> you can do this with regular expressions (applied to the whole file, s,
> with
> re.M), or you can do this by collecting the relevant lines after having
> split
> the file into lines. Then, just use section.split() to get a list of the
> 
> "words" (as separated by whitespace) in the section.
> 
> Have fun,
> 	Shai.
> _______________________________________________
> Python-il mailing list
> Python-il at hamakor.org.il
> http://hamakor.org.il/cgi-bin/mailman/listinfo/python-il
> 
> ______________________________________________________________________
> DSP Group, Inc. automatically scans all emails and attachments using
> MessageLabs Email Security System.
> _____________________________________________________________________
> 
> ______________________________________________________________________
> DSP Group, Inc. automatically scans all emails and attachments using
>  MessageLabs Email Security System.
>  _____________________________________________________________________
>  _______________________________________________
> Python-il mailing list
> Python-il at hamakor.org.il
> http://hamakor.org.il/cgi-bin/mailman/listinfo/python-il
> 


More information about the Python-il mailing list