Page 1 of 1

Can AFR replace text with text from another part of the doc

Posted: Mon Jan 31, 2005 5:32 pm
by Mark
Hi

I was wondering if AFR can do the following:

1. Search for <H1>Text A</H1>.
2. Extract the string "Text A".
3. Search for <TITLE>Text B</TITLE>.
4. Replace the string "Text B" with "Text A".

In a nutshell, I want to process many HTML files, and replace a standard TITLE text with the text that is found in the H1 headers on each page.

Possible???

Thanks

Mark

Posted: Mon Jan 31, 2005 5:32 pm
by Abacre
Hi,

It's interesting case. It's possible. I guess in your case you have only one
<H1></H1> tag in a web page or at least: you should take data from the
first <H1></H1> tag in a page. Is it right?

So what final r.e.:
Search for:
<TITLE>.*</TITLE>(.*)<H1>(.*)</H1>
Replace with:
<TITLE>$2</TITLE>$1<H1>$2</H1>

I verified it on test file, it works finely.

Be sure to set up modifiers correctly (main menu Action - Options -
Batch replace):
Modifier S: ON
Modifier M: ON
Modifier I: ON
Modifier G: OFF

The variable dollar sign

Posted: Mon Jan 31, 2005 5:32 pm
by Claus
What's this with the $2 ?
I have a string pattern that i want to remove only part of.
Ex: 10.12.1234
I would like to remove only the first two digits and the first dot, transforming it into:
12.1234
I can find the strings with:
10\.[1234567890]{2}\.[1234567890]{4}
- but how do I replace them if I want to keep the original tvo digit groups ( the digits vary ) ???

Posted: Mon Jan 31, 2005 5:32 pm
by Abacre
$2 is one of the tricks that seems most of people don't know in r.e.
Read the whole page http://www.abacre.com/afr/manual/regexpsyntax.htm
from the beginning.
So if you use () in r.e. then you may call it in "replace with" part
of r.e. by using $ and # of position of ()

So final r.e. for your case will be:
Search for:
10\.([1234567890]{2})\.([1234567890]{4})
Replace with:
$1.$2

Sure you may write it more simply:

Search for:
10\.(\d{2})\.(\d{4})
Replace with:
$1.$2

Good luck!

the dollar variable

Posted: Mon Jan 31, 2005 5:32 pm
by Claus
Thanks for the quick reply!

Funny thing happened:
When using your search code, nothing is found.
When I use my search code ( should work similarly?) it finds (most of) the strings.
The replacement writes literally $1.$2 though.

Can it be caused by the documents being word .doc type ?

Re: the dollar variable

Posted: Mon Jan 31, 2005 5:32 pm
by Abacre
Claus wrote:Can it be caused by the documents being word .doc type ?

Yes, I described above r.e. (Perl-style) used for plain text files: html, asp and so on.
But MS Word has its own r.e. Read Help file about wildcards in MS Word:

Code: Select all

You can use the \n wildcard to search for an expression and then replace it with the rearranged expression. For example, type (Newton) (Christie) in the Find what box and \2 \1 in the Replace with box. Word will find "Newton Christie" and replace it with "Christie Newton". 

So final r.e. for MS Word will be:

Search for:
10\.([0123456789]{2})\.([0123456789]{4})

Replace with:
\1.\2

replace variables in word

Posted: Mon Jan 31, 2005 5:32 pm
by Claus
Sorry, I have been looking for this frantically in the help files, and cannot find the text you quote.

I have tried the code you quote ( \1.\2) , but the program displays an error massage, stating that I have incorrect syntax in my replace r.e, at the "\" character ( backslash)