Can AFR replace text with text from another part of the doc

General discussions about Advanced Find and Replace
Post Reply
Mark

Can AFR replace text with text from another part of the doc

Post by Mark » Mon Jan 31, 2005 5:32 pm

Hi

I was wondering if AFR can do the following:

1. Search for <H1>Text A</H1>.
2. Extract the string "Text A".
3. Search for <TITLE>Text B</TITLE>.
4. Replace the string "Text B" with "Text A".

In a nutshell, I want to process many HTML files, and replace a standard TITLE text with the text that is found in the H1 headers on each page.

Possible???

Thanks

Mark

Abacre
Site Admin
Posts: 1198
Joined: Mon Jan 31, 2005 5:32 pm
Contact:

Post by Abacre » Mon Jan 31, 2005 5:32 pm

Hi,

It's interesting case. It's possible. I guess in your case you have only one
<H1></H1> tag in a web page or at least: you should take data from the
first <H1></H1> tag in a page. Is it right?

So what final r.e.:
Search for:
<TITLE>.*</TITLE>(.*)<H1>(.*)</H1>
Replace with:
<TITLE>$2</TITLE>$1<H1>$2</H1>

I verified it on test file, it works finely.

Be sure to set up modifiers correctly (main menu Action - Options -
Batch replace):
Modifier S: ON
Modifier M: ON
Modifier I: ON
Modifier G: OFF

Claus

The variable dollar sign

Post by Claus » Mon Jan 31, 2005 5:32 pm

What's this with the $2 ?
I have a string pattern that i want to remove only part of.
Ex: 10.12.1234
I would like to remove only the first two digits and the first dot, transforming it into:
12.1234
I can find the strings with:
10\.[1234567890]{2}\.[1234567890]{4}
- but how do I replace them if I want to keep the original tvo digit groups ( the digits vary ) ???

Abacre
Site Admin
Posts: 1198
Joined: Mon Jan 31, 2005 5:32 pm
Contact:

Post by Abacre » Mon Jan 31, 2005 5:32 pm

$2 is one of the tricks that seems most of people don't know in r.e.
Read the whole page http://www.abacre.com/afr/manual/regexpsyntax.htm
from the beginning.
So if you use () in r.e. then you may call it in "replace with" part
of r.e. by using $ and # of position of ()

So final r.e. for your case will be:
Search for:
10\.([1234567890]{2})\.([1234567890]{4})
Replace with:
$1.$2

Sure you may write it more simply:

Search for:
10\.(\d{2})\.(\d{4})
Replace with:
$1.$2

Good luck!

Claus

the dollar variable

Post by Claus » Mon Jan 31, 2005 5:32 pm

Thanks for the quick reply!

Funny thing happened:
When using your search code, nothing is found.
When I use my search code ( should work similarly?) it finds (most of) the strings.
The replacement writes literally $1.$2 though.

Can it be caused by the documents being word .doc type ?

Abacre
Site Admin
Posts: 1198
Joined: Mon Jan 31, 2005 5:32 pm
Contact:

Re: the dollar variable

Post by Abacre » Mon Jan 31, 2005 5:32 pm

Claus wrote:Can it be caused by the documents being word .doc type ?

Yes, I described above r.e. (Perl-style) used for plain text files: html, asp and so on.
But MS Word has its own r.e. Read Help file about wildcards in MS Word:

Code: Select all

You can use the \n wildcard to search for an expression and then replace it with the rearranged expression. For example, type (Newton) (Christie) in the Find what box and \2 \1 in the Replace with box. Word will find "Newton Christie" and replace it with "Christie Newton". 

So final r.e. for MS Word will be:

Search for:
10\.([0123456789]{2})\.([0123456789]{4})

Replace with:
\1.\2

Claus

replace variables in word

Post by Claus » Mon Jan 31, 2005 5:32 pm

Sorry, I have been looking for this frantically in the help files, and cannot find the text you quote.

I have tried the code you quote ( \1.\2) , but the program displays an error massage, stating that I have incorrect syntax in my replace r.e, at the "\" character ( backslash)

Post Reply

Who is online

Users browsing this forum: No registered users and 4 guests