Page 1 of 1

Remove everything but...

Posted: Thu Jun 08, 2006 9:14 am
by curb
I want to remove everything on the html but the form. There are different form actions on each page so I need it to scan:

The begining word needs to be <form and the ending word has to be </form> so it'll also save the words in between also.

Example:
<html>
<head>
<title>Abacre</title>
</head>
<body>
Hi<br>
<form action="blah>
<input blah blah>
</form>

</body>
</html>

Posted: Wed Jun 28, 2006 6:46 pm
by Abacre
Yes, it's possible to do with Advanced Find and Replace.

Go to main menu - Action - Options - Batch Replace
check "Modifier S"
check "Modifier M"
uncheck "Modifier M"
check "Modifier I"

Go to Batch replace tab, check on "Use regular expressions".
Put two lines (pairs) into the grid,

Search for:
\A.*(<form)

Replace with:
$1

Search for:
(</form>).*\Z

Replace with:
$1

That's all I verified it works perfectly.

Posted: Sun Aug 27, 2006 6:39 am
by fr0gman
This almost seems to be the solution to my questions however I need to remove everything except what is between http:// and .php in about 5000 lines of code...


<br><a href=http://mydomain.com/south-carolina-furniture/furniture-stores-in-islandton-sc-south-carolina.php>Furniture Stores In Islandton Sc South Carolina</a><br>

<br><a href=http://mydomain.com/south-carolina-furniture/furniture-cross-hill-south-carolina.php>Furniture Cross Hill South Carolina</a><br>

<br><a href=http://mydomain.com/south-carolina-furniture/furniture-stores-in-sumter-sc-south-carolina.php>Furniture Stores In Sumter Sc South Carolina</a><br>