Skip navigation.

Harold's Home

   Home
  
XML version of this site

PHP Scripts

Index
CLI fun
Mail on 404
HB-NS (NewsScript)

Downloads

Applescripts
APOD to Desktop
Dreamweaver Extensions

Stuff

Writings
Other stuff
Central Grinder

OOOk Default:

VJ stuff
VJ Tools
Bananas
Strippers
Sample Movies

News

List all abbreviation tags in an entire site
Last week I received an e-mail from someone who wanted to extract all <abbr> and <acronym> tags from a website. He wanted to put all these into a kind of glossary. Ideally he would want this tool to work from within Dreamweaver.

I thought this kind of a problem was more suited to Perl so I went ahead and wrote a script. The script works fine on my OS X machine and I see no reason why it wouldn't work on any other *nix system.

There's a couple of things to know before using this script:
1. You will have to point the script to your website directory (line 29).
2. There should be no files called "list.txt", "abbr-list.txt" and/or "abbreviations.txt" in the siteroot. These will be overwritten!!!
3. Final output is to the file "abbreviations.txt", it will list all <abbr> and <acronym> tags, which you can then paste into a regular HTML file.
4. The default for this script is to extract all tags from html files, you can change the extension to search for (in line 36).
5. There's a secondary perlscript used that removes duplicates and sorts the output. This file came with BBEdit and is called kill_dups_and_sort.pl. It is included in the download.

Download the script.
See the output.
Enjoy!

Show all items | Read all items

About, copyright, privacy and accessibility | Mail