Archives, Web Spiders, and Spam (Oh, my!)

From: elvin.l.hoel@exxonmobil.com
Date: Thu Oct 17 2002 - 16:33:39 BST


At the PIUG NE Workshop on October 8, I took the opportunity to raise the
issue of whether or not we should attempt to prevent web indexing spiders
from indexing our Archives. As you will recall, there was significant
discussion of this issue in August on the List. The issues including:

- Does the benefit of having messages locatable using Google or AltaVista
or other ethical indexer outweigh the annoyance of having our email
addresses harvested by spammers?

- Is it too late already, since our messages have already been long
exposed?

- Unethical spiders are unlikely to respect the "don't index" flag anyway.

- The Archives are freely reachable from the PIUG home page, and the only
prevention would be to establish logon/password protection of the Archives.
This is a maintenance and expense issue that Derwent and Questel-Orbit
would not be able to support for free - as they do the Archives.

No clear consensus was developed in the discussion - although there were
strong feelings on both sides.

I asked for a show-of-hands on the question and of the approximately 100
PIUG members present, the vote appeared to be about 60:40 for trying to
prevent indexing of the Archives.

With this justification, I have asked the Archives webmasters at Derwent
and Questel-Orbit to insure the "don't index" flag is set for their
Archives. Derwent has confirmed that the flag is set, and has been for some
time. Questel-Orbit will set the flag in the near future.

However, the "don't index" flag is no guarantee. A Google search for
postings in my own name located recent messages from the Derwent Archives,
despite their "don't index" flag being set.

So, if you are really concerned about avoiding spammer harvesting of your
email address, I suggest you use an account to send messages to PIUG which
you will never read responses to (e.g. a Hotmail account). Please remember
to include in the body of your message (e.g., such as below) the email
address people should use to contact you. It is unlikely that automated
spammer spiders will harvest email addresses from the body of a message -
the sending address is much easier to locate for harvest.

Regards, Elvin
Elvin L. Hoel
  PIUG Archives Committee Chairman
  Information Research & Analysis
  ExxonMobil Research & Engineering Co., 600 Billingsport Road, Paulsboro,
NJ 08066-0480
  Phone: 856-224-3577; Fax: 856-224-2468; Email:
elvin.l.hoel@exxonmobil.com

----------------------------------------------------------------------------------------------------------------------
The information contained in this email is confidential and intended
only for the use of the individual or entity named above. If the reader
of this message is not the intended recipient, you are hereby notified
that any dissemination, distribution, or copying of this communication
is strictly prohibited. Derwent Information Limited will accept no responsibility
or liability in respect to this email other than to the addressee. If you
have received this communication in error, please notify us
immediately via email: postmaster@derwent.co.uk
----------------------------------------------------------------------------------------------------------------------



This archive was generated by hypermail 2b30 : Fri Feb 14 2003 - 11:57:11 GMT