Sitescooper automatically retrieves the stories from several news websites, trims off extraneous HTML, and converts them into formats you can read on your Palm computing device for later reading on-the-move. It maintains a cache, and will avoid stories you've already read. It can handle 1-page sites, 1-page with diffing, 2-level and 3-level sites, and it's very easy to add a new site to its list.
Even if you don't have a Palm handheld, it's still handy for simple website-to-text conversion, and offline HTML reading.
The output formats supported by sitescooper are as follows:
plain text
HTML
, a free, HTML-based format for Palm handhelds
iSilo
, a HTML-based format for the Palm Computing organizers from DC and Co. Free and shareware versions of the viewer are available.
DOC format, as used by
AportisDoc
,
TealDoc
,
CSpotRun
, etc. Again, free and shareware viewers are available.
RichReader
, an RTF-based format with formatting.
Any other format that converts from text or HTML, using the
-pipe
functionality.
Included in the bundle are site files for
Slashdot
,
NTKnow
,
BluesNews
,
Linux Weekly News
,
Wired News
BBC News
,
TBTF
,
Hacker News Network
,
Robot Wisdom weblog
,
Memepool
,
Jakob Neilsen's Alertbox
,
Ars Technica
,
I, Cringely
,
Linux Today
,
comp.risks
, and
over 300 more
.
The latest released version is 3.1.2.
HTTP and local files, using the
file:/// protocol, are both supported, and it works fine on most UNIX platforms, Windows 95, 98 and NT, and Macs.
The web-retrieval logic can handle a wide variety of formats (1-page sites, 1-page sites with diffing, 2-level sites, and 3-level sites). It trims out sidebar tables and search forms automatically, and can deliver the output as one big page with all the articles and a table of contents, multiple pages and a TOC, or just all the pages in one long list. Effectively, sitescooper acts as a
transcoder
for handheld PCs.
It's easily extensible to add your own sites, and can use My-Netscape-style RSS files to find the articles on a given site.
In short, it's neat.
(Note: if you tried to access this site as
http://sitescooper.tsx.org/
and got a "URL not found" error, my apologies; it's because I've deleted that forwarding URL. When I started work on sitescooper, tsx.org was a reputable forwarding service; when I checked
http://sitescooper.tsx.org/
today, it provided me with 2 uncloseable ad windows, advertising a variety of porn sites, and another 3 ad windows on top of that. This is not the kind of thing I want sitescooper to be associated with, so I'd prefer to delete the forwarding URL than provide my implied support.)