OT- Automated data extraction from websites
Hanasaki JiJi
hanasaki at hanaden.com
Fri May 9 01:55:41 CDT 2003
java has a nice set of APIs that can open a http session and retrieve
the html file then ya just run it through a compiler called "javacc" and
pull out what you need.
The grammar is here:
http://www.cobase.cs.ucla.edu/pub/javacc/html-3.2.jjt
The parser is here:
http://www.cobase.cs.ucla.edu/pub/javacc/
Carl Sappenfield wrote:
> www.screenscraper.com has a nice utility. You'll need to know some Java to
> automate what you're talking about, but not much. Last I looked it was free
> (beta 0.8something).
>
>
> ----- Original Message -----
> From: <KRFinch at dstsystems.com>
> To: <kclug at kclug.org>
> Sent: Thursday, May 08, 2003 12:13 PM
> Subject: OT- Automated data extraction from websites
>
>
>
>>Hello all!
>>
>>A little off-topic, but I figured that there might be someone in this
>
> crowd
More information about the Kclug
mailing list