OT- Automated data extraction from websites

Hanasaki JiJi hanasaki at hanaden.com
Fri May 9 01:55:41 CDT 2003


java has a nice set of APIs that can open a http session and retrieve 
the html file then ya just run it through a compiler called "javacc" and 
pull out what you need.

The grammar is here:
	http://www.cobase.cs.ucla.edu/pub/javacc/html-3.2.jjt
The parser is here:
	http://www.cobase.cs.ucla.edu/pub/javacc/

Carl Sappenfield wrote:
> www.screenscraper.com has a nice utility.  You'll need to know some Java to
> automate what you're talking about, but not much.  Last I looked it was free
> (beta 0.8something).
> 
> 
> ----- Original Message -----
> From: <KRFinch at dstsystems.com>
> To: <kclug at kclug.org>
> Sent: Thursday, May 08, 2003 12:13 PM
> Subject: OT- Automated data extraction from websites
> 
> 
> 
>>Hello all!
>>
>>A little off-topic, but I figured that there might be someone in this
> 
> crowd




More information about the Kclug mailing list