I should have said to look at WhizBang Labs (www.whizbang.com) rather than
FlipDog.com (www.flipdog.com). FlipDog is a client/demonstration site for
WhizBang Labs' technology.
WhizBang! Labs has developed software that builds
application-specific
databases by automatically finding and extracting
user-defined content from
an unlimited number of Web pages located anywhere on
the internet. The
company's proprietary software:
1.Crawls the Web, searching for and identifying new
domains
2.Classifies pages in each domain, identifying
those that contain the
user defined target data
3.Captures the target data, extracting it from the
pages it has found and
classified, whether that target data is embedded
in the text or stored
behind forms
4.Compiles the extracted data, storing it in a
relational database where it
can then be searched, sorted, filtered, and
otherwise manipulated
with traditional Relational Database Management
System (RDBMS)
tools, either directly or through a public or
private portal
Brian Ulicny, PhD
Senior Software Linguist
Lernout & Hauspie Speech Products
52 Third Ave
Burlington MA 01803
USA
bulicny@lhsl.com
bulicny@lhsl.com on 06/06/2000 02:39:12 PM
To: Ken Litkowski <ken@clres.com>
cc: corpora@hd.uib.no (bcc: Brian Ulicny/USER/US/LHS)
Fax to:
Subject: Re: Corpora: IE into MUC-style templates for resumes
Ken, Have a look at FlipDog.com.
Brian Ulicny, PhD
Senior Software Linguist
Lernout & Hauspie Speech Products
52 Third Ave
Burlington MA 01803
USA
bulicny@lhsl.com
Ken Litkowski <ken@clres.com> on 06/06/2000 01:51:45 PM
To: corpora@hd.uib.no
cc: (bcc: Brian Ulicny/USER/US/LHS)
Fax to:
Subject: Corpora: IE into MUC-style templates for resumes
It's been a while since I participated in MUC and have only passively
followed its progress, but I seem to recall that someone (perhaps
several) has a commercial program for doing information extraction for
resumes, extracting data into template fields. Is anyone aware of such
a system? I don't want to reinvent the wheel.
Thanks,
Ken
-- Ken Litkowski TEL.: 301-482-0237 CL Research EMAIL: ken@clres.com 9208 Gue Road Damascus, MD 20872-1025 USA Home Page: http://www.clres.com
This archive was generated by hypermail 2b29 : Wed Jun 07 2000 - 16:04:52 MET DST