Harvest Press Release


Nov. 9, 1994

POWERFUL NEW SEARCH TOOL NOW AVAILABLE ON INTERNET

A powerful new software package designed to locate specific information on the Internet has been placed on the international web of computer networks by a team of U.S. researchers.

Named Harvest, the software enables users connected to the Internet to locate and summarize information stored in many different formats on machines around the world, said Michael Schwartz, a computer science professor at the University of Colorado at Boulder and team leader for the project. Users can create new collections of information in the process and then put them back out on the Internet for other users.

In addition to making information access easier on the Internet, the Harvest software is designed to ease the strain on servers, or host computers, as well as overall network traffic, said Schwartz.

``For some people, looking for information on the Internet is like being in a foreign country where the language and culture are constantly changing,'' he said. ``Harvest can help people find the information they need in a confusing and ever-changing environment.''

Harvest, which is being made available on the Internet at no charge, has been tested at 60 sites around the world for the past four months. Other team members include the University of Southern California's Peter Danzig, the University of Arizona's Udi Manber, Mic Bowman from Transarc Corp., a Pittsburgh software company and CU-Boulder researcher Darren Hardy.

One of the Harvest test projects involved building a server for the AT&T 800 numbers directory, said Schwartz. The information was collected and organized automatically by Harvest from data posted by AT&T on the Internet. The researchers added a powerful search tool to the server, allowing users to find names even if they were misspelled and locate complete 800 numbers by using only partial numbers as a reference.

Harvest now has the capability, for example, to locate thousands of technical reports from around the world on a particular topic and then summarize the contents of each report, he said. Such summaries can be customized to benefit employees of large corporations and companies collaborating with each other, for example, by helping users access only the most pertinent information.

Harvest is the first phase of a $2 million, multi-year project funded by the Advanced Research Projects Agency, the Air Force Office of Scientific Research, the National Science Foundation, Hughes Aircraft Co. and Sun Microsystems of Mountain View, Calif.

Clifford Lynch, director of the University of California Library Automation Project, called it ``one of the most interesting R&D projects'' underway today.

``Up until now, the tools to gather data from the Internet have been very limited and relatively costly in terms of network resources,'' he said. ``What Mike and his group have created is a clever and elegant marriage of research with practical technology important to all of us.''

Before Harvest, online Navy users had to go manually to each of the Navy's 20 server sites in order to conduct searches, said Jim Glenn, the U.S. Navy's Internet manager who tested the software. ``Now, we can do it all with one query from one site. It's one-stop shopping for information.''

But Harvest is not a magic bullet, said the University of Arizona's Manber. ``It's impossible to organize all the world's information in one way to fit everyone's needs,'' he said. ``But it is possible to build collections of information on specific topics for specific uses and tie them loosely together so people can search and locate what they want.''

Harvest was placed on the Internet in the United States, England, France, Australia and Japan on Nov 7. It can be accessed on the Internet through the World Wide Web at http://harvest.cs.colorado.edu.