What is Webstore?

Webstore is a data repository that implements a distributed mechanism to detect and eliminate duplicates. It provides unlimited growth of storage capacity and distinct semantics of operation. Webstore is 100% pure Java.

The software is being used in the tumba! Portuguese Web search engine to support storage of the crawled contents.

 People

Webstore was developed at the XLDB group of the Department of Informatics of the Faculty of Sciences of the University of Lisbon in Portugal.

Webstore was written by André Leal Santos.





SourceForge.net Logo

 Research

Webstore addresses the requirements of warehousing applications that need to incrementally store and maintain contents gathered from the web.
In web warehouses the existence of duplicated contents is prevalent. Webstore provides an efficient elimination of duplicates mechanism based on the analysis of the contents without requiring any additional meta-data. Our experiments showed that Webstore outperforms NFS by 68% in read operations and by 50% in write operations.

You can learn more about Webstore consulting our publications.

 

 Availability

Webstore is released under the BSD License, which basically states that you can do anything you like with it as long as you mention us and make it clear that this library is covered by the BSD License.

Source code, samples and detailed documentation are provided in the download.

The package is relatively simple install and run. We encourage you to try it out and of any problems you find. We would also be very happy to hear from people who are using this software package.