The Wumpus Information Retrieval System – File system search

Author: Stefan Buettcher (stefan@buettcher.org)
Last change: 2005-05-14

Wumpus can be used as a file system search engine for Linux. Before you can use Wumpus as a file system search engine, you first need to do two things:
  1. Build a Linux kernel with file system change notification enabled. At the moment, only fschange is supported by Wumpus. Support for inotify is under development.
  2. Start a web server (e.g. Apache) that is configured to support PHP.
After you have installed the new Linux kernel with file change notification support and restarted your system, make sure that the notificaton service is running. For fschange, you can do this by executing "cat /proc/fschange" as root. For inotify, use the inotify-utils package provided by the inotify developer.

Before you can actually use the system, you need to do two more things:
  1. Edit the wumpus.cfg configuration file in the Wumpus main directory. Change the TCP_PORT configuration variable to the port that you want Wumpus' TCP server to listen on. Change the INDEXABLE_FILESYSTEMS variable so that it reflects your local file system. Wumpus will only index files that are below one of the mount points given here.
  2. Edit the config.php file in wumpus/php so that it is consistent with the TCP port specified in wumpus.cfg (this means changing the value of "$port"). Then copy all files found in wumpus/php into a directory that can be accessed through the web server, e.g. /var/www/html/wumpus or ~/public_html/wumpus).
After you have started Wumpus in file system search mode by executing bin/fssearch, you should be able to to access the index through the PHP scripts. Wumpus will automatically start an exhaustive file system scan. The time between two such scans can be specified using the TIME_BETWEEN_FS_SCANS configuration variable.

Wumpus automatically reacts to file system mounts and umounts by creating or releasing indices under the respective mount points. Per-file-system indices will be created under each mount point defined in the configuration file. The index for the "/" file system, for example, will be found in "/.indexdir/".

Please note that you might have to run umount twice in order to unmount a given file system. This is because when you run it first, Linux cannot unmount the file system, since Wumpus has still open files. However, it notices that an unmount was requested. Thus, when you run umount the second time, it should be successful.