To: wais-discussion@think.com From: Brewster Kahle Subject: WAIS-discussion digest #53 Forum On Wide Area Information Servers and Electronic Publishing Brewster Kahle Contents: sources in foreign languages (fwd) (Elham Chounet) WAIS clients and the directory of servers (rhys@cs.uq.oz.au) Re protecting publishers (Ken Kahn) Please send contributions to wais-discussion@think.com ---------------------------------------------------------------------- From: chounet%FRULM63.BITNET@pucc.Princeton.EDU (Elham Chounet) Subject: sources in foreign languages (fwd) Date: Mon, 11 May 92 12:31:06 MET DST Hello, While using WAIS to create a source for our department library, which contains essentially french and english documents, I was very favourably impressed by WAIS, but was faced by the problem of our french accentuated characters, and could not process them in a way satisfactory for me. Here are two small extensions I have made, both related to processing sources in languages other than english. First, I have add an option to "waisindex", usefull while indexing sources in languages containing accentuated (8-bit) characters: applying it allows the subsequent search process in sources containing accentuated characters to run correctly. Second, I have parameterized "waisindex" by the stoplist which determines a list of words that are ignored during indexation, this list being obviously specific to each language (maybe even specific for some sources?). If interested, get the patch by anonymous ftp from snekkar.ens.fr (or 129.199.104.3) at file /ftp/pub/unix/appl/wais-8-b4.patch.tar.Z (which includes a patch-README file) All suggestions and bugs are welcome. (chounet@ens.fr) Hoping this may help someone! Elham Chounet ------------------------------ Date: Fri, 15 May 92 11:15:39 +1000 From: rhys@cs.uq.oz.au Subject: WAIS clients and the directory of servers Hello WAISers, A friend of mine and myself were having a chat about WAIS this morning and he brought up a particular problem with WAIS clients, and xwais in particular. Namely that of searching the list of sources for keywords. Hey, what about the directory of servers I hear you cry? Well, what about it? It's quickly becoming a solution to the wrong problem, at least locally here at the University of Queensland. I automatically fetch all sources from quake.think.com each week, and store them locally for our users. Hence, connecting to the directory of servers is supposedly not necessary. Currently, if the user wishes to perform a keyword search on the list of sources, they must connect to the directory of servers and do so, to find which sources of the local ones are relevant to their query, and then choose the local ones, increasing network bandwidth in the process. Alternatively, they can manually grep the source files, which is hardly friendly enough. So, what's the solution? I could create a local directory of servers that indexes the sources I fetch, so my users can search that, but this still requires a two-stage search which is clumsy. What would be better is a keyword search built into the "Add Source" function of xwais (and similar functions of other clients) that takes care of the search in the client and adds automatically all those sources that match the search. That is, combine the two steps of searching and adding the found sources into one. The keyword search doesn't need to be as complex as the full WAIS search engine, since the sources have a fairly simple text format, and can be indexed "on the fly". As the number of sources increases, the directory of servers will not be enough - more smarts need to be built into the clients to permit searching the sources that are locally available. WAIS already assumes that clients are smarter than dumb terminals, and rightly so - we should take that further and build some of the initial source-finding operations into the client to alleviate the burden on the user. Cheers, Rhys. -- Rhys Weatherley, University of Queensland, Australia. rhys@cs.uq.oz.au "I'm a FAQ nut - what's your problem?" ------------------------------ From: Ken Kahn Subject: Re protecting publishers Date: Sun, 10 May 1992 21:43:07 PDT Apropos the recent discussion about how publishers can protect their goods if they are available in electronic form, Dave Patterson from Berkeley was asked about that in a recent PARC Forum. He mentioned a solution he had heard that seems to make sense to me. The idea is that your terminal has a key and decryption abilities. When one purchases an electronic document, one gets it encrypted for the terminal being used. The bits can be saved and read multiple times on that terminal but to read it on another terminal one must purchase a new copy. While he didn't mention it, I assume this could easily be generalized for printers for those who prefer paper. The document one buys would only be printable on that particular printer. It is true that one could print lots of copies on that printer, but then one can use a copier today. Patterson indicated that the goal not would be perfect protection but protection from illicit copying by most customers. Sophisticated high-tech hackery might still be able to work around the scheme. The analogy would be the way cable TV works today. I wonder if publishers annouced that they would make lots of documents available were such terminals (and printers) to exist, whether terminal manufacturers would soon start making them. I assume that extra cost isn't very high (or if it is that it won't be a couple of years). -ken kahn ------------------------------ End of WAIS-discussion Digest 53 ************************ -------