Archive for the 'SEO' Category

You are currently browsing the archives of Enabling Technology .

Wordpress Stats plug-in BSUITE [Chinese]

http://www.maisonbisson.com/blog/post/10900/#section-3

bsuite是一款wordpress统计插件,它的前身是bstat,简单好用的来访者统计插件。
它的功能如下:

  1. 跟踪网页点击次数。
  2. 跟踪搜索引挚关键字。
  3. 输出点击率最高的文章。
  4. 输出最近的评论。
  5. 输出最多的搜索关键字。
  6. 输出整个网站或单篇文章的访问量脉冲图
  7. 高亮搜索单词。
  8. 在文章底部列出有联系的相关文章。
  9. bsuite_speedcache结合一体。
  10. 列出文章tag。

请到这儿点击下载,这儿是安装方法。

Posted by micas on Aug 8th 2007 | Filed in SEO | Comments (0)

What People are searching for?

http://searchenginewatch.com/showPage.html?page=2156041

Posted by micas on Aug 8th 2007 | Filed in SEO | Comments (0)

Did You Mean: Lucene?

All modern search engines attempt to detect and correct spelling errors in users’ search queries. Google, for example, was one of the first to offer such a facility, and today we barely notice when we are asked “Did you mean x?” after a slip on the keyboard. This article shows you one way of adding a “did you mean” suggestion facility to your own search applications using the Lucene Spell Checker, an extension written by Nicolas Maisonneuve and David Spencer.

 

Techniques of Spell Checking

Automatic spell checking has a long history. One important early paper was F. Damerau’s A Technique for Computer Detection and Correction of Spelling Errors, published in 1964, which introduced the idea of minimum edit distance. Briefly, the concept of edit distance quantifies the idea of one string being “close” to another, by counting the number of character edit operations (such as insertions, deletions and substitutions) that are needed to transform one string into the other. Using this metric, the best suggestions for a misspelling are those with the minimum edit distance.

Another approach is the similarity key technique, in which words are transformed into some sort of key so that similarly spelled and, hopefully, misspelled words have the same key. To correct a misspelling simply involves creating the key for the misspelling and looking up dictionary words with the same key for a list of suggestions. Soundex is the best-known similarity key, and is often used for phonetic applications.

A combination of minimum edit distance and similarity keys (metaphone) is at the heart of the successful strategy used by Aspell, the leading open source spell checker. However, it is a third approach that underlies the implementation of the “did you mean” technique described in this article: letter n-grams.

A letter n-gram is a sequence of n letters of a word. For instance, the word “lucene” can be divided into four 3-grams, also known as trigrams: “luc”, “uce”, “cen”, and “ene.”. Why is it useful to break words up like this? The intuition is that misspellings typically only affect a few of the constituent n-grams, so we can recognize the intended word just by looking through correctly spelled words for those that share a high proportion of n-grams with the misspelled word. There are various ways of computing this similarity measure, but one powerful way is to treat it as a classic search engine problem with an inverted index of n-grams into words. This is precisely the approach taken by Lucene Spell Checker. Let’s see how to use it.

A Simple Search Application

We’ll first build a very simple search interface that does not include the “did you mean” facility. It defines a single method that takes a search query string and returns a search result.


package org.tiling.didyoumean;

import java.io.IOException;

import org.apache.lucene.queryParser.ParseException;

public interface SearchEngine {
    public SearchResult search(String queryString) throws IOException, ParseException;
}
  

The search result is a SearchResult object, which is a JavaBean that exposes a list of hits (actually just the top hits, for simplicity) and a few other properties. I have omitted the constructor and getters in the listing here as they are boilerplate code. (The full source code is available in the accompanying download–see the “References” section at the end of the article.)


package org.tiling.didyoumean;

import java.util.List;

public class SearchResult {

    private List topHits;
    private int totalHitCount;
    private long searchDuration;
    private String originalQuery;
    private String suggestedQuery;

}
  

Here’s a very simple implementation of SearchEngine built with Lucene. It uses Lucene’s QueryParser to parse the search query string into a Query that is then used to perform the search. The Lucene Hits object is then mapped to an instance of our SearchResult class.


package org.tiling.didyoumean;

import java.io.IOException;
import java.util.ArrayList;
import java.util.List;

import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.queryParser.ParseException;
import org.apache.lucene.queryParser.QueryParser;
import org.apache.lucene.search.Hits;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Query;
import org.apache.lucene.store.Directory;

public class SimpleSearchEngine implements SearchEngine {

    private String defaultField;
    private String nameField;
    private Directory originalIndexDirectory;
    private int maxHits;

    public SimpleSearchEngine(String defaultField, String nameField,
            Directory originalIndexDirectory, int maxHits) {
        this.defaultField = defaultField;
        this.nameField = nameField;
        this.originalIndexDirectory = originalIndexDirectory;
        this.maxHits = maxHits;
    }

    public SearchResult search(String queryString) throws IOException, ParseException {
        long startTime = System.currentTimeMillis();
        IndexSearcher is = null;
        try {
            is = new IndexSearcher(originalIndexDirectory);
            QueryParser queryParser = new QueryParser(defaultField, new StandardAnalyzer());
            queryParser.setOperator(QueryParser.DEFAULT_OPERATOR_AND);
            Query query = queryParser.parse(queryString);
            Hits hits = is.search(query);
            long endTime = System.currentTimeMillis();
            return new SearchResult(extractHits(hits), hits.length(), endTime - startTime, queryString);
        } finally {
            if (is != null) {
                is.close();
            }
        }
    }

    private List extractHits(Hits hits) throws IOException {
        List hitList = new ArrayList();
        for (int i = 0, count = 0; i < hits.length() && count++ < maxHits; i++) {
            hitList.add(hits.doc(i).getField(nameField).stringValue());
        }
        return hitList;
    }
}
  

Note that an IOException may be thrown by Lucene if there is a problem reading the index (typically from disk). The finally clause closes the IndexSearcher, but propagates the exception to indicate the problem to the client, which is the MVC layer, in this case.

With these ingredients it is straightforward to write a user interface that accepts user queries and presents the search results back to the user. I chose Spring’s MVC framework for this. Since this is an article about search and not about Spring, I won’t present any of the code for the user interface here–instead, please refer to the accompanying download.

Figure 1 is a screenshot of the search interface, running against an index of texts by Beatrix Potter from Project Gutenberg.

Figure 1
Figure 1. A simple search application

Adding “Did You Mean” to the Simple Search

Next we’ll extend the search to prompt with “did you mean” suggestions for misspelled search terms in the query. Let’s go through this step by step in the following subsections.

Generating a Spell Index

The first step is to generate an index from the original index that includes the letter n-grams for each word in the original index. I shall refer to this index as the spell index. With the help of the Lucene Spell Checker, this is very easy:


package org.tiling.didyoumean;

import java.io.IOException;

import org.apache.lucene.index.IndexReader;
import org.apache.lucene.search.spell.Dictionary;
import org.apache.lucene.search.spell.LuceneDictionary;
import org.apache.lucene.search.spell.SpellChecker;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.FSDirectory;

public class DidYouMeanIndexer {
    private static final String DEFAULT_FIELD = "contents";

    private static final String FIELD_OPTION = "f";
    private static final String ORIGINAL_INDEX_OPTION = "i";
    private static final String SPELL_INDEX_OPTION = "o";

    public void createSpellIndex(String field,
            Directory originalIndexDirectory,
            Directory spellIndexDirectory) throws IOException {

        IndexReader indexReader = null;
        try {
            indexReader = IndexReader.open(originalIndexDirectory);
            Dictionary dictionary = new LuceneDictionary(indexReader, field);
            SpellChecker spellChecker = new SpellChecker(spellIndexDirectory);
            spellChecker.indexDictionnary(dictionary);
        } finally {
            if (indexReader != null) {
                indexReader.close();
            }
        }
    }

}

  

The Dictionary interface specifies a single method:

public Iterator getWordsIterator();

that returns an iterator over the words in the dictionary. Here we use a LuceneDictionary object to read each word in the given field from the original index. We then create a SpellChecker, giving it a new index location to which to write the n-grams as it indexes the dictionary.

To create the spell index, you can instantiate a new DidYouMeanIndexer and invoke the createSpellIndex() method from your code. Alternatively, you can run DidYouMeanIndexer from the command line (the main() method is not shown in the above listing).

The “Did You Mean” Search Engine

Next, let’s turn back to our SearchEngine interface and look at the implementation of DidYouMeanSearchEngine. This implementation looks for query suggestions when the search results have low relevance.


package org.tiling.didyoumean;

import java.io.IOException;
import java.util.ArrayList;
import java.util.List;

import org.apache.lucene.queryParser.ParseException;
import org.apache.lucene.search.Hits;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.Query;
import org.apache.lucene.store.Directory;

public class DidYouMeanSearchEngine implements SearchEngine {

    private String defaultField;
    private String nameField;
    private Directory originalIndexDirectory;
    private int maxHits;
    private int minimumHits;
    private float minimumScore;
    private DidYouMeanParser didYouMeanParser;

    public DidYouMeanSearchEngine(String defaultField, String nameField,
            Directory originalIndexDirectory,
            int maxHits, int minimumHits, float minimumScore,
            DidYouMeanParser didYouMeanParser) {

        this.defaultField = defaultField;
        this.nameField = nameField;
        this.originalIndexDirectory = originalIndexDirectory;
        this.maxHits = maxHits;
        this.minimumHits = minimumHits;
        this.minimumScore = minimumScore;
        this.didYouMeanParser = didYouMeanParser;
    }

    public SearchResult search(String queryString) throws IOException, ParseException {
        long startTime = System.currentTimeMillis();
        IndexSearcher is = null;
        try {
            is = new IndexSearcher(originalIndexDirectory);
            Query query = didYouMeanParser.parse(queryString);
            Hits hits = is.search(query);

            String suggestedQueryString = null;
            if (hits.length() < minimumHits || hits.score(0) < minimumScore) {
                Query didYouMean = didYouMeanParser.suggest(queryString);
                if (didYouMean != null) {
                    suggestedQueryString = didYouMean.toString(defaultField);
                }
            }

            long endTime = System.currentTimeMillis();
            return new SearchResult(extractHits(hits), hits.length(),
                    endTime - startTime, queryString, suggestedQueryString);
        } finally {
            if (is != null) {
                is.close();
            }
        }
    }

    private List extractHits(Hits hits) throws IOException {
        List hitList = new ArrayList();
        for (int i = 0, count = 0; i < hits.length() && count++ < maxHits; i++) {
            hitList.add(hits.doc(i).getField(nameField).stringValue());
        }
        return hitList;
    }

}

  
The “Did You Mean” Parser

The key difference between the DidYouMeanSearchEngine class and the SimpleSearchEngine class is the introduction of the DidYouMeanParser interface. The DidYouMeanParser interface encapsulates a strategy both for parsing query strings and for suggesting spelling corrections for query strings:


package org.tiling.didyoumean;

import org.apache.lucene.queryParser.ParseException;
import org.apache.lucene.search.Query;

public interface DidYouMeanParser {
    public Query parse(String queryString) throws ParseException;
    public Query suggest(String queryString) throws ParseException;
}

  

The DidYouMeanSearchEngine only asks the DidYouMeanParser for a suggested query if the number of hits returned falls below a minimum threshold (the minimumHits property), or if the relevance of the top hit falls below a minimum threshold (the minimumScore property). Of course, you may choose to implement your own criteria for when to make a “did you mean” suggestion, but this rule is simple and effective.

The first implementation of DidYouMeanParser is straightforward:


package org.tiling.didyoumean;

import java.io.IOException;

import org.apache.lucene.index.Term;
import org.apache.lucene.queryParser.ParseException;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.TermQuery;
import org.apache.lucene.search.spell.SpellChecker;
import org.apache.lucene.store.Directory;

public class SimpleDidYouMeanParser implements DidYouMeanParser {

    private String defaultField;
    private Directory spellIndexDirectory;

    public SimpleDidYouMeanParser(String defaultField, Directory spellIndexDirectory) {
        this.defaultField = defaultField;
        this.spellIndexDirectory = spellIndexDirectory;
    }

    public Query parse(String queryString) {
        return new TermQuery(new Term(defaultField, queryString));
    }

    public Query suggest(String queryString) throws ParseException {
        try {
            SpellChecker spellChecker = new SpellChecker(spellIndexDirectory);
            if (spellChecker.exist(queryString)) {
                return null;
            }
            String[] similarWords = spellChecker.suggestSimilar(queryString, 1);
            if (similarWords.length == 0) {
                return null;
            }
            return new TermQuery(new Term(defaultField, similarWords[0]));
        } catch (IOException e) {
            throw new ParseException(e.getMessage());
        }
    }

}

  

The parse() method simply constructs a new TermQuery from the query. (This means that SimpleDidYouMeanParser only works with single-word queries, a deficiency we shall remedy later.) The suggest() implementation is more interesting. Just as when we created the spell index earlier, we construct a new SpellChecker with the index location for the spell index. This time, however, we just read from the index. First we check if the query word is in the index–if it is, we assume that it is correctly spelled, and make no suggestion by returning null. If instead the query word is not in the index, then we ask the spell checker for a single suggestion, by invoking the suggestSimilar() method. Of course, it may happen that no words are similar enough to the input, so we return null again. But if a suggestion is found, then it is returned as a new TermQuery.

Whew! Let’s see it in action after everything has been wired up using Spring. Figure 2 is a screenshot for the misspelled query “lettice.”

Figure 2
Figure 2. Suggesting a sensible alternative query

How It Works

There’s a lot going on in the suggestSimilar() method of SpellChecker, so let’s follow it through with an example. Take the correctly spelled word “lettuce,” which appears in the Beatrix Potter texts I’ve used for this article. In the original index, where each Lucene document corresponds to a text, “lettuce” appears in two Lucene documents in the contents field. On the other hand, the spell index contains a whole Lucene document for every distinct word in the original index. Each document has a number of fields, as shown here with the values for the document representing the word “lettuce.”

Field name
Field values

word
lettuce

start3
let

gram3
let ett ttu tuc uce

end3
uce

start4
lett

gram4
lett ettu ttuc tuce

end4
tuce

Notice how both trigrams and 4-grams are indexed. In fact, precisely which n-grams are indexed depends on the size of the word. For very short words, unigrams and bigrams are indexed, whereas for longer words, trigrams and 4-grams are indexed.

The suggestSimilar() method forms a Lucene query to search the spell index for candidate suggestions. For the misspelling “lettice” the query is as follows (split over two lines to make it easier to read):


start3:let^2.0 end3:ice gram3:let gram3:ett gram3:tti gram3:tic gram3:ice
start4:lett^2.0 end4:tice gram4:lett gram4:etti gram4:ttic gram4:tice
  

The start n-grams are given more weight than the other n-grams in the word; here, they are boosted by a factor of two, signified by the ^2.0 notation. Another reason to index the start and end n-grams separately is because they are positional, unlike the other n-grams. For example, the words “eat” and “ate” have the same set of unigrams and bigrams (gram1:e gram1:a gram1:t gram2:ea gram2:at), so they need the start and end fields to distinguish them (start1:e end1:t start2:ea end2:at for “eat,” and start1:a end1:e start2:at end2:te for “ate”).

Using a Lucene index browser, such as the excellent Luke–the Lucene Index Toolbox, we can manually run this query against the spell index. Figure 3 shows what we get.

Figure 3
Figure 3. Browsing the spell index–click image for full-size screenshot

But the top hit is “letting,” not “lettuce,” which the web app presented us with. What’s going on? The answer is that the Lucene Spell Checker ranks suggestions by edit distance, not by search relevance. The string “lettice” differs from “lettuce” by a single substitution, whereas “letting” is two substitutions away.

Supporting Composite Queries

SimpleSearchEngine supports composite queries–that is, queries that are composed of a set of clauses; for example, lettuce parsley, which means “find documents in which both of the words ‘lettuce’ and ‘parsley’ appear.” As noted above, DidYouMeanSearchEngine with SimpleDidYouMeanParser only supports single-word queries, so let’s see how we can fix it to support composite queries.

CompositeDidYouMeanParser is an implementation of DidYouMeanParser for use by DidYouMeanSearchEngine that supports composite queries. Recall that the DidYouMeanParser interface has a parse() method and a suggest() method, both of which take query strings and return Lucene Query objects. The implementation of parse() is simple: it uses Lucene’s QueryParser, which has built-in support for composite queries. The implementation of suggest() is a little more tricky. It relies on the getFieldQuery() extensibility hook provided by QueryParser, so if a term (or a word in a phrase) is misspelled, then it is replaced with the best suggestion. If no terms (or words in a phrase) in the whole query are misspelled, then suggest() returns null.

Figure 4 is a screenshot for the misspelled composite query “lettice parslee.”

Figure 4
Figure 4. Correcting the spelling of multiple query terms

Ensuring High-Quality Suggestions

Having a clever algorithm for detecting and correcting spelling errors is a good start, but you need a good source of correctly spelled words to ensure the suggestions are of a high quality. So far, we have used the terms in the original index as the source of words (by constructing a LuceneDictionary). There is a downside to this approach: the content that was indexed will almost certainly contain spelling errors, so there is a good chance that certain query suggestions will be misspelled.

You might think that using a compiled word list might help. However, even the largest dictionaries fall short in word coverage for proper nouns and newly coined words (e.g., technical phrases), so a correctly spelled query term that is not in the dictionary will be incorrectly marked as a misspelling. The user would then be prompted with a distracting alternative query suggestion. (As a side note, Lucene Spell Checker provides an implementation of Dictionary, PlainTextDictionary, which can read words from a word list such as /usr/dict/words commonly found on Unix systems. Use this to do regular spell checking against a dictionary.)

Lucene Spell Checker provides a mechanism to solve this problem, while still using the original index as the source of words. The suggestSimilar() method of SpellChecker is overloaded to support secondary sorting of the suggested words by document frequency in an index; for example:


spellChecker.suggestSimilar(queryText, 1, originalIndexReader, defaultField, true);
  

This call restricts suggestions to those words that are more popular (true) in the original index than the query term. On the plausible assumption that across the whole set of documents, misspellings are less common than the correctly spelled instances of the word, this modification will improve the quality of suggestions, even in document collections containing misspellings.

Zeitgeist

Large search engines use user queries for the source of suggestions. The logic is: if you don’t understand what a user is asking for, compare it to what other users ask for, as someone else is likely to have searched for something similar.

To implement this strategy, each user query submitted to the system should be indexed in the spell index in order to provide a proper record of query frequencies. (All of the main search engines publish their most popular search terms, which are ultimately derived from such an index.) Then, by using the overloaded suggestSimilar() method introduced in the previous section, suggestions will be ranked firstly by edit distance and secondly by user popularity.

Conclusion

Spell checking users’ search queries is a nice feature, and relatively easy to add to a Lucene-powered search application, as this article has shown. Most of the time, the corrections suggested are good ones, but there is plenty of ongoing research in the information retrieval community on improving spell check algorithms (see “References,” below). I think we will continue to see the fruits of such research in open source libraries like Lucene Spell Checker.

References

Tom White is lead Java developer at Kizoom, a leading U.K. software company in the delivery of personalized travel information.

Posted by micas on Aug 8th 2007 | Filed in SEO | Comments (0)

SpellChecker Java Search API

 

July 9, 2007 on 9:20 pm | InJava|July9,2007on9:20pm|InJava|

在寫程式時,基本上都一定要為錯誤的輸入作檢查或修正。在写程式时,基本上都一定要为错误的输入作检查或修正。 這是基本可以用來檢查一個程式有沒有偷懶/偷工減料的最簡單方法。这是基本可以用来检查一个程式有没有偷懒/偷工减料的最简单方法。

在上電腦的基本課時,應該一定會提到有關 GIGO(garbagein,garbageout; 垃圾輸入, 無用輸出)。在上电脑的基本课时,应该一定会提到有关GIGO(garbagein,garbageout;垃圾输入,无用输出)。 就是說電腦很苯,當輸入的數據是垃圾,輸出就一定只會是垃圾。就是说电脑很苯,当输入的数据是垃圾,输出就一定只会是垃圾。 電腦會出錯,可是人類更易出錯,而且犯的錯誤更多。电脑会出错,可是人类更易出错,而且犯的错误更多。

最基本的,是能在輸入時即時先作出反應和指出錯誤。最基本的,是能在输入时即时先作出反应和指出错误。 最簡單是檢查數據的型態和空白(第一類),而一個比較像樣的程式都有數字、時間、日期、大小和特定格式 (如 email 或 UUID) 的 Pattern 檢查(第二類)。最简单是检查数据的型态和空白(第一类),而一个比较像样的程式都有数字、时间、日期、大小和特定格式(如email或UUID)的Pattern检查(第二类)。 造得仔細一點的程式都有會進一步的檢查,就是數據有效性的檢查;例如年月日的組合是否合理,沒有沒串錯字,重復性,在 database 能不能找到相對應的 ID 之類的(第三類)。造得仔细一点的程式都有会进一步的检查,就是数据有效性的检查;例如年月日的组合是否合理,没有没串错字,重复性,在database能不能找到相对应的ID之类的(第三类)。 而最好的則會著重與人的互動關係,清楚的錯誤說明,能在使用者保持集中力的時間內反應(好像大約三秒),自動的修正、建議(第四類)。而最好的则会着重与人的互动关系,清楚的错误说明,能在使用者保持集中力的时间内反应(好象大约三秒),自动的修正、建议(第四类)。

實例实例

第一類:Yahoo 字典第一类:Yahoo字典
雖然它說 “請輸入單字查詢,中英文皆可” ,可是日文,法文,德文也能過。虽然它说“请输入单字查询,中英文皆可”,可是日文,法文,德文也能过。 假如輸入簡体字或日文漢字它也不會了解。假如输入简体字或日文汉字它也不会了解。

第二類: 一般的網頁上常見的 “E-mail this page” 表格(我一直很好奇誰會用)。第二类:一般的网页上常见的“E-mailthispage”表格(我一直很好奇谁会用)。 例如這頁的 footer。例如这页的footer。
它會要求你依一定的格式輸入 email,可是在都不會知道這個 email 是否正確的。它会要求你依一定的格式输入email,可是在都不会知道这个email是否正确的。

第三類: 現時 forum 的 Signup form。第三类:现时forum的Signupform。
它會以寄出 validation code 的方式檢查你輸入的 email。它会以寄出validationcode的方式检查你输入的email。

第四類: Google Suggest / Gmail第四类:GoogleSuggest/Gmail
能在在未按下 sutmit 前給建議,修正,甚至 auto-complete。能在在未按下sutmit前给建议,修正,甚至auto-complete。

以上例子都是 web-application 的原因只是誰也看得到。以上例子都是web-application的原因只是谁也看得到。 事實上 client application 的例子更多。事实上clientapplication的例子更多。
如: Notepad vs UltraEdit vs Microsoft Word vs Eclipse IDE.如:NotepadvsUltraEditvsMicrosoftWordvsEclipseIDE.

好像越說越遠了。好像越说越远了。 回到正題,其中最麻煩,最難實作的是自動化的修正建議。回到正题,其中最麻烦,最难实作的是自动化的修正建议。 因為它的目的不做到 GIGO,這不是和上文所說的電腦很苯相反嗎?因为它的目的不做到GIGO,这不是和上文所说的电脑很苯相反吗? 沒有矛盾,它是建基在數據當中正確的部份。没有矛盾,它是建基在数据当中正确的部份。

再回到主題,英文串字修正是以沒串錯的部份基礎。再回到主题,英文串字修正是以没串错的部份基础。 (從語言角度上著手也可以,不過這是人類善長的工作,不在本文內容)(从语言角度上着手也可以,不过这是人类善长的工作,不在本文内容)
英文是由字母組合而成,而它們的組合次序和方式,則可以提示出正確的建議。英文是由字母组合而成,而它们的组合次序和方式,则可以提示出正确的建议。

用實例來說或會比較易明白: 例如 Monstor (Monster 的誤寫)。用实例来说或会比较易明白:例如Monstor(Monster的误写)。 以 n-Gram 方式拆解的話,就可以得到 mon, ons, nst, sto, tor。以n-Gram方式拆解的话,就可以得到mon,ons,nst,sto,tor。 如果和 Monster 比較(mon, mon, nst, ste, ter),其中 3/5 * 3/5 都是對的(因為是互相比較, 所以是兩組)。如果和Monster比较(mon,mon,nst,ste,ter),其中3/5*3/5都是对的(因为是互相比较,所以是两组)。 而如果和 “monsters inc” 比較,則只有 3/5 * 3/8。而如果和“monstersinc”比较,则只有3/5*3/8。 而拿來和 apple 比較,則完全不付合。而拿来和apple比较,则完全不付合。 只要事先為字並建立 index,用這方法可以快速得到幾個相近的詞語。只要事先为字并建立index,用这方法可以快速得到几个相近的词语。

一般,英文只要有 3-gram 和 4-gram 就可以有不錯的結果。一般,英文只要有3-gram和4-gram就可以有不错的结果。 當然啦,這等零件老早就有寫了出來。当然啦,这等零件老早就有写了出来。 如果你是用 Lucene 1.4 的話,花一點心思也可以自己寫出來,而如果用 Lucene 2.0,則已經有現成的 library 可用。如果你是用Lucene1.4的话,花一点心思也可以自己写出来,而如果用Lucene2.0,则已经有现成的library可用。

參考:  SpellChecker - Lucene-java Wiki参考:SpellChecker-Lucene-javaWiki
參考:  Lucene 2.0org.apache.lucene.search.spell参考:Lucene2.0org.apache.lucene.search.spell

就算不用 Lucene,自已利用 php+mysql 也可以寫得到類似的功能,只要依著以上的方式建立 index table 和設計出一句 inner join SQL 就可以。就算不用Lucene,自已利用php+mysql也可以写得到类似的功能,只要依着以上的方式建立indextable和设计出一句innerjoinSQL就可以。

9 July 20079July2007
DennisDennis

Posted by micas on Aug 8th 2007 | Filed in SEO | Comments (0)

我的一个Java xmlrpc-apache操纵wordPress blog的例子

package com.enablingtech.rpc;

import java.net.MalformedURLException;
import java.util.Hashtable;
import java.util.Vector;

import org.apache.xmlrpc.XmlRpc;
import org.apache.xmlrpc.XmlRpcClient;
import org.apache.xmlrpc.XmlRpcException;

 

public class Test2 {
    public static void main(String[] args) {
        try {
            XmlRpc.setDriver(”org.apache.xerces.parsers.SAXParser”);
            XmlRpcClient xmlrpc = new XmlRpcClient(
                    “http://网址/xmlrpc.php”);
            Vector params = new Vector();
            //params.addElement(”1″);
            params.addElement(”776″);
            params.add(”用户名”);
            params.add(”密码”);

            Vector mp = (Vector)xmlrpc.execute(”mt.getRecentPostTitles”,
                    params);
                System.out.println(mp.get(0));
                System.out.println(mp.get(1));
        } catch (MalformedURLException e) {
            System.out.println(e.toString());
        } catch (XmlRpcException e) {
            System.out.println(e.toString());
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

Posted by micas on Aug 1st 2007 | Filed in SEO | Comments (0)

小试XML-RPC(浏览器javascript与服务器java通信)[转]

From http://www.blogjava.net/mstar/

   前些天无意中发现了XML-RPC(不过笑我才发现啊),总想找个机会摆弄摆弄。毕业论文基本上弄完了,所以决定今天把它弄明白。
XML-RPC的最大用处,我首先想到的是浏览器在不刷新页面的情况下与服务器通信,请求数据。下面我就说一下我用XML-RPC是怎么实现的。

第一步:选择XML-RPC实现。
XML-RPC的一个很大优势就是 它是一个标准,并且各种开发环境下都有实现(酷),这是它能够轻松跨平台的原因。
javascript有3个实现。我看了一下最好的应该是jsolait(JavaScript o Lait)的实现了。因为他不仅仅是一个xml-rpc的实现,除此之外还有很多javascript库,详细内容请看这里(http://jsolait.net/)。
java的实现就更多了,我当然毫不犹豫地选择apache的。详细内容看这里(http://ws.apache.org/xmlrpc/

第二步:建立服务。
用java建立xml-rpc有两种方式,一种是单独开个端口,一种是用servlet。我们客户端是用javascript,那么服务端用servlet是再好不过的了。
如何使用apache的xml-rpc,请详细看apache的资料。(大哥你不会连servlet也不会建吧,那你还是不要往下看了)。
代码如下:
这是一个sayHello的服务类:

public class HelloService {

public String sayHello(String name){
return “Hello: ”+name+” !”;
 }

}

下面是一个Math服务类:

public class MathService {
public double add(Vector v){
double a = Double.parseDouble((String)v.get(0));
double b = Double.parseDouble((String)v.get(1));
return a+b;
 }

public double mult(Vector v){
double a = Double.parseDouble((String)v.get(0));
double b = Double.parseDouble((String)v.get(1));
return a*b;
 }
}

接着是Servlet啦,作为RPC Server用的,这段代码比较经典,很多资料上都有。

public class RpcServer extends HttpServlet {
protected void doPost(HttpServletRequest request,
   HttpServletResponse response) throws ServletException, IOException {
  XmlRpcServer xmlrpc = new XmlRpcServer();
  xmlrpc.addHandler(”HelloService”, new HelloService());
  xmlrpc.addHandler(”MathService”,new MathService());
byte[] result = xmlrpc.execute(request.getInputStream());
  response.setContentType(”text/xml”);
  response.setContentLength(result.length);
  OutputStream out = response.getOutputStream();
out.write(result);
out.flush();
 }
}

主要是这三句:
XmlRpcServer xmlrpc = new XmlRpcServer();
xmlrpc.addHandler(”HelloService”, new HelloService());
xmlrpc.addHandler(”MathService”,new MathService());
一定要记牢Handler的名字,就是第一个参数,因为客户端就靠他来表示要调用的方法呢。

行了现在可以在web.xml中写入配置了:

<servlet>
<servlet-name>RpcServer</servlet-name>
<servlet-class>org.mstar.rpc.RpcServer</servlet-class>
</servlet>
<servlet-mapping>
<servlet-name>RpcServer</servlet-name>
<url-pattern>/RpcServer</url-pattern>
</servlet-mapping>

至此,服务端的工作已经完成,启动应用服务器就行了。

下面是javacript的实现,这也是难点(其实不难理解,只是没有中文材料)。
把jsolait的库下来以后解压缩,得到一些js文件,具体我就不说了。

建立一个html文件:

<html>
<head>
<title>XML-RPC</title>
<script type=”text/javascript” src=”./js/init.js”></script>
<script type=”text/javascript” src=”./js/lib/urllib.js”></script>
<script type=”text/javascript” src=”./js/lib/xml.js”></script>
<script type=”text/javascript” src=”./js/lib/xmlrpc.js”></script>
<script type=”text/javascript” src=”./js/hello.js”></script>
</head>
a:<input type=”text” id=”a” /><br>
b:<input type=”text” id=”b” /><br>
<input type=”button” id=”do1″ value=”a+b” onclick=”add()”/>
<input type=”button” id=”do2″ value=”say” onclick=”hello()”/>
<input type=”text” id=”result” />
</html>

注意到前面那一堆javascript的引用吗?就这么写吧。可别把hello.js当成solait的东西啦(看名字也知道啦),你是找不到的。这是我们自己写的:
hello.js

hello = function(){
var xmlrpc=null;
try{
var xmlrpc = importModule(”xmlrpc”);
 }catch(e){
     reportException(e);
throw “importing of xmlrpc module failed.”;
 }
var addr = “http://localhost:8080/Rpc/RpcServer”;
var methods = [”HelloService.sayHello”];
var rslt;

try{
var service = new xmlrpc.ServiceProxy(addr, methods);
        rslt = service.HelloService.sayHello(”MTY”);
    }catch(e){
var em;
if(e.toTraceString){
            em = e.toTraceString();
        }else{
            em = e.message;
        }
        rslt = “Error trace: \n\n” + em;
    }
 document.getElementById(”result”).value=rslt;
}
add = function(){
var xmlrpc=null;
var a = document.getElementById(”a”).value;
var b = document.getElementById(”b”).value;
var params = new Array();
 params[0] = a;
 params[1] = b;
try{
var xmlrpc = importModule(”xmlrpc”);
 }catch(e){
     reportException(e);
throw “importing of xmlrpc module failed.”;
 }
var addr = “http://localhost:8080/Rpc/RpcServer”;
var methods = [”HelloService.sayHello”,”MathService.add”];
var rslt;

try{
var service = new xmlrpc.ServiceProxy(addr, methods);
        rslt = service.MathService.add(params);
    }catch(e){
var em;
if(e.toTraceString){
            em = e.toTraceString();
        }else{
            em = e.message;
        }
        rslt = “Error trace: \n\n” + em;
    }
 document.getElementById(”result”).value=rslt;
}

这个js文件中有两个函数,一个负责从sayhello,一个负责加法运算。
这里需要一些解释的地方:
1、
 var xmlrpc=null;
 try{
     var xmlrpc = importModule(”xmlrpc”);
 }catch(e){
     reportException(e);
     throw “importing of xmlrpc module failed.”;
 }
这里是把xmlrpc模块引进来,你也就这么写吧,我也不知道为什么。
2、
 var addr = “http://localhost:8080/Rpc/RpcServer“;
 var methods = [”HelloService.sayHello”];
定义服务地址和要用的方法名。规则大概你也能看懂:Handler名.方法名。这里的Handler名就是你在xmlrpcServer中注册名,就是我上面让你记住的那个。方法名就是那个类自己的方法名。注意,methods是一个数组,所以可以写多个方法,如第二个例子。var methods = [”HelloService.sayHello”,”MathService.add”];
3、
    try{
        var service = new xmlrpc.ServiceProxy(addr, methods);
        rslt = service.HelloService.sayHello(”MTY”);
    }catch(e){
        var em;
        if(e.toTraceString){
            em = e.toTraceString();
        }else{
            em = e.message;
        }
        rslt = “Error trace: \n\n” + em;
    }
通过new xmlrpc.ServiceProxy(addr, methods);得到服务代理。
然后调用服务的方法就行了,方法就是代理.Handler名.方法名(参数)。好像参数只能有一个,在第二个例子中我开始有两个参数a,b会发生错误。怎么办?没办法,在javascript用Array传参数,在java用Vector接参数(为什么用Vector,因为xml-rpc规范中的 Array,apache使用Vector实现的,为什么javascript不用Vector,因为js没有Vector,且js的的Array是可变长的)。当然这就需要很多java端类型转换工作,js是弱类型的就不用转换了。

Posted by micas on Aug 1st 2007 | Filed in SEO | Comments (0)

Apache xml-rpc (转)

 

由于最近做的一个项目需要,使用了apache xml-rpc,顺便整理一下使用的方法。

    xml-rpc是一套允许运行在不同操作系统、不同环境的程序实现基于internet过程调用的规范和一系列的实现。这种远程过程调用使用http作为传输协议,xml作为传送信息的编码格式。xml-rpc的定义尽可能的保持了简单,但同时能够传送、处理、返回复杂的数据结构。
    关于xml-rpc更详细的信息,请参阅http://www.xmlrpc.com

1,客户程序
    Apache xml-rpc提供两种客户类:
    org.apache.xmlrpc.XmlRpcClient:使用java.net.URLConnection。
    org.apache.xmlrpc.XmlRpcClientLite:自身提供轻量级的http client实现。
    如果您需要完全的http支持(例如:代理,重定向等等),你应该使用XmlRpcClient。反之,如果您不需要完全的http支持并且更注重性能,那么你应该仔细的试验这两种客户类。在某些平台上,可能XmlRpcClient更快,但是在某些平台上XmlRpcClientLite更快。
    这两个客户类提供相同的接口。

    在客户端使用apache xml-rpc是非常简单的,只需要完成下面的简单工作:
    // 建立xml-rpc客户
    XmlRpcClient client = new XmlRpcClient(”http://” + server + “:” + port);

    // 设置调用参数
    Vector params = new Vector();
    params.addElement(name);

    // 调用并取得结果
    String result = (String) client.execute(”hello.sayHello”, params);

    如果您需要进行异步调用,并使用executeAsync()方法。   

2,登记Handler Object
    org.apache.xmlrpc.XmlRpcServer和org.apache.xmlrpc.WebServer都提供方法以登记/注销Handler Object:
    addHandler (String name, Object handler);
    removeHandler (String name);

3,在servlet环境中使用xml-rpc
    典型的代码如下所示:   
    XmlRpcServer xmlrpc = new XmlRpcServer ();
    xmlrpc.addHandler (”examples”, new ExampleHandler ());
    …
    byte[] result = xmlrpc.execute (request.getInputStream ());
    response.setContentType (”text/xml”);
    response.setContentLength (result.length());
    OutputStream out = response.getOutputStream();
    out.write (result);
    out.flush ();
    请注意:execute方法不会返回任何异常,因为所有错误都被编码成xml返回可以端。

4,使用内建的http server
    代码如下:
    XmlRpc.setDriver(”org.apache.xerces.parsers.SAXParser”);

    //start the server   
    System.out.println(”Starting XML-RPC Server……”);

    WebServer server = new WebServer(8585);
    //register our handler class   
    server.addHandler(”hello”, new HelloHandler());
    server.start();

5,Apache xml-rpc支持的类型
    这些类型适用于xml-rpc的参数和返回类型,同时,如果参数或者返回类型是集合类型的话,也适用于集合元素。

XML-RPC data type
Data Types generated by the Parser
Types expected by the Invoker as input parameters of RPC handlers

or
java.lang.Integer
int

java.lang.Boolean
boolean

java.lang.String
java.lang.String

java.lang.Double
double

java.util.Date
java.util.Date

java.util.Hashtable
java.util.Hashtable

java.util.Vector
java.util.Vector

byte[ ]
byte[ ]

6,使用内建http server的简单例子
    a,建立handler object

        /*
         * 创建日期 2004-5-12
         *
         * 更改所生成文件模板为
         * 窗口 > 首选项 > Java > 代码生成 > 代码和注释
         */
        package helloxmlrpc;
        import java.util.Vector;
        /**
         * @author fyun
         *
         * 更改所生成类型注释的模板为
         * 窗口 > 首选项 > Java > 代码生成 > 代码和注释
         */
        public class HelloHandler {
          public String sayHello(String name) {
            return “Hello ” + name;
          }
        }

    b,登记并启动server
        /*
         * 创建日期 2004-5-12
         *
         * 更改所生成文件模板为
         * 窗口 > 首选项 > Java > 代码生成 > 代码和注释
         */
        package helloxmlrpc;
        /**
         * @author fyun
         *
         * 更改所生成类型注释的模板为
         * 窗口 > 首选项 > Java > 代码生成 > 代码和注释
         */
        import org.apache.xmlrpc.*;
        public class HelloServer {
          public static void initServer() {
            try {
              XmlRpc.setDriver(”org.apache.xerces.parsers.SAXParser”);
              //start the server   
              System.out.println(”Starting XML-RPC Server……”);
              WebServer server = new WebServer(8585);
              //register our handler class   
              server.addHandler(”hello”, new HelloHandler());
              server.start();
              System.out.println(”Now accepting requests……”);
            } catch (ClassNotFoundException e) {
              System.out.println(”Could not locate SAX Driver”);
            }
          }
          public static void main(String[] args){
           initServer();
          }
        }
    c,客户程序
        /*
         * 创建日期 2004-5-12
         *
         * 更改所生成文件模板为
         * 窗口 > 首选项 > Java > 代码生成 > 代码和注释
         */
        package helloxmlrpc;
        /**
         * @author fyun
         *
         * 更改所生成类型注释的模板为
         * 窗口 > 首选项 > Java > 代码生成 > 代码和注释
         */
        import java.io.IOException;
        import org.apache.xmlrpc.XmlRpc;
        import org.apache.xmlrpc.XmlRpcClient;
        import java.net.MalformedURLException;
        import org.apache.xmlrpc.XmlRpcException;
        public class HelloClient {
          public static void invoke(String server, String port, String name) {
            try {
              //Use the Apache Xereces SAX Driver   
              XmlRpc.setDriver(”org.apache.xerces.parsers.SAXParser”);
              //Specify the server   
              XmlRpcClient client = new XmlRpcClient(”http://” + server + “:” + port);
              //create request   
              Vector params = new Vector();
              params.addElement(name);
              //make a request and print the result   
              String result = (String) client.execute(”hello.sayHello”, params);
              System.out.println(”hello.sayHello: ” + result);
            } catch (ClassNotFoundException e) {
              System.out.println(”Could not locate SAX Driver”);
            } catch (MalformedURLException e) {
              System.out.println(
                “Incorrect URL fro xml-rpc server foramt:” + e.getMessage());
            } catch (XmlRpcException e) {
             e.printStackTrace();
              System.out.println(”XmlRpcException :” + e.getMessage());
            } catch (IOException e) {
              System.out.println(”IOException:” + e.getMessage());
            }catch(Exception e){
             e.printStackTrace();
            }
          }
          public static void main(String[] args){
           if( args == null || args.length < 2 ){
            System.out.println(”Usage: java HelloClient [server] [port] [yourname]”);
            System.exit(1);
           }
           invoke(args[0], args[1], args[2]);
          }
        }
7,使用servlet的例子
    1,handler object不变
    2,建立XmlRpcFacade
        package helloxmlrpc;
        import javax.servlet.http.HttpServletRequest;
        import javax.servlet.http.HttpServletResponse;
        import java.io.IOException;
        import java.io.OutputStream;
        import org.apache.xmlrpc.XmlRpcServer;
        public class XmlRpcFacade {
          private static XmlRpcServer xmlrpc;
          static{
            xmlrpc = new XmlRpcServer();
            //登记你的handler object
            xmlrpc.addHandler(”hello”, new HelloHandler());
          }
          public void execute(HttpServletRequest request, HttpServletResponse response) throws
              IOException {
            byte[] result = xmlrpc.execute(request.getInputStream());
            response.setContentType(”text/xml; charset=GB2312″);
            response.setContentLength(result.length);
            OutputStream out = response.getOutputStream();
            out.write(result);
            out.flush();
            out.close();
          }
        }
    3,建立servlet
        package hellpxmlrpc;
        import javax.servlet.*;
        import javax.servlet.http.*;
        import java.io.*;
        import java.util.*;
        public class XmlRpcServlet extends HttpServlet {
          private static final String CONTENT_TYPE = “text/html; charset=GBK”;
          private XmlRpcFacade facade;
          public void init() throws ServletException {
            facade = new XmlRpcFacade();
          }
          //Process the HTTP Get request
          public void doGet(HttpServletRequest request, HttpServletResponse response) throws
              ServletException, IOException {
            this.doService(request, response);
          }
          public void doPost(HttpServletRequest request, HttpServletResponse response) throws
              ServletException, IOException {
            this.doService(request, response);
          }
          public void doService(HttpServletRequest request,
                                HttpServletResponse response) throws ServletException,
              IOException {
            facade.execute(request, response);
          }
          //Clean up resources
          public void destroy() {
          }
        }
    4,客户程序和内建http server类似,只需将先下面这句
    XmlRpcClient client = new XmlRpcClient(”http://” + server + “:” + port);
    改为
    XmlRpcClient client = new XmlRpcClient();
    即可
    希望这篇文档能对你有小小帮助。更详细的信息可以到http://ws.apache.org/xmlrpc了解。

Posted by micas on Aug 1st 2007 | Filed in SEO | Comments (0)

XML-RPC 之 Apache XML-RPC 实例

iTbulo.COM 2005-4-13 佚名(63)

作者:王恩建来源:http://www.sentom.netXML-RPC 是工作在 Internet 上的远程过程调用协议。通俗点讲,就是使用 HTTP 协议交互,交互的载体是 XML 文件。XML-RPC 具体的规范说 明请参考这里。
图片来自XML-RPC官方网站

XML-RPC 规范定义了六种数据类型,下表是这六种数据类型与 Java 的数据类型对应表。

XML-RPCJava<i4> 或 <int>int<boolean>boolean<string>java.lang.String<double>double<dateTime.iso8601>java.util.Date<struct>java.util.Hashtable<array>java.util.Vector<base64>byte[ ]

XML-RPC 规范的各种平台都有具体实现,XML-RPC 规范的 Java 实现都有好几种,这里我们选择了 Apache XML-RPC。
XML-RPC 服务端实现先定义一个简单业务对象 MyHandler,远程客户端将调用该对象的方法,具体代码如下:

package net.sentom.xmlrpc;

public class MyHandler {
    public String sayHello(String str) {
        return “Hello,” + str;
    }
}

然后定义一个 Servlet 名叫 MyXmlRpcServer,远程客户端通过 HTTP-POST 访问该 Servlet。

package net.sentom.xmlrpc;

import java.io.IOException;
import java.io.OutputStream;
import javax.servlet.ServletException;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import org.apache.xmlrpc.XmlRpcServer;

public class MyXmlRpcServer extends HttpServlet {
    public void doPost(HttpServletRequest request, HttpServletResponse response)
            throws ServletException, IOException {
        XmlRpcServer xmlrpc = new XmlRpcServer();
        xmlrpc.addHandler(”myHandler”, new MyHandler());
        byte[] result = xmlrpc.execute(request.getInputStream());
        response.setContentType(”text/xml”);
        response.setContentLength(result.length);
        OutputStream out = response.getOutputStream();
        out.write(result);
        out.flush();
    }
}

需要特别说明是:

xmlrpc.addHandler(”myHandler”, new MyHandler());

为了便于理解,这里可以看成普通的:

MyHandler myHandler = new MyHandler();

最后在web.xml文件中加入以下几行:

<servlet>
    <servlet-name>MyXmlRpcServer</servlet-name>
    <servlet-class>net.sentom.xmlrpc.MyXmlRpcServer</servlet-class>
</servlet>
<servlet-mapping>
    <servlet-name>MyXmlRpcServer</servlet-name>
    <url-pattern>/MyXmlRpcServer</url-pattern>
</servlet-mapping>

XML-RPC 客户端实现客户端相对简单一些,先来一个 Java 客户端实现 MyXmlRpcClient:

package net.sentom.xmlrpc;

import java.io.IOException;
import java.net.MalformedURLException;
import java.util.Vector;
import org.apache.xmlrpc.XmlRpcClient;
import org.apache.xmlrpc.XmlRpcException;

public class MyXmlRpcClient {
    public static void main(String[] args) {
        try {
            XmlRpcClient xmlrpc = new XmlRpcClient(
                    “http://localhost:8080/XMLRPC/MyXmlRpcServer”);
            Vector params = new Vector();
            params.addElement(”Tom”);
            String result = (String) xmlrpc.execute(”myHandler.sayHello”,
                    params);
            System.out.println(result);
        } catch (MalformedURLException e) {
            System.out.println(e.toString());
        } catch (XmlRpcException e) {
            System.out.println(e.toString());
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

http://localhost:8080/XMLRPC/MyXmlRpcServer 为 MyXmlRpcServer 的访问URL。

String result = (String) xmlrpc.execute(”myHandler.sayHello”,params);

再来一个 Python 客户端实现

import xmlrpcliburl = ‘http://localhost:8080/XMLRPC/MyXmlRpcServer’;server = xmlrpclib.Server(url);print server.myHandler.sayHello(’Tom’);

Posted by micas on Aug 1st 2007 | Filed in SEO | Comments (0)

XML RPC简介 和一个例子(java)

1.xml rpc是什么
1.1. xml rpc简介
xml rpc是使用http协议做为传输协议的rpc机制,使用xml文本的方式传输命令和数据。
一个rpc系统,必然包括2个部分:1.rpc client,用来向rpc server调用方法,并接收方法的返回数据;2.rpc server,用于响应rpc client的请求,执行方法,并回送方法执行结果。
1.2. xml rpc的可用版本
xml rpc client和xml rpc server都有很多版本的实现。一般而言,一个实现版本都会同时实现client/server。但由于都满足xml rpc规范,从理论上讲,任何一个版本的rpc client实现与任何一个版本的rpc server都能配套使用。
更进一步,由于xml rpc以xml文本的方式,使用http协议传输,所以与编程语言无关。例如:rpc client的已实现版本包括了:perl,php,python,c/c++,java,等等;rpc server的实现语言包括perl,java,等。
同一种编程语言所实现的版本也不止一个。例如java版的实现有:Marque的xmlrpc实现(http://xmlrpc.sourceforge.net/),apache的xmlrpc 实现(http://ws.apache.org/xmlrpc/)

1.3.xmlrpc的工作原理
完整的需要参考xmlrpc规范(http://www.xmlrpc.com/spec)
简单描述:
rpcclient的工作原理:rpcclient根据URL找到rpcserver -> 构造命令包,调用rpcserver上的某个服务的某个方法 -> 接收到rpcserver的返回,解析响应包,拿出调用的返回结果。
rpcserver的工作原理:启动一个webserver(在使用内置的webserver的情况下) -> 注册每个能提供的服务,每个服务对应一个Handler类 ->进入服务监听状态。
1.4. xmlrpc规范
区区6页,讲的非常清楚,建议细看。http://www.xmlrpc.com/spec

2.在java中使用xml rpc的几个例子
2.0.环境准备:下载如下包并设置到CLASSPATH中
apache xmlrpc软件包(http://ws.apache.org/xmlrpc/)
commons-httpclient-3.0-rc4.jar
commons-codec-1.3.jar
2.1.使用apache的java xmlrpc实现版本,实现简单的加/减服务。参考附录中test.XmlRPCClient类与test.JavaServer类
2.2.使用apache的java xmlrpc实现版本,测试java的数据类型与xmlrpc数据类型的相互对应关系。参考附录中test2.XmlRPCClient类与test2.JavaServer类。
在这里简单描述一下:
>xmlrpc中的Array类型,对应到java中的Vector类型
例如:在RPC Server中的某个服务方法的返回值的声明类型是String[],但在Client中接收到的将是Vector对象;
反之,如果Client传送过去的调用参数为String[],但在RPC Server中所接收到的将是Vector对象
当然,如果我们不使用String[],直接声明为Vector,也是可以的。
>xmlrpc中的struct类型,对应到java中的Hashtable类型
>其它的类型,如:String,int,double,boolean,Date等,都比较好对应。需要注意的是:在rpc Client中,如果使用到int/double/boolean这些基本类型时,需要将他们封装成一个相应的Object,例如:Integer/Double/Boolean。

2.3.使用apache的java xmlrpc实现版本,实现自定义类型的数据的传输
这个sample中,假设所传输的object都实现了XmlRPCSerializable接口。这个例子的目的是:模拟unionmon中的 command对象。当假设所传输的数据是一个Object[]时,可以用这种方式传输。结合unionmon中的代码生成机制,每个vo的序列化/反序列化方法可以在代码生成过程中完成。同样地,vox的序列化/反序列化方法需要手写。
参考代码:附录中的test3.XmlRPCSerializable , test3.AccountVO , test3.XmlRPCClient , test3.JavaServer

2.4.不启动内置的WebServer,让tomcat支持rpc server。
做法:1.实现一个Servlet,并配置到tomcat中;2.让rpc client直接找这个servlet,获得服务。
注意rpc client使用的是http post方法,所以该servlet中只需要实现doPost方法。

过程:在tomcat中的web.xml增加如下配置:

SampleServiceServlet
test4.ServletXmlRPCServer

SampleServiceServlet
/SampleServiceServlet

参考类:附件中的test4.SampleService,test4.ServletXmlRPCServer,test4.XmlRPCClient

3.todolist
3.1.能否通过introspect方式,使得自定义类型的vo的序列化/反序列化工作自动完成.castor可以完成对象的xml binding,似乎可参考
3.2.soap协议比xmlrpc复杂并强大。soap不熟悉。另,soap一定与web service在一起用?能否不用web service,但使用soap做rmi的协议。

4.附录
4.1.附录1 xmlrpc所支持的数据类型(略)
4.2.附录2 xmlrpc数据类型与java语言的数据类型的映射(略)
4.3.参考资料
http://xmlrpc.sourceforge.net/ –Marque的xmlrpc实现
http://ws.apache.org/xmlrpc/ apache上xmlrpc server的实现
http://xmlrpc-c.sourceforge.net/xmlrpc-howto/xmlrpc-howto.html xmlrpc howto
http://www.sentom.net/list.asp?id=80 xml-rpc之实例
http://www.xmlrpc.com/spec xmlrpc规范

Posted by micas on Aug 1st 2007 | Filed in SEO | Comments (0)

开始研究 XML RPC Java Client

  1. http://ws.apache.org/xmlrpc/client.html
  2. 不知道Java的是否可以调用PHP的,如果可以,我就可以发布我的blog系统很方便了。

Posted by micas on Jul 31st 2007 | Filed in SEO | Comments (0)

Next »