|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectorg.apache.lucene.analysis.Analyzer
org.apache.lucene.analysis.cz.CzechAnalyzer
public final class CzechAnalyzer
Analyzer for Czech language. Supports an external list of stopwords (words that will not be indexed at all). A default set of stopwords is used unless an alternative list is specified, the exclusion list is empty by default.
| Field Summary | |
|---|---|
static java.lang.String[] |
CZECH_STOP_WORDS
List of typical stopwords. |
| Constructor Summary | |
|---|---|
CzechAnalyzer()
Builds an analyzer with the default stop words ( CZECH_STOP_WORDS). |
|
CzechAnalyzer(java.io.File stopwords)
Builds an analyzer with the given stop words. |
|
CzechAnalyzer(java.util.HashSet stopwords)
|
|
CzechAnalyzer(java.lang.String[] stopwords)
Builds an analyzer with the given stop words. |
|
| Method Summary | |
|---|---|
void |
loadStopWords(java.io.InputStream wordfile,
java.lang.String encoding)
Loads stopwords hash from resource stream (file, database...). |
TokenStream |
tokenStream(java.lang.String fieldName,
java.io.Reader reader)
Creates a TokenStream which tokenizes all the text in the provided Reader. |
| Methods inherited from class org.apache.lucene.analysis.Analyzer |
|---|
getPositionIncrementGap, getPreviousTokenStream, reusableTokenStream, setPreviousTokenStream |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Field Detail |
|---|
public static final java.lang.String[] CZECH_STOP_WORDS
| Constructor Detail |
|---|
public CzechAnalyzer()
CZECH_STOP_WORDS).
public CzechAnalyzer(java.lang.String[] stopwords)
public CzechAnalyzer(java.util.HashSet stopwords)
public CzechAnalyzer(java.io.File stopwords)
throws java.io.IOException
java.io.IOException| Method Detail |
|---|
public void loadStopWords(java.io.InputStream wordfile,
java.lang.String encoding)
wordfile - File containing the wordlistencoding - Encoding used (win-1250, iso-8859-2, ...), null for default system encoding
public final TokenStream tokenStream(java.lang.String fieldName,
java.io.Reader reader)
tokenStream in class Analyzer
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||