Class HHMMSegmenter
java.lang.Object
org.apache.lucene.analysis.cn.smart.hhmm.HHMMSegmenter
Finds the optimal segmentation of a sentence into Chinese words
-
Field Summary
Fields -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprivate SegGraph
createSegGraph
(String sentence) Create theSegGraph
for a sentence.private static int[]
getCharTypes
(String sentence) Get the character types for every character in a sentence.Return a list ofSegToken
representing the best segmentation of a sentence
-
Field Details
-
wordDict
-
-
Constructor Details
-
HHMMSegmenter
public HHMMSegmenter()
-
-
Method Details
-
createSegGraph
Create theSegGraph
for a sentence.- Parameters:
sentence
- input sentence, without start and end markers- Returns:
SegGraph
corresponding to the input sentence.
-
getCharTypes
Get the character types for every character in a sentence.- Parameters:
sentence
- input sentence- Returns:
- array of character types corresponding to character positions in the sentence
- See Also:
-
process
Return a list ofSegToken
representing the best segmentation of a sentence- Parameters:
sentence
- input sentence- Returns:
- best segmentation as a
List
-