Crate jieba_rs

Expand description

The Jieba Chinese Word Segmentation Implemented in Rust

§Installation

Add it to your Cargo.toml:

[dependencies]
jieba-rs = "0.7"

then you are good to go. If you are using Rust 2015 you have to extern crate jieba_rs to your crate root as well.

§Example

use jieba_rs::Jieba;

let jieba = Jieba::new();
let words = jieba.cut("我们中出了一个叛徒", false);
assert_eq!(words, vec!["我们", "中", "出", "了", "一个", "叛徒"]);

use jieba_rs::Jieba;
use jieba_rs::{TfIdf, KeywordExtract};

fn main() {
    let jieba = Jieba::new();
    let keyword_extractor = TfIdf::default();
    let top_k = keyword_extractor.extract_keywords(
        &jieba,
        "今天纽约的天气真好啊，京华大酒店的张尧经理吃了一只北京烤鸭。后天纽约的天气不好，昨天纽约的天气也不好，北京烤鸭真好吃",
        3,
        vec![],
    );
    println!("{:?}", top_k);
}

use jieba_rs::Jieba;
use jieba_rs::{TextRank, KeywordExtract};

fn main() {
    let jieba = Jieba::new();
    let keyword_extractor = TextRank::default();
    let top_k = keyword_extractor.extract_keywords(
        &jieba,
        "此外，公司拟对全资子公司吉林欧亚置业有限公司增资4.3亿元，增资后，吉林欧亚置业注册资本由7000万元增加到5亿元。吉林欧亚置业主要经营范围为房地产开发及百货零售等业务。目前在建吉林欧亚城市商业综合体项目。2013年，实现营业收入0万元，实现净利润-139.13万元。",
        6,
        vec![String::from("ns"), String::from("n"), String::from("vn"), String::from("v")],
    );
    println!("{:?}", top_k);
}

§Enabling Additional Features

default-dict feature enables embedded dictionary, this features is enabled by default
tfidf feature enables TF-IDF keywords extractor
textrank feature enables TextRank keywords extractor

[dependencies]
jieba-rs = { version = "0.7", features = ["tfidf", "textrank"] }

Re-exports§

pub use crate::keywords::DEFAULT_STOP_WORDS;
pub use crate::keywords::DEFAULT_STOP_WORDS;

Structs§

DEFAULT_STOP_WORDS
Jieba
Jieba segmentation
Keyword
Keyword with weight
KeywordExtractConfig
Creates a KeywordExtractConfig state that contains filter criteria as well as segmentation configuration for use by keyword extraction implementations.
Tag
A tagged word
TextRank
Text rank keywords extraction.
TfIdf
TF-IDF keywords extraction
Token
A Token

Enums§

Error
The Error type
TokenizeMode

Traits§

KeywordExtract
Extracts keywords from a given sentence with the Jieba instance.

Crate jieba_rsCopy item path