KyTea API

Return to KyTea

This page describes how to access KyTea's functionality through your own C++ program using the KyTea API. There are also the below third party wrappers written for KyTea for some languages. If your favorite language is missing, please feel free to write a wrapper and I will post it here!

Preliminaries

Before using KyTea from your own programs, you must compile, install, and make sure that you can access the header and library files. If you install KyTea in the default location, you can include the KyTea library when compiling your program by using the -lkytea option.

$ g++ -lkytea my-program.cpp

Example Program

This is an example program that uses KyTea to analyze a sentence (it is also available in the src/api directory of the KyTea source).

#include <iostream>

// a file including the main program

#include "kytea/kytea.h"
// a file including sentence, word, and pronunciation objects
#include "kytea/kytea-struct.h"

using namespace std;
using namespace kytea;

int main(int argc, char** argv) {

    // Create an instance of the Kytea program
    Kytea kytea;
    
    // Load a KyTea model from a model file
    //  this can be a binary or text model in any character encoding,
    //  it will be detected automatically
    kytea.readModel("model.bin");

    // Get the string utility class. This allows you to convert from
    //  the appropriate string encoding to Kytea's internal format
    StringUtil* util = kytea.getStringUtil(); 

    // Get the configuration class, this allows you to read or set the
    //  configuration for the analysis

    KyteaConfig* config = kytea.getConfig();

    // Map a plain text string to a KyteaString, and create a sentence object
    KyteaSentence sentence(util->mapString("これはテストです。"));

    // Find the word boundaries
    kytea.calculateWS(sentence);
    // Find the pronunciations for each tag level
    for(int i = 0; i < config->getNumTags(); i++)
        kytea.calculateTags(sentence,i);

    // For each word in the sentence
    const KyteaSentence::Words & words =  sentence.words;

    for(int i = 0; i < (int)words.size(); i++) {
        // Print the word
        cout << util->showString(words[i].surf);

        // For each tag level
        for(int j = 0; j < (int)words[i].tags.size(); j++) {
            cout << "\t";
            // Print each of its tags
            for(int k = 0; k < (int)words[i].tags[j].size(); k++) {
                cout << " " << util->showString(words[i].tags[j][k].first) << 
                        "/" << words[i].tags[j][k].second;
            }
        }

        cout << endl;
    }
    cout << endl;

}
Return to KyTea
Last Modified: 2011-06-17 by neubig