python-ucto

This is a Python binding to the tokenizer Ucto. Tokenisation is one of the first step in almost any Natural Language Processing task, yet it is not always as trivial a task as it appears to be. This binding makes the power of the ucto tokeniser available to Python. Ucto itself is regular-expression based, extensible, and advanced tokeniser written in C++ (http://ilk.uvt.nl/ucto).

License

This is a Python binding to the tokenizer Ucto. Tokenisation is one of the first step in almost any Natural Language Processing task, yet it is not always as trivial a task as it appears to be. This binding makes the power of the ucto tokeniser available to Python. Ucto itself is regular-expression based, extensible, and advanced tokeniser written in C++ (http://ilk.uvt.nl/ucto).

Creator

proycon

Related apps

python-frog

python-frog

Python bindings to the dutch NLP tool Frog (pos tagger, lemmatiser, NER tagger,

Cython47gpl-3.0

5 months ago

colibri-core

colibri-core

Colibri core is an NLP tool as well as a C++ and Python library for working with

C++122gpl-3.0

5 months ago

c-plus-pluscomputational-linguisticscorpus

pynlpl

PyNLPl, pronounced as 'pineapple', is a Python library for Natural Language Proc

Python469gpl-3.0

7 months ago

computational-linguisticsevaluation-metricsfolia

python-timbl

python-timbl, originally developed by Sander Canisius, is a Python extension mod

Python17gpl-3.0

4 years ago

k-nearest-neighboursknnmachine-learning