Everything important is here: installation, quick start, pipeline components, machine learning utilities, REST API, CLI, hardware selection, advanced usage, reference, and troubleshooting.
MeoMaya documentation, built to be used.
MeoMaya is a MeoX project focused on lightweight, high-performance Natural Language Processing in Python. This documentation covers installation, quick start, core pipeline components, machine learning utilities, REST API, CLI, reference, and troubleshooting in one polished docs experience.
git clone https://github.com/KashyapSinh-Gohil/meomaya.git
cd meomaya
python -m venv .venv
source .venv/bin/activate
pip install -r meomaya/requirements.txt
Start with Introduction, Installation, and Quick Start to understand what MeoMaya is and get it running fast.
Core Components, Machine Learning, Hardware, CLI, and Advanced Usage cover how the framework is structured and used.
REST API, API Reference, Advanced Sentiment Demo, and Troubleshooting help once you move beyond setup.
Every section listed in the index is included below with its full content, examples, and reference material.
Tokenizer,
uvicorn, Classifier, or CLI.
1. Introduction
MeoMaya is a lightweight, high-performance Natural Language Processing (NLP) framework built entirely in Python. It is designed to be simple, modular, and efficient, making it an ideal choice for developers, researchers, and students who need a powerful NLP toolkit without the overhead of larger, more complex libraries.
The framework provides a complete text processing pipeline, including normalization, tokenization, part-of-speech (POS) tagging, and parsing. Additionally, it features a pure-Python machine learning stack with a TF-IDF vectorizer and a centroid-based classifier, allowing for straightforward implementation of text classification and analysis tasks.
2. Installation
Prerequisites
- Python 3.11 or higher
pippackage manager
Steps
git clone https://github.com/KashyapSinh-Gohil/meomaya.git
cd meomaya
python -m venv .venv
source .venv/bin/activate
pip install -r meomaya/requirements.txt
Optional: pip install indic-nlp-library for Indian language support.
3. Quick Start
Modelify
from meomaya.core.modelify import Modelify
m = Modelify(mode="text")
result = m.run("MeoMaya makes NLP easy and fun!")
import json
print(json.dumps(result, indent=2))
CLI
python -m meomaya "MeoMaya is great for command-line use." --mode text
4. Core Components
Normalizer
from meomaya.core.normalizer import Normalizer
normalizer = Normalizer(lang="en")
normalizer.normalize("This is an EXAMPLE")
Tokenizer
from meomaya.core.tokenizer import Tokenizer
tokenizer = Tokenizer(lang="en")
tokens = tokenizer.tokenize("Hello, world!")
Tagger
from meomaya.core.tagger import Tagger
tagger = Tagger(lang="en")
tagger.tag(['MeoMaya','is','cool'])
Parser
from meomaya.core.parser import Parser
parser = Parser(lang="en")
parser.parse([('MeoMaya','NNP'),('is','VBZ')])
5. Machine Learning Utilities
Vectorizer
from meomaya.ml.vectorizer import Vectorizer
vectorizer = Vectorizer()
X = vectorizer.fit_transform([...])
Classifier
from meomaya.ml.classifier import Classifier
# train, save, load, predict
6. REST API
uvicorn meomaya.api.server:app --host 0.0.0.0 --port 8000
curl -X POST http://localhost:8000/run -H 'Content-Type: application/json' \
-d '{"input":"Hello","mode":"text"}'
7. Hardware Selection
from meomaya.core.hardware import select_device
print(select_device()) # 'cpu', 'cuda', or 'mps'
Advanced Sentiment Demo
See meomaya/examples/full_nlp_workflow_demo.py to train
and save a vectorizer & classifier.
python meomaya/examples/full_nlp_workflow_demo.py
CLI
python -m meomaya "Your text" --mode text
Advanced Usage
from meomaya.core.normalizer import Normalizer
from meomaya.core.tokenizer import Tokenizer
from meomaya.core.tagger import Tagger
from meomaya.core.parser import Parser
# ...build pipelines...
API Reference
Normalizer(lang: str = "en")—normalize(text: str) -> strTokenizer(lang: str = "en")—tokenize(text: str) -> list[str]Tagger(lang: str = "en")—tag(tokens: list[str]) -> list[tuple[str,str]]Parser(lang: str = "en")—parse(tagged_tokens: list[tuple[str,str]]) -> dict
Troubleshooting
- ImportError for
indic_nlp_library: install optional dependency. - Incorrect path for corpus: verify file path.
- Performance: process in batches for large datasets.
See tests/ or open an issue on the project repo for help.