KazakhBERTmulti-onnx
This repository provides a ready-to-use ONNX model directory for @sarmay/kaz-converter-lm.
It is intended to improve ambiguous Arabic Tote Zhazu -> Cyrillic conversion in Node.js by scoring candidate Kazakh sentences with a masked language model.
Model repo:
Compatible npm packages:
@sarmay/kaz-converter@sarmay/kaz-converter-lm
Use with npm
Install:
npm install @sarmay/kaz-converter @sarmay/kaz-converter-lm
Download this model repository to a local directory such as:
models/KazakhBERTmulti-onnx
If you only want the quickest path, use the bundled CLI:
npx sarmay-kaz-download ./models/KazakhBERTmulti-onnx
Then use:
import { createOnnxArabicToCyrillicConverter } from "@sarmay/kaz-converter-lm";
const converter = await createOnnxArabicToCyrillicConverter({
modelDirectory: "./models/KazakhBERTmulti-onnx"
});
console.log(await converter.convertAsync("الما بار."));
console.log(await converter.convertAsync("بىر كۇنى"));
console.log(await converter.convertAsync("اكەم كەلدى."));
Expected local layout
models/
KazakhBERTmulti-onnx/
model.onnx
tokenizer.json
tokenizer_config.json
special_tokens_map.json
config.json
vocab.txt
Expected files
This repository should contain:
model.onnxtokenizer.jsontokenizer_config.jsonspecial_tokens_map.jsonconfig.jsonvocab.txt
Minimum required for loading:
model.onnxtokenizer.jsontokenizer_config.json
Source model
This ONNX export is derived from:
amandyk/KazakhBERTmulti
and exported for local Node.js inference with:
onnxruntime-node@huggingface/tokenizers
Intended use
Use this model when you want:
- a local Node.js scorer
- better disambiguation than pure rule conversion
- a drop-in model directory that works with
createOnnxArabicToCyrillicConverter()
This repository does not package the model into npm. Download the model files separately and pass the directory path to the LM package.
Intended package pair
- Core package:
@sarmay/kaz-converter - LM package:
@sarmay/kaz-converter-lm
- Downloads last month
- 22
Model tree for sarmay/KazakhBERTmulti-onnx
Base model
amandyk/KazakhBERTmulti