Good Translation Setups for PCs with weak Gpus / Cpus #1029
Wired-cell
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Some Background (Skip if you want to) :
I have a Really Old Gpu - gtx 960 oc 2gb (not even the TI version). As you can guess, the vram is too low. So, if I try to translate any japanese manga and I get a memory allocation error. If I load the models on my ram and leave the processing to cpu (ryzen 7 2700), the inference (working) is painfully slow. Thus, I downloaded and tested a buuuunch of models and setups. Some worked, some refused to hear my pleas and some were just horrible. Here, I have listed my findings and suggestions.
NOTE :- I have neither trained nor finetuned any models. All the credit goes to their respective creators.
Text Detector :
ctd (default) : is currently the only one that works. I'm still searching for more models (found 1 that's a pt file of 25 mbs) but don't particularly know how to properly load them.
OCR :
mit48px_ctc (default) : is of good size. I found another manga-ocr (120ish mbs) model but transformer just fails to recognize its model type (bert). I tried using vit but it failed again.
Inpainter :
aot (already included) : the smallest model (25 mbs). Works really well as far as I tested. Results may vary for you
Translator :
The real vram devouring happens here (1 gb for sugoi). So, we have a few choices to lessen the usage by upto 700 mbs (the translation model only taking about 300 mbs) . They are :
Use a different Translation model. I have ranked them according to my tests :
1. Fugu-mt ja-en : Actually Provides similar translation quality to sugoi-v4 but has minor difficulties when converting sounds. Ex - kata-kata to yata-yata. Is of 300 mbs and really fast.
2. Helsinki-nlp/opus-mt ja-en : Really Broken Translation sometimes but silver lining is that it's super fast :). No, like seriously, its subpar and its finetunes are even worse.
3. Gemma3-1b : Worse than fugu and opus and not suited for conversations.
∞. Helsinki-nlp/opus-mt jap-en : Mind the p. Infinity for a reason. Translates to old testament english and that too incorrectly. If thou wanna feel thou-self (?) questioning life, try it.
Honorable Mentions : Aya-8b (The Best but too big) , Gemma3-4b (cannot fit on gpu) , A bunch of other translators whose name I forgot.
(Note : Will update about mitsua and its finetunes after testing)
Quantize sugoi :
The default sugoi model is of float32 (unquantized) compute type. The loader and optimiser used is ctranslate2. Ctranslate2 actually allows you to quantize the models yourself if you get the source but also requires that you have a quantization capable gpu. For all my fellow pewpocomp (people with potato computers), the required cuda compute capability (Nvidia) for 8 bit quantisation is wonky. So, check here and read further only if you have a compatible gpu otherwise, the model will automatically be converted to 32bit on load and take the same amount of vram besides being slower to load.
Moving on, you can download the fairseq sugoi v4 model here. After which, use ctranslate2-fairseq-converter library (automatically gets downloaded with ctranslate2 which is included), use syntax :
ct2-fairseq-converter --model_path big.pretrain.pt --data_dir . --source_lang ja --target_lang en --quantization int8 --output_dir ../converted/sugoi-v3.3-ja-en-ct2-int8to quantize it to your preferred type or just download pre-quantized models from this user. More info about quantization can be found here.
Note : Did not include under discussions for models because they only list models that can be loaded through llama.cpp and not pytorch natively (the prompt can sometimes not force the translator to output the same number of lines as received on models using a listening api).
Another Note : Will drop my modified scripts for the translators if requested.
Beta Was this translation helpful? Give feedback.
All reactions