Writing your modules #1017
Replies: 1 comment
-
Defining User-Configurable Parameters: How the UI Interprets
|
Beta Was this translation helpful? Give feedback.
-
Defining User-Configurable Parameters: How the UI Interprets
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
API Documentation for Module Creation
Introduction
This guide describes how to create and integrate custom modules into the application. The system features a modular architecture that allows developers to extend its functionality by adding new algorithms for the following tasks:
TextDetector): Finding text regions within an image.OCR): Extracting text from the found regions.Inpainter): Removing the original text from the image.Translator): Translating the extracted text.Key Concepts
1. Module Registry
The system uses a "Registry" pattern to discover and manage modules. For your extension to become available in the program, it must be registered using a special decorator.
.pyfile in the corresponding directory (modules/textdetector,modules/ocr, etc.).modules/textdetector@register_textdetectors('my-detector')modules/ocr@register_OCR('my-ocr')modules/inpaint@register_inpainter('my-inpainter')modules/translators@register_translator('my-translator')2. The
BaseModuleBase ClassAll modules must inherit from a base class corresponding to their type (
TextDetectorBase,OCRBase, etc.), which in turn inherit frommodules.base.BaseModule. This class provides a common interface and functionality.Key Class Attributes:
params(dict): A dictionary defining the module's parameters that will be user-configurable. Structure:download_file_list(list): A list of dictionaries describing files (e.g., models) that need to be downloaded for the module to work. The system will automatically handle the download and verification process._load_model_keys(set): A set of attribute names that store large objects (models). This helps the system manage memory by loading and unloading models as needed.Core Methods to Implement:
_load_model(): This is where you should load your heavy models and assign them to the attributes listed in_load_model_keys. It is called automatically the first time the module is used.unload_model(): The system manages unloading automatically via_load_model_keys, but you can override this method for more complex cleanup logic.3. Inheriting from Specialized Base Classes
Each module type has its own base class that inherits from
BaseModuleand defines the primary method you must implement.How to Create Your Module: Step-by-Step Guides
1. Creating a Custom Text Detector
A text detector scans an image for text blocks and returns their mask and coordinates.
modules/textdetector/my_detector.pyTextDetectorBase.@register_textdetectors('my-detector')decorator._detect(self, img, proj)method.img(np.ndarray): The image in BGR format (3 channels).proj(ProjImgTrans): A context object (for advanced use).Tuple[np.ndarray, List[TextBlock]]mask(np.ndarray): A single-channel mask (0-255), where white areas denote detected text.blk_list(List[TextBlock]): A list ofTextBlockobjects, one for each detected text region.Example (A simple detector based on thresholding):
2. Creating a Custom OCR Module
An OCR module takes an image and a list of text blocks and fills them with recognized text.
modules/ocr/my_ocr.pyOCRBase.@register_OCR('my-ocr')decorator._ocr_blk_list(self, img, blk_list)method.img(np.ndarray): The original image in RGB format.blk_list(List[TextBlock]): A list of text blocks to recognize.TextBlock, crop the corresponding region fromimg, recognize the text, and write it toblk.text.Example (A dummy OCR that returns coordinates):
3. Creating a Custom Inpainter Module
An Inpainter takes an image and a text mask, then fills in the text regions so they can be replaced with translated text.
modules/inpaint/my_inpainter.pyInpainterBase.@register_inpainter('my-inpainter')decorator._inpaint(self, img, mask, textblock_list)method.img(np.ndarray): The RGB image to be processed.mask(np.ndarray): A single-channel mask where white areas correspond to text that needs to be inpainted.textblock_list(List[TextBlock]): A list of text blocks (can beNone).np.ndarray- the image with text inpainted.Example (Using
cv2.inpaint):4. Creating a Custom Translator Module
A Translator takes a list of strings and returns a list of translated strings.
modules/translators/my_translator.pyBaseTranslator.@register_translator('my-translator')decorator._setup_translator(). Here, you must populateself.lang_mapwith the languages your API supports and their corresponding codes._translate(self, src_list).src_list(List[str]) - a list of strings to translate.List[str]- a list of translated strings. The length must matchsrc_list.Example (A "translator" that reverses strings):
Best Practices: On-Demand Client Initialization
When creating modules that interact with external APIs or load heavy models (e.g.,
OpenAI,httpx.Client, GPU models), it is strongly recommended not to create the client instance within the__init__or_setup_translatormethods.The Problem: Premature Initialization
Initializing a client in the constructor (
__init__) can lead to several issues:The Solution: Lazy Initialization
The correct approach is to create the client only at the moment it is actually needed. This ensures that the most current parameters are used and that resources are not wasted.
How to Implement It:
__init__method, set the client attribute toNone._initialize_client) that handles the creation and configuration of the client instance, using the current values fromself.params._translate,ocr, etc.), add a check: if the client has not been created yet (self.client is None), call your initialization method.updateParammethod, reset the client toNonewhenever a key parameter (API key, endpoint, proxy) is changed. This forces the system to re-create the client with the new settings on the next call.Example: Refactoring
llm_api_translatorHere is how this pattern is applied in the
LLM_OCRmodule (fileocr\ocr_llm_api.py):By following this pattern, you will make your modules more efficient, robust, and responsive to user configuration changes.
Gemini did almost all the work on the preparation of the documentation. Thanks to him)
Version for wibe-coding. Warning, the most stupid modules, which will not even be checked, but simply made so that we can check them later, will be immediately rejected. Spend time debugging and checking.
Beta Was this translation helpful? Give feedback.
All reactions