|
|
| Line 1: |
Line 1: |
| # Label Studio Annotation Guide
| |
|
| |
|
| This guide describes how to use Label Studio to create annotated datasets for training Object Detection models used by LogicalDOC.
| |
|
| |
| ## Installing Label Studio
| |
|
| |
| Refer to the official Label Studio installation documentation.
| |
|
| |
| ## Starting Label Studio
| |
|
| |
| To enable the use of local files, the following environment variables must be configured:
| |
|
| |
| * LABEL_STUDIO_LOCAL_FILES_SERVING_ENABLED
| |
| * LABEL_STUDIO_LOCAL_FILES_DOCUMENT_ROOT
| |
|
| |
| Example:
| |
|
| |
| ```bash
| |
| export LABEL_STUDIO_LOCAL_FILES_SERVING_ENABLED=true
| |
| export LABEL_STUDIO_LOCAL_FILES_DOCUMENT_ROOT=/path/to/documents
| |
|
| |
| label-studio start
| |
| ```
| |
|
| |
| ## Creating a Project
| |
|
| |
| 1. Create a new project.
| |
| 2. Configure the labeling interface.
| |
| 3. Define the labels that will be used during annotation.
| |
|
| |
| ## Importing Data
| |
|
| |
| Label Studio supports multiple import methods.
| |
|
| |
| For large projects, importing media files directly through the web interface is not recommended. Instead, use local storage references.
| |
|
| |
| When importing document images, select **Files** as the import method.
| |
|
| |
| Unlike CVAT, Label Studio creates one task for each imported document.
| |
|
| |
| ## Annotating Documents
| |
|
| |
| 1. Open a task.
| |
| 2. Select the desired label.
| |
| 3. Draw a bounding box around the target document element.
| |
| 4. Save the annotation.
| |
|
| |
| Typical labels may include:
| |
|
| |
| * Invoice Number
| |
| * Date
| |
| * Seller Name
| |
| * Buyer Name
| |
| * Total Amount
| |
|
| |
| ## Annotation Example
| |
|
| |
| [Insert screenshots here]
| |
|
| |
| ## Exporting the Dataset
| |
|
| |
| After the annotation process is completed, export the project dataset.
| |
|
| |
| Label Studio supports multiple export formats, including:
| |
|
| |
| * YOLO
| |
| * COCO
| |
| * Pascal VOC
| |
| * CSV
| |
|
| |
| For YOLO training, the YOLO export format is recommended.
| |
|
| |
| ## Dataset Formats
| |
|
| |
| ### COCO Format
| |
|
| |
| COCO is a JSON-based dataset format commonly used for object detection datasets.
| |
|
| |
| Reference:
| |
| https://docs.aws.amazon.com/rekognition/latest/customlabels-dg/md-coco-overview.html
| |
|
| |
| ### YOLO Format
| |
|
| |
| YOLO datasets consist of images and text annotation files organized according to a predefined directory structure.
| |
|
| |
| Reference:
| |
| https://docs.cvat.ai/docs/dataset_management/formats/format-yolo/
| |
|
| |
| ### YOLOv8 OBB
| |
|
| |
| YOLOv8 OBB (Oriented Bounding Boxes) extends the standard YOLO format by supporting rotated bounding boxes through eight normalized coordinates.
| |
|
| |
| This format is useful when document elements are not aligned horizontally.
| |