Label Studio Guide: Difference between revisions
No edit summary |
|||
| (46 intermediate revisions by the same user not shown) | |||
| Line 1: | Line 1: | ||
= | = Label Studio Guide = | ||
This guide explains how to create an annotated dataset for YOLO training using Label Studio. | This guide explains how to create an annotated dataset for YOLO training using Label Studio. | ||
Install Label Studio using pip: | {{Advice| This guide describes an example workflow for training a custom YOLO model and preparing it for use with LogicalDOC. | ||
<b><u>Please be aware that this procedure is not coverded by the standard support contract</u></b>. LogicalDOC cannot provide assistance with issues related to dataset preparation, training failures, model quality, GPU configuration, or third-party tools such as Label Studio, Ultralytics YOLO, or ONNX Runtime. | |||
If you require professional assistance, please contact <b>sales@logicaldoc.com</b> to request a quotation for consulting services.}} | |||
== Install Label Studio == | |||
Label Studio requires Python 3.10 or later. | |||
Install Label Studio using `pip`: | |||
<pre> | <pre> | ||
| Line 15: | Line 23: | ||
</pre> | </pre> | ||
For additional installation options, refer to the official documentation: | |||
https://labelstud.io/guide/install | https://labelstud.io/guide/install | ||
== Start Label Studio == | |||
Label Studio can be started in one of the following ways. | |||
=== Default Startup === | |||
For small datasets or proof-of-concept projects, Label Studio can be started with the default configuration: | |||
<pre> | <pre> | ||
| Line 58: | Line 51: | ||
</pre> | </pre> | ||
=== | === Start Label Studio with Local File Storage Enabled === | ||
For larger datasets, it is recommended to use Local Storage so that images are accessed directly from the filesystem instead of being uploaded through the Label Studio interface. | |||
Windows | To enable support for Local Storage, configure the following environment variables before starting Label Studio: | ||
<pre> | |||
LABEL_STUDIO_LOCAL_FILES_SERVING_ENABLED=true | |||
LABEL_STUDIO_LOCAL_FILES_DOCUMENT_ROOT=/path/to/images | |||
</pre> | |||
Example (Windows): | |||
<pre> | <pre> | ||
| Line 78: | Line 78: | ||
</pre> | </pre> | ||
The directory | The value of <code>LABEL_STUDIO_LOCAL_FILES_DOCUMENT_ROOT</code> defines the root directory from which Label Studio is allowed to access local files. The Local Storage itself will be configured later when creating the project. | ||
Port | Port <code>8081</code> is used in this example to avoid conflicts with the default LogicalDOC installation, which typically runs on port <code>8080</code>. | ||
== Create a Project == | |||
# Login to Label Studio | # Login to Label Studio | ||
| Line 91: | Line 90: | ||
# Save the project | # Save the project | ||
=== Configure Local Storage === | === Configure and Synchronize Local Storage === | ||
# Open the project | # Open the project | ||
| Line 107: | Line 101: | ||
[[File:local-storage-button.png|thumb|800px|center| | [[File:local-storage-button.png|thumb|800px|center|Add Source Storage Selection]] | ||
[[File:Storage-Settings-Label-Studio.png|thumb|800px|center|Local Storage Selection]] | [[File:Storage-Settings-Label-Studio.png|thumb|800px|center|Local Storage Selection]] | ||
When importing images, choose '''Files''' as the import method. | |||
[[File:Label-studio-importing-files.png|thumb|900px|center|Label-Studio File Type Selection]] | |||
After synchronization, Label Studio automatically creates one task for each imported document image. | After synchronization, Label Studio automatically creates one task for each imported document image. | ||
[[File:LabelStudio-local-storage.png.png|thumb|800px|center|Local Storage configuration]] | |||
== Annotate Documents == | |||
# Open a task | # Open a task | ||
| Line 123: | Line 123: | ||
# Save the annotation | # Save the annotation | ||
Example | Example: | ||
[[File:label-studio-annotated-image-example.png|thumb|700px|center|Label-Studio Annotation Example]] | |||
[[File:label-studio-annotated-image-example.png|thumb| | |||
=== Export the Dataset === | === Export the Dataset === | ||
| Line 138: | Line 132: | ||
# Click '''Export''' | # Click '''Export''' | ||
# Select the desired format | # Select the desired format | ||
[[File:label-studio-export.png|thumb|800px|center|Export Button Selection]] | |||
[[File:label-studio-export-data-selection.png|thumb|600px|center|Export Data Selection]] | |||
Supported formats include: | Supported formats include: | ||
| Line 146: | Line 143: | ||
* CSV | * CSV | ||
For YOLO training, export the dataset in YOLO format. | For YOLO training, export the dataset in YOLO format, or YOLO with Images. | ||
{{Note| | |||
<b>Known Issue:</b> Even when selecting the <b>YOLO with Images</b> export format, Label Studio exports only the annotation (<code>.txt</code>) files. The corresponding images are not included in the exported archive. | |||
Before starting the training process, copy the original images manually into the appropriate dataset directories. | |||
}} | |||
For information about the expected dataset structure, see [[YOLO Training Pipeline]]. | |||
== Dataset Formats == | |||
==== COCO ==== | ==== COCO ==== | ||
COCO is a JSON-based dataset format commonly used for object detection datasets. | COCO is a JSON-based dataset format commonly used for object detection datasets. It stores images, categories, annotations, and bounding boxes in a single JSON file. | ||
More information: | More information: | ||
https://docs.aws.amazon.com/rekognition/latest/customlabels-dg/md-coco-overview.html | https://docs.aws.amazon.com/rekognition/latest/customlabels-dg/md-coco-overview.html | ||
==== Pascal VOC XML ==== | |||
Pascal VOC is an XML-based dataset format widely used in object detection tasks. Each image has a corresponding XML file containing metadata such as image dimensions, object classes, and bounding box coordinates. | |||
More information: | |||
https://roboflow.com/formats/pascal-voc-xml | |||
==== YOLO ==== | ==== YOLO ==== | ||
YOLO datasets | YOLO datasets consist of images and text annotation files organized according to a predefined directory structure. Each image has a corresponding text file containing the object class and normalized bounding box coordinates. | ||
More information: | More information: | ||
| Line 166: | Line 179: | ||
==== YOLOv8 OBB ==== | ==== YOLOv8 OBB ==== | ||
YOLOv8 OBB (Oriented Bounding Boxes) extends the standard YOLO format by supporting rotated bounding boxes using eight normalized coordinates. | YOLOv8 OBB (Oriented Bounding Boxes) extends the standard YOLO format by supporting rotated bounding boxes using eight normalized coordinates instead of four. This format is useful when objects are not aligned horizontally. | ||
Latest revision as of 07:46, 26 June 2026
Label Studio Guide
This guide explains how to create an annotated dataset for YOLO training using Label Studio.
Install Label Studio
Label Studio requires Python 3.10 or later.
Install Label Studio using `pip`:
pip install label-studio
Verify the installation:
python -m label_studio.server --help
For additional installation options, refer to the official documentation:
https://labelstud.io/guide/install
Start Label Studio
Label Studio can be started in one of the following ways.
Default Startup
For small datasets or proof-of-concept projects, Label Studio can be started with the default configuration:
label-studio start
or
python -m label_studio.server start
By default, the application is available at:
http://localhost:8080
Start Label Studio with Local File Storage Enabled
For larger datasets, it is recommended to use Local Storage so that images are accessed directly from the filesystem instead of being uploaded through the Label Studio interface.
To enable support for Local Storage, configure the following environment variables before starting Label Studio:
LABEL_STUDIO_LOCAL_FILES_SERVING_ENABLED=true LABEL_STUDIO_LOCAL_FILES_DOCUMENT_ROOT=/path/to/images
Example (Windows):
set LABEL_STUDIO_PORT=8081 set LABEL_STUDIO_LOCAL_FILES_SERVING_ENABLED=true set LABEL_STUDIO_LOCAL_FILES_DOCUMENT_ROOT=C:\Users\username\Documents\label-studio python -m label_studio.server start
After startup, Label Studio will be available at:
http://localhost:8081
The value of LABEL_STUDIO_LOCAL_FILES_DOCUMENT_ROOT defines the root directory from which Label Studio is allowed to access local files. The Local Storage itself will be configured later when creating the project.
Port 8081 is used in this example to avoid conflicts with the default LogicalDOC installation, which typically runs on port 8080.
Create a Project
- Login to Label Studio
- Click Create Project
- Enter a project name
- Configure the labeling interface
- Save the project
Configure and Synchronize Local Storage
- Open the project
- Navigate to Settings > Cloud Storage
- Click Add Source Storage
- Select Local Files
- Configure the path specified by LABEL_STUDIO_LOCAL_FILES_DOCUMENT_ROOT
- Click Sync Storage


When importing images, choose Files as the import method.

After synchronization, Label Studio automatically creates one task for each imported document image.

Annotate Documents
- Open a task
- Select a label
- Draw a bounding box around the target area
- Save the annotation
Example:

Export the Dataset
- Open the project
- Click Export
- Select the desired format


Supported formats include:
- YOLO
- COCO
- Pascal VOC
- CSV
For YOLO training, export the dataset in YOLO format, or YOLO with Images.
For information about the expected dataset structure, see YOLO Training Pipeline.
Dataset Formats
COCO
COCO is a JSON-based dataset format commonly used for object detection datasets. It stores images, categories, annotations, and bounding boxes in a single JSON file.
More information: https://docs.aws.amazon.com/rekognition/latest/customlabels-dg/md-coco-overview.html
Pascal VOC XML
Pascal VOC is an XML-based dataset format widely used in object detection tasks. Each image has a corresponding XML file containing metadata such as image dimensions, object classes, and bounding box coordinates.
More information: https://roboflow.com/formats/pascal-voc-xml
YOLO
YOLO datasets consist of images and text annotation files organized according to a predefined directory structure. Each image has a corresponding text file containing the object class and normalized bounding box coordinates.
More information: https://docs.cvat.ai/docs/dataset_management/formats/format-yolo/
YOLOv8 OBB
YOLOv8 OBB (Oriented Bounding Boxes) extends the standard YOLO format by supporting rotated bounding boxes using eight normalized coordinates instead of four. This format is useful when objects are not aligned horizontally.