Label Studio Guide

From LogicalDOC Community Wiki
Revision as of 06:59, 24 June 2026 by Giuseppe (talk | contribs)
Jump to navigationJump to search

Label Studio

This guide explains how to create an annotated dataset for YOLO training using Label Studio.

Install Label Studio using pip:

pip install label-studio

Verify the installation:

python -m label_studio.server --help

Or refer to the official installation guide: https://labelstud.io/guide/install


Enable Local File Storage

For large projects it is not recommended to upload images directly through the Label Studio interface. Instead, configure a local directory that contains the images to annotate.

To enable local file access, configure the following environment variables before starting Label Studio:

LABEL_STUDIO_LOCAL_FILES_SERVING_ENABLED=true
LABEL_STUDIO_LOCAL_FILES_DOCUMENT_ROOT=/path/to/images



Starting Label Studio

Label Studio can be started using one of the following methods.


Default Startup

If local file storage is not required, Label Studio can be started with the default configuration:

label-studio start

or

python -m label_studio.server start

By default, the application is available at:

http://localhost:8080

Startup with Local File Storage

When working with large datasets, it is recommended to configure Local Storage so that images are accessed directly from the filesystem.

Windows example:

set LABEL_STUDIO_PORT=8081
set LABEL_STUDIO_LOCAL_FILES_SERVING_ENABLED=true
set LABEL_STUDIO_LOCAL_FILES_DOCUMENT_ROOT=C:\Users\username\Documents\label-studio

python -m label_studio.server start

After startup, Label Studio will be available at:

http://localhost:8081

The directory specified by LABEL_STUDIO_LOCAL_FILES_DOCUMENT_ROOT can then be configured as Local Storage within a Label Studio project.

Port 8081 is used to avoid conflicts with the default LogicalDOC installation, which typically runs on port 8080.

Create a Project

  1. Login to Label Studio
  2. Click Create Project
  3. Enter a project name
  4. Configure the labeling interface
  5. Save the project


Import Images through Local Storage

  1. Open the project
  2. Navigate to Settings > Cloud Storage
  3. Click Add Source Storage
  4. Select Local Files
  5. Configure the path specified by LABEL_STUDIO_LOCAL_FILES_DOCUMENT_ROOT
  6. Click Sync Storage


Add Source Storage Selection
Local Storage Selection
Local Storage configuration


When importing images, choose Files as the import method.


File:Label-studio-importing-file.png
Label-Studio File Type Selection

After synchronization, Label Studio automatically creates one task for each imported document image.


Annotate Documents

  1. Open a task
  2. Select a label
  3. Draw a bounding box around the target area
  4. Save the annotation

Example:

Label-Studio Annotation Example

Export the Dataset

  1. Open the project
  2. Click Export
  3. Select the desired format
Export Button Selection
Export Data Selection

Supported formats include:

  • YOLO
  • COCO
  • Pascal VOC
  • CSV

For YOLO training, export the dataset in YOLO format, or YOLO with Images.


KNOWN ISSUE

Even when selecting the YOLO with Images export format, Label Studio exports only the annotation (.txt) files. The corresponding images are not included in the exported archive. This means that the images must be copied manually from the source image directory into the appropriate dataset folders before starting the training process.


Dataset Formats

COCO

COCO is a JSON-based dataset format commonly used for object detection datasets. It stores images, categories, annotations, and bounding boxes in a single JSON file.

More information: https://docs.aws.amazon.com/rekognition/latest/customlabels-dg/md-coco-overview.html

Pascal VOC XML

Pascal VOC is an XML-based dataset format widely used in object detection tasks. Each image has a corresponding XML file containing metadata such as image dimensions, object classes, and bounding box coordinates.

More information: https://roboflow.com/formats/pascal-voc-xml

YOLO

YOLO datasets consist of images and text annotation files organized according to a predefined directory structure. Each image has a corresponding text file containing the object class and normalized bounding box coordinates.

More information: https://docs.cvat.ai/docs/dataset_management/formats/format-yolo/

YOLOv8 OBB

YOLOv8 OBB (Oriented Bounding Boxes) extends the standard YOLO format by supporting rotated bounding boxes using eight normalized coordinates instead of four. This format is useful when objects are not aligned horizontally.