Label Studio Guide
Label Studio Guide
This guide explains how to create an annotated dataset for YOLO training using Label Studio.
Install Label Studio
Label Studio requires Python 3.10 or later.
Install Label Studio using `pip`:
pip install label-studio
Verify the installation:
python -m label_studio.server --help
For additional installation options, refer to the official documentation:
https://labelstud.io/guide/install
Start Label Studio
Label Studio can be started in one of the following ways.
Default Startup
For small datasets or proof-of-concept projects, Label Studio can be started with the default configuration:
label-studio start
or
python -m label_studio.server start
By default, the application is available at:
http://localhost:8080
Start Label Studio with Local File Storage Enabled
For larger datasets, it is recommended to use Local Storage so that images are accessed directly from the filesystem instead of being uploaded through the Label Studio interface.
To enable support for Local Storage, configure the following environment variables before starting Label Studio:
LABEL_STUDIO_LOCAL_FILES_SERVING_ENABLED=true LABEL_STUDIO_LOCAL_FILES_DOCUMENT_ROOT=/path/to/images
Example (Windows):
set LABEL_STUDIO_PORT=8081 set LABEL_STUDIO_LOCAL_FILES_SERVING_ENABLED=true set LABEL_STUDIO_LOCAL_FILES_DOCUMENT_ROOT=C:\Users\username\Documents\label-studio python -m label_studio.server start
After startup, Label Studio will be available at:
http://localhost:8081
The value of LABEL_STUDIO_LOCAL_FILES_DOCUMENT_ROOT defines the root directory from which Label Studio is allowed to access local files. The Local Storage itself will be configured later when creating the project.
Port 8081 is used in this example to avoid conflicts with the default LogicalDOC installation, which typically runs on port 8080.
Create a Project
- Login to Label Studio
- Click Create Project
- Enter a project name
- Configure the labeling interface
- Save the project
Configure and Synchronize Local Storage
- Open the project
- Navigate to Settings > Cloud Storage
- Click Add Source Storage
- Select Local Files
- Configure the path specified by LABEL_STUDIO_LOCAL_FILES_DOCUMENT_ROOT
- Click Sync Storage


When importing images, choose Files as the import method.

After synchronization, Label Studio automatically creates one task for each imported document image.

Annotate Documents
- Open a task
- Select a label
- Draw a bounding box around the target area
- Save the annotation
Example:

Export the Dataset
- Open the project
- Click Export
- Select the desired format


Supported formats include:
- YOLO
- COCO
- Pascal VOC
- CSV
For YOLO training, export the dataset in YOLO format, or YOLO with Images.
KNOWN ISSUE
Even when selecting the YOLO with Images export format, Label Studio exports only the annotation (.txt) files. The corresponding images are not included in the exported archive.
Before starting the training process, copy the original images manually into the appropriate dataset directories.
For information about the expected dataset structure, see YOLO Training Pipeline.
Dataset Formats
COCO
COCO is a JSON-based dataset format commonly used for object detection datasets. It stores images, categories, annotations, and bounding boxes in a single JSON file.
More information: https://docs.aws.amazon.com/rekognition/latest/customlabels-dg/md-coco-overview.html
Pascal VOC XML
Pascal VOC is an XML-based dataset format widely used in object detection tasks. Each image has a corresponding XML file containing metadata such as image dimensions, object classes, and bounding box coordinates.
More information: https://roboflow.com/formats/pascal-voc-xml
YOLO
YOLO datasets consist of images and text annotation files organized according to a predefined directory structure. Each image has a corresponding text file containing the object class and normalized bounding box coordinates.
More information: https://docs.cvat.ai/docs/dataset_management/formats/format-yolo/
YOLOv8 OBB
YOLOv8 OBB (Oriented Bounding Boxes) extends the standard YOLO format by supporting rotated bounding boxes using eight normalized coordinates instead of four. This format is useful when objects are not aligned horizontally.