SSL Configuration using Certificate and Label Studio Guide: Difference between pages

From LogicalDOC Community Wiki
(Difference between pages)
Jump to navigationJump to search
 
Giuseppe (talk | contribs)
No edit summary
 
Line 1: Line 1:
{{TOCright}} __TOC__
= Preparing a Dataset with Label Studio =


LogicalDOC embeds the Tomcat application server and it can be configured to support the encrypted protocol HTTPS. This is useful when you want to expose the program on the Internet.
This guide explains how to create an annotated dataset for YOLO training using Label Studio.


{{Advice|<b><u>Please be aware that this procedure is not coverded by the standard support contract</u>.</b> <br/>In case you want this matter to be handled professionally, please write to sales@logicaldoc.com for a quote.}}
Install Label Studio using pip:


Basically you only have to follows the steps described in the Apache how-to at [https://tomcat.apache.org/tomcat-9.0-doc/ssl-howto.html SSL Configuration HOW-TO]
<pre>
What follows is a re-visioned extract from that how-to
pip install label-studio
</pre>


== Preparing the certificates ==
Verify the installation:
Note: skip this step if you already have your own SSL certificate


To install and configure SSL support on Tomcat, you need the following things:
<pre>
* The file of your server certificate (the format must be PEM-encoded)
python -m label_studio.server --help
* The file containing the certificate chain associated with the server certificate (the format must be PEM-encoded)
</pre>
* The file that contains the server private key (the format must be PEM-encoded)


You get those 3 files as a result of the certificate issuing procedure.
Or refer to the official installation guide:
https://labelstud.io/guide/install


Most of the times you server's certificate and the chain file are in .crt, .cer or .der format.
In this case please convert them  openssl:<br />
<code>openssl x509 -in cert.crt -out cert.pem</code><br />
<code>openssl x509 -in cert.cer -out cert.pem</code><br />
<code>openssl x509 -in cert.der -out cert.pem</code><br /><br />


In the same way, probably your secret key is in .txt format, so please convert it into .pem using openssl:<br />
=== Enable Local File Storage ===
<code>openssl rsa -in privkey.txt -out privkey.pem</code><br /><br />


We suggest to put your .pem files in <LDOC_HOME>/conf and in any case outside the tomcat folder.
For large projects it is not recommended to upload images directly through the Label Studio interface. Instead, configure a local directory that contains the images to annotate.


==Edit the Tomcat configuration file==   
To enable local file access, configure the following environment variables before starting Label Studio:
The final step is to configure your secure socket in the <LDOC_HOME>/tomcat/conf/server.xml file, where <LDOC_HOME> represents the base directory for the LogicalDOC installation. An example <Connector> element for an SSL connector looks something like this:


<source lang="xml">
<pre>
    <Connector protocol="org.apache.coyote.http11.Http11NioProtocol"
LABEL_STUDIO_LOCAL_FILES_SERVING_ENABLED=true
              port="8443" maxThreads="200"
LABEL_STUDIO_LOCAL_FILES_DOCUMENT_ROOT=/path/to/images
              URIEncoding="UTF-8" server="Undisclosed/8.41"
</pre>
              scheme="https" secure="true" SSLEnabled="true">
              <SSLHostConfig>
                <Certificate certificateFile="${catalina.home}/../conf/<b>cert.pem</b>"
                              certificateKeyFile="${catalina.home}/../conf/<b>privkey.pem</b>"
                              certificateChainFile="${catalina.home}/../conf/<b>chain.pem</b>" />
              </SSLHostConfig>
    </Connector>
</source>
   
<small>replace <LDOC_HOME> with the installation path of LogicalDOC</small>


{{Advice|LogicalDOC application will be updated from time to time so it is not safe to maintain the keystore inside the tomcat/ folder, please put your .pem files inside the conf/ folder of the LogicalDOC installation path.}}


The port attribute (default value is 8443) is the TCP/IP port number on which Tomcat will listen for secure connections. You can change this to any port number you wish.


If you change the port number here, you should also change the value specified for the redirectPort attribute on the non-SSL connector. This allows Tomcat to automatically redirect users who attempt to access a page with a security constraint specifying that SSL is required.


After completing these configuration changes, you must restart LogicalDOC as you normally do, you should be able to access via SSL. For example, try:
=== Starting Label Studio ===


    https://localhost:8443
Label Studio can be started using one of the following methods.


and you should see the usual login page.


== Install PFX certificates ==
==== Default Startup ====


Modify the value of the Connector attribute in the server.xml file to the following:
If local file storage is not required, Label Studio can be started with the default configuration:


<source lang="xml">
<pre>
    <Connector protocol="org.apache.coyote.http11.Http11NioProtocol"
label-studio start
              port="8443" maxThreads="200"
</pre>
              URIEncoding="UTF-8" server="Undisclosed"
 
              maxHttpHeaderSize="16384"  
or
              scheme="https" secure="true" SSLEnabled="true"
 
      clientAuth="false"
<pre>
      sslProtocol="TLSv1.1+TLSv1.2+TLSv1.3"
python -m label_studio.server start
              keystoreFile="${catalina.home}/../conf/mycert.pfx"  # Path of the certificate file
</pre>
      keystoreType="PKCS12"
 
              keystorePass="certpasswd" # Replace the value with the password of your certificate
By default, the application is available at:
              />
 
</source>
<pre>
http://localhost:8080
</pre>
 
==== Startup with Local File Storage ====
 
When working with large datasets, it is recommended to configure Local Storage so that images are accessed directly from the filesystem.
 
Windows example:
 
<pre>
set LABEL_STUDIO_PORT=8081
set LABEL_STUDIO_LOCAL_FILES_SERVING_ENABLED=true
set LABEL_STUDIO_LOCAL_FILES_DOCUMENT_ROOT=C:\Users\username\Documents\label-studio
 
python -m label_studio.server start
</pre>
 
After startup, Label Studio will be available at:
 
<pre>
http://localhost:8081
</pre>
 
The directory specified by '''LABEL_STUDIO_LOCAL_FILES_DOCUMENT_ROOT''' can then be configured as Local Storage within a Label Studio project.
 
Port '''8081''' is used to avoid conflicts with the default LogicalDOC installation, which typically runs on port '''8080'''.
 
 
=== Create a Project ===
 
# Login to Label Studio
# Click '''Create Project'''
# Enter a project name
# Configure the labeling interface
# Save the project
 
=== Import Images ===
 
# Open the project
# Click '''Import'''
# Select '''Local Storage'''
 
=== Configure Local Storage ===
 
# Open the project
# Navigate to '''Settings > Cloud Storage'''
# Click '''Add Source Storage'''
# Select '''Local Files'''
# Configure the path specified by LABEL_STUDIO_LOCAL_FILES_DOCUMENT_ROOT
# Click '''Sync Storage'''
 
After synchronization, Label Studio automatically creates one task for each imported document image.
 
When importing images, choose '''Files''' as the import method.
 
Unlike CVAT, Label Studio creates one task for each imported document image.
 
[[File:LabelStudio-local-storage.png.png|thumb|800px|center|Local Storage configuration showing a synchronized directory of document images]]
 
=== Annotate Documents ===
 
# Open a task
# Select a label
# Draw a bounding box around the target area
# Save the annotation
 
Example labels:
 
* Invoice Number
* Date
* Seller Name
* Buyer Name
* Total Amount
 
[[File:LabelStudio-annotation-example.png|thumb|600px|center|Example annotation]]
 
=== Export the Dataset ===
 
# Open the project
# Click '''Export'''
# Select the desired format
 
Supported formats include:
 
* YOLO
* COCO
* Pascal VOC
* CSV
 
For YOLO training, export the dataset in YOLO format.
 
=== Dataset Formats ===
 
==== COCO ====
 
COCO is a JSON-based dataset format commonly used for object detection datasets.
 
More information:
https://docs.aws.amazon.com/rekognition/latest/customlabels-dg/md-coco-overview.html
 
==== YOLO ====
 
YOLO datasets contain images and annotation files organized according to a predefined directory structure.
 
More information:
https://docs.cvat.ai/docs/dataset_management/formats/format-yolo/
 
==== YOLOv8 OBB ====
 
YOLOv8 OBB (Oriented Bounding Boxes) extends the standard YOLO format by supporting rotated bounding boxes using eight normalized coordinates.

Revision as of 13:03, 23 June 2026

Preparing a Dataset with Label Studio

This guide explains how to create an annotated dataset for YOLO training using Label Studio.

Install Label Studio using pip:

pip install label-studio

Verify the installation:

python -m label_studio.server --help

Or refer to the official installation guide: https://labelstud.io/guide/install


Enable Local File Storage

For large projects it is not recommended to upload images directly through the Label Studio interface. Instead, configure a local directory that contains the images to annotate.

To enable local file access, configure the following environment variables before starting Label Studio:

LABEL_STUDIO_LOCAL_FILES_SERVING_ENABLED=true
LABEL_STUDIO_LOCAL_FILES_DOCUMENT_ROOT=/path/to/images



Starting Label Studio

Label Studio can be started using one of the following methods.


Default Startup

If local file storage is not required, Label Studio can be started with the default configuration:

label-studio start

or

python -m label_studio.server start

By default, the application is available at:

http://localhost:8080

Startup with Local File Storage

When working with large datasets, it is recommended to configure Local Storage so that images are accessed directly from the filesystem.

Windows example:

set LABEL_STUDIO_PORT=8081
set LABEL_STUDIO_LOCAL_FILES_SERVING_ENABLED=true
set LABEL_STUDIO_LOCAL_FILES_DOCUMENT_ROOT=C:\Users\username\Documents\label-studio

python -m label_studio.server start

After startup, Label Studio will be available at:

http://localhost:8081

The directory specified by LABEL_STUDIO_LOCAL_FILES_DOCUMENT_ROOT can then be configured as Local Storage within a Label Studio project.

Port 8081 is used to avoid conflicts with the default LogicalDOC installation, which typically runs on port 8080.


Create a Project

  1. Login to Label Studio
  2. Click Create Project
  3. Enter a project name
  4. Configure the labeling interface
  5. Save the project

Import Images

  1. Open the project
  2. Click Import
  3. Select Local Storage

Configure Local Storage

  1. Open the project
  2. Navigate to Settings > Cloud Storage
  3. Click Add Source Storage
  4. Select Local Files
  5. Configure the path specified by LABEL_STUDIO_LOCAL_FILES_DOCUMENT_ROOT
  6. Click Sync Storage

After synchronization, Label Studio automatically creates one task for each imported document image.

When importing images, choose Files as the import method.

Unlike CVAT, Label Studio creates one task for each imported document image.

Local Storage configuration showing a synchronized directory of document images

Annotate Documents

  1. Open a task
  2. Select a label
  3. Draw a bounding box around the target area
  4. Save the annotation

Example labels:

  • Invoice Number
  • Date
  • Seller Name
  • Buyer Name
  • Total Amount
File:LabelStudio-annotation-example.png
Example annotation

Export the Dataset

  1. Open the project
  2. Click Export
  3. Select the desired format

Supported formats include:

  • YOLO
  • COCO
  • Pascal VOC
  • CSV

For YOLO training, export the dataset in YOLO format.

Dataset Formats

COCO

COCO is a JSON-based dataset format commonly used for object detection datasets.

More information: https://docs.aws.amazon.com/rekognition/latest/customlabels-dg/md-coco-overview.html

YOLO

YOLO datasets contain images and annotation files organized according to a predefined directory structure.

More information: https://docs.cvat.ai/docs/dataset_management/formats/format-yolo/

YOLOv8 OBB

YOLOv8 OBB (Oriented Bounding Boxes) extends the standard YOLO format by supporting rotated bounding boxes using eight normalized coordinates.