diff --git a/docs/source/guide/ml.md b/docs/source/guide/ml.md index 73ead50df9d..5db75267d13 100644 --- a/docs/source/guide/ml.md +++ b/docs/source/guide/ml.md @@ -76,7 +76,17 @@ The model should begin running at `http://localhost:9090` (if you are using a Do {"model_class":"SamMLBackend","status":"UP"} ``` -If you see any errors, see [Troubleshooting ML Backends & Predictions](https://support.humansignal.com/hc/en-us/sections/23627938255117-ML-Backend-Predictions) in the HumanSignal support center and see the [Troubleshooting section in the README](https://github.com/HumanSignal/label-studio-ml-backend/tree/master?tab=readme-ov-file#troubleshooting). +
+ +If you see any errors, see [Troubleshooting ML backends](troubleshooting#ML-backends) and the [Troubleshooting section in the README](https://github.com/HumanSignal/label-studio-ml-backend/tree/master/README.md#troubleshooting). + +
+ +
+ +If you see any errors, see [Troubleshooting ML Backends & Predictions](https://support.humansignal.com/hc/en-us/sections/23627938255117-ML-Backend-Predictions) in the HumanSignal support center and see the [Troubleshooting section in the README](https://github.com/HumanSignal/label-studio-ml-backend/tree/master/README.md#troubleshooting). + +
#### localhost and Docker containers diff --git a/docs/source/guide/predictions.md b/docs/source/guide/predictions.md index 637396d5758..7afe25e6672 100644 --- a/docs/source/guide/predictions.md +++ b/docs/source/guide/predictions.md @@ -822,4 +822,14 @@ Import pre-annotated tasks into Label Studio [using the UI](tasks.html#Import-da ### Troubleshooting pre-annotations -See [Troubleshooting ML Backends & Predictions](https://support.humansignal.com/hc/en-us/sections/23627938255117-ML-Backend-Predictions) in the HumanSignal support center. +
+ +See [Troubleshooting pre-annotations](troubleshooting#Pre-annotations). + +
+ +
+ +See [Troubleshooting ML Backends & Predictions](https://support.humansignal.com/hc/en-us/sections/23627938255117-ML-Backend-Predictions) in the HumanSignal support center. + +
\ No newline at end of file diff --git a/docs/source/guide/storage.md b/docs/source/guide/storage.md index 7af1595a311..2bf98d618c5 100644 --- a/docs/source/guide/storage.md +++ b/docs/source/guide/storage.md @@ -26,9 +26,19 @@ When working with an external cloud storage connection, keep the following in mi * Label Studio doesn’t import the data stored in the bucket, but instead creates *references* to the objects. Therefore, you must have full access control on the data to be synced and shown on the labeling screen. * Sync operations with external buckets only goes one way. It either creates tasks from objects on the bucket (Source storage) or pushes annotations to the output bucket (Target storage). Changing something on the bucket side doesn’t guarantee consistency in results. -* We recommend using a separate bucket folder for each Label Studio project. +* We recommend using a separate bucket folder for each Label Studio project. -For more troubleshooting information, see [Troubleshooting Import, Export, & Storage](https://support.humansignal.com/hc/en-us/sections/16982163062029-Import-Export-Storage) in the HumanSignal support center. +
+ +For more troubleshooting information, see [Troubleshooting Label Studio](troubleshooting). + +
+ +
+ +For more troubleshooting information, see [Troubleshooting Import, Export, & Storage](https://support.humansignal.com/hc/en-us/sections/16982163062029-Import-Export-Storage) in the HumanSignal support center. + +
## How external storage connections and sync work @@ -662,4 +672,14 @@ If you're using Label Studio in Docker, you need to mount the local directory th ### Troubleshooting cloud storage -See [Troubleshooting Import, Export, and Storage](https://support.humansignal.com/hc/en-us/sections/16982163062029-Import-Export-Storage) in the HumanSignal support center. +
+ +For more troubleshooting information, see [Troubleshooting Label Studio](troubleshooting). + +
+ +
+ +For more troubleshooting information, see [Troubleshooting Import, Export, & Storage](https://support.humansignal.com/hc/en-us/sections/16982163062029-Import-Export-Storage) in the HumanSignal support center. + +
\ No newline at end of file diff --git a/docs/source/guide/troubleshooting.md b/docs/source/guide/troubleshooting.md new file mode 100644 index 00000000000..ddfb220ad37 --- /dev/null +++ b/docs/source/guide/troubleshooting.md @@ -0,0 +1,506 @@ +--- +title: Troubleshooting Label Studio +short: Troubleshooting Label Studio +tier: opensource +type: guide +order: 460 +order_enterprise: 0 +hide_menu: true +meta_title: Troubleshooting Label Studio +meta_description: Troubleshooting issues in Label Studio Community Edition +date: 2024-09-03 09:57:28 +--- + +!!! error Enterprise + This page covers common user troubleshooting scenarios for Label Studio Community version. For information specific to Label Studio Enterprise, see our [support center articles](https://support.humansignal.com/hc/en-us). + + +## Installation + +See [Troubleshoot installation issues](install_troubleshoot). + + +## Projects + +### Blank page when loading a project + +After starting Label Studio and opening a project, you see a blank page. Several possible issues could be the cause. + +If you specify a host without a protocol such as `http://` or `https://` when starting Label Studio, Label Studio can fail to locate the correct files to load the project page. + +To resolve this issue, update the host specified as an environment variable or when starting Label Studio. See [Start Label Studio](start). + +## Labeling + +### Slowness while labeling + +* If you're using the SQLite database and another user imports a large volume of data, labeling might slow down for other users on the server due to the database load. + +* If you want to upload a large volume of data (thousands of items), consider doing that at a time when people are not labeling or use a different database backend such as PostgreSQL or Redis. You can run Docker Compose from the root directory of Label Studio to use PostgreSQL: `docker-compose up -d`, or see [Sync data from cloud or database storage](storage). + +* If you are using a labeling schema that has many thousands of labels, consider using an [external taxonomy](/tags/taxonomy) instead. + +### Image/audio/resource loading error while labeling + +The most common mistake while resource loading is CORS (Cross-Origin Resource Sharing) problem or Cross Domain. When you are trying to fetch a picture from external hosting it could be blocked by security reasons. + +Open your browser console and check errors there. Typically, this problem is solved by the external host setup. + +- If you have access to the hosting server as admin then you need to allow CORS for the web server. For example, on nginx, you can try adding the following lines to `/etc/nginx/nginx.conf` under your `location` section: + +{% details Click for details %} + ```conf + location { + if ($request_method = 'OPTIONS') { + add_header 'Access-Control-Allow-Origin' '*'; + add_header 'Access-Control-Allow-Methods' 'GET, POST, OPTIONS'; + # + # Custom headers and headers various browsers *should* be OK with but aren't + # + add_header 'Access-Control-Allow-Headers' 'DNT,User-Agent,X-Requested-With,If-Modified-Since,Cache-Control,Content-Type,Range'; + # + # Tell client that this pre-flight info is valid for 20 days + # + add_header 'Access-Control-Max-Age' 1728000; + add_header 'Content-Type' 'text/plain; charset=utf-8'; + add_header 'Content-Length' 0; + return 204; + } + if ($request_method = 'POST') { + add_header 'Access-Control-Allow-Origin' '*'; + add_header 'Access-Control-Allow-Methods' 'GET, POST, OPTIONS'; + add_header 'Access-Control-Allow-Headers' 'DNT,User-Agent,X-Requested-With,If-Modified-Since,Cache-Control,Content-Type,Range'; + add_header 'Access-Control-Expose-Headers' 'Content-Length,Content-Range'; + } + if ($request_method = 'GET') { + add_header 'Access-Control-Allow-Origin' '*'; + add_header 'Access-Control-Allow-Methods' 'GET, POST, OPTIONS'; + add_header 'Access-Control-Allow-Headers' 'DNT,User-Agent,X-Requested-With,If-Modified-Since,Cache-Control,Content-Type,Range'; + add_header 'Access-Control-Expose-Headers' 'Content-Length,Content-Range'; + } + } + ``` +{% enddetails %} +* For Amazon S3, see [Configuring and using cross-origin resource sharing (CORS)](https://docs.aws.amazon.com/AmazonS3/latest/userguide/cors.html) in the Amazon S3 User Guide. +* For GCS, see [Configuring cross-origin resource sharing (CORS)](https://cloud.google.com/storage/docs/configuring-cors) in the Google Cloud Storage documentation. +* For Microsoft Azure, see [Cross-Origin Resource Sharing (CORS) support for Azure Storage](https://docs.microsoft.com/en-us/rest/api/storageservices/cross-origin-resource-sharing--cors--support-for-the-azure-storage-services) in the Microsoft Azure documentation. +* If you serve your data from an HTTP server created like follows: `python -m http.server 8081 -d`, run the following from the command line: +```bash +npm install http-server -g +http-server -p 3000 --cors +``` + +Not every host supports CORS setup, but you can to try locate CORS settings in the admin area of your host configuration. + +### Audio wave doesn't match annotations + +If you find that after annotating audio data, the visible audio wave doesn't match the timestamps and the sound, try converting the audio to a different format. For example, if you are annotating mp3 files, try converting them to wav files. + +```bash +ffmpeg -y -i audio.mp3 -ar 8k -ac 1 audio.wav +``` + +### Predictions aren't visible to annotators + +See [Pre-annotations](#Pre-annotations) below. + +### Can't label PDF data + +Label Studio does not support labeling PDF files directly. However, you can convert files to HTML using your PDF viewer or another tool and label the PDF as part of the HTML. See an example labeling configuration in the [Label Studio playground](/playground/?config=%3CView%3E%3Cbr%3E%20%20%3CHyperText%20name%3D%22pdf%22%20value%3D%22%24pdf%22%2F%3E%3Cbr%3E%3Cbr%3E%20%20%3CHeader%20value%3D%22Rate%20this%20article%22%2F%3E%3Cbr%3E%20%20%3CRating%20name%3D%22rating%22%20toName%3D%22pdf%22%20maxRating%3D%2210%22%20icon%3D%22star%22%20size%3D%22medium%22%20%2F%3E%3Cbr%3E%3Cbr%3E%20%20%3CChoices%20name%3D%22choices%22%20choice%3D%22single-radio%22%20toName%3D%22pdf%22%20showInline%3D%22true%22%3E%3Cbr%3E%20%20%20%20%3CChoice%20value%3D%22Important%20article%22%2F%3E%3Cbr%3E%20%20%20%20%3CChoice%20value%3D%22Yellow%20press%22%2F%3E%3Cbr%3E%20%20%3C%2FChoices%3E%3Cbr%3E%3C%2FView%3E%3Cbr%3E). + +## Cloud and local storage + +When working with an external Cloud Storage connection (S3, GCS, Azure), keep the following in mind: + +* Label Studio doesn’t import the data stored in the bucket, but instead creates *references* to the objects. Therefore, you have full access control on the data to be synced and shown on the labeling screen. +* Sync operations with external buckets only goes one way. It either creates tasks from objects on the bucket (Source storage) or pushes annotations to the output bucket (Target storage). Changing something on the bucket side doesn’t guarantee consistency in results. +* We recommend using a separate bucket folder for each Label Studio project. + +### CORS errors + +If you have not set up CORS, you cannot view cloud storage data from Label Studio. You might see a link to the data rather than a preview of the data, or you might see a CORS error in your web browser console: + +* For Amazon S3, see [Configuring and using cross-origin resource sharing (CORS)](https://docs.aws.amazon.com/AmazonS3/latest/userguide/cors.html) in the Amazon S3 User Guide. +* For GCS, see [Configuring cross-origin resource sharing (CORS)](https://cloud.google.com/storage/docs/configuring-cors) in the Google Cloud Storage documentation. +* For Microsoft Azure, see [Cross-Origin Resource Sharing (CORS) support for Azure Storage](https://docs.microsoft.com/en-us/rest/api/storageservices/cross-origin-resource-sharing--cors--support-for-the-azure-storage-services) in the Microsoft Azure documentation. + +!!! note + 1. Make sure to apply the correct role and permissions for your Service Account. For example, Service Account Role "roles/iam.serviceAccountTokenCreator" to the Service Account. + + 2. If the name of the Service Account `labelstudio` is using the error displayed in the DEBUG logs, then you can enable them using the `--log-level DEBUG` flag in the `label-studio start` command. + +### 403 errors + +If you see 403 errors in your web browser console, make sure you configured the correct credentials. + +{% details Google Cloud Storage credentials %} + +See [Setting up authentication](https://cloud.google.com/storage/docs/reference/libraries#setting_up_authentication) and [IAM permissions for Cloud Storage](https://cloud.google.com/storage/docs/access-control/iam-permissions) in the Google Cloud Storage documentation. + +Your account must have the **Service Account Token Creator** role, **Storage Object Viewer** role, and **storage.buckets.get** access permission. + +Also, if you're using a service account to authorize access to the Google Cloud Platform, make sure to activate it. See [gcloud auth activate-service-account](https://cloud.google.com/sdk/gcloud/reference/auth/activate-service-account) in the Google Cloud SDK: Command Line Interface documentation. + +{% enddetails %} + +{% details Amazon S3 credentials %} + +For Amazon S3, see [Configuration and credential file settings](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-files.html) in the Amazon AWS Command Line Interface User Guide. Also check that your credentials work from the [aws client](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-files.html). + + * Ensure that you specified the correct region when creating a bucket. If needed, change the region in your source or target storage settings or the `.aws/config` file, otherwise you might have problems accessing your bucket objects. + For example, update the following: `~/.aws/config` + + ``` + [default] + region=us-east-2 # change to the region of your bucket + ``` +- Ensure that the credentials you used to set up the source or target storage connection are still valid. If you see 403 errors in the browser console, and you set up the correct permissions for the bucket, you might need to update the Access Key ID, Secret Access Key, and Session ID. See the AWS Identity and Access Management documentation on [Requesting temporary security credentials](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_temp_request.html). + +{% enddetails %} + +### Clicking Sync does not update my data + +Sometimes the sync process doesn’t start immediately. That is because syncing process is based on internal job scheduler. If after a period of time nothing happens, follow the steps below. + +First, check that you have specified the correct credentials (see the sections above). + +Then go to the cloud storage settings page and click **Edit** next to the cloud connection. From here, you can check the following: + +* The **File Filter Regex** is set and correct. When no filters are specified, all found items are skipped. The filter should be a valid regular expression, not a wildcard (e.g. `.*` is a valid, `*.` is not valid) +* **Treat every bucket object as a source file** should be toggled `ON` if you work with images, audio, text files or any other binary content stored in the bucket. + + This instructs Label Studio to create URI endpoints and store this as a labeling task payload, and resolve them into presigned `https` URLs when opening the labeling screen. + + If you store JSON tasks in the Label Studio format in your bucket - turn this toggle `OFF`. + +* Check for rq worker failures. An easy way to check rq workers is complete an export operation. + + From the Data manager, click **Export**, and create a new snapshot and download the JSON file. If you see an Error, most likely your rq workers are having problems. Another way to check rq workers is to login as a superuser and go to the `/django-rq` page. You should see a `workers` column. If the values are `0` or the column is empty, this can indicate a failure. + +### JSON files from a cloud storage are not synced and the Data Manager is empty + +1. Edit the storage settings to enable **Treat every bucket object as a source file**. If you see tasks in the Data Manager, proceed to step 2. +2. Disable **Treat every bucket object as a source file**. + + If you don’t see tasks in the Data Manager, your bucket doesn’t have GET permissions, only LIST permissions. + +If there is only LIST permission, Label Studio can scan the bucket for the existence of objects without actually reading them. With GET permissions, Label Studio can read the data and extract your JSON files appropriately. + + +### Tasks don't load the way I expect + +If the tasks sync to Label Studio but don't appear the way that you expect, maybe with URLs instead of images or with one task where you expect to see many, check the following: +- If you're placing JSON files in [cloud storage](storage.html), place 1 task in each JSON file in the storage bucket. If you want to upload a JSON file from local storage into Label Studio, you can place multiple tasks in one JSON file. +- If you're syncing image or audio files, make sure **Treat every bucket object as a source file** is enabled. + +### Unable to access local storage when using Windows + +If you are using Windows: + +1. Ensure you use double backslashes (`\\`) when setting the environment variable `LABEL_STUDIO_LOCAL_FILES_DOCUMENT_ROOT`. This is necessary because you have to escape the backslash (\). +2. Ensure you use single backslashes (`\`) when entering the **Absolute local path** when configuring local storage for a project. +3. Do not use spaces or non-latin symbols in `LABEL_STUDIO_LOCAL_FILES_DOCUMENT_ROOT` or in the **Absolute local path**. + +Example: + +```bash +LABEL_STUDIO_LOCAL_FILES_DOCUMENT_ROOT=c:\\data\\media +Absolute local path from Local Storage settings = c:\data\media\subpath +``` + +## Pre-annotations + +Check that you are using the correct annotation units. + +{% details Image annotation units %} + + + +{% enddetails %} + +### Annotators cannot see predictions + +If annotators can't see predictions or if you encounter unexpected behavior after you [import pre-annotations into Label Studio](predictions), review this guidance to resolve the issues. + +First, in the **Settings > Annotation** section for your project, ensure that **Use predictions to pre-label tasks** is enabled. + +#### Check the configuration values of the labeling configuration and tasks + +The `from_name` of the pre-annotation task JSON must match the value of `name` in the `` portion of the labeling configuration. The `to_name` must match the `toName` value. + +For example, the following XML: + ```xml + ... + ` + ... + + ... + ``` + +Should correspond with the following portions of the example JSON: +```json +... +"type": "rectanglelabels", +"from_name": "label", "to_name": "image", +... +type": "choices", +"from_name": "choice", "to_name": "image", +... +``` + +#### Check the labels in your configuration and your tasks +Make sure that you have a labeling configuration set up for the labeling interface, and that the labels in your JSON file exactly match the labels in your configuration. If you're using a [tool to transform your model output](https://github.com/HumanSignal/label-studio-transformers), make sure that the labels aren't altered by the tool. + +#### Check the IDs and toName values +If you're performing nested labeling, such as displaying a TextArea tag for specific Label or Choice values, the IDs for those results must match. + +For example, if you want to transcribe text alongside a named entity resolution task, you might have the following labeling configuration: +```xml + + + + +