1
Fork 0
mirror of https://github.com/RGBCube/uutils-coreutils synced 2026-01-14 17:21:04 +00:00
uutils-coreutils/docs/src/l10n.md
Sylvestre Ledru 61d69a18d7
l10n: document a bit how it works (#8102)
* l10n: document a bit how it works

* add a link to fluent

Co-authored-by: Daniel Hofstetter <daniel.hofstetter@42dh.com>

* fix typo

Co-authored-by: Daniel Hofstetter <daniel.hofstetter@42dh.com>

---------

Co-authored-by: Daniel Hofstetter <daniel.hofstetter@42dh.com>
2025-06-11 09:30:48 +02:00

3.8 KiB

🌍 Localization (L10n) in uutils coreutils

This guide explains how localization (L10n) is implemented in the Rust-based coreutils project, detailing the use of Fluent files, runtime behavior, and developer integration.


📁 Fluent File Layout

Each utility has its own set of translation files under:

    src/uu/<utility>/locales/<locale>.ftl

Examples:

    src/uu/ls/locales/en-US.ftl
    src/uu/ls/locales/fr-FR.ftl

These files follow Fluent syntax and contain localized message patterns.


⚙️ Initialization

Localization must be explicitly initialized at runtime using:

    setup_localization(path)

This is typically done:

  • In src/bin/coreutils.rs for multi-call binaries
  • In src/uucore/src/lib.rs for single-call utilities

The string parameter determines the lookup path for Fluent files.


🌐 Locale Detection

Locale selection is automatic and performed via:

    fn detect_system_locale() -> Result<LanguageIdentifier, LocalizationError>

It reads the LANG environment variable (e.g., fr-FR.UTF-8), strips encoding, and parses the identifier.

If parsing fails or LANG is not set, it falls back to:

    const DEFAULT_LOCALE: &str = "en-US";

You can override the locale at runtime by running:

    LANG=ja-JP ./target/debug/ls

📥 Retrieving Messages

Two APIs are available:

get_message(id: &str) -> String

Returns the message from the current locale bundle.

    let msg = get_message("id-greeting");

If not found, falls back to en-US. If still missing, returns the ID itself.


get_message_with_args(id: &str, args: HashMap<String, String>) -> String

Supports variable interpolation and pluralization.

    let msg = get_message_with_args(
        "error-io",
        HashMap::from([
            ("error".to_string(), std::io::Error::last_os_error().to_string())
        ])
    );

Fluent message example:

    error-io = I/O error occurred: { $error }

Variables must match the Fluent placeholder keys ($error, $name, $count, etc.).


📦 Fluent Syntax Example

    id-greeting = Hello, world!
    welcome = Welcome, { $name }!
    count-files = You have { $count ->
        [one] { $count } file
       *[other] { $count } files
    }

Use plural rules and inline variables to adapt messages dynamically.


🧪 Testing Localization

Run all localization-related unit tests with:

    cargo test --lib -p uucore

Tests include:

  • Loading bundles
  • Plural logic
  • Locale fallback
  • Fluent parse errors
  • Thread-local behavior
  • ...

🧵 Thread-local Storage

Localization is stored per thread using a OnceLock. Each thread must call setup_localization() individually. Initialization is one-time-only per thread — re-initialization results in an error.


🧪 Development vs Release Mode

During development (cfg(debug_assertions)), paths are resolved relative to the crate source:

    $CARGO_MANIFEST_DIR/../uu/<utility>/locales/

In release mode, paths are resolved relative to the executable:

    <executable_dir>/locales/<utility>/

If both fallback paths fail, an error is returned during setup_localization().


🔤 Unicode Isolation Handling

By default, the Fluent system wraps variables with Unicode directional isolate characters (U+2068, U+2069) to protect against visual reordering issues in bidirectional text (e.g., mixing Arabic and English).

In this implementation, isolation is disabled via:

    bundle.set_use_isolating(false);

This improves readability in CLI environments by preventing extraneous characters around interpolated values:

Correct (as rendered):

    "Welcome, Alice!"

Fluent default (disabled here):

    "\u{2068}Alice\u{2069}"