* l10n: document a bit how it works * add a link to fluent Co-authored-by: Daniel Hofstetter <daniel.hofstetter@42dh.com> * fix typo Co-authored-by: Daniel Hofstetter <daniel.hofstetter@42dh.com> --------- Co-authored-by: Daniel Hofstetter <daniel.hofstetter@42dh.com>
3.8 KiB
🌍 Localization (L10n) in uutils coreutils
This guide explains how localization (L10n) is implemented in the Rust-based coreutils project, detailing the use of Fluent files, runtime behavior, and developer integration.
📁 Fluent File Layout
Each utility has its own set of translation files under:
src/uu/<utility>/locales/<locale>.ftl
Examples:
src/uu/ls/locales/en-US.ftl
src/uu/ls/locales/fr-FR.ftl
These files follow Fluent syntax and contain localized message patterns.
⚙️ Initialization
Localization must be explicitly initialized at runtime using:
setup_localization(path)
This is typically done:
- In
src/bin/coreutils.rsfor multi-call binaries - In
src/uucore/src/lib.rsfor single-call utilities
The string parameter determines the lookup path for Fluent files.
🌐 Locale Detection
Locale selection is automatic and performed via:
fn detect_system_locale() -> Result<LanguageIdentifier, LocalizationError>
It reads the LANG environment variable (e.g., fr-FR.UTF-8), strips encoding, and parses the identifier.
If parsing fails or LANG is not set, it falls back to:
const DEFAULT_LOCALE: &str = "en-US";
You can override the locale at runtime by running:
LANG=ja-JP ./target/debug/ls
📥 Retrieving Messages
Two APIs are available:
get_message(id: &str) -> String
Returns the message from the current locale bundle.
let msg = get_message("id-greeting");
If not found, falls back to en-US. If still missing, returns the ID itself.
get_message_with_args(id: &str, args: HashMap<String, String>) -> String
Supports variable interpolation and pluralization.
let msg = get_message_with_args(
"error-io",
HashMap::from([
("error".to_string(), std::io::Error::last_os_error().to_string())
])
);
Fluent message example:
error-io = I/O error occurred: { $error }
Variables must match the Fluent placeholder keys ($error, $name, $count, etc.).
📦 Fluent Syntax Example
id-greeting = Hello, world!
welcome = Welcome, { $name }!
count-files = You have { $count ->
[one] { $count } file
*[other] { $count } files
}
Use plural rules and inline variables to adapt messages dynamically.
🧪 Testing Localization
Run all localization-related unit tests with:
cargo test --lib -p uucore
Tests include:
- Loading bundles
- Plural logic
- Locale fallback
- Fluent parse errors
- Thread-local behavior
- ...
🧵 Thread-local Storage
Localization is stored per thread using a OnceLock.
Each thread must call setup_localization() individually.
Initialization is one-time-only per thread — re-initialization results in an error.
🧪 Development vs Release Mode
During development (cfg(debug_assertions)), paths are resolved relative to the crate source:
$CARGO_MANIFEST_DIR/../uu/<utility>/locales/
In release mode, paths are resolved relative to the executable:
<executable_dir>/locales/<utility>/
If both fallback paths fail, an error is returned during setup_localization().
🔤 Unicode Isolation Handling
By default, the Fluent system wraps variables with Unicode directional isolate characters (U+2068, U+2069) to protect against visual reordering issues in bidirectional text (e.g., mixing Arabic and English).
In this implementation, isolation is disabled via:
bundle.set_use_isolating(false);
This improves readability in CLI environments by preventing extraneous characters around interpolated values:
Correct (as rendered):
"Welcome, Alice!"
Fluent default (disabled here):
"\u{2068}Alice\u{2069}"