* update lemma meaning and phonetic transcription repos * chore: simplify progress bar widget * Remove instructions from chat view, and add profile explanation to course participant page * Translate courseParticipantTooltip * fix: in course chats list, sort activities by activity ID * use different text in chat/course participant tooltips * depress disabled toolbar buttons * fix: load course images on course load * fix: on add course plan to space, set m.space.child power level to 0 * chore: add label to emoji selector in vocab analytics * chore: increase text sizes in activity summary * fix: don't show open sessions if user has selected a role * feat: add button to regenerate latest bot message * chore: update morph meaning repo * chore: increase text size and spacing in language selection page, consume language locale emojis * feat: on first select lemma emoji, show snackbar with explanation * chore: use builder to style pressable buttons based on height * chore: add tooltips to each practice mode * initial work to add shimmer to match activity options * show word card in image toolbar mode * use the same widget for word card and vocab details emoji pickers * add shimmer background to match choices * fix: close previous snackbar before opening new mode disabled snackbar * fix: refresh course details when course ID changes in course details * chore: keep message button depressed * only show emoji selection shimmer if no emoji is selected * don't show reaction picker in emoji mode * lemma emoji picker style updates * update loading indicators in word zoom card * feat: show word card in vocab details page * practice buttons shimmer * fixed height audio player * more practice mode updates * more practice tweaks * add space between rows of tokens in practice mode * conditional top padding for practice tooltips * feat: send message info in lemma info request * chore: Focus on word meanings in reaction choices * fix: restrict width of morph icon in practice token button * chore: Expand word card for meanings * chore: When first grammar question active, shimmer choices * chore: Swap seed for hyphen for not-yet-chosen emojis in analytics * chore: Level attention to emoji and audio icons * fix: fix non-token vertical spacing in practice mode * fix: close message overlay when screen size changes * feat: While audio is playing, allow clicking of word to move audio to that spot * feat: play audio on token click and on construct click in vocab analytics * chore: snackbar close button * feat: Stay in audio mode after end of audio * chore: more word card spacing adjustments * fix: use construct id json in route for analytics details page * feat: custom SSO login/signup dialog * chore: add content to distinguish system edit from manual edit * Make input bar suggestion text vertically centered when shrinking * Add Pangea comments * Add background to make dark mode icon stand out in own message grammar practice * chore: re-style sso popup * fix: progress bar min width * fix: change how screen width metric changes are tracked * simplify * fix: fix carousel scroll issue * fix: set emoji placeholder text colot * fix: when not in column mode, don't add padding to top of practice tooltip * chore: prevent running IGC on empty message * fix: allow translation of bot audio transcripts * feat: analytics database * fix: update analytics profile room IDs on change, set via parameter in analytics room knock request (#4949) * chore: center title in add a course page (#4950) * fix: update spacing of activity participant indicators to make them narrower, make user activity summary highlight row scrollable (#4951) * fix: remove clicked new token from new tokens cache immeadiatley instead of waiting for new token animation to finish (#4952) * What now button takes user to top of course plan page (#4946) * Add scrollController to course details pages * Make what now button refresh details tab if needed, remove scrollController * 4907 construct details changes (#4961) * chore: remove delegation analytics page * feat: vocab construct analytics level bar * chore: analytics mobile navigation * feat: cap construct XP * Add background to regeneration request background (#4960) * chore: reduce padding between lines of message in practice mode (#4962) * chore: don't show message regeneration button if message has already been regenerated (#4963) * fix: prevent request regeneration button from altering message height (#4964) * fix: only animate top portion of activity status bar (#4965) * fix: fix white box error and add opacity variation to construct levels in progress bar (#4966) * fix: don't close word card on click (#4967) * feat: after user exits IT three times, show them a popup with the option to disable automatic language assistance (#4968) * feat: allow token feedback for word card in vocab analytics (#4900) * feat: allow token feedback for word card in vocab analytics * fix: remove duplicate global keys * 4726 word card in arabic goes way to the side (#4970) * fix: initial work for word card positioning on RTL system * fix: fix practice mode animation for RTL languages * chore: close lemma emoji snackbar on parent widget disposed (#4972) * fix: remove user summary testing code (#4974) * feat: On hover of the Nav Bar, expand to show current icon tooltip text (#4976) * feat: On hover of the Nav Bar, expand to show current icon tooltip text * animate menu transition * chore: delete construct navigation (#4984) * chore: Use hyphen instead of seed/sprout/flower in list view (#4985) * chore: update analytics page on construct update (#4987) * fix: fix word card overlay in mobile vocab details page (#4988) * fix: Latest sent message sinks when clicked on Mobile (#4989) * fix: don't highlight new tokens until analytics initialize (#4990) * chore: calculate times closed out of IT based on all message in session (#4991) * chore: add feedback response dialog (#4992) * chore: move request generation button into message bubble (#4993) * fix: show request regen button in overlay message (#4996) * fix: separate block construct and update construct updates in vocab list view (#4998) * feat: Do gold shimmer every 5 seconds on unselected emojis (#4999) * simplify message token renderer (#4994) * simplify message token renderer * token rendering and new word collection for tokens in activity summary / menu * make tokens hoverable * Model key cleanup (#4983) * refactor: Group redundant ModelKey entries * Add python script to find and replace hardcoded ModelKey values * Edited Python script to not automatically use ModelKey for files not already using it * refactor: Ran script and accepted obvious changes * rename 'duration' model key --------- Co-authored-by: ggurdin <ggurdin@gmail.com> * fix: return bot local stt, ensure stt rep exists in request stt translation function (#5003) * chore: set max lines for word card phonetic transcription (#5005) * chore: Don't show shimmer for unavailable modes (#5006) * chore: Delay until screen darkening (#5009) * chore: add focus node to vocab list view search bar (#5011) * chore: collapse navigation rail on navigate (#5013) * When user saves course edits, return to details page (#5012) * fix: don't lowercase construct keys in morph analytics list view (#5014) * 4860 dms all chats (#5015) * feat: initial work for dms => all chats * more navigation updates * change all chats tooltip * fix: set exact reactions length in overlay (#5016) * fix: fix message list rendering (#5017) * chore: disable lemma emoji selection for word card in token feedback dialog (#5026) * fix: don't add XP update if no new construct uses were added (#5027) * chore: hide request regeneration button in practice mode (#5028) * chore: use root navigator for chat details dialogs (#5029) * fix: rebuild word card on new word overlay dismissed (#5030) * Ensure consistency of pressable button height after animation (#5025) * Ensure consistency of pressable button height after animation * Use variable instead of hardcoded value * fix: fix overlay reactions bouncing around (#5031) * fix: add horizontal padding to prevent choice animation cutoff (#5032) * 4919 further optimizing message info (#5033) * remove original sent from message content * don't add null fields to message content JSON * fix: only show disable language assistance popup is user manually closes IT (#5034) * fix: only exclude xp gained analytics events if blocked constructs has entry (#5035) * fix: on analytics DB init, don't clear DB unless stored userID doesn't match client userID (#5036) * don't log missing POS error for POS 'other' (#5039) * don't long missing POS error for POS 'other' * don't long error for missing grammar copy if lemma is 'other' * chore: rebuild input bar hint text on language update (#5042) * fix: clear database on reinitialize (#5045) * chore: default to reactions maxWidth null if not available (#5047) * fix: remove duplicate navigator pop in member actions popup (#5048) * Reduce gap between lines in practice modes (#5041) * fix: prevent word card overflow in vocab details (#5049) * chore: style tokens in transcription like other clickable tokens (#5055) * fix: always align space nav rail children to the left (#5059) * chore: update message analytics feedback popup background color (#5061) * chore: increase padding in span card scroll view to prevent choice animation overflow (#5062) * chore: Don't use dropdown if only one item (#5063) * chore: Disable ability to send video/files (slash anything else that the bot doesn’t know what to do with) in bot chats (#5065) * chore: show more specific error in audio recording dialog (#5068) * chore: stack expanded space navigation menu over screen in one column mode (#5069) * feat: when screen size gets too short, show warning dialog (#5070) * 5053 can get points from lemma with max score (#5078) * make uses a private field for ConstructUses * expose capped list of uses in ConstructUses * filter capped construct uses in getUses * fix: don't show send button if error in recording dialog (#5079) * chore: allow users to highlight main word in word card * fix: in emoji picker, don't set selected emoji based on old stream data * remove duplicate subscription cancel * fix: fix recording dialog import error * fix: disable new token collection for token not in L2 * chore: use activity plan CEFR level in saved activity display * chore: apply border to dialog directly in delete space dialog (#5093) * chore: hide nav rail item tooltips when expanded (#5094) * chore: reduce min height of span card feedback section (#5095) * chore: force span card to always go above input bar (#5096) * fix: always enable small screen warning dialog on web (#5097) * fix: add new blocks to merge table before fetching previous constructs when calculating points added by construct update (#5098) * fix: remove reaction subscription to prevent overlay jumping (#5100) * 4825 vocabulary practice (#4826) * chore: move logic for lastUsedByActivityType into ConstructIdentifier * feat: vocab practice * add vocab activity progress bar * fix: shuffle audio practice choices * update UI of vocab practice Added buttons, increased text size and change position, cards flip over and turn red/green on click and respond to hover input * add xp sparkle, shimmering choice card placeholder * spacing changes fix padding, make choice cards spacing/sizing responsive to screen size, replace shimmer cards with stationary circle indicator * don't include duplicate lemma choices * use constructID and show lemma/emoji on choice cards add method to clear cache in case the results was an error, and add a retry button on error * gain xp immediately and take out continue session also refactor the choice cards to have separate widgets for each type and a parent widget to give each an id for xp sparkle * add practice finished page with analytics * Color tweaks on completed page and time card placeholder * add timer * give XP for bonuses and change timer to use stopwatch * simplify card logic, lock practice when few vocab words * merge analytics changes and fix bugs * reload on language change - derive XP data from new analytics - Don't allow any clicks after correct answer selected * small fixes, added tooltip, added copy to l10 * small tweaks and comments * formatting and import sorting --------- Co-authored-by: avashilling <165050625+avashilling@users.noreply.github.com> * feat: Directing to click messages with shimmer (#5106) * fix: use standard loading dialog on submit delete space dialog (#5107) * chore: don't show practice tooltip if mode is complete (#5108) * chore: don't restrict token length (#5112) * fix: in recording dialog, throw exception on permission denied (#5114) * chore: remove margin from last entry in user activity summary list (#5115) * chore: make emoji choice shimmer background match word card background (#5116) * feat: allow users to update bot's voice settings (#5119) * fix: hide ability to change bot chat settings from non-admins (#5120) * fix: remove extra text from end of download file name (#5121) * fix: remove invalid expanded widget (#5124) * fix: add guard to prevent showing screen size popup when expanding screen after showing popup (#5127) * chore: normalize accents in vocab search (#5128) * chore: base level icon spacing on xp needed to reach level in vocab details (#5131) * chore: add padding to bottom of vocab list view so practice button doesn't block last vocab entries (#5132) * fix: fix practice record construct id assignment for morph activities (#5133) * fix: coerce existing aggregate analytics database entries into correct format before merging to avoid data loss (#5136) * feat: make construct aggregated case-insensitive (#5137) * chore: prevent user from spamming disabled vocab practice button (#5138) * fix: reset voice on langauge update (#5140) * chore: make emoji base shimmer transparent (#5142) * chore: update sort order in space participants list (#5144) * chore: remove padding from last entry in activity list (#5146) * fix: disable emoji setting for non-L2 constructs (#5148) * fix: add reaction notifier to rebuild reaction picker and reaction display on reaction change (#5151) * chore: decrease text sizes in vocab practice complete page in one column mode (#5152) * chore: hide download button in download dialogs if download is complete (#5157) * fix: show morph as unlocked in analytics if ever used (#5158) * chore: reduce span card spacing to reduce unneeded scroll (#5160) * chore: reduce span card spacing to reduce unneeded scroll * remove debugging code * fix: don't double space ID on navigation (#5163) * chore: reduce negative points to 1 (#5162) To eliminate the chance of having negative total, minimum upon completion now is 30XP * fix: remove duplicates from answer choices (#5161) * fix: use canonical activity time in display for completed activity (#5164) * chore: refresh language cache to add voices (#5165) * chore: don't show loading dialog on reaction redaction (#5166) * build: bump version --------- Co-authored-by: Kelrap <kel.raphael3@outlook.com> Co-authored-by: Kelrap <99418823+Kelrap@users.noreply.github.com> Co-authored-by: avashilling <165050625+avashilling@users.noreply.github.com>
496 lines
17 KiB
Python
Executable file
496 lines
17 KiB
Python
Executable file
#!/usr/bin/env python3
|
|
"""
|
|
Translate a specific set of keys in the ARB files.
|
|
|
|
This script accepts a list of keys and:
|
|
1. Finds the value in the source ARB file (intl_en.arb)
|
|
2. Finds each instance of the value in the target ARB files (intl_*.arb)
|
|
3. Translates the value using OpenAI (reusing translate.py model)
|
|
4. Replaces the value in the target ARB files with the translated value
|
|
|
|
Usage:
|
|
python translate_keys.py --keys key1 key2 key3
|
|
python translate_keys.py --keys-file keys.txt
|
|
|
|
|
|
WARNING: Has not been tested extensively. Has not been tested with pluralization or
|
|
complex placeholders. Verify results before committing.
|
|
"""
|
|
|
|
import argparse
|
|
import sys
|
|
from pathlib import Path
|
|
from typing import Any, List
|
|
|
|
try:
|
|
from openai import OpenAI
|
|
except ImportError:
|
|
print("Error: openai package not found. Please install it with: pip install openai")
|
|
sys.exit(1)
|
|
|
|
# We'll implement all necessary functions locally to avoid path issues
|
|
|
|
# Get the client root directory (where this script is run from when called as module)
|
|
client_root = (
|
|
Path.cwd() if Path.cwd().name == "client" else Path(__file__).parent.parent.parent
|
|
)
|
|
l10n_dir = client_root / "lib" / "l10n"
|
|
|
|
|
|
def load_translations(lang_code: str) -> dict[str, str]:
|
|
"""Load translations for a language code using the correct path."""
|
|
import json
|
|
|
|
path_to_translations = l10n_dir / f"intl_{lang_code}.arb"
|
|
if not path_to_translations.exists():
|
|
translations = {}
|
|
else:
|
|
with open(path_to_translations, encoding="utf-8") as f:
|
|
translations = json.loads(f.read())
|
|
|
|
return translations
|
|
|
|
|
|
def load_supported_languages() -> List[tuple[str, str]]:
|
|
"""Load the supported languages from the languages.json file with correct path."""
|
|
import json
|
|
|
|
languages_path = client_root / "scripts" / "languages.json"
|
|
with open(languages_path, "r", encoding="utf-8") as f:
|
|
raw_languages = json.load(f)
|
|
|
|
languages: List[tuple[str, str]] = []
|
|
for lang in raw_languages:
|
|
assert isinstance(lang, dict), "Each language entry must be a dictionary."
|
|
language_code = lang.get("language_code", None)
|
|
language_name = lang.get("language_name", None)
|
|
assert (
|
|
language_code and language_name
|
|
), f"Each language must have a 'language_code' and 'language_name'. Found: {lang}"
|
|
languages.append((language_code, language_name))
|
|
return languages
|
|
|
|
|
|
def save_translations(lang_code: str, translations: dict[str, str]) -> None:
|
|
"""Save translations for a language code using the correct path."""
|
|
import json
|
|
from collections import OrderedDict
|
|
from datetime import datetime
|
|
|
|
path_to_translations = l10n_dir / f"intl_{lang_code}.arb"
|
|
|
|
translations["@@locale"] = lang_code
|
|
translations["@@last_modified"] = datetime.now().strftime("%Y-%m-%d %H:%M:%S.%f")
|
|
|
|
# Load existing data to preserve order if exists.
|
|
if path_to_translations.exists():
|
|
with open(path_to_translations, "r", encoding="utf-8") as f:
|
|
try:
|
|
existing_data = json.load(f, object_pairs_hook=OrderedDict)
|
|
except json.JSONDecodeError:
|
|
existing_data = OrderedDict()
|
|
else:
|
|
existing_data = OrderedDict()
|
|
|
|
# Update existing keys and append new keys (preserving existing order).
|
|
for key, value in translations.items():
|
|
if key in existing_data:
|
|
existing_data[key] = value # update value; order remains unchanged
|
|
else:
|
|
existing_data[key] = value # new key appended at the end
|
|
|
|
with open(path_to_translations, "w", encoding="utf-8") as f:
|
|
f.write(json.dumps(existing_data, indent=2, ensure_ascii=False))
|
|
|
|
|
|
def reconcile_metadata(
|
|
lang_code: str,
|
|
translation_keys: List[str],
|
|
english_translations_dict: dict[str, Any],
|
|
) -> None:
|
|
"""
|
|
For each translation key, update its metadata (the key prefixed with '@') by merging
|
|
any existing metadata with computed metadata. For basic translations, if no metadata exists,
|
|
add it; otherwise, leave it as is.
|
|
"""
|
|
translations = load_translations(lang_code)
|
|
|
|
for key in translation_keys:
|
|
# Skip keys that weren't successfully translated
|
|
if key not in translations:
|
|
continue
|
|
|
|
translation = translations[key]
|
|
meta_key = f"@{key}"
|
|
existing_meta = translations.get(meta_key, {})
|
|
assert isinstance(translation, str)
|
|
|
|
# Case 1: Basic translations, no placeholders.
|
|
if "{" not in translation:
|
|
if not existing_meta:
|
|
translations[meta_key] = {"type": "String", "placeholders": {}}
|
|
# if metadata exists, leave it as is.
|
|
|
|
# Case 2: Translations with placeholders (no pluralization).
|
|
elif (
|
|
"{" in translation
|
|
and "plural," not in translation
|
|
and "other{" not in translation
|
|
):
|
|
# Compute placeholders.
|
|
computed_placeholders = {}
|
|
for placeholder in translation.split("{")[1:]:
|
|
placeholder_name = placeholder.split("}")[0]
|
|
computed_placeholders[placeholder_name] = {}
|
|
# Obtain placeholder type from english translation or default to {}
|
|
placeholder_type = (
|
|
english_translations_dict.get(meta_key, {})
|
|
.get("placeholders", {})
|
|
.get(placeholder_name, {})
|
|
.get("type")
|
|
)
|
|
if placeholder_type:
|
|
computed_placeholders[placeholder_name]["type"] = placeholder_type
|
|
if existing_meta:
|
|
# Merge computed placeholders into existing metadata.
|
|
existing_meta.setdefault("type", "String")
|
|
existing_meta["placeholders"] = computed_placeholders
|
|
translations[meta_key] = existing_meta
|
|
else:
|
|
# Obtain type from english translation or default to "String".
|
|
translation_type = english_translations_dict.get(meta_key, {}).get(
|
|
"type", "String"
|
|
)
|
|
translations[meta_key] = {
|
|
"type": translation_type,
|
|
"placeholders": computed_placeholders,
|
|
}
|
|
|
|
# Case 3: Translations with pluralization.
|
|
elif (
|
|
"{" in translation and "plural," in translation and "other{" in translation
|
|
):
|
|
# Extract placeholders appearing before the plural part.
|
|
prefix = translation.split("plural,")[0].split("{")[1]
|
|
placeholders_list = [
|
|
p.strip() for p in prefix.split(",") if p.strip() != ""
|
|
]
|
|
computed_placeholders = {ph: {} for ph in placeholders_list}
|
|
for ph in placeholders_list:
|
|
# Obtain placeholder type from english translation or default to {}
|
|
placeholder_type = (
|
|
english_translations_dict.get(meta_key, {})
|
|
.get("placeholders", {})
|
|
.get(ph, {})
|
|
.get("type")
|
|
)
|
|
if placeholder_type:
|
|
computed_placeholders[ph]["type"] = placeholder_type
|
|
if existing_meta:
|
|
existing_meta.setdefault("type", "String")
|
|
existing_meta["placeholders"] = computed_placeholders
|
|
translations[meta_key] = existing_meta
|
|
else:
|
|
# Obtain type from english translation or default to "String".
|
|
translation_type = english_translations_dict.get(meta_key, {}).get(
|
|
"type", "String"
|
|
)
|
|
translations[meta_key] = {
|
|
"type": "String",
|
|
"placeholders": computed_placeholders,
|
|
}
|
|
|
|
save_translations(lang_code, translations)
|
|
|
|
|
|
def get_all_target_language_files() -> List[Path]:
|
|
"""Get all target ARB files (excluding intl_en.arb)."""
|
|
arb_files = list(l10n_dir.glob("intl_*.arb"))
|
|
# Filter out the English source file
|
|
return [f for f in arb_files if f.name != "intl_en.arb"]
|
|
|
|
|
|
def extract_language_code_from_filename(arb_path: Path) -> str:
|
|
"""Extract language code from ARB filename (e.g., intl_es.arb -> es)."""
|
|
filename = arb_path.stem # Get filename without extension
|
|
return filename.replace("intl_", "")
|
|
|
|
|
|
def translate_batch_with_openai(
|
|
translation_requests: dict[str, str],
|
|
target_language: str,
|
|
source_language: str = "English",
|
|
) -> dict[str, str]:
|
|
"""
|
|
Translate a batch of texts using OpenAI API.
|
|
|
|
Args:
|
|
translation_requests: Dictionary of {key: text} to translate
|
|
target_language: Target language name (e.g., "Spanish", "French")
|
|
source_language: Source language name (default: "English")
|
|
|
|
Returns:
|
|
Dictionary of {key: translated_text}
|
|
"""
|
|
import json
|
|
|
|
client = OpenAI()
|
|
|
|
# Create a batch translation prompt
|
|
prompt = f"""
|
|
Translate the following JSON object from {source_language} to {target_language}.
|
|
Preserve any placeholders (text within curly braces like {{username}}) exactly as they are.
|
|
Preserve any special formatting or ICU message format syntax.
|
|
Return only a JSON object with the same keys but translated values.
|
|
|
|
JSON to translate: {json.dumps(translation_requests, indent=2)}
|
|
"""
|
|
|
|
try:
|
|
chat_completion = client.chat.completions.create(
|
|
messages=[
|
|
{
|
|
"role": "system",
|
|
"content": "You are a professional translator. You translate text accurately while preserving any placeholders and special formatting. Always respond with valid JSON only.",
|
|
},
|
|
{
|
|
"role": "user",
|
|
"content": prompt,
|
|
},
|
|
],
|
|
model="gpt-4o-mini",
|
|
temperature=0.0,
|
|
)
|
|
|
|
response = chat_completion.choices[0].message.content.strip()
|
|
|
|
# Clean up common JSON formatting issues
|
|
if response.startswith("```json"):
|
|
response = response[7:]
|
|
if response.endswith("```"):
|
|
response = response[:-3]
|
|
response = response.strip()
|
|
|
|
# Parse the JSON response
|
|
translated_batch = json.loads(response)
|
|
return translated_batch
|
|
|
|
except json.JSONDecodeError as e:
|
|
print(f"JSON parsing error when translating batch to {target_language}: {e}")
|
|
print(f"Response was: {response[:200]}...")
|
|
# Fallback to original texts if parsing fails
|
|
return translation_requests
|
|
except (ConnectionError, TimeoutError) as e:
|
|
print(f"Network error translating batch to {target_language}: {e}")
|
|
return translation_requests # Return original texts if translation fails
|
|
except ValueError as e:
|
|
print(f"API error translating batch to {target_language}: {e}")
|
|
return translation_requests # Return original texts if translation fails
|
|
|
|
|
|
def get_language_display_name(lang_code: str) -> str:
|
|
"""Get the display name for a language code."""
|
|
supported_languages = load_supported_languages()
|
|
for code, display_name in supported_languages:
|
|
if code == lang_code:
|
|
return display_name
|
|
|
|
# Fallback mapping for common language codes
|
|
fallback_names = {
|
|
"es": "Spanish",
|
|
"fr": "French",
|
|
"de": "German",
|
|
"it": "Italian",
|
|
"pt": "Portuguese",
|
|
"pt_PT": "Portuguese",
|
|
"ja": "Japanese",
|
|
"ko": "Korean",
|
|
"zh": "Chinese",
|
|
"zh_Hant": "Traditional Chinese",
|
|
"ru": "Russian",
|
|
"ar": "Arabic",
|
|
"hi": "Hindi",
|
|
"nl": "Dutch",
|
|
"sv": "Swedish",
|
|
"da": "Danish",
|
|
"no": "Norwegian",
|
|
"nb": "Norwegian",
|
|
"fi": "Finnish",
|
|
"pl": "Polish",
|
|
"cs": "Czech",
|
|
"sk": "Slovak",
|
|
"hu": "Hungarian",
|
|
"ro": "Romanian",
|
|
"bg": "Bulgarian",
|
|
"hr": "Croatian",
|
|
"sl": "Slovenian",
|
|
"et": "Estonian",
|
|
"lv": "Latvian",
|
|
"lt": "Lithuanian",
|
|
"el": "Greek",
|
|
"tr": "Turkish",
|
|
"th": "Thai",
|
|
"vi": "Vietnamese",
|
|
"id": "Indonesian",
|
|
"ms": "Malay",
|
|
"fil": "Filipino",
|
|
"he": "Hebrew",
|
|
"uk": "Ukrainian",
|
|
"be": "Belarusian",
|
|
"ca": "Catalan",
|
|
"gl": "Galician",
|
|
"eu": "Basque",
|
|
}
|
|
|
|
return fallback_names.get(lang_code, lang_code.title())
|
|
|
|
|
|
def translate_keys(keys_to_translate: List[str]) -> None:
|
|
"""
|
|
Translate specific keys in all target ARB files using batch translation.
|
|
|
|
Args:
|
|
keys_to_translate: List of keys to translate
|
|
"""
|
|
# Load English translations (source)
|
|
english_translations = load_translations("en")
|
|
|
|
# Validate that all keys exist in English
|
|
missing_keys = [key for key in keys_to_translate if key not in english_translations]
|
|
if missing_keys:
|
|
print(
|
|
f"Error: The following keys were not found in intl_en.arb: {missing_keys}"
|
|
)
|
|
return
|
|
|
|
# Filter out metadata keys
|
|
keys_to_translate = [key for key in keys_to_translate if not key.startswith("@")]
|
|
|
|
# Get all target ARB files
|
|
target_files = get_all_target_language_files()
|
|
|
|
print(f"Found {len(target_files)} target language files")
|
|
print(f"Translating {len(keys_to_translate)} keys: {keys_to_translate}")
|
|
|
|
# Process keys in batches of 10
|
|
batch_size = 10
|
|
|
|
for arb_file in target_files:
|
|
lang_code = extract_language_code_from_filename(arb_file)
|
|
lang_display_name = get_language_display_name(lang_code)
|
|
|
|
print(f"\nProcessing {lang_display_name} ({lang_code})...")
|
|
|
|
# Load current translations for this language
|
|
current_translations = load_translations(lang_code)
|
|
|
|
# Track which keys were actually updated
|
|
updated_keys = []
|
|
|
|
# Process keys in batches
|
|
for i in range(0, len(keys_to_translate), batch_size):
|
|
batch = keys_to_translate[i : i + batch_size]
|
|
|
|
# Prepare translation requests for this batch
|
|
translation_requests = {}
|
|
for key in batch:
|
|
english_text = english_translations[key]
|
|
translation_requests[key] = english_text
|
|
|
|
print(
|
|
f" Translating batch {i//batch_size + 1} ({len(batch)} keys): {batch}"
|
|
)
|
|
|
|
# Translate the batch
|
|
translated_batch = translate_batch_with_openai(
|
|
translation_requests, lang_display_name
|
|
)
|
|
|
|
# Update translations and track updated keys
|
|
for key in batch:
|
|
if key in translated_batch:
|
|
translated_text = translated_batch[key]
|
|
current_translations[key] = translated_text
|
|
updated_keys.append(key)
|
|
print(
|
|
f" {key}: '{english_translations[key]}' -> '{translated_text}'"
|
|
)
|
|
else:
|
|
print(f" Warning: Key '{key}' not found in translation response")
|
|
|
|
# Save the updated translations
|
|
if updated_keys:
|
|
save_translations(lang_code, current_translations)
|
|
|
|
# Reconcile metadata for updated keys
|
|
reconcile_metadata(lang_code, updated_keys, english_translations)
|
|
|
|
print(f" ✓ Updated {len(updated_keys)} keys in {lang_code}")
|
|
else:
|
|
print(f" - No keys updated in {lang_code}")
|
|
|
|
|
|
def read_keys_from_file(file_path: Path) -> List[str]:
|
|
"""Read keys from a text file (one key per line)."""
|
|
if not file_path.exists():
|
|
raise FileNotFoundError(f"Keys file not found: {file_path}")
|
|
|
|
with open(file_path, "r", encoding="utf-8") as f:
|
|
keys = [
|
|
line.strip()
|
|
for line in f
|
|
if line.strip() and not line.strip().startswith("#")
|
|
]
|
|
|
|
return keys
|
|
|
|
|
|
def main():
|
|
parser = argparse.ArgumentParser(
|
|
description="Translate specific keys in ARB files",
|
|
formatter_class=argparse.RawDescriptionHelpFormatter,
|
|
epilog="""
|
|
Examples:
|
|
pip install openai
|
|
python translate_keys.py --keys about accept account
|
|
python -m scripts.translate.translate_keys.py --keys-file scripts.translate.keys_to_translate.txt
|
|
""",
|
|
)
|
|
|
|
group = parser.add_mutually_exclusive_group(required=True)
|
|
group.add_argument("--keys", nargs="+", help="List of keys to translate")
|
|
group.add_argument(
|
|
"--keys-file",
|
|
type=Path,
|
|
help="File containing keys to translate (one per line)",
|
|
)
|
|
|
|
args = parser.parse_args()
|
|
|
|
# Get the keys to translate
|
|
if args.keys:
|
|
keys_to_translate = args.keys
|
|
else:
|
|
keys_to_translate = read_keys_from_file(args.keys_file)
|
|
|
|
if not keys_to_translate:
|
|
print("Error: No keys to translate")
|
|
sys.exit(1)
|
|
|
|
# Change to the client directory so relative paths work
|
|
original_cwd = Path.cwd()
|
|
client_dir = Path(__file__).parent.parent.parent
|
|
|
|
try:
|
|
import os
|
|
|
|
os.chdir(client_dir)
|
|
translate_keys(keys_to_translate)
|
|
finally:
|
|
os.chdir(original_cwd)
|
|
|
|
print("\n✓ Translation complete!")
|
|
|
|
|
|
if __name__ == "__main__":
|
|
main()
|