feat: phonetic transcription v2 migration (#5640)

* docs: add PT v2 and token-info-feedback design docs

- Add phonetic-transcription-v2-design.instructions.md (client PT v2 migration)
- Add token-info-feedback-v2.instructions.md (client token feedback v2 migration)

* fix: update applyTo path for token info feedback v2 migration

* feat: Refactor phonetic transcription to v2 models and repository (in progress)

* feat: PT v2 migration - tts_phoneme rename, v1 cleanup, disambiguation, TTS integration

* feat: Update phonetic transcription v2 design document for endpoint changes and response structure

* docs: fix stale _storageKeys claim in pt-v2 design doc

* style: reformat PT v2 files with Dart 3.10 formatter (Flutter 3.38)

* feat: add speakingRate to TTS request model (default 0.85)

Passes speaking_rate to the choreo TTS endpoint. Default preserves
current behavior; can be overridden for single-word playback later.

* feat: use normal speed (1.0) for single-word TTS playback

The 0.85x slowdown is helpful for full sentences but makes single
words sound unnaturally slow. tts_controller._speakFromChoreo now
sends speakingRate=1.0. Full-sentence TTS via pangea_message_event
still defaults to 0.85.

* style: clean up formatting and reduce line breaks in TtsController

* fix: env goofiness

* formatting, fix linter issues

* don't return widgets from functions

---------

Co-authored-by: ggurdin <ggurdin@gmail.com>
Co-authored-by: ggurdin <46800240+ggurdin@users.noreply.github.com>
This commit is contained in:
wcjord 2026-02-10 16:29:26 -05:00 committed by GitHub
parent 59ff104a30
commit 0e681c4d68
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
27 changed files with 1022 additions and 498 deletions

View file

@ -0,0 +1,188 @@
---
applyTo: "lib/pangea/phonetic_transcription/**,lib/pangea/text_to_speech/**, client/controllers/tts_controller.dart"
---
# Phonetic Transcription v2 Design
## 1. Overview
Phonetic transcription provides pronunciations for L2 tokens, tailored to the user's L1. Applies to **all L1/L2 combinations** — not just non-Latin scripts (e.g., Spanish "lluvia" → "YOO-vee-ah" for an English L1 speaker).
## 2. Endpoint & Models
### Endpoint
`POST /choreo/phonetic_transcription_v2`
### Request
`surface` (string) + `lang_code` + `user_l1` + `user_l2`
- `lang_code`: language of the token (may differ from `user_l2` for loanwords/code-switching).
- `user_l2`: included in base schema but does not affect pronunciation — only `lang_code` and `user_l1` matter.
### Response
Flat `pronunciations` array, each with `transcription`, `tts_phoneme`, `ud_conditions`. Server-cached via CMS (subsequent calls are instant).
**Response example** (Chinese — `tts_phoneme` uses pinyin):
```json
{
"pronunciations": [
{
"transcription": "hái",
"tts_phoneme": "hai2",
"ud_conditions": "Pos=ADV"
},
{
"transcription": "huán",
"tts_phoneme": "huan2",
"ud_conditions": "Pos=VERB"
}
]
}
```
**Response example** (Spanish — `tts_phoneme` uses IPA):
```json
{
"pronunciations": [
{
"transcription": "YOO-vee-ah",
"tts_phoneme": "ˈʎubja",
"ud_conditions": null
}
]
}
```
### `tts_phoneme` Format by Language
The PT v2 handler selects the correct phoneme format based on `lang_code`. The client treats `tts_phoneme` as an opaque string — it never needs to know the alphabet.
| `lang_code` | Phoneme format | `alphabet` (resolved by TTS server) | Example |
| ------------------------ | ----------------------- | ----------------------------------- | ------------ |
| `cmn-CN`, `cmn-TW`, `zh` | Pinyin + tone numbers | `pinyin` | `hai2` |
| `yue` (Cantonese) | Jyutping + tone numbers | `jyutping` | `sik6 faan6` |
| `ja` | Yomigana (hiragana) | `yomigana` | `なか` |
| All others | IPA | `ipa` | `ˈʎubja` |
---
## 3. Disambiguation Logic
When the server returns multiple pronunciations (heteronyms), the client chooses which to display based on UD context.
### 3.1 Chat Page (WordZoomWidget)
Available context: `token.pos`, `token._morph` (full morph features), `token.text.content`.
**`lang_code` source**: `PangeaMessageEvent.messageDisplayLangCode` — the detected language of the text, not always `user_l2`.
**Strategy**: Match `ud_conditions` against all available UD info (POS + morph features).
### 3.2 Analytics Page (VocabDetailsView)
Available context: `constructId.lemma`, `constructId.category` (lowercased POS).
**`lang_code`**: Always `userL2Code`.
**Surface**: Use the **lemma** as `surface` in the PT request (dictionary pronunciation). Remove audio buttons beside individual forms — users access form pronunciation in chat.
**Strategy**: Match `ud_conditions` against lemma + POS only. Compare case-insensitively (`category` is lowercased, `ud_conditions` uses uppercase).
### 3.3 Fallback
If disambiguation doesn't produce a single match, **display all pronunciations** (e.g. `"hái / huán"`), each with its own play button for TTS using its `tts_phoneme` (see §5).
### 3.4 Parsing `ud_conditions`
Keys use **PascalCase** (`Pos`, `Tense`, `VerbForm`). Parse:
1. Split on `;` → individual conditions.
2. Split each on `=` → feature-value pairs.
3. `Pos=X` → compare against `token.pos` (or `constructId.category`, case-insensitively).
4. Other features → compare against `token.morph`.
5. A pronunciation matches if **all** conditions are satisfied.
6. `null` `ud_conditions` = unconditional (unambiguous word).
---
## 4. Local Caching
- **Key**: `surface + lang_code + user_l1`. Exclude `user_l2` (doesn't affect pronunciation). Exclude UD context (full pronunciation list is cached; disambiguation at display time).
- **Memory cache**: In-flight deduplication + fast reads, short TTL (keep ~10 min).
- **Disk cache**: `GetStorage`, **24-hour TTL** (down from 7 days — server CMS cache means re-fetching is cheap, daily refresh ensures corrections propagate).
- **Invalidation**: Lazy eviction on read.
- **Logout**: PT storage keys registered in `_storageKeys` in `pangea_controller.dart` (both v1 `phonetic_transcription_storage` and v2 `phonetic_transcription_v2_storage`).
---
## 5. TTS with Phoneme Pronunciation
PT covers **isolated words** only. Whole-message audio uses the existing TTS flow (unaffected).
### Problem
Ambiguous surface forms (e.g., 还 → hái vs huán) get arbitrary pronunciation from device TTS because it has no context.
### Decision Flow
The branch point is **how many entries are in the PT v2 `pronunciations` array** for this word.
```
PT response has 1 pronunciation? (unambiguous word)
→ YES: Use surface text for TTS as today (device or server fallback).
Device TTS will pronounce it correctly — no phoneme override needed.
→ NO (2+ pronunciations — heteronym):
Can disambiguate to exactly one using UD context? (§3)
→ YES: Send that pronunciation's tts_phoneme to _speakFromChoreo.
→ NO: Send first pronunciation's tts_phoneme to _speakFromChoreo as default,
or let user tap a specific pronunciation to play its tts_phoneme.
```
**The TTS request always contains at most one `tts_phoneme` string.** Disambiguation happens _before_ calling TTS.
### Implementation
**PT v2 handler** (choreo):
1. `tts_phoneme` on every `Pronunciation` — format determined by `lang_code`:
- Chinese (`zh`, `cmn-CN`, `cmn-TW`): pinyin with tone numbers (e.g. `hai2`)
- Cantonese (`yue`): jyutping with tone numbers (e.g. `sik6`)
- Japanese (`ja`): yomigana in hiragana (e.g. `なか`)
- All others: IPA (e.g. `ˈʎubja`)
2. Eval function validates format matches expected type for the language.
**TTS server** (choreo):
1. `tts_phoneme: Optional[str] = None` on `TextToSpeechRequest`.
2. Resolves SSML `alphabet` from `lang_code` (see table in §2). Client never sends the alphabet.
3. When `tts_phoneme` is set, wraps text in `<phoneme alphabet="{resolved}" ph="{tts_phoneme}">{text}</phoneme>` inside existing SSML `<speak>` tags.
4. `tts_phoneme` included in cache key.
5. Google Cloud TTS suppresses SSML mark timepoints inside `<phoneme>` tags → duration estimated via `estimate_duration_ms()`.
**Client**:
1. `ttsPhoneme` field on `TextToSpeechRequestModel` and `DisambiguationResult`.
2. `ttsPhoneme` param on `TtsController.tryToSpeak` and `_speakFromChoreo`.
3. When `ttsPhoneme` is provided, skips device TTS and calls `_speakFromChoreo`.
4. When `ttsPhoneme` is not provided, behavior unchanged.
5. Client treats `ttsPhoneme` as an opaque string — no language-specific logic needed.
### Cache-Only Phoneme Resolution
`TtsController.tryToSpeak` resolves `ttsPhoneme` from the **local PT v2 cache** (`_resolveTtsPhonemeFromCache`) rather than making a server call. This is a deliberate tradeoff:
- **Why cache-only**: TTS is latency-sensitive — adding a blocking PT v2 network call before every word playback would degrade the experience. By the time a user taps to play a word, the PT v2 response has almost certainly already been fetched and cached (it was needed to render the transcription overlay).
- **What if the cache misses**: The word plays without phoneme override, using device TTS or plain server TTS. This is the same behavior as before PT v2 existed — acceptable because heteronyms are ~5% of words. The user still gets audio, just without guaranteed disambiguation.
- **No silent failures**: A cache miss doesn't block or error — it falls through gracefully.
---
## 6. Future Improvements
- **Finetuning**: Once CMS accumulates enough examples, benchmark and train a smaller finetuned model on the server to replace `GPT_5_2`.
- **Legacy v1 endpoint removal**: The v1 `/choreo/phonetic_transcription` endpoint can be removed server-side once all clients are on v2.

View file

@ -0,0 +1,162 @@
---
applyTo: "lib/pangea/token_info_feedback/**, lib/**"
---
# Token Info Feedback — v2 Migration (Client)
Migrate the token info feedback flow to use the v2 endpoint and v2 phonetic transcription types. This is a client-side companion to the choreo instructions at `2-step-choreographer/.github/instructions/token-info-feedback-pt-v2.instructions.md`.
## Context
Token info feedback lets users flag incorrect token data (POS, language, phonetics, lemma). The server evaluates the feedback via LLM, conditionally calls sub-handlers, and returns updated fields. The client applies the updates to local caches and optionally edits the Matrix message.
**Why migrate**: The phonetics field currently sends a plain `String` and receives a `PhoneticTranscriptionResponse` (v1 nested types). The v2 endpoint expects `PTRequest` + `PTResponse` and returns `PTResponse` (flat v2 types). This aligns token feedback with the broader PT v2 migration.
**Staleness detection**: The server compares the client-sent phonetics against its CMS cache. If they differ and the LLM didn't request changes, the server returns the CMS version as `updatedPhonetics` so the client refreshes its local cache. This means `updatedPhonetics` may be non-null even when the user's feedback didn't trigger phonetic changes.
---
## 1. Endpoint URL
**v1**: `{choreoEndpoint}/token/feedback`
**v2**: `{choreoEndpoint}/token/feedback_v2`
Update `PApiUrls.tokenFeedback` (or add a new constant) in [urls.dart](lib/pangea/common/network/urls.dart).
---
## 2. Request Changes
### Replace `phonetics` with `ptRequest` + `ptResponse`
**v1**: `phonetics: String` — a rendered transcription like `"hái"`, extracted by `PhoneticTranscriptionBuilder.transcription` (the first token's `phoneticL1Transcription.content`).
**v2**: Two new fields replace `phonetics`:
- `ptRequest: PTRequest?` — the PT request used to fetch phonetics (surface, langCode, userL1, userL2). The server passes this directly to `pt_v2_handler.get()` when feedback triggers a phonetics re-evaluation.
- `ptResponse: PTResponse?` — the cached PT response containing `List<Pronunciation>`. The server uses `ptResponse.pronunciations` for the evaluation prompt and staleness detection.
This means `TokenInfoFeedbackRequestData` drops `phonetics: String` and adds `ptRequest: PTRequest?` + `ptResponse: PTResponse?`.
### What feeds into `ptRequest` / `ptResponse`
The data flows through this chain:
1. **`PhoneticTranscriptionBuilder`** resolves a `PhoneticTranscriptionResponse` (v1) → extracts a `String`.
2. **`TokenFeedbackButton`** receives the string via `onFlagTokenInfo(lemmaInfo, transcript)`.
3. **Call sites** (`reading_assistance_content.dart`, `analytics_details_popup.dart`) put the string into `TokenInfoFeedbackRequestData(phonetics: transcript)`.
After v2 migration, this chain must change:
1. **`PhoneticTranscriptionBuilder`** resolves a v2 response → exposes both the `PTRequest` it used and the `PTResponse` it received.
2. **`TokenFeedbackButton`** callback signature changes: `Function(LemmaInfoResponse, PTRequest, PTResponse)`.
3. **Call sites** pass both objects into the updated request data: `TokenInfoFeedbackRequestData(ptRequest: ptReq, ptResponse: ptRes)`.
### `toJson()` serialization
v1:
```json
{ "phonetics": "hái" }
```
v2:
```json
{
"pt_request": {
"surface": "还",
"lang_code": "zh",
"user_l1": "en",
"user_l2": "zh"
},
"pt_response": {
"pronunciations": [
{ "transcription": "hái", "ipa": "xaɪ̌", "ud_conditions": "Pos=ADV" },
{ "transcription": "huán", "ipa": "xwaň", "ud_conditions": "Pos=VERB" }
]
}
}
```
All other request fields (`userId`, `roomId`, `fullText`, `detectedLanguage`, `tokens`, `selectedToken`, `lemmaInfo`, `wordCardL1`) are unchanged.
---
## 3. Response Changes
### `updatedPhonetics` field
**v1**: `PhoneticTranscriptionResponse?` — deeply nested v1 types with `phoneticTranscriptionResult.phoneticTranscription[0].phoneticL1Transcription.content`.
**v2**: The v2 response type (e.g., `PhoneticTranscriptionV2Response` with `pronunciations: List<Pronunciation>`). Deserialized via the v2 model's `fromJson()`.
**New behavior**: `updatedPhonetics` may be non-null in two cases:
1. The LLM evaluated user feedback and generated new phonetics (same as v1).
2. The server detected that the client's cached phonetics are stale compared to CMS. In this case, the server returns the current CMS version so the client can refresh.
Either way, the client should apply the update to its local cache (see §4).
All other response fields (`userFriendlyMessage`, `updatedToken`, `updatedLemmaInfo`, `updatedLanguage`) are unchanged.
---
## 4. Cache Side-Effects in `_submitFeedback`
The dialog applies server updates to local caches. The phonetic cache write must change:
### v1 (current)
```dart
Future<void> _updatePhoneticTranscription(
PhoneticTranscriptionResponse response,
) async {
final req = PhoneticTranscriptionRequest(
arc: LanguageArc(l1: ..., l2: ...),
content: response.content,
);
await PhoneticTranscriptionRepo.set(req, response);
}
```
This constructs a v1 `PhoneticTranscriptionRequest` to use as the cache key, then writes the v1 response.
### v2 (target)
Construct the v2 cache key (`surface + lang_code + user_l1`) and write the v2 response to the v2 PT cache. The exact implementation depends on how the PT v2 repo's `set()` method is designed during the broader PT migration. The key pieces are:
- **Cache key inputs**: `surface` = the token's surface text, `langCode` = `this.langCode` (from the dialog), `userL1` = `requestData.wordCardL1`.
- **Response type**: The v2 response containing `List<Pronunciation>`.
- **Cache target**: The v2 PT cache (not the v1 `phonetic_transcription_storage`).
---
## 5. Files to Modify
| File | Change |
|------|--------|
| `token_info_feedback_request.dart` | Drop `phonetics: String`. Add `ptRequest: PTRequest?` + `ptResponse: PTResponse?`. Update `toJson()`, `==`, `hashCode`. |
| `token_info_feedback_response.dart` | `updatedPhonetics: PhoneticTranscriptionResponse?` → v2 response type. Update `fromJson()`, `toJson()`, `==`, `hashCode`. Remove v1 `PhoneticTranscriptionResponse` import. |
| `token_info_feedback_dialog.dart` | Update `_updatePhoneticTranscription` to use v2 cache key/types. Remove v1 `PhoneticTranscriptionRequest`, `PhoneticTranscriptionResponse`, `LanguageArc`, `PLanguageStore` imports. |
| `token_info_feedback_repo.dart` | Update URL to `PApiUrls.tokenFeedbackV2` (or equivalent). |
| `token_feedback_button.dart` *(outside this folder)* | Change callback from `(LemmaInfoResponse, String)` to `(LemmaInfoResponse, PTRequest, PTResponse)`. Update how the PT objects are extracted from the builder. |
| Call sites *(outside this folder)* | `reading_assistance_content.dart`, `analytics_details_popup.dart` — update `onFlagTokenInfo` to pass `PTRequest` + `PTResponse` into `TokenInfoFeedbackRequestData`. |
| `urls.dart` *(outside this folder)* | Add `tokenFeedbackV2` URL constant. |
---
## 6. Dependency on PT v2 Migration
This migration **depends on** the core PT v2 models existing on the client:
- `Pronunciation` model (with `transcription`, `ipa`, `ud_conditions`)
- V2 response model (with `pronunciations: List<Pronunciation>`)
- V2 repo with a `set()` method that accepts the v2 cache key
These are created as part of the main PT v2 migration (see `phonetic-transcription-v2-design.instructions.md` §3). Implement the core PT v2 models first, then update token info feedback.
---
## 7. Checklist
- [ ] Replace `phonetics` field with `ptRequest` + `ptResponse` in request model
- [ ] Update `updatedPhonetics` field type in response model
- [ ] Update `_updatePhoneticTranscription` cache write in dialog
- [ ] Update `TokenFeedbackButton` callback signature to `(LemmaInfoResponse, PTRequest, PTResponse)`
- [ ] Update call sites to pass `PTRequest` + `PTResponse`
- [ ] Update URL to v2 endpoint
- [ ] Remove all v1 PT type imports from token_info_feedback files

View file

@ -23,6 +23,7 @@ import 'package:fluffychat/pangea/lemmas/lemma_info_response.dart';
import 'package:fluffychat/pangea/morphs/default_morph_mapping.dart';
import 'package:fluffychat/pangea/morphs/morph_models.dart';
import 'package:fluffychat/pangea/morphs/morph_repo.dart';
import 'package:fluffychat/pangea/phonetic_transcription/pt_v2_models.dart';
import 'package:fluffychat/pangea/token_info_feedback/show_token_feedback_dialog.dart';
import 'package:fluffychat/pangea/token_info_feedback/token_info_feedback_request.dart';
import 'package:fluffychat/widgets/matrix.dart';
@ -163,7 +164,8 @@ class ConstructAnalyticsViewState extends State<ConstructAnalyticsView> {
Future<void> onFlagTokenInfo(
PangeaToken token,
LemmaInfoResponse lemmaInfo,
String phonetics,
PTRequest ptRequest,
PTResponse ptResponse,
) async {
final requestData = TokenInfoFeedbackRequestData(
userId: Matrix.of(context).client.userID!,
@ -172,7 +174,8 @@ class ConstructAnalyticsViewState extends State<ConstructAnalyticsView> {
selectedToken: 0,
wordCardL1: MatrixState.pangeaController.userController.userL1Code!,
lemmaInfo: lemmaInfo,
phonetics: phonetics,
ptRequest: ptRequest,
ptResponse: ptResponse,
);
await TokenFeedbackUtil.showTokenFeedbackDialog(

View file

@ -13,6 +13,7 @@ import 'package:fluffychat/pangea/events/models/pangea_token_model.dart';
import 'package:fluffychat/pangea/events/models/pangea_token_text_model.dart';
import 'package:fluffychat/pangea/lemmas/lemma.dart';
import 'package:fluffychat/pangea/lemmas/lemma_info_response.dart';
import 'package:fluffychat/pangea/phonetic_transcription/pt_v2_models.dart';
import 'package:fluffychat/pangea/toolbar/word_card/word_zoom_widget.dart';
import 'package:fluffychat/widgets/adaptive_dialogs/show_ok_cancel_alert_dialog.dart';
import 'package:fluffychat/widgets/future_loading_dialog.dart';
@ -92,14 +93,19 @@ class VocabDetailsView extends StatelessWidget {
langCode:
MatrixState.pangeaController.userController.userL2Code!,
construct: constructId,
pos: constructId.category,
onClose: Navigator.of(context).pop,
onFlagTokenInfo:
(LemmaInfoResponse lemmaInfo, String phonetics) =>
controller.onFlagTokenInfo(
token,
lemmaInfo,
phonetics,
),
(
LemmaInfoResponse lemmaInfo,
PTRequest ptRequest,
PTResponse ptResponse,
) => controller.onFlagTokenInfo(
token,
lemmaInfo,
ptRequest,
ptResponse,
),
reloadNotifier: controller.reloadNotifier,
maxWidth: double.infinity,
),

View file

@ -214,6 +214,7 @@ class VocabAnalyticsListView extends StatelessWidget {
.pangeaController
.userController
.userL2Code!,
pos: vocabItem.id.category,
);
AnalyticsNavigationUtil.navigateToAnalytics(
context: context,

View file

@ -197,5 +197,7 @@ class PangeaController {
'course_activity_storage',
'course_location_media_storage',
'language_mismatch',
'phonetic_transcription_storage',
'phonetic_transcription_v2_storage',
];
}

View file

@ -44,6 +44,8 @@ class PApiUrls {
static String speechToText = "${PApiUrls._choreoEndpoint}/speech_to_text";
static String phoneticTranscription =
"${PApiUrls._choreoEndpoint}/phonetic_transcription";
static String phoneticTranscriptionV2 =
"${PApiUrls._choreoEndpoint}/phonetic_transcription_v2";
static String messageActivityGeneration =
"${PApiUrls._choreoEndpoint}/practice";
@ -68,6 +70,8 @@ class PApiUrls {
"${PApiUrls._choreoEndpoint}/activity_plan/feedback";
static String tokenFeedback = "${PApiUrls._choreoEndpoint}/token/feedback";
static String tokenFeedbackV2 =
"${PApiUrls._choreoEndpoint}/token/feedback_v2";
static String morphFeaturesAndTags = "${PApiUrls._choreoEndpoint}/morphs";
static String constructSummary =

View file

@ -1,13 +1,15 @@
import 'package:flutter/material.dart';
import 'package:fluffychat/pangea/common/utils/async_state.dart';
import 'package:fluffychat/pangea/events/models/pangea_token_text_model.dart';
import 'package:fluffychat/pangea/languages/language_arc_model.dart';
import 'package:fluffychat/pangea/languages/language_model.dart';
import 'package:fluffychat/pangea/phonetic_transcription/phonetic_transcription_request.dart';
import 'package:fluffychat/pangea/phonetic_transcription/pt_v2_models.dart';
import 'package:fluffychat/pangea/phonetic_transcription/pt_v2_repo.dart';
import 'package:fluffychat/widgets/matrix.dart';
import 'phonetic_transcription_repo.dart';
/// Fetches and exposes the v2 [PTResponse] for a given surface text.
///
/// Exposes both the [PTRequest] used and the full [PTResponse] received,
/// which callers need for token feedback and disambiguation.
class PhoneticTranscriptionBuilder extends StatefulWidget {
final LanguageModel textLanguage;
final String text;
@ -34,7 +36,7 @@ class PhoneticTranscriptionBuilder extends StatefulWidget {
class PhoneticTranscriptionBuilderState
extends State<PhoneticTranscriptionBuilder> {
final ValueNotifier<AsyncState<String>> _loader = ValueNotifier(
final ValueNotifier<AsyncState<PTResponse>> _loader = ValueNotifier(
const AsyncState.idle(),
);
@ -61,23 +63,31 @@ class PhoneticTranscriptionBuilderState
super.dispose();
}
AsyncState<String> get state => _loader.value;
AsyncState<PTResponse> get state => _loader.value;
bool get isError => _loader.value is AsyncError;
bool get isLoaded => _loader.value is AsyncLoaded;
String? get transcription =>
isLoaded ? (_loader.value as AsyncLoaded<String>).value : null;
PhoneticTranscriptionRequest get _request => PhoneticTranscriptionRequest(
arc: LanguageArc(
l1: MatrixState.pangeaController.userController.userL1!,
l2: widget.textLanguage,
),
content: PangeaTokenText.fromString(widget.text),
/// The full v2 response (for feedback and disambiguation).
PTResponse? get ptResponse =>
isLoaded ? (_loader.value as AsyncLoaded<PTResponse>).value : null;
/// The request that was used to fetch this response.
PTRequest get ptRequest => _request;
/// Convenience: the first transcription string (for simple display).
String? get transcription =>
ptResponse?.pronunciations.firstOrNull?.transcription;
PTRequest get _request => PTRequest(
surface: widget.text,
langCode: widget.textLanguage.langCode,
userL1: MatrixState.pangeaController.userController.userL1Code ?? 'en',
userL2: MatrixState.pangeaController.userController.userL2Code ?? 'en',
);
Future<void> _load() async {
_loader.value = const AsyncState.loading();
final resp = await PhoneticTranscriptionRepo.get(
final resp = await PTV2Repo.get(
MatrixState.pangeaController.userController.accessToken,
_request,
);
@ -85,16 +95,7 @@ class PhoneticTranscriptionBuilderState
if (!mounted) return;
resp.isError
? _loader.value = AsyncState.error(resp.asError!.error)
: _loader.value = AsyncState.loaded(
resp
.asValue!
.value
.phoneticTranscriptionResult
.phoneticTranscription
.first
.phoneticL1Transcription
.content,
);
: _loader.value = AsyncState.loaded(resp.asValue!.value);
}
@override

View file

@ -1,197 +0,0 @@
import 'dart:convert';
import 'dart:io';
import 'package:async/async.dart';
import 'package:get_storage/get_storage.dart';
import 'package:http/http.dart';
import 'package:fluffychat/pangea/common/config/environment.dart';
import 'package:fluffychat/pangea/common/network/requests.dart';
import 'package:fluffychat/pangea/common/network/urls.dart';
import 'package:fluffychat/pangea/common/utils/error_handler.dart';
import 'package:fluffychat/pangea/phonetic_transcription/phonetic_transcription_request.dart';
import 'package:fluffychat/pangea/phonetic_transcription/phonetic_transcription_response.dart';
class _PhoneticTranscriptionMemoryCacheItem {
final Future<Result<PhoneticTranscriptionResponse>> resultFuture;
final DateTime timestamp;
const _PhoneticTranscriptionMemoryCacheItem({
required this.resultFuture,
required this.timestamp,
});
}
class _PhoneticTranscriptionStorageCacheItem {
final PhoneticTranscriptionResponse response;
final DateTime timestamp;
const _PhoneticTranscriptionStorageCacheItem({
required this.response,
required this.timestamp,
});
Map<String, dynamic> toJson() {
return {
'response': response.toJson(),
'timestamp': timestamp.toIso8601String(),
};
}
static _PhoneticTranscriptionStorageCacheItem fromJson(
Map<String, dynamic> json,
) {
return _PhoneticTranscriptionStorageCacheItem(
response: PhoneticTranscriptionResponse.fromJson(json['response']),
timestamp: DateTime.parse(json['timestamp']),
);
}
}
class PhoneticTranscriptionRepo {
// In-memory cache
static final Map<String, _PhoneticTranscriptionMemoryCacheItem> _cache = {};
static const Duration _cacheDuration = Duration(minutes: 10);
static const Duration _storageDuration = Duration(days: 7);
// Persistent storage
static final GetStorage _storage = GetStorage(
'phonetic_transcription_storage',
);
static Future<Result<PhoneticTranscriptionResponse>> get(
String accessToken,
PhoneticTranscriptionRequest request,
) async {
await GetStorage.init('phonetic_transcription_storage');
// 1. Try memory cache
final cached = _getCached(request);
if (cached != null) {
return cached;
}
// 2. Try disk cache
final stored = _getStored(request);
if (stored != null) {
return Future.value(Result.value(stored));
}
// 3. Fetch from network (safe future)
final future = _safeFetch(accessToken, request);
// 4. Save to in-memory cache
_cache[request.hashCode.toString()] = _PhoneticTranscriptionMemoryCacheItem(
resultFuture: future,
timestamp: DateTime.now(),
);
// 5. Write to disk *after* the fetch finishes, without rethrowing
writeToDisk(request, future);
return future;
}
static Future<void> set(
PhoneticTranscriptionRequest request,
PhoneticTranscriptionResponse resultFuture,
) async {
await GetStorage.init('phonetic_transcription_storage');
final key = request.hashCode.toString();
try {
final item = _PhoneticTranscriptionStorageCacheItem(
response: resultFuture,
timestamp: DateTime.now(),
);
await _storage.write(key, item.toJson());
_cache.remove(key); // Invalidate in-memory cache
} catch (e, s) {
ErrorHandler.logError(e: e, s: s, data: {'request': request.toJson()});
}
}
static Future<Result<PhoneticTranscriptionResponse>> _safeFetch(
String token,
PhoneticTranscriptionRequest request,
) async {
try {
final resp = await _fetch(token, request);
return Result.value(resp);
} catch (e, s) {
// Ensure error is logged and converted to a Result
ErrorHandler.logError(e: e, s: s, data: request.toJson());
return Result.error(e);
}
}
static Future<PhoneticTranscriptionResponse> _fetch(
String accessToken,
PhoneticTranscriptionRequest request,
) async {
final req = Requests(
choreoApiKey: Environment.choreoApiKey,
accessToken: accessToken,
);
final Response res = await req.post(
url: PApiUrls.phoneticTranscription,
body: request.toJson(),
);
if (res.statusCode != 200) {
throw HttpException(
'Failed to fetch phonetic transcription: ${res.statusCode} ${res.reasonPhrase}',
);
}
return PhoneticTranscriptionResponse.fromJson(
jsonDecode(utf8.decode(res.bodyBytes)),
);
}
static Future<Result<PhoneticTranscriptionResponse>>? _getCached(
PhoneticTranscriptionRequest request,
) {
final now = DateTime.now();
final key = request.hashCode.toString();
// Remove stale entries first
_cache.removeWhere(
(_, item) => now.difference(item.timestamp) >= _cacheDuration,
);
final item = _cache[key];
return item?.resultFuture;
}
static Future<void> writeToDisk(
PhoneticTranscriptionRequest request,
Future<Result<PhoneticTranscriptionResponse>> resultFuture,
) async {
final result = await resultFuture; // SAFE: never throws
if (!result.isValue) return; // only cache successful responses
await set(request, result.asValue!.value);
}
static PhoneticTranscriptionResponse? _getStored(
PhoneticTranscriptionRequest request,
) {
final key = request.hashCode.toString();
try {
final entry = _storage.read(key);
if (entry == null) return null;
final item = _PhoneticTranscriptionStorageCacheItem.fromJson(entry);
if (DateTime.now().difference(item.timestamp) >= _storageDuration) {
_storage.remove(key);
return null;
}
return item.response;
} catch (e, s) {
ErrorHandler.logError(e: e, s: s, data: {'request': request.toJson()});
_storage.remove(key);
return null;
}
}
}

View file

@ -1,46 +0,0 @@
import 'package:fluffychat/pangea/events/models/pangea_token_text_model.dart';
import 'package:fluffychat/pangea/languages/language_arc_model.dart';
class PhoneticTranscriptionRequest {
final LanguageArc arc;
final PangeaTokenText content;
final bool requiresTokenization;
PhoneticTranscriptionRequest({
required this.arc,
required this.content,
this.requiresTokenization = false,
});
factory PhoneticTranscriptionRequest.fromJson(Map<String, dynamic> json) {
return PhoneticTranscriptionRequest(
arc: LanguageArc.fromJson(json['arc'] as Map<String, dynamic>),
content: PangeaTokenText.fromJson(
json['content'] as Map<String, dynamic>,
),
requiresTokenization: json['requires_tokenization'] ?? true,
);
}
Map<String, dynamic> toJson() {
return {
'arc': arc.toJson(),
'content': content.toJson(),
'requires_tokenization': requiresTokenization,
};
}
String get storageKey => '${arc.l1}-${arc.l2}-${content.hashCode}';
@override
int get hashCode =>
content.hashCode ^ arc.hashCode ^ requiresTokenization.hashCode;
@override
bool operator ==(Object other) {
return other is PhoneticTranscriptionRequest &&
other.content == content &&
other.arc == arc &&
other.requiresTokenization == requiresTokenization;
}
}

View file

@ -1,153 +0,0 @@
import 'package:fluffychat/pangea/events/models/pangea_token_text_model.dart';
import 'package:fluffychat/pangea/languages/language_arc_model.dart';
enum PhoneticTranscriptionDelimEnum { sp, noSp }
extension PhoneticTranscriptionDelimEnumExt on PhoneticTranscriptionDelimEnum {
String get value {
switch (this) {
case PhoneticTranscriptionDelimEnum.sp:
return " ";
case PhoneticTranscriptionDelimEnum.noSp:
return "";
}
}
static PhoneticTranscriptionDelimEnum fromString(String s) {
switch (s) {
case " ":
return PhoneticTranscriptionDelimEnum.sp;
case "":
return PhoneticTranscriptionDelimEnum.noSp;
default:
return PhoneticTranscriptionDelimEnum.sp;
}
}
}
class PhoneticTranscriptionToken {
final LanguageArc arc;
final PangeaTokenText tokenL2;
final PangeaTokenText phoneticL1Transcription;
PhoneticTranscriptionToken({
required this.arc,
required this.tokenL2,
required this.phoneticL1Transcription,
});
factory PhoneticTranscriptionToken.fromJson(Map<String, dynamic> json) {
return PhoneticTranscriptionToken(
arc: LanguageArc.fromJson(json['arc'] as Map<String, dynamic>),
tokenL2: PangeaTokenText.fromJson(
json['token_l2'] as Map<String, dynamic>,
),
phoneticL1Transcription: PangeaTokenText.fromJson(
json['phonetic_l1_transcription'] as Map<String, dynamic>,
),
);
}
Map<String, dynamic> toJson() => {
'arc': arc.toJson(),
'token_l2': tokenL2.toJson(),
'phonetic_l1_transcription': phoneticL1Transcription.toJson(),
};
}
class PhoneticTranscription {
final LanguageArc arc;
final PangeaTokenText transcriptionL2;
final List<PhoneticTranscriptionToken> phoneticTranscription;
final PhoneticTranscriptionDelimEnum delim;
PhoneticTranscription({
required this.arc,
required this.transcriptionL2,
required this.phoneticTranscription,
this.delim = PhoneticTranscriptionDelimEnum.sp,
});
factory PhoneticTranscription.fromJson(Map<String, dynamic> json) {
return PhoneticTranscription(
arc: LanguageArc.fromJson(json['arc'] as Map<String, dynamic>),
transcriptionL2: PangeaTokenText.fromJson(
json['transcription_l2'] as Map<String, dynamic>,
),
phoneticTranscription: (json['phonetic_transcription'] as List)
.map(
(e) =>
PhoneticTranscriptionToken.fromJson(e as Map<String, dynamic>),
)
.toList(),
delim: json['delim'] != null
? PhoneticTranscriptionDelimEnumExt.fromString(
json['delim'] as String,
)
: PhoneticTranscriptionDelimEnum.sp,
);
}
Map<String, dynamic> toJson() => {
'arc': arc.toJson(),
'transcription_l2': transcriptionL2.toJson(),
'phonetic_transcription': phoneticTranscription
.map((e) => e.toJson())
.toList(),
'delim': delim.value,
};
}
class PhoneticTranscriptionResponse {
final LanguageArc arc;
final PangeaTokenText content;
final Map<String, dynamic>
tokenization; // You can define a typesafe model if needed
final PhoneticTranscription phoneticTranscriptionResult;
PhoneticTranscriptionResponse({
required this.arc,
required this.content,
required this.tokenization,
required this.phoneticTranscriptionResult,
});
factory PhoneticTranscriptionResponse.fromJson(Map<String, dynamic> json) {
return PhoneticTranscriptionResponse(
arc: LanguageArc.fromJson(json['arc'] as Map<String, dynamic>),
content: PangeaTokenText.fromJson(
json['content'] as Map<String, dynamic>,
),
tokenization: Map<String, dynamic>.from(json['tokenization'] as Map),
phoneticTranscriptionResult: PhoneticTranscription.fromJson(
json['phonetic_transcription_result'] as Map<String, dynamic>,
),
);
}
Map<String, dynamic> toJson() {
return {
'arc': arc.toJson(),
'content': content.toJson(),
'tokenization': tokenization,
'phonetic_transcription_result': phoneticTranscriptionResult.toJson(),
};
}
@override
bool operator ==(Object other) =>
identical(this, other) ||
other is PhoneticTranscriptionResponse &&
runtimeType == other.runtimeType &&
arc == other.arc &&
content == other.content &&
tokenization == other.tokenization &&
phoneticTranscriptionResult == other.phoneticTranscriptionResult;
@override
int get hashCode =>
arc.hashCode ^
content.hashCode ^
tokenization.hashCode ^
phoneticTranscriptionResult.hashCode;
}

View file

@ -7,6 +7,8 @@ import 'package:fluffychat/pangea/common/utils/async_state.dart';
import 'package:fluffychat/pangea/common/widgets/error_indicator.dart';
import 'package:fluffychat/pangea/languages/language_model.dart';
import 'package:fluffychat/pangea/phonetic_transcription/phonetic_transcription_builder.dart';
import 'package:fluffychat/pangea/phonetic_transcription/pt_v2_disambiguation.dart';
import 'package:fluffychat/pangea/phonetic_transcription/pt_v2_models.dart';
import 'package:fluffychat/pangea/text_to_speech/tts_controller.dart';
import 'package:fluffychat/widgets/hover_builder.dart';
import 'package:fluffychat/widgets/matrix.dart';
@ -15,6 +17,12 @@ class PhoneticTranscriptionWidget extends StatefulWidget {
final String text;
final LanguageModel textLanguage;
/// POS tag for disambiguation (from PangeaToken, e.g. "VERB").
final String? pos;
/// Morph features for disambiguation (from PangeaToken).
final Map<String, String>? morph;
final TextStyle? style;
final double? iconSize;
final Color? iconColor;
@ -27,6 +35,8 @@ class PhoneticTranscriptionWidget extends StatefulWidget {
super.key,
required this.text,
required this.textLanguage,
this.pos,
this.morph,
this.style,
this.iconSize,
this.iconColor,
@ -54,6 +64,8 @@ class _PhoneticTranscriptionWidgetState
context: context,
targetID: targetId,
langCode: widget.textLanguage.langCode,
pos: widget.pos,
morph: widget.morph,
onStart: () {
if (mounted) setState(() => _isPlaying = true);
},
@ -74,6 +86,7 @@ class _PhoneticTranscriptionWidgetState
? L10n.of(context).stop
: L10n.of(context).playAudio,
child: GestureDetector(
behavior: HitTestBehavior.opaque,
onTap: () => _handleAudioTap(targetId),
child: AnimatedContainer(
duration: const Duration(milliseconds: 150),
@ -111,13 +124,17 @@ class _PhoneticTranscriptionWidgetState
context,
).failedToFetchTranscription,
),
AsyncLoaded<String>(value: final transcription) => Row(
AsyncLoaded<PTResponse>(value: final ptResponse) => Row(
spacing: 8.0,
mainAxisSize: MainAxisSize.min,
children: [
Flexible(
child: Text(
transcription,
disambiguate(
ptResponse.pronunciations,
pos: widget.pos,
morph: widget.morph,
).displayTranscription,
textScaler: TextScaler.noScaling,
style:
widget.style ??

View file

@ -0,0 +1,101 @@
import 'package:fluffychat/pangea/phonetic_transcription/pt_v2_models.dart';
/// Disambiguation result for choosing which pronunciation(s) to display.
class DisambiguationResult {
/// The matched pronunciation, or null if zero or multiple matches.
final Pronunciation? matched;
/// All pronunciations (for fallback display).
final List<Pronunciation> all;
const DisambiguationResult({this.matched, required this.all});
bool get isAmbiguous => matched == null && all.length > 1;
bool get isUnambiguous => all.length == 1 || matched != null;
/// The transcription to display (single match or slash-separated fallback).
String get displayTranscription {
if (matched != null) return matched!.transcription;
if (all.length == 1) return all.first.transcription;
return all.map((p) => p.transcription).join(' / ');
}
/// The tts_phoneme for TTS. Returns the matched value, or null if ambiguous
/// (caller should let user choose or use the first).
String? get ttsPhoneme {
if (matched != null) return matched!.ttsPhoneme;
if (all.length == 1) return all.first.ttsPhoneme;
return null;
}
}
/// Disambiguate pronunciations against available UD context.
///
/// [pos] POS tag from PangeaToken (uppercase, e.g. "VERB").
/// [morph] morphological features from PangeaToken (e.g. {"Tense": "Past"}).
///
/// Both may be null (analytics page has limited context).
DisambiguationResult disambiguate(
List<Pronunciation> pronunciations, {
String? pos,
Map<String, String>? morph,
}) {
if (pronunciations.isEmpty) {
return const DisambiguationResult(all: []);
}
if (pronunciations.length == 1) {
return DisambiguationResult(
matched: pronunciations.first,
all: pronunciations,
);
}
// Try to find a pronunciation whose ud_conditions all match.
final matches = pronunciations.where((p) {
if (p.udConditions == null) return true; // unconditional = always matches
return _matchesConditions(p.udConditions!, pos: pos, morph: morph);
}).toList();
if (matches.length == 1) {
return DisambiguationResult(matched: matches.first, all: pronunciations);
}
// Ambiguous return all.
return DisambiguationResult(all: pronunciations);
}
/// Parse ud_conditions string and check if all conditions are met.
///
/// Format: "Pos=ADV;Tense=Past" semicolon-separated feature=value pairs.
/// "Pos" is matched against [pos] (case-insensitive).
/// Other features are matched against [morph].
bool _matchesConditions(
String udConditions, {
String? pos,
Map<String, String>? morph,
}) {
final conditions = udConditions.split(';');
for (final cond in conditions) {
final parts = cond.split('=');
if (parts.length != 2) continue;
final feature = parts[0].trim();
final value = parts[1].trim();
if (feature.toLowerCase() == 'pos') {
if (pos == null) return false;
if (pos.toLowerCase() != value.toLowerCase()) return false;
} else {
if (morph == null) return false;
// UD features use PascalCase keys. Match case-insensitively
// in case the morph map uses different casing.
final morphValue = morph.entries
.where((e) => e.key.toLowerCase() == feature.toLowerCase())
.map((e) => e.value)
.firstOrNull;
if (morphValue == null) return false;
if (morphValue.toLowerCase() != value.toLowerCase()) return false;
}
}
return true;
}

View file

@ -0,0 +1,142 @@
/// Phonetic Transcription v2 models.
///
/// Maps to choreo endpoint `POST /choreo/phonetic_transcription_v2`.
/// Request: [PTRequest] with surface, langCode, userL1, userL2.
/// Response: [PTResponse] with a list of [Pronunciation]s.
library;
class Pronunciation {
final String transcription;
final String ttsPhoneme;
final String? udConditions;
const Pronunciation({
required this.transcription,
required this.ttsPhoneme,
this.udConditions,
});
factory Pronunciation.fromJson(Map<String, dynamic> json) {
return Pronunciation(
transcription: json['transcription'] as String,
ttsPhoneme: json['tts_phoneme'] as String,
udConditions: json['ud_conditions'] as String?,
);
}
Map<String, dynamic> toJson() => {
'transcription': transcription,
'tts_phoneme': ttsPhoneme,
'ud_conditions': udConditions,
};
@override
bool operator ==(Object other) =>
identical(this, other) ||
other is Pronunciation &&
transcription == other.transcription &&
ttsPhoneme == other.ttsPhoneme &&
udConditions == other.udConditions;
@override
int get hashCode =>
transcription.hashCode ^ ttsPhoneme.hashCode ^ udConditions.hashCode;
}
class PTRequest {
final String surface;
final String langCode;
final String userL1;
final String userL2;
const PTRequest({
required this.surface,
required this.langCode,
required this.userL1,
required this.userL2,
});
factory PTRequest.fromJson(Map<String, dynamic> json) {
return PTRequest(
surface: json['surface'] as String,
langCode: json['lang_code'] as String,
userL1: json['user_l1'] as String,
userL2: json['user_l2'] as String,
);
}
Map<String, dynamic> toJson() => {
'surface': surface,
'lang_code': langCode,
'user_l1': userL1,
'user_l2': userL2,
};
/// Cache key excludes userL2 (doesn't affect pronunciation).
String get cacheKey => '$surface|$langCode|$userL1';
@override
bool operator ==(Object other) =>
identical(this, other) ||
other is PTRequest &&
surface == other.surface &&
langCode == other.langCode &&
userL1 == other.userL1 &&
userL2 == other.userL2;
@override
int get hashCode =>
surface.hashCode ^ langCode.hashCode ^ userL1.hashCode ^ userL2.hashCode;
}
class PTResponse {
final List<Pronunciation> pronunciations;
const PTResponse({required this.pronunciations});
factory PTResponse.fromJson(Map<String, dynamic> json) {
return PTResponse(
pronunciations: (json['pronunciations'] as List)
.map((e) => Pronunciation.fromJson(e as Map<String, dynamic>))
.toList(),
);
}
Map<String, dynamic> toJson() => {
'pronunciations': pronunciations.map((p) => p.toJson()).toList(),
};
@override
bool operator ==(Object other) =>
identical(this, other) ||
other is PTResponse &&
const _PronunciationListEquality().equals(
pronunciations,
other.pronunciations,
);
@override
int get hashCode => const _PronunciationListEquality().hash(pronunciations);
}
// ignore: unintended_html_in_doc_comment
/// Deep equality for List<Pronunciation>.
class _PronunciationListEquality {
const _PronunciationListEquality();
bool equals(List<Pronunciation> a, List<Pronunciation> b) {
if (a.length != b.length) return false;
for (int i = 0; i < a.length; i++) {
if (a[i] != b[i]) return false;
}
return true;
}
int hash(List<Pronunciation> list) {
int result = 0;
for (final p in list) {
result ^= p.hashCode;
}
return result;
}
}

View file

@ -0,0 +1,198 @@
import 'dart:convert';
import 'dart:io';
import 'package:async/async.dart';
import 'package:get_storage/get_storage.dart';
import 'package:http/http.dart';
import 'package:fluffychat/pangea/common/config/environment.dart';
import 'package:fluffychat/pangea/common/network/requests.dart';
import 'package:fluffychat/pangea/common/network/urls.dart';
import 'package:fluffychat/pangea/common/utils/error_handler.dart';
import 'package:fluffychat/pangea/phonetic_transcription/pt_v2_models.dart';
class _MemoryCacheItem {
final Future<Result<PTResponse>> resultFuture;
final DateTime timestamp;
const _MemoryCacheItem({required this.resultFuture, required this.timestamp});
}
class _DiskCacheItem {
final PTResponse response;
final DateTime timestamp;
const _DiskCacheItem({required this.response, required this.timestamp});
Map<String, dynamic> toJson() => {
'response': response.toJson(),
'timestamp': timestamp.toIso8601String(),
};
static _DiskCacheItem fromJson(Map<String, dynamic> json) {
return _DiskCacheItem(
response: PTResponse.fromJson(json['response']),
timestamp: DateTime.parse(json['timestamp']),
);
}
}
const String ptV2StorageKey = 'phonetic_transcription_v2_storage';
class PTV2Repo {
static final Map<String, _MemoryCacheItem> _cache = {};
static const Duration _memoryCacheDuration = Duration(minutes: 10);
static const Duration _diskCacheDuration = Duration(hours: 24);
static final GetStorage _storage = GetStorage(ptV2StorageKey);
static Future<Result<PTResponse>> get(
String accessToken,
PTRequest request,
) async {
await GetStorage.init(ptV2StorageKey);
// 1. Try memory cache
final cached = _getCached(request);
if (cached != null) return cached;
// 2. Try disk cache
final stored = _getStored(request);
if (stored != null) return Future.value(Result.value(stored));
// 3. Fetch from network
final future = _safeFetch(accessToken, request);
// 4. Save to in-memory cache
_cache[request.cacheKey] = _MemoryCacheItem(
resultFuture: future,
timestamp: DateTime.now(),
);
// 5. Write to disk after fetch completes
_writeToDisk(request, future);
return future;
}
/// Overwrite a cached response (used by token feedback to refresh stale PT).
static Future<void> set(PTRequest request, PTResponse response) async {
await GetStorage.init(ptV2StorageKey);
final key = request.cacheKey;
try {
final item = _DiskCacheItem(
response: response,
timestamp: DateTime.now(),
);
await _storage.write(key, item.toJson());
_cache.remove(key);
} catch (e, s) {
ErrorHandler.logError(e: e, s: s, data: {'cacheKey': key});
}
}
/// Look up a cached PT response without triggering a network fetch.
/// Returns null if not in memory or disk cache.
static PTResponse? getCachedResponse(
String surface,
String langCode,
String userL1,
) {
final key = '$surface|$langCode|$userL1';
// Check memory cache first.
final now = DateTime.now();
final memItem = _cache[key];
if (memItem != null &&
now.difference(memItem.timestamp) < _memoryCacheDuration) {
// Memory cache stores a Future can't resolve synchronously.
// Fall through to disk cache.
}
// Check disk cache.
try {
final entry = _storage.read(key);
if (entry == null) return null;
final item = _DiskCacheItem.fromJson(entry);
if (now.difference(item.timestamp) >= _diskCacheDuration) {
_storage.remove(key);
return null;
}
return item.response;
} catch (_) {
return null;
}
}
static Future<Result<PTResponse>>? _getCached(PTRequest request) {
final now = DateTime.now();
_cache.removeWhere(
(_, item) => now.difference(item.timestamp) >= _memoryCacheDuration,
);
return _cache[request.cacheKey]?.resultFuture;
}
static PTResponse? _getStored(PTRequest request) {
final key = request.cacheKey;
try {
final entry = _storage.read(key);
if (entry == null) return null;
final item = _DiskCacheItem.fromJson(entry);
if (DateTime.now().difference(item.timestamp) >= _diskCacheDuration) {
_storage.remove(key);
return null;
}
return item.response;
} catch (e, s) {
ErrorHandler.logError(e: e, s: s, data: {'cacheKey': key});
_storage.remove(key);
return null;
}
}
static Future<Result<PTResponse>> _safeFetch(
String token,
PTRequest request,
) async {
try {
final resp = await _fetch(token, request);
return Result.value(resp);
} catch (e, s) {
ErrorHandler.logError(e: e, s: s, data: request.toJson());
return Result.error(e);
}
}
static Future<PTResponse> _fetch(
String accessToken,
PTRequest request,
) async {
final req = Requests(
choreoApiKey: Environment.choreoApiKey,
accessToken: accessToken,
);
final Response res = await req.post(
url: PApiUrls.phoneticTranscriptionV2,
body: request.toJson(),
);
if (res.statusCode != 200) {
throw HttpException(
'Failed to fetch phonetic transcription v2: ${res.statusCode} ${res.reasonPhrase}',
);
}
return PTResponse.fromJson(jsonDecode(utf8.decode(res.bodyBytes)));
}
static Future<void> _writeToDisk(
PTRequest request,
Future<Result<PTResponse>> resultFuture,
) async {
final result = await resultFuture;
if (!result.isValue) return;
await set(request, result.asValue!.value);
}
}

View file

@ -8,6 +8,8 @@ class TextToSpeechRequestModel {
String userL2;
List<PangeaTokenText> tokens;
String? voice;
String? ttsPhoneme;
double speakingRate;
TextToSpeechRequestModel({
required this.text,
@ -16,6 +18,8 @@ class TextToSpeechRequestModel {
required this.userL2,
required this.tokens,
this.voice,
this.ttsPhoneme,
this.speakingRate = 0.85,
});
Map<String, dynamic> toJson() => {
@ -25,6 +29,8 @@ class TextToSpeechRequestModel {
ModelKey.userL2: userL2,
ModelKey.tokens: tokens.map((token) => token.toJson()).toList(),
'voice': voice,
if (ttsPhoneme != null) 'tts_phoneme': ttsPhoneme,
'speaking_rate': speakingRate,
};
@override
@ -34,9 +40,11 @@ class TextToSpeechRequestModel {
return other is TextToSpeechRequestModel &&
other.text == text &&
other.langCode == langCode &&
other.voice == voice;
other.voice == voice &&
other.ttsPhoneme == ttsPhoneme;
}
@override
int get hashCode => text.hashCode ^ langCode.hashCode ^ voice.hashCode;
int get hashCode =>
text.hashCode ^ langCode.hashCode ^ voice.hashCode ^ ttsPhoneme.hashCode;
}

View file

@ -20,6 +20,8 @@ import 'package:fluffychat/pangea/common/widgets/card_header.dart';
import 'package:fluffychat/pangea/events/models/pangea_token_text_model.dart';
import 'package:fluffychat/pangea/instructions/instructions_enum.dart';
import 'package:fluffychat/pangea/languages/language_constants.dart';
import 'package:fluffychat/pangea/phonetic_transcription/pt_v2_disambiguation.dart';
import 'package:fluffychat/pangea/phonetic_transcription/pt_v2_repo.dart';
import 'package:fluffychat/pangea/text_to_speech/text_to_speech_repo.dart';
import 'package:fluffychat/pangea/text_to_speech/text_to_speech_request_model.dart';
import 'package:fluffychat/pangea/text_to_speech/text_to_speech_response_model.dart';
@ -115,6 +117,34 @@ class TtsController {
static VoidCallback? _onStop;
/// Look up the PT v2 cache for [text] and return tts_phoneme if the word is a
/// heteronym that can be disambiguated. Returns null for single-pronunciation
/// words or when no PT data is cached.
static String? _resolveTtsPhonemeFromCache(
String text,
String langCode, {
String? pos,
Map<String, String>? morph,
}) {
final userL1 = MatrixState.pangeaController.userController.userL1Code;
if (userL1 == null) return null;
final ptResponse = PTV2Repo.getCachedResponse(text, langCode, userL1);
debugPrint(
'[TTS-DEBUG] _resolveTtsPhonemeFromCache: text="$text" lang=$langCode cached=${ptResponse != null} count=${ptResponse?.pronunciations.length ?? 0} pos=$pos morph=$morph',
);
if (ptResponse == null || ptResponse.pronunciations.length <= 1) {
return null;
}
final result = disambiguate(
ptResponse.pronunciations,
pos: pos,
morph: morph,
);
return result.ttsPhoneme;
}
static Future<void> tryToSpeak(
String text, {
required String langCode,
@ -124,7 +154,29 @@ class TtsController {
ChatController? chatController,
VoidCallback? onStart,
VoidCallback? onStop,
/// When provided, skip device TTS and use choreo with phoneme tags.
/// If omitted, the PT v2 cache is checked automatically.
String? ttsPhoneme,
/// POS tag for disambiguation when resolving tts_phoneme from cache.
String? pos,
/// Morph features for disambiguation when resolving tts_phoneme from cache.
Map<String, String>? morph,
}) async {
// Auto-resolve tts_phoneme from PT cache if not explicitly provided.
final explicitPhoneme = ttsPhoneme;
ttsPhoneme ??= _resolveTtsPhonemeFromCache(
text,
langCode,
pos: pos,
morph: morph,
);
debugPrint(
'[TTS-DEBUG] tryToSpeak: text="$text" explicitPhoneme=$explicitPhoneme resolvedPhoneme=$ttsPhoneme pos=$pos morph=$morph',
);
final prevOnStop = _onStop;
_onStop = onStop;
@ -147,6 +199,7 @@ class TtsController {
chatController: chatController,
onStart: onStart,
onStop: onStop,
ttsPhoneme: ttsPhoneme,
);
}
@ -161,6 +214,7 @@ class TtsController {
ChatController? chatController,
VoidCallback? onStart,
VoidCallback? onStop,
String? ttsPhoneme,
}) async {
chatController?.stopMediaStream.add(null);
MatrixState.pangeaController.matrixState.audioPlayer?.stop();
@ -182,9 +236,15 @@ class TtsController {
);
onStart?.call();
await (_isLangFullySupported(langCode)
? _speak(text, langCode, [token])
: _speakFromChoreo(text, langCode, [token]));
// When tts_phoneme is provided, skip device TTS and use choreo with phoneme tags.
if (ttsPhoneme != null) {
await _speakFromChoreo(text, langCode, [token], ttsPhoneme: ttsPhoneme);
} else {
await (_isLangFullySupported(langCode)
? _speak(text, langCode, [token])
: _speakFromChoreo(text, langCode, [token]));
}
} else if (targetID != null && context != null) {
await _showTTSDisabledPopup(context, targetID);
}
@ -240,8 +300,12 @@ class TtsController {
static Future<void> _speakFromChoreo(
String text,
String langCode,
List<PangeaTokenText> tokens,
) async {
List<PangeaTokenText> tokens, {
String? ttsPhoneme,
}) async {
debugPrint(
'[TTS-DEBUG] _speakFromChoreo: text="$text" ttsPhoneme=$ttsPhoneme',
);
TextToSpeechResponseModel? ttsRes;
loadingChoreoStream.add(true);
@ -257,6 +321,8 @@ class TtsController {
userL2:
MatrixState.pangeaController.userController.userL2Code ??
LanguageKeys.unknownLanguage,
ttsPhoneme: ttsPhoneme,
speakingRate: 1.0,
),
);
loadingChoreoStream.add(false);

View file

@ -8,19 +8,15 @@ import 'package:fluffychat/pangea/events/models/language_detection_model.dart';
import 'package:fluffychat/pangea/events/models/pangea_token_model.dart';
import 'package:fluffychat/pangea/events/models/tokens_event_content_model.dart';
import 'package:fluffychat/pangea/extensions/pangea_room_extension.dart';
import 'package:fluffychat/pangea/languages/language_arc_model.dart';
import 'package:fluffychat/pangea/languages/p_language_store.dart';
import 'package:fluffychat/pangea/lemmas/lemma_info_repo.dart';
import 'package:fluffychat/pangea/lemmas/lemma_info_response.dart';
import 'package:fluffychat/pangea/phonetic_transcription/phonetic_transcription_repo.dart';
import 'package:fluffychat/pangea/phonetic_transcription/phonetic_transcription_request.dart';
import 'package:fluffychat/pangea/phonetic_transcription/phonetic_transcription_response.dart';
import 'package:fluffychat/pangea/phonetic_transcription/pt_v2_models.dart';
import 'package:fluffychat/pangea/phonetic_transcription/pt_v2_repo.dart';
import 'package:fluffychat/pangea/token_info_feedback/token_info_feedback_repo.dart';
import 'package:fluffychat/pangea/token_info_feedback/token_info_feedback_request.dart';
import 'package:fluffychat/pangea/token_info_feedback/token_info_feedback_response.dart';
import 'package:fluffychat/pangea/toolbar/word_card/word_zoom_widget.dart';
import 'package:fluffychat/widgets/future_loading_dialog.dart';
import 'package:fluffychat/widgets/matrix.dart';
class TokenInfoFeedbackDialog extends StatelessWidget {
final TokenInfoFeedbackRequestData requestData;
@ -125,21 +121,11 @@ class TokenInfoFeedbackDialog extends StatelessWidget {
response,
);
Future<void> _updatePhoneticTranscription(
PhoneticTranscriptionResponse response,
) async {
final req = PhoneticTranscriptionRequest(
arc: LanguageArc(
l1:
PLanguageStore.byLangCode(requestData.wordCardL1) ??
MatrixState.pangeaController.userController.userL1!,
l2:
PLanguageStore.byLangCode(langCode) ??
MatrixState.pangeaController.userController.userL2!,
),
content: response.content,
);
await PhoneticTranscriptionRepo.set(req, response);
Future<void> _updatePhoneticTranscription(PTResponse response) async {
// Use the original request from the feedback data to write to v2 cache
final ptRequest = requestData.ptRequest;
if (ptRequest == null) return;
await PTV2Repo.set(ptRequest, response);
}
@override
@ -151,6 +137,8 @@ class TokenInfoFeedbackDialog extends StatelessWidget {
extraContent: WordZoomWidget(
token: selectedToken.text,
construct: selectedToken.vocabConstructID,
pos: selectedToken.pos,
morph: selectedToken.morph.map((k, v) => MapEntry(k.name, v)),
langCode: langCode,
enableEmojiSelection: false,
),

View file

@ -24,7 +24,7 @@ class TokenInfoFeedbackRepo {
);
final Response res = await req.post(
url: PApiUrls.tokenFeedback,
url: PApiUrls.tokenFeedbackV2,
body: request.toJson(),
);

View file

@ -1,5 +1,6 @@
import 'package:fluffychat/pangea/events/models/pangea_token_model.dart';
import 'package:fluffychat/pangea/lemmas/lemma_info_response.dart';
import 'package:fluffychat/pangea/phonetic_transcription/pt_v2_models.dart';
class TokenInfoFeedbackRequestData {
final String userId;
@ -9,7 +10,8 @@ class TokenInfoFeedbackRequestData {
final List<PangeaToken> tokens;
final int selectedToken;
final LemmaInfoResponse lemmaInfo;
final String phonetics;
final PTRequest? ptRequest;
final PTResponse? ptResponse;
final String wordCardL1;
TokenInfoFeedbackRequestData({
@ -18,8 +20,9 @@ class TokenInfoFeedbackRequestData {
required this.tokens,
required this.selectedToken,
required this.lemmaInfo,
required this.phonetics,
required this.wordCardL1,
this.ptRequest,
this.ptResponse,
this.roomId,
this.fullText,
});
@ -35,7 +38,8 @@ class TokenInfoFeedbackRequestData {
detectedLanguage == other.detectedLanguage &&
selectedToken == other.selectedToken &&
lemmaInfo == other.lemmaInfo &&
phonetics == other.phonetics &&
ptRequest == other.ptRequest &&
ptResponse == other.ptResponse &&
wordCardL1 == other.wordCardL1;
@override
@ -46,7 +50,8 @@ class TokenInfoFeedbackRequestData {
detectedLanguage.hashCode ^
selectedToken.hashCode ^
lemmaInfo.hashCode ^
phonetics.hashCode ^
ptRequest.hashCode ^
ptResponse.hashCode ^
wordCardL1.hashCode;
}
@ -65,7 +70,8 @@ class TokenInfoFeedbackRequest {
'tokens': data.tokens.map((token) => token.toJson()).toList(),
'selected_token': data.selectedToken,
'lemma_info': data.lemmaInfo.toJson(),
'phonetics': data.phonetics,
'pt_request': data.ptRequest?.toJson(),
'pt_response': data.ptResponse?.toJson(),
'user_feedback': userFeedback,
'word_card_l1': data.wordCardL1,
};

View file

@ -1,13 +1,13 @@
import 'package:fluffychat/pangea/events/models/content_feedback.dart';
import 'package:fluffychat/pangea/events/models/pangea_token_model.dart';
import 'package:fluffychat/pangea/lemmas/lemma_info_response.dart';
import 'package:fluffychat/pangea/phonetic_transcription/phonetic_transcription_response.dart';
import 'package:fluffychat/pangea/phonetic_transcription/pt_v2_models.dart';
class TokenInfoFeedbackResponse implements JsonSerializable {
final String userFriendlyMessage;
final PangeaToken? updatedToken;
final LemmaInfoResponse? updatedLemmaInfo;
final PhoneticTranscriptionResponse? updatedPhonetics;
final PTResponse? updatedPhonetics;
final String? updatedLanguage;
TokenInfoFeedbackResponse({
@ -30,7 +30,7 @@ class TokenInfoFeedbackResponse implements JsonSerializable {
)
: null,
updatedPhonetics: json['updated_phonetics'] != null
? PhoneticTranscriptionResponse.fromJson(
? PTResponse.fromJson(
json['updated_phonetics'] as Map<String, dynamic>,
)
: null,

View file

@ -68,6 +68,8 @@ class PracticeMatchItemState extends State<PracticeMatchItem> {
context: context,
targetID: 'word-audio-button',
langCode: l2,
pos: widget.token?.pos,
morph: widget.token?.morph.map((k, v) => MapEntry(k.name, v)),
);
}
} catch (e, s) {

View file

@ -209,6 +209,8 @@ class MessageOverlayController extends State<MessageSelectionOverlay>
TtsController.tryToSpeak(
selectedToken!.text.content,
langCode: pangeaMessageEvent.messageDisplayLangCode,
pos: selectedToken!.pos,
morph: selectedToken!.morph.map((k, v) => MapEntry(k.name, v)),
);
}

View file

@ -4,6 +4,7 @@ import 'package:matrix/matrix_api_lite/model/message_types.dart';
import 'package:fluffychat/config/themes.dart';
import 'package:fluffychat/pangea/lemmas/lemma_info_response.dart';
import 'package:fluffychat/pangea/phonetic_transcription/pt_v2_models.dart';
import 'package:fluffychat/pangea/token_info_feedback/token_info_feedback_request.dart';
import 'package:fluffychat/pangea/toolbar/message_selection_overlay.dart';
import 'package:fluffychat/pangea/toolbar/word_card/word_zoom_widget.dart';
@ -44,30 +45,41 @@ class ReadingAssistanceContent extends StatelessWidget {
.key,
token: overlayController.selectedToken!.text,
construct: overlayController.selectedToken!.vocabConstructID,
pos: overlayController.selectedToken!.pos,
morph: overlayController.selectedToken!.morph.map(
(k, v) => MapEntry(k.name, v),
),
event: overlayController.event,
onClose: () => overlayController.updateSelectedSpan(null),
langCode: overlayController.pangeaMessageEvent.messageDisplayLangCode,
onDismissNewWordOverlay: () => overlayController.setState(() {}),
onFlagTokenInfo: (LemmaInfoResponse lemmaInfo, String phonetics) {
if (selectedTokenIndex < 0) return;
final requestData = TokenInfoFeedbackRequestData(
userId: Matrix.of(context).client.userID!,
roomId: overlayController.event.room.id,
fullText: overlayController.pangeaMessageEvent.messageDisplayText,
detectedLanguage:
onFlagTokenInfo:
(
LemmaInfoResponse lemmaInfo,
PTRequest ptRequest,
PTResponse ptResponse,
) {
if (selectedTokenIndex < 0) return;
final requestData = TokenInfoFeedbackRequestData(
userId: Matrix.of(context).client.userID!,
roomId: overlayController.event.room.id,
fullText: overlayController.pangeaMessageEvent.messageDisplayText,
detectedLanguage:
overlayController.pangeaMessageEvent.messageDisplayLangCode,
tokens: tokens ?? [],
selectedToken: selectedTokenIndex,
wordCardL1:
MatrixState.pangeaController.userController.userL1Code!,
lemmaInfo: lemmaInfo,
ptRequest: ptRequest,
ptResponse: ptResponse,
);
overlayController.widget.chatController.showTokenFeedbackDialog(
requestData,
overlayController.pangeaMessageEvent.messageDisplayLangCode,
tokens: tokens ?? [],
selectedToken: selectedTokenIndex,
wordCardL1: MatrixState.pangeaController.userController.userL1Code!,
lemmaInfo: lemmaInfo,
phonetics: phonetics,
);
overlayController.widget.chatController.showTokenFeedbackDialog(
requestData,
overlayController.pangeaMessageEvent.messageDisplayLangCode,
overlayController.pangeaMessageEvent,
);
},
overlayController.pangeaMessageEvent,
);
},
);
}
}

View file

@ -6,13 +6,14 @@ import 'package:fluffychat/pangea/languages/language_model.dart';
import 'package:fluffychat/pangea/lemmas/lemma_info_response.dart';
import 'package:fluffychat/pangea/lemmas/lemma_meaning_builder.dart';
import 'package:fluffychat/pangea/phonetic_transcription/phonetic_transcription_builder.dart';
import 'package:fluffychat/pangea/phonetic_transcription/pt_v2_models.dart';
class TokenFeedbackButton extends StatelessWidget {
final LanguageModel textLanguage;
final ConstructIdentifier constructId;
final String text;
final Function(LemmaInfoResponse, String) onFlagTokenInfo;
final Function(LemmaInfoResponse, PTRequest, PTResponse) onFlagTokenInfo;
final Map<String, dynamic> messageInfo;
const TokenFeedbackButton({
@ -38,20 +39,22 @@ class TokenFeedbackButton extends StatelessWidget {
final enabled =
(lemmaController.lemmaInfo != null ||
lemmaController.isError) &&
(transcriptController.transcription != null ||
(transcriptController.ptResponse != null ||
transcriptController.isError);
final lemmaInfo =
lemmaController.lemmaInfo ?? LemmaInfoResponse.error;
final transcript = transcriptController.transcription ?? 'ERROR';
return IconButton(
color: Theme.of(context).iconTheme.color,
icon: const Icon(Icons.flag_outlined),
onPressed: enabled
onPressed: enabled && transcriptController.ptResponse != null
? () {
onFlagTokenInfo(lemmaInfo, transcript);
onFlagTokenInfo(
lemmaInfo,
transcriptController.ptRequest,
transcriptController.ptResponse!,
);
}
: null,
tooltip: enabled ? L10n.of(context).reportWordIssueTooltip : null,

View file

@ -10,6 +10,7 @@ import 'package:fluffychat/pangea/languages/language_model.dart';
import 'package:fluffychat/pangea/languages/p_language_store.dart';
import 'package:fluffychat/pangea/lemmas/lemma_info_response.dart';
import 'package:fluffychat/pangea/phonetic_transcription/phonetic_transcription_widget.dart';
import 'package:fluffychat/pangea/phonetic_transcription/pt_v2_models.dart';
import 'package:fluffychat/pangea/toolbar/reading_assistance/new_word_overlay.dart';
import 'package:fluffychat/pangea/toolbar/reading_assistance/tokens_util.dart';
import 'package:fluffychat/pangea/toolbar/word_card/lemma_meaning_display.dart';
@ -27,9 +28,15 @@ class WordZoomWidget extends StatelessWidget {
final Event? event;
/// POS tag for PT v2 disambiguation (e.g. "VERB").
final String? pos;
/// Morph features for PT v2 disambiguation (e.g. {"Tense": "Past"}).
final Map<String, String>? morph;
final bool enableEmojiSelection;
final VoidCallback? onDismissNewWordOverlay;
final Function(LemmaInfoResponse, String)? onFlagTokenInfo;
final Function(LemmaInfoResponse, PTRequest, PTResponse)? onFlagTokenInfo;
final ValueNotifier<int>? reloadNotifier;
final double? maxWidth;
@ -40,6 +47,8 @@ class WordZoomWidget extends StatelessWidget {
required this.langCode,
this.onClose,
this.event,
this.pos,
this.morph,
this.enableEmojiSelection = true,
this.onDismissNewWordOverlay,
this.onFlagTokenInfo,
@ -135,6 +144,8 @@ class WordZoomWidget extends StatelessWidget {
textLanguage:
PLanguageStore.byLangCode(langCode) ??
LanguageModel.unknown,
pos: pos,
morph: morph,
style: const TextStyle(fontSize: 14.0),
iconSize: 24.0,
maxLines: 2,

View file

@ -469,8 +469,5 @@ class UserController {
userL1Code != LanguageKeys.unknownLanguage &&
userL2Code != LanguageKeys.unknownLanguage;
bool get showTranscription =>
(userL1 != null && userL2 != null && userL1?.script != userL2?.script) ||
(userL1?.script != LanguageKeys.unknownLanguage ||
userL2?.script == LanguageKeys.unknownLanguage);
bool get showTranscription => true;
}