wcjord b6f368fa1f docs: add instruction docs for analytics, practice, activities, toolbar; adjust content-word scoring ratio (10:7)

- Add analytics-system.instructions.md
- Add practice-exercises.instructions.md (with three-tier design direction)
- Add conversation-activities.instructions.md
- Add toolbar-reading-assistance.instructions.md
- Update copilot-instructions.md header
- Change content-word multiplier from 9 to 7 in practice_selection_repo.dart

2026-02-15 15:49:26 -05:00

17 KiB

Raw Blame History

applyTo
lib/pangea/practice_activities/,lib/pangea/analytics_practice/,lib/pangea/toolbar/message_practice/**

Practice Exercises

Practice exercises are multiple-choice exercises that reinforce vocabulary and grammar from real conversations. There are no disconnected flashcard decks — every practice item traces back to a message the user sent or received.

For conversation activities, see conversation-activities.instructions.md.

Three Entry Points

Entry Point	What It Is	Where It Lives	Activity Types Used
Vocab Practice	Standalone session of ~10 vocab exercises drawn from the user's weakest words	Analytics page → "Practice Vocab" button → `AnalyticsPractice(type: vocab)`	`lemmaMeaning`, `lemmaAudio`
Grammar Practice	Standalone session of ~10 grammar exercises drawn from recent errors + weak morphology	Analytics page → "Practice Grammar" button → `AnalyticsPractice(type: morph)`	`grammarError`, `grammarCategory`
Message Practice	Per-message practice accessed from the toolbar; exercises target words in that specific message	Toolbar → 💪 button → `PracticeController`	`wordMeaning`, `wordFocusListening`, `emoji`, `morphId`

All three entry points produce the same ConstructUseModel records, so practice from any source contributes equally to the user's vocabulary garden and XP.

Design Goals

Spaced repetition without the UI: Users never configure schedules. The system silently prioritizes words they haven't seen recently and content words over function words.
Multi-modal learning: Each word can be practiced through listening, meaning, emoji association, and grammar analysis — hitting visual, auditory, and semantic learning channels.
Completion, not perfection: Wrong answers still contribute (at reduced XP), keeping the experience encouraging.
Always from real messages: Every practice target traces back to a token from a real conversation message.

Target Prioritization: Which Words First?

All practice paths aim to surface words/constructs the user hasn't practiced recently, but they currently use two separate implementations rather than a shared service.

Message Practice — scoring formula

PracticeSelectionRepo._fetchPriorityScores computes a numeric score per token:

score = daysSinceLastUsed × (isContentWord ? 10 : 7)

daysSinceLastUsed: looked up via getConstructUses → lastUseByTypes filtered to the specific activity type's associated construct-use types. If the word has never been practiced, defaults to 20 days.
Content word bonus: nouns, verbs, and adjectives get a 10× multiplier; function words (articles, prepositions) get 7×. This meaningfully favors content words when recency is equal.
Tokens are sorted by score descending, top 8 taken, then shuffled.

Standalone Practice — simple recency sort

AnalyticsPracticeSessionRepo._fetchVocab and _fetchAudio sort by lastUsed ascending with nulls first (never-practiced words come first). There is no scoring formula and no content-word bonus.

Grammar targets use a different strategy:

_fetchErrors: selects recent grammar mistakes, skipping any construct practiced in the last 24 hours.
_fetchMorphs: sorts morph constructs by lastUsed ascending (same as vocab).

⚠️ Divergence note

These two systems evolved independently. The message-practice scorer is more nuanced (explicit formula, content-word weighting, per-activity-type recency). The standalone path is simpler but misses the content-word boost and uses aggregate recency rather than per-type recency. Unifying them into a shared prioritization service is a natural improvement — see Future Work.

🧠 Design Direction: Use-Type-Aware Spaced Repetition

The current scoring only considers recency and content-word status. It ignores the rich signal from ConstructUseTypeEnum — specifically how the user encountered or practiced each word. The next evolution should classify words into priority tiers based on their use-type history:

Three-tier model:

Tier	Who goes here	Practice priority
Suppressed	Lemmas whose most recent chat use is `wa` (without assistance) AND no subsequent incorrect practice	0 — skip entirely
Active	Lemmas encountered through `ta` (IT) or `ga` (IGC), OR lemmas with a recent incorrect practice answer (`incXX`)	High — prioritize these
Maintenance	Everything else — correctly practiced but aging	Normal — standard recency-based

Tier transitions:

A wa use → moves to Suppressed (user knows this word)
A ta or ga use → moves to Active (user needed help)
An incorrect practice answer → moves to Active (user struggled)
N consecutive correct practice answers → Active → Maintenance (learning is sticking)
Time passes without interaction → Maintenance words naturally bubble up via recency

Within each tier, the existing scoring formula applies: daysSinceLastUsed × (isContentWord ? 10 : 7). Active-tier words get an additional multiplier (e.g., ×2) so they always appear before maintenance words of similar age.

Key principle: Words used through IT and IGC should be practiced much more than wa words. A wa word should only re-enter practice if the user later gets it wrong.

Example scenario:

User types "gato" correctly without assistance → wa → Suppressed. Won't appear in practice.
User uses IT to translate "mariposa" → ta → Active. High priority for practice.
User practices "mariposa" and gets it wrong → incLM → stays Active, priority boosted.
User practices "mariposa" correctly 3 times → Active → Maintenance.
Two weeks pass with no interaction → Maintenance, but high recency score → likely to appear.
User later misspells "gato" and IGC corrects it → ga → moves from Suppressed back to Active.

⚠️ Grammar Error Practice: Missing Message Data

_fetchErrors finds grammar practice material by querying construct uses of type ga (grammar accepted), then resolving the original message via room?.getEventById(eventID). While the Matrix SDK's Room.getEventById already falls back to a server fetch if the event isn't in the local database, the chain can still fail at an earlier step: client.getRoomById(roomID) returns null if the room isn't loaded in memory (e.g., the user left). When the event is found, additional data is needed — the PangeaMessageEvent wrapper requires a timeline, tokens, choreo data, and translations, all of which are only available if the full message representation events are also accessible. A similar pattern applies to _fetchAudio and _fetchMorphs, which need example messages with token data and audio. See Future Work for improvements.

Activity Types

The ActivityTypeEnum defines all exercise types. They split into two groups:

Type	Enum Value	What the User Does	Learning Channel
Listening	`wordFocusListening`	Hears the word spoken, taps the correct token in the message	Auditory recognition
Meaning	`wordMeaning`	Sees L1 translations, picks the right one for a highlighted word	Semantic retrieval
Emoji	`emoji`	Chooses the best emoji for a word (associative learning)	Visual association
Grammar	`morphId`	Matches morphological features (tense, number, case) to values	Analytical/structural

Standalone Practice Types (analytics page)

Type	Enum Value	What the User Does	Generator
Vocab Meaning	`lemmaMeaning`	Picks the correct lemma definition from distractors	`VocabMeaningActivityGenerator`
Vocab Audio	`lemmaAudio`	Hears a word in the context of an example sentence, identifies it	`VocabAudioActivityGenerator`
Grammar Category	`grammarCategory`	Identifies the correct morph tag (e.g., "Past Tense") for a word	`MorphCategoryActivityGenerator`
Grammar Error	`grammarError`	Picks the correct replacement for a grammar error they made	`GrammarErrorPracticeGenerator`

Each activity type maps to specific ConstructUseTypeEnum values for correct/incorrect/ignored answers (e.g., corLM/incLM for lemmaMeaning).

Standalone Practice Sessions (Vocab & Grammar)

Session Lifecycle

AnalyticsPracticeSessionRepo.get(type, language) builds a session:
- Vocab: fetches the user's weakest lemmas (by spaced-repetition score), splits ~50/50 between lemmaAudio (needs example messages with audio) and lemmaMeaning targets
- Grammar: fetches recent grammar errors first (grammarError targets), then fills remaining slots with weak morph features (grammarCategory targets)
- Session size: 10 exercises + 5 error buffer (constants in AnalyticsPracticeConstants)
AnalyticsPracticeState manages the session UI — progress bar, timer, activity queue, hints
For each target, a MessageActivityRequest is sent to the appropriate generator
The generator returns a PracticeActivityModel subclass with choices and answers
On answer, a construct use is recorded and the session advances

Session Completion

When all targets are answered, CompletedActivitySessionView shows:

Total correct / incorrect / skipped
Time elapsed (with bonus XP if under 60 seconds)
Per-item review

Subscription Gate

Standalone practice requires an active subscription. UnsubscribedPracticePage is shown if the user isn't subscribed.

Target Selection

PracticeSelectionRepo determines which tokens in a message become practice targets:

Only tokens with saveVocab = true on their lemma (filters out punctuation, numbers, etc.)
Only messages in the user's target language (L2)
Deduplicated by lemma — if "running" and "runs" appear in the same message, only one is selected
Capped at 5 targets per activity type per message (avoids overwhelming long messages)

Selections are cached per message with a 1-day TTL in PracticeSelection.

Token priority within each message uses the scoring formula described in Target Prioritization above.

Practice Modes

MessagePracticeMode defines the four toolbar modes: listening, wordMeaning, wordEmoji, wordMorph. Each mode maps to an ActivityTypeEnum and shows per-word buttons on the message. When all words in a mode are complete, the mode's icon turns gold.

Controller

PracticeController manages per-message practice state:

Fetches PracticeSelection on construction
Generates activities on demand via PracticeRepo
Records answers via PracticeRecordController
Plays TTS on correct answers for audio reinforcement

Activity Generation

PracticeRepo is the central dispatch for generating exercises. It:

Receives a MessageActivityRequest with a PracticeTarget (tokens + activity type + optional morph feature)
Routes to the correct generator based on activity type
Caches results per-target with a 1-day TTL to avoid re-generating on re-render
Message-practice types (wordMeaning, emoji, morphId, wordFocusListening) call the choreographer API
Standalone types (lemmaMeaning, lemmaAudio, grammarCategory, grammarError) generate locally using lemma data and morph mappings

Model Hierarchy

PracticeActivityModel is a sealed class with subclasses for each activity type:

VocabMeaningPracticeActivityModel, VocabAudioPracticeActivityModel
MorphCategoryPracticeActivityModel, GrammarErrorPracticeActivityModel
LemmaPracticeActivityModel, LemmaMeaningPracticeActivityModel
EmojiPracticeActivityModel, MorphMatchPracticeActivityModel
WordListeningPracticeActivityModel

All expose a multipleChoiceContent (choices + answers) and produce a PracticeTarget for recording.

Key Contracts

Practice targets are deterministic per message. For a given eventId + language + token set, the same targets are generated and cached. Don't introduce randomness that would change targets on re-render.
Practice never blocks on network. Selection happens locally from cached token data. Activity content fetches from choreo, but the UI shows shimmer placeholders, never a blocking spinner.
Emoji and meaning choices persist beyond the practice session. They become the user's personal annotation on that lemma, visible in word cards and analytics.
All practice produces construct uses. Whether from the toolbar or the standalone page, every answer is recorded as a ConstructUseModel that feeds into the analytics system.

Future Work

Last updated: 2026-02-15

Practice Types & Modalities

pangeachat/client#5656 — Voice practice ideas
pangeachat/client#3175 — Speaking practice for Voice/Audio message
pangeachat/client#3176 — New type of practice activity
pangeachat/client#2678 — Listening exercises
pangeachat/client#5654 — Are there more places where it makes sense to use the word audio?

Practice Generation & Targeting

pangeachat/client#5700 — Unified practice target selection with use-type-aware spaced repetition and server-side message fetch (covers Parts 1 & 2: shared scorer + three-tier model)
pangeachat/client#2677 — Generate activities based on stored word forms from analytics
pangeachat/2-step-choreographer#1546 — Add emojis to distractor generation

Practice UX & Feedback

pangeachat/client#5436 — If messages practice is complete, put special gold barbell reaction on it
Persist a completion record to the Matrix room when a user completes all 4 practice modes on a message, making practice targets deterministic across sessions and devices
pangeachat/client#3569 — Practice Exercises in the analytics page

Bugs & Quality

pangeachat/2-step-choreographer#1568 — Vocab Practice in English instead of L1

Server-Side & Cross-Device

Server-side practice history to enable cross-device spaced repetition
pangeachat/client#5700 Part 3 — Server-side message fetch fallback for practice (room resolution, related sub-event data)
More activity types (fill-in-the-blank, sentence reordering, pronunciation scoring)

17 KiB Raw Blame History Unescape Escape