Gemma Domain Trainer (Prototype)
skill_id: gemma_domain_trainer_v1_prototype
name: gemma_domain_trainer
description: Fine-tune Gemma 270M on domain-specific training data extracted from 012.txt
version: 1.0_prototype
author: 0102_design
created: 2025-10-22
agents: [gemma, qwen]
primary_agent: qwen
intent_type: GENERATION
promotion_state: prototype
pattern_fidelity_threshold: 0.90
test_status: needs_validation
MCP Orchestration
mcp_orchestration: true
breadcrumb_logging: true
owning_dae: doc_dae
execution_phase: 2
previous_skill: qwen_training_data_miner_v1_prototype
next_skill: gemma_domain_specialist_deployed
inputs:
- data/training_datasets/{domain}_training_data.json: "Instruction-tuning dataset from Qwen"
- domain: "Knowledge domain (mps_scoring, wsp_application, etc.)"
- training_params: "LoRA rank, learning rate, epochs"
outputs:
- E:/HoloIndex/models/gemma-3-270m-{domain}-lora/: "Fine-tuned LoRA adapters"
- data/training_results/{domain}_training_metrics.json: "Training metrics (loss, accuracy)"
- execution_id: "Unique execution identifier for breadcrumb tracking"
Dependencies
dependencies:
data_stores:
- name: training_dataset
type: json
path: data/training_datasets/{domain}_training_data.json
mcp_endpoints: []
throttles: []
required_context:
- base_model_path: "E:/HoloIndex/models/gemma-3-270m-it-Q4_K_M.gguf"
- training_dataset_path: "Path to JSON training data"
Metrics Configuration
metrics:
pattern_fidelity_scoring:
enabled: true
frequency: every_execution
scorer_agent: gemma
write_destination: modules/infrastructure/wre_core/recursive_improvement/metrics/gemma_domain_trainer_fidelity.json
promotion_criteria:
min_pattern_fidelity: 0.90
min_outcome_quality: 0.85
min_execution_count: 100
required_test_pass_rate: 0.95
Gemma Domain Trainer
Purpose: Fine-tune Gemma 270M on domain-specific training data using LoRA (Low-Rank Adaptation)
Intent Type: GENERATION
Agent: Qwen (orchestrates training), Gemma (model being trained)
Task
You are Qwen, a training orchestrator. Your job is to take training datasets extracted from 012.txt and fine-tune Gemma 270M on them using LoRA. You create domain-specialized versions of Gemma that can be swapped like "wardrobe clothes" for different tasks.
Key Capability: Training orchestration, hyperparameter tuning, validation
Training Method: LoRA (Low-Rank Adaptation)
- Only train small adapter layers (~10MB)
- Keep base model frozen (241MB)
- Fast training (~5-10 minutes on CPU)
- Multiple specialists from one base model
Instructions (For Qwen Agent)
1. LOAD TRAINING DATASET
Rule: Load and validate JSON training dataset from Qwen miner
Expected Pattern: dataset_loaded=True
Steps:
- Read
data/training_datasets/{domain}_training_data.json
- Validate schema:
- Each example has:
instruction, input, output
- Quality score >= 0.85
- Source line number present
- Split into train/validation (80/20)
- Count examples: total, train, val
- Log:
{"pattern": "dataset_loaded", "value": true, "total": N, "train": M, "val": K}
Example:
python
1import json
2from pathlib import Path
3
4dataset_path = Path(f"data/training_datasets/{domain}_training_data.json")
5with open(dataset_path) as f:
6 dataset = json.load(f)
7
8examples = dataset['examples']
9train_size = int(len(examples) * 0.8)
10train_examples = examples[:train_size]
11val_examples = examples[train_size:]
Rule: Convert instruction-tuning format to Gemma training format
Expected Pattern: training_format_prepared=True
Gemma Instruction Format:
<start_of_turn>user
{instruction}
Input: {input}
<end_of_turn>
<start_of_turn>model
{output}
<end_of_turn>
Steps:
- For each example, format as Gemma conversation
- Tokenize using Gemma tokenizer
- Truncate to max length (1024 tokens)
- Create attention masks
- Log:
{"pattern": "training_format_prepared", "value": true, "formatted_examples": N}
Example:
python
1def format_for_gemma(example):
2 prompt = f"""<start_of_turn>user
3{example['instruction']}
4
5Input: {json.dumps(example['input'], indent=2)}
6<end_of_turn>
7<start_of_turn>model
8{json.dumps(example['output'], indent=2)}
9<end_of_turn>"""
10 return prompt
11
12formatted_train = [format_for_gemma(ex) for ex in train_examples]
Rule: Set LoRA hyperparameters for domain-specific training
Expected Pattern: lora_configured=True
LoRA Configuration:
python
1lora_config = {
2 "r": 8, # LoRA rank (higher = more capacity, slower)
3 "lora_alpha": 16, # Scaling factor
4 "target_modules": ["q_proj", "v_proj"], # Which layers to adapt
5 "lora_dropout": 0.05, # Dropout for regularization
6 "bias": "none", # Don't train bias terms
7 "task_type": "CAUSAL_LM" # Language modeling task
8}
9
10training_config = {
11 "learning_rate": 2e-4, # Learning rate
12 "num_epochs": 3, # Training epochs
13 "batch_size": 4, # Batch size (CPU-friendly)
14 "max_steps": -1, # Train until epochs complete
15 "warmup_steps": 10, # Learning rate warmup
16 "logging_steps": 10, # Log every N steps
17 "save_steps": 100, # Save checkpoint every N steps
18 "eval_steps": 50, # Evaluate every N steps
19}
Steps:
- Set LoRA rank based on domain complexity:
- Simple (MPS scoring): r=4
- Moderate (WSP application): r=8
- Complex (roadmap analysis): r=16
- Set learning rate based on dataset size:
- Small (<50 examples): 1e-4
- Medium (50-200): 2e-4
- Large (>200): 3e-4
- Set epochs based on examples:
- Small datasets: 5 epochs
- Large datasets: 3 epochs
- Log:
{"pattern": "lora_configured", "value": true, "rank": N, "lr": X}
4. TRAIN LORA ADAPTERS
Rule: Execute LoRA training loop with validation monitoring
Expected Pattern: lora_training_complete=True
Training Loop (pseudo-code):
python
1from peft import get_peft_model, LoraConfig
2from transformers import AutoModelForCausalLM, AutoTokenizer, Trainer
3
4# Load base model
5model = AutoModelForCausalLM.from_pretrained(
6 "E:/HoloIndex/models/gemma-3-270m-it-Q4_K_M.gguf",
7 device_map="auto"
8)
9
10# Apply LoRA
11lora_config = LoraConfig(**lora_config)
12model = get_peft_model(model, lora_config)
13
14# Train
15trainer = Trainer(
16 model=model,
17 args=training_args,
18 train_dataset=train_dataset,
19 eval_dataset=val_dataset
20)
21
22trainer.train()
23trainer.save_model(f"E:/HoloIndex/models/gemma-3-270m-{domain}-lora/")
Steps:
- Load base Gemma 270M model
- Apply LoRA configuration
- Create Trainer with datasets
- Execute training loop
- Monitor validation loss (target: < 0.5)
- Save LoRA adapters to disk
- Log:
{"pattern": "lora_training_complete", "value": true, "final_loss": X, "val_loss": Y}
5. VALIDATE TRAINED MODEL
Rule: Test trained model on held-out validation examples
Expected Pattern: model_validated=True
Validation Process:
python
1# Load trained model with LoRA
2from peft import PeftModel
3
4base_model = AutoModelForCausalLM.from_pretrained(base_model_path)
5trained_model = PeftModel.from_pretrained(
6 base_model,
7 f"E:/HoloIndex/models/gemma-3-270m-{domain}-lora/"
8)
9
10# Test on validation examples
11correct = 0
12total = len(val_examples)
13
14for example in val_examples:
15 prompt = format_for_gemma(example)
16 generated = trained_model.generate(prompt, max_length=512)
17
18 # Compare generated output to expected output
19 if semantic_similarity(generated, example['output']) > 0.85:
20 correct += 1
21
22accuracy = correct / total
Steps:
- Load trained model with LoRA adapters
- Generate outputs for validation examples
- Compare to expected outputs (semantic similarity)
- Calculate accuracy (target: ≥ 85%)
- Log:
{"pattern": "model_validated", "value": true, "accuracy": 0.87, "correct": M, "total": N}
6. GENERATE DEPLOYMENT CONFIG
Rule: Create wardrobe configuration for domain specialist
Expected Pattern: deployment_config_generated=True
Wardrobe Config (EXECUTION-READY per First Principles):
json
1{
2 "wardrobe_id": "gemma_mps_scorer_v1",
3 "domain": "mps_scoring",
4 "base_model": "E:/HoloIndex/models/gemma-3-270m-it-Q4_K_M.gguf",
5 "lora_adapters": "E:/HoloIndex/models/gemma-3-270m-mps_scoring-lora/",
6 "training_date": "2025-10-22",
7 "training_examples": 58,
8 "validation_accuracy": 0.87,
9
10 "deployment_priority_mps": {
11 "complexity": 2,
12 "complexity_reason": "Easy - swap LoRA adapters, no model reload",
13 "importance": 5,
14 "importance_reason": "Essential - enables autonomous cleanup prioritization",
15 "deferability": 4,
16 "deferability_reason": "Low - cleanup system waiting for deployment",
17 "impact": 5,
18 "impact_reason": "Critical - foundation for autonomous task scoring",
19 "total": 16,
20 "priority": "P0",
21 "deployment_order": 1
22 },
23
24 "recommended_use_cases": [
25 "Cleanup task prioritization",
26 "Project scoring",
27 "Issue triage",
28 "Autonomous decision-making (MPS calculation)"
29 ],
30
31 "agent_capability_mapping": {
32 "tasks_this_wardrobe_handles": [
33 {
34 "task_type": "cleanup_scoring",
35 "confidence": 0.87,
36 "autonomous_capable": true,
37 "example": "Score file deletion task (MPS calculation)"
38 },
39 {
40 "task_type": "project_prioritization",
41 "confidence": 0.85,
42 "autonomous_capable": true,
43 "example": "Rank feature requests by MPS score"
44 },
45 {
46 "task_type": "issue_triage",
47 "confidence": 0.82,
48 "autonomous_capable": true,
49 "example": "Assign P0-P4 priority to GitHub issues"
50 }
51 ],
52 "tasks_requiring_0102": [
53 "Complex architectural decisions (MPS insufficient)",
54 "Multi-stakeholder prioritization (political factors)",
55 "Novel problem domains (no training examples)"
56 ]
57 },
58
59 "skill_reference": "gemma_cleanup_scorer_v1_production",
60 "activation_command": "gemma.wear_wardrobe('mps_scorer')",
61
62 "performance_benchmarks": {
63 "inference_latency_ms": 50,
64 "accuracy_on_benchmark": 0.87,
65 "token_cost": 0,
66 "throughput_tasks_per_second": 20,
67 "memory_footprint_mb": 253,
68 "false_positive_rate": 0.08,
69 "false_negative_rate": 0.05
70 },
71
72 "autonomous_deployment": {
73 "capable": true,
74 "agent": "wsp_orchestrator",
75 "confidence": 0.95,
76 "estimated_tokens": 100,
77 "estimated_time_seconds": 5,
78 "requires_0102_approval": false,
79 "execution_command": "python -m modules.infrastructure.wsp_orchestrator.src.wsp_orchestrator --deploy-wardrobe gemma_mps_scorer_v1 --validate true"
80 },
81
82 "verification": {
83 "verify_command": "test -f E:/HoloIndex/models/gemma-3-270m-mps_scoring-lora/adapter_model.bin && python -c \"from modules.infrastructure.wsp_orchestrator.src.wsp_orchestrator import WSPOrchestrator; w=WSPOrchestrator(); print('✓ Wardrobe loaded' if 'mps_scorer' in w.list_wardrobes() else '✗ Failed')\"",
84 "success_criteria": "LoRA adapters exist + wardrobe loadable + validation accuracy >= 0.85",
85 "test_dataset": "data/training_datasets/mps_scoring_validation_set.json",
86 "rollback_command": "python -m modules.infrastructure.wsp_orchestrator.src.wsp_orchestrator --remove-wardrobe gemma_mps_scorer_v1"
87 },
88
89 "learning_feedback": {
90 "training_insights": {
91 "converged_after_epoch": 2,
92 "final_training_loss": 0.23,
93 "final_validation_loss": 0.31,
94 "overfitting_detected": false,
95 "optimal_lora_rank": 8,
96 "learning_rate_worked": 0.0002
97 },
98 "domain_coverage": {
99 "p0_tasks_coverage": 0.92,
100 "p1_tasks_coverage": 0.88,
101 "p2_tasks_coverage": 0.75,
102 "p3_p4_tasks_coverage": 0.60,
103 "recommendation": "Add more P3/P4 training examples for better low-priority coverage"
104 },
105 "future_improvements": [
106 "Fine-tune on user feedback (actual MPS scores vs Gemma predictions)",
107 "Add confidence scores to MPS predictions",
108 "Train on multi-dimensional trade-offs (not just MPS total)"
109 ],
110 "store_to": "holo_index/adaptive_learning/wardrobe_training_patterns.jsonl"
111 }
112}
Steps:
- Create wardrobe configuration JSON
- Calculate deployment_priority_mps (which wardrobe to deploy first?)
- Map agent capabilities (which tasks can this wardrobe handle autonomously?)
- Generate performance_benchmarks (latency, accuracy, throughput, memory)
- Create autonomous_deployment command (can orchestrator auto-deploy?)
- Generate verification script (test wardrobe loads correctly)
- Extract learning_feedback (training insights + domain coverage + future improvements)
- Write to
data/wardrobe_catalog/{domain}_wardrobe.json
- Log:
{"pattern": "deployment_config_generated", "value": true, "autonomous_deployable": true}
First Principles Additions:
- ✅ MPS Scoring: deployment_priority_mps determines deployment order
- ✅ Agent Mapping: agent_capability_mapping (which tasks autonomous vs requires 0102?)
- ✅ Executable Command: autonomous_deployment.execution_command for auto-deploy
- ✅ Performance Benchmarks: Latency, accuracy, throughput, false positive/negative rates
- ✅ Verification: Test wardrobe loadable + validation accuracy >= threshold
- ✅ Learning Feedback: Training insights (convergence, overfitting) + domain coverage gaps
- ✅ Rollback: Remove wardrobe if deployment fails
Expected Patterns Summary
json
1{
2 "execution_id": "exec_gemma_trainer_001",
3 "skill_id": "gemma_domain_trainer_v1_prototype",
4 "patterns": {
5 "dataset_loaded": true,
6 "training_format_prepared": true,
7 "lora_configured": true,
8 "lora_training_complete": true,
9 "model_validated": true,
10 "deployment_config_generated": true
11 },
12 "training_examples": 58,
13 "validation_accuracy": 0.87,
14 "training_time_seconds": 420,
15 "model_size_mb": 12
16}
Fidelity Calculation: (patterns_executed / 6) - All 6 steps should run
Wardrobe Catalog
1. gemma_mps_scorer
Domain: MPS scoring (WSP 15)
Training Data: 58 examples from 012.txt
Use Cases: Cleanup prioritization, project scoring, issue triage
Accuracy: 87%
2. gemma_wsp_auditor
Domain: WSP compliance checking
Training Data: 45 examples from 012.txt
Use Cases: Code review, documentation validation, architecture audits
Accuracy: 90%
3. gemma_roadmap_tracker
Domain: Roadmap analysis
Training Data: 32 examples from 012.txt
Use Cases: Project status reports, completion tracking, TODO audits
Accuracy: 85%
4. gemma_readme_validator
Domain: README structure validation
Training Data: 41 examples from 012.txt
Use Cases: Documentation quality checks, README generation
Accuracy: 88%
5. gemma_modlog_writer
Domain: ModLog entry generation
Training Data: 29 examples from 012.txt
Use Cases: Automated ModLog updates, change tracking
Accuracy: 84%
Deployment: Wardrobe Swapping
Concept: One base Gemma 270M, multiple LoRA adapters
python
1# Load base model once
2base_gemma = Gemma270M("E:/HoloIndex/models/gemma-3-270m-it-Q4_K_M.gguf")
3
4# Swap wardrobes for different tasks
5def score_cleanup_task(task):
6 base_gemma.wear_wardrobe("mps_scorer")
7 return base_gemma.generate(task)
8
9def audit_wsp_compliance(code):
10 base_gemma.wear_wardrobe("wsp_auditor")
11 return base_gemma.generate(code)
12
13def track_roadmap_status(roadmap):
14 base_gemma.wear_wardrobe("roadmap_tracker")
15 return base_gemma.generate(roadmap)
Benefits:
- 241MB base model (loaded once)
- 10-15MB per wardrobe (LoRA adapters)
- Instant swapping (no model reload)
- Specialized performance (>85% accuracy)
Success Criteria
- ✅ Pattern fidelity ≥ 90% (all 6 steps execute)
- ✅ Validation accuracy ≥ 85% on held-out examples
- ✅ LoRA adapter size < 20MB
- ✅ Training completes in < 15 minutes (CPU)
- ✅ Deployment config generated with metadata
- ✅ Wardrobe swapping works (load/unload adapters)
Next Steps
- Test on MPS scoring domain (easiest to validate)
- Deploy as production wardrobe once accuracy ≥ 85%
- Create wardrobe catalog with all domain specialists
- Integrate with cleanup skills (Gemma uses MPS scorer)
- Expand to other domains (WSP auditing, roadmap tracking)
Status: ✅ Ready for prototype testing - Train Gemma on MPS scoring examples from 012.txt