Q-Learning Recommendation System Dashboard

Sistem rekomendasi berbasis Q-Learning dengan state komposit: VARK + MSLQ + AMS + Engagement
Actions: 101(Reward), 102(Produk), 103(Hukuman), 105(Misi), 106(Coaching)

Total Students

-

Trained Students

-

Total Actions

5

Recommendations

-

Quick Train Q-Learning Model
Recommended: 200-500 episodes for optimal learning
Episode Configuration Disclaimer
📊 What is an Episode?
Satu episode = 1 iterasi lengkap melalui seluruh dataset interaksi siswa untuk update Q-values.

⚡ Impact on Learning:
  • 50-150 episodes: Quick learning, basic patterns
  • 200-500 episodes: Optimal convergence (recommended)
  • 500+ episodes: Diminishing returns, may overfit
🔄 Implementation Details:
• Alpha (α) = 0.1 (learning rate)
• Gamma (γ) = 0.9 (discount factor)
• Formula: Q(s,a) ← Q(s,a) + α[r + γ max Q(s',a') - Q(s,a)]

⚠️ Performance Note:
Higher episodes = longer training time. For production use, consider incremental learning with new data.
Get Student Recommendations
Cara Kerja Q-Learning Training
Data Processing Strategy
📊 Training menggunakan SEMUA data siswa:
  • ALL Students: Sistem memproses seluruh siswa dalam database
  • ALL Interactions: Setiap interaction history digunakan untuk training
  • ALL States: Semua kombinasi state (VARK+MSLQ+AMS+Engagement) dilatih
Tidak ada sampling acak - setiap data berkontribusi pada pembelajaran
Episode Training Flow
🔄 Per Episode Process:
  1. Load ALL interaction data dari database
  2. Generate states untuk setiap interaction
  3. Iterate through EVERY row secara sequential
  4. Update Q-values menggunakan formula Q-learning
  5. Repeat untuk episode berikutnya
Sequential processing memastikan konsistensi learning
Q-Learning Mechanism
🧬 State Generation untuk SETIAP siswa:
state = f"{vark_letter}_high_mslq_{cat}_ams_{motivation_type}_eng_{cat}"
Format: {V|A|R|K}_high_mslq_{high|medium|low}_ams_{intrinsic|extrinsic|achievement|amotivation}_eng_{high|medium|low}
Database States: 144 kombinasi state dari VARK letters (V/A/R/K) × MSLQ levels × AMS motivation types × Engagement levels
Learning Progress
📈 Progressive Learning:
Episode 1-50
Initial learning
Episode 100-200
Pattern recognition
Episode 300+
Convergence
Key Understanding
🎯 Comprehensive Training:
Sistem menggunakan SEMUA data siswa dan interactions untuk membangun model dengan 144 unique states dari database
🔄 Database-Driven Process:
States menggunakan format database: {V|A|R|K}_high_mslq_{level}_ams_{motivation}_eng_{level}
📊 Rich State Space:
VARK letters (V/A/R/K) × MSLQ levels × AMS types (intrinsic/extrinsic/achievement/amotivation) × Engagement levels
System Status
System ready. Train the model to start generating recommendations.