Apr 2025 – Jun 2025
GemmaTR
GemmaTR addresses the lack of Turkish chatbot resources by fine-tuning Google's Gemma model with Unsloth and LoRA on Google Colab. I created a dataset with 400,000 Turkish Wikipedia entries and 50,000 law, education, and agriculture-focused question-answer pairs, developed four model variants, and shared them on Hugging Face for public access.
Overview
GemmaTR addresses the lack of Turkish chatbot resources by fine-tuning Google's Gemma model with Unsloth and LoRA on Google Colab. I created a dataset with 400,000 Turkish Wikipedia entries and 50,000 law, education, and agriculture-focused question-answer pairs, developed four model variants, and shared them on Hugging Face for public access.
Problem
Turkish users have fewer open and specialized LLM resources than English users, especially for domain-focused question answering in areas like law, education, and agriculture.
Technical Approach
I prepared a large Turkish dataset, fine-tuned Google Gemma with Unsloth and LoRA, and iterated through four model variants during a 40-hour training process on Google Colab.
Result
GemmaTR made Turkish LLM fine-tuning work publicly discoverable through Hugging Face and provides a stronger open model signal for Turkish NLP.