Apr 2025 – Jun 2025
GemmaTR
To address the lack of Turkish chatbots, I trained the Google Gemma model on Google Colab over a 40-hour process in fragments using the Unsloth library and LoRA technique; I created a dataset of 400,000 Turkish Wikipedia articles and 50,000 law, education, and agriculture-focused question-answer pairs, developed 4 different models, and shared them on the HuggingFace platform for public access.
Overview
To address the lack of Turkish chatbots, I trained the Google Gemma model on Google Colab over a 40-hour process in fragments using the Unsloth library and LoRA technique; I created a dataset of 400,000 Turkish Wikipedia articles and 50,000 law, education, and agriculture-focused question-answer pairs, developed 4 different models, and shared them on the HuggingFace platform for public access.
Problem
Turkish users have fewer open, specialized chatbot resources than English users, especially for domain-focused question answering.
Technical Approach
I fine-tuned Google Gemma with Unsloth and LoRA using a large Turkish corpus, including Wikipedia articles and law, education, and agriculture question-answer data.
Result
GemmaTR produced four Turkish model variants and made the work publicly discoverable through Hugging Face for community access and reuse.