Back to Projects

Apr 2025 – Jun 2025

GemmaTR

To address the lack of Turkish chatbots, I trained the Google Gemma model on Google Colab over a 40-hour process in fragments using the Unsloth library and LoRA technique; I created a dataset of 400,000 Turkish Wikipedia articles and 50,000 law, education, and agriculture-focused question-answer pairs, developed 4 different models, and shared them on the HuggingFace platform for public access.

Overview

To address the lack of Turkish chatbots, I trained the Google Gemma model on Google Colab over a 40-hour process in fragments using the Unsloth library and LoRA technique; I created a dataset of 400,000 Turkish Wikipedia articles and 50,000 law, education, and agriculture-focused question-answer pairs, developed 4 different models, and shared them on the HuggingFace platform for public access.

Problem

Turkish users have fewer open, specialized chatbot resources than English users, especially for domain-focused question answering.

Technical Approach

I fine-tuned Google Gemma with Unsloth and LoRA using a large Turkish corpus, including Wikipedia articles and law, education, and agriculture question-answer data.

Result

GemmaTR produced four Turkish model variants and made the work publicly discoverable through Hugging Face for community access and reuse.

Technologies

Programming

External Links