Apr 2025 – Jun 2025

GemmaTR

To address the lack of Turkish chatbots, I trained the Google Gemma model on Google Colab over a 40-hour process in fragments using the Unsloth library and LoRA technique; I created a dataset of 400,000 Turkish Wikipedia articles and 50,000 law, education, and agriculture-focused question-answer pairs, developed 4 different models, and shared them on the HuggingFace platform for public access.

Overview

Problem

Turkish users have fewer open, specialized chatbot resources than English users, especially for domain-focused question answering.

Technical Approach

I fine-tuned Google Gemma with Unsloth and LoRA using a large Turkish corpus, including Wikipedia articles and law, education, and agriculture question-answer data.

Result

GemmaTR produced four Turkish model variants and made the work publicly discoverable through Hugging Face for community access and reuse.

Technologies

Programming

External Links

Hugging Face GitHub Profile