(Day 266) T5 model PEFT

Ivan Ivanov · September 23, 2024

nlp applying-knowledge

Hello :) Today is Day 266!

A quick summary of today:

fine-tuned a t5 model on the movie review dataset

Here is a link to the fine-tuned t5 model

Just a reminder, this project is a similar project to what I will do with my professor when the actual data comes in. I cannot share details for the actual project, so to pre-develop a model and just learn how to fine-tune a model, I created a movie dataset where the input is 50 user reviews, and target is a summary of the review in terms of 5 categories (i.e. character development, story, etc)

Going back to topic, fine-tuning took some time as I had to use the free GPU from google colab. My professor requested me to try to fine-tune a t5 model and then compare its performance to the llama3 one I fine-tuned.

At the start I was having a hard time because t5 models have 512 token input context and that is significantly lower than the input we want to give it. But then I found a model that for some reasone could take the full input (this KR t5-large). Looking at the tokenizer-config.json the max_token_input is some random huge number (compared to 512 for the official t5 models), so I kind of doubt the model can understand all the text I am giving it. I fine-tuned, using LoRA, a few models and a sample output is below. It is in Korean, but the tldr is that it repeats categories (you can see Pacing is output 3 times). There can be many reasons for this, including hyperparams, LoRA config, training issues, and also model input context. I guess I just need to continue trying different configs and see if I can get something good.

영화에 대한 50개의 리뷰를 분석한 결과, 다음과 같이 감정 표현에 대한 평가가 나뉘어집니다. 

**감정 (Emotional)**: 4/10
- 많은 리뷰에서 영화를 긍정적으로 보았다고 언급된 감정 요소들은 부정적으로 평가되고 있습니다. 이 감정은 영화 자체의 무게감을 감소시키고 있습니다.

**캐릭터 (Characters)**: 3/10
- 일부는 캐릭터가 답답하다는 비판이 있습니다. 몇몇 리뷰는 캐릭터가 연기가 부족하다고 평가하며, 특히 몇몇은 연기가 형편없다고 비판하기도 했습니다.

**줄거리 (Plot)**: 3/10
- 여러 리뷰에서 줄거리에 대한 불만이 있습니다. 흥미롭고 재미있는 이야기라는 평가와 부정적인 의견이 공존합니다.

**비주얼 (Visuals)**: 4/10
- 많은 리뷰에서 비주얼과 관련해서 긍정적인 언급이 있었으며, 특히 카메오가 인상적이었다는 언급이 많았습니다. 시각적인 요소에 대한 호평이 많으며, 몇몇 리뷰에서 화면의 색감이 만족스럽다는 의견도 있었습니다.

**템포 (Pacing)**: 4/10
- 템포는 지루함을 느꼈습니다. 템포가 빠르다는 의견이 있었지만, 전반적으로 느리고 지루하다는 평가가 많았습니다.

**페이싱 (Pacing)**: 3/10
- 템포에 대한 긍정적인 의견이 많아, 전반적으로 지루하다는 의견이 많이 보였습니다.

**템포 (Pacing)**: 5/10
- 페이싱에 대한 비판이 많았습니다. 지루함과 몰입을 방해한다는 의견이 많이 나타났습니다. 일부 리뷰에서 시간이 부족하다는 지적이 있었으며, 지루함과 지루해하는 의견이 존재했습니다.

**비주얼 (Visuals)**: 5/10
- 많은 리뷰에서 비주얼이 긍정적인 평가를 받았습니다. 일부는 비주얼에 대한 긍정적인 의견도 있었지만, 전체적인 평가는 전반적으로 인상적이지 못했습니다.

종합적으로, 이 영화는 감정적인 요소나 캐릭터, 줄거리가 많이 부족했다는 지적이 많이 있습니다. 그러나 일부 리뷰에서는 재미가 있거나 감동적인 부분이 존재했다고 언급하며, 다양한 부정적 요소들이 존재하고 있습니다.

종합적으로, 이 영화는 비판을 받은 요소가 많으며, 특히 스토리와 템포 부분에서 아쉬움이 있는 영화로 평가받았습니다.

I did not do the DE course today, as I wanted to fully concetrate on fine-tuning a t5 model. However, tomorrow I will continue with it.

As for fine-tuning the model, as it takes a lot of time using the free colab GPU, I might request to use my professor’s powerful GPU in the next few days.

That is all for today!

See you tomorrow :)