wilmerhenao
/

olinguito

Model card Files Files and versions

olinguito / README.md

wilmerhenao's picture

Update README.md

57fe380 almost 3 years ago

|

history blame contribute delete

1.79 kB


	This is a Finetuning of GPT-J-6B using LoRa - https://huggingface.co/EleutherAI/gpt-j-6B

	The dataset is the cleaned version of the Alpaca dataset - https://github.com/gururise/AlpacaDataCleaned

	A model similar to this has been talked about

	The performance is good but not as good as the orginal Alpaca trained from a base model of LLaMa

	This is mostly due to the LLaMa 7B model being pretrained on 1T tokens and GPT-J-6B being trained on 300-400M tokens

	You will need a 3090 or A100 to run it, unfortunately this current version won't work on a T4.


	---
	library_name: peft
	license: apache-2.0
	language:
	- en
	tags:
	- Text Generation
	---
	## Training procedure


	The following `bitsandbytes` quantization config was used during training:
	- load_in_8bit: True
	- load_in_4bit: False
	- llm_int8_threshold: 6.0
	- llm_int8_skip_modules: None
	- llm_int8_enable_fp32_cpu_offload: False
	- llm_int8_has_fp16_weight: False
	- bnb_4bit_quant_type: fp4
	- bnb_4bit_use_double_quant: False
	- bnb_4bit_compute_dtype: float32

	The following `bitsandbytes` quantization config was used during training:
	- load_in_8bit: True
	- load_in_4bit: False
	- llm_int8_threshold: 6.0
	- llm_int8_skip_modules: None
	- llm_int8_enable_fp32_cpu_offload: False
	- llm_int8_has_fp16_weight: False
	- bnb_4bit_quant_type: fp4
	- bnb_4bit_use_double_quant: False
	- bnb_4bit_compute_dtype: float32

	The following `bitsandbytes` quantization config was used during training:
	- load_in_8bit: True
	- load_in_4bit: False
	- llm_int8_threshold: 6.0
	- llm_int8_skip_modules: None
	- llm_int8_enable_fp32_cpu_offload: False
	- llm_int8_has_fp16_weight: False
	- bnb_4bit_quant_type: fp4
	- bnb_4bit_use_double_quant: False
	- bnb_4bit_compute_dtype: float32
	### Framework versions

	- PEFT 0.4.0.dev0
	- PEFT 0.4.0.dev0

	- PEFT 0.4.0.dev0