Question for bird-and-animal training

Hello, I'm interested in Cogview implemenation.
Actually I have problem in executing scripts/pretrain_single_node.sh
After 20,000 training iteration and executing ./scripts/text2image.sh, but there is no exact of shape for bird or animal in the result image. The text is 飞鹰(Flying eagle)
![image](https://github.com/THUDM/CogView/assets/34644194/0d36497d-53d7-44e3-a632-f8a6098fd571)

For training, I executed the shell script only changing the variable, NUM_GPUS_PER_WORKER from 8 to 1 since I have only one GPU instance.
I'm using Google Colab(A100 is supported).
1. NUM_GPUS_PER_WORKER=1
2. Image tokenizer number of tokens : 8192
3. The Number of layers : 12
4. The size of hidden layer : 1024
5. The number of attention heads : 16

mp_rank_00_model_state.pt is the result file of training and the size of pt(pytorch) file is only 2.76Giga bytes.

Of course, if I use the pretrained model, cogview-base.tar, then the result is all right.
![image](https://github.com/THUDM/CogView/assets/34644194/17ca1db6-646f-4fbc-8fc0-3d36931c2a6d)

Please check my question and I hope you can give any advise or comment.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question for bird-and-animal training #68

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Question for bird-and-animal training #68

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions