Summary of Advantages and Disadvantages of Each Version of Qwen - Image - Edit - Rapid - AIO for Your Reference

#264
by mingzhenkai - opened

Hello, everyone on the forum!

Since there are certain differences in the performance of each version of Qwen - Image - Edit - Rapid - AIO, and the discussion information is rather scattered. In order to quickly understand your opinions on the differences between the versions, I collected some posts that sparked intense discussions and used a large language model to summarize the content, resulting in the following table. This will also help new friends get a general idea of each version.

However, it should be noted that the information I collected may not be comprehensive, and the summary is for reference only. If you find that some important information is missing, please feel free to add it below this post.

Version Advantages Disadvantages Other Notes
v14.1 It is already quite perfect. The upper and lower surfaces of the penis are often reversed. -
v15 It starts to use the 2511 version, and the consistency is significantly improved. - Began to use the 2511 version.
v18 It performs well in terms of consistency and realism. It doesn't understand NSFW concepts as well as v20/v21. -
v18.1 It has the best facial consistency, with the least change in the face from V18 to V21; the image generation effect is impressive. It adds a blur and "beauty filter" effect to the edited photos, removing fine details on the skin and hair, and the details of background objects are also blurred; it has a plastic - like texture. -
v19 It has a good anime style; it follows pose prompts better; it has impressive realism and the checkerboard pattern problem is fixed; its speed is similar to v18. It gets confused when dealing with multi - layered clothing descriptions; when using the er_sde/beta sampler with 6 - 8 steps, there will be a lot of moles on the skin; the skin may become oily. -
v20 Its semantic understanding ability has been greatly improved compared to v19 and can handle complex scenes (such as multi - layered clothing descriptions) more accurately; it has the best balance and the colors are close to the original image. The characters' skin becomes "waxy" and artifacts like moles appear on the body; the consistency of facial features deteriorates; when the output image resolution is higher than the source image, the facial features will be completely overridden; the BestFaceSwap LORA interferes with the facial features of single images. -
v21 It includes more realistic LORAs to improve skin details; it can achieve a decent anime style in some cases. It tries to turn anime art into a realistic style, ruining the anime style; there may be rashes on the skin during local redrawing; it may have problems with representing male genitalia; there are significant changes in hairstyle and face; it has a waxy texture characteristic of AI, with details being smoothed out. -
v22 It generates perfect anime effects; it can retain details such as the color of red food. It still needs the F2P Lora to preserve facial features when changing poses or scenes. -

Finally, a special thanks to Phr00t, the author of Qwen - Image - Edit - Rapid - AIO, for the contributions to the open - source community! Thanks to your hard work and continuous optimization, we can use such an excellent model. We are looking forward to your future updates and improvements!

Nice summary! Where to find the F2P lora btw?

Nice summary! Where to find the F2P lora btw?

search Qwen-Image-Edit F2P

Nice sumarry. Very similar to my experience with the models. For my use, I usually go with v18 for great consistency or v21 for good consistency and NSFW understanding. V21 90% of time. V22 at least for me deforms hands in a large percentage of generations, so I'm not using so much... I use the basic workflow with the Phr00t TextEncode Node.

It seems that the prompt "Keep the face unchanged" is effective

I'm currently using V20, and after exporting the images, I import them into Photoshop for further refinement because no model is truly perfect yet.

Nice sumarry. Very similar to my experience with the models. For my use, I usually go with v18 for great consistency or v21 for good consistency and NSFW understanding. V21 90% of time. V22 at least for me deforms hands in a large percentage of generations, so I'm not using so much... I use the basic workflow with the Phr00t TextEncode Node.

May I ask about the facial retention of real human faces in the V21 version of the model? I haven't had time to test this version yet.

I tested V23 today, and it's practically perfect. Everyone can update with confidence.

V23效果不錯~ (人物簡單站立的話) ,有沒有FP16的單一包? 可以直接GGUF掛接.

I believe V16 has the best consistency, second only to V5.

I tested V23 today, and it's practically perfect. Everyone can update with confidence.

Version 23 still needs to be combined with F2P LORA to stabilize the consistency of character facial features.

Could v23 be added to the summary table?

I believe V16 has the best consistency, second only to V5.

I agree. V16 gave the best consistency natively without any lora.

Can we mention in the table as well

Sign up or log in to comment