Text-to-Image
Diffusers
Safetensors
English
ZImagePipeline

Blog post doesn't mention base model

#81
by DioxideSiO2 - opened

https://tongyi-mai.github.io/Z-Image-blog/

"We are publicly releasing two specialized models on Z-Image: Z-Image-Turbo for generation and Z-Image-Edit for editing."

Does this mean the base model is no longer scheduled for release? I hope it comes out 😢

According to the forum,they will most likely break their promises.

Alibaba is a business company.

Both the huggingface and github READMEs list base as to be released. Blog post was probably just focused on the two "production" models they intend to focus on (pre-trained/distilled for ready-to-use drop in purposes). Turbo for fast gen, Edit for, well, editing. I promise they know Base is still absolutely necessary for fine tuning and lora training. They'd be truly dumb not to release it, and so they will.

image

According to the forum,they will most likely break their promises.

I think that not keeping their promises by refusing to release the base or edit version as open source would be a huge mistake on their part. I don't think their goal is to disappoint the open source community and lose their trust. They know very well that many open source users are also users who pay for certain pro APIs models. But treating people in the community like idiots is shooting themselves in the foot.

I'm not sure it's necessarily strategically useful to adopt the same stance as Black Forest Labs for the Tongyi-MAI team.

Then honestly promising a model dedicated to fine-tuning, purely dedicated to the open-source community, only to ultimately say, “No, fuck off, pay for our API after all,” would be the worst move.

@clawcraft3r

Your idle threats and entitlement are very off-putting. That's not what you do to get what you want, no matter how badly you want it.

@clawcraft3r

Your idle threats and entitlement are very off-putting. That's not what you do to get what you want, no matter how badly you want it.

I think you're misreading my comment. I'm not making threats or acting entitled. I'm pointing out that not keeping their promises would be a strategic mistake for Tongyi-MAI.
When a company publicly announces a model "dedicated to fine-tuning, purely dedicated to the open-source community," it's completely reasonable for users to expect that promise to be kept. That's not entitlement, that's basic trust.

My point about Black Forest Labs wasn't a threat either. It's an observation about what happens when companies change their open source strategy after building hype. Some users feel burned, and that impacts the ecosystem. I don't think Tongyi-MAI wants to follow that path, especially given how well the community has responded to Turbo.

Look at the numbers: over 1 million downloads, 200+ community resources created, 1200+ positive reviews. The community clearly wants to contribute more with proper fine-tuning capabilities. That's why the base model matters.

I'm not demanding anything. I'm saying that if there are delays or changed plans, just communicate that. Silence creates uncertainty, and uncertainty breeds the kind of speculation we're seeing in this thread.
I want Z-Image to succeed. That's why I care enough to write these comments.

If you think constructive feedback is "idle threats," then I don't know what to tell you.

Agreed on all counts, @clawcraft3r . I don't think it's likely that the model goes vaporware, but I figured I'd ask in a place where someone from Tongyi-MAI may be able to answer. It wasn't my intention to stir up any trouble.

According to the forum,they will most likely break their promises.

I think that not keeping their promises by refusing to release the base or edit version as open source would be a huge mistake on their part. I don't think their goal is to disappoint the open source community and lose their trust. They know very well that many open source users are also users who pay for certain pro APIs models. But treating people in the community like idiots is shooting themselves in the foot.

I'm not sure it's necessarily strategically useful to adopt the same stance as Black Forest Labs for the Tongyi-MAI team.

Then honestly promising a model dedicated to fine-tuning, purely dedicated to the open-source community, only to ultimately say, “No, fuck off, pay for our API after all,” would be the worst move.

Alibaba's model is much larger than Black Forest Labs'.

Alibaba's models include:
Qwen, Qwen-VL, Qwen-Image series
Wan 2.1, 2.2, 2.5 (released, not open source)
FunASR, Paraformer, CosyVoice
More.
This company's scale is incomparable to Black Forest Labs. Z-Image is not the main focus of the Tongyi team; Qwen and Wan have their unique characteristics. There are many similar projects within Alibaba, but not all projects receive adequate funding; leadership approval is paramount.

If the developers could give exact release dates, they would have done it already - I’m sure of that. Since they haven’t, it means that for one reason or another they simply can’t do it yet. Just like all of you, I’m impatiently waiting for the release of the base and edit models. God, I even have a bot that scans the info field every 8 hours to check whether the weights are out or not. But realistically, that’s the maximum we can do, and we all just need to be patient.

Huge respect to the developers for this incredible model. The level of professionalism is off the charts. I’ve been involved with generative neural networks practically since their early days, and if someone had asked me before Z-Image whether it was possible to achieve this level of quality with only 6 billion parameters, I would have said it was almost impossible.

These guys are doing the impossible — please, show patience and respect.

根据中国一个社交平台上说的,基础模型是会公布的,但现在似乎还在训练当中

Sign up or log in to comment