diff --git a/README.md b/README.md index 72b6e8d..7727f77 100644 --- a/README.md +++ b/README.md @@ -1,5 +1,5 @@
-
+
+
acceleration,
inference, and more. Our model can produce 2s 512x512 videos with only 3 days training. [[checkpoints]](#open-sora-10-model-weights)
[[blog]](https://hpc-ai.com/blog/open-sora-v1.0) [[report]](/docs/report_01.md)
@@ -63,19 +63,19 @@ Demos are presented in compressed GIF format for convenience. For original quali
| **5s 1024×576** | **5s 576×1024** | **5s 576×1024** |
| -------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------- |
-| [
](https://streamable.com/e/8g9y9h?autoplay=1) | [
](https://streamable.com/e/k50mnv?autoplay=1) | [
](https://streamable.com/e/bzrn9n?autoplay=1) |
-| [
](https://streamable.com/e/dsv8da?autoplay=1) | [
](https://streamable.com/e/3wif07?autoplay=1) | [
](https://streamable.com/e/us2w7h?autoplay=1) |
-| [
](https://streamable.com/e/yfwk8i?autoplay=1) | [
](https://streamable.com/e/jgjil0?autoplay=1) | [
](https://streamable.com/e/lsoai1?autoplay=1) |
+| [
](https://streamable.com/e/8g9y9h?autoplay=1) | [
](https://streamable.com/e/k50mnv?autoplay=1) | [
](https://streamable.com/e/bzrn9n?autoplay=1) |
+| [
](https://streamable.com/e/dsv8da?autoplay=1) | [
](https://streamable.com/e/3wif07?autoplay=1) | [
](https://streamable.com/e/us2w7h?autoplay=1) |
+| [
](https://streamable.com/e/yfwk8i?autoplay=1) | [
](https://streamable.com/e/jgjil0?autoplay=1) | [
](https://streamable.com/e/lsoai1?autoplay=1) |
](https://streamable.com/e/r0imrp?quality=highest&autoplay=1) | [
](https://streamable.com/e/hfvjkh?quality=highest&autoplay=1) | [
](https://streamable.com/e/kutmma?quality=highest&autoplay=1) |
-| [
](https://streamable.com/e/osn1la?quality=highest&autoplay=1) | [
](https://streamable.com/e/l1pzws?quality=highest&autoplay=1) | [
](https://streamable.com/e/2vqari?quality=highest&autoplay=1) |
-| [
](https://streamable.com/e/1in7d6?quality=highest&autoplay=1) | [
](https://streamable.com/e/e9bi4o?quality=highest&autoplay=1) | [
](https://streamable.com/e/09z7xi?quality=highest&autoplay=1) |
-| [
](https://streamable.com/e/16c3hk?quality=highest&autoplay=1) | [
](https://streamable.com/e/wi250w?quality=highest&autoplay=1) | [
](https://streamable.com/e/vw5b64?quality=highest&autoplay=1) |
+| [
](https://streamable.com/e/r0imrp?quality=highest&autoplay=1) | [
](https://streamable.com/e/hfvjkh?quality=highest&autoplay=1) | [
](https://streamable.com/e/kutmma?quality=highest&autoplay=1) |
+| [
](https://streamable.com/e/osn1la?quality=highest&autoplay=1) | [
](https://streamable.com/e/l1pzws?quality=highest&autoplay=1) | [
](https://streamable.com/e/2vqari?quality=highest&autoplay=1) |
+| [
](https://streamable.com/e/1in7d6?quality=highest&autoplay=1) | [
](https://streamable.com/e/e9bi4o?quality=highest&autoplay=1) | [
](https://streamable.com/e/09z7xi?quality=highest&autoplay=1) |
+| [
](https://streamable.com/e/16c3hk?quality=highest&autoplay=1) | [
](https://streamable.com/e/wi250w?quality=highest&autoplay=1) | [
](https://streamable.com/e/vw5b64?quality=highest&autoplay=1) |
](https://github.com/hpcaitech/Open-Sora/assets/99191637/7895aab6-ed23-488c-8486-091480c26327) | [
](https://github.com/hpcaitech/Open-Sora/assets/99191637/20f07c7b-182b-4562-bbee-f1df74c86c9a) | [
](https://github.com/hpcaitech/Open-Sora/assets/99191637/3d897e0d-dc21-453a-b911-b3bda838acc2) |
-| [
](https://github.com/hpcaitech/Open-Sora/assets/99191637/644bf938-96ce-44aa-b797-b3c0b513d64c) | [
](https://github.com/hpcaitech/Open-Sora/assets/99191637/272d88ac-4b4a-484d-a665-8d07431671d0) | [
](https://github.com/hpcaitech/Open-Sora/assets/99191637/ebbac621-c34e-4bb4-9543-1c34f8989764) |
-| [
](https://github.com/hpcaitech/Open-Sora/assets/99191637/a1e3a1a3-4abd-45f5-8df2-6cced69da4ca) | [
](https://github.com/hpcaitech/Open-Sora/assets/99191637/d6ce9c13-28e1-4dff-9644-cc01f5f11926) | [
](https://github.com/hpcaitech/Open-Sora/assets/99191637/561978f8-f1b0-4f4d-ae7b-45bec9001b4a) |
+| [
](https://github.com/hpcaitech/Open-Sora/assets/99191637/7895aab6-ed23-488c-8486-091480c26327) | [
](https://github.com/hpcaitech/Open-Sora/assets/99191637/20f07c7b-182b-4562-bbee-f1df74c86c9a) | [
](https://github.com/hpcaitech/Open-Sora/assets/99191637/3d897e0d-dc21-453a-b911-b3bda838acc2) |
+| [
](https://github.com/hpcaitech/Open-Sora/assets/99191637/644bf938-96ce-44aa-b797-b3c0b513d64c) | [
](https://github.com/hpcaitech/Open-Sora/assets/99191637/272d88ac-4b4a-484d-a665-8d07431671d0) | [
](https://github.com/hpcaitech/Open-Sora/assets/99191637/ebbac621-c34e-4bb4-9543-1c34f8989764) |
+| [
](https://github.com/hpcaitech/Open-Sora/assets/99191637/a1e3a1a3-4abd-45f5-8df2-6cced69da4ca) | [
](https://github.com/hpcaitech/Open-Sora/assets/99191637/d6ce9c13-28e1-4dff-9644-cc01f5f11926) | [
](https://github.com/hpcaitech/Open-Sora/assets/99191637/561978f8-f1b0-4f4d-ae7b-45bec9001b4a) |
@@ -95,16 +95,16 @@ Demos are presented in compressed GIF format for convenience. For original quali
| **2s 240×426** | **2s 240×426** |
| --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| [
](https://github.com/hpcaitech/Open-Sora-dev/assets/99191637/c31ebc52-de39-4a4e-9b1e-9211d45e05b2) | [
](https://github.com/hpcaitech/Open-Sora-dev/assets/99191637/c31ebc52-de39-4a4e-9b1e-9211d45e05b2) |
-| [
](https://github.com/hpcaitech/Open-Sora-dev/assets/99191637/f7ce4aaa-528f-40a8-be7a-72e61eaacbbd) | [
](https://github.com/hpcaitech/Open-Sora-dev/assets/99191637/5d58d71e-1fda-4d90-9ad3-5f2f7b75c6a9) |
+| [
](https://github.com/hpcaitech/Open-Sora-dev/assets/99191637/c31ebc52-de39-4a4e-9b1e-9211d45e05b2) | [
](https://github.com/hpcaitech/Open-Sora-dev/assets/99191637/c31ebc52-de39-4a4e-9b1e-9211d45e05b2) |
+| [
](https://github.com/hpcaitech/Open-Sora-dev/assets/99191637/f7ce4aaa-528f-40a8-be7a-72e61eaacbbd) | [
](https://github.com/hpcaitech/Open-Sora-dev/assets/99191637/5d58d71e-1fda-4d90-9ad3-5f2f7b75c6a9) |
| **2s 426×240** | **4s 480×854** |
| -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| [
](https://github.com/hpcaitech/Open-Sora-dev/assets/99191637/34ecb4a0-4eef-4286-ad4c-8e3a87e5a9fd) | [
](https://github.com/hpcaitech/Open-Sora-dev/assets/99191637/c1619333-25d7-42ba-a91c-18dbc1870b18) |
+| [
](https://github.com/hpcaitech/Open-Sora-dev/assets/99191637/34ecb4a0-4eef-4286-ad4c-8e3a87e5a9fd) | [
](https://github.com/hpcaitech/Open-Sora-dev/assets/99191637/c1619333-25d7-42ba-a91c-18dbc1870b18) |
| **16s 320×320** | **16s 224×448** | **2s 426×240** |
| ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| [
](https://github.com/hpcaitech/Open-Sora/assets/99191637/3cab536e-9b43-4b33-8da8-a0f9cf842ff2) | [
](https://github.com/hpcaitech/Open-Sora/assets/99191637/9fb0b9e0-c6f4-4935-b29e-4cac10b373c4) | [
](https://github.com/hpcaitech/Open-Sora-dev/assets/99191637/3e892ad2-9543-4049-b005-643a4c1bf3bf) |
+| [
](https://github.com/hpcaitech/Open-Sora/assets/99191637/3cab536e-9b43-4b33-8da8-a0f9cf842ff2) | [
](https://github.com/hpcaitech/Open-Sora/assets/99191637/9fb0b9e0-c6f4-4935-b29e-4cac10b373c4) | [
](https://github.com/hpcaitech/Open-Sora-dev/assets/99191637/3e892ad2-9543-4049-b005-643a4c1bf3bf) |
@@ -113,9 +113,9 @@ Demos are presented in compressed GIF format for convenience. For original quali
| **2s 512×512** | **2s 512×512** | **2s 512×512** |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
-| [
](https://github.com/hpcaitech/Open-Sora/assets/99191637/de1963d3-b43b-4e68-a670-bb821ebb6f80) | [
](https://github.com/hpcaitech/Open-Sora/assets/99191637/13f8338f-3d42-4b71-8142-d234fbd746cc) | [
](https://github.com/hpcaitech/Open-Sora/assets/99191637/fa6a65a6-e32a-4d64-9a9e-eabb0ebb8c16) |
+| [
](https://github.com/hpcaitech/Open-Sora/assets/99191637/de1963d3-b43b-4e68-a670-bb821ebb6f80) | [
](https://github.com/hpcaitech/Open-Sora/assets/99191637/13f8338f-3d42-4b71-8142-d234fbd746cc) | [
](https://github.com/hpcaitech/Open-Sora/assets/99191637/fa6a65a6-e32a-4d64-9a9e-eabb0ebb8c16) |
| A serene night scene in a forested area. [...] The video is a time-lapse, capturing the transition from day to night, with the lake and forest serving as a constant backdrop. | A soaring drone footage captures the majestic beauty of a coastal cliff, [...] The water gently laps at the rock base and the greenery that clings to the top of the cliff. | The majestic beauty of a waterfall cascading down a cliff into a serene lake. [...] The camera angle provides a bird's eye view of the waterfall. |
-| [
](https://github.com/hpcaitech/Open-Sora/assets/99191637/64232f84-1b36-4750-a6c0-3e610fa9aa94) | [
](https://github.com/hpcaitech/Open-Sora/assets/99191637/983a1965-a374-41a7-a76b-c07941a6c1e9) | [
](https://github.com/hpcaitech/Open-Sora/assets/99191637/ec10c879-9767-4c31-865f-2e8d6cf11e65) |
+| [
](https://github.com/hpcaitech/Open-Sora/assets/99191637/64232f84-1b36-4750-a6c0-3e610fa9aa94) | [
](https://github.com/hpcaitech/Open-Sora/assets/99191637/983a1965-a374-41a7-a76b-c07941a6c1e9) | [
](https://github.com/hpcaitech/Open-Sora/assets/99191637/ec10c879-9767-4c31-865f-2e8d6cf11e65) |
| A bustling city street at night, filled with the glow of car headlights and the ambient light of streetlights. [...] | The vibrant beauty of a sunflower field. The sunflowers are arranged in neat rows, creating a sense of order and symmetry. [...] | A serene underwater scene featuring a sea turtle swimming through a coral reef. The turtle, with its greenish-brown shell [...] |
Videos are downsampled to `.gif` for display. Click for original videos. Prompts are trimmed for display,
@@ -251,7 +251,7 @@ torchrun --nproc_per_node 1 --standalone scripts/diffusion/inference.py configs/
| Score | 1 | 4 | 7 |
| ----- | ------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------- |
-| |
|
|
|
+| |
|
|
|
### Prompt Refine
@@ -285,15 +285,15 @@ We test the computational efficiency of text-to-video on H100/H800 GPU. For 256x
On [VBench](https://huggingface.co/spaces/Vchitect/VBench_Leaderboard), Open-Sora 2.0 significantly narrows the gap with OpenAI’s Sora, reducing it from 4.52% → 0.69% compared to Open-Sora 1.2.
-
+
Human preference results show our model is on par with HunyuanVideo 11B and Step-Video 30B.
-
+
With strong performance, Open-Sora 2.0 is cost-effective.
-
+
## Contribution