dav1d: performance and completion of the first release

21 November 2018

tl;dr: dav1d is in a very good shape

If you want a quick summary of this post:

  • dav1d now covers all the spec and features of AV1, for 8bits and 10bits depth,
  • dav1d is very fast, up to 400% faster (more fps) than the libaom decoder, and very often 100% faster.

Now is the right time to integrate it, in your products!

Read the following for more details…

A few reminders about dav1d

AV1 is a new video codec by the Alliance for Open Media, composed of most of the important Web companies (Google, Facebook, Netflix, Amazon, Microsoft, Mozilla…).

AV1 has the potential to be up to 20% better than the HEVC codec, but the patents license is totally free, while HEVC patents licenses are insanely high and very confusing.

The reference decoder for AV1 is great, but it’s a research codebase, so it has a lot to improve.

Therefore, the VideoLAN, VLC and FFmpeg communities have started to work on a new decoder, sponsored by the Alliance of Open Media, in order to create the reference optimized decoder for AV1.

Features

We launched dav1d, exactly 2 months ago, during VDD.

We did a lot of work since. And by “we”, I mean mostly the others. :)

There are now more than 500 commits from 29 contributors from different open source communities. This is a good result for a new open source project.

First, we’ve completed all the features, including Film Grain, Super-Res, Scaled References, and other more obscure features of the bitstream. This covers both 8 and 10bits, of course.

We also improved the public API.

Then, we’ve fuzzed the decoder a lot: we are now above 99% of functions covered, and 97% of lines covered on OSS-FUZZ; and we usually fix all the issues in a couple of days. This should assure you a secure decoding for AV1.

Finally, we’ve written a lot of assembly, mostly for modern desktop CPUs, but the work has been started for mobile and older desktop CPUs.

We even reduced the size of the C code!

Performance

Today, dav1d is very fast on AVX2 processors, which should cover a bit more than 50% of the CPUs used on the desktop. We wrote 95% of the code needed for AVX2, but there is still a bit more achievable.

We’re readying the SSE and the ARM optimizations, to do the same. They will be very fast too, in the next weeks.

The following graphs are comparing dav1d and aomdec top-of-the-tree on master branches. (and yes, aomdec has CONFIG_LOWBITDEPTH=1).

This was done on Windows 10 64bits, using precompiled binaries.

The clips are taken from Netflix, Elecard, and Youtube, because they don’t use the same parameters in the encoder, and don’t have the same bitstream features.

Film Grain is not run on the CPU, so it is not visible here.

Haswell

Here, on Haswell (i7-4710, a 4 year old CPU with 4 cores), are the results:

And reported to in percentage compared to libaom:

We got in average 2.49x, and we even get 3.48x on the Youtube Summer clip!

Zen

With a more modern Zen machine (Ryzen 5 1600, 6 cores HT), here are the results:

And reported to in percentage compared to libaom:

The average is even higher at 3.49x, and we even get 5.27x on the Youtube Summer clip!

Global comparison

If we put both on the same graphs, here is what we have:

Threading

If you listened to our talks during VDD or during demuxed, we explained that dav1d threading was quite innovative, and should scale way better than libaom.

On an even less powerful machine, an i5-4590, with 4 cores/4 threads, here are our results, for the Youtube Summer clip:

You see that dav1d can scale better, in terms of threading, than libaom.

Conclusion

dav1d is very fast, dav1d is almost complete, dav1d is cool.

We’re finishing the rough edges for a release soon, so that we can hope that Firefox 65 will ship with dav1d for AV1 decoding.

On the other platforms, SSE and ARM assembly will follow very quickly, and we’re already as fast on ARMv8. Stay tuned for more!

I would like to thank Ewout ter Hoeven (EwoutH) from the community who did all the testing, numbers and computations.

Jean-Baptiste Kempf

Comments

  1. On 19 May 19190, 5:25 by 1 – How to use AV1 with open source tools | Traffic.Ventures

    1 – How to use AV1 with open source tools | Traffic.Ventures

    (…) If you follow this blog, you should know everything about AV1. (…)

  2. On 11 May 11110, 4:23 by 3 – First release of dav1d, the AV1 decoder |

    3 – First release of dav1d, the AV1 decoder |

    (…) If you follow this blog, you should know everything about dav1d. (…)

  3. On 23 May 23230, 10:46 by Daniel

    Thank you for the great results!
    Question: the article mentions “8bits and 10bits depth”. I heard something about 12bit depth in AV1. Will be the 12bit depth supported?

  4. On 23 May 23230, 10:35 by กลุ่ม VLC เปิดตัว dav1d ไลบรารีเข้ารหัสวิดีโอแบบ AV1

    กลุ่ม VLC เปิดตัว dav1d ไลบรารีเข้ารหัสวิดีโอแบบ AV1

    (…) ที่มา – dav1d (1), dav1d (2), Phoronix (…)

  5. On 23 May 23230, 4:28 by NM64

    Great to hear that SSE optimizations will be coming next as even the newest full-fat desktop Sky/Kaby/Coffee Lake Pentiums (like the ever-popular 2core/4thread G4560 and G5400) completely lack AVX support.

  6. On 22 May 22220, 10:26 by jrdls

    Is Dav1d merged in Firefox?

  7. On 22 May 22220, 10:14 by Bob H

    Would be nice to see some numbers for ARM, although obviously they are probably more relevant once the optimisation is complete.

  8. On 22 May 22220, 8:44 by LinAGKar

    The 4710 doesn’t seem to exist.

  9. On 22 May 22220, 8:16 by Elk

    Dav1d is only a decoder, not an encoder.

  10. On 22 May 22220, 3:52 by Jeremiah

    Dav1d is not an encoder, Ali.

  11. On 21 May 21210, 8:00 by ali

    Thanks for you great post, is there any charts about encoding performance and frame rate?