| Branch: | Revision:

ffmpeg / libavcodec / vp8.c @ 10bf2eeb

History | View | Annotate | Download (58.2 KB)

# Date Author Comment
10bf2eeb 08/02/2010 05:20 AM Jason Garrett-Glaser

VP8: simplify token_prob handling
~1.5% faster decode_block_coeffs

Originally committed as revision 24659 to svn://

c22b4468 08/01/2010 11:20 PM Pascal Massimino

prevent access to vp8_coeff_band16

Originally committed as revision 24656 to svn://

a8ab0ccc 07/27/2010 11:09 PM Pascal Massimino

b0rk3d FATE + black helicopters hissing -> rolling back to r24556 and sleeping

Originally committed as revision 24559 to svn://

62d1f786 07/27/2010 10:23 PM Pascal Massimino

perform the clipping on luma_dc_qmul1 and chroma_qmul0 earlier

Originally committed as revision 24558 to svn://

e7e81959 07/27/2010 10:21 PM Pascal Massimino

save some copies by moving some fields out of proba2

Originally committed as revision 24557 to svn://

fca05ea8 07/26/2010 07:10 AM Jason Garrett-Glaser

VP8: add missing free
Fixes a tiny memory leak.

Originally committed as revision 24504 to svn://

28e241de 07/25/2010 02:49 PM Carl Eugen Hoyos

Fix r24445: Instead of needlessly initialising a variable, silence the warning.

Originally committed as revision 24498 to svn://

ca18a478 07/23/2010 09:46 PM David Conrad

VP8: Inline traversing vp8_small_mvtree

Much faster read_mv_component, slightly faster overall

Originally committed as revision 24470 to svn://

7697cdcf 07/23/2010 09:46 PM David Conrad

VP8: Use vp56_rac_get_prob_branchy when the bit is only used by an if()

Originally committed as revision 24469 to svn://

fe1b5d97 07/23/2010 09:46 PM David Conrad

Decode DCT tokens by branching to a different code path for each branch
on the huffman tree, instead of traversing the tree in a while loop.

Based on the similar optimization in libvpx's detokenize.c

10% faster at normal bitrates, and 30% faster for high-bitrate intra-only...

13a1304b 07/23/2010 09:42 PM Jason Garrett-Glaser

Add myself to VP8 copyright and maintainers.
Also add Ronald to maintainers.

Originally committed as revision 24464 to svn://

414ac27d 07/23/2010 09:36 PM Jason Garrett-Glaser

VP8: always_inline some things to force gcc to do the right thing
Mostly seems to help in the MC code, which gets a hundred cycles faster.

Originally committed as revision 24463 to svn://

06d50ca8 07/23/2010 09:17 PM Jason Garrett-Glaser

VP8: use AV_RL24 instead of defining a new RL24.

Originally committed as revision 24462 to svn://

9fddd14a 07/23/2010 07:06 PM Jason Garrett-Glaser

VP8: Slightly faster MV selection
Don't clamp best mv unless it's actually used.

Originally committed as revision 24461 to svn://

14767f35 07/23/2010 10:42 AM Jason Garrett-Glaser

VP8: use AV_ZERO32 instead of AV_WN32A where relevant

Originally committed as revision 24460 to svn://

09959ec4 07/23/2010 10:34 AM Jason Garrett-Glaser

VP8: eliminate redundant code in r24458

Originally committed as revision 24459 to svn://

a71abb71 07/23/2010 10:24 AM Jason Garrett-Glaser

VP8: shave a few clocks off check_intra_pred_mode

Originally committed as revision 24458 to svn://

0087aa47 07/23/2010 06:41 AM Jason Garrett-Glaser

VP8: fix broken sign bias code in MV pred
Apparently the official conformance test vectors don't test this feature,
even though libvpx uses it.

Originally committed as revision 24456 to svn://

3ae079a3 07/23/2010 06:02 AM Jason Garrett-Glaser

VP8: optimize DC-only chroma case in the same way as luma.
Add MMX idct_dc_add4uv function for this case.
~40% faster chroma idct.

Originally committed as revision 24455 to svn://

3df56f41 07/23/2010 03:44 AM Jason Garrett-Glaser

VP8: Clean up some variable shadowing.

Originally committed as revision 24454 to svn://

8a467b2d 07/23/2010 02:58 AM Jason Garrett-Glaser

VP8: 30% faster idct_mb
Take shortcuts based on statistically common situations.
Add 4-at-a-time idct_dc function (mmx and sse2) since rows of 4 DC-only DCT
blocks are common.
TODO: tie this more directly into the MB mode, since the DC-level transform is
only used for non-splitmv blocks?...

ef38842f 07/23/2010 01:59 AM Jason Garrett-Glaser

VP8: smarter prefetching
Don't prefetch reference frames that were used less than 1/32th of the time so
far in the frame.
This helps speed up to ~2% on videos that, in many frames, make near-zero
(but not entirely zero) use of golden and/or alt-refs.
This is a very common property of videos encoded by libvpx....

c25c7767 07/23/2010 12:07 AM Jason Garrett-Glaser

VP8: clear DCT blocks in iDCT instead of using clear_blocks.
~0.3% faster overall.

Originally committed as revision 24448 to svn://

b74f70d6 07/23/2010 12:05 AM Jason Garrett-Glaser

VP8: avoid a memset for non-i4x4 blocks with no coefficients

Originally committed as revision 24447 to svn://

145d3186 07/22/2010 11:11 PM Jason Garrett-Glaser

Get rid of more unnecessary dereferences in VP8 deblocking

Originally committed as revision 24446 to svn://

86721533 07/22/2010 11:04 PM Jason Garrett-Glaser

Shut up an uninitialized variable GCC warning in VP8.

Originally committed as revision 24445 to svn://

c4211046 07/22/2010 11:03 PM Jason Garrett-Glaser

Smarter VP8 prefetching
Prefetch all refs (including altref), but only if they've been used so far this
~2.5% faster overall.

TODO: Do something even smarter, like using how often each ref has been used
so far, so that a couple blocks of a rarely-used ref don't force us to prefetch...

8cfae560 07/22/2010 10:15 PM Jason Garrett-Glaser

Fix stupid bug in VP8 prefetching code

Originally committed as revision 24443 to svn://

2a38c2e9 07/22/2010 10:08 PM Jason Garrett-Glaser

Eliminate a LUT in escape decoding in VP8 decode_block_coeffs

Originally committed as revision 24441 to svn://

d292c345 07/22/2010 09:05 PM Jason Garrett-Glaser

Eliminate some repeated dereferences in VP8 inter_predict

Originally committed as revision 24438 to svn://

b946111f 07/22/2010 12:15 PM Jason Garrett-Glaser

Eliminate a pointless memset for intra blocks in P-frames in VP8

Originally committed as revision 24429 to svn://

b9a7186b 07/22/2010 11:55 AM Jason Garrett-Glaser

VP8: Don't store segment in macroblock struct anymore.
Not necessary with the previous patch.

Originally committed as revision 24427 to svn://

c55e0d34 07/22/2010 11:45 AM Jason Garrett-Glaser

Convert VP8 macroblock structures to a ring buffer.
Uses a slightly nonintuitive ring buffer size of (width+height*2) to simplify
addressing logic.
Also split out the segmentation map to a separate structure, necessary to
implement the ring buffer.

Originally committed as revision 24426 to svn://

968570d6 07/22/2010 07:24 AM Jason Garrett-Glaser

Calculate deblock strength per-MB instead of per-row
Gives better cache locality, since the VP8Macroblock structs are still in cache.
Inspired by the way x264 does it.

Originally committed as revision 24417 to svn://

d1c58fce 07/22/2010 07:04 AM Jason Garrett-Glaser

Avoid tracking i4x4 modes in P-frames in VP8
As in the previous commit, they aren't used for context selection, so it saves
memory this way.

Originally committed as revision 24416 to svn://

158e062c 07/22/2010 06:39 AM Jason Garrett-Glaser

Avoid useless fill_rectangle in P-frames in VP8
In VP8, i4x4 only uses contexts based on neighbors in I-frames.

Originally committed as revision 24415 to svn://

7bf254c4 07/22/2010 06:29 AM Jason Garrett-Glaser

Optimize partition mv decoding in VP8

Originally committed as revision 24414 to svn://

c0498b30 07/22/2010 05:49 AM Jason Garrett-Glaser

Take shortcuts for mv0 case in VP8 MC
Avoid edge emulation -- it isn't needed if there isn't any subpel.

Originally committed as revision 24413 to svn://

702e8d33 07/22/2010 04:26 AM Jason Garrett-Glaser

Much faster VP8 mv and mode prediction

Originally committed as revision 24412 to svn://

d864dee8 07/22/2010 03:09 AM Jason Garrett-Glaser

Add prefetching to VP8 decoder
~5% faster overall, probably depends on CPU and resolution.

Originally committed as revision 24410 to svn://

096971e8 07/20/2010 05:54 PM Måns Rullgård

vp8: indent

Originally committed as revision 24368 to svn://

070ce7ef 07/20/2010 05:54 PM Måns Rullgård

vp8: add do { } while(0) around XCHG macro to avoid confusing if/else

This is the correct solution to the warning "fixed" in the previous

Originally committed as revision 24367 to svn://

153da88d 07/20/2010 05:45 PM Diego Biurrun

Add some braces to silence the warning:
libavcodec/vp8.c:892: warning: suggest explicit braces to avoid ambiguous `else'

Originally committed as revision 24366 to svn://

3facfc99 07/19/2010 09:18 PM Ronald S. Bultje

Change function prototypes for width=8 inner and mbedge loopfilter functions
so that it does both U and V planes at the same time. This will have speed
advantages when using SSE2 (or higher) optimizations, since we can do both
the U and V rows together in a single xmm register....

9ac831c2 07/16/2010 07:20 AM David Conrad

vp8: Save mb border needed for intra prediction so that loop filter can run
immediately after a mb row is decoded

Originally committed as revision 24252 to svn://

b6c420ce 07/16/2010 07:20 AM David Conrad

vp8: Check for malloc failure

Originally committed as revision 24251 to svn://

e394953e 07/08/2010 03:01 PM Ronald S. Bultje

Add missing doxy for function arguments.

Originally committed as revision 24110 to svn://

5245c04d 07/02/2010 09:04 PM David Conrad

VP8: Move calculation of outer filter limit out of dsp functions for normal
filter to match the simple loop filter

Originally committed as revision 24010 to svn://

3fa76268 07/02/2010 11:44 AM Diego Biurrun

Avoid square brackets in Doxygen comments; Doxygen chokes on them.

Originally committed as revision 23979 to svn://

7ed06b2b 06/28/2010 04:04 PM Ronald S. Bultje

Simplify MV parsing, removes laying out 2 or 4 (16x8/8x8/8x16) MVs over all
16 subblocks (since we no longer need that), which should also lead to a
minor speedup.

Originally committed as revision 23854 to svn://

7c4dcf81 06/28/2010 01:50 PM Ronald S. Bultje

Optimize split MC, so we don't always do 4x4 blocks of 4x4pixels each, but
we apply them as 16x8/8x16/8x8 subblocks where possible. Since this allows
us to use width=8/16 instead of width=4 MC functions, we can now take more
advantage of SSE2/SSSE3 optimizations, leading to a total speedup for splitMV...

0ef1dbed 06/27/2010 01:46 AM David Conrad

VP8 bilinear filter

Originally committed as revision 23813 to svn://

92a54426 06/27/2010 12:37 AM Måns Rullgård

vp8: warn and request sample if upscaling specified in header

Originally committed as revision 23809 to svn://

d6f8476b 06/25/2010 06:14 PM Jason Garrett-Glaser

Make VP8 DSP functions take two strides
This isn't useful for the C functions, but will allow re-using H and V functions
for HV functions without adding separate H and V wrappers.

Originally committed as revision 23782 to svn://

03ac56e7 06/25/2010 04:23 AM Jason Garrett-Glaser

fix typo in vp8 decoder error message

Originally committed as revision 23765 to svn://

8f910a56 06/23/2010 09:45 PM Stefan Gehrer

avoid conditional and division in chroma MV calculation

Originally committed as revision 23745 to svn://

3b636f21 06/22/2010 07:24 PM David Conrad

Native VP8 decoder.

Patch by David Conrad <lessen42 gmail com> and myself.

Originally committed as revision 23719 to svn://