Statistics
| Branch: | Revision:

ffmpeg / libavcodec / x86 @ 8b9b5e08

# Date Author Comment
8b9b5e08 08/03/2010 11:21 AM Jason Garrett-Glaser

VP5/6/8: add one inline missed in r24677

Originally committed as revision 24682 to svn://svn.ffmpeg.org/ffmpeg/trunk

827d43bb 08/02/2010 08:18 PM Jason Garrett-Glaser

VP8: move zeroing of luma DC block into the WHT
Lets us do the zeroing in asm instead of C.
Also makes it consistent with the way the regular iDCT code does it.

Originally committed as revision 24668 to svn://svn.ffmpeg.org/ffmpeg/trunk

6341838f 07/31/2010 11:13 PM Ronald S. Bultje

Use word-writing instead of dword-writing (with two cached but otherwise
unchanged bytes) in the horizontal simple loopfilter. This makes the filter
quite a bit faster in itself (~30 cycles less on Core1), probably mostly
because we don't need a complex 4x4 transpose, but only a simple byte...

fa738b3a 07/31/2010 04:20 PM Vitor Sessak

Remove x86/mmx.h. It is not used anymore and has been deprecated for years.

Originally committed as revision 24618 to svn://svn.ffmpeg.org/ffmpeg/trunk

de4bc44a 07/31/2010 02:50 PM Vitor Sessak

Convert deinterlacing MMX code to YASM

Originally committed as revision 24615 to svn://svn.ffmpeg.org/ffmpeg/trunk

740dfe70 07/29/2010 10:45 PM Vitor Sessak

Fix compilation in x86_64. I broke it with r24580.

Originally committed as revision 24582 to svn://svn.ffmpeg.org/ffmpeg/trunk

2c3dda68 07/29/2010 10:19 PM Vitor Sessak

Translate libmpeg2 MMX IDCT to plain asm

Originally committed as revision 24580 to svn://svn.ffmpeg.org/ffmpeg/trunk

ab4d0318 07/26/2010 09:18 PM Ronald S. Bultje

Use pmaddubsw for the mbedge_filter (>=ssse3), 6-10 cycles faster.

Originally committed as revision 24514 to svn://svn.ffmpeg.org/ffmpeg/trunk

e25dee60 07/26/2010 07:34 PM Jason Garrett-Glaser

VP8: Much faster SSE2 MC
5-10% faster or more on Phenom, Athlon 64, and some others.
Helps some on pre-SSSE3 Intel chips as well, but not as much.

Originally committed as revision 24513 to svn://svn.ffmpeg.org/ffmpeg/trunk

48adb7e7 07/26/2010 02:07 PM Ronald S. Bultje

Enable no-loop memory/register saving for ssse3/sse4 also.

Originally committed as revision 24511 to svn://svn.ffmpeg.org/ffmpeg/trunk

2a180c69 07/26/2010 02:00 PM Ronald S. Bultje

Save a register (or regsize of stackspace for x86-32) for the no-loop
mbedge loopfilter functions, by re-using space that holds a variable
that we no longer need.

Originally committed as revision 24510 to svn://svn.ffmpeg.org/ffmpeg/trunk

bcd4aa64 07/26/2010 01:56 PM Ronald S. Bultje

Use nested ifs instead of &&, which appears to not work with %ifidn (i.e. this
construct was always enabled, even for <ssse3 versions).

Originally committed as revision 24509 to svn://svn.ffmpeg.org/ffmpeg/trunk

2208053b 07/26/2010 01:50 PM Ronald S. Bultje

Split pextrw macro-spaghetti into several opt-specific macros, this will make
future new optimizations (imagine a sse5) much easier. Also fix a bug where
we used the direction (%2) rather than optimization (%1) to enable this, which
means it wasn't ever actually used......

6de5b7c6 07/25/2010 02:42 AM Ronald S. Bultje

Fix obvious bug in assignment. Somehow, the test vectors don't test this...

Originally committed as revision 24489 to svn://svn.ffmpeg.org/ffmpeg/trunk

e3f7bf77 07/24/2010 07:33 PM Ronald S. Bultje

Fix SPLATB_REG mess. Used to be a if/elseif/elseif/elseif spaghetti, so this
splits it into small optimization-specific macros which are selected for each
DSP function. The advantage of this approach is that the sse4 functions now
use the ssse3 codepath also without needing an explicit sse4 codepath....

3611e7a3 07/23/2010 09:46 PM Eli Friedman

Inline asm for VP56 arith coder

This is a lot more reliable to get cmov rather than trying to trick gcc into
generating it, useful since it's 2% faster overall.

Patch by Eli Friedman <eli.friedman at gmail>

Originally committed as revision 24471 to svn://svn.ffmpeg.org/ffmpeg/trunk

3ae079a3 07/23/2010 06:02 AM Jason Garrett-Glaser

VP8: optimize DC-only chroma case in the same way as luma.
Add MMX idct_dc_add4uv function for this case.
~40% faster chroma idct.

Originally committed as revision 24455 to svn://svn.ffmpeg.org/ffmpeg/trunk

51c91564 07/23/2010 03:02 AM Jason Garrett-Glaser

VP8 asm: cosmetics (spacing)

Originally committed as revision 24453 to svn://svn.ffmpeg.org/ffmpeg/trunk

8a467b2d 07/23/2010 02:58 AM Jason Garrett-Glaser

VP8: 30% faster idct_mb
Take shortcuts based on statistically common situations.
Add 4-at-a-time idct_dc function (mmx and sse2) since rows of 4 DC-only DCT
blocks are common.
TODO: tie this more directly into the MB mode, since the DC-level transform is
only used for non-splitmv blocks?...

c25c7767 07/23/2010 12:07 AM Jason Garrett-Glaser

VP8: clear DCT blocks in iDCT instead of using clear_blocks.
~0.3% faster overall.

Originally committed as revision 24448 to svn://svn.ffmpeg.org/ffmpeg/trunk

dc5eec80 07/22/2010 07:59 PM Ronald S. Bultje

Use pextrw for SSE4 mbedge filter result writing, speedup 5-10cycles on
CPUs supporting it.

Originally committed as revision 24437 to svn://svn.ffmpeg.org/ffmpeg/trunk

003243c3 07/22/2010 01:35 AM Ronald S. Bultje

Fix and enable horizontal >=SSE2 mbedge loopfilter.

Originally committed as revision 24409 to svn://svn.ffmpeg.org/ffmpeg/trunk

c7b1d976 07/22/2010 12:39 AM Loren Merritt

relicense h264 deblock sse2 to lgpl

Originally committed as revision 24408 to svn://svn.ffmpeg.org/ffmpeg/trunk

532e7697 07/21/2010 10:45 PM Loren Merritt

sync yasm macros from x264

Originally committed as revision 24406 to svn://svn.ffmpeg.org/ffmpeg/trunk

8731dbd8 07/21/2010 10:41 PM Jason Garrett-Glaser

Eliminate one instruction in VP8 dc_add_sse4

Originally committed as revision 24405 to svn://svn.ffmpeg.org/ffmpeg/trunk

7dd224a4 07/21/2010 10:11 PM Jason Garrett-Glaser

Various VP8 x86 deblocking speedups
SSSE3 versions, improve SSE2 versions a bit.
SSE2/SSSE3 mbedge h functions are currently broken, so explicitly disable them.

Originally committed as revision 24403 to svn://svn.ffmpeg.org/ffmpeg/trunk

b8b231b5 07/21/2010 08:51 PM Jason Garrett-Glaser

Make mmx VP8 WHT faster
Avoid pextrw, since it's slow on many older CPUs.
Now it doesn't require mmxext either.

Originally committed as revision 24397 to svn://svn.ffmpeg.org/ffmpeg/trunk

af521abc 07/21/2010 10:02 AM David Conrad

Add header declarations for mmx/sse constants missing them

Originally committed as revision 24381 to svn://svn.ffmpeg.org/ffmpeg/trunk

c7eec581 07/21/2010 10:02 AM David Conrad

Move ff_pw_* from vc1dsp_mmx.c to dsputil_mmx.c

Should fix compilation with icc and should help prevent any future duplicates

Originally committed as revision 24380 to svn://svn.ffmpeg.org/ffmpeg/trunk

e9e456d8 07/20/2010 10:58 PM Ronald S. Bultje

VP8 MBedge loopfilter MMX/MMX2/SSE2 functions for both luma (width=16)
and chroma (width=8).

Originally committed as revision 24378 to svn://svn.ffmpeg.org/ffmpeg/trunk

268821e7 07/20/2010 10:04 PM Ronald S. Bultje

Chroma (width=8) inner loopfilter MMX/MMX2/SSE2 for VP8 decoder.

Originally committed as revision 24377 to svn://svn.ffmpeg.org/ffmpeg/trunk

c60ed66d 07/19/2010 11:57 PM Ronald S. Bultje

Revert r24339 (it causes fate failures on x86-64) - I'll figure out what's
wrong with it tomorrow or so, then re-submit.

Originally committed as revision 24341 to svn://svn.ffmpeg.org/ffmpeg/trunk

6526976f 07/19/2010 10:38 PM Ronald S. Bultje

Remove FF_MM_SSE2/3 flags for CPUs where this is generally not faster than
regular MMX code. Examples of this are the Core1 CPU. Instead, set a new flag,
FF_MM_SSE2/3SLOW, which can be checked for particular SSE2/3 functions that
have been checked specifically on such CPUs and are actually faster than...

1878f685 07/19/2010 09:53 PM Ronald S. Bultje

Implement chroma (width=8) inner loopfilter MMX/MMX2/SSE2 functions.

Originally committed as revision 24339 to svn://svn.ffmpeg.org/ffmpeg/trunk

fb9bdf04 07/19/2010 09:45 PM Ronald S. Bultje

Be more efficient with registers or stack memory. Saves 8/16 bytes stack
for x86-32, or 2 MM registers on x86-64.

Originally committed as revision 24338 to svn://svn.ffmpeg.org/ffmpeg/trunk

3facfc99 07/19/2010 09:18 PM Ronald S. Bultje

Change function prototypes for width=8 inner and mbedge loopfilter functions
so that it does both U and V planes at the same time. This will have speed
advantages when using SSE2 (or higher) optimizations, since we can do both
the U and V rows together in a single xmm register....

1ee076b1 07/18/2010 08:06 PM Loren Merritt

more credits to D. J. Bernstein for fft

Originally committed as revision 24308 to svn://svn.ffmpeg.org/ffmpeg/trunk

819b2dd2 07/16/2010 09:35 PM Ronald S. Bultje

Attempt to fix x86-64 testsuite on fate.

Originally committed as revision 24275 to svn://svn.ffmpeg.org/ffmpeg/trunk

6f323f12 07/16/2010 07:54 PM Ronald S. Bultje

Remove duplicate define.

Originally committed as revision 24272 to svn://svn.ffmpeg.org/ffmpeg/trunk

889b2c26 07/16/2010 07:54 PM Ronald S. Bultje

Revert 24270, it contained some stuff that shouldn't have been in there.

Originally committed as revision 24271 to svn://svn.ffmpeg.org/ffmpeg/trunk

2356a783 07/16/2010 07:42 PM Ronald S. Bultje

Remove duplicate define.

Originally committed as revision 24270 to svn://svn.ffmpeg.org/ffmpeg/trunk

ede1b966 07/16/2010 07:38 PM Ronald S. Bultje

Give x86 r%d registers names, this will simplify implementation of the chroma
inner loopfilter, and it also allows us to save one register on x86-64/sse2.

Originally committed as revision 24269 to svn://svn.ffmpeg.org/ffmpeg/trunk

526e831a 07/16/2010 06:29 PM Ronald S. Bultje

Change return statement, the REP_RET is a mistake since the else case (x86-64,
sse2) doesn't actually loop, so REP_RET isn't necessary.

Originally committed as revision 24268 to svn://svn.ffmpeg.org/ffmpeg/trunk

a711eb48 07/15/2010 11:02 PM Ronald S. Bultje

VP8 H/V inner loopfilter MMX/MMXEXT/SSE2 optimizations.

Originally committed as revision 24250 to svn://svn.ffmpeg.org/ffmpeg/trunk

faa26db2 07/11/2010 10:53 PM David Conrad

MMX/SSE VC1 loop filter

Originally committed as revision 24208 to svn://svn.ffmpeg.org/ffmpeg/trunk

7af8fbd3 07/11/2010 10:52 PM David Conrad

Make ff_pw_4 128 bits

Originally committed as revision 24207 to svn://svn.ffmpeg.org/ffmpeg/trunk

881fd7a6 07/06/2010 05:48 PM Vitor Sessak

Move SSE optimized 32-point DCT to its own file. Should fix breakage with YASM
disabled.

Originally committed as revision 24078 to svn://svn.ffmpeg.org/ffmpeg/trunk

4dcc4f8e 07/06/2010 04:58 PM Vitor Sessak

SSE optimized 32-point DCT

Originally committed as revision 24077 to svn://svn.ffmpeg.org/ffmpeg/trunk

f2a30bd8 07/03/2010 07:26 PM Ronald S. Bultje

Simple H/V loopfilter for VP8 in MMX, MMX2 and SSE2 (yay for yasm macros).

Originally committed as revision 24029 to svn://svn.ffmpeg.org/ffmpeg/trunk

b06855f1 07/03/2010 12:48 AM Jason Garrett-Glaser

SSSE3 versions of vp8 width4 bilinear MC functions

Originally committed as revision 24013 to svn://svn.ffmpeg.org/ffmpeg/trunk

dcc602d8 07/02/2010 05:27 AM Jason Garrett-Glaser

SSSE3 versions of width4 VP8 6-tap MC functions
Also make some small changes to saturation order of 4-tap SSSE3 MC to fix a
non-bitexactness bug.

Patch mostly by Eli Friedman <eli.friedman AT gmail DOT com>.

Originally committed as revision 23965 to svn://svn.ffmpeg.org/ffmpeg/trunk

8434fc26 07/01/2010 10:09 PM Jason Garrett-Glaser

Fix 100L in vp8dsp asm init

Originally committed as revision 23946 to svn://svn.ffmpeg.org/ffmpeg/trunk

17dc7c7a 07/01/2010 10:29 AM Jason Garrett-Glaser

Fix h264/vp8 intra pred on Athlon XP
Whose idea was it to have a CPU that didn't SIGILL on an invalid instruction?

Originally committed as revision 23927 to svn://svn.ffmpeg.org/ffmpeg/trunk

49bd8e4b 06/30/2010 03:38 PM Måns Rullgård

Fix grammar errors in documentation

Originally committed as revision 23904 to svn://svn.ffmpeg.org/ffmpeg/trunk

82a8d0f1 06/29/2010 05:23 PM Jason Garrett-Glaser

Use add instead of lshift in mmxext vp8 idct

Originally committed as revision 23891 to svn://svn.ffmpeg.org/ffmpeg/trunk

565344e7 06/29/2010 05:04 PM Ronald S. Bultje

Remove unused macros (duplicates from the now-LGPL x86util.asm).

Originally committed as revision 23890 to svn://svn.ffmpeg.org/ffmpeg/trunk

2dd2f716 06/29/2010 02:43 PM Ronald S. Bultje

MMX idct_add for VP8.

Originally committed as revision 23886 to svn://svn.ffmpeg.org/ffmpeg/trunk

29e71937 06/29/2010 12:28 PM Jason Garrett-Glaser

Add missing mm_support call toff_h264_pred_init_x86.
I'm not sure if this is supposed to be here, but it can't hurt.

Originally committed as revision 23885 to svn://svn.ffmpeg.org/ffmpeg/trunk

004cda8e 06/29/2010 01:41 AM Jason Garrett-Glaser

Add mmxext version of VP8 DC Hadamard transform

Originally committed as revision 23878 to svn://svn.ffmpeg.org/ffmpeg/trunk

37355fe8 06/29/2010 12:40 AM Jason Garrett-Glaser

Make x86util.asm LGPL so we can use it in LGPL asm
Strip out most x264-specific stuff (not used anywhere in ffmpeg).

Originally committed as revision 23877 to svn://svn.ffmpeg.org/ffmpeg/trunk

bc14f04b 06/29/2010 12:23 AM Jason Garrett-Glaser

MMXEXT version of vp8 4x4 vertical pred

Originally committed as revision 23876 to svn://svn.ffmpeg.org/ffmpeg/trunk

fb9927ad 06/28/2010 11:53 PM Jason Garrett-Glaser

Add mmx/mmxext/ssse3 4x4 TM intra pred functions for vp8

Originally committed as revision 23875 to svn://svn.ffmpeg.org/ffmpeg/trunk

8b746bb4 06/28/2010 11:37 PM Jason Garrett-Glaser

Add missing comment header for predict_4x4_dc_mmxext

Originally committed as revision 23874 to svn://svn.ffmpeg.org/ffmpeg/trunk

270a85d2 06/28/2010 11:35 PM Jason Garrett-Glaser

Fix some intra pred MMX functions that used MMXEXT instructions
Also add predict_4x4_dc MMXEXT function for vp8/h264.

Originally committed as revision 23873 to svn://svn.ffmpeg.org/ffmpeg/trunk

a912da76 06/28/2010 10:13 PM Jason Garrett-Glaser

Fix VP8 bilinear mc on x86_64

Originally committed as revision 23872 to svn://svn.ffmpeg.org/ffmpeg/trunk

50f70541 06/28/2010 09:12 PM Baptiste Coudurier

Change MMXEXT to MMX2, MMXEXT is deprecated

Originally committed as revision 23865 to svn://svn.ffmpeg.org/ffmpeg/trunk

0fecad09 06/28/2010 07:14 PM Jason Garrett-Glaser

Add x86 asm functions for VP8 put_pixels

Originally committed as revision 23858 to svn://svn.ffmpeg.org/ffmpeg/trunk

a173aa89 06/28/2010 06:56 PM Jason Garrett-Glaser

Add MMX, SSE2, SSSE3 asm for VP8 bilinear MC

Originally committed as revision 23857 to svn://svn.ffmpeg.org/ffmpeg/trunk

1f65b67c 06/28/2010 10:02 AM Måns Rullgård

Fix x86 build with h264dsp disabled

Originally committed as revision 23844 to svn://svn.ffmpeg.org/ffmpeg/trunk

b3858964 06/27/2010 03:11 PM Eli Friedman

Add const to some pointer parameters.

Patch by Eli Friedman, eli D friedman A gmail

Originally committed as revision 23826 to svn://svn.ffmpeg.org/ffmpeg/trunk

30bdefd1 06/27/2010 02:52 AM David Conrad

Fix build without yasm

Originally committed as revision 23816 to svn://svn.ffmpeg.org/ffmpeg/trunk

0178d14f 06/27/2010 02:01 AM Jason Garrett-Glaser

First shot at VP8 optimizations:
- MMXEXT, SSE2 and SSSE3 MC functions
- MMX and SSE4 IDCT dc_add functions

Patch by Jason Garrett-Glaser <darkshikari gmail com> and myself.

Originally committed as revision 23815 to svn://svn.ffmpeg.org/ffmpeg/trunk

0912db02 06/25/2010 07:10 PM Måns Rullgård

Make vp8 select h264dsp and use this to pull in mmx intrapred

Originally committed as revision 23790 to svn://svn.ffmpeg.org/ffmpeg/trunk

0c590748 06/25/2010 07:06 PM Carl Eugen Hoyos

Fix compilation without --enable-gpl.

Originally committed as revision 23789 to svn://svn.ffmpeg.org/ffmpeg/trunk

96da2a69 06/25/2010 06:34 PM Carl Eugen Hoyos

Cosmetics: Fix indentation.

Originally committed as revision 23785 to svn://svn.ffmpeg.org/ffmpeg/trunk

4af8cdfc 06/25/2010 06:25 PM Jason Garrett-Glaser

16x16 and 8x8c x86 SIMD intra pred functions for VP8 and H.264

Originally committed as revision 23783 to svn://svn.ffmpeg.org/ffmpeg/trunk

89c7d805 06/24/2010 08:53 AM Vitor Sessak

Fix compilation on x64.

Originally committed as revision 23753 to svn://svn.ffmpeg.org/ffmpeg/trunk

57dbd12b 06/24/2010 08:46 AM Vitor Sessak

Fix asm constraints in apply_window()

Originally committed as revision 23752 to svn://svn.ffmpeg.org/ffmpeg/trunk

bc2b3682 06/24/2010 07:44 AM Vitor Sessak

SSE-optimized MP3 floating point windowing functions

Originally committed as revision 23750 to svn://svn.ffmpeg.org/ffmpeg/trunk

2966cc18 06/23/2010 07:20 PM Jason Garrett-Glaser

Update x264asm header files to latest versions.
Modify the asm accordingly.
GLOBAL is now no longoer necessary for PIC-compliant loads.

Originally committed as revision 23739 to svn://svn.ffmpeg.org/ffmpeg/trunk

413abbe1 06/04/2010 04:46 AM David Conrad

Add bitexact versions of put_no_rnd_pixels8 _x2 and _y2 for vp3/theora

Originally committed as revision 23463 to svn://svn.ffmpeg.org/ffmpeg/trunk

179655b6 05/28/2010 07:01 AM David Conrad

vp3: The DC-only IDCT is surprisingly not supposed to be bitexact to the
full IDCT. Fix this.

Originally committed as revision 23358 to svn://svn.ffmpeg.org/ffmpeg/trunk

22cb6fb6 05/11/2010 12:22 AM Michael Niedermayer

Adding missing () to mathops.h.

Originally committed as revision 23083 to svn://svn.ffmpeg.org/ffmpeg/trunk

1c71b5c8 05/10/2010 09:16 PM Reimar Döffinger

Replace more "m" constraints with MANGLE to fix compilation issues
with x86_32 gcc 4.4.4 and -fPIC.

Originally committed as revision 23082 to svn://svn.ffmpeg.org/ffmpeg/trunk

ba87f080 04/20/2010 02:45 PM Diego Biurrun

Remove explicit filename from Doxygen @file commands.

Passing an explicit filename to this command is only necessary if the
documentation in the @file block refers to a file different from the
one the block resides in.

Originally committed as revision 22921 to svn://svn.ffmpeg.org/ffmpeg/trunk

eb6a6cd7 04/17/2010 02:04 AM David Conrad

vp3: DC-only IDCT

2-4% faster overall decode

Originally committed as revision 22896 to svn://svn.ffmpeg.org/ffmpeg/trunk

27eecec3 04/01/2010 04:52 PM Reimar Döffinger

Convert two "m" constraints to MANGLE to fix compilation with some compilers.

Originally committed as revision 22760 to svn://svn.ffmpeg.org/ffmpeg/trunk

d343d598 03/18/2010 03:00 PM Måns Rullgård

Replace remaining uses of ATTR_ALIGNED with DECLARE_ALIGNED

Originally committed as revision 22593 to svn://svn.ffmpeg.org/ffmpeg/trunk

3bd74e92 03/16/2010 09:23 PM Måns Rullgård

Simplify arch-specific object file lists

Originally committed as revision 22570 to svn://svn.ffmpeg.org/ffmpeg/trunk

43f60eba 03/16/2010 09:22 PM Måns Rullgård

Move arch-specific makefile parts into $arch/Makefile

Originally committed as revision 22569 to svn://svn.ffmpeg.org/ffmpeg/trunk

4693b031 03/16/2010 01:17 AM Måns Rullgård

Move H264 dsputil functions into their own struct

This moves the H264-specific functions from DSPContext to the new
H264DSPContext. The code is made conditional on CONFIG_H264DSP
which is set by the codecs requiring it.

The qpel and chroma MC functions are not moved as these are used by...

05aec7bb 03/14/2010 05:50 PM Måns Rullgård

Separate DWT from snow and dsputil

This moves the DWT functions from snow.c and dsputil.c to a file of
their own. A new struct, DWTContext, holds the function pointers
previously part of DSPContext.

Originally committed as revision 22522 to svn://svn.ffmpeg.org/ffmpeg/trunk

f49747e9 03/06/2010 10:37 PM Måns Rullgård

x86: move function prototypes to header files

Originally committed as revision 22266 to svn://svn.ffmpeg.org/ffmpeg/trunk

c26e58e3 03/06/2010 10:36 PM Måns Rullgård

Add some missing #includes

Originally committed as revision 22258 to svn://svn.ffmpeg.org/ffmpeg/trunk

1429224b 03/06/2010 02:34 PM Måns Rullgård

Move FFT parts from dsputil.h to fft.h

Originally committed as revision 22235 to svn://svn.ffmpeg.org/ffmpeg/trunk

84dc2d8a 03/06/2010 02:24 PM Måns Rullgård

Remove DECLARE_ALIGNED_{8,16} macros

These macros are redundant. All uses are replaced with the generic
DECLARE_ALIGNED macro instead.

Originally committed as revision 22233 to svn://svn.ffmpeg.org/ffmpeg/trunk

5e46be96 02/17/2010 11:58 PM Måns Rullgård

Move NEG_[US]SR32 macros to mathops.h

Originally committed as revision 21873 to svn://svn.ffmpeg.org/ffmpeg/trunk

19530266 02/10/2010 02:02 AM David Conrad

Enable SSE2 (put|avg)_pixels_16_sse2

SVQ1 chroma has been special-cased aligned to 16-bytes since at least r15466
Other architectures also assume 16-byte alignment here too but set STRIDE_ALIGN
to 16.

Originally committed as revision 21736 to svn://svn.ffmpeg.org/ffmpeg/trunk

3d05c1fb 01/30/2010 07:26 PM Reimar Döffinger

Make the jump-table section-relative for x86_64 with PIC enabled.
This allows to get rid of the macho64 specific hack that moves them
to rodata (with worse cache behaviour) and avoids textrels which
e.g. Gentoo does not allow for x86_64 libraries.

Originally committed as revision 21551 to svn://svn.ffmpeg.org/ffmpeg/trunk

900479bb 01/26/2010 05:17 PM Loren Merritt

optimize h264_loop_filter_strength_mmx2
244->160 cycles on core2

Originally committed as revision 21462 to svn://svn.ffmpeg.org/ffmpeg/trunk