| Branch: | Revision:

ffmpeg / libavcodec / x86 / dsputil_yasm.asm @ 12802ec0

History | View | Annotate | Download (23.9 KB)

# Date Author Comment
17cf7c68 02/08/2011 11:25 PM Ronald S. Bultje

Fix ff_emu_edge_core_sse() on Win64.

Fix emu_edge_v_extend_15 to be <128 bytes on Win64, by being more strict
on the size of registers and which registers are being used for operations
where multiple are available. This fixes segfaults in emulated_edge()...

c73d99e6 02/02/2011 02:44 AM Justin Ruggles

Separate format conversion DSP functions from DSPContext.

This will be beneficial for use with the audio conversion API without
requiring it to depend on all of dsputil.

Signed-off-by: Mans Rullgard <>

81f2a3f4 02/01/2011 01:55 AM Ronald S. Bultje

Implement a SIMD version of emulated_edge_mc() for x86.

From ~550 cycles (C version) to 170 (SSE/x86-64), 206 (MMX/x86-32)
and 196 (SSE2/x86-32) cycles.

2966cc18 06/23/2010 07:20 PM Jason Garrett-Glaser

Update x264asm header files to latest versions.
Modify the asm accordingly.
GLOBAL is now no longoer necessary for PIC-compliant loads.

Originally committed as revision 23739 to svn://

3deb5384 01/22/2010 11:07 PM Alex Converse

Implement an sse version of scalarproduct_float().

Originally committed as revision 21386 to svn://

758c7455 12/08/2009 09:24 PM Loren Merritt

fix a crash in ape decoding on x86_32 sse2

Originally committed as revision 20777 to svn://

a4605efd 12/05/2009 05:53 PM Loren Merritt

slightly faster scalarproduct_and_madd_int16_ssse3 on penryn, no change on conroe

Originally committed as revision 20743 to svn://

b1159ad9 12/05/2009 03:09 PM Loren Merritt

refactor and optimize scalarproduct
29-105% faster apply_filter, 6-90% faster ape decoding on core2
(Any x86 other than core2 probably gets much less, since this is mostly due to ssse3 cachesplit avoidance and I haven't written the full gamut of other cachesplit modes.)...

b10fa1bb 12/03/2009 06:53 PM Loren Merritt

port ape dsp functions from sse2 to mmx
now requires yasm

Originally committed as revision 20722 to svn://

b07781b6 10/18/2009 09:44 PM Loren Merritt

fix linking on systems with a function name prefix (10l in r20287)

Originally committed as revision 20294 to svn://

e17ccf60 10/18/2009 08:47 PM Loren Merritt

huffyuv: add some const qualifiers

Originally committed as revision 20290 to svn://

2f77923d 10/18/2009 08:10 PM Loren Merritt

simd add_hfyu_left_prediction
2.2x faster than C on conroe, 3.6x on penryn.
4-6% faster huffyuv decoding if using left or plane mode and yuv

Originally committed as revision 20287 to svn://

3daa434a 02/08/2009 05:45 PM Loren Merritt

overall ffvhuff decoding speedup: 28% on core2, 25% on k8.

Originally committed as revision 17059 to svn://

a6493a8f 12/22/2008 09:12 AM Diego Biurrun

Rename libavcodec/i386/ --> libavcodec/x86/.
It contains optimizations that are not specific to i386 and
libavutil uses this naming scheme already.

Originally committed as revision 16270 to svn://