Statistics
| Branch: | Revision:

ffmpeg / libavcodec / x86 @ 329d689f

Name Size Revision Age Author Comment
Makefile 3.03 KB d0acc2d2 over 10 years Ronald S. Bultje Move sse16_sse2() from inline asm to yasm. It i...
cavsdsp_mmx.c 19.2 KB c6c98d08 over 10 years Stefano Sabatini Move mm_support() from libavcodec to libavutil,...
dct32_sse.c 11.4 KB 881fd7a6 almost 11 years Vitor Sessak Move SSE optimized 32-point DCT to its own file...
deinterlace.asm 2.47 KB de4bc44a over 10 years Vitor Sessak Convert deinterlacing MMX code to YASM Origina...
dnxhd_mmx.c 2.15 KB c6c98d08 over 10 years Stefano Sabatini Move mm_support() from libavcodec to libavutil,...
dsputil_mmx.c 123 KB 329d689f over 10 years Eli Friedman Use sse2 variant of put_pixels16() for no_rnd a...
dsputil_mmx.h 8.2 KB 2c166c3a over 10 years Ronald S. Bultje Port latest x264 deblock asm (before they moved...
dsputil_mmx_avg_template.c 40.7 KB 413abbe1 almost 11 years David Conrad Add bitexact versions of put_no_rnd_pixels8 _x2...
dsputil_mmx_qns_template.c 3.71 KB a6493a8f over 12 years Diego Biurrun Rename libavcodec/i386/ --> libavcodec/x86/. It...
dsputil_mmx_rnd_template.c 22.7 KB 00312109 over 11 years Reimar Döffinger Replace several #ifdef PIC with the more obviou...
dsputil_yasm.asm 9.77 KB 2966cc18 almost 11 years Jason Garrett-Glaser Update x264asm header files to latest versions....
dsputilenc_mmx.c 35.1 KB c0bc8b9a over 10 years Måns Rullgård x86: disable SSE functions using stack when sta...
dsputilenc_yasm.asm 9.52 KB ada65af9 over 10 years Ronald S. Bultje Don't access upper 32 bits of a 32-bit int on 6...
fdct_mmx.c 17.7 KB d343d598 about 11 years Måns Rullgård Replace remaining uses of ATTR_ALIGNED with DEC...
fft.c 1.83 KB c6c98d08 over 10 years Stefano Sabatini Move mm_support() from libavcodec to libavutil,...
fft.h 1.58 KB 4dcc4f8e almost 11 years Vitor Sessak SSE optimized 32-point DCT Originally committe...
fft_3dn.c 898 Bytes a6493a8f over 12 years Diego Biurrun Rename libavcodec/i386/ --> libavcodec/x86/. It...
fft_3dn2.c 5.1 KB cb4f1246 over 10 years Alex Converse imdct/x86: Use "s->mdct_size" instead of "1 << ...
fft_mmx.asm 14.9 KB dc77e985 over 10 years Reimar Döffinger Split and then simplify address generation macr...
fft_sse.c 2.89 KB cb4f1246 over 10 years Alex Converse imdct/x86: Use "s->mdct_size" instead of "1 << ...
h264_chromamc.asm 17.4 KB d0eb5a11 over 10 years Ronald S. Bultje Move H264 chroma MC from inline asm to yasm. Th...
h264_deblock.asm 22.4 KB 2c166c3a over 10 years Ronald S. Bultje Port latest x264 deblock asm (before they moved...
h264_i386.h 6.08 KB ba87f080 about 11 years Diego Biurrun Remove explicit filename from Doxygen @file com...
h264_idct.asm 21.4 KB 02b424d9 over 10 years Reimar Döffinger Add d suffix to movd target register to make it...
h264_intrapred.asm 14.3 KB 17dc7c7a almost 11 years Jason Garrett-Glaser Fix h264/vp8 intra pred on Athlon XP Whose idea...
h264_intrapred_init.c 4.66 KB c6c98d08 over 10 years Stefano Sabatini Move mm_support() from libavcodec to libavutil,...
h264_qpel_mmx.c 52.6 KB 14bc1f24 over 10 years Ronald S. Bultje Split h264dsp_mmx.c (which was #included in dsp...
h264_weight.asm 8.33 KB b1c32fb5 over 10 years Reimar Döffinger Use "d" suffix for general-purpose registers us...
h264dsp_mmx.c 17.1 KB cd17285e over 10 years Ronald S. Bultje Merge b_idx and edge variables, and optimize th...
idct_mmx.c 23.7 KB 740dfe70 over 10 years Vitor Sessak Fix compilation in x86_64. I broke it with r245...
idct_mmx_xvid.c 23.4 KB c26e58e3 about 11 years Måns Rullgård Add some missing #includes Originally committe...
idct_sse2_xvid.c 15.1 KB 7e7c4b60 over 10 years Ronald S. Bultje Put ff_ prefix on non-static {put_signed,put,ad...
idct_xvid.h 1.2 KB ba87f080 about 11 years Diego Biurrun Remove explicit filename from Doxygen @file com...
lpc_mmx.c 5.61 KB 4a128945 over 11 years Reimar Döffinger Reduce number of ASM constraints for ff_lpc_com...
mathops.h 2.68 KB 22cb6fb6 almost 11 years Michael Niedermayer Adding missing () to mathops.h. Originally com...
mlpdsp.c 6.21 KB 989b7181 almost 12 years Ramiro Polla Use fewer macros in x86-optimized mlpdsp. Fixes...
motion_est_mmx.c 15.8 KB c6c98d08 over 10 years Stefano Sabatini Move mm_support() from libavcodec to libavutil,...
mpegaudiodec_mmx.c 5 KB c6c98d08 over 10 years Stefano Sabatini Move mm_support() from libavcodec to libavutil,...
mpegvideo_mmx.c 27.7 KB c6c98d08 over 10 years Stefano Sabatini Move mm_support() from libavcodec to libavutil,...
mpegvideo_mmx_template.c 17.1 KB 84dc2d8a about 11 years Måns Rullgård Remove DECLARE_ALIGNED_{8,16} macros These mac...
simple_idct_mmx.c 71 KB 7e7c4b60 over 10 years Ronald S. Bultje Put ff_ prefix on non-static {put_signed,put,ad...
snowdsp_mmx.c 39.4 KB c6c98d08 over 10 years Stefano Sabatini Move mm_support() from libavcodec to libavutil,...
vc1dsp_mmx.c 34.5 KB c6c98d08 over 10 years Stefano Sabatini Move mm_support() from libavcodec to libavutil,...
vc1dsp_yasm.asm 7.8 KB b1c32fb5 over 10 years Reimar Döffinger Use "d" suffix for general-purpose registers us...
vp3dsp.asm 20.7 KB b1c32fb5 over 10 years Reimar Döffinger Use "d" suffix for general-purpose registers us...
vp56_arith.h 1.71 KB 05c04cdf over 10 years Jason Garrett-Glaser VP5/6/8: ~7% faster arithmetic decoding Grab fr...
vp56dsp.asm 4.86 KB 4eca52ed over 10 years Ronald S. Bultje Fix typos when converting inline asm to yasm, f...
vp56dsp_init.c 1.73 KB c6c98d08 over 10 years Stefano Sabatini Move mm_support() from libavcodec to libavutil,...
vp8dsp-init.c 19 KB c6c98d08 over 10 years Stefano Sabatini Move mm_support() from libavcodec to libavutil,...
vp8dsp.asm 78.4 KB b1c32fb5 over 10 years Reimar Döffinger Use "d" suffix for general-purpose registers us...
x86inc.asm 16.4 KB 532e7697 almost 11 years Loren Merritt sync yasm macros from x264 Originally committe...
x86util.asm 9.12 KB e2e34104 over 10 years Ronald S. Bultje Move hadamard_diff{,16}_{mmx,mmx2,sse2,ssse3}()...

Latest revisions

# Date Author Comment
329d689f 09/29/2010 03:34 PM Eli Friedman

Use sse2 variant of put_pixels16() for no_rnd also. Provides a minor speed
increase to e.g. vc1, snow and mpeg decoding.

Patch by Eli Friedman <eli dot friedman gmail com>.

Originally committed as revision 25259 to svn://svn.ffmpeg.org/ffmpeg/trunk

cd17285e 09/29/2010 02:04 PM Ronald S. Bultje

Merge b_idx and edge variables, and optimize the ASM to directly load variables
from memory locations/offsets depending on b_idx plus constants, rather than
having gcc do this. This saves several lea calls and together saves about
10 cycles in h264_loop_filter_strength_mmx2()....

0cc8a5d0 09/29/2010 02:03 PM Ronald S. Bultje

Remove mv_mask variable. Replace the related pand -1/0 instructions by either
a pxor, or remove the instruction alltogether. Altogether, this saves 1
instruction.

Originally committed as revision 25255 to svn://svn.ffmpeg.org/ffmpeg/trunk

c0673f2c 09/29/2010 02:02 PM Ronald S. Bultje

Remove d_idx as a variable, and instead load it as a constant in the asm.
This has no measurable speed effect because the surrounding code doesn't
take advantage of this yet.

Originally committed as revision 25254 to svn://svn.ffmpeg.org/ffmpeg/trunk

2c3135f6 09/29/2010 01:35 PM Ronald S. Bultje

Unroll inner bidir loop in h264_loop_filter_strength_mmx2(), which gets rid
of the d_idx variable and therefore allows for future optimizations. No speed
difference by this commit itself.

Originally committed as revision 25253 to svn://svn.ffmpeg.org/ffmpeg/trunk

4b81511c 09/29/2010 01:34 PM Ronald S. Bultje

Unloop the outer loop in h264_loop_filter_strength_mmx2(), which allows
inlining various constants within the loop code. 20 cycles faster on
cathedral sample.

Originally committed as revision 25252 to svn://svn.ffmpeg.org/ffmpeg/trunk

02b424d9 09/26/2010 09:15 AM Reimar Döffinger

Add d suffix to movd target register to make it work with nasm.

Originally committed as revision 25206 to svn://svn.ffmpeg.org/ffmpeg/trunk

dc77e985 09/26/2010 09:08 AM Reimar Döffinger

Split and then simplify address generation macro.
Allows nasm to work for this code.

Originally committed as revision 25205 to svn://svn.ffmpeg.org/ffmpeg/trunk

7e117771 09/24/2010 03:31 PM Ronald S. Bultje

Remove unused variable.

Originally committed as revision 25173 to svn://svn.ffmpeg.org/ffmpeg/trunk

ae112918 09/24/2010 02:07 PM Ronald S. Bultje

Unroll loop in h264_idct_add16intra_sse2(). Basically identical to r25171, this
inlines scan8[] and removes loop setup. 15% faster, 0.4% overall.

See "[PATCH] unroll loop in h264_idct_add8_sse2()" thread on ML.

Originally committed as revision 25172 to svn://svn.ffmpeg.org/ffmpeg/trunk

View revisions

Also available in: Atom