Statistics
| Branch: | Revision:

ffmpeg / libavcodec / i386 / dsputil_mmx.c @ 0dba1995

History | View | Annotate | Download (119 KB)

# Date Author Comment
0dba1995 10/19/2008 04:44 AM David Conrad

Cosmetics: reindent

Originally committed as revision 15644 to svn://svn.ffmpeg.org/ffmpeg/trunk

ca4a4ac1 10/19/2008 04:43 AM David Conrad

Combine non-bitexact sections

Originally committed as revision 15643 to svn://svn.ffmpeg.org/ffmpeg/trunk

daa1ea04 10/19/2008 04:40 AM David Conrad

VP3 loop filter is mmx2 not mmx

Originally committed as revision 15642 to svn://svn.ffmpeg.org/ffmpeg/trunk

357f45d9 10/17/2008 03:18 AM David Conrad

MMX VP3 Loop Filter

Originally committed as revision 15630 to svn://svn.ffmpeg.org/ffmpeg/trunk

be449fca 10/16/2008 01:34 PM Diego Pettenò

Convert asm keyword into asm.

Neither the asm() nor the asm() keyword is part of the C99
standard, but while GCC accepts the former in C89 syntax, it is not
accepted in C99 unless GNU extensions are turned on (with -fasm). The
latter form is accepted in any syntax as an extension (without...

8cfd78ce 09/17/2008 07:49 PM David Conrad

Ensure MMX/SSE2 VP3 IDCT selection isn't disabled when only Theora is enabled

Originally committed as revision 15350 to svn://svn.ffmpeg.org/ffmpeg/trunk

ccd3ec82 09/17/2008 07:30 PM David Conrad

MMX/SSE2 VP3 IDCT are bitexact now that the dequantization matrices are permutated correctly

Originally committed as revision 15345 to svn://svn.ffmpeg.org/ffmpeg/trunk

b4c3d835 08/31/2008 07:05 AM David Conrad

Use ff_vp3_idct_data in vp3dsp_mmx.c rather than duplicating it

Originally committed as revision 15118 to svn://svn.ffmpeg.org/ffmpeg/trunk

21383da8 08/30/2008 07:40 PM David Conrad

Let ff_pw_8 be used as an SSE constant

Originally committed as revision 15052 to svn://svn.ffmpeg.org/ffmpeg/trunk

ebceaa1c 08/14/2008 04:40 AM Loren Merritt

gcc chokes on the 7 registers needed for float_to_int16_interleave6 (even inside HAVE_7REGS), so write it in yasm

Originally committed as revision 14749 to svn://svn.ffmpeg.org/ffmpeg/trunk

ee467537 08/14/2008 04:39 AM Loren Merritt

gcc chokes on xmm constraints, so pessimize int32_to_float_fmul_scalar_sse a little

Originally committed as revision 14748 to svn://svn.ffmpeg.org/ffmpeg/trunk

67587238 08/13/2008 11:36 PM Loren Merritt

special case 6 channel version of float_to_int16_interleave
5% faster ac3

Originally committed as revision 14744 to svn://svn.ffmpeg.org/ffmpeg/trunk

911e21a3 08/13/2008 11:35 PM Loren Merritt

simd int->float
20% faster ac3 if downmixing, 15% if not

Originally committed as revision 14743 to svn://svn.ffmpeg.org/ffmpeg/trunk

ac2e5564 08/13/2008 11:33 PM Loren Merritt

simd downmix
13% faster ac3 if downmixing

Originally committed as revision 14742 to svn://svn.ffmpeg.org/ffmpeg/trunk

862b98d4 08/12/2008 12:51 AM Loren Merritt

cosmetics in dsp init

Originally committed as revision 14704 to svn://svn.ffmpeg.org/ffmpeg/trunk

f769b746 08/02/2008 05:32 PM Uoti Urpala

Mark add_png_paeth_prediction_* functions which are only used within this file
as static. patch by Uoti Urpala, uoti.urpala pp1.inet fi

Originally committed as revision 14509 to svn://svn.ffmpeg.org/ffmpeg/trunk

5eb0f2a4 07/16/2008 12:50 AM Loren Merritt

float_to_int16_interleave: change src to an array of pointers instead of assuming it's contiguous.
this has no immediate effect, but will allow it to be used in more codecs.

Originally committed as revision 14252 to svn://svn.ffmpeg.org/ffmpeg/trunk

4342a7f3 07/15/2008 04:11 AM Loren Merritt

10l, float_to_int16_interleave_sse/3dnow wrote the wrong samples

Originally committed as revision 14236 to svn://svn.ffmpeg.org/ffmpeg/trunk

b9fa3208 07/13/2008 03:03 PM Loren Merritt

exploit mdct symmetry
2% faster vorbis on conroe, k8. 7% on celeron.

Originally committed as revision 14207 to svn://svn.ffmpeg.org/ffmpeg/trunk

f27e1d64 07/13/2008 02:56 PM Loren Merritt

simplify vorbis windowing

Originally committed as revision 14205 to svn://svn.ffmpeg.org/ffmpeg/trunk

d7e1fc42 07/11/2008 04:48 AM Kostya Shishkov

SSE2 optimizations for Monkey's Audio decoder vector functions

Originally committed as revision 14161 to svn://svn.ffmpeg.org/ffmpeg/trunk

e98750c3 07/09/2008 07:21 AM Michael Niedermayer

float_to_int16_sse2()
20% faster than sse

Originally committed as revision 14138 to svn://svn.ffmpeg.org/ffmpeg/trunk

35ee72b1 07/07/2008 09:25 PM Michael Niedermayer

1 c-asm loop less and 1x unroll of float_to_int16_sse()
25% faster

Originally committed as revision 14104 to svn://svn.ffmpeg.org/ffmpeg/trunk

560fa9bf 07/07/2008 09:04 PM Michael Niedermayer

Fix x86-64

Originally committed as revision 14103 to svn://svn.ffmpeg.org/ffmpeg/trunk

63b737d4 07/07/2008 08:46 PM Michael Niedermayer

dont use C-asm loops and unroll once float_to_int16_3dnow()
30% faster

Originally committed as revision 14102 to svn://svn.ffmpeg.org/ffmpeg/trunk

00eebe3d 06/22/2008 07:05 AM Reimar Döffinger

Fix add_bytes_mmx and add_bytes_l2_mmx for w < 16

Originally committed as revision 13877 to svn://svn.ffmpeg.org/ffmpeg/trunk

245976da 05/09/2008 11:56 AM Diego Biurrun

Use full path for #includes from another directory.

Originally committed as revision 13098 to svn://svn.ffmpeg.org/ffmpeg/trunk

40d0e665 05/08/2008 09:11 PM Ramiro Polla

Do not misuse long as the size of a register in x86.
typedef x86_reg as the appropriate size and use it instead.

Originally committed as revision 13081 to svn://svn.ffmpeg.org/ffmpeg/trunk

f73a6393 04/16/2008 01:36 AM Alexander Strange

Add a new xvid-style IDCT using SSE2.

Originally committed as revision 12843 to svn://svn.ffmpeg.org/ffmpeg/trunk

54a0b6e5 04/12/2008 04:54 PM Alexander Strange

Add a header file to declare Xvid IDCT functions.
patch by Alexander Strange, astrange ithinksw com

Originally committed as revision 12794 to svn://svn.ffmpeg.org/ffmpeg/trunk

ce53144b 04/01/2008 04:51 AM Loren Merritt

h264 chroma mc ssse3
width8: 180->92, width4: 78->63 cycles (core2)

Originally committed as revision 12661 to svn://svn.ffmpeg.org/ffmpeg/trunk

9e8e6d31 03/21/2008 12:36 PM Zuxy Meng

Add missed call to ff_cavsdsp_init_3dnow() in dsputil_init_mmx()

Originally committed as revision 12540 to svn://svn.ffmpeg.org/ffmpeg/trunk

943032b1 03/20/2008 02:24 PM Michael Niedermayer

Hardcode register to prevent aparent miscompilation.
Fixes regression tests with gcc 2.95.

Originally committed as revision 12512 to svn://svn.ffmpeg.org/ffmpeg/trunk

dea00a46 03/20/2008 02:09 PM Michael Niedermayer

remove unused temp

Originally committed as revision 12511 to svn://svn.ffmpeg.org/ffmpeg/trunk

5a6a9e78 03/04/2008 12:07 AM Aurelien Jacobs

move draw_edges() into dsputil

Originally committed as revision 12309 to svn://svn.ffmpeg.org/ffmpeg/trunk

97d1d009 02/25/2008 11:14 PM Aurelien Jacobs

split encoding part of dsputil_mmx into its own file

Originally committed as revision 12223 to svn://svn.ffmpeg.org/ffmpeg/trunk

78d3d94f 02/24/2008 02:46 PM Reimar Döffinger

__asm __volatile -> asm volatile, improves code consistency and works
(as far as that is possible) with the Sun C compiler.

Originally committed as revision 12188 to svn://svn.ffmpeg.org/ffmpeg/trunk

4a9ca0a2 02/21/2008 07:10 AM Loren Merritt

simd and unroll png_filter_row
cycles per 1000 pixels on core2:
left: 9211->5170
top: 9283->2138
avg: 12215->7611
paeth: 64024->17360
overall rgb png decoding speed: +45%
overall greyscale png decoding speed: +6%

Originally committed as revision 12164 to svn://svn.ffmpeg.org/ffmpeg/trunk

1d67b037 02/06/2008 12:32 PM Loren Merritt

sse2 h264 motion compensation. not new code, just separate out the cases that didn't need ssse3.

Originally committed as revision 11877 to svn://svn.ffmpeg.org/ffmpeg/trunk

20d565be 02/06/2008 04:44 AM Loren Merritt

put loop counter in a register if possible. makes some of the qpel functions 3% faster.

Originally committed as revision 11876 to svn://svn.ffmpeg.org/ffmpeg/trunk

a2b7bc8e 02/06/2008 03:51 AM Loren Merritt

constant was excessively aligned

Originally committed as revision 11874 to svn://svn.ffmpeg.org/ffmpeg/trunk

ddf96970 02/05/2008 11:22 AM Loren Merritt

ssse3 h264 motion compensation.
25% faster tham mmx on core2, 35% if you discount fullpel, 4% overall decoding.

Originally committed as revision 11871 to svn://svn.ffmpeg.org/ffmpeg/trunk

fa9b873e 02/05/2008 01:16 AM Loren Merritt

clean up an ugliness introduced in r11826. this syntax will require fewer changes when adding future sse2 code.

Originally committed as revision 11868 to svn://svn.ffmpeg.org/ffmpeg/trunk

b2f77586 02/04/2008 04:20 PM Loren Merritt

reduce code duplication

Originally committed as revision 11863 to svn://svn.ffmpeg.org/ffmpeg/trunk

b313e815 02/03/2008 05:04 PM Loren Merritt

avg_pixels4_mmx2

Originally committed as revision 11829 to svn://svn.ffmpeg.org/ffmpeg/trunk

6c01d006 02/03/2008 04:19 PM Loren Merritt

use mmx2/3dnow avg functions in avg_qpel*_mc00

Originally committed as revision 11828 to svn://svn.ffmpeg.org/ffmpeg/trunk

ed5d7a53 02/03/2008 07:05 AM Loren Merritt

ff_h264_idct8_add_sse2.
compared to mmx, 217->126 cycles on core2, 262->220 on k8.

Originally committed as revision 11826 to svn://svn.ffmpeg.org/ffmpeg/trunk

066e0cc5 01/30/2008 11:54 PM Baptiste Coudurier

add parenthesis, fix warning: i386/dsputil_mmx.c:2618: warning: suggest parentheses around arithmetic in operand of |

Originally committed as revision 11673 to svn://svn.ffmpeg.org/ffmpeg/trunk

afa47789 01/30/2008 11:52 PM Baptiste Coudurier

fix prototypes, remove warning: i386/dsputil_mmx.c:3594: warning: assignment from incompatible pointer type

Originally committed as revision 11672 to svn://svn.ffmpeg.org/ffmpeg/trunk

27215c6b 01/27/2008 02:46 PM Reimar Döffinger

Use DECLARE_ALIGNED

Originally committed as revision 11630 to svn://svn.ffmpeg.org/ffmpeg/trunk

28748a91 01/11/2008 08:29 AM Christophe Gisquet

Factorize some duplicated code from CAVS and H.264 into a common file.
patch by Christophe Gisquet, christophe.gisquet free fr

Originally committed as revision 11504 to svn://svn.ffmpeg.org/ffmpeg/trunk

9fa35729 12/21/2007 11:11 PM Christophe Gisquet

add MMX version for put_no_rnd_h264_chroma_mc8_c, used in VC-1 decoding.
patch by Christophe GISQUET christophe P gisquet A free P fr
original thread:
date: Nov 25, 2007 12:35 AM
subject: Re: [FFmpeg-devel] MMX version for put_no_rnd_h264_chroma_mc8_c

Originally committed as revision 11298 to svn://svn.ffmpeg.org/ffmpeg/trunk

9fbd14ac 12/21/2007 12:38 PM Diego Biurrun

Fix typo in macro name: WARPER8_16_SQ --> WRAPPER8_16_SQ.

Originally committed as revision 11296 to svn://svn.ffmpeg.org/ffmpeg/trunk

407c50a0 12/16/2007 10:20 PM Aurelien Jacobs

move FLAC mmx dsp to its own file

Originally committed as revision 11244 to svn://svn.ffmpeg.org/ffmpeg/trunk

571bf37f 12/11/2007 06:47 PM Diego Biurrun

typo/clarification

Originally committed as revision 11201 to svn://svn.ffmpeg.org/ffmpeg/trunk

52b541ad 12/01/2007 10:21 PM Vitor Sessak

spelling

Originally committed as revision 11122 to svn://svn.ffmpeg.org/ffmpeg/trunk

bb6cc730 11/27/2007 10:57 PM Aurelien Jacobs

remove some unused ff_p* vars from dsputil

Originally committed as revision 11106 to svn://svn.ffmpeg.org/ffmpeg/trunk

dbb5fdbd 11/27/2007 10:56 PM Aurelien Jacobs

remove useless #ifdef around extern declaration

Originally committed as revision 11105 to svn://svn.ffmpeg.org/ffmpeg/trunk

7c35b551 11/27/2007 10:54 PM Aurelien Jacobs

cosmetics: indentation

Originally committed as revision 11104 to svn://svn.ffmpeg.org/ffmpeg/trunk

51ac8822 11/27/2007 10:54 PM Aurelien Jacobs

convert some #ifdef CONFIG_ to if(ENABLE_

Originally committed as revision 11103 to svn://svn.ffmpeg.org/ffmpeg/trunk

5b67ce2a 11/27/2007 10:42 PM Aurelien Jacobs

build vc1dsp_mmx.c in its own compilation unit

Originally committed as revision 11102 to svn://svn.ffmpeg.org/ffmpeg/trunk

43de5065 11/27/2007 10:36 PM Aurelien Jacobs

use ff_ prefix for extern vars

Originally committed as revision 11101 to svn://svn.ffmpeg.org/ffmpeg/trunk

182f56cb 11/27/2007 10:23 PM Aurelien Jacobs

make ff_p* vars extern so that they can be used in various *_mmx.c files

Originally committed as revision 11100 to svn://svn.ffmpeg.org/ffmpeg/trunk

82821c91 11/21/2007 10:41 PM Christophe Gisquet

add VC-1 MMX DSP functions, under MIT license.
patch by Christophe GISQUET christophe P gisquet A free P fr
original thread:
date: Jul 7, 2007 12:52 PM
subject: [FFmpeg-devel] [PATCH] VC-1 MMX DSP functions

Originally committed as revision 11074 to svn://svn.ffmpeg.org/ffmpeg/trunk

02d36191 11/12/2007 02:04 AM Michael Niedermayer

tring to workaround gcc 2.95 bug which causes random failures

Originally committed as revision 11003 to svn://svn.ffmpeg.org/ffmpeg/trunk

6810b93a 09/29/2007 10:31 PM Loren Merritt

sse2 version of compute_autocorr().
4x faster than c (somehow, even though doubles only allow 2x simd).
overal flac encoding: 15-50% faster on core2, 4-11% on k8, 3-13% on p4.

Originally committed as revision 10621 to svn://svn.ffmpeg.org/ffmpeg/trunk

7bcc1d5b 08/26/2007 04:10 PM Ramiro Polla

CONFIG_7REGS has been renamed to HAVE_7REGS

Originally committed as revision 10237 to svn://svn.ffmpeg.org/ffmpeg/trunk

90e9e94d 08/26/2007 12:34 PM Michael Niedermayer

workaround gcc bug, untested as my gcc is not complaining

Originally committed as revision 10236 to svn://svn.ffmpeg.org/ffmpeg/trunk

62975029 08/26/2007 01:11 AM Michael Niedermayer

avoid overflow in the 3rd lifting step, this now needs mmx2 at minimum
(patch for plain mmx support is welcome ...)

Originally committed as revision 10226 to svn://svn.ffmpeg.org/ffmpeg/trunk

3e0f7126 08/25/2007 03:20 PM Michael Niedermayer

update mmx code to latest snow changes
note, the code likely can overflow and thus needs some more changes
sse2 updated too but disabled as it is untested

Originally committed as revision 10223 to svn://svn.ffmpeg.org/ffmpeg/trunk

d593e329 08/25/2007 03:00 AM Michael Niedermayer

use 16bit IDWT (a SIMD implementation of it should be >2x faster then with
the old 32bit code)
disable mmx/sse2 optimizations as they need a rewrite now

Originally committed as revision 10218 to svn://svn.ffmpeg.org/ffmpeg/trunk

73f51a4d 07/24/2007 08:54 AM Aurelien Jacobs

help some gcc version to optimize out those functions

Originally committed as revision 9785 to svn://svn.ffmpeg.org/ffmpeg/trunk

674eeb5f 07/10/2007 08:27 PM Aurelien Jacobs

cosmetics: indentation

Originally committed as revision 9582 to svn://svn.ffmpeg.org/ffmpeg/trunk

eb75a698 07/10/2007 08:23 PM Aurelien Jacobs

Avoid linking with h263.c functions when the relevant codecs
are not compiled in.

Originally committed as revision 9581 to svn://svn.ffmpeg.org/ffmpeg/trunk

a00177a9 07/08/2007 11:15 PM Måns Rullgård

make arguments to ssd_int8_vs_int16() const

Originally committed as revision 9548 to svn://svn.ffmpeg.org/ffmpeg/trunk

663deb54 05/20/2007 05:07 AM Zuxy Meng

Remove incorrect comment; MMX2 is preferred over 3DNow! on Athlon

Originally committed as revision 9079 to svn://svn.ffmpeg.org/ffmpeg/trunk

038bfcf9 05/18/2007 08:18 AM Zuxy Meng

3DNow! and SSSE3 optimization to QNS DSP functions; use pmulhrw/pmulhrsw instead of pmulhw

Originally committed as revision 9053 to svn://svn.ffmpeg.org/ffmpeg/trunk

5b0b7054 05/16/2007 11:23 PM Aurelien Jacobs

better separation of vp3dsp functions from dsputil_mmx.c

Originally committed as revision 9039 to svn://svn.ffmpeg.org/ffmpeg/trunk

b550bfaa 05/16/2007 09:51 AM Ronald S. Bultje

Add libavcodec to compiler include flags in order to simplify header
include paths in the source files.
mostly from a patch by Ronald S. Bultje, rbultje ronald.bitfreak net

Originally committed as revision 9034 to svn://svn.ffmpeg.org/ffmpeg/trunk

9b5dc867 05/14/2007 02:28 PM Panagiotis Issaris

Make vp3dsp*.c compilation optional.

Originally committed as revision 9025 to svn://svn.ffmpeg.org/ffmpeg/trunk

1edbfe19 05/12/2007 02:41 AM Loren Merritt

factor sum_abs_dctelem out of dct_sad, and simd it.
sum_abs_dctelem_* alone:
core2: c=186 mmx2=39 sse2=21 ssse3=13 (cycles)
k8: c=163 mmx2=33 sse2=31
p4: c=370 mmx2=60 sse2=60
dct_sad including sum_abs_dctelem_*:
core2: c=405 mmx2=258 sse2=240 ssse3=232...

561f940c 05/12/2007 01:16 AM Loren Merritt

sse2 & ssse3 versions of hadamard. unroll and inline diff_pixels.
core2: before mmx2=193 cycles. after mmx2=174 sse2=122 ssse3=115 (cycles).
k8: before mmx2=205. after mmx2=184 sse2=180.
p4: before mmx2=342. after mmx2=314 sse2=309.

Originally committed as revision 9000 to svn://svn.ffmpeg.org/ffmpeg/trunk

5adf43e4 05/09/2007 01:46 AM Loren Merritt

cosmetics: remove code duplication in hadamard8_diff_mmx

Originally committed as revision 8946 to svn://svn.ffmpeg.org/ffmpeg/trunk

bba5293b 05/08/2007 05:55 PM Loren Merritt

cosmetics: remove duplicate transpose macro

Originally committed as revision 8939 to svn://svn.ffmpeg.org/ffmpeg/trunk

fe037229 04/07/2007 02:10 PM Diego Biurrun

typos

Originally committed as revision 8642 to svn://svn.ffmpeg.org/ffmpeg/trunk

59006372 03/30/2007 07:15 PM Loren Merritt

mmx 16-bit ssd. 2.3x faster svq1 encoding.

Originally committed as revision 8559 to svn://svn.ffmpeg.org/ffmpeg/trunk

d42f8802 02/24/2007 11:58 AM Diego Biurrun

Fix wrong conditional, Snow decoding, not encoding, was SIMD-accelerated.

Originally committed as revision 8116 to svn://svn.ffmpeg.org/ffmpeg/trunk

9dd6c804 01/30/2007 10:31 AM Panagiotis Issaris

Add the const specifier as needed to reduce the number of warnings.

Originally committed as revision 7764 to svn://svn.ffmpeg.org/ffmpeg/trunk

486497e0 11/14/2006 03:18 AM Måns Rullgård

revert bad checkin

Originally committed as revision 7044 to svn://svn.ffmpeg.org/ffmpeg/trunk

be6ed6ff 11/14/2006 03:12 AM Måns Rullgård

move some CFLAGS settings away from config.* writing section

Originally committed as revision 7043 to svn://svn.ffmpeg.org/ffmpeg/trunk

bb54f6ab 11/12/2006 03:34 AM Måns Rullgård

adding more static keywords

Originally committed as revision 6976 to svn://svn.ffmpeg.org/ffmpeg/trunk

e9f1885c 11/03/2006 02:03 AM Michael Niedermayer

optimize H264_DEBLOCK_P0_Q0
2.5% faster filter_mb_fast() on P3

Originally committed as revision 6877 to svn://svn.ffmpeg.org/ffmpeg/trunk

7c428ea6 10/14/2006 05:04 PM Diego Biurrun

Put libmpeg2 IDCT functions under CONFIG_GPL, fixes link failure
with --disable-opts.

Originally committed as revision 6691 to svn://svn.ffmpeg.org/ffmpeg/trunk

c26abfa5 10/11/2006 11:17 PM Diego Biurrun

Rename ABS macro to FFABS.

Originally committed as revision 6666 to svn://svn.ffmpeg.org/ffmpeg/trunk

b78e7197 10/07/2006 03:30 PM Diego Biurrun

Change license headers to say 'FFmpeg' instead of 'this program/this library'
and fix GPL/LGPL version mismatches.

Originally committed as revision 6577 to svn://svn.ffmpeg.org/ffmpeg/trunk

0eb59ddb 10/05/2006 12:23 AM Diego Biurrun

Switch idct_mmx_xvid.c from GPL to LGPL as permitted by the
author, Peter Ross (pross xvid org).

Originally committed as revision 6557 to svn://svn.ffmpeg.org/ffmpeg/trunk

2833fc46 10/01/2006 09:25 PM Loren Merritt

approximate qpel functions: sacrifice some quality for some decoding speed. enabled on B-frames with -lavdopts fast.

Originally committed as revision 6412 to svn://svn.ffmpeg.org/ffmpeg/trunk

62bb489b 09/27/2006 07:54 PM Måns Rullgård

add some #ifdef CONFIG_ENCODERS/DECODERS

Originally committed as revision 6356 to svn://svn.ffmpeg.org/ffmpeg/trunk

2a2311be 09/14/2006 10:13 PM Aurelien Jacobs

disable vp3 mmx idct for theora files to avoid artifacts
(see theora-a4_v6-k250-s0_2.ogg)

Originally committed as revision 6253 to svn://svn.ffmpeg.org/ffmpeg/trunk

7f889a76 09/14/2006 12:38 AM Diego Biurrun

Remove the LGPL exception clause as discussed on ffmpeg-devel
and move the dependent code under CONFIG_GPL.

Originally committed as revision 6248 to svn://svn.ffmpeg.org/ffmpeg/trunk