more credits to D. J. Bernstein for fft
Originally committed as revision 24308 to svn://svn.ffmpeg.org/ffmpeg/trunk
Attempt to fix x86-64 testsuite on fate.
Originally committed as revision 24275 to svn://svn.ffmpeg.org/ffmpeg/trunk
Remove duplicate define.
Originally committed as revision 24272 to svn://svn.ffmpeg.org/ffmpeg/trunk
Revert 24270, it contained some stuff that shouldn't have been in there.
Originally committed as revision 24271 to svn://svn.ffmpeg.org/ffmpeg/trunk
Originally committed as revision 24270 to svn://svn.ffmpeg.org/ffmpeg/trunk
Give x86 r%d registers names, this will simplify implementation of the chromainner loopfilter, and it also allows us to save one register on x86-64/sse2.
Originally committed as revision 24269 to svn://svn.ffmpeg.org/ffmpeg/trunk
Change return statement, the REP_RET is a mistake since the else case (x86-64,sse2) doesn't actually loop, so REP_RET isn't necessary.
Originally committed as revision 24268 to svn://svn.ffmpeg.org/ffmpeg/trunk
VP8 H/V inner loopfilter MMX/MMXEXT/SSE2 optimizations.
Originally committed as revision 24250 to svn://svn.ffmpeg.org/ffmpeg/trunk
MMX/SSE VC1 loop filter
Originally committed as revision 24208 to svn://svn.ffmpeg.org/ffmpeg/trunk
Make ff_pw_4 128 bits
Originally committed as revision 24207 to svn://svn.ffmpeg.org/ffmpeg/trunk
Move SSE optimized 32-point DCT to its own file. Should fix breakage with YASMdisabled.
Originally committed as revision 24078 to svn://svn.ffmpeg.org/ffmpeg/trunk
SSE optimized 32-point DCT
Originally committed as revision 24077 to svn://svn.ffmpeg.org/ffmpeg/trunk
Simple H/V loopfilter for VP8 in MMX, MMX2 and SSE2 (yay for yasm macros).
Originally committed as revision 24029 to svn://svn.ffmpeg.org/ffmpeg/trunk
SSSE3 versions of vp8 width4 bilinear MC functions
Originally committed as revision 24013 to svn://svn.ffmpeg.org/ffmpeg/trunk
SSSE3 versions of width4 VP8 6-tap MC functionsAlso make some small changes to saturation order of 4-tap SSSE3 MC to fix anon-bitexactness bug.
Patch mostly by Eli Friedman <eli.friedman AT gmail DOT com>.
Originally committed as revision 23965 to svn://svn.ffmpeg.org/ffmpeg/trunk
Fix 100L in vp8dsp asm init
Originally committed as revision 23946 to svn://svn.ffmpeg.org/ffmpeg/trunk
Fix h264/vp8 intra pred on Athlon XPWhose idea was it to have a CPU that didn't SIGILL on an invalid instruction?
Originally committed as revision 23927 to svn://svn.ffmpeg.org/ffmpeg/trunk
Fix grammar errors in documentation
Originally committed as revision 23904 to svn://svn.ffmpeg.org/ffmpeg/trunk
Use add instead of lshift in mmxext vp8 idct
Originally committed as revision 23891 to svn://svn.ffmpeg.org/ffmpeg/trunk
Remove unused macros (duplicates from the now-LGPL x86util.asm).
Originally committed as revision 23890 to svn://svn.ffmpeg.org/ffmpeg/trunk
MMX idct_add for VP8.
Originally committed as revision 23886 to svn://svn.ffmpeg.org/ffmpeg/trunk
Add missing mm_support call toff_h264_pred_init_x86.I'm not sure if this is supposed to be here, but it can't hurt.
Originally committed as revision 23885 to svn://svn.ffmpeg.org/ffmpeg/trunk
Add mmxext version of VP8 DC Hadamard transform
Originally committed as revision 23878 to svn://svn.ffmpeg.org/ffmpeg/trunk
Make x86util.asm LGPL so we can use it in LGPL asmStrip out most x264-specific stuff (not used anywhere in ffmpeg).
Originally committed as revision 23877 to svn://svn.ffmpeg.org/ffmpeg/trunk
MMXEXT version of vp8 4x4 vertical pred
Originally committed as revision 23876 to svn://svn.ffmpeg.org/ffmpeg/trunk
Add mmx/mmxext/ssse3 4x4 TM intra pred functions for vp8
Originally committed as revision 23875 to svn://svn.ffmpeg.org/ffmpeg/trunk
Add missing comment header for predict_4x4_dc_mmxext
Originally committed as revision 23874 to svn://svn.ffmpeg.org/ffmpeg/trunk
Fix some intra pred MMX functions that used MMXEXT instructionsAlso add predict_4x4_dc MMXEXT function for vp8/h264.
Originally committed as revision 23873 to svn://svn.ffmpeg.org/ffmpeg/trunk
Fix VP8 bilinear mc on x86_64
Originally committed as revision 23872 to svn://svn.ffmpeg.org/ffmpeg/trunk
Change MMXEXT to MMX2, MMXEXT is deprecated
Originally committed as revision 23865 to svn://svn.ffmpeg.org/ffmpeg/trunk
Add x86 asm functions for VP8 put_pixels
Originally committed as revision 23858 to svn://svn.ffmpeg.org/ffmpeg/trunk
Add MMX, SSE2, SSSE3 asm for VP8 bilinear MC
Originally committed as revision 23857 to svn://svn.ffmpeg.org/ffmpeg/trunk
Fix x86 build with h264dsp disabled
Originally committed as revision 23844 to svn://svn.ffmpeg.org/ffmpeg/trunk
Add const to some pointer parameters.
Patch by Eli Friedman, eli D friedman A gmail
Originally committed as revision 23826 to svn://svn.ffmpeg.org/ffmpeg/trunk
Fix build without yasm
Originally committed as revision 23816 to svn://svn.ffmpeg.org/ffmpeg/trunk
First shot at VP8 optimizations:- MMXEXT, SSE2 and SSSE3 MC functions- MMX and SSE4 IDCT dc_add functions
Patch by Jason Garrett-Glaser <darkshikari gmail com> and myself.
Originally committed as revision 23815 to svn://svn.ffmpeg.org/ffmpeg/trunk
Make vp8 select h264dsp and use this to pull in mmx intrapred
Originally committed as revision 23790 to svn://svn.ffmpeg.org/ffmpeg/trunk
Fix compilation without --enable-gpl.
Originally committed as revision 23789 to svn://svn.ffmpeg.org/ffmpeg/trunk
Cosmetics: Fix indentation.
Originally committed as revision 23785 to svn://svn.ffmpeg.org/ffmpeg/trunk
16x16 and 8x8c x86 SIMD intra pred functions for VP8 and H.264
Originally committed as revision 23783 to svn://svn.ffmpeg.org/ffmpeg/trunk
Fix compilation on x64.
Originally committed as revision 23753 to svn://svn.ffmpeg.org/ffmpeg/trunk
Fix asm constraints in apply_window()
Originally committed as revision 23752 to svn://svn.ffmpeg.org/ffmpeg/trunk
SSE-optimized MP3 floating point windowing functions
Originally committed as revision 23750 to svn://svn.ffmpeg.org/ffmpeg/trunk
Update x264asm header files to latest versions.Modify the asm accordingly.GLOBAL is now no longoer necessary for PIC-compliant loads.
Originally committed as revision 23739 to svn://svn.ffmpeg.org/ffmpeg/trunk
Add bitexact versions of put_no_rnd_pixels8 _x2 and _y2 for vp3/theora
Originally committed as revision 23463 to svn://svn.ffmpeg.org/ffmpeg/trunk
vp3: The DC-only IDCT is surprisingly not supposed to be bitexact to thefull IDCT. Fix this.
Originally committed as revision 23358 to svn://svn.ffmpeg.org/ffmpeg/trunk
Adding missing () to mathops.h.
Originally committed as revision 23083 to svn://svn.ffmpeg.org/ffmpeg/trunk
Replace more "m" constraints with MANGLE to fix compilation issueswith x86_32 gcc 4.4.4 and -fPIC.
Originally committed as revision 23082 to svn://svn.ffmpeg.org/ffmpeg/trunk
Remove explicit filename from Doxygen @file commands.
Passing an explicit filename to this command is only necessary if thedocumentation in the @file block refers to a file different from theone the block resides in.
Originally committed as revision 22921 to svn://svn.ffmpeg.org/ffmpeg/trunk
vp3: DC-only IDCT
2-4% faster overall decode
Originally committed as revision 22896 to svn://svn.ffmpeg.org/ffmpeg/trunk
Convert two "m" constraints to MANGLE to fix compilation with some compilers.
Originally committed as revision 22760 to svn://svn.ffmpeg.org/ffmpeg/trunk
Replace remaining uses of ATTR_ALIGNED with DECLARE_ALIGNED
Originally committed as revision 22593 to svn://svn.ffmpeg.org/ffmpeg/trunk
Simplify arch-specific object file lists
Originally committed as revision 22570 to svn://svn.ffmpeg.org/ffmpeg/trunk
Move arch-specific makefile parts into $arch/Makefile
Originally committed as revision 22569 to svn://svn.ffmpeg.org/ffmpeg/trunk
Move H264 dsputil functions into their own struct
This moves the H264-specific functions from DSPContext to the newH264DSPContext. The code is made conditional on CONFIG_H264DSPwhich is set by the codecs requiring it.
The qpel and chroma MC functions are not moved as these are used by...
Separate DWT from snow and dsputil
This moves the DWT functions from snow.c and dsputil.c to a file oftheir own. A new struct, DWTContext, holds the function pointerspreviously part of DSPContext.
Originally committed as revision 22522 to svn://svn.ffmpeg.org/ffmpeg/trunk
x86: move function prototypes to header files
Originally committed as revision 22266 to svn://svn.ffmpeg.org/ffmpeg/trunk
Add some missing #includes
Originally committed as revision 22258 to svn://svn.ffmpeg.org/ffmpeg/trunk
Move FFT parts from dsputil.h to fft.h
Originally committed as revision 22235 to svn://svn.ffmpeg.org/ffmpeg/trunk
Remove DECLARE_ALIGNED_{8,16} macros
These macros are redundant. All uses are replaced with the genericDECLARE_ALIGNED macro instead.
Originally committed as revision 22233 to svn://svn.ffmpeg.org/ffmpeg/trunk
Move NEG_[US]SR32 macros to mathops.h
Originally committed as revision 21873 to svn://svn.ffmpeg.org/ffmpeg/trunk
Enable SSE2 (put|avg)_pixels_16_sse2
SVQ1 chroma has been special-cased aligned to 16-bytes since at least r15466Other architectures also assume 16-byte alignment here too but set STRIDE_ALIGNto 16.
Originally committed as revision 21736 to svn://svn.ffmpeg.org/ffmpeg/trunk
Make the jump-table section-relative for x86_64 with PIC enabled.This allows to get rid of the macho64 specific hack that moves themto rodata (with worse cache behaviour) and avoids textrels whiche.g. Gentoo does not allow for x86_64 libraries.
Originally committed as revision 21551 to svn://svn.ffmpeg.org/ffmpeg/trunk
optimize h264_loop_filter_strength_mmx2244->160 cycles on core2
Originally committed as revision 21462 to svn://svn.ffmpeg.org/ffmpeg/trunk
Implement an sse version of scalarproduct_float().
Originally committed as revision 21386 to svn://svn.ffmpeg.org/ffmpeg/trunk
Move array specifiers outside DECLARE_ALIGNED() invocations
Originally committed as revision 21377 to svn://svn.ffmpeg.org/ffmpeg/trunk
Use two separate memory arguments since 8+() is invalid gas syntax
Originally committed as revision 21360 to svn://svn.ffmpeg.org/ffmpeg/trunk
Attempt to fix asm compilation failure.Only tested on gcc 4 & x86_64.
Originally committed as revision 21355 to svn://svn.ffmpeg.org/ffmpeg/trunk
Move COPY3_IF_LT to lavc/mathops.h
This obscure macro is only used in motion_est.c so having it in lavcmakes more sense. See discussion here:http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/2008-November/056561.html
Originally committed as revision 21346 to svn://svn.ffmpeg.org/ffmpeg/trunk
Use constant offsets for memory operands since gcc is unable toThis fixes gcc failing to fit 6 memory locations into 7 registers on x86-32
Originally committed as revision 21337 to svn://svn.ffmpeg.org/ffmpeg/trunk
Fix h264_loop_filter_strength_mmx2() so it works with b frames.
Originally committed as revision 21327 to svn://svn.ffmpeg.org/ffmpeg/trunk
Remove -2 -> -1 remapping, its not needed anymore as we must remap allreferences per LUT anyway.
Originally committed as revision 21323 to svn://svn.ffmpeg.org/ffmpeg/trunk
Fix XvMC. XvMCCreateBlocks() may not allocate 16-byte aligned blocks,so we can't use SSE-optimized routines.
Originally committed as revision 21011 to svn://svn.ffmpeg.org/ffmpeg/trunk
Reduce number of ASM constraints for ff_lpc_compute_autocorr_sse2 since itcauses no significant speed difference and can avoid compilation issues with--enable-pic.
Originally committed as revision 21003 to svn://svn.ffmpeg.org/ffmpeg/trunk
Get rid of pointless CONFIG_ANY_H263 preprocessor definition.
Originally committed as revision 20975 to svn://svn.ffmpeg.org/ffmpeg/trunk
fix a crash in ape decoding on x86_32 sse2
Originally committed as revision 20777 to svn://svn.ffmpeg.org/ffmpeg/trunk
slightly faster scalarproduct_and_madd_int16_ssse3 on penryn, no change on conroe
Originally committed as revision 20743 to svn://svn.ffmpeg.org/ffmpeg/trunk
r20739 broke compilation on systems without yasm
Originally committed as revision 20742 to svn://svn.ffmpeg.org/ffmpeg/trunk
refactor and optimize scalarproduct29-105% faster apply_filter, 6-90% faster ape decoding on core2(Any x86 other than core2 probably gets much less, since this is mostly due to ssse3 cachesplit avoidance and I haven't written the full gamut of other cachesplit modes.)...
port ape dsp functions from sse2 to mmxnow requires yasm
Originally committed as revision 20722 to svn://svn.ffmpeg.org/ffmpeg/trunk
s/movdqa/movaps/ in sse1 fft. (regression in r20293)
Originally committed as revision 20371 to svn://svn.ffmpeg.org/ffmpeg/trunk
fix linking on systems with a function name prefix (10l in r20287)
Originally committed as revision 20294 to svn://svn.ffmpeg.org/ffmpeg/trunk
sync yasm macros to x264
Originally committed as revision 20293 to svn://svn.ffmpeg.org/ffmpeg/trunk
huffyuv: add some const qualifiers
Originally committed as revision 20290 to svn://svn.ffmpeg.org/ffmpeg/trunk
simd add_hfyu_left_prediction2.2x faster than C on conroe, 3.6x on penryn.4-6% faster huffyuv decoding if using left or plane mode and yuv
Originally committed as revision 20287 to svn://svn.ffmpeg.org/ffmpeg/trunk
add CONFIG_LPC to the build system for lpc dsputil functions. fixes buildproblems when lpc.c is not compiled.
Originally committed as revision 20285 to svn://svn.ffmpeg.org/ffmpeg/trunk
Move autocorrelation function from flacenc.c to lpc.c. Also rename thecorresponding dsputil functions and remove their dependency on the FLACencoder.Fixes Issue1486.
Originally committed as revision 20266 to svn://svn.ffmpeg.org/ffmpeg/trunk
Use MANGLE in cavsdsp, the current version using "m" constraints will notcompile on e.g. OpenBSD due to running out of registers.
Originally committed as revision 20123 to svn://svn.ffmpeg.org/ffmpeg/trunk
Replace several #ifdef PIC with the more obvious and correct#if !HAVE_EBX_AVAILABLE, since all it does is avoid using ebx.
Originally committed as revision 20094 to svn://svn.ffmpeg.org/ffmpeg/trunk
cosmetics: fix indentation after previous commit
Originally committed as revision 20062 to svn://svn.ffmpeg.org/ffmpeg/trunk
Drop unused args from vector_fmul_add_add, simpify code, and rename
The src3 and step arguments to vector_fmul_add_add() are always zeroand one, respectively. This removes these arguments from the function,simplifies the code accordingly, and renames the function to better...
Merge FFTContext and MDCTContext
Originally committed as revision 19931 to svn://svn.ffmpeg.org/ffmpeg/trunk
Move per-arch fft init bits into the corresponding subdirs
Originally committed as revision 19864 to svn://svn.ffmpeg.org/ffmpeg/trunk
Move declarations of some mmx functions to dsputil_mmx.h
Originally committed as revision 19739 to svn://svn.ffmpeg.org/ffmpeg/trunk
Mark "i" parameter of vector_clipf_sse() as early-clobber
Originally committed as revision 19731 to svn://svn.ffmpeg.org/ffmpeg/trunk
Mark parameter src of vector_clipf() as const
Originally committed as revision 19729 to svn://svn.ffmpeg.org/ffmpeg/trunk
SSE optimized vector_clipf(). 10% faster TwinVQ decoding.
Originally committed as revision 19728 to svn://svn.ffmpeg.org/ffmpeg/trunk
Update x264 asm code to latest to add support for 64-bit Windows.Use the new x86inc features to support 64-bit Windows on all non-x264 nasmassembly code as well.Patch by John Adcock, dscaler.johnad AT googlemail DOT com.Win64 changes originally by Anton Mitrofanov....
Do not check for both CONFIG_VC1_DECODER and CONFIG_WMV3_DECODER,the former depends upon the latter.
Originally committed as revision 19533 to svn://svn.ffmpeg.org/ffmpeg/trunk
Do not redundantly check for both CONFIG_THEORA_DECODER and CONFIG_VP3_DECODER.The Theora decoder depends on the VP3 decoder.
Originally committed as revision 19492 to svn://svn.ffmpeg.org/ffmpeg/trunk