Fix FSF address copy paste error in some license headers.
10-bit H.264 x86 chroma v loopfilter asm
Also delete some unused deblock asm macros.
Port x86 10-bit H.264 deblock asm from x264
Update x86 H.264 deblock asm
Includes AVX versions from x264.
h264dsp_mmx: place bracket outside #if/#endif block.
Should fix compile on systems missing yasm/nasm.
Adds 8-, 9- and 10-bit versions of some of the functions used by the h264 decoder.
This patch lets e.g. dsputil_init chose dsp functions with respect tothe bit depth to decode. The naming scheme of bit depth dependentfunctions is <base name>_<bit depth>[_
] (i.e. the old...
Remove disabled non-optimized code variants.
Add AVX FFT implementation.
Signed-off-by: Reinhard Tartler <siretart@tauware.de>
Update x86inc.asm from x264 to allow AVX emulation using SSE and MMX.
dsputil: allow to skip drawing of top/bottom edges.
Add apply_window_int16() to DSPContext with x86-optimized versions and use itin the ac3_fixed encoder.
Move dct and rdft definitions to separate files
This leaves fft.h with only the core FFT and MDCT definitionsthus making it more managable.
Signed-off-by: Mans Rullgard <mans@mansr.com>
Replace FFmpeg with Libav in licence headers
ac3enc: add float_to_fixed24() with x86-optimized versions to AC3DSPContextand use in scale_coefficients() for the floating-point AC-3 encoder.
mathops: fix MULL when the compiler does not inline the function.
If the function is not inlined, an immmediate cannot be used for theshift parameter, so the %cl register must be used instead in that case.
This fixes compilation for x86-32 using gcc with --disable-optimizations.
mathops: change "g" constraint to "rm" in x86-32 version of MUL64.
The 1-arg imul instruction cannot take an immediate argument, only a registeror memory argument.
mathops: convert MULL/MULH/MUL64 to inline functions rather than macros.
This fixes unexpected name collisions that were occurring with variablesdeclared within the macros.It also fixes the fate-acodec-ac3_fixed regression test on x86-32.
ac3enc: add SIMD-optimized shifting functions for use with the fixed-point AC3 encoder.
Add CONFIG_AC3DSP symbol to simplify makefiles
dsputil_mmx.c: remove ff_vector128.
Remove ff_vector128, it is identical to ff_pb_80.
dsputil: move VC1-specific stuff into VC1DSPContext.
ac3dsp: Change punpckhqdq to movhlps in ac3_max_msb_abs_int16().
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
ac3enc: Add x86-optimized function to speed up log2_tab().
AC3DSPContext.ac3_max_msb_abs_int16() finds the maximum MSB of the absolutevalue of each element in an array of int16_t.
FFT: factor a shuffle out of the inner loop and merge it into fft_permute.
6% faster SSE FFT on Conroe, 2.5% on Penryn.
Signed-off-by: Janne Grunau <janne-ffmpeg@jannau.net>
Add x86-optimized versions of exponent_min().
Fix ff_emu_edge_core_sse() on Win64.
Fix emu_edge_v_extend_15 to be <128 bytes on Win64, by being more stricton the size of registers and which registers are being used for operationswhere multiple are available. This fixes segfaults in emulated_edge()...
Separate format conversion DSP functions from DSPContext.
This will be beneficial for use with the audio conversion API withoutrequiring it to depend on all of dsputil.
Fix ff_imdct_calc_sse() on gcc-4.6
Gcc 4.6 only preserves the first value when using an array with an "m" constraint.
Implement a SIMD version of emulated_edge_mc() for x86.
From ~550 cycles (C version) to 170 (SSE/x86-64), 206 (MMX/x86-32)and 196 (SSE2/x86-32) cycles.
cosmetics: indentation
Remove unneeded add bias from 3 functions.
DSPContext.vector_fmul_window()DCADSPContext.lfe_fir()SynthFilterContext.synth_filter_float()
x86: fix overflow in h264 8x8 planar prediction
Change DSPContext.vector_fmul() from dst=dst*src to dest=src0*src1.
cosmetics related to LPC changes.
Separate window function from autocorrelation.
Move lpc_compute_autocorr() from DSPContext to a new struct LPCContext.
Fix horizontal/horizontal_up 8x8l intra prediction x86/simd functions.The original functions did not work correctly for edge pixels, e.g.when CODEC_FLAG_EMU_EDGE is set, leading to corrupt output in e.g. VLC.Based on a patch by Daniel Kang <daniel d kang gmail com>....
Replace ASMALIGN with .p2align
This macro has unconditionally used .p2align for a long time andserves no useful purpose.
x86: remove VLA in ac3_downmix_sse
consolidate .gitignore patters into a single file
convert svn:ignore properties to .gitignore files
Fix overflow in pred16x16_plane x86 simd code. Fixes issue 2547.
Originally committed as revision 26381 to svn://svn.ffmpeg.org/ffmpeg/trunk
Fix ff_pw_3 alignment.
Originally committed as revision 26344 to svn://svn.ffmpeg.org/ffmpeg/trunk
H.264: split luma dc idct out and implement MMX/SSE2 versionsAbout 2.5x the speed.
NOTE: the way that the asm code handles large qmuls is a bit suboptimal.If x264-style dequant was used (separate shift and qmul values), it mightbe possible to get some extra speed....
Fix compilation on x86-32 with --disable-optimizations,fixes issue 2127.
Patch by Daniel Kang, daniel.d.kang at gmail
Originally committed as revision 26204 to svn://svn.ffmpeg.org/ffmpeg/trunk
Fix invalid reads in valgrind fate, patch by Daniel Kang <daniel dot d dotkang at gmail com>, as part of Google's GCI 2010.
Originally committed as revision 26177 to svn://svn.ffmpeg.org/ffmpeg/trunk
Port pred8x8l_down_left_mmxext (H.264 intra prediction) from x264 (authors:Jason, Loren, Holger) to FFmpeg. Patch by Daniel Kang <daniel dot d dot kangat gmail com>, as part of Google's GCI 2010.
Originally committed as revision 26162 to svn://svn.ffmpeg.org/ffmpeg/trunk
Port pred4x4_down_right_mmxext (H.264 intra prediction) from x264 (authors:Jason, Loren, Holger) to FFmpeg. Patch by Daniel Kang <daniel dot d dot kangat gmail com>, as part of Google's GCI 2010.
Originally committed as revision 26159 to svn://svn.ffmpeg.org/ffmpeg/trunk
Port pred4x4_vertical_right_mmxext (H.264 intra prediction) from x264 (authors:Jason, Loren, Holger) to FFmpeg. Patch by Daniel Kang <daniel dot d dot kangat gmail com>, as part of Google's GCI 2010.
Originally committed as revision 26158 to svn://svn.ffmpeg.org/ffmpeg/trunk
Port pred4x4_horizontal_down_mmxext (H.264 intra prediction) from x264(authors:Jason, Loren, Holger) to FFmpeg. Patch by Daniel Kang <daniel dotd dot kang at gmail com>, as part of Google's GCI 2010.
Originally committed as revision 26157 to svn://svn.ffmpeg.org/ffmpeg/trunk
Port pred4x4_horizontal_up_mmxext (H.264 intra prediction) from x264 (authors:Jason, Loren, Holger) to FFmpeg. Patch by Daniel Kang <daniel dot d dot kangat gmail com>, as part of Google's GCI 2010.
Originally committed as revision 26156 to svn://svn.ffmpeg.org/ffmpeg/trunk
Port pred4x4_vertical_left_mmxext (H.264 intra prediction) from x264 (authors:Jason, Loren, Holger) to FFmpeg. Patch by Daniel Kang <daniel dot d dot kangat gmail com>, as part of Google's GCI 2010.
Originally committed as revision 26155 to svn://svn.ffmpeg.org/ffmpeg/trunk
Merge a few superfluous CONFIG_GPL checks.
Originally committed as revision 26154 to svn://svn.ffmpeg.org/ffmpeg/trunk
Whitespace cosmetics.
Originally committed as revision 26152 to svn://svn.ffmpeg.org/ffmpeg/trunk
Port pred8x8l_horizontal_down_sse2/ssse3 (H.264 intra prediction) from x264(authors: Jason, Loren, Holger) to FFmpeg. Patch by Daniel Kang <daniel dotd dot kang at gmail com>, as part of Google's GCI 2010.
Originally committed as revision 26151 to svn://svn.ffmpeg.org/ffmpeg/trunk
Port pred8x8l_horizontal_down_mmxext (H.264 intra prediction) from x264(authors: Jason, Loren, Holger) to FFmpeg. Patch by Daniel Kang <daniel dotd dot kang at gmail com>, as part of Google's GCI 2010.
Originally committed as revision 26150 to svn://svn.ffmpeg.org/ffmpeg/trunk
Port pred8x8l_horizontal_up_mmxext/ssse3 (H.264 intra prediction) from x264(authors: Jason, Loren, Holger) to FFmpeg. Patch by Daniel Kang <daniel dotd dot kang at gmail com>, as part of Google's GCI 2010.
Originally committed as revision 26149 to svn://svn.ffmpeg.org/ffmpeg/trunk
Port pred8x8l_vertical_left_sse2/ssse3 (H.264 intra prediction) from x264(authors: Jason, Loren, Holger) to FFmpeg. Patch by Daniel Kang <daniel dotd dot kang at gmail com>, as part of Google's GCI 2010.
Originally committed as revision 26148 to svn://svn.ffmpeg.org/ffmpeg/trunk
Port pred8x8l_vertical_right_sse2/ssse3 (H.264 intra prediction) from x264(authors: Jason, Loren, Holger) to FFmpeg. Patch by Daniel Kang <daniel dotd dot kang at gmail com>, as part of Google's GCI 2010.
Originally committed as revision 26147 to svn://svn.ffmpeg.org/ffmpeg/trunk
Port pred8x8l_vertical_right_mmxext (H.264 intra prediction) from x264(authors: Jason, Loren, Holger) to FFmpeg. Patch by Daniel Kang <daniel dotd dot kang at gmail com>, as part of Google's GCI 2010.
Originally committed as revision 26146 to svn://svn.ffmpeg.org/ffmpeg/trunk
Port pred8x8l_down_right_sse2/ssse3 (H.264 intra prediction) from x264(authors: Jason, Loren, Holger) to FFmpeg. Patch by Daniel Kang <daniel dotd dot kang at gmail com>, as part of Google's GCI 2010.
Originally committed as revision 26145 to svn://svn.ffmpeg.org/ffmpeg/trunk
Port pred8x8l_down_right_mmxext (H.264 intra prediction) from x264 (authors:Jason, Loren, Holger) to FFmpeg. Patch by Daniel Kang <daniel dot d dot kangat gmail com>, as part of Google's GCI 2010.
Originally committed as revision 26143 to svn://svn.ffmpeg.org/ffmpeg/trunk
Port pred8x8l_down_left_sse2/ssse3 (H.264 intra prediction) from x264 (authors:Jason, Loren, Holger) to FFmpeg. Patch by Daniel Kang <daniel dot d dot kang atgmail com>, as part of Google's GCI 2010.
Originally committed as revision 26142 to svn://svn.ffmpeg.org/ffmpeg/trunk
Port pred8x8l_vertical_mmxext/ssse3 (H.264 intra prediction) from x264 toFFmpeg. Original authors: Holger Lubitz <holger lubitz org>, Jason Garrett-Glaser <darkshikari gmail com> (approves LGPL relicensing for this code) andLoren Merritt <lorenm at u dot washington dot edu> (approves LGPL relicensing...
Port pred8x8l_horizontal_mmxext/ssse3 (H.264 intra prediction) from x264 toFFmpeg. Original authors: Holger Lubitz <holger lubitz org>, Jason Garrett-Glaser <darkshikari gmail com> (approves LGPL relicensing for this code) andLoren Merritt <lorenm at u dot washington dot edu> (approves LGPL relicensing...
Port pred8x8l_dc_mmx/ssse3 (H.264 intra prediction) from x264 to FFmpeg.Original authors: Holger Lubitz <holger lubitz org>, Jason Garrett-Glaser<darkshikari gmail com> (approves LGPL relicensing for this code) and LorenMerritt <lorenm at u dot washington dot edu> (approves LGPL relicensing for...
Port pred8x8l_top_dc_mmxext/ssse3 (H.264 intra prediction) from x264 to FFmpeg.Original authors: Holger Lubitz <holger lubitz org>, Jason Garrett-Glaser<darkshikari gmail com> (approves LGPL relicensing for this code) and LorenMerritt <lorenm at u dot washington dot edu> (approves LGPL relicensing for...
Move PRED4x4_LOWPASS up so it can be used in 8x8l predict functions whilekeeping the functions ordered in the source file (i.e. cosmetics).
Originally committed as revision 26136 to svn://svn.ffmpeg.org/ffmpeg/trunk
Port pred8x8_dc_mmxext (H.264 intra prediction) from x264 to FFmpeg. Originalauthors: Holger Lubitz <holger lubitz org>, Jason Garrett-Glaser <darkshikarigmail com> (approves LGPL relicensing for this code) and Loren Merritt <lorenmat u dot washington dot edu> (approves LGPL relicensing for this code). Patch...
Add missing authors to copyright headers.
Originally committed as revision 26133 to svn://svn.ffmpeg.org/ffmpeg/trunk
Port pred8x8_top_dc_mmxext (H.264 intra prediction) from x264 to FFmpeg.Original authors: Holger Lubitz <holger lubitz org>, Jason Garrett-Glaser<darkshikari gmail com> (approves LGPL relicensing for this code) and LorenMerritt <lorenm at u dot washington dot edu> (approves LGPL relicensing for...
Mark recently added pred4x4_down_left_mmxext as CONFIG_GPL. Although Holgerinitially said he'd be OK with relicensing, he also said he wanted to haveanother look at the patch, and then he went on vacation, so let's play itsafe for now. We can consider removing this again later....
Port pred4x4_down_left_mmxext (H.264 intra prediction) from x264 to FFmpeg.LGPL relicensing approved by original authors: Holger Lubitz <holger lubitzorg>, Jason Garrett-Glaser <darkshikari gmail com> and Loren Merritt <lorenmat u dot washington dot edu>. Patch by Daniel Kang <daniel dot d dot kang at...
For rounding in chroma MC SSSE3, use 16-byte pw_3/4 instead of reading 8 bytesand then using movlhps to dup it into the higher half of the register.
Originally committed as revision 26086 to svn://svn.ffmpeg.org/ffmpeg/trunk
In yadif filter, declare asm constants directly to avoid dependency on libavcodec
Originally committed as revision 25895 to svn://svn.ffmpeg.org/ffmpeg/trunk
10l, add ff_pw_1 to dsputil_mmx for yadif sse2
Originally committed as revision 25881 to svn://svn.ffmpeg.org/ffmpeg/trunk
Use SECTION .text for yasm code.
Patch by avcoder, ffmpeg gmail
Originally committed as revision 25859 to svn://svn.ffmpeg.org/ffmpeg/trunk
dnxhd_mmx: prefer xmm registers below xmm6 when they are available
Originally committed as revision 25634 to svn://svn.ffmpeg.org/ffmpeg/trunk
dsputil: Use explicit movzbl instead of movzx
This fixes compilation with the latest clang trunk version.
Patch by İsmail Dönmez, ismail at namtrac dot org
Originally committed as revision 25628 to svn://svn.ffmpeg.org/ffmpeg/trunk
lpc_mmx: add xmm registers to clobber list
Originally committed as revision 25620 to svn://svn.ffmpeg.org/ffmpeg/trunk
lpc_mmx: merge some asm blocks
These blocks depended on the compiler keeping xmm registers untouched betweenthem.
Originally committed as revision 25619 to svn://svn.ffmpeg.org/ffmpeg/trunk
sad16_sse2: merge 2 asm blocks
Originally committed as revision 25617 to svn://svn.ffmpeg.org/ffmpeg/trunk
xmm_clobbers: list xmm registers first in clobber list
suncc does not like the leading commas inside the macro, but it has no problemwith trailing commas.
Originally committed as revision 25615 to svn://svn.ffmpeg.org/ffmpeg/trunk
idct_sse2_xvid: only mark xmm>=8 as clobbered on x86_64
Originally committed as revision 25614 to svn://svn.ffmpeg.org/ffmpeg/trunk
motion_est_mmx: prefer xmm registers below xmm6 when they are available
Originally committed as revision 25612 to svn://svn.ffmpeg.org/ffmpeg/trunk
dsputil_mmx: add xmm registers to clobber list
Originally committed as revision 25611 to svn://svn.ffmpeg.org/ffmpeg/trunk
cosmetics: split long line
Originally committed as revision 25610 to svn://svn.ffmpeg.org/ffmpeg/trunk
fdct_mmx: add xmm registers to clobber list
Originally committed as revision 25609 to svn://svn.ffmpeg.org/ffmpeg/trunk
idct_sse2_xvid: add xmm registers to clobber list
Originally committed as revision 25608 to svn://svn.ffmpeg.org/ffmpeg/trunk
mpegvideo_mmx: add xmm registers to clobber list
Originally committed as revision 25607 to svn://svn.ffmpeg.org/ffmpeg/trunk
dsputil_mmx: prefer xmm registers below xmm6 when they are available
Originally committed as revision 25606 to svn://svn.ffmpeg.org/ffmpeg/trunk
h264dsp: add xmm registers to clobber list
Originally committed as revision 25604 to svn://svn.ffmpeg.org/ffmpeg/trunk
indent
Originally committed as revision 25598 to svn://svn.ffmpeg.org/ffmpeg/trunk
h264dsp: merge some more asm blocks
Originally committed as revision 25597 to svn://svn.ffmpeg.org/ffmpeg/trunk
dct32: mark xmm registers in clobber list in ff_dct32_float_sse()
Originally committed as revision 25569 to svn://svn.ffmpeg.org/ffmpeg/trunk
h264dsp: merge some asm blocks
Some code was initializing some xmm registers in one asm block and using themin the following block, assuming they wouldn't be changed in between blocks.
Originally committed as revision 25568 to svn://svn.ffmpeg.org/ffmpeg/trunk
Add d modifier to asm argument to fix nasm compilation.
Originally committed as revision 25397 to svn://svn.ffmpeg.org/ffmpeg/trunk
fft: mark xmm registers as clobbered in ff_imdct_calc_sse
Originally committed as revision 25363 to svn://svn.ffmpeg.org/ffmpeg/trunk
MMX, MMX2, SSE2 and SSSE3 optimizations for pred16x16/8x8_plane H264 intraprediction (plus some with different rounding for svq3/rv40). Speedup (forSSSE3) about ~6-fold, 3.6% faster overall with cathedral sample.
Originally committed as revision 25361 to svn://svn.ffmpeg.org/ffmpeg/trunk
snowdsp: Explicitly state the operand sizes
Fixes compilation with clang's builtin assembler
Originally committed as revision 25331 to svn://svn.ffmpeg.org/ffmpeg/trunk