bluez

korg/bluez

mirror of https://git.kernel.org/pub/scm/bluetooth/bluez.git synced 2024-11-30 07:34:27 +08:00

Author	SHA1	Message	Date
Siarhei Siamashka	a563c8ed5a	SBC encoder scale factors calculation optimized with __builtin_clz Count leading zeros operation is often implemented using a special instruction for it on various architectures (at least this is true for ARM and x86). Using __builtin_clz gcc intrinsic allows to eliminate innermost loop in scale factors calculation and improve performance. Also scale factors calculation can be optimized even more using SIMD instructions.	2009-01-29 08:25:50 +01:00
Siarhei Siamashka	19af3c49e6	Performance optimizations for input data processing in SBC encoder Channels deinterleaving, endian conversion and samples reordering is done in one pass, avoiding the use of intermediate buffer. Also this code is implemented as a new "performance primitive", which allows further platform specific optimizations (ARMv6 and ARM NEON should gain quite a lot from assembly optimizations here).	2009-01-28 06:42:10 +01:00
Siarhei Siamashka	c90eb5ba7e	Use of -funroll-loops option to improve SBC encoder performance Added the use of -funroll-loops gcc option for SBC. Also in order to gain better effect, 'sbc_pack_frame' function body moved to an inline function, which gets instantiated for 4 different subbands/channels combinations. So that 'frame_subbands' and 'frame_channels' arguments become compile time constants and can be better optimized by the compiler.	2009-01-23 21:15:38 +02:00
Siarhei Siamashka	d48c175f70	Coding style fixes	2009-01-18 16:09:00 +01:00
Siarhei Siamashka	82d00972c9	SBC arrays and constant tables aligned at 16 byte boundary for SIMD Most SIMD instruction sets benefit from data being naturally aligned. And even if it is not strictly required, performance is usually better with the aligned data. ARM NEON and SSE2 have different instruction variants for aligned/unaligned memory accesses.	2009-01-16 08:23:19 +01:00
Siarhei Siamashka	9e31e7dde6	SIMD-friendly variant of SBC encoder analysis filter Added SIMD-friendly C implementation of SBC analysis filter (the structure of code had to be changed a bit and constants in the tables reordered). This code can be used as a reference for developing platform specific SIMD optimizations. These functions are put into a new file 'sbc_primitives.c', which is going to contain all the basic stuff for SBC codec.	2009-01-16 00:28:32 +01:00
Siarhei Siamashka	8bbfdf782d	Fix for big endian problems in SBC codec	2009-01-07 14:41:46 +01:00
Christian Hoene	2cada66773	Fixed correct handling of frame sizes in the encoder	2009-01-06 03:41:57 +01:00
Siarhei Siamashka	365f92ed45	Use of constant shift in SBC quantization code to make it faster The result of 32x32->64 unsigned long multiplication is returned in two registers (high and low 32-bit parts) for many 32-bit architectures. For these architectures constant right shift by 32 bits is optimized out by the compiler to just taking the high 32-bit part. Also some data needed at the quantization stage is precalculated beforehand to improve performance.	2009-01-06 03:39:27 +01:00
Marcel Holtmann	fb333f1c88	Update copyright information	2009-01-01 19:33:20 +01:00
Siarhei Siamashka	8a206b8115	Added possibility to analyze 4 blocks at once in SBC encoder This change is needed for SIMD optimizations which will follow shortly. And even for non-SIMD capable platforms it still may be useful to have possibility to merge several analyzing functions together into one for better code scheduling or reusing loaded constants. Also analysis filter functions are now called using function pointers, which allows the default implementation to be overrided at runtime (with high precision variant or MMX/SSE2/NEON optimized code).	2009-01-01 09:52:37 +01:00
Siarhei Siamashka	a6cb57cd01	New SBC analysis filter function to replace current broken code This code is heavily based on the patch submitted by Jaska Uimonen. Additional changes include preserving extra bits in the output of filter function for better precision, support for both 16-bit and 32-bit fixed point implementation. Sign of some table values was changed in order to preserve a regular code structure and have multiply-accumulate oparations only. No additional optimizations were applied as this code is intended to be some kind of "reference" implementation. Platform specific optimizations may require different tricks and can be branched off from this implementation. Some extra information about this code can be found in linux-bluetooth mailing list archive for December 2008.	2008-12-29 12:52:20 +01:00
Siarhei Siamashka	635e9348a9	Fixed subbands selection for joint-stereo in SBC encoder	2008-12-29 11:31:59 +01:00
Marcel Holtmann	ce633965e1	Don't decode a frame if it is too small	2008-12-23 23:41:38 +01:00
Luiz Augusto von Dentz	50374ec694	Remove unnecessary code and fix a coding style.	2008-12-18 23:46:43 +01:00
Siarhei Siamashka	91a3fc0c35	Fix for overflow bug in SBC quantization code The result of multiplication does not always fit into 32-bits. Using 64-bit calculations helps to avoid overflows and sound quality problems in encoded audio. Overflows are more likely to show up when using high values for bitpool setting.	2008-12-18 23:46:05 +01:00
Siarhei Siamashka	037a47214c	Bitstream writing optimization for SBC encoder SBC encoder performance improvement up to 1.5x for ARM11 and almost twice faster for Intel Core2 in some cases.	2008-12-18 23:45:36 +01:00
Marcel Holtmann	40e63b5f54	Fix SBC gain mismatch	2008-10-31 23:55:13 +01:00
Marcel Holtmann	45c36dbd27	Avoid direct inclusion of malloc.h	2008-06-11 13:20:50 +00:00
Brad Midgley	9e446dba51	Cidorvan found another place where the spec had us saving a bunch of values that were used immediately. Just compute and use instead of saving. In the decoder.	2008-03-08 05:21:26 +00:00
Brad Midgley	51a5483169	decoder optimization, now using nested multiply calls	2008-03-06 14:04:43 +00:00
Brad Midgley	7a68b05bea	Cidorvan's 4-subband overflow fixes	2008-02-29 03:56:22 +00:00
Johan Hedberg	4170955ad1	Replace 64bits multiplies by 32bits to further optimize the code	2008-02-22 13:41:02 +00:00
Luiz Augusto von Dentz	ce342bf252	Introduce sbc new API.	2008-02-19 19:47:25 +00:00
Brad Midgley	ff51f4b0b2	fix for decoder noise at high bitpools	2008-02-15 18:06:32 +00:00
Marcel Holtmann	e823c15e43	Update copyright information	2008-02-02 03:37:05 +00:00
Brad Midgley	82540ead6b	change MUL/MULA semantics	2008-01-30 20:37:49 +00:00
Brad Midgley	4b8bfb24c7	one more .X 32-bitism	2008-01-29 19:47:49 +00:00
Brad Midgley	c6d9a4373d	revert 16-bit state.X change (bad on arm)	2008-01-29 18:56:13 +00:00
Brad Midgley	6d205fda03	revert arm conditional code	2008-01-28 18:00:51 +00:00
Brad Midgley	358888c523	change function signature so the arm optimization will work	2008-01-28 17:48:21 +00:00
Brad Midgley	38158dc5dd	remove 16x16 mult optimization--gcc actually generates more costly code	2008-01-28 17:26:22 +00:00
Johan Hedberg	ba255beb79	Whitespace cleanup	2008-01-28 10:38:40 +00:00
Brad Midgley	d352bd0438	avoid an (unlikely) overflow	2008-01-27 04:29:08 +00:00
Brad Midgley	c2ce7c2d41	get 32-bit products whenever we're sure the multiplicands are both 16 bits	2008-01-27 03:35:53 +00:00
Brad Midgley	c9b5101059	shorten the encoder tables to 16 bits, take out mula32/mul32 for now for simplicity	2008-01-26 19:45:54 +00:00
Brad Midgley	dae52eacba	pcm input array should be 16 not 32 bits use 32-bit product when multiplying two values limited to 16 bits each	2008-01-26 05:24:50 +00:00
Brad Midgley	66fe637352	update copyrights	2008-01-19 15:56:52 +00:00
Brad Midgley	910d620357	coding style	2008-01-14 22:15:17 +00:00
Brad Midgley	8c72b28a0c	comment typo	2008-01-14 20:03:42 +00:00
Brad Midgley	23f3e84ba4	fix initialization	2008-01-14 15:03:23 +00:00
Brad Midgley	d4d085c9a0	take out memmove in sbc analyze	2008-01-14 14:40:57 +00:00
Brad Midgley	0dda8d7230	tweak to the memmove for 4 subbands	2008-01-11 20:28:18 +00:00
Brad Midgley	40d383bb1e	optimizations: use memmove instead of a loop, unroll short loop	2008-01-08 20:56:17 +00:00
Brad Midgley	1da4920f71	smooth out last shift-in-place wrinkle	2007-12-14 06:54:19 +00:00
Brad Midgley	9a14d423e7	push in-place-shift optimization up into scalefactors section	2007-12-14 06:43:06 +00:00
Brad Midgley	c1ce3b25c4	shift-in-place opt is back in, with a bugfix for the 4-subband case	2007-12-14 06:07:52 +00:00
Brad Midgley	5acc7043cf	coding style on ?:	2007-12-14 00:46:09 +00:00
Brad Midgley	356a8b48cd	be more strict about calculating from joint since client may set it to a funky value other than 0/1	2007-12-14 00:15:33 +00:00
Brad Midgley	c5e43eefcc	roll back the shift-in-place bitpack optimization while we figure out if it tickles a bug or creates a bug for 4 subbands	2007-12-14 00:13:07 +00:00

1 2

84 Commits