2019-05-27 14:55:05 +08:00
|
|
|
// SPDX-License-Identifier: GPL-2.0-or-later
|
2005-04-17 06:20:36 +08:00
|
|
|
/*
|
|
|
|
* Patch transfer callback for Emu10k1
|
|
|
|
*
|
|
|
|
* Copyright (C) 2000 Takashi iwai <tiwai@suse.de>
|
|
|
|
*/
|
|
|
|
/*
|
|
|
|
* All the code for loading in a patch. There is very little that is
|
|
|
|
* chip specific here. Just the actual writing to the board.
|
|
|
|
*/
|
|
|
|
|
|
|
|
#include "emu10k1_synth_local.h"
|
|
|
|
|
|
|
|
/*
|
|
|
|
*/
|
|
|
|
#define BLANK_LOOP_START 4
|
|
|
|
#define BLANK_LOOP_END 8
|
|
|
|
#define BLANK_LOOP_SIZE 12
|
2024-04-06 14:48:29 +08:00
|
|
|
#define BLANK_HEAD_SIZE 3
|
2005-04-17 06:20:36 +08:00
|
|
|
|
|
|
|
/*
|
|
|
|
* allocate a sample block and copy data from userspace
|
|
|
|
*/
|
|
|
|
int
|
2005-11-17 21:50:13 +08:00
|
|
|
snd_emu10k1_sample_new(struct snd_emux *rec, struct snd_sf_sample *sp,
|
|
|
|
struct snd_util_memhdr *hdr,
|
|
|
|
const void __user *data, long count)
|
2005-04-17 06:20:36 +08:00
|
|
|
{
|
2024-04-06 14:48:22 +08:00
|
|
|
u8 fill;
|
|
|
|
u32 xor;
|
2024-04-06 14:48:26 +08:00
|
|
|
int shift;
|
2005-04-17 06:20:36 +08:00
|
|
|
int offset;
|
2020-07-03 03:35:54 +08:00
|
|
|
int truesize, size, blocksize;
|
ALSA: emu10k1: fix wavetable playback position and caching, take 2
Compensate for the cache lag of 64 frames, and actually populate the
cache. Without these, the playback would start with garbage (which
would be (mostly?) masqueraded by the note's attack phase).
Note that we set the starting address only 61 frames ahead, to
compensate for the interpolator's epsilon. Unlike for PCM playback, we
don't even need to manually silence-fill the first frames in the cache,
because we insert some silence in front of each sample anyway.
A challenge are extremely short samples with a loop end below the cache
size, because a) we'd have to wrap the current address to be within the
loop and b) automatic pre-filling of the cache with the right data does
not work in this case.
We could pre-fill the cache manually, but that's slow, requires
additional code for each sample width, and is made even more complex by
the driver's virtual address space having no contiguous mapping for the
CPU.
We could have the engine fill the cache piece-wise (which is really what
happens when playback is running), but that would also be complex, and
we'd need to wait for the engine to handle each piece, so it wouldn't be
that much faster than the manual fill.
For the case of requiring only one loop iteration prior to reaching the
cache size, we could leverage the engine's looping mechanism around
CCR_CACHELOOPFLAG, but this special case doesn't seem worth the
complexity.
So we just unroll the loop as far as necessary to be able to play back
the sample without any fiddling.
Pedantically, this would be incorrect for loop-until-release samples
with a low loop end which are released very quickly, but that would be
relatively harmless, is not a plausible use case in the first place, and
SoundFont sample mode 3 isn't actually implemented anyway (it's
conflated with mode 1, infinite looping).
Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Message-ID: <20240406064830.1029573-16-oswald.buddenhagen@gmx.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2024-04-06 14:48:28 +08:00
|
|
|
int loop_start, loop_end, loop_size, data_end, unroll;
|
2005-11-17 21:50:13 +08:00
|
|
|
struct snd_emu10k1 *emu;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
|
|
|
emu = rec->hw;
|
2008-08-08 23:12:14 +08:00
|
|
|
if (snd_BUG_ON(!sp || !hdr))
|
|
|
|
return -EINVAL;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
ALSA: emu10k1: prune vestiges of SNDRV_SFNT_SAMPLE_{BIDIR,REVERSE}_LOOP support
This is required only to implement WAVE_BIDIR_LOOP and WAVE_LOOP_BACK in
the GUS patch loader. It has not worked on emu10k1 since before ALSA hit
mainline, yet nobody appears to have complained. And as it isn't super
easy to implement, just admit defeat and clean up the code.
If somebody wanted to resurrect the feature, the emu8k driver could
serve as a template, but the code would be quite different. But
arguably, this should be done in user space in the first place, as this
doesn't represent a hardware feature (somewhat ironically, the actual
GUS driver has no synth support, and therefore no GUS patch loader).
Note that instead of properly rejecting affected samples, we continue to
just pretend that the feature wasn't requested. This is extremely
questionable behavior, but avoids that possibly unused instruments
suddenly prevent loading the entire file, which would break backwards
compatibility. But at least we log a warning now.
Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Message-ID: <20240406064830.1029573-6-oswald.buddenhagen@gmx.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2024-04-06 14:48:18 +08:00
|
|
|
if (sp->v.mode_flags & (SNDRV_SFNT_SAMPLE_BIDIR_LOOP | SNDRV_SFNT_SAMPLE_REVERSE_LOOP)) {
|
|
|
|
/* should instead return -ENOTSUPP; but compatibility */
|
|
|
|
printk(KERN_WARNING "Emu10k1 wavetable patch %d with unsupported loop feature\n",
|
|
|
|
sp->v.sample);
|
|
|
|
}
|
|
|
|
|
2024-04-06 14:48:22 +08:00
|
|
|
if (sp->v.mode_flags & SNDRV_SFNT_SAMPLE_8BITS) {
|
2024-04-06 14:48:26 +08:00
|
|
|
shift = 0;
|
2024-04-06 14:48:22 +08:00
|
|
|
fill = 0x80;
|
|
|
|
xor = (sp->v.mode_flags & SNDRV_SFNT_SAMPLE_UNSIGNED) ? 0 : 0x80808080;
|
|
|
|
} else {
|
2024-04-06 14:48:26 +08:00
|
|
|
shift = 1;
|
2024-04-06 14:48:22 +08:00
|
|
|
fill = 0;
|
|
|
|
xor = (sp->v.mode_flags & SNDRV_SFNT_SAMPLE_UNSIGNED) ? 0x80008000 : 0;
|
|
|
|
}
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
/* compute true data size to be loaded */
|
|
|
|
truesize = sp->v.size + BLANK_HEAD_SIZE;
|
2024-04-06 14:48:24 +08:00
|
|
|
if (sp->v.mode_flags & SNDRV_SFNT_SAMPLE_NO_BLANK) {
|
2005-04-17 06:20:36 +08:00
|
|
|
truesize += BLANK_LOOP_SIZE;
|
2024-04-06 14:48:24 +08:00
|
|
|
/* if no blank loop is attached in the sample, add it */
|
|
|
|
if (sp->v.mode_flags & SNDRV_SFNT_SAMPLE_SINGLESHOT) {
|
|
|
|
sp->v.loopstart = sp->v.end + BLANK_LOOP_START;
|
|
|
|
sp->v.loopend = sp->v.end + BLANK_LOOP_END;
|
|
|
|
}
|
|
|
|
}
|
2005-04-17 06:20:36 +08:00
|
|
|
|
ALSA: emu10k1: fix wavetable playback position and caching, take 2
Compensate for the cache lag of 64 frames, and actually populate the
cache. Without these, the playback would start with garbage (which
would be (mostly?) masqueraded by the note's attack phase).
Note that we set the starting address only 61 frames ahead, to
compensate for the interpolator's epsilon. Unlike for PCM playback, we
don't even need to manually silence-fill the first frames in the cache,
because we insert some silence in front of each sample anyway.
A challenge are extremely short samples with a loop end below the cache
size, because a) we'd have to wrap the current address to be within the
loop and b) automatic pre-filling of the cache with the right data does
not work in this case.
We could pre-fill the cache manually, but that's slow, requires
additional code for each sample width, and is made even more complex by
the driver's virtual address space having no contiguous mapping for the
CPU.
We could have the engine fill the cache piece-wise (which is really what
happens when playback is running), but that would also be complex, and
we'd need to wait for the engine to handle each piece, so it wouldn't be
that much faster than the manual fill.
For the case of requiring only one loop iteration prior to reaching the
cache size, we could leverage the engine's looping mechanism around
CCR_CACHELOOPFLAG, but this special case doesn't seem worth the
complexity.
So we just unroll the loop as far as necessary to be able to play back
the sample without any fiddling.
Pedantically, this would be incorrect for loop-until-release samples
with a low loop end which are released very quickly, but that would be
relatively harmless, is not a plausible use case in the first place, and
SoundFont sample mode 3 isn't actually implemented anyway (it's
conflated with mode 1, infinite looping).
Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Message-ID: <20240406064830.1029573-16-oswald.buddenhagen@gmx.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2024-04-06 14:48:28 +08:00
|
|
|
loop_start = sp->v.loopstart;
|
|
|
|
loop_end = sp->v.loopend;
|
|
|
|
loop_size = loop_end - loop_start;
|
|
|
|
if (!loop_size)
|
|
|
|
return -EINVAL;
|
|
|
|
data_end = sp->v.end;
|
|
|
|
|
2024-04-06 14:48:25 +08:00
|
|
|
/* recalculate offset */
|
|
|
|
sp->v.start += BLANK_HEAD_SIZE;
|
|
|
|
sp->v.end += BLANK_HEAD_SIZE;
|
|
|
|
sp->v.loopstart += BLANK_HEAD_SIZE;
|
|
|
|
sp->v.loopend += BLANK_HEAD_SIZE;
|
|
|
|
|
ALSA: emu10k1: fix wavetable playback position and caching, take 2
Compensate for the cache lag of 64 frames, and actually populate the
cache. Without these, the playback would start with garbage (which
would be (mostly?) masqueraded by the note's attack phase).
Note that we set the starting address only 61 frames ahead, to
compensate for the interpolator's epsilon. Unlike for PCM playback, we
don't even need to manually silence-fill the first frames in the cache,
because we insert some silence in front of each sample anyway.
A challenge are extremely short samples with a loop end below the cache
size, because a) we'd have to wrap the current address to be within the
loop and b) automatic pre-filling of the cache with the right data does
not work in this case.
We could pre-fill the cache manually, but that's slow, requires
additional code for each sample width, and is made even more complex by
the driver's virtual address space having no contiguous mapping for the
CPU.
We could have the engine fill the cache piece-wise (which is really what
happens when playback is running), but that would also be complex, and
we'd need to wait for the engine to handle each piece, so it wouldn't be
that much faster than the manual fill.
For the case of requiring only one loop iteration prior to reaching the
cache size, we could leverage the engine's looping mechanism around
CCR_CACHELOOPFLAG, but this special case doesn't seem worth the
complexity.
So we just unroll the loop as far as necessary to be able to play back
the sample without any fiddling.
Pedantically, this would be incorrect for loop-until-release samples
with a low loop end which are released very quickly, but that would be
relatively harmless, is not a plausible use case in the first place, and
SoundFont sample mode 3 isn't actually implemented anyway (it's
conflated with mode 1, infinite looping).
Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Message-ID: <20240406064830.1029573-16-oswald.buddenhagen@gmx.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2024-04-06 14:48:28 +08:00
|
|
|
// Automatic pre-filling of the cache does not work in the presence
|
|
|
|
// of loops (*), and we don't want to fill it manually, as that is
|
|
|
|
// fiddly and slow. So we unroll the loop until the loop end is
|
|
|
|
// beyond the cache size.
|
|
|
|
// (*) Strictly speaking, a single iteration is supported (that's
|
|
|
|
// how it works when the playback engine runs), but handling this
|
|
|
|
// special case is not worth it.
|
|
|
|
unroll = 0;
|
|
|
|
while (sp->v.loopend < 64) {
|
|
|
|
truesize += loop_size;
|
|
|
|
sp->v.loopstart += loop_size;
|
|
|
|
sp->v.loopend += loop_size;
|
|
|
|
sp->v.end += loop_size;
|
|
|
|
unroll++;
|
|
|
|
}
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
/* try to allocate a memory block */
|
2024-04-06 14:48:26 +08:00
|
|
|
blocksize = truesize << shift;
|
2005-04-17 06:20:36 +08:00
|
|
|
sp->block = snd_emu10k1_synth_alloc(emu, blocksize);
|
|
|
|
if (sp->block == NULL) {
|
2014-02-26 00:02:09 +08:00
|
|
|
dev_dbg(emu->card->dev,
|
|
|
|
"synth malloc failed (size=%d)\n", blocksize);
|
2005-04-17 06:20:36 +08:00
|
|
|
/* not ENOMEM (for compatibility with OSS) */
|
|
|
|
return -ENOSPC;
|
|
|
|
}
|
|
|
|
/* set the total size */
|
|
|
|
sp->v.truesize = blocksize;
|
|
|
|
|
|
|
|
/* write blank samples at head */
|
|
|
|
offset = 0;
|
2024-04-06 14:48:26 +08:00
|
|
|
size = BLANK_HEAD_SIZE << shift;
|
2024-04-06 14:48:22 +08:00
|
|
|
snd_emu10k1_synth_memset(emu, sp->block, offset, size, fill);
|
2005-04-17 06:20:36 +08:00
|
|
|
offset += size;
|
|
|
|
|
ALSA: emu10k1: prune vestiges of SNDRV_SFNT_SAMPLE_{BIDIR,REVERSE}_LOOP support
This is required only to implement WAVE_BIDIR_LOOP and WAVE_LOOP_BACK in
the GUS patch loader. It has not worked on emu10k1 since before ALSA hit
mainline, yet nobody appears to have complained. And as it isn't super
easy to implement, just admit defeat and clean up the code.
If somebody wanted to resurrect the feature, the emu8k driver could
serve as a template, but the code would be quite different. But
arguably, this should be done in user space in the first place, as this
doesn't represent a hardware feature (somewhat ironically, the actual
GUS driver has no synth support, and therefore no GUS patch loader).
Note that instead of properly rejecting affected samples, we continue to
just pretend that the feature wasn't requested. This is extremely
questionable behavior, but avoids that possibly unused instruments
suddenly prevent loading the entire file, which would break backwards
compatibility. But at least we log a warning now.
Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Message-ID: <20240406064830.1029573-6-oswald.buddenhagen@gmx.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2024-04-06 14:48:18 +08:00
|
|
|
/* copy provided samples */
|
ALSA: emu10k1: fix wavetable playback position and caching, take 2
Compensate for the cache lag of 64 frames, and actually populate the
cache. Without these, the playback would start with garbage (which
would be (mostly?) masqueraded by the note's attack phase).
Note that we set the starting address only 61 frames ahead, to
compensate for the interpolator's epsilon. Unlike for PCM playback, we
don't even need to manually silence-fill the first frames in the cache,
because we insert some silence in front of each sample anyway.
A challenge are extremely short samples with a loop end below the cache
size, because a) we'd have to wrap the current address to be within the
loop and b) automatic pre-filling of the cache with the right data does
not work in this case.
We could pre-fill the cache manually, but that's slow, requires
additional code for each sample width, and is made even more complex by
the driver's virtual address space having no contiguous mapping for the
CPU.
We could have the engine fill the cache piece-wise (which is really what
happens when playback is running), but that would also be complex, and
we'd need to wait for the engine to handle each piece, so it wouldn't be
that much faster than the manual fill.
For the case of requiring only one loop iteration prior to reaching the
cache size, we could leverage the engine's looping mechanism around
CCR_CACHELOOPFLAG, but this special case doesn't seem worth the
complexity.
So we just unroll the loop as far as necessary to be able to play back
the sample without any fiddling.
Pedantically, this would be incorrect for loop-until-release samples
with a low loop end which are released very quickly, but that would be
relatively harmless, is not a plausible use case in the first place, and
SoundFont sample mode 3 isn't actually implemented anyway (it's
conflated with mode 1, infinite looping).
Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Message-ID: <20240406064830.1029573-16-oswald.buddenhagen@gmx.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2024-04-06 14:48:28 +08:00
|
|
|
if (unroll && loop_end <= data_end) {
|
|
|
|
size = loop_end << shift;
|
|
|
|
if (snd_emu10k1_synth_copy_from_user(emu, sp->block, offset, data, size, xor))
|
|
|
|
goto faulty;
|
|
|
|
offset += size;
|
|
|
|
|
|
|
|
data += loop_start << shift;
|
|
|
|
while (--unroll > 0) {
|
|
|
|
size = loop_size << shift;
|
|
|
|
if (snd_emu10k1_synth_copy_from_user(emu, sp->block, offset, data, size, xor))
|
|
|
|
goto faulty;
|
|
|
|
offset += size;
|
|
|
|
}
|
|
|
|
|
|
|
|
size = (data_end - loop_start) << shift;
|
|
|
|
} else {
|
|
|
|
size = data_end << shift;
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
ALSA: emu10k1: fix wavetable playback position and caching, take 2
Compensate for the cache lag of 64 frames, and actually populate the
cache. Without these, the playback would start with garbage (which
would be (mostly?) masqueraded by the note's attack phase).
Note that we set the starting address only 61 frames ahead, to
compensate for the interpolator's epsilon. Unlike for PCM playback, we
don't even need to manually silence-fill the first frames in the cache,
because we insert some silence in front of each sample anyway.
A challenge are extremely short samples with a loop end below the cache
size, because a) we'd have to wrap the current address to be within the
loop and b) automatic pre-filling of the cache with the right data does
not work in this case.
We could pre-fill the cache manually, but that's slow, requires
additional code for each sample width, and is made even more complex by
the driver's virtual address space having no contiguous mapping for the
CPU.
We could have the engine fill the cache piece-wise (which is really what
happens when playback is running), but that would also be complex, and
we'd need to wait for the engine to handle each piece, so it wouldn't be
that much faster than the manual fill.
For the case of requiring only one loop iteration prior to reaching the
cache size, we could leverage the engine's looping mechanism around
CCR_CACHELOOPFLAG, but this special case doesn't seem worth the
complexity.
So we just unroll the loop as far as necessary to be able to play back
the sample without any fiddling.
Pedantically, this would be incorrect for loop-until-release samples
with a low loop end which are released very quickly, but that would be
relatively harmless, is not a plausible use case in the first place, and
SoundFont sample mode 3 isn't actually implemented anyway (it's
conflated with mode 1, infinite looping).
Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Message-ID: <20240406064830.1029573-16-oswald.buddenhagen@gmx.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2024-04-06 14:48:28 +08:00
|
|
|
if (snd_emu10k1_synth_copy_from_user(emu, sp->block, offset, data, size, xor))
|
|
|
|
goto faulty;
|
2005-04-17 06:20:36 +08:00
|
|
|
offset += size;
|
|
|
|
|
|
|
|
/* clear rest of samples (if any) */
|
|
|
|
if (offset < blocksize)
|
2024-04-06 14:48:22 +08:00
|
|
|
snd_emu10k1_synth_memset(emu, sp->block, offset, blocksize - offset, fill);
|
2005-04-17 06:20:36 +08:00
|
|
|
|
|
|
|
return 0;
|
ALSA: emu10k1: fix wavetable playback position and caching, take 2
Compensate for the cache lag of 64 frames, and actually populate the
cache. Without these, the playback would start with garbage (which
would be (mostly?) masqueraded by the note's attack phase).
Note that we set the starting address only 61 frames ahead, to
compensate for the interpolator's epsilon. Unlike for PCM playback, we
don't even need to manually silence-fill the first frames in the cache,
because we insert some silence in front of each sample anyway.
A challenge are extremely short samples with a loop end below the cache
size, because a) we'd have to wrap the current address to be within the
loop and b) automatic pre-filling of the cache with the right data does
not work in this case.
We could pre-fill the cache manually, but that's slow, requires
additional code for each sample width, and is made even more complex by
the driver's virtual address space having no contiguous mapping for the
CPU.
We could have the engine fill the cache piece-wise (which is really what
happens when playback is running), but that would also be complex, and
we'd need to wait for the engine to handle each piece, so it wouldn't be
that much faster than the manual fill.
For the case of requiring only one loop iteration prior to reaching the
cache size, we could leverage the engine's looping mechanism around
CCR_CACHELOOPFLAG, but this special case doesn't seem worth the
complexity.
So we just unroll the loop as far as necessary to be able to play back
the sample without any fiddling.
Pedantically, this would be incorrect for loop-until-release samples
with a low loop end which are released very quickly, but that would be
relatively harmless, is not a plausible use case in the first place, and
SoundFont sample mode 3 isn't actually implemented anyway (it's
conflated with mode 1, infinite looping).
Signed-off-by: Oswald Buddenhagen <oswald.buddenhagen@gmx.de>
Message-ID: <20240406064830.1029573-16-oswald.buddenhagen@gmx.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2024-04-06 14:48:28 +08:00
|
|
|
|
|
|
|
faulty:
|
|
|
|
snd_emu10k1_synth_free(emu, sp->block);
|
|
|
|
sp->block = NULL;
|
|
|
|
return -EFAULT;
|
2005-04-17 06:20:36 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* free a sample block
|
|
|
|
*/
|
|
|
|
int
|
2005-11-17 21:50:13 +08:00
|
|
|
snd_emu10k1_sample_free(struct snd_emux *rec, struct snd_sf_sample *sp,
|
|
|
|
struct snd_util_memhdr *hdr)
|
2005-04-17 06:20:36 +08:00
|
|
|
{
|
2005-11-17 21:50:13 +08:00
|
|
|
struct snd_emu10k1 *emu;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
|
|
|
emu = rec->hw;
|
2008-08-08 23:12:14 +08:00
|
|
|
if (snd_BUG_ON(!sp || !hdr))
|
|
|
|
return -EINVAL;
|
2005-04-17 06:20:36 +08:00
|
|
|
|
|
|
|
if (sp->block) {
|
|
|
|
snd_emu10k1_synth_free(emu, sp->block);
|
|
|
|
sp->block = NULL;
|
|
|
|
}
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|