Voice acting, sound editing, etc.

Discuss the Wing Commander Series and find the latest information on the Wing Commander Universe privateer mod as well as the standalone mod Wasteland Incident project.
Post Reply
chuck_starchaser
Elite
Elite
Posts: 8014
Joined: Fri Sep 05, 2003 4:03 am
Location: Montreal
Contact:

Post by chuck_starchaser »

Bad news: I was just looking into OpenAL and there's abso-f&*@ing-lutely nothing in it with regard to reflections or reverberations. I'm sure it would be different if OpenAL hadn't been blessed by the Kiss of Death of Creative's mediocrity crusade; but let's not get into politics... The next version of OpenAL, 1.1, will have an extension to specify an "offset" in milliseconds, which is not explained anywhere that I can find, but if I were to speculate I'd say it's probably going to allow one to synchronize renditions of a sound, such as to implement reflections. (God, please bring back Aureal3D!)
So, for now it looks like we'll have to stick early reflections into the preprocessed wave files, and wave away hopes of dynamically handling head rotations.
Well, actually, OpenAL will handle filtering and channel balance for head rotation, though not the differential timing of reflections, so there will be some perceivable audio changes when turning the head, but not fully realistic-sounding.

OR.... or... or... or... there's always the possibility of handling early reflections later :), I mean in real-time. Say we were to content ourselves with just one early reflection; --the most prominent one, say off the windshield: We could fairly easy do that without FFT's, just by reading values from the sound in the sample buffer, starting from the back, multiplying by a constant, and adding the result to the sample N samples back, where N is the reflection's delay time multiplied by the sampling rate.
To do this, though, we have to have have our input buffer in memory. We can't just use a call to load the file and immediately start this process, as we would incurr a file loading time hic-up.
Penta
Trader
Trader
Posts: 28
Joined: Tue Mar 15, 2005 2:43 pm

Post by Penta »

chuck_starchaser wrote:I suspect you CAN hear, at least partially, with the bad ear, if you do get any sense of direction of sound in real life. No, actually, even with hearing in one ear only, some directionality is left due to different frequency responses depending on how waves hit the folds of the one ear. But I suspect if your loss of hearing in one ear are at lower frequencies, it should be possible to improve directional perception by cutting off lower frequencies. Maybe we should put the option, for the rumble filter setting in the setup program, to set the frequency pretty high.
In real life? It's been a few years since I actually had my hearing tested, matter of fact. I don't recall the details too well.

But generally, my sense of direction is faulty. That's not a real problem with games, though; The low-freq@high pitch is. That can hurt.
Penta
Trader
Trader
Posts: 28
Joined: Tue Mar 15, 2005 2:43 pm

Post by Penta »

klauss wrote:Penta: How would you like for us to handle that?
Really, the best one to decide is yourself since you know your own problem. We can't remove LF for everyone. But perhaps there could be an option. Similar to the highpass option for music.
But please tell us exactly how sound should be to not bother you.

Chuck: we posted at the same time. I was thinking that could be the solution to Penta's problem, but I though he should confirm - he knows his own problem better than us.
It smells like that would work. I have to actually sit down with a decent (non-broken) pair of headphones and find something that'll let me mess with things to see where the problem comes.

Actually, is there something like that? A program I could load up to do tones like they do audiological tests with?

Or am I on crack?
chuck_starchaser
Elite
Elite
Posts: 8014
Joined: Fri Sep 05, 2003 4:03 am
Location: Montreal
Contact:

Post by chuck_starchaser »

There probably is, but I don't know of one off hand. But you could simply ask a friend to lend you some CD that has good positional material when listened to with headphones, and the pair of headphones :), and see if you can find a comfortable setting in terms of balance, bass and treble controls.

If you have a preamp with a graphic equalizer (if not, Winamp does) you could try starting from an EQ setting where all the sliders are all the way up. Now bring the first EQ slider on the left all the way down and see if that helps your aural directional sense. If not, bring the second slider all the way down, and so on.

When you say "low frequency, high pitch" you throw me off. 'Pitch' is a synonym of 'frequency'. Did you mean "low frequency, high VOLUME"? Or do you mean you have trouble at both ends of the spectrum? You could try the same EQ process but starting by bringing down the slider all the way to the right, if that's the case. If you can find a good EQ patch for you, it would really help us get a glimpse what might be involved, if we decide to include a set of accessibility options to the sound settings.

Talking about accessibility, none of my business perhaps, but I think the VS engine might gain some popularity by offering such options on the graphics side. If my recollection is correct, about 15% of people have trouble distinguishing between red and green, and another 5% have trouble distinguishing blue from green or yellow, though I think I also read that only one in two out of this 20% of the population actually realize they have an impairment.
Not suggesting any color table kind of tweak; this would be madness to implement; but where it would be most important to provide some alternative color schemes is with radar and text displays. Back in the early days of computing, when accessibility was unheard of, many DOS programs used red text on green background, or purple on yellow. Nowadays such things aren't done anymore.
In the HUD front, I saw a thread somewhere with a new design proposal that used sort of alpha-blended green backgrounds and transparent text, IIRC; which could be a killer to many people. Also, a font-size option could be something to think about...
Penta
Trader
Trader
Posts: 28
Joined: Tue Mar 15, 2005 2:43 pm

Post by Penta »

My headphones are busted anyhow, so it's currently something for the to do list.

That said, re colors and visual:

Among Americans, the Rate of Incidence of colorblindness is about 10% for Males, at least some form of color perception deficency (thanks to Wikipedia for the info).

More info can be found there...

And detailed info is here.
smbarbour
Fearless Venturer
Fearless Venturer
Posts: 610
Joined: Wed Mar 23, 2005 6:42 pm
Location: Northern Illinois

Post by smbarbour »

In regards to positional audio, I would recommend listening to the album Fragile by Yes. "Heart of the Sunrise" fully pans the audio back and forth between left and right. Listening to it on headphones is a real trip. :mrgreen:
I've stopped playing. I'm waiting for a new release.

I've kicked the MMO habit for now, but if I maintain enough money for an EVE-Online subscription, I'll be gone again.
chuck_starchaser
Elite
Elite
Posts: 8014
Joined: Fri Sep 05, 2003 4:03 am
Location: Montreal
Contact:

Post by chuck_starchaser »

smbarbour wrote:In regards to positional audio, I would recommend listening to the album Fragile by Yes. "Heart of the Sunrise" fully pans the audio back and forth between left and right. Listening to it on headphones is a real trip. :mrgreen:
(People with good musical tastes around here!) I got the album but I've actually never heard it in stereo, if you can belive my story... I built my own 3-way speaker and the 3 amplifiers that go with them, and the Linkwitz-Riley crossover preamp, but I spent a coupe of G's on just that, altogether, (uh.. more like 3) so I'm still debating about building a second channel... :D
Now, before I built it, I had a nice set of Altec Lansing PC speakers, but I loaned them to a coworker since I built my amplifier. Then I got Fragile. But even though my coworker's office is only 10 steps away from where I'm sitting and I could reclaim them in an instant, the problem is that Fragile is now sitting somewhere in my Linux drive, which is presently non-functional...

Anyhow, I suppose the effect you describe must sound like Robert Plant near the end of Whole Lotta Love, before the solo. A real trip with headphones, I remember.

Thing is, though, we're trying to achieve realism, rather than trippy effects; --at least for now. But some trippy effect will be coming in due course, for during jump point animations, as well as SPEC flight in Vega Strike... Stay tuned ;-)

EDIT: Just realized when I said I built my 3-way speaker, it may be confusing: It's not 3-way as in left, right, subwoof; rather as in bass-mid-tweeter: Doing the crossover in the preamp (Linkwitz-Riley), and using separate power amplifiers ("tri-amp-ing") of my own design, 120W + 120W + 30W respectively (i.e.: 270W "musical"), very hi-fi to my ears, and in the expressions of all the people who've heard it (who also share my opinion that it sounds as powerful as a 1000W system (ROCKS the house (and far beyond) ) ); BUT... mono, not stereo. I keep debating whether to build a second channel, but in truth I don't believe in listening to stereo through speakers, unless using a holographic preamplifier, which I talked about earlier in this thread. And holographic preamps are no longer available, that I know, and designing one would take months (needs a DSP and stuff). (DOH! I meant to buy a pair of headphones today, but forgot..)
klauss
Elite
Elite
Posts: 7243
Joined: Mon Apr 18, 2005 2:40 pm
Location: LS87, Buenos Aires, República Argentina

Post by klauss »

Good lord! That's a nice speaker.
Tri-amping must have made the difference.
I remember when I bi-amped, bi-wirde my system, it got 10 times better.
After using good cables it got another 10 times better.
It's unbelievable how you can always go up, up, up and never stop in terms of both quality and cost.

Back to the ESD: I said daemon since it would be monitoring stuff all the time. I was thinking about an element of the game that would Execute() every physics frame and every frame test acceleration and stuff and trigger events. So, it's basically the tick() idea.
And a third reason: For acceleration and turning triggers, we're probably best off hooking to the joystic or mouse events, rather than monitoring acceleration, since acceleration might result from hitting another plane, for example, which would need a different sound effect from that of a thruster...
Good thing you were thinking, for sure as hell I was not! I didn't realize that. Anyway, that's for engine activity. But for stress and cargo sounds the acceleration approach is still correct (it will always be, since those things depend solely on acceleration).
Okay, I still don't get why you said "can't be under /sound". Were you thinking about the /sound folder under vegastrike? I was referring to the /sound folder under .vegastrike, or under .privateer100, which looks empty to me, so I though might as well squat it..
Perhaps that would be the place for the cachè, but not for the .etx files and room descriptions. That would go in /sound (not .something/sound), since it would come with the package.
Tell me you're NOT going to invent a new language for this...
No, i'm not. It already exists. I clarified since it's not normally used for complex values, but for vectors. Just look at the complex values as if they were 2D vectors.
We could fairly easy do that without FFT's, just by reading values from the sound in the sample buffer, starting from the back, multiplying by a constant, and adding the result to the sample N samples back, where N is the reflection's delay time multiplied by the sampling rate.
If anyone knows OpenAL's limitations, fill me in. But I think, for one level of reflection, perhaps playing the sound, with ms delay, and at a position mirrored by the reflecting surface, would be the best way. We would have to update the position every frame, however, since the reflection will change with head movement. I really don't like all that. I was thinking:

Consider the distance from the windshield to the head: depending on the ship, it could range from 50cm to 1m. Perhaps more on capships. Now consider the position variation of the ear when you rotate your head: 10-20cm, not more. Now compute the delay that would come with a 20cm shift: around 0.6ms, which is around 27 samples, in a heavily filtered reflection (since now the ear would be pointing the other way). What I'm saying is, what would be most noticeable would be the fadeout and muffle of the reflection as the ear turns away, and the other ear picks up the reflection instead. Thus, I think, a Fwd,Back,Left,Right mixing matrix would aproximate quite well the results, since as the ear turns, Left-back and Right-fwd interpolation would gradually fade between the propper early reflections already precomputed with FFT. The effect wouldn't be the same, but I think it would be quite believable. I didn't run any tests yet. Perhaps I should...
Oíd mortales, el grito sagrado...
Call me "Menes, lord of Cats"
Wing Commander Universe
klauss
Elite
Elite
Posts: 7243
Joined: Mon Apr 18, 2005 2:40 pm
Location: LS87, Buenos Aires, República Argentina

Post by klauss »

Oh... I just remembered.
With all the fuzz about the sounds, I think people forgot about the voice acting thing.
Perhaps opening a "Voice Acting - Really" thread would be nice, don't you think?
Oíd mortales, el grito sagrado...
Call me "Menes, lord of Cats"
Wing Commander Universe
chuck_starchaser
Elite
Elite
Posts: 8014
Joined: Fri Sep 05, 2003 4:03 am
Location: Montreal
Contact:

Post by chuck_starchaser »

Tri-amping must have made the difference.
Indeed, and as you say, there's always 'better': In my first implementation of the crossover, I used analog subtraction to get phase coherency. I noticed however that distortion began earlier in the mid range than in the tweeter and woofer amps. Finally I simulated my crossover (after having built it, of course :) ) and discovered that the subtraction method causes the channel resulting from the subtraction almost double the gain at the corner frequency, in order to *subtract* air pressure from the other speaker!!! Later I designed a new preamp using the Linkwitz-Riley crossover (this time I simulated it first :D ), and what a difference! That's when people started to say "this can't be 270 Watts, this is at least 1000W, and you're kidding me."
Besides that, I implemented an invention of mine, in the amplifiers, that electronically linearizes and ultra-dampens speakers. I'm using high efficiency speakers, from Eminence, the type used for rock band stage amps, --NOT hi-fi speakers--, which is part of the reason the system sounds so loud; but my linearization invention makes them sound like flat response, "monitor" speakers, in terms of quality. No resonant booming sounds: My bass speaker vibrates you and pounds you evenly, at all bass frequencies.. :D
But in fact, if the amp is on, but I have no music player, if you tap on the speaker cones, they sound like tapping the belly of a dead cat. Almost no sound at all. And if you push the cone for one second and let go, it oscillates a couple of times very slowly, like 0.5 or 0.3 Hz. That's my invention causing the speaker's coil act as if it were superconductive... People who tried this though my speakers had the Devil inside.. :D

R.E. Stress, cargo, acceleration: Good you're thinking too. Yeah, always one should strive to think to trigger things on their immediate causes, rather than on causes of those causes, or on sister-consequences.

R.E. tick() idea: Excellent. It should work. I'm not sure whether the VS engine is constant rate or variable rate. If the latter, we might need to use time as an input param.

R.E.: /sound rather than .whatever/sound for .etx's: Got you. We could reflect the same folder hierarchy except without the leafs, having each etx file named as the leaf folder is named under .whatever/sound.

R.E.: New language, I was joking. Sort of a bad omen tradition, when someone starts coming up with a new language to solve a problem, they forget about the problem and spend the rest of their lives working on their language... :D
That sounds very powerful, BTW.

R.E.: OpenAL's limitations: Been digging around, and found someone who developed the sound system for a Quake 2 mod, based on OpenAL, but implementing... get this: A3D!!!! I've no idea how, but I sent the guy an email, around 4 am this morning. No reply yet.

R.E.: 0.6mS: That's right. That's why you were talking about millisecondS and I was saying we need sub-millisecond reflections. Of course we don't perceive the change as a change in delay time; but we do perceive it as a change in spectral content. Your idea of just trying to do with filters and balance would not sound very realistic, because our ears are a lot more sensitive to, for example, the reinforced bands in a comb-filter stretching apart, than they are to a change in frequency response involving a single moving pole. I'm sure there are a lot of psychoaccoustic hacks and shortcuts we can take, but I really don't think this is one of them. A 0.6 mS delay makes a comb filter response with peaks and throws every 1.5 Khz. Not easy at all to do that with an LPF...
So, if the hardware won't do it for us, I'd explore the possibility of real time software post-processing, to add at least one dynamic reflection.
Oh... I just remembered.
With all the fuzz about the sounds, I think people forgot about the voice acting thing. Perhaps opening a "Voice Acting - Really" thread would be nice, don't you think?
There was such a thread open, somewhere. Maybe it was at the WCU forum at the crius.net website. No responses at all, that I remember, and no responses to this thread, in that regard, either.
I think people will have to feel motivated. Right now the EQ on the current voice-overs is simply wrong, or non-existent. With my preliminary experiments processing those few confed tracks, I got to what, to my ears, is much more understandable speech. Once people see that their voice acting efforts will be put to good use, they'll begin volunteering. "Build it, and they'll come."
Doh! I keep forgetting this is about the VS engine, not WCU alone.
Well, maybe we should leave the voice acting alone for now. The WCU voice overs are not really that bad, just lacking EQ and compression. With some processing they'll sound a lot better. About Vegastrike the game, I haven't played it in a while, I forget what its voiceovers sounded like.

I was going to ask you, BTW, whether it would be possible to use ORFEO to do the same processing I did to those voice-overs, namely:

Code: Select all

*Bit depth to 16, if needed.
*Upsample to 88,888 Hz (I'll call this 88KHz)
*Normalize to about 50% below clipping.
*"Differentiate" (EQ such that gain = 1 at 1Khz and falls 6db/oct to the left, and rises 6db/oct to the right.)
*"Telephone band-pass filter" 250Hz to 4KHz. Applied 3 times for sharpness.
*Bit depth down to 8.
*Sampling down to 8KHz
*Bit depth up to 16
*Sampling back up to 88KHz
*Apply EQ curve of cheap 2" speaker. Got the EQ file but can't upload to geocities.
*"Integrate" (EQ such that gain = 1 at 1Khz and falls 6db/oct to the right, and rises 6db/oct to the left.)
*Distort with smooth symetric function ("Synchronize" is the word Soundforge uses for symmetric, as they use symmetric for some other meaning...)
*"Differentiate" back
*Add reverbs: 3 taps: 1ms 50%, 3ms 25%, 7ms 17%.
*Normalize to about 75% of clipping volume.
*Downsample back to 11KHz.
Without the last 3 steps. And, do you have a command line version of ORFEO? What I'm thinking about is writing a little program that iterates through the cvs tree, processing all the voice-over files, batch-mode style.
chuck_starchaser
Elite
Elite
Posts: 8014
Joined: Fri Sep 05, 2003 4:03 am
Location: Montreal
Contact:

Post by chuck_starchaser »

By the way, most sounds shipped with Privateer/WCU and probably VS too, are sampled at a paltry 11 KHz. Are you planning to upsample with some snazzy polynomial interpolation before applying FFT?
klauss
Elite
Elite
Posts: 7243
Joined: Mon Apr 18, 2005 2:40 pm
Location: LS87, Buenos Aires, República Argentina

Post by klauss »

I had to comment:
And if you push the cone for one second and let go, it oscillates a couple of times very slowly, like 0.5 or 0.3 Hz. That's my invention causing the speaker's coil act as if it were superconductive... People who tried this though my speakers had the Devil inside..
Mine does exactly the same. I try not to, since the almost DC current needed to do that could overheat the coils. But in my case (and it's not my design, but I love it anyway), it's because of the so precise servo loop. When you push it, even sligthly, it tries to go back where it should be, and starts oscillating real slow.
.E.: 0.6mS: That's right. That's why you were talking about millisecondS and I was saying we need sub-millisecond reflections...
Of course, I would have to experiment with it... The idea is that the filtering due to scattering that the bounce would suffer as you turn your head would smear the pulse in time, making the change of the pulse not a shift in time, but a change in frequency response mainly. Thus, the interpolation of the two states would mimic that behavior (if it was just a shift in time, as you say, it would not). But, it depends on things, mainly the specifics of the early reflections, so I'm just imagining. I don't like the software idea, at least for now. It would be utterly simple, but would require extensive modification of the vegastrike engine. I'd rather leave it for later, and concentrate now on achieving the f/b/l/r mixing matrix (which I'm not how extensive modification of the engine will it require)
I was going to ask you, BTW, whether it would be possible to use ORFEO to do the same processing I did to those voice-overs
Up until I read that, I was about to tell you that with Orfeo it should be quite simple to achieve almost everything you said, with the exception of th quantization to 14 bits. I think, perhaps I have an old filter for that. But even if I don't, creating a quantization filter will be a piece of cake, I estimate no more than 12 seconds, so... I could process all of the voice overs and upload the result. I'll even play with the numbers to see if anything gets better results (orfeo makes it really easy to play with number in runtime).

Reading through the processing, perhaps you will have a bit of aliasing in your filters: try taking down the telephone filter to 3500hz, and making it sharper in the high end. So that when downsampling to 8Khz there's no aliasing.

And I could try to have fun modelling a speaker, since speaker distortion is not linear. So the 2" speaker not only produces different EQ. It also has a non-linear slope transfer function (that is, the acceleration does not match the cone's acceleration linearly), and has soft saturation. (Oh, that's the Syncrhonize)

Just one question: Why the reverb? If it's to make it sound as in a real cockpit, I can have fun there. If it's to model something else (like console resonance), perhaps not.
And, do you have a command line version of ORFEO? What I'm thinking about is writing a little program that iterates through the cvs tree, processing all the voice-over files, batch-mode style.
No, I've always used it as an interactive tool. But, when I have to actually do things, I have a batch file format hidden somewhere. When you select the input file, select a "*.batch" file (you have to select the appropriate filter). The batch file is a text file of the form:

<input path>|<output path>|<filter stack>

I'm not sure if the version I sent you has the <filter stack> part implemented already. I think so. If you only se <input file>, it's a simple playlist. If you don't set a <filter stack> (by omitting it and the | ), then it will use the current one.
Notice that Orfeo cannot process the files inplace. You have to create a second copy, and then move the second copy into the first. (I'm always almost adding inplace filtering and a forward/rewind button, but I never get to it - guess I'm just too lazy ;) ).
Oíd mortales, el grito sagrado...
Call me "Menes, lord of Cats"
Wing Commander Universe
chuck_starchaser
Elite
Elite
Posts: 8014
Joined: Fri Sep 05, 2003 4:03 am
Location: Montreal
Contact:

Post by chuck_starchaser »

Indeed, OpenAL 1.0 has no EAX, but OpenAL 1.1, which is utra-new, hot still off the oven, has EAX support...
SINGAPORE - March 9, 2005 - Creative (NASDAQ: CREAF), the creator of Sound Blaster®, the standard for PC audio and a worldwide leader in digital entertainment solutions, today announced a significant advancement in gaming audio with OpenAL 1.1, an upgrade for the free API, OpenAL (Open Audio Library). With this important upgrade, several top game development houses are implementing OpenAL with EAX® ADVANCED HD™ for the highest-quality and most realistic audio possible in-game.
http://www.creative.com/corporate/press ... 9-2thmar05
klauss
Elite
Elite
Posts: 7243
Joined: Mon Apr 18, 2005 2:40 pm
Location: LS87, Buenos Aires, República Argentina

Post by klauss »

Ok, a few more guidelines, in case you want to do it yourself:

Orfeo works in the FFT domain all the time, in float. There's almost no precision loss, depending on the FFT length (from 8K and above, long filter stacks can introduce some precision loss, which propagates in time in a way which is quite inaudible, not the usual quantization loss), since most filters are pointwise multiplies of the FFT coefficients. Also, since FFT form highly resembles the psychoacoustic perception of sound, even when there's precision loss it is inaudible.
It only performs a single FFT/iFFT, to further reduce precision loss.

There's the issue of the S/D antialias, which specifically adds noise. Shaped noise, but noise it is. But it's always good to use it, for otherwise the noise added is the normal quantization noise of the final conversion from float to int, which is highly dependant on the sound source and, and it sucks. The S/D noise, besides being much less audible due to psychoacoustic masking of the high frequencies, is also highly independent of the input (with the exception of near-silence inputs, which I'm still working on). S/D dithering (it's not antialias, but for historic reasons it's named so) is much better.

It uses an S/D carrier in the LSB of the int to convey the infra-quantum variations. There's a way of dynamically varying the order of the S/D whenever the psychoacoustic masking allows it, but it's perhaps too much and it's experimental.

As for the filters you mention:

Normalize: there's no normalize. I have a "normalize" command line tool for that. Sorry, I didn't send it. Anyway, orfeo doesn't handle that kind of stuff. It's a special purpose tool (whose purpose is not normalization).
Differentiate: Two Bandpass/stops:
1) lo: 0, hi: 1000hz, decay: -6db/oct
2) lo: 1000hz, hi:0, decay: 6db/oct
(Integrate is the opposite)
Telephone filter is, again, bandpass/stop:
1) lo: 250, hi: 3500, decay 12db (or more, depends on your taste)
2) lo: 50, hi: 3990, decay 1000db (dc cut + steep ~4K lopass, to emulate 8Khz downsample)
Bitdepth: The easiest way is to output at this stage an 8 bit file. But I think I can use the "phase separator" filter to achieve the same. (NOTE: It has nothing to do with actual phase)
EQ curve: To apply EQ curves you must use the bandpass/stops to produce slopes, or the Parametric EQ for notchs, or the Band EQ for constant gains throughout a sharp-limited band.
Integrate: already told you.
Synchronize: Nothing like that, although I can add it in a snap.
Reverbs: obvious. Only tap-less: technically speaking, the FFT rever is a N-tap reverb with N being the FFT length.
Normalize: again, nothing specific.

Notes on normalization: Since Orfeo work it way in floats, it never clips until the last stage. And even then, if you leave "Anti-clipping" selected, it will automatically scale down the buffer so that there's no clipping, basically achieving some sort of dynamic range compression. One could say, orfeo is Clip-free. I've never worried about clipping since I wrote Orfeo. I now hear the word and think: "Clipping, what's that? Ah.... clipping....". (Although you do have to watch out for the anti-clipping compression, which is sometimes undesirable - only acceptable in transients).

About downsampling: orfeo only downsamples from the high sampling rates of >48 down to standard sampling rate of either 48 or 41... It would be easy from the algorithmic point of view to make it downsample to other rates, but the interface got messy so I never did it.

Basically, there's always a way to do things, but it's sometimes cumbersome if you don't know the internals.
Oíd mortales, el grito sagrado...
Call me "Menes, lord of Cats"
Wing Commander Universe
klauss
Elite
Elite
Posts: 7243
Joined: Mon Apr 18, 2005 2:40 pm
Location: LS87, Buenos Aires, República Argentina

Post by klauss »

By the way, most sounds shipped with Privateer/WCU and probably VS too, are sampled at a paltry 11 KHz. Are you planning to upsample with some snazzy polynomial interpolation before applying FFT?
Quite better: upsample with NO interpolator, then apply a steep lowpass with FFT. It's the best way, even in textbooks. You only have to take care of one single FFT coefficient to minimize Ripple - that's all.
Oíd mortales, el grito sagrado...
Call me "Menes, lord of Cats"
Wing Commander Universe
chuck_starchaser
Elite
Elite
Posts: 8014
Joined: Fri Sep 05, 2003 4:03 am
Location: Montreal
Contact:

Post by chuck_starchaser »

Excellent!
One more question:
I was thinking, for the band-pass, I don't mind the 4KHz cut being sharp and ringy. It adds realism that way. But for the hi-pass at the low end, could we implement a smooth distribution, like a chain of single pole hi-pass filters at slightly different frequencies, say, starting from the right, a 6db cut at 300 Hz, another 6db cut at 250, another at 200, and another at 150?

By the way, for speaker distortion I was thinking of using a function x^(m/n), where m < n, and both are odd numbers. Like 3/5 or 5/7. Has the advantage that % distortion in invariant with amplitude. I once built a fuzz-box/sustain for electric guitars using a cubic root function (analog hardware, if you can believe it) and it sounded very sweet, as distortions go.

PS.: My soundcard supposedly supports EAX, but it blue-screened me at the first demo I threw at it...
klauss
Elite
Elite
Posts: 7243
Joined: Mon Apr 18, 2005 2:40 pm
Location: LS87, Buenos Aires, República Argentina

Post by klauss »

Actually, FFT gives you the possibility of not thinking things in terms of poles. When I have to think in terms of poles, I go to another (very much in development) tool I use, the Filter Designer, which helps me design IIR filters mostly (since I always use FFT for FIR filters). With filter designers I position the poles in the Z plane directly, and watch the Frequency/Phase response as I move them. Right now, it's very limited in that it only output Form I biquads, but for EQs, I'd rather use the FFT's ability to "draw" the frequency response. That's why Orfeo's filters have nothing to do with poles and stuff. Just think about what you want to hear, and draw the frequency response. I had the "Response" checkbox a while ago, functional. Now it is there, but somehow it ceased to be functional :(
Anyway, you can do whatever you want EQ-wise. Just find the right combination of Bandpass/ParEQ/BandEQ. I was thinking of adding a filter with a button to let you litterally draw the response, with the mouse. That would save a lot of work with the Bandpass/ParEQ/BandEQ, but I never got to it because of UI limitations in Orfeo's architecture.
Oíd mortales, el grito sagrado...
Call me "Menes, lord of Cats"
Wing Commander Universe
chuck_starchaser
Elite
Elite
Posts: 8014
Joined: Fri Sep 05, 2003 4:03 am
Location: Montreal
Contact:

Post by chuck_starchaser »

Got it.

Here's a quote from the EAX 2.0 specification:
Distance Effects
Both OpenAL and DirectSound have the notions of Minimum Distance,
Maximum Distance, and Rolloff Factor. It applies an attenuation to the source
signal according to source-listener distance as follows:
n If distance £ Minimum_Distance, the attenuation is 0 dB (no attenuation).
n If Minimum_Distance £ distance £ Maximum_Distance, the attenuation
expressed in dB is -6 dB for each doubling of distance if the Rolloff
Factor is set to 1.0. For values different from 1.0, the Rolloff Factor acts
as a multiplier applied to the source-listener distance (diminished by the
minimum distance).
n If distance ³ Maximum_Distance, the attenuation no longer varies with
distance.
Shouldn't that be -12 dB for each doubling of distance? Begins to sound to me like OpenGL and its creative lighting arithmetic...
And shouldn't the roll-off factor be a power applied to distance ratios, rather than a distance multiplier? How does one implement the natural rolloff of sound --i.e.: by square of the distance?
chuck_starchaser
Elite
Elite
Posts: 8014
Joined: Fri Sep 05, 2003 4:03 am
Location: Montreal
Contact:

Post by chuck_starchaser »

Here's another quote:
The only reverberation response parameters that can vary automatically with
distance in EAX are the intensities of the three temporal sections: Direct,
Reflections and Reverb.
In other words, forget about using hardware for early reflections, since it can't vary the delay in real time. What a bunch of useless idiots those Creative people are. What's the point of having sound hardware? And would it have been so difficult to design asynchronous i/o FIFOs for early reflection delays, to make them variable?
chuck_starchaser
Elite
Elite
Posts: 8014
Joined: Fri Sep 05, 2003 4:03 am
Location: Montreal
Contact:

Post by chuck_starchaser »

Allright, one can vary early reflections time manually; though I wouldn't be surprised if the attempt was denied or deferred while a sound is playing. No news is bad news with Creative, and they say nothing about it. Early reflections have a single delay parameter: A float that ranges from 0.0f to 0.3f representing 300mS. In other words, the pattern of early reflections is fixed for all sounds. You can only move it in time all together as a block. No time tweaking of individual reflections.
You can increase the amount of initial reflections to simulate a more narrow space
or closer walls, especially effective if you associate the initial reflections increase
with a reduction in reflections delays by lowering the value of the Reflection Delay
property. To simulate open or semi-open environments, you can maintain the
amount of early reflections while reducing the value of the Reverb property, which
controls later reflections.
Specify using this ID DSPROPERTY_EAXLISTENER_REFLECTIONSDELAY
Value type FLOAT
Value range 0.0 to 0.3
Default value Varies depending on the environment
Value units Seconds
The Reflections Delay property is the amount of delay between the arrival time of
the direct path from the source to the first reflection from the source. It ranges
from 0 to 300 milliseconds. You can reduce or increase Reflections Delay to
simulate closer or more distant reflective surfaces—and therefore control the
perceived size of the room.
And it doesn't specify a resolution, or minimum resolution the hardware should support to comply with the spec.

Now, this is EAX 2.0, whose specification they make available for download. If you want the specs for EAX 3.0 or 4.0 you have to fill-out a developer registration and NDA form, answer a zillion questions about your "business", and wait up to 14 days for them to decide whether to accept your submission. Wouldn't it be nice if we could boicot them, instead?, --i.e.: if a Creative product is present, just refuse to play any sounds... ;-)
chuck_starchaser
Elite
Elite
Posts: 8014
Joined: Fri Sep 05, 2003 4:03 am
Location: Montreal
Contact:

Post by chuck_starchaser »

Oh, klauss, by the way, you were right about using acceleration, after all. The reason is that in VS, the thrusters are controlled by the ship's computer, which re-interprets your intentions from your joystick input AND the current vector. If you're afterburning and you turn, you'll be sliding for a second or two, during which time thrusters are full on to slow you down. But if you're in slide mode "~" you can turn any which way without thruster activations. It gets even more removed from input if you've matched speed with a target...
klauss
Elite
Elite
Posts: 7243
Joined: Mon Apr 18, 2005 2:40 pm
Location: LS87, Buenos Aires, República Argentina

Post by klauss »

Ok, lot to read, but let's start answering now that I have answers.
Shouldn't that be -12 dB for each doubling of distance? Begins to sound to me like OpenGL and its creative lighting arithmetic...
And shouldn't the roll-off factor be a power applied to distance ratios, rather than a distance multiplier? How does one implement the natural rolloff of sound --i.e.: by square of the distance?
I did my math, and the rolloff factor is the exponent. Only that when you express that attenuation in Db, it becomes a multiplier. Things are modelled such that 1.0 is the effect where intensity is m_d/d, and 2.0 is m_d/d^2, and so on...

Note to administrators: Latex formulas are not working.

I'll keep reading...
Oíd mortales, el grito sagrado...
Call me "Menes, lord of Cats"
Wing Commander Universe
klauss
Elite
Elite
Posts: 7243
Joined: Mon Apr 18, 2005 2:40 pm
Location: LS87, Buenos Aires, República Argentina

Post by klauss »

chuck_starchaser wrote:Oh, klauss, by the way, you were right about using acceleration, after all. The reason is that in VS, the thrusters are controlled by the ship's computer, which re-interprets your intentions from your joystick input AND the current vector. ...
That was the reason I wanted to use acceleration. But it gets more complicated: what you said about collisions is true as well. We should instead try to distinguish forced acceleration from controlled acceleration. No idea how to do that yet, but sooner or later...
Oíd mortales, el grito sagrado...
Call me "Menes, lord of Cats"
Wing Commander Universe
klauss
Elite
Elite
Posts: 7243
Joined: Mon Apr 18, 2005 2:40 pm
Location: LS87, Buenos Aires, República Argentina

Post by klauss »

About the boicot, I would, and would love doing it, include in the FAQ:

Q: I own an EAX(c)-compliant sound card, and VS doesn't seem to use it. Is VS EAX(c)-compliant?
A: No. In fact, VS is much better. The EAX(c) specification is unable to reproduce the sophistication embedded in VS's sound engine, so EAX(c) support was intentionally left out. Otherwise, all the richness and natural feel of the sounds would be lost.
Oíd mortales, el grito sagrado...
Call me "Menes, lord of Cats"
Wing Commander Universe
chuck_starchaser
Elite
Elite
Posts: 8014
Joined: Fri Sep 05, 2003 4:03 am
Location: Montreal
Contact:

Post by chuck_starchaser »

Klauss, sorry this post is took so long... I want to write a little program that will iterate through wave-files in the cvs tree and process all the comm speech files. I'd like to call your convolve from inside of it three times for each file. Pseudocode:

Code: Select all

float bigbuff[77777777];
int main()
{
    for all folders recursively
    {
        for all .wav files in each folder
        {
            system( "filename" ); //play it through Winamp
            cout << "Process " << filename << "? [y/n] ";
            cin >> answer;
            if( answer == 'y' )
            {
                read_wavefile( filename, bigbuff );
                normalize_level( bigbuff, 0.5f ); //half-clipping level
                write_temp_raw( bigbuff );
                Sleep( 77 );
                system( "convolve temp.raw patch1.etx" ); //upsamp, bandpass & diff
                Sleep( 77 );
                read_temp_raw( bigbuff );
                downsample_to_8khz( bigbuff ); //I'll average chunks of samples
                normalize_level( bigbuff, 0.9f ); //90% of clipping level
                convert_to_8_bits( bigbuff ); //actually, &= floats with FC00...
                //...which is better: --closer to actual telephone codec ;-)
                write_temp_raw( bigbuff );
                Sleep( 77 );
                system( "convolve temp.raw patch2.etx" ); //upsamp+speaker
                Sleep( 77 );
                read_temp_raw( bigbuff );
                compute_double_integral( bigbuff ); //I'll do this myself.
                normalize_level( bigbuff, 0.9f ); //90% of clipping level
                remove_dc_offset( bigbuff ); //more like a high-pass ~100 Hz
                apply_pow_5_over_7_distortion( bigbuff );
                compute_time_differential( bigbuff ); //I'll do this myself.
                write_temp_raw( bigbuff );
                Sleep( 77 );
                system( "convolve temp.raw patch3.etx" ); //lo-pass & downsample
                Sleep( 77 );
                //the rest below I'll do myself:
                read_temp_raw( bigbuff );
                system( "ren "+filename "+'_'+filename" ); // ;-)
                Sleep( 77 );
                normalize_level( bigbuff, 0.9f );
                write_wavefile( bigbuff, filename )
                cout << "Done!" << endl;
                system( "filename" ); //play it through Winamp again
                Sleep( 777 );
   }   }   }
    return 0;
}

Wonder if you could flesh-out my calls to convolve and help me out with the patches... Here's what patch 1 needs to do (in one step of course, and values are approximate, of course, also) to band-pass and differentiate:

<150 Hz: -infinity dB :)
170 Hz: -30dB
205 Hz: -21dB
250 Hz: -15dB
300 Hz: - 9 dB
500 Hz: - 6 dB
1.00 KHz: = 0.0 dB
2.00 KHz: + 6.0 dB
4.00 KHz: +12.0 dB
4.01 KHz: -infinity dB :)

Next patch has to do the speaker curve. I've decided to leave in the 2-inch speaker model, rather than upgrade to 3 or 4 inch type, just to make sure there's no doubt in anybody's ears that it is a small speaker. The response is like this (assume straight lines between points):

Hz -----> dB
100 -----> - 14
320 -----> = 0.000dB
400 -----> -2
666 -----> -2
800 -----> -4
1100 -----> -4
1200 -----> -6
1270 -----> -3
1350 -----> -4
2000 -----> +1
2250 -----> = 0.000dB
2350 -----> +3
2700 -----> +2
2900 -----> +4
3100 -----> +3
3250 -----> +5
4050 -----> -3
4600 -----> -1
4800 -----> -1
5100 -----> = 0.000dB
5150 -----> -4
5300 -----> -5
5600 -----> -16
6000 -----> -11
6200 -----> -5
7200 -----> +2
7500 -----> +2
7700 -----> = 0.000dB
8000 -----> -6
9000 -----> -2
9800 -----> = 0.000dB
11000 -----> -1
12000 -----> +3
13500 -----> -10
20000 -----> -16

Patch 3 is just a sharp low pass at 5 KHz and downsample to 11KHz.

(Pheww... Done!)
Last edited by chuck_starchaser on Tue Apr 26, 2005 7:38 pm, edited 1 time in total.
Post Reply