16-bit to 32-bit automatic upsampling in libsox

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

16-bit to 32-bit automatic upsampling in libsox

soumith
This might be a bit of a noob question,
I am reading a 16-bit wav file using libsox, and I see that the values of the samples are automatically scaled to 32-bit limits. Is there a way to avoid this, or is there a reason that this is being done.

For example, reading the same file into matlab
0.0663, 0.1061, 0.1451 in matlab are read as 142344192, 227868672, 311492608 in libsox.

Is there something fundamentally wrong in my thinking?

--soumith

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Sox-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/sox-users
Reply | Threaded
Open this post in threaded view
|

Re: 16-bit to 32-bit automatic upsampling in libsox

Jan Stary
On May 25 09:38:25, soumith wrote:

> This might be a bit of a noob question,
> I am reading a 16-bit wav file using libsox, and I see that the values of
> the samples are automatically scaled to 32-bit limits. Is there a way to
> avoid this, or is there a reason that this is being done.
>
> For example, reading the same file into matlab
> 0.0663, 0.1061, 0.1451 in matlab are read as 142344192, 227868672,
> 311492608 in libsox.
>
> Is there something fundamentally wrong in my thinking?

SoX converts everything to 32bit internaly before processing.

What is it that you actually need to do?


------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Sox-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/sox-users
Reply | Threaded
Open this post in threaded view
|

Re: 16-bit to 32-bit automatic upsampling in libsox

soumith
Hey Jan,

I wasn't sure why sox was upscaling the numbers to 32-bit limits. Type-casting 16-bit numbers is one thing, but upscaling is what is happening.
I don't need that much precision, so I was trying to see if I could give in options to libsox to avoid this upscaling.

--soumith

On Fri, May 25, 2012 at 10:07 AM, Jan Stary <[hidden email]> wrote:
On May 25 09:38:25, soumith wrote:
> This might be a bit of a noob question,
> I am reading a 16-bit wav file using libsox, and I see that the values of
> the samples are automatically scaled to 32-bit limits. Is there a way to
> avoid this, or is there a reason that this is being done.
>
> For example, reading the same file into matlab
> 0.0663, 0.1061, 0.1451 in matlab are read as 142344192, 227868672,
> 311492608 in libsox.
>
> Is there something fundamentally wrong in my thinking?

SoX converts everything to 32bit internaly before processing.

What is it that you actually need to do?


------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Sox-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/sox-users


------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Sox-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/sox-users
Reply | Threaded
Open this post in threaded view
|

Re: 16-bit to 32-bit automatic upsampling in libsox

Pascal Giard
In reply to this post by soumith
On Fri, May 25, 2012 at 9:38 AM, soumith <[hidden email]> wrote:

> This might be a bit of a noob question,
> I am reading a 16-bit wav file using libsox, and I see that the values of
> the samples are automatically scaled to 32-bit limits. Is there a way to
> avoid this, or is there a reason that this is being done.
>
> For example, reading the same file into matlab
> 0.0663, 0.1061, 0.1451 in matlab are read as 142344192, 227868672, 311492608
> in libsox.
>
> Is there something fundamentally wrong in my thinking?


I wouldn't call that upscaling but rather conversion from floating
point (probably double precision floating point as it's the default in
MATLAB) to 32bit fixed-point. SoX does all its signal processing in
32bit fixed-point.

Taking your samples to illustrate:
0.0663 * 2^31 = 142378165,8624
0.1061 * 2^31 = 227848015,0528
0.1451 * 2^31 = 311599877,3248

Cheers,

-Pascal
--
Homepage (http://organact.mine.nu)
Debian GNU/Linux (http://www.debian.org)
COMunité/LACIME: École de technologie supérieure (http://www.comunite.ca)
Integrated Microsystems Laboratory: McGill (http://www.iml.ece.mcgill.ca)

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Sox-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/sox-users
Reply | Threaded
Open this post in threaded view
|

Re: 16-bit to 32-bit automatic upsampling in libsox

Doug Cook-2
>> This might be a bit of a noob question,
>> I am reading a 16-bit wav file using libsox, and I see that the values of
>> the samples are automatically scaled to 32-bit limits. Is there a way to
>> avoid this, or is there a reason that this is being done.
>>
>> For example, reading the same file into matlab
>> 0.0663, 0.1061, 0.1451 in matlab are read as 142344192, 227868672, 311492608
>> in libsox.
>>
>> Is there something fundamentally wrong in my thinking?

The 16-bit wav file consists of 16-bit signed integers. Those can
store numbers from -32768 to +32767. That is very good for
final-product sound files, as it is very close to the accuracy that is
possible from most sound hardware and is very close to the dynamic
range that the human ear can hear.

For audio processing, 16-bit signed integers are not as convenient.
For one thing, most steps in audio processing tend to mess with the
last one or two digits of the number they process. For another thing,
nearly every CPU that is capable of running sox is more efficient at
doing math with 32-bit integers than with any other kind of data.
Finally, code can only process one kind of data. If you want code that
processes two kinds of data, you have to make two slightly different
copies of the code. Instead of having a separate copy of each bit of
audio processing code, sox converts all input into 32-bit integers,
then all of sox's audio processing steps take 32-bit integers as input
and produce 32-bit integers as output.

As a side note, you'll notice that MatLab isn't showing you 16-bit
signed integers either. It also converted the data into an internal
format (probably a 64-bit floating-point number) different from the
format in the audio file (16-bit signed integer). The audio file does
not contain numbers like 0.0663. Instead, the audio file has the
number +2172. When reading the wav file, Matlab converts the number to
its internal format with a formula something like this: (2172 + 0.5) /
32768 = 0.0663, or perhaps just 2172 / 32768. Sox does a similar
process, but it converts to a 32-bit signed integer instead of to a
64-bit floating-point number: 2172 * 65536 = 142344192.

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Sox-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/sox-users
Reply | Threaded
Open this post in threaded view
|

Re: 16-bit to 32-bit automatic upsampling in libsox

soumith
Thanks for your great reply Doug. Yes, matlab normalized it to the range (-1,1).

On Fri, May 25, 2012 at 5:12 PM, Doug Cook <[hidden email]> wrote:
>> This might be a bit of a noob question,
>> I am reading a 16-bit wav file using libsox, and I see that the values of
>> the samples are automatically scaled to 32-bit limits. Is there a way to
>> avoid this, or is there a reason that this is being done.
>>
>> For example, reading the same file into matlab
>> 0.0663, 0.1061, 0.1451 in matlab are read as 142344192, 227868672, 311492608
>> in libsox.
>>
>> Is there something fundamentally wrong in my thinking?

The 16-bit wav file consists of 16-bit signed integers. Those can
store numbers from -32768 to +32767. That is very good for
final-product sound files, as it is very close to the accuracy that is
possible from most sound hardware and is very close to the dynamic
range that the human ear can hear.

For audio processing, 16-bit signed integers are not as convenient.
For one thing, most steps in audio processing tend to mess with the
last one or two digits of the number they process. For another thing,
nearly every CPU that is capable of running sox is more efficient at
doing math with 32-bit integers than with any other kind of data.
Finally, code can only process one kind of data. If you want code that
processes two kinds of data, you have to make two slightly different
copies of the code. Instead of having a separate copy of each bit of
audio processing code, sox converts all input into 32-bit integers,
then all of sox's audio processing steps take 32-bit integers as input
and produce 32-bit integers as output.

As a side note, you'll notice that MatLab isn't showing you 16-bit
signed integers either. It also converted the data into an internal
format (probably a 64-bit floating-point number) different from the
format in the audio file (16-bit signed integer). The audio file does
not contain numbers like 0.0663. Instead, the audio file has the
number +2172. When reading the wav file, Matlab converts the number to
its internal format with a formula something like this: (2172 + 0.5) /
32768 = 0.0663, or perhaps just 2172 / 32768. Sox does a similar
process, but it converts to a 32-bit signed integer instead of to a
64-bit floating-point number: 2172 * 65536 = 142344192.

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Sox-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/sox-users


------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Sox-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/sox-users
Reply | Threaded
Open this post in threaded view
|

Re: 16-bit to 32-bit automatic upsampling in libsox

Doug Cook-2
In reply to this post by Doug Cook-2
On Fri, May 25, 2012 at 2:12 PM, Doug Cook
<[hidden email]> wrote:

>>> This might be a bit of a noob question,
>>> I am reading a 16-bit wav file using libsox, and I see that the values of
>>> the samples are automatically scaled to 32-bit limits. Is there a way to
>>> avoid this, or is there a reason that this is being done.
>>>
>>> For example, reading the same file into matlab
>>> 0.0663, 0.1061, 0.1451 in matlab are read as 142344192, 227868672, 311492608
>>> in libsox.
>>>
>>> Is there something fundamentally wrong in my thinking?
>
> The 16-bit wav file consists of 16-bit signed integers. Those can
> store numbers from -32768 to +32767. That is very good for
> final-product sound files, as it is very close to the accuracy that is
> possible from most sound hardware and is very close to the dynamic
> range that the human ear can hear.
>
> For audio processing, 16-bit signed integers are not as convenient.
> For one thing, most steps in audio processing tend to mess with the
> last one or two digits of the number they process. For another thing,
> nearly every CPU that is capable of running sox is more efficient at
> doing math with 32-bit integers than with any other kind of data.
> Finally, code can only process one kind of data. If you want code that
> processes two kinds of data, you have to make two slightly different
> copies of the code. Instead of having a separate copy of each bit of
> audio processing code, sox converts all input into 32-bit integers,
> then all of sox's audio processing steps take 32-bit integers as input
> and produce 32-bit integers as output.
>
> As a side note, you'll notice that MatLab isn't showing you 16-bit
> signed integers either. It also converted the data into an internal
> format (probably a 64-bit floating-point number) different from the
> format in the audio file (16-bit signed integer). The audio file does
> not contain numbers like 0.0663. Instead, the audio file has the
> number +2172. When reading the wav file, Matlab converts the number to
> its internal format with a formula something like this: (2172 + 0.5) /
> 32768 = 0.0663, or perhaps just 2172 / 32768. Sox does a similar
> process, but it converts to a 32-bit signed integer instead of to a
> 64-bit floating-point number: 2172 * 65536 = 142344192.

And my formula above is probably wrong. The formula for
16-bit-to-float is either (2172 / 32768) or ((2172 + 0.5) / 32767.5),
depending on who you ask and what theory of audio processing they
follow. ((2172 + 0.5) / 32768) is just me being stupid. :)

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Sox-users mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/sox-users