AudioWatermarking.info
robust audio watermarking technology homepage
  Main   |   Details   |   Examples   |   FAQ   |   Evaluate   |   Buy   |   BLOG  

 

If you prefer, you can skip this reading and pass directly to the Examples section.
You can also download AWT2 User Guide (PDF) containing all the detailed information about the product and its usage.

 

AWT2 at a glance

The watermark is highly robust. Read details below or proceed to the examples section.

AWT2 algorithm implements so called "blind watermarking" approach, in a sense that the original (non-watermarked) audio stream is not needed to extract the watermark from the watermarked stream. The watermark is extracted directly from the watermarked audio stream or its fragment.

AWT2 encoder and decoder operate with wave PCM (.wav) audio files of almost any format - mono/stereo, with sampling rates from 8 to 192 KHz, and amplitude resolutions of 8/16/24/32 bits. Additional audio formats (such as MP3, OGG, AMR, etc.) are supported by the decoder via external tool, FFmpeg (http://ffmpeg.org).

AWT2 can operate in 'normal' and 'high capacity' mode. Normal mode provides the highest robustness with moderate watermarking data rate (see below). In 'high capacity' mode AWT2 uses extended frequency range to embed watermarks that results in 3 times higher data rate, but leads to somewhat reduced robustness (e.g. against very low sample rate conversion such as 8 KHz and lower which is typical sampling rate for phone networks).

Supported watermarking payload size is from 1 to 120 bytes (subject to limitations in different AWT2 packages). Recommended watermarking payload size ensuring uncompromising robustness is up to 20 bytes in ‘normal mode’ and up to 50 bytes in ‘high capacity’ mode.

With default parameters in 'normal' mode, watermarking data rate is approximately 8 bps for 1-byte payload, 11 bps for 2-bytes payload, 15 bps for 4-bytes payload, 17 bps for 8-bytes payload, and 20 bps for 20-bytes payload. In 'high capacity' mode the data rate is exactly 3 times higher (i.e. 60 bps for 20-bytes payload, etc.). The watermarking rate increases with increase of watermarking payload size due to some constant overhead for each watermark copy. A special parameter of the encoder allows adjusting the data rate making it higher or lower than default.

The watermarking algorithm works in the time domain in several frequency sub-bands. The overall idea behind the algorithm is in embedding of a binary watermarking payload into carrier audio signal in the time domain by time-shifting the carrier signal blocks.

The algorithm can be applied to all kinds of audio data. Typical examples: music (pop, jazz, classics, rock), speech recordings, instrument samples, etc.

Each particular copy of AWT2 binaries with particular Serial Number (SN) contains a unique numeric identifier that is used during encoding to scramble watermark payload. This security feature prevents one AWT2 user (with one SN) from extracting watermarks from watermarked files created by another AWT2 user (with another SN number).

AWT2 is fast: in 'normal mode' it is at least 20 times faster than the real-time on modest Intel Core 2 Duo E6750@2.6Ghz using 1 core.


This watermarking technology is patented, U.S. Patent No. 8,116,514

Download AWT2 User Guide (PDF) >>>
Download free AWT2 evaluation package >>>

 

Watermark robustness and aural (im)perceptibility

On robustness…

The proposed watermarking scheme demonstrates very high robustness to almost all kinds of audio conversions. Here are some typical examples:

  • lossy transcoding using MP3, Ogg Vorbis and other audio codecs (including multiple transcoding at very low bitrates)
  • acoustic coupling (i.e. traveling of sound from D/A to loudspeaker, then to the microphone via air and then to A/D)
  • time-stretching (time scaling) up to 50%
  • mixing with other signals, noise addition
  • signal cropping, cutting
  • sample rate conversion (even down to low sample rates such as 8 KHz that is typical for phone networks); amplitude re-quantization
  • effect processing, from a simple EQ to an extreme dynamic range compression, reverberation, echo, spectral effects, etc.
  • waveform distortions such as limiting, clipping, slope manipulation, gain control
  • A/D - D/A conversion
  • transmission over radio waves

A quick example: the watermark survives even transcoding with low-bitrate MP3 codec, then time-stretching and transducing of the signal via air (i.e. reproducing it with a loudspeaker and recording with a microphone).

In 'high capacity' operating mode the algorithm uses extended frequency range to embed watermarks. This results in 3 times higher data rate, but leads to reduced robustness against conversions with sample rates lower than 16 KHz or extreme lossy transcoding.

It is worth underlining the ability of the decoder to detect and successfully extract watermarks even from aggressively time-stretched audio streams. It's not a secret that time-stretching of 1-2% is often used on radio stations in order to free more air time for advertisements. AWT2 decoder can cope with time-stretching of up to 50%.

 

On imperceptibility…

With default parameters, the proposed watermarking algorithm demonstrates practically undistinguishable watermarking which is transparent to an average listener with audio equipment of any quality on most of audio content. For the sake of truth it should be noted that (like with any other real world technology) there are examples of very specific audio samples that may reveal some watermarking artifacts compared to original non-watermarked audio, however in these specific cases such effects are rather minor and may be noticeable only to experienced listener. Depending on the target needs, the user may adjust encoding parameters (namely, watermarking “density” and “aggressiveness”) to achieve optimal aural transparency and robustness.

 

How the algorithm works

...no, it is not another spread spectrum watermarking technique...

Below is a brief high-level description of the patented watermarking algorithm of AWT2:

  • Watermarking payload is converted into a watermarking data packet containing encrypted watermarking payload and error-correction code
  • the source audio stream is decomposed into several frequency sub-band signals (carrier signals); in 'normal' operating mode a few bands in narrow frequency range are used, in 'high capacity' operating mode more bands are utilized and therefore wider frequency range is involved
  • each sub-band signal is then divided into blocks of a certain length
  • each block of the carrier sub-band signal is associated with corresponding symbol (bit) of the watermarking payload data packet
  • each block of the carrier signal is then time-shifted (forwards or backwards in time) to a degree determined from inter-band signal analysis and associated with the corresponding watermarking payload data packet symbol (bit) value
  • the watermark is repeated throughout the audio stream as many times as the stream duration permits
  • the output (watermarked) audio stream is then synthesized by combining modified (time-shifted) carrier sub-band signals back into full band signal
  • on the last stage the output (watermarked) stream is stored on disk in form of audio file

For increased reliability, different statistical, security, error-correction and signal processing mechanisms are applied.
Important outcomes:

  • number of copies of the watermarking payload embedded into audio stream is proportional to its duration
  • maximal degree of the time-shifts (which is a parameter called “aggressiveness”) impacts on robustness and (im)perceptibility of the watermark: lower degrees result in less robust and less perceptible watermark, and vice versa.
  • block size (which is a parameter called “density”) impacts on robustness and imperceptibility: greater block size results in less payload copies embedded into the audio stream

Due to a constant overhead for each copy of the watermarking payload, watermarking rate increases with increase of the payload size.

 

Examples

To demonstrate robustness and imperceptibility of AWT2 watermarks, I place audio samples that you can play with during your evaluation of AWT2. Please do not forget to download the demo package of AWT2 that includes AWT2 encoder, decoder, convenient GUI tool and documentation.

Here are several source audio signals (WAV PCM, 44.1 Khz, 16 bit) that are used in this demonstration:
    brahms-in.wav (50 sec)
    
    gazebo-in.wav (60 sec)
    
    jarre-in.wav (60 sec)
    
    speech-in.wav (30 sec)
    
    yello-in.wav (60 sec)
    
These are audio recordings of different types: music (pop, electronic, classics) and speech.

All of the above source files have been encoded using AWT2 encoder, and the watermark 0xABCDEF12 (4 bytes) has been embedded into each of them (in 'normal' operating mode). The encoding was done by running the encoder in command line (or, alternatively, you can use AWT2 GUI):

    awt2_enc source.wav output.wav 0xABCDEF12

With these parameters the watermarking data rate is approximately 15 bits per second, that results in embedding of 28 copies of the watermarking payload per one minute of audio.

Below is a table of input and output (watermarked) files together with their distorted copies (transcoded, air transduced, etc). You can download and listen to them in order to:
* check aural transparency of the watermark (by comparing source and watermarked files quality)
* get impression of the distortions introduced into the original recordings and ensure that the watermarks are still detectable by the AWT2 decoder even in so much distorted files.
You can also edit/distort these files even further to test the robustness of the watermark.

To decode the watermark from any of the watermarked files, run:

    awt2_dec <file.wav> 4

Below is a table containing all input and output files. You can download the wave files or just listen to them in place using embedded audio player.

Source (input) file
original, without watermark
Output file watermarked with 0xABCDEF12
to ensure that the watermark is indeed inaudible
Watermarked output after
32 Kbps MP3 transcoding*, cropped

to test watermark robustness
using AWT2 decoder
Watermarked output after
128 Kbps MP3 encoding and transducing** via air, cropped

to test watermark robustness
using AWT2 decoder
brahms-in.wav
(full, 50 sec)

 
brahms-out.wav
(full, 50 sec)

 
brahms-out-transcoded-cropped.wav
(20 sec)

 
brahms-out-transcoded-transduced-cropped.wav
(25 sec)

 
gazebo-in.wav
(full, 60 sec)

 
gazebo-out.wav
(full, 60 sec)

 

gazebo-out-transcoded-cropped.wav
(20 sec)

 

gazebo-out-transcoded-transduced-cropped.wav
(25 sec)

 
jarre-in.wav
(full, 60 sec)

 
jarre-out.wav
(full, 60 sec)

 
jarre-out-transcoded-cropped.wav
(25 sec)

 
jarre-out-transcoded-transduced-cropped.wav
(22 sec)

 
speech-in.wav
(full, 30 sec)

 
speech-out.wav
(full, 30 sec)

 
speech-out-transcoded-cropped.wav
(20 sec)

 
speech-out-transcoded-transduced-cropped.wav
(25 sec)

 
yello-in.wav
(full, 60 sec)

 
yello-out.wav
(full, 60 sec)

 
yello-out-transcoded-cropped.wav
(20 sec)

 
yello-out-transcoded-transduced-cropped.wav
(25 sec)

 

(*) Transcoded files were created by encoding to MP3 using Lame encoder at 32 Kbps and then decoding back to wave.
(**) Air transducing (acoustic coupling) has been performed by reproducing the lossy transcoded output files* using multimedia loudspeakers and by recording of the played signal with a microphone placed at 30 cm from one of the loudspeakers.

---------
To show the ability of the decoder to find watermarks even in time-stretched audio streams, I place one additional audio example in which the above already transcoded, transduced and cropped file 'gazebo-out-transcoded-transduced-cropped.wav' has been additionally time-stretched with speed decrease of 3%:
    gazebo-out-transcoded-transduced-cropped-stretched103prcnt.wav (25 sec)
    

You can use AWT2 decoder to make sure that the watermark is still detecteable in this stretched audio. Simply run the decoder like this: awt2_dec output.wav 4 -time_stretch=3.5

---------
To briefly show the performance of AWT2 in the 'high capacity' mode I place one additional example of speech recording and its output with embedded watermark containing a text string with 52 symbols. The following speech recording was used:
    speech50sec-in.wav (source audio, 50 sec)
    
This file has been watermarked in 'high capacity' mode with a 40-byte hexadecimal string 0xD453D08394E313B8059C58E0EF5363760ED383B4C76758033E094DF456D0824654B53D638394C500 which is a compact representation of the following text (52 symbols):

hey! don't even think to touch my files! (c)john doe

To convert this text into this hex form, 'text2hex' utility has been used (which is a part of the AWT2 package). You can find detailed description of this command line tool in AWT2 User Guide.

The encoding was done by running AWT2 encoder:

    awt2_enc.exe speech50sec-in.wav speech50sec-out.wav 0xD453D0...94C500 -high_capacity -density=0.8

Here is the watermarked output of the encoder (containing one full copy of the 52 text symbols long payload in every 3.8 seconds of audio):
    speech50sec-out.wav
    
And here is its MP3 64 Kbps transcoded (encoded/decoded) version:
    speech50sec-out-transcoded.wav
    

To try decoding watermark from the above output you can run:

    awt2_dec.exe speech50sec-out-transcoded.wav 40 -high_capacity -density=0.8

---------
As you can make sure yourself, the watermarks are still detectable in all the above transcoded, transduced, cropped and time stretched recordings. The information and examples presented above are not intended to claim absolute robustness of AWT2 watermarks. However the author believes that AWT2 watermarking is indeed strong enough to withstand most of audio transformations typical for today's uses (such as multiple MP3/MP4/other lossy transcoding, time stretching and so forth).

 

FAQ

Q: Why AWT2 is so pricy / so cheap? What advantages or disadvantages AWT2 has compared to competitive solutions?
A: AWT2 watermarks are very robust (read above), probably more robust than competitive solutions available around the web. The watermarks even survive time-stretching and transducing via air. Plus, the data rate is extremely high (up to 125 bps). That's why AWT2 is possibly a little more expensive compared to some competitive solutions. On the other hand, AWT2 is a product of a single developer that is provided on "as is" basis. That's why AWT2 is not as expensive as it could be.

Q: I'm evaluating AWT2, but the demo encoder allows embedding only a few watermarks from a short list. How I can be sure that you're not cheating and that the full AWT2 encoder performs not worse than the demo?
A: First of all, I give you my word that the demo encoder is exactly the same as the fully functional version in terms of DSP performance. The limitation with the fixed list of watermarks is purely artificial and is introduced only for protection purposes. With the demo encoder you can perform variety of performance tests to decide whether AWT2 suits your needs or not before buying fully functional version. If you still wish to check the performance with other watermarks, you can contact me by e-mail and send me your test audio files together with list of watermarks you want to embed into. I'll process your files and will send you watermarked outputs so that you will be able to test the performance on your side using the decoder from the demo package.

Q: Can special “watermarking attacks” compromise watermark robustness of AWT2? In other words, are there methods able to destroy AWT2 watermarks?
A: Yes, of course. Damaging watermarks is always possible at least by developing a special watermarking attack targeted on specific watermarking technique. Therefore the watermarking technique used in AWT2 is not attack agnostic too. However, the author believes that the proposed technique is strong enough to survive most of standard watermarking attacks.

Q: Does the use of this watermarking solution compromise the performance of automatic audio recognition services (such as TrackID, MusicBrainz, etc) being applied to the watermarked files?
A: The answer depends on every particular audio recognition technology. Since AWT2 watermark is practically transparent to the human listener, it should be transparent to any “good” audio recognition engine as well. Therefore, AWT2 watermarking should not harm recognition results of the most popular audio recognition technologies, at least theoretically.

Q: Can one AWT2 user detect and/or extract watermarks from audio streams of another AWT2 user?
A: No. Each particular copy of AWT2 binaries with particular Serial Number (SN) contains a unique numeric identifier that is used during encoding to scramble watermark payload. This security feature prevents one AWT2 user (with one SN) from extracting watermarks from watermarked files created by another AWT2 user (with another SN number).

Q: Can I extract watermark from a small fragment of the watermarked stream?
A: Yes, of course. Since the watermarking rate provided by AWT2 is quite high, it is generally enough to use only a portion of the watermarked stream to extract the watermark from it (15-50 seconds, depending on watermarking payload size), especially when the audio is marginally distorted compared to the source file. On the other hand, if the audio stream is significantly distorted, you might need to use longer portion or the entire available watermarked recording because statistically analysis of longer signals improves extraction reliability.

Q: Can I watermark MP3/OGG/… files?
A: This procedure requires transcoding. In other words, you need to decode your MP3 files back to wave format before watermarking them. AWT2 encoder operates with wave files only, and is unable to embed watermarks directly into MP3 bit stream. Therefore, the only way to watermark MP3 files is to decode MP3 file to wave file, then watermark the obtained wave file, and then encode the watermarked wave file back to MP3.

Q: Can I use AWT2 to detect watermarked content in broadcast recordings?
A: Yes. AWT2 decoder allows analyzing long audio recordings by slices. This feature is especially useful for watermarks search in recordings of broadcasts in which not the whole recording is watermarked, but only parts of it (e.g. particular songs in radio broadcast).

 

Proceed to purchase page >>>
Download AWT2 User Guide (PDF) >>>
Download free AWT2 evaluation package >>>

<<< back to the main page