HomeArtificial IntelligencePosit AI Weblog: Wavelet Rework

Posit AI Weblog: Wavelet Rework


Word: Like a number of prior ones, this publish is an excerpt from the forthcoming e-book, Deep Studying and Scientific Computing with R torch. And like many excerpts, it’s a product of laborious trade-offs. For added depth and extra examples, I’ve to ask you to please seek the advice of the e-book.

Wavelets and the Wavelet Rework

What are wavelets? Just like the Fourier foundation, they’re features; however they don’t lengthen infinitely. As a substitute, they’re localized in time: Away from the middle, they shortly decay to zero. Along with a location parameter, in addition they have a scale: At totally different scales, they seem squished or stretched. Squished, they’ll do higher at detecting excessive frequencies; the converse applies after they’re stretched out in time.

The fundamental operation concerned within the Wavelet Rework is convolution – have the (flipped) wavelet slide over the information, computing a sequence of dot merchandise. This manner, the wavelet is principally on the lookout for similarity.

As to the wavelet features themselves, there are numerous of them. In a sensible utility, we’d need to experiment and decide the one which works greatest for the given information. In comparison with the DFT and spectrograms, extra experimentation tends to be concerned in wavelet evaluation.

The subject of wavelets could be very totally different from that of Fourier transforms in different respects, as effectively. Notably, there’s a lot much less standardization in terminology, use of symbols, and precise practices. On this introduction, I’m leaning closely on one particular exposition, the one in Arnt Vistnes’ very good e-book on waves (Vistnes 2018). In different phrases, each terminology and examples mirror the alternatives made in that e-book.

Introducing the Morlet wavelet

The Morlet, often known as Gabor, wavelet is outlined like so:

[
Psi_{omega_{a},K,t_{k}}(t_n) = (e^{-i omega_{a} (t_n – t_k)} – e^{-K^2}) e^{- omega_a^2 (t_n – t_k )^2 /(2K )^2}
]

This formulation pertains to discretized information, the sorts of knowledge we work with in observe. Thus, (t_k) and (t_n) designate time limits, or equivalently, particular person time-series samples.

This equation appears daunting at first, however we are able to “tame” it a bit by analyzing its construction, and pointing to the principle actors. For concreteness, although, we first have a look at an instance wavelet.

We begin by implementing the above equation:

Evaluating code and mathematical formulation, we discover a distinction. The perform itself takes one argument, (t_n); its realization, 4 (omega, Ok, t_k, and t). It is because the torch code is vectorized: On the one hand, omega, Ok, and t_k, which, within the method, correspond to (omega_{a}), (Ok), and (t_k) , are scalars. (Within the equation, they’re assumed to be mounted.) t, however, is a vector; it should maintain the measurement occasions of the sequence to be analyzed.

We decide instance values for omega, Ok, and t_k, in addition to a variety of occasions to judge the wavelet on, and plot its values:

omega  6 * pi
Ok  6
t_k  5
 
sample_time  torch_arange(3, 7, 0.0001)

create_wavelet_plot  perform(omega, Ok, t_k, sample_time) {
  morlet  morlet(omega, Ok, t_k, sample_time)
  df  information.body(
    x = as.numeric(sample_time),
    actual = as.numeric(morlet$actual),
    imag = as.numeric(morlet$imag)
  ) %>%
    pivot_longer(-x, names_to = "half", values_to = "worth")
  ggplot(df, aes(x = x, y = worth, colour = half)) +
    geom_line() +
    scale_colour_grey(begin = 0.8, finish = 0.4) +
    xlab("time") +
    ylab("wavelet worth") +
    ggtitle("Morlet wavelet",
      subtitle = paste0("ω_a = ", omega / pi, "π , Ok = ", Ok)
    ) +
    theme_minimal()
}

create_wavelet_plot(omega, Ok, t_k, sample_time)
A Morlet wavelet.

What we see here’s a advanced sine curve – word the true and imaginary components, separated by a section shift of (pi/2) – that decays on either side of the middle. Wanting again on the equation, we are able to determine the elements answerable for each options. The primary time period within the equation, (e^{-i omega_{a} (t_n – t_k)}), generates the oscillation; the third, (e^{- omega_a^2 (t_n – t_k )^2 /(2K )^2}), causes the exponential decay away from the middle. (In case you’re questioning in regards to the second time period, (e^{-Ok^2}): For given (Ok), it’s only a fixed.)

The third time period really is a Gaussian, with location parameter (t_k) and scale (Ok). We’ll discuss (Ok) in nice element quickly, however what’s with (t_k)? (t_k) is the middle of the wavelet; for the Morlet wavelet, that is additionally the situation of most amplitude. As distance from the middle will increase, values shortly method zero. That is what is supposed by wavelets being localized: They’re “energetic” solely on a brief vary of time.

The roles of (Ok) and (omega_a)

Now, we already mentioned that (Ok) is the size of the Gaussian; it thus determines how far the curve spreads out in time. However there may be additionally (omega_a). Wanting again on the Gaussian time period, it, too, will impression the unfold.

First although, what’s (omega_a)? The subscript (a) stands for “evaluation”; thus, (omega_a) denotes a single frequency being probed.

Now, let’s first examine visually the respective impacts of (omega_a) and (Ok).

p1  create_wavelet_plot(6 * pi, 4, 5, sample_time)
p2  create_wavelet_plot(6 * pi, 6, 5, sample_time)
p3  create_wavelet_plot(6 * pi, 8, 5, sample_time)
p4  create_wavelet_plot(4 * pi, 6, 5, sample_time)
p5  create_wavelet_plot(6 * pi, 6, 5, sample_time)
p6  create_wavelet_plot(8 * pi, 6, 5, sample_time)

(p1 | p4) /
  (p2 | p5) /
  (p3 | p6)
Morlet wavelet: Effects of varying scale and analysis frequency.

Within the left column, we preserve (omega_a) fixed, and range (Ok). On the correct, (omega_a) modifications, and (Ok) stays the identical.

Firstly, we observe that the upper (Ok), the extra the curve will get unfold out. In a wavelet evaluation, which means extra time limits will contribute to the remodel’s output, leading to excessive precision as to frequency content material, however lack of decision in time. (We’ll return to this – central – trade-off quickly.)

As to (omega_a), its impression is twofold. On the one hand, within the Gaussian time period, it counteracts – precisely, even – the size parameter, (Ok). On the opposite, it determines the frequency, or equivalently, the interval, of the wave. To see this, check out the correct column. Akin to the totally different frequencies, we now have, within the interval between 4 and 6, 4, six, or eight peaks, respectively.

This double position of (omega_a) is the explanation why, all-in-all, it does make a distinction whether or not we shrink (Ok), retaining (omega_a) fixed, or enhance (omega_a), holding (Ok) mounted.

This state of issues sounds difficult, however is much less problematic than it may appear. In observe, understanding the position of (Ok) is vital, since we have to decide smart (Ok) values to strive. As to the (omega_a), however, there will likely be a large number of them, akin to the vary of frequencies we analyze.

So we are able to perceive the impression of (Ok) in additional element, we have to take a primary have a look at the Wavelet Rework.

Wavelet Rework: An easy implementation

Whereas total, the subject of wavelets is extra multifaceted, and thus, could appear extra enigmatic than Fourier evaluation, the remodel itself is less complicated to know. It’s a sequence of native convolutions between wavelet and sign. Right here is the method for particular scale parameter (Ok), evaluation frequency (omega_a), and wavelet location (t_k):

[
W_{K, omega_a, t_k} = sum_n x_n Psi_{omega_{a},K,t_{k}}^*(t_n)
]

That is only a dot product, computed between sign and complex-conjugated wavelet. (Right here advanced conjugation flips the wavelet in time, making this convolution, not correlation – a indisputable fact that issues rather a lot, as you’ll see quickly.)

Correspondingly, easy implementation ends in a sequence of dot merchandise, every akin to a distinct alignment of wavelet and sign. Beneath, in wavelet_transform(), arguments omega and Ok are scalars, whereas x, the sign, is a vector. The result’s the wavelet-transformed sign, for some particular Ok and omega of curiosity.

wavelet_transform  perform(x, omega, Ok) {
  n_samples  dim(x)[1]
  W  torch_complex(
    torch_zeros(n_samples), torch_zeros(n_samples)
  )
  for (i in 1:n_samples) {
    # transfer middle of wavelet
    t_k  x[i, 1]
    m  morlet(omega, Ok, t_k, x[, 1])
    # compute native dot product
    # word wavelet is conjugated
    dot  torch_matmul(
      m$conj()$unsqueeze(1),
      x[, 2]$to(dtype = torch_cfloat())
    )
    W[i]  dot
  }
  W
}

To check this, we generate a easy sine wave that has a frequency of 100 Hertz in its first half, and double that within the second.

gencos  perform(amp, freq, section, fs, length) {
  x  torch_arange(0, length, 1 / fs)[1:-2]$unsqueeze(2)
  y  amp * torch_cos(2 * pi * freq * x + section)
  torch_cat(listing(x, y), dim = 2)
}

# sampling frequency
fs  8000

f1  100
f2  200
section  0
length  0.25

s1  gencos(1, f1, section, fs, length)
s2  gencos(1, f2, section, fs, length)

s3  torch_cat(listing(s1, s2), dim = 1)
s3[(dim(s1)[1] + 1):(dim(s1)[1] * 2), 1] 
  s3[(dim(s1)[1] + 1):(dim(s1)[1] * 2), 1] + length

df  information.body(
  x = as.numeric(s3[, 1]),
  y = as.numeric(s3[, 2])
)
ggplot(df, aes(x = x, y = y)) +
  geom_line() +
  xlab("time") +
  ylab("amplitude") +
  theme_minimal()
An example signal, consisting of a low-frequency and a high-frequency half.

Now, we run the Wavelet Rework on this sign, for an evaluation frequency of 100 Hertz, and with a Ok parameter of two, discovered by fast experimentation:

Ok  2
omega  2 * pi * f1

res  wavelet_transform(x = s3, omega, Ok)
df  information.body(
  x = as.numeric(s3[, 1]),
  y = as.numeric(res$abs())
)

ggplot(df, aes(x = x, y = y)) +
  geom_line() +
  xlab("time") +
  ylab("Wavelet Rework") +
  theme_minimal()
Wavelet Transform of the above two-part signal. Analysis frequency is 100 Hertz.

The remodel accurately picks out the a part of the sign that matches the evaluation frequency. If you happen to really feel like, you would possibly need to double-check what occurs for an evaluation frequency of 200 Hertz.

Now, in actuality we’ll need to run this evaluation not for a single frequency, however a variety of frequencies we’re taken with. And we’ll need to strive totally different scales Ok. Now, in case you executed the code above, you could be nervous that this might take a lot of time.

Effectively, it by necessity takes longer to compute than its Fourier analogue, the spectrogram. For one, that’s as a result of with spectrograms, the evaluation is “simply” two-dimensional, the axes being time and frequency. With wavelets there are, as well as, totally different scales to be explored. And secondly, spectrograms function on entire home windows (with configurable overlap); a wavelet, however, slides over the sign in unit steps.

Nonetheless, the state of affairs isn’t as grave because it sounds. The Wavelet Rework being a convolution, we are able to implement it within the Fourier area as an alternative. We’ll do this very quickly, however first, as promised, let’s revisit the subject of various Ok.

Decision in time versus in frequency

We already noticed that the upper Ok, the extra spread-out the wavelet. We are able to use our first, maximally easy, instance, to research one speedy consequence. What, for instance, occurs for Ok set to twenty?

Ok  20

res  wavelet_transform(x = s3, omega, Ok)
df  information.body(
  x = as.numeric(s3[, 1]),
  y = as.numeric(res$abs())
)

ggplot(df, aes(x = x, y = y)) +
  geom_line() +
  xlab("time") +
  ylab("Wavelet Rework") +
  theme_minimal()
Wavelet Transform of the above two-part signal, with K set to twenty instead of two.

The Wavelet Rework nonetheless picks out the right area of the sign – however now, as an alternative of a rectangle-like end result, we get a considerably smoothed model that doesn’t sharply separate the 2 areas.

Notably, the primary 0.05 seconds, too, present appreciable smoothing. The bigger a wavelet, the extra element-wise merchandise will likely be misplaced on the finish and the start. It is because transforms are computed aligning the wavelet in any respect sign positions, from the very first to the final. Concretely, after we compute the dot product at location t_k = 1, only a single pattern of the sign is taken into account.

Other than presumably introducing unreliability on the boundaries, how does wavelet scale have an effect on the evaluation? Effectively, since we’re correlating (convolving, technically; however on this case, the impact, ultimately, is similar) the wavelet with the sign, point-wise similarity is what issues. Concretely, assume the sign is a pure sine wave, the wavelet we’re utilizing is a windowed sinusoid just like the Morlet, and that we’ve discovered an optimum Ok that properly captures the sign’s frequency. Then every other Ok, be it bigger or smaller, will end in much less point-wise overlap.

Performing the Wavelet Rework within the Fourier area

Quickly, we’ll run the Wavelet Rework on an extended sign. Thus, it’s time to velocity up computation. We already mentioned that right here, we profit from time-domain convolution being equal to multiplication within the Fourier area. The general course of then is that this: First, compute the DFT of each sign and wavelet; second, multiply the outcomes; third, inverse-transform again to the time area.

The DFT of the sign is shortly computed:

F  torch_fft_fft(s3[ , 2])

With the Morlet wavelet, we don’t even should run the FFT: Its Fourier-domain illustration might be acknowledged in closed type. We’ll simply make use of that formulation from the outset. Right here it’s:

morlet_fourier  perform(Ok, omega_a, omega) {
  2 * (torch_exp(-torch_square(
    Ok * (omega - omega_a) / omega_a
  )) -
    torch_exp(-torch_square(Ok)) *
      torch_exp(-torch_square(Ok * omega / omega_a)))
}

Evaluating this assertion of the wavelet to the time-domain one, we see that – as anticipated – as an alternative of parameters t and t_k it now takes omega and omega_a. The latter, omega_a, is the evaluation frequency, the one we’re probing for, a scalar; the previous, omega, the vary of frequencies that seem within the DFT of the sign.

In instantiating the wavelet, there may be one factor we have to pay particular consideration to. In FFT-think, the frequencies are bins; their quantity is set by the size of the sign (a size that, for its half, immediately is determined by sampling frequency). Our wavelet, however, works with frequencies in Hertz (properly, from a consumer’s perspective; since this unit is significant to us). What this implies is that to morlet_fourier, as omega_a we have to move not the worth in Hertz, however the corresponding FFT bin. Conversion is completed relating the variety of bins, dim(x)[1], to the sampling frequency of the sign, fs:

# once more search for 100Hz components
omega  2 * pi * f1

# want the bin akin to some frequency in Hz
omega_bin  f1/fs * dim(s3)[1]

We instantiate the wavelet, carry out the Fourier-domain multiplication, and inverse-transform the end result:

Ok  3

m  morlet_fourier(Ok, omega_bin, 1:dim(s3)[1])
prod  F * m
reworked  torch_fft_ifft(prod)

Placing collectively wavelet instantiation and the steps concerned within the evaluation, we now have the next. (Word easy methods to wavelet_transform_fourier, we now, conveniently, move within the frequency worth in Hertz.)

wavelet_transform_fourier  perform(x, omega_a, Ok, fs) {
  N  dim(x)[1]
  omega_bin  omega_a / fs * N
  m  morlet_fourier(Ok, omega_bin, 1:N)
  x_fft  torch_fft_fft(x)
  prod  x_fft * m
  w  torch_fft_ifft(prod)
  w
}

We’ve already made important progress. We’re prepared for the ultimate step: automating evaluation over a variety of frequencies of curiosity. This can end in a three-dimensional illustration, the wavelet diagram.

Creating the wavelet diagram

Within the Fourier Rework, the variety of coefficients we acquire is determined by sign size, and successfully reduces to half the sampling frequency. With its wavelet analogue, since anyway we’re doing a loop over frequencies, we would as effectively determine which frequencies to investigate.

Firstly, the vary of frequencies of curiosity might be decided operating the DFT. The subsequent query, then, is about granularity. Right here, I’ll be following the advice given in Vistnes’ e-book, which is predicated on the relation between present frequency worth and wavelet scale, Ok.

Iteration over frequencies is then carried out as a loop:

wavelet_grid  perform(x, Ok, f_start, f_end, fs) {
  # downsample evaluation frequency vary
  # as per Vistnes, eq. 14.17
  num_freqs  1 + log(f_end / f_start)/ log(1 + 1/(8 * Ok))
  freqs  seq(f_start, f_end, size.out = ground(num_freqs))
  
  reworked  torch_zeros(
    num_freqs, dim(x)[1],
    dtype = torch_cfloat()
    )
  for(i in 1:num_freqs) {
    w  wavelet_transform_fourier(x, freqs[i], Ok, fs)
    reworked[i, ]  w
  }
  listing(reworked, freqs)
}

Calling wavelet_grid() will give us the evaluation frequencies used, along with the respective outputs from the Wavelet Rework.

Subsequent, we create a utility perform that visualizes the end result. By default, plot_wavelet_diagram() shows the magnitude of the wavelet-transformed sequence; it could actually, nonetheless, plot the squared magnitudes, too, in addition to their sq. root, a way a lot advisable by Vistnes whose effectiveness we’ll quickly have alternative to witness.

The perform deserves a couple of additional feedback.

Firstly, identical as we did with the evaluation frequencies, we down-sample the sign itself, avoiding to recommend a decision that’s not really current. The method, once more, is taken from Vistnes’ e-book.

Then, we use interpolation to acquire a brand new time-frequency grid. This step might even be vital if we preserve the unique grid, since when distances between grid factors are very small, R’s picture() might refuse to just accept axes as evenly spaced.

Lastly, word how frequencies are organized on a log scale. This results in rather more helpful visualizations.

plot_wavelet_diagram  perform(x,
                                 freqs,
                                 grid,
                                 Ok,
                                 fs,
                                 f_end,
                                 kind = "magnitude") {
  grid  change(kind,
    magnitude = grid$abs(),
    magnitude_squared = torch_square(grid$abs()),
    magnitude_sqrt = torch_sqrt(grid$abs())
  )

  # downsample time sequence
  # as per Vistnes, eq. 14.9
  new_x_take_every  max(Ok / 24 * fs / f_end, 1)
  new_x_length  ground(dim(grid)[2] / new_x_take_every)
  new_x  torch_arange(
    x[1],
    x[dim(x)[1]],
    step = x[dim(x)[1]] / new_x_length
  )
  
  # interpolate grid
  new_grid  nnf_interpolate(
    grid$view(c(1, 1, dim(grid)[1], dim(grid)[2])),
    c(dim(grid)[1], new_x_length)
  )$squeeze()
  out  as.matrix(new_grid)

  # plot log frequencies
  freqs  log10(freqs)
  
  picture(
    x = as.numeric(new_x),
    y = freqs,
    z = t(out),
    ylab = "log frequency [Hz]",
    xlab = "time [s]",
    col = hcl.colours(12, palette = "Mild grays")
  )
  principal  paste0("Wavelet Rework, Ok = ", Ok)
  sub  change(kind,
    magnitude = "Magnitude",
    magnitude_squared = "Magnitude squared",
    magnitude_sqrt = "Magnitude (sq. root)"
  )

  mtext(facet = 3, line = 2, at = 0, adj = 0, cex = 1.3, principal)
  mtext(facet = 3, line = 1, at = 0, adj = 0, cex = 1, sub)
}

Let’s use this on a real-world instance.

An actual-world instance: Chaffinch’s tune

For the case examine, I’ve chosen what, to me, was essentially the most spectacular wavelet evaluation proven in Vistnes’ e-book. It’s a pattern of a chaffinch’s singing, and it’s obtainable on Vistnes’ web site.

url  "http://www.physics.uio.no/pow/wavbirds/chaffinch.wav"

obtain.file(
 file.path(url),
 destfile = "/tmp/chaffinch.wav"
)

We use torchaudio to load the file, and convert from stereo to mono utilizing tuneR’s appropriately named mono(). (For the sort of evaluation we’re doing, there isn’t any level in retaining two channels round.)

library(torchaudio)
library(tuneR)

wav  tuneR_loader("/tmp/chaffinch.wav")
wav  mono(wav, "each")
wav
Wave Object
    Variety of Samples:      1864548
    Period (seconds):     42.28
    Samplingrate (Hertz):   44100
    Channels (Mono/Stereo): Mono
    PCM (integer format):   TRUE
    Bit (8/16/24/32/64):    16 

For evaluation, we don’t want the whole sequence. Helpfully, Vistnes additionally revealed a suggestion as to which vary of samples to investigate.

waveform_and_sample_rate  transform_to_tensor(wav)
x  waveform_and_sample_rate[[1]]$squeeze()
fs  waveform_and_sample_rate[[2]]

# http://www.physics.uio.no/pow/wavbirds/chaffinchInfo.txt
begin  34000
N  1024 * 128
finish  begin + N - 1
x  x[start:end]

dim(x)
[1] 131072

How does this look within the time area? (Don’t miss out on the event to truly hear to it, in your laptop computer.)

df  information.body(x = 1:dim(x)[1], y = as.numeric(x))
ggplot(df, aes(x = x, y = y)) +
  geom_line() +
  xlab("pattern") +
  ylab("amplitude") +
  theme_minimal()
Chaffinch’s song.

Now, we have to decide an inexpensive vary of study frequencies. To that finish, we run the FFT:

On the x-axis, we plot frequencies, not pattern numbers, and for higher visibility, we zoom in a bit.

bins  1:dim(F)[1]
freqs  bins / N * fs

# the bin, not the frequency
cutoff  N/4

df  information.body(
  x = freqs[1:cutoff],
  y = as.numeric(F$abs())[1:cutoff]
)
ggplot(df, aes(x = x, y = y)) +
  geom_col() +
  xlab("frequency (Hz)") +
  ylab("magnitude") +
  theme_minimal()
Chaffinch’s song, Fourier spectrum (excerpt).

Primarily based on this distribution, we are able to safely prohibit the vary of study frequencies to between, roughly, 1800 and 8500 Hertz. (That is additionally the vary advisable by Vistnes.)

First, although, let’s anchor expectations by making a spectrogram for this sign. Appropriate values for FFT dimension and window dimension had been discovered experimentally. And although, in spectrograms, you don’t see this completed usually, I discovered that displaying sq. roots of coefficient magnitudes yielded essentially the most informative output.

fft_size  1024
window_size  1024
energy  0.5

spectrogram  transform_spectrogram(
  n_fft = fft_size,
  win_length = window_size,
  normalized = TRUE,
  energy = energy
)

spec  spectrogram(x)
dim(spec)
[1] 513 257

Like we do with wavelet diagrams, we plot frequencies on a log scale.

bins  1:dim(spec)[1]
freqs  bins * fs / fft_size
log_freqs  log10(freqs)

frames  1:(dim(spec)[2])
seconds  (frames / dim(spec)[2])  * (dim(x)[1] / fs)

picture(x = seconds,
      y = log_freqs,
      z = t(as.matrix(spec)),
      ylab = 'log frequency [Hz]',
      xlab = 'time [s]',
      col = hcl.colours(12, palette = "Mild grays")
)
principal  paste0("Spectrogram, window dimension = ", window_size)
sub  "Magnitude (sq. root)"
mtext(facet = 3, line = 2, at = 0, adj = 0, cex = 1.3, principal)
mtext(facet = 3, line = 1, at = 0, adj = 0, cex = 1, sub)
Chaffinch’s song, spectrogram.

The spectrogram already exhibits a particular sample. Let’s see what might be completed with wavelet evaluation. Having experimented with a couple of totally different Ok, I agree with Vistnes that Ok = 48 makes for a wonderful alternative:

f_start  1800
f_end  8500

Ok  48
c(grid, freqs) % wavelet_grid(x, Ok, f_start, f_end, fs)
plot_wavelet_diagram(
  torch_tensor(1:dim(grid)[2]),
  freqs, grid, Ok, fs, f_end,
  kind = "magnitude_sqrt"
)
Chaffinch’s song, wavelet diagram.

The achieve in decision, on each the time and the frequency axis, is totally spectacular.

Thanks for studying!

Picture by Vlad Panov on Unsplash

Vistnes, Arnt Inge. 2018. Physics of Oscillations and Waves. With Use of Matlab and Python. Springer.

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

- Advertisment -
Google search engine

Most Popular

Recent Comments