*Entropy measurements quantify the uncertainty in the EEG, which is roughly equivalent to the possible configurations or their predictability. However, there are many method and parameter options that can fundamentally change the result and meaning.*

### The basic idea of entropy.

Entropy as a concept first originated in the field of thermodynamics and a typical physical interpretation of entropy is the disorder of a system, described by the distribution probabilities of molecules of gaseous or fluid systems. Shannon introduced this concept to the field of information theory and defined what is commonly known as statistical entropy,

H = -Σ p(x)log(p(x))

To make the concept of statistical entropy more intuitive, consider an experiment of choosing a number from a set S = {1, 2, 3} and the probabilities of choosing each number. Let's say that in one case we have equal probabilities of choosing each number as follows:

P(1) = 1/3, P(2) =1/3 e, P(3) =1/3.

Substituting these values in the above equation, the entropy H turns out to be 1.09.

If instead you could only choose the number 1, that means there is only possibility. This would have the following probabilities for which H = 0.

P(1) = 1, P(2) = 0, y P(3) = 0,

Let's consider one more case, where you can choose 1 or 2, but not 3, so the probabilities are

P(1) = 1/2, P(2) = 1/2 y P(3) = 0,

Now it turns out that H = 0.69 which is in the middle. Thus, entropy is maximum when the number of possibilities is greatest and least when there is only one possibility. Another way of looking at this is that the maximum entropy is when all outcomes are equally likely, and therefore the degree of uncertainty or ignorance about the outcome is the highest. When we are more sure of the result, the entropy is lower. When the entropy is zero, it is the case of maximum information, where there is no need to perform the experiment because we know that the result is always 1!

Therefore, entropy simply quantifies uncertainty or ignorance in a statistical sense, which equates to the possible configurations of possibilities of the system. This is a pretty rough interpretation, but it helps to form some intuition about the concept.

**EEG entropy**

Apply the concept of entropy to time series such as electroencephalography (EEG)it is a way of quantifying, in a statistical sense, the amount of uncertainty or randomness in the pattern, which is also roughly equivalent to the amount of information contained in the signal. Time-domain entropy measurements typically divide the signal into segments that are then compared for similarity directly (in the time domain) or after some kind of signal transformation (such as power spectral density). This usually depends on a few fundamental parameters: the length of the chosen segment, the signal transformation (if any), and the distance metric, or the way the segments are compared. Other types of entropy first transform the EEG signal into the frequency domain using different methods, such as theFourier transformor more complex methods, such as wavelets, and observe these transformed features.

By making these various choices, each entropy measure makes some implicit assumptions about which aspect of the signal is significant or important to quantify. Some entropy measures analyze the vector distance between segments in the time domain, while others analyze the transformed elements of the signal, such as the spectral content or an oscillatory element or wavelet. This, of course, has implications, since if you choose an irrelevant aspect of the signal, it can lose meaning. Consider, for example, if you wanted to compare words, but you didn't know they were words, and instead noticed how the symbols for each letter were different from one another: b might be more similar in shape to d than x, but that will be lost. the point of the tongue the challenge is that*first*we do not fully know which aspect is relevant in the EEG signal.

So where was this useful for the EEG? So far, its most significant impact has been seen in anesthesia, where most entropy measurements decrease with anesthesia, suggesting that the signal becomes more predictable or repetitive the further down it goes. It has also been used to classify disease states such as schizophrenia. Of course, there may also be other aspects of cognition where entropy is relevant.

**assumptions and problems**

Assuming that the aspect of the chosen sign is relevant, there are different issues to take into account. One is noise in the signal, which will have an impact on the entropy measurement, particularly on shorter segment length time domain measurements. Another common assumption made in these measurements is that the signal in question is stationary, that is, that the statistical distributions of values across all windows are identical if the data is split across multiple windows. This is generally not the case, particularly for EEG signals and power spectrum calculation. A common way to mitigate this problem is to split the data into multiple "stationary" segments, but there is no real consensus on what the length of this segment should be for EEG data. Furthermore, these methods typically require large enough amounts of data (which may violate the 'stationarity' requirement), making their application to experimental EEG data challenging. Although SampEn has been shown to be more consistent and works well with shorter data lengths, the low signal-to-noise ratio is still a problem.

In summary, entropy measurements are used to quantify, in a statistical sense, the irregularity or uncertainty in a biological signal such as the EEG. This can be done in the time or frequency domain. The choice to use a transform and all the associated parameters make implicit assumptions about what is important in the signal and can lead to very different results.

We provide a tutorial on common entropy measurements below.

**Entropy in the time domain**

Two of the popular options in the time domain include approximate entropy and sample entropy. They are used to quantify the amount of repeatability or predictability in the waveform patterns of an EEG signal.

The approximate or sample entropy calculation basically depends on three parameters – 1) window length – m (the length of the signal you are using for comparison), 2) threshold for similarity – r and 3) data length N. It basically works so:

Given an EEG time series of N points, we create a series of sets of smaller segments of length m. We call each of these segments**X**(i) there is nothing here like a data block of length m, starting at time point 'i' in the EEG. We do this for each point.*UE*from the first to the last possible where you can still have a segment of length m. For an EEG signal of length N, there will be N-m+1 of these segments (i.e.,**X**(1),**X**(2),…**X**(N-m+1)). Therefore, we are looking for an answer to this question:** How similar is the segment x(i) to the rest of the segments?**Here the similarity is defined using the 'r' threshold, which is typically a measure of the distance between two data segments (which is essentially the vector distance, or the distance between two m-dimensional points). If the distance is less than the 'r' threshold, we give it a value of 1 (ie the segments are similar), otherwise we give it a value of 0. For each segment

**X**(i) then we calculate a quantity C(i,r,m) which is the fraction of segments similar to the segment

*UE*.

*C(i,r,m) = (number of segments similar to block x(i))/(total number of segments).*

We now average the logarithm of these similarity fractions for all segments (the logarithm ensures that very small fractions do not drastically distort the distribution).

*A(m,r) = 1/(N-m+1) * Σ log (C(i,r,m))*(remember that N-m+1 is the number of segments)

Now we repeat the previous procedure with the length of the segment m+1 and in an analogous way we define B(m+1,r). Now the approximate entropy is calculated as

*AppEn(m,r) = A(m,r) – B(m+1,r)*

The smaller the approximate value of the entropy, the more regular or repetitive the signal will be. Random or irregular signals tend to have higher approximate entropy values. This is obvious because, if the sign is repeating, there won't be much difference between the block sizes m and m+1 when the above statistic is calculated. Also note that to avoid the situation where there are no threads like**X(**i) which will result in log(0) (undefined!), Automatches of approximate entropy counts, which can cause you to skew.

Sample entropy differs from approximate entropy in two ways:

1) Does not imply self-combination and

2) It does not do segment similarity matching.

Instead, we try to calculate the statistic A(m,r) as

*A(m,r) = (number of X(j) vectors inside the limit 'r' ofX(i) vetor) / (N-m)*

where j=1…N-m and j ≠ i. Similarly, one can define B(m + 1, r) and calculate the entropy of the sample as

*MuestraEn = -log (A(m,r) / B(m+1,r))*

Again, smaller sample entropy values indicate repeatability in a signal, and larger values indicate irregularity.

**Entropy in the frequency domain**

The basic idea of calculating entropy in the frequency domain is to transform the signal from the time domain to the frequency domain using standard tools like the Fourier transform or more advanced methods like Wavelets. This gives rise to two different entropy measures: 1) spectral entropy and 2) total wavelet entropy.

*spectral entropy*

Spectral entropy requires thepower spectral density (PSD)of an EEG signal, which is obtained through the Discrete Fourier Transform (DFT). Given two frequency points of interest, say f1 and f2, the power spectrum between these frequencies is normalized and the spectral entropy is calculated defined by the Shannon entropy.

SE = -Σ P_norma log(P_norma),

where the sum is taken over all frequencies between f1 and f2. For a single-frequency periodic signal, SE will be close to zero (think of the experiment we described where P(X=1) =1!) and for white noise, random signal, SE values will be much larger, such as the Random noise contains energy at all frequencies (just like the experiment described above, where all outputs were equally likely).

See related postO Blue Toad no EEG

*total wavelet entropy*

One can use wavelets to decompose an EEG signal into various levels of resolution and calculate the relative energy for each 'j' level as

*p(j) = (Energy at level j) / (Total energy of all levels)*

Now the total wavelet entropy given by Shannon is defined as

DOIS = -Σ p(j)log(p(j))

where the sum is taken over all decomposed levels. TWE measures the amount of order/disorder in a signal. As in the case of spectral entropy, a sinusoidal signal would have a very low TWE value, almost close to zero, and a random signal that has its energy spread across all bands will have a high TWE value.