Sample entropy

Sample entropy (SampEn) is a modification of approximate entropy (ApEn), used extensively for assessing the complexity of a physiological time-series signal, thereby diagnosing diseased state.[1] Unlike ApEn, SampEn shows good traits such as data length independence and trouble-free implementation. Also, there is a small computational difference: In ApEn, the comparison between the template vector (see below) and the rest of the vectors also includes comparison with itself. This guarantees that probabilities  C_{i}'^{m}(r) are never zero. Consequently, it is always possible to take a logarithm of probabilities. Because template comparisons with itself lower ApEn values, the signals are interpreted to be more regular than they actually are. These self-matches are not included in SampEn.

There is a multiscale version of SampEn as well, suggested by Costa and others.[2]

Definition

Like approximate entropy (ApEn), Sample entropy (SampEn) is a measure of complexity . But it does not include self-similar patterns as ApEn does. For a given embedding dimension   m  , tolerance   r and number of data points  N , SampEn is the negative logarithm of the probability that if two sets of simultaneous data points of length  m have distance  < r  then two sets of simultaneous data points of length   m+1 also have distance  < r  . And we represent it by  SampEn(m,r,N) (or by SampEn(m,r,\tau,N) including sampling time \tau).

Now assume we have a time-series data set of length  N = { \{ x_1 , x_2 , x_3 , . . . , x_N \} } with a constant time interval \tau. We define a template vector of length  m , such that  X_m (i)={ \{ x_i , x_{i+1} , x_{i+2} , . . . , x_{i+m-1} \} } and the distance function  d[X_m(i),X_m(j)] (iā‰ j) is to be the Chebyshev distance (but it could be any distance function, including Euclidean distance). We count the number of vector pairs in template vectors of length m and m+1 having  d[X_m(i),X_m(j)] < r and denote it by B and A respectively. We define the sample entropy to be



SampEn=-\log {A \over B}

Where,

 A = no of template vector pairs having  d[X_{m+1}(i),X_{m+1}(j)] < r of length m+1

 B = no of template vector pairs having  d[X_m(i),X_m(j)] < r of length m

It is clear from the definition that A will always have a value smaller or equal to B. Therefore, SampEn(m,r,\tau) will be always either be zero or positive value. A smaller value of SampEn also indicates more self-similarity in data set or less noise.

Generally we take the value of m to be 2 and the value of r to be 0.2 \times std. Where std stands for standard deviation which should be taken over a very large dataset. For instance, the r value of 6 ms is appropriate for sample entropy calculations of heart rate intervals, since this corresponds to 0.2 \times std for a very large population.

Multiscale SampEn

The definition mentioned above is a special case of multi scale sampEn with \delta=1 ,where  \delta is called skipping parameter. In multiscale SampEn template vectors are defined with a certain interval between its each element specified by the value of  \delta . And modified template vector is defined as 
X_{m,\delta}(i)={x_i,x_{i+\delta},x_{i+2\times\delta},...,x_{i+(m-1)\times\delta} }
and sampEn can be written as 
SampEn \left ( m,r,\delta \right )=-\log { A_\delta \over B_\delta } And we calculate A_\delta and B_\delta like before.

Implementation

Sample entropy can be implemented easily in many different programming languages. An example written in Matlab can be found here. An example written for R can be found here.

References

  1. ā†‘ Richman, JS; Moorman, JR (2000). "Physiological time-series analysis using approximate entropy and sample entropy". American Journal of Physiology. Heart and Circulatory Physiology 278 (6): H2039ā€“49. PMID 10843903.
  2. ā†‘ Costa, Madalena; Goldberger, Ary; Peng, C.-K. (2005). "Multiscale entropy analysis of biological signals". Physical Review E 71 (2). doi:10.1103/PhysRevE.71.021906.
This article is issued from Wikipedia - version of the Monday, February 15, 2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.