Physica A 569 (2021) 125770
Contents lists available at ScienceDirect
Physica A
journal homepage: www.elsevier.com/locate/physa
Quantifying the randomness of the forex market
Alfonso Delgado-Bonal
a,1,
, Álvaro García López
b,1
a
Gentleman Scientist, 53C Crescent Road, Greenbelt, MD, USA
b
Universidad Rey Juan Carlos, Nonlinear Dynamics, Chaos and Complex System Group, Madrid, Spain
a r t i c l e i n f o
Article history:
Received 18 October 2020
Received in revised form 8 January 2021
Available online 18 January 2021
Keywords:
Complexity
Econophysics
Information Theory
Randomness
a b s t r a c t
Currency markets are international networks of participants opened all day during
weekdays without a supervisory entity. The precise value of an exchange pair is
determined by the decisions of the central banks and the behavior of the speculators,
whose actions can be determined on the spot or be related to previous decisions.
All those decisions affect the complexity and predictability of the system, which are
quantitatively analyzed in this paper. For this purpose, we compare the randomness
of the most traded currencies in the forex market using the Pincus Index. We extend
the development of this methodology to include multidimensionality in the embedding
dimension, to capture the influence of the past in current decisions and to analyze
different frequencies within the data with a multiscale approach. We show that, in
general, the forex market is more predictable using one hour ticks than using daily data
for the six major pairs, and present evidence suggesting that the variance is easier to
predict for longer time frames.
© 2021 Elsevier B.V. All rights reserved.
1. Introduction
The foreign exchange market (forex) is the largest market in the world and determines the exchange rates for every
currency. The participants in the market set the relative value of each currency pair by buying or selling positions, and
these actions are influenced by personal beliefs, trends or public announcements of central banks.
Unlike stock markets, forex is not being subject to an specific supervisory entity and is globally decentralized, open
to banks, commercial companies and private agents. The price of a currency pair at a given time is supposed to be a
reflection of economic factors, political conditions and the psychology of the participants.
The six most traded forex pairs analyzed in this paper are: EUR/USD (euro/US dollar); USD/JPY (US dollar/Japanese
yen); GBP/USD (British pound sterling/US dollar); AUD/USD (Australian dollar/US dollar); USD/CAD (US dollar/Canadian
dollar); USD/CNY (US dollar/Chinese renminbi).
The forex market has been the subject of intensive research for a long time, frequently focusing on the predictability
of its values but also on its variance. In this paper, we analyze the number of patterns within the data for the returns and
its variance in different timeframes, 1-hour (H1), 4-hours (H4) and daily.
Research on complexity includes a variety of algorithms and analysis techniques that usually come from the physical
or mathematical realm [1]. Complex systems are entangled by nonlinearly interacting elements and are found in different
fields such as the brain or financial markets. A review on the meaning of complexity and detailed analyses in those and
other processes can be found in [2].
Corresponding author.
E-mail address: contact@adelgadobonal.com (A. Delgado-Bonal).
1
Both authors contributed equally to this work.
https://doi.org/10.1016/j.physa.2021.125770
0378-4371/© 2021 Elsevier B.V. All rights reserved.
A. Delgado-Bonal and Á.G. López Physica A 569 (2021) 125770
When dealing with the currency market, its movements can be separated into high-frequency variations [3] and slower
movements responsible for the trends [2]. This paper deals with the later, and explores its complexity relying on the
concept of entropy as defined in Information Theory and Kolmogorov complexity.
In the field of Information Theory, entropy is a magnitude that quantifies the uncertainty of a measure. On the
other hand, this paper follows the approach of Chaitin [4] and Kolmogorov [5] by defining complexity as in algorithmic
information theory, which takes into account the order of the points in a sequence. In this view, a chain is random if its
Kolmogorov complexity is at least equal to the length of the chain.
The benefit of the connection between information content and randomness is that it provides a way to quantify the
complexity of a dataset without relying on models or hypothesis about the process generating the data. By comparing the
entropy of our system with the maximum entropy rate possible, we can determine the degree of randomness of a series;
a complex (total random) process is defined as that process lacking pattern repetition.
The use of the entropy rate to study the complexity of a time series is not limited to stochastic processes. Sinai [6]
introduced the concept of entropy to describe the structural similarity between different dynamic systems that preserve
the measurements, giving a generalization of Shannon entropy for dynamic systems, known as Kolmogorov–Sinai entropy
(KS). Unfortunately, KS entropy is sometimes undefined for limited and noisy measurements of a signal represented in a
data series.
To overcome that limitation, Grassberger and Procaccia [7] used the Renyi entropy to define the correlation integral,
which in turn was used by Eckmann and Ruelle [8] to define the φ functions as a conditional probability. This ER entropy
is an exact estimation of the entropy of the system. Building upon those φ functions, Pincus [9] described the methodology
of ApEn, useful for limited and noisy data, providing a hierarchy of randomness based on the different patterns and their
repetitions.
ApEn measures the logarithmic probability that nearby pattern runs remain close in the next incremental comparison:
low ApEn values reflect that the system is very persistent, repetitive and predictive, with apparent patterns that repeat
themselves throughout of the series, while high values means complexity in the sense of independence between the data
and a low number of repeated patterns. The readers are encouraged to read a recent comprehensive tutorial on these
algorithms [10].
To use Approximate Entropy, it is necessary to specify two parameters, the embedding dimension (m) and the tolerance
of the measure (r), determined as a percentage of the standard deviation. Once the calculations have been performed, the
result of the algorithm is a positive real number, with higher values indicating more randomness. However, those values
are dependent on the characteristics of the dataset such as the influence of the past in the future prices or the volatility
of the prices.
In order to obtain a measure of randomness suitable for comparisons between evolving datasets, the Pincus Index (PI)
was introduced as a measure of the distance between a dataset and the maximum possible randomness of that system [11].
A value of PI equal to zero implies a totally ordered and completely predictable system, whereas a value equal to or greater
than one implies total randomness and unpredictability. The added benefit of the Pincus Index is that, unlike ApEn, it is
suitable for comparisons between different markets. This paper completes the development of that index by introducing
different kinds of multidimensionality in the measure. Thus, knowledge of the PI would be useful to fully understand the
concepts here presented and how the several levels of complexity of this measure are captured.
The Pincus Index was designed [11] to be independent on the parameter r by choosing the maximum value of
Approximate Entropy (MaxApEn), but the index is still dependent on the selection of the embedding dimension (m). This
parameter is related to the memory of the system and accounts for the length of the patterns compared in the sequence.
Techniques to determine the optimum value of the embedding dimension include the use the mutual information and false
nearest neighbor method [1214], but since different markets may have different embedding dimensions, the comparisons
with a fixed m could be biased. To account for that possibility, we follow Bolea and coauthors [15] in the definition of
a Multidimensional index. Since such an index was based on MaxApEn, its extrapolation to a Multidimensional Pincus
Index is straightforward and provides a parameter-free index which allows for comparisons between evolving systems.
Besides multidimensionality in embedding dimension, dynamic systems may be composed of processes at different
frequencies with correlations at multiple time scales. Therefore, in the characterization of complexity, the comparison
of different frequencies may lead to incorrect conclusions. Costa and coauthors [16] proposed a multiscale procedure
to capture those correlations, showing its efficiency in distinguishing complexities in different dynamical regimes. To
describe the complexity of a time series at different levels, Costa and coauthors [17] generalized the multiscale procedure
to consider the complexity of higher statistical moments of time series. Here, we extend that methodology to create a new
Multiscale Pincus Index, showing how it is useful to correctly quantify the complexity of trading in different timeframes
and different statistical moments.
2. Methods and results
2.1. On the calculation of the Pincus Index
The Pincus Index (PI) captures the distance from a situation of total randomness for a given dataset, measured against
shuffled versions of the same data. To better quantify complexity and provide an index that is independent of the tolerance
2
A. Delgado-Bonal and Á.G. López Physica A 569 (2021) 125770
Fig. 1. ApEn, SampEn, and asymptotic lines depending on the alphabet for 50 pseudo-random binary (left) and decimal (right) sequences.
r, it is constructed based on the maximum value of Approximate Entropy (MaxApEn). The steps to compute the PI include
the determination of the MaxApEn of the original sequence and the MaxApEn of bootstrapped versions. Then, we use the
median value (50% percentile) of the empirical distribution of the bootstrapped versions to calculate the value of the
Pincus Index, and the 5% and 95% percentiles of the empirical cumulative distribution function to calculate the extremes
of the index. The rationale is simple: if the degree of randomness of the original sequence is similar to the shuffled
versions, the PI will be close to one, indicating randomness. If, on the other hand, the original sequence is ordered, the
PI will capture the distance from randomness as a fraction. For a detailed explanation of the methodology and several
examples of application, the reader is encouraged to see [10,11].
The Pincus Index is based on Approximate Entropy. When the number of data (N) is large, ApEn can be approximated
by Eq. (1). The error committed in this approximation is estimated to be smaller than 0.05 for N m+1 > 90 and smaller
than 0.02 for N m + 1 > 283 [18].
ApEn(m, r, N)
1
N m
Nm
i=1
log
Nm
j=1
[times that d[|x
m+1
(j) x
m+1
(i)|] < r]
Nm
j=1
[times that d[|x
m
(j) x
m
(i)|] < r]
.
(1)
where m is the length of the vectors being compared, and d measures the scalar distance between the vectors in a
component-wise way.
The Sample Entropy (SampEn) algorithm has been designed to avoid the self-bias included in ApEn [18], which is
mathematically formulated as [10]:
SampEn(m, r, N) =
log
Nm
i=1
Nm
j=1,j=i
[times that d[|x
m+1
(j) x
m+1
(i)|] < r]
Nm
i=1
Nm
j=1,j=i
[times that d[|x
m
(j) x
m
(i)|] < r]
(2)
It is often said that SampEn is largely independent on the number of points because, unlike ApEn, it does not include
a prefactor
1
Nm
. However, it must be noticed that such independence is only true for homogeneous series, and it does
not hold for general situations [19]. In general, randomness depends for both algorithms on m and N [20].
In Fig. 1 we show the different behavior of ApEn and SampEn for 50 pseudo-random binary (left) and decimal (right)
chains using m = 2; we use r < 1 to make the analysis independent of this parameter, given their well-defined alphabet.
As it can be seen, the mean value of SampEn reaches the asymptotic limit of log k faster but with a larger standard
deviation than ApEn.
In the construction of the Pincus Index, we calculate the ratio MaxApEn
original
/MaxApEn
shuffled
. Since those quantities
are calculated using the same values of m and N, the ratio between them does not include the prefactor
1
Nm
appearing
in ApEn in Eq. (1). This fact makes the PI independent of the number of points in the same way that SampEn (i.e., for
white noise, or a homogeneously generated sequence, it captures the randomness independently of N).
Another reason for the construction of SampEn was the self-counting introduced in the calculation of ApEn: note that
the definition of SampEn explicitly avoids that situation by limiting j = i in Eq. (2). That bias can be as high as 20% or 30%
if the number of points is low [18]. In this regard, since the PI is constructed as a ratio and the bias in ApEn is present in
both the nominator and the denominator, the overall bias is modulated and severely corrected, providing a better measure
of complexity. It should be emphasized that the PI does not measure randomness specifically but how far away a series
is from total randomness.
3
A. Delgado-Bonal and Á.G. López Physica A 569 (2021) 125770
Fig. 2. Value of r for which MaxApEn is reached for the original series (red line) and the average of one hundred shuffled versions (black line). The
average of the standard deviations (bars) for the shuffled versions is 0.038 and increases with the embedding dimension.
2.2. The threshold: r
max
and MaxApEn
The use of an incorrect parameter selection when using ApEn or SampEn can lead to inaccurate estimations of the
complexity of datasets. By means of MaxApEn we can prevent the arbitrary selection of the threshold of r, which changes
depending on the complexity of the sequence.
Restrepo et al. [21] showed that the combined use of MaxApEn and r
max
can help to correctly characterize the
complexity. Using a dataset containing daily values of EURUSD from 2006 to 2010, we show in Fig. 2 that r
max
changes with
the embedding dimension selected. The distance between the maximum value of the threshold for the original (red line)
and the shuffled series (black line) shows that using a fixed common valued for the threshold would lead to misleading
results. Thus, even though the recommended range for r is commonly [0.1σ , 0.25σ ], that region does not guarantee to
capture the complexity correctly for all the values of the embedding dimension. It is advised to use a value equal to or
greater than the value of r
max
[10] to assure the relative consistency; the comparison with different values of r beyond
the maximum would lead to the same qualitative characterization of the order of the system.
As explained in [21], the differences in r
max
for the original and shuffled versions can be used as a mean to discern
between systems in noisy datasets with low number of samples N. Albeit in this work we focus on the development of
the Pincus Index, we take the opportunity to recall that, for some dynamical regimes, these combined techniques could
provide a better characterization of the systems, and show that the recommended range may not be adequate depending
on the embedding dimension m.
2.3. The embedding dimension: multidimensional analysis
In the methodology to calculate the Pincus Index, the tolerance r is automatically selected as the value which
maximizes Approximate Entropy [11]. However, the selection of the embedding dimension is a requirement for the
calculations. The embedding dimension determines the length of the patterns being compared, and it is related to how
much information from the past is used to determine the future values. In the search of a parameter-free application
of Approximate Entropy, Bolea et al. [15] proposed the use of MaxApEn combined with a multidimensional analysis by
adding the contribution of MaxApEn over a wide range of embedding dimensions to capture the influence of previous
values.
Since a priori the memory of the system is unknown and it may change in evolving datasets like the forex markets,
adopting the same methodology as Bolea and coauthors, it is straight forward to build a Multidimensional Pincus Index
(MPI) independent of both r and m, by defining:
MPI =
m
max
m
i
=1
MaxApEn
original
(m
i
)
m
max
m=1
MaxApEn
shuffled
(m
i
)
(3)
We illustrate the behavior of this new multidimensional index in Fig. 3 using the EURUSD exchange rate as an example.
Fig. 3 (left) shows the MaxApEn for different embedding dimensions for the original series (black line) and pseudo-
randomized versions of the same data (red) for a dataset containing daily values of EURUSD from 2006 to 2010. Based
on those values, we show how the MPI changes when we consider only the previous m values using Eq. (3). We observe
that, by adding the contribution of larger embedding dimensions, the MPI varies to capture the increased information
in the complexity of the series. The right side of Fig. 3 shows the MPI accounting for the contribution of the embedding
dimension up to 15 (MPI(m
max
= 15)) for rolling windows of four years of EURUSD daily exchange rate, i.e., approximately
N 1000 points.
4
A. Delgado-Bonal and Á.G. López Physica A 569 (2021) 125770
Fig. 3. Multidimensional Pincus Index.
The rationale for the inclusion of multidimensionality is its ability to capture complexity in a greater extent, as shown
by Bolea et al. [15]. Specifically for the forex or stock markets, or when drawing comparisons between different systems, it
is not guaranteed that the optimal value of the embedding dimension would be the same. In general, randomness depends
on m and N, as shown by Pincus and coauthors when they defined the maximum {m, N}-randomness [20,22]. The value
of the Pincus Index is its aptness to make comparisons between systems by measuring the distance of each series against
the maximum randomness of each alphabet. We shall remember that both ApEn and SampEn provide relative values, and
may be unsuitable for comparisons. By including multidimensionality in the definition of the MPI, we obtain an index
independent of preselected parameter values for both r and m which can be used with evolving datasets.
2.4. Sampling frequency: multiscale entropy
Another variable must be taken into account in order to capture complexity in all of its forms, which is the different
frequencies within the data. It is not uncommon for dynamical systems to be composed of subprocesses emerging at
different time scales. That situation is often observed in the markets when the trend at a certain frequency domain, let
us say 15 min, is not the same (or even the complete opposite) as the trend at 1 day data.
To account for that possibility, Costa et al. [16] designed the multiscale entropy (MSE) procedure based on the approach
proposed by Zang [23,24]. This measure is based on a weighted sum of scale dependent entropies, and it has been used
extensively since its appearance in the literature [25]. The main idea is the construction of coarse-grained time series
determined by a certain scale factor τ , averaging different time scales from the original time series. The coarse-graining
procedure reduces the length of the sequence by a scale factor τ , obtaining a coarse-grained time of length N , with
N the original length. Thus, the larger the scale factor used, the shorter the resulting length of the coarse-grained time
series.
This procedure has become a prevailing method to quantify the complexity of data series and it has been vastly applied
in many different research fields, including finances [26]. After the creation of the coarse-grained sequences, the entropy
of each sequence is calculated and added up to obtain a multiscale entropy value. More detailed instructions of the
methodology can be found in Costa and coauthors works [16,27].
The MSE methodology has generally been applied in conjunction to Sample Entropy, given the above-mentioned fact
that is less dependent on the time series length since it does not include a prefactor in Eq. (2). However, similarly to
the comparison of different time series, different time scales may have different alphabets and the comparisons using
the same parameters may be biased. Some traded time frames will show higher variability, while the variations at
different frequencies may show lower changes and averaged values. Furthermore, Sample Entropy uses a fixed value
of the tolerance filter r which may not be adequate for all frequencies. This hinders the applicability of Sample Entropy
to characterize the randomness level appropriately.
As seen in the previous sections, the value of r which captures the maximum complexity is different for each sequence;
since the Pincus Index is based on MaxApEn, which automatically adapts to the maximum complexity of each frequency,
this index is able to capture the distance from total randomness of the different frequencies. As an example, Fig. 4 shows
the results of the Pincus Index for values traded Daily, at four hours (H4) and at one hour (H1) frequencies for m = 2.
We present the results for the six major traded pairs to display the evolution of the different frequencies.
In a previous communication we showed the effect of including more or less number of points (see Supplementary
Information in [11]). In this paper, our interest lies on characterizing the different frequencies knowing that approximately
the same number of points are used to draw comparisons about their complexity. To that end, we use rolling windows of
approximately N 1000 for all frequencies showed in Fig. 4, corresponding to four years in daily values, eight months in
5
A. Delgado-Bonal and Á.G. López Physica A 569 (2021) 125770
Fig. 4. Pincus Index with m = 2 for Daily, H4 and H1 data for the six major forex pairs.
H4 data, and two months in H1 data. Importantly, it can be seen in the figure that, in general, one hour frequencies are
less random than daily values, but the specific values of the Pincus Index change with the epoch and the frequency for
each of the six considered markets. It is also noticeable that the shape of the PI for the EURUSD daily values presented in
Fig. 4(a) is the same as the shape showed for the multidimensional PI analysis in Fig. 3, but with higher values. As far as
we have computed, adding the information contained in higher embedding dimensions makes the sequence less random,
but does not change the overall shape of the analysis between different rolling windows.
As in previous observations when dealing with the stock markets [11], we find that the degree of predictability
increases in times of crisis or sharp changes in the values; in those moments, the market chains falling sessions creating
more patterns repeated by the agents. Previous studies with high-frequency data have studied the changes during these
kinds of events in the short time reaction of the markets [3], and in our analysis, we find examples of that behavior in
Fig. 4(c) and (d). At those moments, the GBP and AUD pairs concatenated several months of depreciation with respect to
the USD, making ApEn (the numerator of the Pincus Index) to decrease accordingly.
6
A. Delgado-Bonal and Á.G. López Physica A 569 (2021) 125770
Fig. 5. Left: Multiscale PI for the first moment (m = 2). Right: Corresponding r
max
, MaxApEn and SampEn(r = 0.15σ ) for m = 2 for different τ using
the EURUSD logratio series for H1 from 2008 to 2010.
2.5. Higher statistical moments: Generalized Pincus Index
Finally, in this section we study a different type of dimensionality, this time by considering the statistical moments.
In the MSE procedure, the coarse-graining of the original series consists on the averaging of the time series, using the
first moment (mean). However, it is well known that, for time series in general and for financial time series in particular,
higher moments such as the variance, skewness or kurtosis contain valuable information different from the mean, which
can help to select τ .
Aware of this fact, Costa and coauthors extended their methodology to a Generalized Multiscale Entropy [17], MSEn,
being n the order of the moment used in the calculations. In this generalized methodology, different moments are used
in the coarse-graining procedure, helping to characterize the time series based on information not contained in the first
moment. Hence, when the MSE is applied to higher moments such as skewness and kurtosis in financial time series,
the MSEn is more effective to capture changes in the dynamics, providing valuable information [28]. Furthermore, some
authors have proposed the use of a Refined Generalized Multiscale entropy [29].
In Fig. 5, the PI is calculated for m = 2 for different coarse-graining values τ , with τ = 1 corresponding to hourly
data and τ = 24 to daily values. We can obtain useful information from this figure which helps us to understand the
dynamics within the data. First, by looking at Fig. 5 (left) we observe that the PI is low four hourly (τ = 1) and daily data
(τ = 24), while other less traded frequencies in between such as τ = 7, 11, 18 or 21, have higher values of PI indicating
more randomness. It is important to notice that MaxApEn or SampEn alone are unable to capture those features because,
as stated before, both are dependent on the number of points considered for non-homgeneous systems. In this example,
the sequence with τ = 1 consists of 12309 points while the last one with τ = 24 has only 512. On the contrary, as the
Pincus Index is computed for each sequence, the behavior of this index reflects the dynamics of the different frequencies.
Furthermore, the choice of r for SampEn is rather arbitrary as evidenced in Fig. 5 (right); we can observe that the value
for which ApEn reaches its maximum changes drastically with the coarse-graining and thus keeping a fixed value of r
for SampEn does not guarantee that the complexity is accurately captured. We have used the same values as Costa et al.
for r in the SampEn calculations. The difference in the choice of r in the first and second moment analyses is because
the amplitudes of the variance coarse-grained time series are much smaller than those of the mean coarse-grained time
series [17].
In the comparison of financial time series, measures such as the volatility of the returns have great relevance [30,31].
The Pincus Index can also be applied to characterize the complexity of those higher moments by using the same
generalization mechanism. Fig. 6 shows the Pincus Index of the second moment, the variance, for the log-ratio series
of EURUSD hourly data from 2008 to 2010 using m = 2. In Fig. 6 we see that the variance is more predictable for daily
data (τ = 24) than for hourly data and, in general, is more predictable than the logarithmic return of the price itself
showed in Fig. 5, in agreement with recent research [32].
Usually, the MSE devised by Costa and coauthors is interpreted as follows. If the MSE curve shows a monotonically
decreasing trend, it is considered that only the smallest scale, i.e. the first values of τ contain information [29]. Similarly,
if the trend is monotonically increasing only the larger scales, i.e. large values of τ , contain information. In our analysis,
we can see the MSE curves of SampEn in the right side of Figs. 5 and 6. We see the increasing trend in both cases, which
would be interpreted saying that the complexity of larger values of τ is higher. However, when compared with the left
side of those figures, we see that the Multiscale Pincus Index permits, not only to characterize the complexity of each
frequency, but also to make comparisons between them.
The original formulation of MSE adds the entropy values of each coarse-grained series for a selected range of scales
τ = 1, . . . , τ
max
, making the result dependent on the selected range. Wu et al. [33] proposed that the result would be
7
A. Delgado-Bonal and Á.G. López Physica A 569 (2021) 125770
Fig. 6. Left: Multiscale PI for the second moment (m = 2). Right: Corresponding r
max
, MaxApEn and SampEn(r = 0.05σ ) for m = 2 for different τ
using the EURUSD logratio series for H1 from 2008 to 2010.
better defined as the mean of the τ entropy values, suggesting the name Composite Multiscale Entropy (CMSE) for it. If
one were to use the Pincus Index for a multiscale analysis, it would be desirable to continue normalizing the system to
obtain a comparable metric by dividing the sum by the number of partitions. This leads to the definition of a Composite
Multi Scale Pincus Index:
CMSPI =
τ
max
τ
i
=1
PI(τ
i
)
τ
max
(4)
3. Discussion and conclusions
The objective of this paper was to continue with the development of the Pincus Index to capture the different
dimensions of the complexity of a dataset, and to use the methodology to analyze the six major traded pairs of the
forex market.
We have shown that the methodology of the Pincus Index can be easily extended to capture the information contained
in multiple embedding dimensions. Since different markets may have different optimal values for this parameter, the
comparison of the MPI is useful to discern the complexity of the systems, as long as the maximum embedding dimension
used in the analysis is the same for the markets considered.
Another source of complexity is the behavior of the system at different modes, and we have shown that the Pincus
Index captures the complexity depending on the frequency. The main results of this work are the suggestions that the
six analyzed markets are more predictable when using hourly data for a fixed number of points, and that less traded
frequencies are more random than usually traded frequencies such as one hour or one day. Finally, as an example of
generalization, we have analyzed higher moments to study the predictability of the variance, showing that the variance
of daily data is more predictable than that of lower frequencies.
The analysis of the complexity of a dataset consists of multiple dimensions. In a complete characterization of
the complexity of a time series, a multiscale approach using the Pincus Index can be developed in conjunction to a
multidimensional embedding dimension analysis to fully account for the dynamics at different frequencies. However,
given the computer power required to perform those calculations, instead of focusing on one market in particular we
have preferred to show the behavior of the six major traded pairs.
The methodology presented in this paper is not restricted to the forex market or the field of economics in any way,
and can be used to study any dataset. In particular, ApEn and SampEn have their roots in physiological studies and we
believe that the presented methodology will be useful in that field. Source codes in R programming language for the
determination of ApEn and SampEn are available in [10].
Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have
appeared to influence the work reported in this paper. The views and opinions expressed by ADB in this article are those
of the author and do not reflect the policy or position of his employer. ADB contributed to this work as an independent
researcher. The data used in this paper have been downloaded from the MT5 platform historic data. The source codes
used for the determination of the Pincus Index are freely available upon request.
8
A. Delgado-Bonal and Á.G. López Physica A 569 (2021) 125770
Funding
This research received no external funding. As a matter of fact, this research did not receive any funding whatsoever.
References
[1] M. Mitchell, Complexity: A Guided Tour, first ed., Oxford University Press, 2011.
[2] J. Kwapień, S. Drożdż, Physical approach to complex systems, Phys. Rep. 515 (3) (2012) 115–226, http://dx.doi.org/10.1016/j.physrep.2012.01.007.
[3] R. Gębarowski, P. Oświęcimka, M. Wątorek, S. Drożdż, Detecting correlations and triangular arbitrage opportunities in the forex by means of
multifractal detrended cross-correlations analysis, Nonlinear Dynam. 98 (3) (2019) 2349–2364, http://dx.doi.org/10.1007/s11071-019-05335-5.
[4] G.J. Chaitin, Randomness and mathematical proof, Sci. Am. 232 (1975) 47–52, http://dx.doi.org/10.1038/scientificamerican0575-47.
[5] A. Kolmogorov, A new metric invariant of transient dynamical systems and automorphisms in Lebesgue spaces, Dokl. Akad. Nauk SSSR 119.61
(1958) 864.
[6] Y. Sinai, On the notion of entropy of a dynamical system, Dokl. Russ. Acad. Sci. 124 (1959) 768–771.
[7] P. Grassberger, I. Procaccia, Estimation of the Kolmogorov entropy from a chaotic signal, Phys. Rev. A 28 (1983) 2591–2593, http://dx.doi.org/
10.1103/PhysRevA.28.2591.
[8] J.-P. Eckmann, D. Ruelle, Ergodic theory of chaos and strange attractors, Rev. Modern Phys. 57 (1985) 617–656.
[9] S.M. Pincus, Approximate entropy as a measure of system complexity., Proc. Natl. Acad. Sci. 88 (6) (1991) 2297–2301, http://dx.doi.org/10.
1073/pnas.88.6.2297.
[10] A. Delgado-Bonal, A. Marshak, Approximate entropy and sample entropy: A comprehensive tutorial, Entropy 21 (6) (2019) 541, http:
//dx.doi.org/10.3390/e21060541.
[11] A. Delgado-Bonal, Quantifying the randomness of the stock markets, Sci. Rep. 9 (1) (2019) 12761, http://dx.doi.org/10.1038/s41598-019-49320-9.
[12] A.M. Fraser, H.L. Swinney, Independent coordinates for strange attractors from mutual information, Phys. Rev. A 33 (1986) 1134–1140,
http://dx.doi.org/10.1103/PhysRevA.33.1134.
[13] M.B. Kennel, R. Brown, H.D.I. Abarbanel, Determining embedding dimension for phase-space reconstruction using a geometrical construction,
Phys. Rev. A 45 (1992) 3403–3411, http://dx.doi.org/10.1103/PhysRevA.45.3403.
[14] M. Perc, The dynamics of human gait, Eur. J. Phys. 26 (3) (2005) 525–534, http://dx.doi.org/10.1088/0143-0807/26/3/017.
[15] J. Bolea, R. Bailón, E. Pueyo, On the standardization of approximate entropy: Multidimensional approximate entropy index evaluated on
short-term HRV time series, Complexity 2018 (2018) 4953273, http://dx.doi.org/10.1155/2018/4953273.
[16] M. Costa, A.L. Goldberger, C.-K. Peng, Multiscale entropy analysis of complex physiologic time series, Phys. Rev. Lett. 89 (2002) 068102,
http://dx.doi.org/10.1103/PhysRevLett.89.068102.
[17] M.D. Costa, A.L. Goldberger, Generalized multiscale entropy analysis: Application to quantifying the complex volatility of human heartbeat time
series, Entropy 17 (3) (2015) 1197–1203, http://dx.doi.org/10.3390/e17031197.
[18] J.S. Richman, J.R. Moorman, Physiological time-series analysis using approximate entropy and sample entropy, Am. J. Physiol.-Heart Circ. Physiol.
278 (6) (2000) H2039–H2049, http://dx.doi.org/10.1152/ajpheart.2000.278.6.H2039.
[19] A. Delgado-Bonal, On the use of complexity algorithms: A cautionary lesson from climate research, Sci. Rep. 10 (1) (2020) 5092, http:
//dx.doi.org/10.1038/s41598-020-61731-7.
[20] S. Pincus, B.H. Singer, Randomness and degrees of irregularity, Proc. Natl. Acad. Sci. 93 (5) (1996) 2083–2088, http://dx.doi.org/10.1073/pnas.
93.5.2083.
[21] J.F. Restrepo, G. Schlotthauer, M.E. Torres, Maximum approximate entropy and r threshold: A new approach for regularity changes detection,
Physica A 409 (2014) 97–109, http://dx.doi.org/10.1016/j.physa.2014.04.041.
[22] S. Pincus, R.E. Kalman, Not all (possibly) “random” sequences are created equal, Proc. Natl. Acad. Sci. 94 (8) (1997) 3513–3518, http:
//dx.doi.org/10.1073/pnas.94.8.3513.
[23] Y.-C. Zhang, Complexity and 1/f noise. A phase space approach, J. Physique I 1 (1991) 971–977, http://dx.doi.org/10.1051/jp1:1991180.
[24] H.C. Fogedby, On the phase space approach to complexity, J. Stat. Phys. 69 (1) (1992) 411–425, http://dx.doi.org/10.1007/BF01053799.
[25] A. Humeau-Heurtier, The multiscale entropy algorithm and its variants: A review, Entropy 17 (5) (2015) 3110–3123, http://dx.doi.org/10.3390/
e17053110.
[26] J. Xia, P. Shang, Multiscale entropy analysis of financial time series, Fluct. Noise Lett. 11 (04) (2012) 1250033, http://dx.doi.org/10.1142/
S0219477512500332.
[27] M. Costa, A. Goldberger, C.-K. Peng, Multiscale entropy of biological signals, Phys. Rev. E (3) 71 (2005) 021906, http://dx.doi.org/10.1103/
PhysRevE.71.021906.
[28] M. Xu, P. Shang, Analysis of financial time series using multiscale entropy based on skewness and kurtosis, Physica A 490 (2018) 1543–1550,
http://dx.doi.org/10.1016/j.physa.2017.08.136.
[29] Y. Liu, Y. Lin, J. Wang, P. Shang, Refined generalized multiscale entropy analysis for physiological signals, Physica A 490 (2018) 975–985,
http://dx.doi.org/10.1016/j.physa.2017.08.047.
[30] P. Jorion, Predicting volatility in the foreign exchange market, J. Finance 50 (2) (1995) 507–528.
[31] S.R. Bentes, R. Menezes, D.A. Mendes, Long memory and volatility clustering: Is the empirical evidence consistent across stock markets? Physica
A 387 (2008) 3826–3830, http://dx.doi.org/10.1016/j.physa.2008.01.046.
[32] X. Zhao, C. Liang, N. Zhang, P. Shang, Quantifying the multiscale predictability of financial time series by an information-theoretic approach,
Entropy 21 (7) (2019) http://dx.doi.org/10.3390/e21070684.
[33] S.-D. Wu, C.-W. Wu, S.-G. Lin, C.-C. Wang, K.-Y. Lee, Time series analysis using composite multiscale entropy, Entropy 15 (3) (2013) 1069–1084,
http://dx.doi.org/10.3390/e15031069.
9