Python Trading Toolbox: Weighted and Exponential Moving Averages
In the first article of the Financial Trading Toolbox series (Building a Financial Trading Toolbox in Python: Simple Moving Average), we discussed how to calculate a simple moving average, add it to a price series chart, and use it for investment and trading decisions. The Simple Moving Average is only one of several moving averages available that can be applied to price series to build trading systems or investment decision frameworks. Among those, two other moving averages are commonly used among financial market analysts:
 Weighted Moving Average (WMA)
 Exponential Moving Average (EMA)
In this article, we will explore how to calculate those two averages and how to ensure that the results match the definitions that we need to implement.
(A version of this post was published on Towards Data Science on October 10, 2019)
Weighted Moving Average
In some applications, one of the limitations of the simple moving average is that it gives equal weight to each of the daily prices included in the window. E.g., in a 10day moving average, the most recent day receives the same weight as the first day in the window: each price receives a 10% weighting.
Compared to the Simple Moving Average, the Linearly Weighted Moving Average (or simply Weighted Moving Average, WMA), gives more weight to the most recent price and gradually less as we look back in time. On a 10day weighted average, the price of the 10th day would be multiplied by 10, that of the 9th day by 9, the 8th day by 8 and so on. The total will then be divided by the sum of the weights (in this case: 55). In this specific example, the most recent price receives about 18.2% of the total weight, the second more recent 16.4%, and so on until the oldest price in the window that receives 0.02% of the weight.
Let’s put that in practice with an example in Python. In addition to pandas and Matplotlib, we are going to make use of NumPy:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib as mpl
import sys
print('Python version: ' + sys.version)
print('pandas version: ' + pd.__version__)
print('NumPy version:' + np.__version__)
print('matplotlib version: ' + mpl.__version__)
Python version: 3.7.4 (default, Aug 13 2019, 15:17:50)
[Clang 4.0.1 (tags/RELEASE_401/final)]
pandas version: 0.25.1
NumPy version:1.16.4
matplotlib version: 3.1.1
%matplotlib inline
plt.style.use('fivethirtyeight')
For the next examples, we are going to use price data from a StockCharts.com article. It’s an excellent educational article on moving averages and I recommend reading it. The price series used in that article could belong to any stock or financial instrument and will serve our purposes for illustration.
I modified the original Excel sheet by including calculations for the 10day WMA since the calculation for the EMA is already included. You can access my Google Sheets file and download the data in CSV format here.
It is always a good practice, when modeling data, to start with a simple implementation of our model that we can use to make sure that the results from our final implementation are correct.
We start by loading the data into a data frame:
datafile = 'data/csmovavg.csv'
data = pd.read_csv(datafile, index_col = 'Date')
data.index = pd.to_datetime(data.index)
# We can drop the old index column:
data = data.drop(columns='Unnamed: 0')
data
Price  10day SMA  10day WMA  Smoothing Constant 2/(10 + 1)  10day EMA  

Date  
20100324  22.273  NaN  NaN  NaN  NaN 
20100325  22.194  NaN  NaN  NaN  NaN 
20100326  22.085  NaN  NaN  NaN  NaN 
20100329  22.174  NaN  NaN  NaN  NaN 
20100330  22.184  NaN  NaN  NaN  NaN 
20100331  22.134  NaN  NaN  NaN  NaN 
20100401  22.234  NaN  NaN  NaN  NaN 
20100405  22.432  NaN  NaN  NaN  NaN 
20100406  22.244  NaN  NaN  NaN  NaN 
20100407  22.293  22.225  22.247  NaN  22.225 
20100408  22.154  22.213  22.234  0.1818  22.212 
20100409  22.393  22.233  22.266  0.1818  22.245 
20100412  22.382  22.262  22.293  0.1818  22.270 
20100413  22.611  22.306  22.357  0.1818  22.332 
20100414  23.356  22.423  22.548  0.1818  22.518 
20100415  24.052  22.615  22.844  0.1818  22.797 
20100416  23.753  22.767  23.051  0.1818  22.971 
20100419  23.832  22.907  23.244  0.1818  23.127 
20100420  23.952  23.078  23.434  0.1818  23.277 
20100421  23.634  23.212  23.535  0.1818  23.342 
20100422  23.823  23.379  23.647  0.1818  23.429 
20100423  23.872  23.527  23.736  0.1818  23.510 
20100426  23.654  23.654  23.759  0.1818  23.536 
20100427  23.187  23.711  23.674  0.1818  23.473 
20100428  23.098  23.686  23.563  0.1818  23.404 
20100429  23.326  23.613  23.498  0.1818  23.390 
20100430  22.681  23.506  23.328  0.1818  23.261 
20100503  23.098  23.432  23.254  0.1818  23.231 
20100504  22.403  23.277  23.067  0.1818  23.081 
20100505  22.173  23.131  22.866  0.1818  22.916 
We are going to consider only the Price and 10Day WMA columns for now and move to the EMA later on.
When it comes to linearly weighted moving averages, the pandas library does not have a ready offtheshelf method to calculate them. It offers, however, a very powerful and flexible method: .apply()
This method allows us to create and pass any custom function to a rolling window: that is how we are going to calculate our Weighted Moving Average.
In order to calculate a 10Day WMA, we start by creating an array of weights  whole numbers from 1 to 10:
weights = np.arange(1,11) #this creates an array with integers 1 to 10 included
weights
array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
Next, using the .apply()
method we pass our own function (a lambda function) to compute the dot product of weights and prices in our rolling window (prices in the window will be multiplied by the corresponding weight, then summed), then dividing it by the sum of the weights:
wma10 = data['Price'].rolling(10).apply(lambda prices: np.dot(prices, weights)/weights.sum(), raw=True)
wma10.head(20)
Date
20100324 NaN
20100325 NaN
20100326 NaN
20100329 NaN
20100330 NaN
20100331 NaN
20100401 NaN
20100405 NaN
20100406 NaN
20100407 22.246473
20100408 22.233618
20100409 22.266382
20100412 22.293527
20100413 22.356909
20100414 22.547800
20100415 22.843927
20100416 23.050818
20100419 23.244455
20100420 23.434455
20100421 23.535582
Name: Price, dtype: float64
Now, we want to compare our WMA to the one obtained with the spreadsheet. To do so, we can add an ‘Our 10day WMA’ column to the dataframe. To make the visual comparison easier, we can round the WMA series to three decimals using the .round()
method from NumPy. Then, we select the price and WMA columns to be displayed:
data['Our 10day WMA'] = np.round(wma10, decimals=3)
data[['Price', '10day WMA', 'Our 10day WMA']].head(20)
Price  10day WMA  Our 10day WMA  

Date  
20100324  22.273  NaN  NaN 
20100325  22.194  NaN  NaN 
20100326  22.085  NaN  NaN 
20100329  22.174  NaN  NaN 
20100330  22.184  NaN  NaN 
20100331  22.134  NaN  NaN 
20100401  22.234  NaN  NaN 
20100405  22.432  NaN  NaN 
20100406  22.244  NaN  NaN 
20100407  22.293  22.247  22.246 
20100408  22.154  22.234  22.234 
20100409  22.393  22.266  22.266 
20100412  22.382  22.293  22.294 
20100413  22.611  22.357  22.357 
20100414  23.356  22.548  22.548 
20100415  24.052  22.844  22.844 
20100416  23.753  23.051  23.051 
20100419  23.832  23.244  23.244 
20100420  23.952  23.434  23.434 
20100421  23.634  23.535  23.536 
The two WMA columns look the same. There are a few differences in the third decimal place, but we can put that down to rounding error and conclude that our implementation of the WMA is correct. In a reallife application, if we want to be more rigorous we should compute the differences between the two columns and check that they are not too large. For now, we keep things simple and we can be satisfied with the visual inspection.
It would be interesting to compare in a plot our newly created WMA with the familiar SMA:
sma10 = data['Price'].rolling(10).mean()
plt.figure(figsize = (12,6))
plt.plot(data['Price'], label="Price")
plt.plot(wma10, label="10Day WMA")
plt.plot(sma10, label="10Day SMA")
plt.xlabel("Date")
plt.ylabel("Price")
plt.legend()
<matplotlib.legend.Legend at 0x1148b19d0>
As we can see, both averages smooth out the price movement. The WMA is more reactive and follows the price closer than the SMA: we expect that since the WMA gives more weight to the most recent price observations. Also, both moving average series start on day 10: the first day with enough available data to compute the averages.
The Weighted Moving Average may be lesser known than its Exponential sibling. However, it can be an additional item in our toolbox when we try to build original solutions. Implementing the WMA in Python forced us to search for a way to create customized moving averages using .apply()
: this technique can be used to implement new and original moving averages as well.
Exponential Moving Average
Similarly to the Weighted Moving Average, the Exponential Moving Average (EMA) assigns a greater weight to the most recent price observations. While it assigns lesser weight to past data, it is based on a recursive formula that includes in its calculation all the past data in our price series.
The EMA at time $t$ is calculated as the current price multiplied by a smoothing factor alpha (a positive number less than 1) plus the EMA at time $t1$ multiplied by 1 minus the alpha. It is basically a value between the previous EMA and the current price:
\[EMA_t = \alpha Price_t + (1  \alpha) EMA_{t1}\]The smoothing factor $\alpha$ ( alpha ) is defined as:
\[\alpha = \frac{2}{(n + 1)}\]where $n$ is the number of days in our span. Therefore, a 10day EMA will have a smoothing factor:
\[\alpha = \frac{2}{10 + 1} \approx 0.1818\]Pandas includes a method to compute the EMA moving average of any time series: .ewm()
Will this method respond to our needs and compute an average that matches our definition? Let’s test it:
ema10 = data['Price'].ewm(span=10).mean()
ema10.head(10)
Date
20100324 22.273000
20100325 22.229550
20100326 22.171442
20100329 22.172285
20100330 22.175648
20100331 22.164830
20100401 22.181498
20100405 22.238488
20100406 22.239687
20100407 22.250886
Name: Price, dtype: float64
We want to compare this EMA series with the one obtained in the spreadsheet:
data['Our 10day EMA'] = np.round(ema10, decimals=3)
data[['Price', '10day EMA', 'Our 10day EMA']].head(20)
Price  10day EMA  Our 10day EMA  

Date  
20100324  22.273  NaN  22.273 
20100325  22.194  NaN  22.230 
20100326  22.085  NaN  22.171 
20100329  22.174  NaN  22.172 
20100330  22.184  NaN  22.176 
20100331  22.134  NaN  22.165 
20100401  22.234  NaN  22.181 
20100405  22.432  NaN  22.238 
20100406  22.244  NaN  22.240 
20100407  22.293  22.225  22.251 
20100408  22.154  22.212  22.231 
20100409  22.393  22.245  22.263 
20100412  22.382  22.270  22.287 
20100413  22.611  22.332  22.349 
20100414  23.356  22.518  22.542 
20100415  24.052  22.797  22.828 
20100416  23.753  22.971  23.002 
20100419  23.832  23.127  23.157 
20100420  23.952  23.277  23.305 
20100421  23.634  23.342  23.366 
As you have already noticed, we have a problem here: the 10day EMA that we just calculated does not correspond to the one calculated in the downloaded spreadsheet. One starts on day 10, while the other starts on day 1. Also, the values do not match exactly.
Is our calculation wrong? Or is the calculation in the provided spreadsheet wrong? Neither: those two series correspond to two different definitions of EMA. To be more specific, the formula used to compute the EMA is the same. What changes is the use of the initial values.
If we look carefully at the definition of Exponential Moving Average on the StockCharts.com web page we can notice one important detail: they start calculating a 10day moving average on day 10, disregarding the previous days and replacing the price on day 10 with its 10day SMA. It’s a different definition than the one applied when we calculated the EMA using the .ewm()
method directly.
The following lines of code create a new modified price series where the first 9 prices (when the SMA is not available) are replaced by NaN and the price on the 10th date becomes its 10Day SMA:
modPrice = data['Price'].copy()
modPrice.iloc[0:10] = sma10[0:10]
modPrice.head(20)
Date
20100324 NaN
20100325 NaN
20100326 NaN
20100329 NaN
20100330 NaN
20100331 NaN
20100401 NaN
20100405 NaN
20100406 NaN
20100407 22.2247
20100408 22.1540
20100409 22.3930
20100412 22.3820
20100413 22.6110
20100414 23.3560
20100415 24.0520
20100416 23.7530
20100419 23.8320
20100420 23.9520
20100421 23.6340
Name: Price, dtype: float64
We can use this modified price series to calculate a second version of the EWM. By looking at the documentation, we can note that the .ewm()
method has an adjust parameter that defaults to True. This parameter adjusts the weights to account for the imbalance in the beginning periods (if you need more detail, see the Exponentially weighted windows section in the pandas documentation).
If we want to emulate the EMA as in our spreadsheet using our modified price series, we don’t need this adjustment. We then set adjust=False
:
ema10alt = modPrice.ewm(span=10, adjust=False).mean()
Will this newly calculated EMA match the one calculated in the spreadsheet? Let’s have a look:
data['Our 2nd 10Day EMA'] = np.round(ema10alt, decimals=3)
data[['Price', '10day EMA', 'Our 10day EMA', 'Our 2nd 10Day EMA']].head(20)
Price  10day EMA  Our 10day EMA  Our 2nd 10Day EMA  

Date  
20100324  22.273  NaN  22.273  NaN 
20100325  22.194  NaN  22.230  NaN 
20100326  22.085  NaN  22.171  NaN 
20100329  22.174  NaN  22.172  NaN 
20100330  22.184  NaN  22.176  NaN 
20100331  22.134  NaN  22.165  NaN 
20100401  22.234  NaN  22.181  NaN 
20100405  22.432  NaN  22.238  NaN 
20100406  22.244  NaN  22.240  NaN 
20100407  22.293  22.225  22.251  22.225 
20100408  22.154  22.212  22.231  22.212 
20100409  22.393  22.245  22.263  22.245 
20100412  22.382  22.270  22.287  22.270 
20100413  22.611  22.332  22.349  22.332 
20100414  23.356  22.518  22.542  22.518 
20100415  24.052  22.797  22.828  22.797 
20100416  23.753  22.971  23.002  22.971 
20100419  23.832  23.127  23.157  23.127 
20100420  23.952  23.277  23.305  23.277 
20100421  23.634  23.342  23.366  23.342 
Now, we are doing much better. We have obtained an EMA series that matches the one calculated in the spreadsheet.
We ended up with two different versions of EMA in our hands:

ema10
: This version uses the plain.ewm()
method, starts at the beginning of our price history but does not match the definition used in the spreadsheet. 
ema10alt
: This version starts on day 10 (with an initialvalue equal to the 10day SMA) and matches the definition on our spreadsheet.
Which one is the best to use? The answer is: it depends on what we need for our application and to build our system. If we need an EMA series that starts from day 1, then we should choose the first one. On the other hand, if we need to use our average in combination with other averages that have no values for the initial days (such as the SMA), then the second is probably the best one.
The second EMA is widely used among financial market analysts: if we need to implement an already existing system, we need to be careful to use the correct definition. Otherwise, the results may not be what is expected from us and may put the accuracy of all of our work into question. In any case, the numeric difference between those two averages is very minimal, with an impact on our trading or investment decision system limited to the initial days.
Let’s look at all the moving averages we have used so far in a chart:
plt.figure(figsize = (12,6))
plt.plot(data['Price'], label="Price")
plt.plot(wma10, label="10Day WMA")
plt.plot(sma10, label="10Day SMA")
plt.plot(ema10, label="10Day EMA1")
plt.plot(ema10alt, label="10Day EMA2")
plt.xlabel("Date")
plt.ylabel("Price")
plt.legend()
<matplotlib.legend.Legend at 0x114cb7050>
Of all the moving averages, the WMA appears the one that is more responsive and tags the price more closely, while the SMA is the one that responds with some more lag. The two versions of the EMA tend to overlap each other, mainly in the last days.
I hope you found this post useful. Introducing the Weighted Moving Average helped us to learn and implement a custom average based on a specific definition. Working with the Exponential Moving Average gave us the chance to highlight how important it is to ensure that any function we are using to work on price series matches the definition that we have for any given task.