Picking winners with Network Analysis

HennyGe Wichers

February 2, 2023

Active investment aims to beat average returns. A portfolio manager selects a set of assets that have a higher probability of better returns than some tracking index. In this article, we do just that. We use Network Analysis to select two baskets of stock exchanges and create portfolios to rival a global tracker. One portfolio consists of central stock exchanges, while the other contains outsiders. We then evaluate performance by simulating an investment kept for one year. A portfolio of outsiders could be the best choice when the economic outlook is uncertain.

Note: This article illustrates Network Analysis for stock markets, and it is not, in any way, financial advice. A copy of this article with codes is available on GitHub, and a preview of the Jupyter notebook on nbviewer is here.

The network of stock exchanges after analysis, with the central portfolio in orange and the outsiders in green. Picking winners with Network Analysis. — Fig 1: Selected stock exchanges after network analysis

Data

Our analysis uses the top 20 stock exchanges by market capitalisation, excluding Tehran Stock Exchange, because we could not find the data. We collected values for the period January 1, 2015, to December 31, 2022, from Yahoo Finance, with one exception: Saudi Stock Exchange (Tadawul) numbers were unavailable and downloaded from Investing.com instead.

We reserve the 2022 data to evaluate performance later and use the remaining years to create our portfolios. Let’s have a look at the data we will be working with. Fig 2 shows values for each stock exchange in our analysis from the start of 2015 to the end of 2021.

Line graph of stock exchange values for the top 20 in market capitalisation, from January 1, 2015, to December 31, 2021. — Fig 2: Stock Exchange values from January 1, 2015, to December 31, 2021

March 2020 immediately draws the eye: all major markets fell sharply at the beginning of the COVID-19 pandemic. But there are some easy-to-spot differences in the period that follows too. For example, the London Stock Exchange (light blue) and the SIX Swiss Exchange (orange) do not display the same recovery as other big markets. We also see some contrasting trends in 2017 and 2018. The Korea Exchange (brown) and the German Xetra (green) mainly remained flat, while many other stock markets saw growth. That is good news for our analysis; we may find patterns in the variation.

Finding the network

We determine the network for our 19 stock exchanges in three steps. First, we calculate daily returns. Second, we find the correlations between the returns for each stock exchange and every other stock exchange. And finally, we convert the correlation coefficients into distances and draw a network that maintains only the significant links. Let’s look at each step in more detail.

Calculate daily returns

We use daily log returns. There are two good reasons for this choice. One, simple returns tend to follow a log-normal distribution. So if we use log returns, we can work with a normal distribution, and the normal distribution makes life easy. And two, we are interested in multiplicative returns: we want to know whether a stock doubled in value rather than that it increased by, say, 5 dollars.

Find the correlations

We calculate the correlations between log returns for each stock exchange with every other stock exchange. Fig 3 shows a correlations heatmap for the period in our analysis. Light colours indicate strong positive correlations, and dark colours are zero or negative correlations. So Japan and Shenzhen, for example, move together but independently of all other exchanges.

Correlations heatmap for log returns over the period 2015 to 2021. — Fig 3: Correlations heatmap for log returns over the period 2015 to 2021

We can also use the correlation matrix to draw a fully connected network (Fig 4). Each orange circle (node) represents one of the 19 stock exchanges, and the nodes are connected by lines (edges). A thicker edge indicates a stronger correlation between two nodes, and a loop is the correlation of a node with itself. Neat!

Fully connected graph of log returns correlations (2015-2021) between stock exchanges. — Fig 4: Fully connected graph of log returns correlations (2015 to 2021) between stock exchanges

Draw the network

However, the network would be more informative if we eliminated redundancies and noise. A Minimum Spanning Tree (MST) can help identify the relationships that matter. First, we need distances of some kind. We convert the correlation coefficients into distances by taking the square root of 2 * (1 – correlation coefficient). That gives us short distances between nodes with high correlations and longer distances otherwise. Now we can generate the MST and visualise the network of meaningful relationships (Fig 5).

Minimum Spanning Tree (MST) for stock exchanges. — Fig 5: The Minimum Spanning Tree (MST) for stock exchanges

Calculating metrics

Now we are ready to calculate three Network Analysis metrics to guide the portfolio selection: degree centrality, betweenness centrality, and closeness centrality. They measure a node’s importance, but each has a different emphasis. Let’s see what that is.

Degree centrality

Degree centrality is the simplest measure: it is the number of connections expressed as a fraction of the total possible connections. A node with a high degree centrality influences the behaviour of many others.

In our network, New York has the most connections (Fig 5) and, therefore, the highest degree centrality. It has links with five other exchanges out of a possible 18, so its degree centrality is 5 / 18 = 0.28. Note that the edge between Korea and India runs behind New York, making it look like seven links, but the line does not connect. Fig 6 has the values for all exchanges.

Vertical bar chart of degree centrality by stock exchange. — Fig 6: Degree centrality by stock exchange

Betweenness centrality

Betweenness centrality measures a node’s influence over the flow of information in the network. It’s the number of shortest paths that pass through it. The shortest path is the minimum number of hops from one node to another. For example, to get from Nasdaq Nordic (NASDAQNOR) to Toronto takes two jumps on the shortest path: first to New York and then to Toronto. We can go via India first, but with three hops, that’s not the shortest path.

It’s tempting to think that the node with the most connections also has the most shortest paths. It can be, but it often isn’t. The picture in Fig 7 helps to illustrate why.

Pictures of networks to illustrate degree centrality and betweenness centrality. — Fig 7: Degree centrality and betweenness centrality

A and B each have seven nodes and six edges, but the networks have different structures. Network A has a star shape. The purple node in the middle has the highest degree centrality with six connections and the highest betweenness centrality because it is on the shortest path between any other two nodes.

In network B, however, the roles fall on separate nodes. The blue nodes have the highest degree centrality, with three connections each. But the orange node has the highest betweenness centrality: every shortest path from the left to the right side, and vice versa, has to pass through it. It’s a bridge between the two larger components of the network. Shapes like network B are more common in the real world, so the highest degree and betweenness centrality are often different nodes.

Fig 8 displays the betweenness centralities for every node in our network. Nasdaq Nordic has the highest score. Any exchange with one connection gets zero because no path can pass through it.

Vertical bar chart of betweenness centrality by stock exchange. — Fig 8: Betweenness centrality by stock exchange

Closeness centrality

Closeness centrality expresses how close a node is to all other nodes. It measures the lengths of the shortest paths between a node and all other nodes that can be reached from it. We calculate it as the reciprocal of the sum of the shortest paths, and Fig 9 helps us see how that works.

Let’s compute closeness centrality for the purple node. There are four other nodes in the network, and they can all be reached. The numbers indicate the shortest path length from the purple node to that node. Then, the closeness centrality for the purple node is 4 / (1 + 1 + 2 + 2) = 4 / 6 = 2 / 3.

Nasdaq Nordic has the highest closeness centrality in our network of stock exchanges. It also has the highest betweenness centrality, but, once again, it is only sometimes that the same node takes both roles. Fig 10 displays the complete set of closeness centralities.

Vertical bar chart of closeness centrality by stock exchange. — Fig 10: Closeness centrality by stock exchange

Creating the portfolios

Our two portfolios are on opposite ends. Stock exchanges in the central portfolio play a key role and influence other exchanges worldwide. However, exchanges in the outsiders portfolio have weak correlations and contain more noise.

The three metrics we calculated focus on different aspects of a node’s importance, and we take the average to reduce the error caused by any single method. Nasdaq Nordic, New York and Korea have the highest average scores and go in the central portfolio. Shanghai, Japan and London have the lowest scores and make up the outsiders portfolio (Fig 11).

Table with degree centrality, betweenness centrality, closeness centrality, and average centrality measures for the central and outsiders portfolios. The network of stock exchanges after analysis, with the central portfolio in orange and the outsiders in green. Picking winners with Network Analysis. — Fig 11: The central and outsiders portfolios

Let’s visualise that in the network. In Fig 12, we colour the central portfolio orange and the outsiders green. Japan is difficult to see – it is located behind Shenzhen, all the way at the top. But our results make intuitive sense: the central portfolio is near the centre of the network, and the outsiders are, well, on the outside.

The network of stock exchanges after analysis, with the central portfolio in orange and the outsiders in green. With title text. Picking winners with Network Analysis. — Fig 12: The network with the central portfolio in orange and the outsiders in green

Evaluating performance

Say we did our analysis at the end of 2021, and on the first day of trading in 2022, we spent 100.000* on either a central portfolio, an outsiders portfolio, or a global tracker. What would have happened? Fig 13 shows us.

Line chart of performance evaluation over 2022 for the central portfolio, outsiders portfolio, and global tracker. The network of stock exchanges after analysis, with the central portfolio in orange and the outsiders in green. Picking winners with Network Analysis. — Fig 13: Performance evaluation over 2022

Clearly, 2022 was not great for stock markets. At the end of the year, we would be down on our investment regardless of the choice we made at the beginning. But losses would be lowest with the portfolio of outsiders. Their low correlation with highly influential markets can help them escape a downturn.

By the end of 2021, we already suspected that 2022 would not be rosy. If we had combined that intuition with the knowledge about weakly correlated markets and which ones they are, we would have made the right choice. Probably.

* Note that we conveniently ignore currencies and exchange rates. This article aims to illustrate Network Analysis rather than to provide accurate investment advice.