Semantic Bayesian Networks

Overview

The stock market is a dynamic, non-linear and chaotic system. Share evaluation involves identifying securities with suitable characteristics for inclusion in a portfolio. This requires investment professionals to assess and evaluate multiple factors, including contradictory information when deciding whether a share is suitable for inclusion in an investment portfolio. Intelligent systems incorporate Artificial Intelligence (AI) techniques to automate tasks and support human decision-making.

The INVEST system, proposed by Drake, is one such intelligent system that incorporates ontologies and Bayesian networks to support decision-making for share evaluation on the Johannesburg Stock Exchange (JSE). Ontologies are used to formally encode unstructured expert information into a representative, machine-understandable form, whilst Bayesian networks (BNs) are useful in the financial domain as it represents uncertain, ambiguous or incomplete knowledge.

Research Objectives

In this research, we evaluated the INVEST system with varying conditions and test several system extensions.
  1. Evaluate the base INVEST system as designed by Drake.
  2. Conduct an ablation study to assess the relative importance of each component in the overall framework.
  3. Extend the system to include systematic risk within the BN topology.
  4. Test the stability and robustness of the INVEST system through a noise simulation component.
  5. Integrate a deep learning predictive component into the INVEST system.

System Design


The system is composed of six submodules within the INVEST module. The INVEST module contains a Store class, which represents the central point of the system. Responsibilities include invoking the functionality of the 6 submodules and running different experiments dependent on user input. The package diagram on the left depicts the dependencies between modules and submodules.


Submodules

1

Preprocessing

This submodule contains data loading and preprocessing utilities. Furthermore, this component has been extended to include a noise simulation component to facilitate system stress tests.

2

Calculator

This submodule performs the financial calculations of the system for each company, using financial ratios computation and thresholding logic. Financial ratios produce a numerical output, which the threshold component subsequently converts to discrete states required as input for the BNs.

3

Networks

This submodule contains the implementation of three decision networks used to reach an investment decision: Value Evaluation, Quality Evaluation, and Investment Recommendation. These BNs conduct inference using the evidence returned by the thresholding component.

4

Evaluation

The INVEST system is evaluated through back-testing using historical data for shares listed on the JSE. This submodule contains the validation procedures and depends on a Metrics submodule that houses risk and risk-adjusted return metrics.

5

Metrics

This submodule contains the logic to determine portfolio performance, which is calculated using risk and risk-adjusted return metrics.

6

Prediction

This component includes logic to facilitate the integration of a Graph Neural Network (GNN) with the BNs. Prediction is performed based on the close price data of each share.


Bayesian Networks

The Value Evaluation network is used to evaluate a share relative to price and determines whether a share is Cheap, FairValue or Expensive. The Quality Evaluation network is used to evaluate the quality of a share and determines whether a share is of a Low, Medium or High quality.

Finally, the Investment Recommendation network incorporates the output of the Quality and Value BNs to reflect the decision process performed by investors: evaluating whether the price of a share is reasonable and the quality of the share, thus providing an investment recommendation for a specific share. This network produces a final Yes or No decision pertaining to system's assessment of the share's suitability for investment.


Experimental Design

Data

The dataset used consists of price and fundamental data for 36 shares listed on the JSE between 2012 - 2018: 17 shares from the General Industrials sector and 19 shares from the Consumer Services sector.

Benchmark

The JSE All-Share Index, General Industrials (JGIND) and Consumer Services (JCSEV) are the selected benchmark indices to validate the share evaluation recommendations for the respective sectors. The benchmark dataset includes a composite of FTSE/JSE General Industrials, FTSE/JSE Consumer Services and FTSE/JSE All-Share indices data.

Setup

Each experiment is conducted on an Apple MacBook Pro with an Intel® Core™ i5 CPU @ 2.4 GHz. Each experiment was run with the default threshold = 0.1 and a β = 0.2 if not specified otherwise.

Evaluation

Return and Risk-Adjusted Return metrics are used to evaluate the performance of the semantic BNs for share evaluation. These metrics include Annual Return, Compound Return, Average Annual Return, Treynor Ratio and Sharpe Ratio. This selection is informed by our requirements of reproducibility and to support robust evaluation of other BN topologies. The system is evaluated over one-year and one-month holding periods.


Implementation

Python, a general-purpose programming language is the selected development platform for the system. The PyAgrum library is used to create and manage Bayesian networks, given its ability to perform efficient computations.

A visualization web console was developed using a Flask server and ReactJS interface. This provides the user an opportunity to select different experiments and obtain a graphical view of the results in comparison to the relative benchmark.



Results

Base INVEST

The experiment is conducted to validate the INVEST system results. This experiment was run with β = {0.2, 0.6, 1.0, 1.4, 1.6} for years 2015 to 2018 and compared to the benchmarks for the respective sectors.

Performance comparison of INVEST investment portfolio and benchmark for 2015 - 2018 (β = 0.2)
Measure IP.JGIND JGIND IP.JCSEV JCSEV
CR 13.33% 6.96% 37.55% 0.51%
AAR 12.01% 7.32% 11.13% 0.83%
TR 0.37 0.09 0.82 0.01
SR 0.25 0.35 2.58 0.03

Fig.4 - Annual Returns of INVEST vs Benchmark for 2015 - 2018 (β = 0.2) For β = 0.2, the experimental results demonstrate that the annual returns significantly outperformed the General Industrials benchmark in 2016 and the Consumer Services benchmark in 2015 and 2016 respectively.

This experiment was also run for β = 0.6 and β = 1, where it performed slightly worse providing tentative evidence towards a more conservative investment approach where the share has less systematic risk than the market. However, we note that it was still able to outperform the benchmark in the majority of the conducted experiments.



Based on our empirical evaluation of INVEST, we conclude that the base INVEST system demonstrates consistent excess returns for the majority of the General Industrials and Consumer services sector with lower β values, providing evidence for a more conservative investment approach.

Extended INVEST

Fig.5 - Extended Quality Evaluation Bayesian Network The extended Quality Bayesian Network depicts the systematic risk extension (modelled in pink), which was included by modifying the topology of the Quality Evaluation BN to model risk within the system. This experiment was conducted for β = {0.2, 1.0} for years 2015 to 2018, and was compared to the base INVEST system.




Fig.6 - Annual Returns of INVEST Extended network vs INVEST for 2015 - 2018 (β = 0.2) For β = 0.2, we note that the extended system outperformed the baseline results for General Industrials, however, this was not the case for Consumer Services. For β = 1, performance degraded for General Industrials, however, near parity was achieved relative to the base system. Furthermore, we note that the results did not improve for Consumer Services.






Fig.7 - Annual Returns of INVEST Extended network vs INVEST for 2015 - 2018 (β = 1) With the inclusion of systematic risk in the quality network, we note that with a lower β value the system produced improved results. However, as β increased to 1, it had a marginal impact on performance.
We conclude that there is tentative evidence to include systematic risk in the network.





Ablation Study

Each network is tested in isolation to determine its relative importance within the overall framework.

Performance comparison of Value Evaluation network investment portfolio and INVEST for 2015 - 2018 (β = 0.2)
Measure IP.JGIND.V IP.JGIND IP.JCSEV.V IP.JCSEV
CR 16.05% 13.33% 43.19% 37.55%
AAR 5.13% 12.01% 11.61% 11.13%
TR 0.5 0.37 0.97 0.82
SR 0.33 0.25 2.95 2.58
Performance comparison of Quality Evaluation network investment portfolio and INVEST for 2015 - 2018 (β = 0.2)
Measure IP.JGIND.Q IP.JGIND IP.JCSEV.Q IP.JCSEV
CR -53.24% 13.33% 19.80% 37.55%
AAR -8.48% 12.01% 2.95% 11.13%
TR -1.45 0.37 0.49 0.82
SR -3.29 0.25 1.7 2.58

This experiment yielded unexpected results for β = 0.2 . Considering only the Value BN, we observe that the resultant investment portfolio outperformed the INVEST baseline results across all metrics for Consumer Services, and over all metrics for General Industrials excluding AAR. We observe a significant decrease in performance when β = 1.

Excluding the Value BN and testing only the Quality BN for β = {0.2, 1.0} resulted in a stark decrease in performance . In both configurations, the base INVEST system outperformed the recommended investment portfolio.
We can conclude that the Value network is a relatively important component of the INVEST system in comparison to the Quality network.

Hybrid Integration

A hybrid experiment was performed by integrating a spatial-temporal graph neural network (GNN) model into the system. To perform the hybrid experiment, the FutureSharePerformance node was classified as Positive, Negative or Stagnant using inference on price movement from the GNN. The experiment was conducted for β = {0.2, 1} for one-year and one-month holding periods and compared to the base INVEST system results.

Performance comparison of INVEST + GNN and INVEST investment portfolio over a one year holding period (β = 1)
Measure IP.JGIND.G IP.JGIND IP.JCSEV.G IP.JCSEV
CR 3.75% 2.53% 6.82% 9.49%
AAR 21.49% 20.80% 4.94% 6.33%
TR 0.13 0.10 0.18 0.25
SR 0.09 0.06 0.30 0.5
Performance comparison of INVEST + GNN and INVEST investment portfolio over a one month holding period (β = 1)
Measure IP.JGIND.G IP.JGIND IP.JCSEV.G IP.JCSEV
CR 0.79% -0.48% 1.29% 1.48%
AAR 0.44% -1.56% 1.07% 1.08%
TR 0.02 -0.01 0.02 0.03
SR 0.09 -0.18 0.34 0.40

For β = 0.2 for both holding periods, the same results were obtained given that an identical set of shares were selected. However, when β = 1 for a one-year holding period , the results demonstrate no performance improvement for Consumer Services in comparison to the base system results. We note that for the General Industrials sector, the hybrid system outperforms the baseline results measured across all metrics.

We posit that the results are a consequence of the specified prediction window. The ST-GNN model outputs predicted values for the next 10 days, while the investment portfolio holding period is one year. When β = 1 for the one-month holding period, we see an improvement in performance . The results demonstrate a drastic improvement relative to the one-year holding period for General Industrials.

We conclude that a GNN, or more broadly, Machine Learning model integration could be valuable for short-term holding periods, such that the period is temporally consistent with the prediction horizon of the model.

Noise Simulation

Fig.8 - Noise Simulation for 2015 - 2018 (β = 0.2)

For the addition of noise, a fractional sample of the dataset is randomly adjusted positively or negatively by the standard deviation of the variable of interest. This was run 10 times and the mean metric values are reported.

The results demonstrate that the INVEST system cannot entirely withstand noise given that different results are produced in comparison to the baseline. Therefore, we conclude that the system is not entirely stable and robust.


Future Work

This research provides a solid foundation for further investigation and extension of the INVEST system and other Semantic BN-based IDSSs for share evaluation. Our results present opportunities for further investigation.