QWR Articles

Applied Machine Learning – Bayesian Modeling in Ninja Trader Strategies

by Marcio Costa


Over time, the way traders approach to markets has become more complex, more fragmented, and more abstract.
Today, there are multiple trading venues, hundreds of order types with modifiers, and unlimited algorithmic smart order routing systems,
all with the express purpose of competing to get the best fill at the lowest total cost. But there is still, conceptually,
one major progress that all market participants understand as the tool that brought more significant value to quantitative approach.
Machine Learning is the fastest way to prove the evidence or the effectiveness of alphas during the strategy building process.
When analyzing data, we can several methods to look for alphas: visual, back testing, Monte Carlo, and other simulations,
but machine learning can show us the best statistical evidence over time. This article brings some light on the Bayesian model
and a real application for Ninja Trader Platform. We will not explore machine learning definitions, and we assume traders
are comfortable with basic ML, math, and statistical concepts.

Bayesian Modeling

The Naive Bayes model can be utilized for twofold (binary) arrangement, for instance anticipating if the weather is warm or cold,
we can track several patterns like rain, cloud coverage, season,…etc or for multiclass grouping. The classification process is the key
factor for the ML statistics to be applied. Also, we have to make sure the variables we are using to our dataset are eligible for normal distribution.
This variable feature is called stationary. Thus, stationarity implies that a time series does not have a trend or seasonal effects1.
This way, I named alphas (https://www.investopedia.com/terms/a/alpha.asp) as
the results extracted from indicators or even the market raw data. In this text, we will test alphas as indicators results,
statistically called priors, to measure if some values individually or combined have the possibility to predict price movement.
In this article I disclose how to make a credulous Bayes order framework when the indicator esteems are numeric, utilizing the C# language with any special code libraries.

The Code

This coding process has been simplified because it can be used for many applications and situations so you can explorer
more options if we leave like an open concept. All code runs on real time during OnBarUpdate() entry on Ninja Trader script.
It is not recommend using it in high frequency or tick-by-tick due the nature of the operation and the number of resources used by the system.
First step is to create and load a dataset. In this case we are using arrays, the fastest way to access from PC memory.
The reference data that is used to determine the class of an unknown data item is hard-coded into an array-of-arrays style matrix.
This structure allows several numbers of priors (explained in the next subtitle) and this way we can combine more probabilities.
In order to create a more visual and practical, we have this table below to show some random numbers:


Bar 1

Bar X

Bar Y – prediction













action (return in ticks)




The matrix (arrays) will be loaded with values from indicators for several bars. From these arrays we will:

1 compute class counts
2 compute / display means
3 compute / display variances
4 set up item to predict
5 compute / display conditional probabilities
6 compute / display unconditional probabilities
7 compute / display evidence terms
8 compute / display prediction probabilities

The prediction probabilities will indicate the best chances for the price prediction for the next bar.

Let’s move to real code and first thing is declare the variables:

double[][] data;

protected override void OnStateChange()
	if (State == State.SetDefaults)
		data = new double[1000][];
	else if (State == State.DataLoaded)
		Lista = new List();	

Besides the main data array, we have to create a new list. This redundancy allows us to work with functions on both datasets.
Although it looks like a double storage, many accesses are provided to lists and other to arrays

  private	List Lista; 

  private class TestList
    public double lprior1;
    public double lprior2;
    public double lprior3;
    public double lside;
    public TestList(double myprior1, double myprior2, double myprior3, double myside)
      lprior1 = myprior1;
      lprior2 = myprior2;
      lprior3 = myprior3;
      lside = myside;

For all machine learning applications, we have to follow the same workflow, that involves train in-sample data and test with out-of-sample data.
The dataset is split into training and test data. The training data contained the historical data, so you must allow NT work with State.Historical.
In the OnBarUpdate() class, we add the input values:

  prior1 = EMA(8)[0] - EMA(8)[5];
  prior2 = SMA(8)[0] - SMA(8)[13];
  prior3 = RSI(13)[0] - RSI(13)[5];

And then we complete the array input instruction with the code below. Note for the “trnum” is our reference bar number and we do not use the
CurrentBar() component from NT. Our data[] array is limited in 1000 registers, for this study, we had to limit the historical bar’s number.

    if(Closes[0]> Open[0])
        data[trnum] = new double[] { prior1, prior2, prior3, 1 }; // 1 = Up Bar 
        Lista.Add( new TestList(prior1, prior2, prior3, 1));
    if(Closes[0]< Open[0])
        data[trnum] = new double[] { prior1, prior2, prior3, 0 }; // 0 = Down Bar 
        Lista.Add( new TestList(prior1, prior2, prior3, 0));
Preparing the Prediction

There are four steps to prepare a naive Bayes prediction for numeric data. You must compute the counts of each class to predict,
the means of each predictor, the variances of each predictor, and set up the item to predict. The class counts are computed like so:

      int N = Lista.Count();
      int[] classCts = new int[2]; // up bar, down bar
      for (int i = 0; i < N; ++i)
        int c = (int)data[i][3]; // class is at [3]

In these instructions above we define the 2 class counts, for up bar and down bar. Next step we compute the means for each predictor variable, for each class.

double[][] means = new double[2][];
for (int c = 0; c < 2; ++c)
	means[c] = new double[3];
for (int i = 0; i < N; ++i)
	int c = (int)data[i][3];
	for (int j = 0; j < 3; ++j)  // EMA, SMA, RSI
		means[c][j] += data[i][j];

The values are stored in an array-of-arrays matrix named means where the first index indicates which class (0 = down, 1 = up) and the second index indicates which predictor
(0 = EMA, 1 = SMA, 2 = RSI). After the sums of the predictors are computed, they’re converted to means by dividing by the number of items in each class:

for (int c = 0; c < 2; ++c)
    for (int j = 0; j < 3; ++j)
      means[c][j] /= classCts[c];

The demo sets up storage for the variances of each predictor using the same indexing scheme that’s used for the means:

 double[][] variances = new double[2][];
 for (int c = 0; c < 2; ++c)
   variances[c] = new double[3];
 for (int i = 0; i < N; ++i)
   int c = (int)data[i][3];
   for (int j = 0; j < 3; ++j)
     double x = data[i][j];
     double u = means[c][j];
     variances[c][j] += (x - u) * (x - u);

Then the sums are divided by one less than each class count to get the sample variances:

    for (int c = 0; c < 2; ++c)
      for (int j = 0; j < 3; ++j)
        variances[c][j] /= classCts[c] - 1;
Making the Prediction

To make a prediction, you use the item to predict, the means, and the variances to compute conditional probabilities.
So, the formula for our basic Gaussian distribution formula and the code to solve it.

Storage for the conditional probabilities utilizes the same indexing scheme that is used for means and variances,
where the first index indicate the class and the second index indicates the predictor:

          static double ProbDensFunc(double u, double v, double x)
            double left = 1.0 / Math.Sqrt(2 * Math.PI * v);
            double right = Math.Exp( -(x - u) * (x - u) / (2 * v) );
             return left * right;
        static double ProbDensFuncStdDev(double u, double v, double x)
            double left = ( x - u) > 0 ? (x - u) : (x - u) *-1;
            double right = left / v;
                    return right;

The probability density function defines the shape of the Gaussian bell-shaped curve.
Note that these qualities truly aren’t probabilities since they can be more prominent than 1.0,
however they’re typically called probabilities in any case.
Unconditional probabilities of the classes to predict are ordinary probabilities.
For example, the P(C = 0) = 0.5000 because four of the eight data items are class 0.
The unconditional probabilities are computed and displayed by this code:

      double[] evidenceTerms = new double[2];
      for (int c = 0; c < 2; ++c)
        evidenceTerms[c] = classProbs[c];
        for (int j = 0; j < 3; ++j)
          evidenceTerms[c] *= condProbs[c][j];

The last step in making a prediction is to convert the evidence terms to probabilities. You compute the sum of all evidence terms then divide each term by the sum:

    double sumEvidence = 0.0;
    for (int c = 0; c < 2; ++c)
      sumOfEvidences += evidenceTerms[c];
    double[] predictProbs = new double[2];
    for (int c = 0; c < 2; ++c)
      predictProbs[c] = evidenceTerms[c] / sumEvidence;    
Predication probabilities

The result returned by the sub is a number between 0 to 1, and this can be the class of 0 (probability for short) or 1 (for long).
These classes can be integrated to a strategy and can also work with a second data layer processing results from this virtual trading layer.
The virtual trading layer results can be analyzed and added to the original dataset.

In a ML environment, the concept of Bayesian modeling is getting more popularity due to the individual element for each prior,
very applicable for financial analysis and the capacity to sum up with other priors.

Alphas and priors

We can discuss a lot about the alphas and priors and that is the main reason we can apply ML. In order to generate alpha (or edge)
we have analyze several set ups using market indicators, datasets, strategies and other tools. The predictor model is used to seek evidence
between these factors. Most strategies rely in one or more indictors and during the backtest process we can measure trading efficiency and behavior.
The evolution shown with the ML applications are relevant because have dynamic datasets, so it reduces the marginal error from bad quality data.

Conclusion and future research

It is a constant struggle to find a way to implement new concepts like Machine Learning and its variants to the current platforms.
Ninja Trader has a very strategy builder interface with NT Script sometimes the use of external libraries become quite tough.
The way we explore in this article shows definite and easy path to proof results from ML stand point. This work will extend over new ways to
improve the operation speed, resources efficiency and language optimization.

Feel free to keep in touch at marcio@quantwisetrading.com


1 Jansen, Stefan. Machine Learning for Algorithmic Trading: Predictive models to extract signals from market and alternative data for systematic trading strategies with Python, 2nd Edition (p. 433). Packt Publishing. Kindle Edition