Skip to content

Documentation for Administrators

This documentation is intended for people who want to know how to run the application locally.

Trading Strategy Tester package

In terms of the Trading Strategy Tester Python package itself, user can simply install it via pip:

pip install trading-strategy-tester

This will install the latest stable version of the package. If you want to install the latest development version, you can clone the repository and install it locally:

git clone https://github.com/DrDanicka/trading_strategy_tester.git

or if you are using Apple device with mlx support, you can clone this version:

git clone --recurse-submodules https://github.com/DrDanicka/trading_strategy_tester.git

in which you would be able to train your models locally.

Web Application

The web application is a separate project that uses the Trading Strategy Tester package as a backend. If you want to use the web application, you need to do the following:

  • Install and start Ollama on your machine using this link. After installing Ollama, restart your computer to ensure that the installation is complete and the environment is set up correctly.

  • Clone the repository:

git clone https://github.com/DrDanicka/trading_strategy_tester_web_app.git
  • Navigate to the project directory:
cd trading_strategy_tester_web_app
  • Initialize the Ollama models:
python init_ollama.py

This step downloads the fine-tuned weights for the Llama 3.2 models that are needed to create Ollama models. It's required to have at least 50GB of free space on your disk. Right after the weights are downloaded, Ollama models are created and the weights are deleted.

  • Start the application with Docker:
docker-compose up --build

This command builds the Docker image and starts the application. The app will be available at http://localhost:5001.

Note: If you want to use LLM integration in your code without the web application, you still have to do steps 1-4. to initialize the models. You can then use the process_prompt function to generate trading strategies from natural language prompts.

Training your own models

In the previous section we described how to create the Ollama models with already fine-tuned weights that were created for this project. It is the recommended way to use the models, but if you want to train your own models, you can do so by following these steps:

IMPORTANT NOTE: Training your own models is only available on Apple devices with mlx support. If you are using a different device, you can still use the pre-trained models from the previous section but you won't be able to train your own models.

  • Install and start Ollama on your machine using this link.

  • Clone the repository:

git clone --recurse-submodules https://github.com/DrDanicka/trading_strategy_tester.git
  • Navigate to the trading_strategy_tester/evaluation directory:
cd trading_strategy_tester/evaluation
  • Create your own training data with training data generator:
python3 ../trading_strategy_tester/training_data/training_data.py

You can also change the counts and seeds of the training data generator in the trading_strategy_tester/training_data/training_data.py file.

  • Train the models with the generated training data:
python train.py --model all

This command trains all the models with the generated training data. You can also train only specific models by using the --model option and you can also change learning rage, iterations, and other parameters. You can find them in the train.py file.

After the training is finished, you have all the models created in Ollama and you can use them in your code or in the web application.

Evaluation

After cloning the repository, navigate to the evaluation/ directory. This directory contains two core scripts used to evaluate the performance of the LLMs:

  • run_model.py – generates responses of models for test prompts.
  • test_model.py – validates the responses and computes evaluation metrics.

Step 1: Generate model outputs

To generate model outputs, make sure you're inside the evaluation/ directory and run:

python run_model.py --model all

This will run all available models on their respective test sets and save the outputs to the testing_outputs/ directory. You can also generate outputs for a specific model:

python run_model.py --model llama3-2-1B_tst_ft-ticker --data _data/fields/ticker/test.jsonl

Each output file will contain:

  • the input prompt,
  • the expected completion,
  • and the actual response generated by the model.

The result will be saved to testing_outputs/{model}.jsonl.

Step 2: Run the evaluation

Once the model outputs are generated, run the evaluation script:

python test_model.py --model all

To enable detailed logging or to evaluate a specific model:

python test_model.py --model llama3-2-1B_tst_ft-ticker --log

This script will:

  • validate the model outputs syntactically and semantically,
  • compare them with the expected completions,
  • and count how many parameters and outputs match.

Evaluation metrics will be saved in the results/ directory in JSON format, e.g., results/llama3-2-1B_tst_ft-ticker_results.json.

Requirements

Before running the evaluation, ensure that:

  • You have Ollama installed and running.
  • You have initialized the models using the init_ollama.py script.
  • You have generated or downloaded test data in _data/fields/{param}/test.jsonl or _data/full/test.jsonl.
  • Install the required Python packages using the provided requirements.txt file:
pip install -r requirements.txt

Notes

The test set contains 10,000 prompts. Since each model is tested individually using two methods (fine-tuning and few-shot prompting), evaluation is computationally intensive.

On hardware without GPU acceleration, one prompt may take up to 1 minute per model, making full evaluation take several days. In our case, generating all outputs took approximately 6 days on a Mac with Apple M3 Pro chip.