5

I'm implementing a backtesting strategy with the following approach:

  • We buy a stock at today's opening price.

  • We set stop-loss and target boundaries.

  • If the stop-loss boundary is hit first, we sell the stock at the stop-loss price.

  • If the target boundary is hit first, we hold the stock until the end of the day and sell at the closing price (i.e., remove the stop order after the target is hit).

  • If neither boundary is triggered, we hold the stock until the end of the day and sell at the closing price.

Since I don't have intraday data, I assume a uniform distribution for intraday price movements. Although this is my current implementation, I am uncertain about the realism of these assumptions for backtesting purposes.

target_profit = 1.05
stop_loss = 0.95
target_price = target_profit * today_open
stop_price = stop_loss * today_open

if today_high >= target_price and today_low <= stop_price: if random.random() > (target_profit-1)/(target_profit-stop_loss): # a guess based on uniform distribution to determine if the stop or the target are hit the first sale_proceeds = shares * today_close else: sale_proceeds = shares * stop_price elif today_high >= target_price: sale_proceeds = shares * today_close elif today_low <= stop_price: sale_proceeds = shares * stop_price else: sale_proceeds = shares * today_close

user3070752
  • 151
  • 1
  • 5

4 Answers4

13

You have a proposed stock investing plan:

  • If A happens I do this,
  • if B happens I do that, and
  • If C happens I do something else.

While you know the opening price and the closing price, the actual prices during the trading day are a mystery.

You then propose to model what happens during the day by predicting that:

  • the A branch will happen X% of the time,
  • the B branch will happen Y% of the time, and
  • the C branch will happen (100-X%-Y%) of the time.

If your percentages are correct then you can predict how well your plan will work.

Without the data during the day can't test your theory, because how well it works depends on how well you guessed the values of X and Y.

If you believe your method will work. You need the data. You need enough data so that the software that would be reacting in real time can be tested with the old data.

Moving to a sports example. You predict that in an NFL game the first QB to throw for 200 yards in a game will win the game, and if neither does the home team will win.

If you back test only knowing the total throwing yards, you have no way of knowing the order the teams reach 200 yards. So you guess. But now you are using data that can't test your theory.

mhoran_psprep
  • 148,961
  • 16
  • 203
  • 418
5

Your model of intraday movements is arguably consistent with the efficient market hypothesis (EMH), i.e., with the stock price executing a martingale random walk. Indeed, if you were to use a different strategy -- selling at the target or stop price, whichever is hit first -- then your probability formula automatically results in zero expected return in the case where both are hit. So this is actually more sophisticated and realistic than a "uniform distribution".

It is somewhat paradoxical to take a strategy that is presumably intended to exploit some market inefficiency, and backtest it while using EMH as an assumption. However, this can be interpreted as a "conservative" evaluation where you are adopting EMH as a pessimistic model to cover a gap in your data, while still using the data you do have (OHLC) in a way that does not completely assume EMH.

Ultimately, there is no substitute for obtaining the actual data needed to fully backtest a strategy. That said, you could make a somewhat more realistic evaluation with OHLC data by taking into account the closing price. If the open is lower than the close (upward intraday trend), then it's more likely that lower prices are hit before higher ones -- and vice versa. The details would be based on a "Brownian bridge" -- the statistics of a random walk conditioned on both an initial and a final value.

nanoman
  • 30,218
  • 3
  • 75
  • 92
5

Perhaps a little unorthodox answer here but a uniform distribution of price changes is not actually how the stock market behaves. If you want to build models that resemble the actual market, you might want to look into the work of Benoit Mandelbrot. He is most famous for his work on fractal geometries, but he was also taught economics for a while. One of his doctoral students was Eugene Fama. Mandelbrot had proposed that price change distributions were not normal and that they were actually 'fat-tailed'.

Somewhat confusingly, Fama confirmed Mandelbrot's Pareto-Lévy hypothesis empirically prior to finishing his work on the efficient market hypothesis. Mandelbrot's hypothesis is a bit at odds with the EMH. As far as I can tell, the reason that Fama produces EMH after showing that it was flawed is simply that the statistical tools and methods to deal with Mandelbrot's ideas did not exist and stated in 1963 that there was a need "to develop more adequate statistical tools for dealing with the stable Paretian distributions".

OK, enough with the history lesson. Why should you care: in Mandelbrot's book: "The (Mis)behavior of Markets" he discusses creating fake distributions that are like market movements. Specifically on page 17, he shows 4 charts of market prices. 2 are real and 2 are fake. A few pages later, he shows a chart of the corresponding price movements. Of the 4, one is obviously not like the others. That is the 2 real charts and one fake look very similar, but one chart is obviously fake. The one that's obviously fake is ... the one generated using uniform distribution of price movements. It's pretty clear that if you want to simulate real market movements, that's not the right way to do it.

He goes on to explain this in detail along with specific methodologies. For example, he discusses a concept of H-factor (frustratingly difficult to find information about on the web,) which IIRC is roughly a measure of momentum or maybe more correctly, how uniformly distributed changes in price are. That is, it describes how likely a series is to revert to the mean (or not.) You might want to consider researching his work in economics if you want to generate realistic simulations of price movements. I think, along with using historical data, this could be very useful as you can experiment with how your strategy will work with varying levels of uniformity and non-uniformity.

3

The stock market is generally taken as having log normal returns. According to the Central Limit Theorem, if you have a process that yields any probability distribution for percentage change with finite standard deviation over period t, then if you take the limit as t goes to zero (with "limit" suitably defined), the distribution for a fixed time period goes to log normal.

So, for instance, if you model price changes as having uniformly distributed percentage change each second, then the percentage change over each day will be close to log normal.

Acccumulation
  • 10,727
  • 21
  • 47