Live verified track record
No martingale · No grid · Fixed SL
99.9% tick data backtests
Monte Carlo stress-tested
334+ customers in 68 countries
Strategy Analysis Trading EAs Trading Theories

Why Your Forex EA Backtest Period Is Probably Wrong — And How to Fix It

⏱ 8 min read March 16, 2026 1,557 words By The Nomad Trader

I’ve seen it a hundred times. Someone posts a backtest showing 300% returns over two years, perfect equity curve, profit factor of 2.8. Three months into live trading, the account is down 40%.

The backtest wasn’t fraudulent. The developer wasn’t lying. They just picked an arbitrary period — and it was the wrong one for that strategy’s timeframe.

The question “how long should I backtest?” doesn’t have a single answer. It has a framework. And once you understand the framework, the right number falls out automatically for any EA you’re building or evaluating.

The Real Goal: Trade Count + Regime Coverage

Before talking about years, you need to understand what a backtest is actually trying to prove. It’s doing two things simultaneously:

  1. Statistical validity — enough completed trades to draw conclusions about the system’s edge
  2. Market regime coverage — enough calendar time to have encountered trending markets, ranging markets, high-volatility crises, and low-volatility consolidations

These two requirements pull in opposite directions depending on timeframe. A D1 system trades infrequently, so you need many years just to accumulate enough trades. An M15 system fires far more often — in 3 years it will have generated more trades than a D1 system generates in 15.

That’s the core insight: the right backtest period is the one that satisfies both requirements at the same time. Time alone doesn’t validate a backtest. Trades alone don’t validate it either. You need both.

The Timeframe-to-Period Framework I Actually Use

Here’s exactly how I think about this for each timeframe:

TimeframeApprox. trades/year (portfolio)My backtest periodWhy
M15200–4003–4 yearsHigh trade frequency means statistical significance arrives faster. 3 years still covers post-COVID volatility, the 2022 rate cycle, and a normalisation period.
H180–1505–7 yearsFewer trades per year, so you need more time to hit a meaningful sample. 5–7 years also starts capturing a genuine multi-regime window.
H430–607–10 yearsLow trade frequency means 5 years might only produce 200–300 trades — borderline. 10 years gives confidence.
D110–2510–20 yearsA D1 system on a single pair might fire 15 times a year. You need a decade just to approach statistical relevance, and two decades is much better.

The key point: M15 systems don’t need a 10-year backtest to be credible. They produce enough trades in 3–4 years that the statistics are already robust. Extending to 10 years adds noise more than signal, because the further back you go, the less the market structure resembles what you’ll actually be trading.

Conversely, a D1 system with a 3-year backtest is almost worthless. At 15 trades per year, 3 years gives you 45 trades. That’s not a statistical sample — that’s a handful of data points that could all have occurred during a single market phase.

The right backtest length is the minimum period needed to accumulate 300+ trades across your portfolio AND cover at least two distinct market regimes. For high-frequency strategies that’s often 2–3 years. For low-frequency strategies it can be 15–20.

The Minimum Trade Count — The Real Constraint

The number I actually optimise for, before thinking about calendar time, is trade count across the portfolio. Here’s the framework:

  • Below 100 trades: No conclusions possible. Statistical noise only.
  • 100–300 trades: Directional signal, but wide confidence intervals. Use with caution.
  • 300–800 trades: Meaningful. You can draw conclusions about edge with reasonable confidence.
  • 800+ trades: Strong signal. Results are statistically stable across reasonable variations in trade sequence.

This is why I always look at portfolio trade count rather than single-pair results. Running across multiple pairs isn’t just about diversification — it’s what makes the statistics meaningful in a realistic testing timeframe. A system that generates 80–150 trades per year across the portfolio can reach statistical relevance in 5–7 years. The same system tested on a single pair might need twice as long to hit the same trade count.

More Years Isn’t Always Better — The Regime Relevance Problem

Here’s where most “always test 20 years” advice goes wrong. The further back you extend a backtest, the less the historical data resembles the market you’ll actually trade.

Pre-2008 forex was a structurally different market. Spreads were wider. Liquidity was thinner. High-frequency trading hadn’t reshaped intraday dynamics yet. Central bank policy operated under entirely different frameworks. An M15 scalper optimised on 2003–2006 data is being fitted to market microstructure that no longer exists.

For shorter timeframes especially, recent data quality beats historical data quantity. A 3-year M15 backtest from 2022–2025 is more predictive than a 10-year M15 backtest from 2010–2020, because the system you deploy today will trade in a market more similar to 2022–2025 than to 2010.

For higher timeframes — H4 and D1 — this matters less, because daily and multi-day patterns are more structurally stable across decades. Macro cycles, institutional positioning behaviour, and fundamental reaction patterns don’t evolve as fast as intraday microstructure.

From the developer
Apex DayTrader EA
Session-aware MT5 day trader. EMA distance filter, Engulfing confirmation, break-even/trail. 4 live Myfxbook accounts.
$700one-time · lifetime
View EA →

99.9% Tick Data: Non-Negotiable Regardless of Timeframe

Whatever period you choose, the data quality has to be right. MetaTrader’s default modelling mode — “Control Points” — reconstructs ticks from M1 bars using mathematical interpolation. It is not a real simulation. For any strategy with a stop loss, this produces results that won’t match live trading.

  • MT5: Use “Every Tick Based on Real Ticks.” This is the gold standard — actual broker tick data, not a reconstruction.
  • MT4: Use Tick Data Suite (TDS) with Dukascopy data, 99.9% modelling quality, variable spread.

Always use variable spread. Your live spread on EURUSD widens from 0.6 to 4–5 pips during London open and news releases. A fixed-spread backtest never encounters this. Every EA looks better on fixed spread than it does in live trading. Don’t test on conditions that don’t exist.

Walk-Forward: The Test After the Test

Curve-fitting is the silent killer of most EA backtests. Any sufficiently flexible system can be optimised to produce a beautiful equity curve on past data — change parameters, add a filter, adjust a threshold. The result looks like edge. It isn’t.

Walk-forward testing exposes this. The structure is simple:

  1. Split your full dataset: use the older portion (70%) to find and optimise your parameters.
  2. Lock those parameters and test them, unchanged, on the newer portion (30%).
  3. If the out-of-sample period holds up with a similar drawdown profile and positive expectancy — you have genuine edge.
  4. If performance collapses on unseen data — the system was fitted to the in-sample period. Start over.

The out-of-sample window should always be the most recent data, because that’s closest to what you’ll actually be trading. For an M15 system tested over 4 years, I’d use the most recent 1–1.5 years as out-of-sample. For a D1 system tested over 15 years, I’d use the most recent 4–5 years.

Monte Carlo: The Test Most Developers Skip

Your backtest shows one specific sequence of trades in one specific order. But live trading doesn’t respect sequence. A string of 12 consecutive losses that appeared in month 18 of your backtest could happen in month 1 of your live deployment. Would the account survive? Would you switch it off before it recovered?

Monte Carlo simulation randomises the trade sequence thousands of times and shows you the full distribution of outcomes — not one equity curve, but thousands. The metrics that matter:

  • 95th percentile maximum drawdown: The worst drawdown you’d expect in 95% of scenarios. This is your real risk number, not the backtest drawdown.
  • Profit factor across runs: Does PF stay above 1.2 across 95% of Monte Carlo iterations? Below this, the edge is too thin to survive real-world variance.
  • Ruin probability: What percentage of runs hit a 50%+ drawdown? This should be near zero.

I built the Edge Matrix at ErgodicLabs specifically because there was no free tool doing this properly for retail EA results. You can run your own reports at ergodiclabs.co.

The Full Checklist

Pulling this together into something you can apply immediately:

CriterionM15H1H4D1
Backtest period3–4 years5–7 years7–10 years10–20 years
Minimum portfolio trades500+400+300+200+
Modelling quality99.9% tick data, variable spread — all timeframes
Out-of-sample window1–1.5 yr1.5–2 yr2–3 yr4–5 yr
Profit factor target1.5+ minimum, 1.8+ preferred — all timeframes
Max drawdown target<20% backtest, <30% at MC 95th percentile
Pairs tested (portfolio)3+ minimum, 5+ preferred — all timeframes

What to Look for When Evaluating Someone Else’s EA

You’re looking at buying an EA. The vendor shows you a backtest. Here’s how to read it quickly:

Red flags — walk away:

  • Period too short for the timeframe — M15 with 1 year, H1 with 2 years, D1 with 3 years
  • Modelling quality below 99%, fixed spread used
  • No Monte Carlo results provided
  • Equity curve perfectly smooth with zero visible losing streaks (over-optimised)
  • Single pair only
  • No live Myfxbook account

Good signs — worth evaluating further:

  • Period appropriate to the timeframe, 99.9% tick quality, variable spread
  • Results across 3+ pairs showing consistent behaviour across all of them
  • Visible losing periods in the equity curve — real markets have losing streaks
  • Monte Carlo provided, 95th percentile drawdown disclosed
  • Live Myfxbook account running 6+ months with behaviour consistent with the backtest

That last point is the most important. A backtest is hypothesis. A live account with consistent behaviour is evidence. If a developer won’t run their own EA with real money, they don’t believe in it — and neither should you.

Summary

There’s no single “correct” backtest length. The right answer scales with your timeframe, because what you’re actually trying to achieve is a sufficient trade sample and meaningful market regime coverage simultaneously. For M15 systems, 3–4 years achieves both. For D1 systems, you might need 15–20 years to achieve either.

Beyond period length: 99.9% tick data and variable spread are non-negotiable at every timeframe. Walk-forward out-of-sample testing is what separates genuine edge from curve-fitting. And Monte Carlo is the only way to understand your real risk — not the clean number printed on the backtest report.

Apply these standards to any EA before you trust it with real capital. If the developer can’t satisfy them, keep looking.

NT
The Nomad Trader
Algorithmic forex trader and EA developer. 6+ years live verified results. 334+ customers in 68 countries. Building trading systems from the Amazon, Ecuador — every insight comes from live money on the line.
The Nomad Trader Live verified · No martingale
Browse EAs →