Though linear factor models are ubiquitous, factor-based investing doesn't jibe with the way fundamentally-driven investors invest. More often than not, simpler mathematical models become popular because they are easy to compute, simple to explain and easy to sell.
Think of the simplest “Ben Graham” value-investing process: Given a universe, buy all stocks trading at Price:Book < 1, ignore everything else. On a graph, with Price:Book as dependent variable, the response looks like a step function. Now imagine trying to approximate that step function with a straight line, ie a linear model with Price:Book as the single factor. The approximation would be far from satisfactory. Now imagine an investor using a stock screener with multiple attributes (which could describe a large part of the investment process for many pro investors, including Warren Buffet). With many step functions in many variables, what are the chances of a linear model approximating it accurately?
In many countries, bank loans have a covenant, ie the company is required to keep their leverage (debt:Equity, or net debt: EBITDA) below a certain level, or be at fault. Suppose one such covenant for a company is net debt: EBITDA < 3. Stock traders effectively ignore the covenant when the ratio is < 1.5, so the response is zero. But then the effect on the stock price starts to become non-trivial when the ratio crosses 2.0, rises more sharply past 2.5, and goes exponential, taking it to distressed level closer to 2.9/3.
This is one of many non-linear effects that cannot be modeled with a linear factor model.
The forward-thinking AI/ML community has correctly identified that the fundamental nonlinear forms embedded in neural networks can capture nonlinear effects in the data far better than linear factor models. But they come with their own mathematical pitfalls that are usually in the blind spot of Ph.D.s in Finance.
Neural networks are fundamentally non-convex, and the training process requires optimizing an objective function potentially with many local minima (that too, performed with a gradient-descent algorithm which doesn't even check for curvature). The training process could easily get trapped in a sub-optimally 'trained model' (and you wouldn't even know about it). Worse, this problem rises exponentially as you add more variables, hence more parameters to the training process and optimization problem. But practitioners rarely stop to check.
This is not a job for a group of interns and coders that earned their stripes in social media or e-commerce. This requires a rebellious combination of mathematics, computer science, and hands-on market experience to even ask the right questions. We are that fiery concoction.
Did we mention that financial markets, unlike languages, and self-driving patterns are not stationary? All your takeaways from such well-researched areas return to square one when it comes to predicting financial markets.