Engineer
Major
A note that LLMs that follow a “Kepler-esque” approach: they can successfully predict the next position in a planet’s orbit, but fail to find the underlying explanation of Newton’s Law of Gravity (see ). Instead, they resort to incorrect fitting rules that allow them to successfully predict the planet’s next orbital position but to find the force vector and generalize to other physics. Explained in .
A neat idea, but the authors oversimplified the concept of planetary motion. An excerpt from the paper:
For centuries, astronomers and physicists have worked on predicting the orbits of planets around the sun. A groundbreaking model was offered by the astronomer Johannes Kepler in the 17th century. His model was based on geometric patterns: for example, that the orbit of each planet followed an ellipse with the sun at one of its foci. While the model could predict orbits with a near-perfect level of precision, it couldn’t explain why the planets obeyed these geometric orbits or be applied to new problems beyond predicting trajectories.
Later, Isaac Newton expanded on this model using new laws of motion, now known as Newtonian mechanics. These laws involved computing properties of the set of planets in motion, such as their relative velocities and masses. Using these properties, he could derive Kepler’s earlier laws for orbital trajectories, but also go beyond, understanding and formalizing other concepts like force and gravity. From Kepler to Newton, scientists were able to move beyond good predictive models of sequences to a deeper understanding of them. In this section, we test whether a transformer that can predict sequences of orbital trajectories is merely a good sequence model, or whether it has also made the transition to providing a world model.
May be the authors didn't have enough space to devote to history, but that very first sentence should have given everyone a clue: If the concept were so simple, humanity wouldn't have spent centuries working on it! Those centuries should be more like millennia, because observation started way back in ancient time. Here is a video about that history:
The authors trained and used a 109M parameters model that did nothing except for predicting next planetary positions based on a sequence of observation. The authors' argument would have been a little more convincing had the model output velocity of the planets as well. With only positions and no mass data given, the valid course of prediction would be through the use of N-th order polynomials, and that's exactly what the model did.
Even with knowledge on Newton's Law of Gravity, N-th order polynomials are still used to describe motion of planets in real world application, such as NASA's DE Series. Newton's Law only works for two-body systems, which the solar system is not. So may be it wasn't the model that is stupid here.
Last edited: