From Ezra: http://ezrakilty.net/research/2006/02/3body_problem_in_erlang.html
The first conceptual problem I hit was the question of synchronization. In order for the sim to be a decent finite approximation to the continuous world of physics, we need to break it into discrete time steps, each of which depends on the previous time step (at least, that's the only way I know of to get a fair approximation). This means that each particle can't just crunch away at it's own speed, working as fast as it can to calculate its position at various times. Easily the particles could grow out of sync with one another, making an inaccurate physical model.I hadn't thought about this, but I think Ezra is right. In terms of simulation of the 3 body problem, if the correct calculation in the future depends on current calculations, and the current calculations depend on each other, you need to make sure that the calculations are 'in step'.
This calls into question my thought before that asynchronous simulations would work, since whenever the messages arrive, that's when they arrive and process them. In a decentralized simulation of termites gathering wood chips, I imagine an asynchronous simulation would suffice. It doesn't really matter what exact paths the termites take, but rather, the end result of that chaos. But in a gravity simulation, asynchronous simulation doesn't seem to work, because what you're interested in is the actual paths.
If the calculations of all other threads must be synchronous or in lockstep, it would seem like it would give an upper bound to how fast the simulation can go, even in a multi-threaded environment. Since the calculations will be wrong, the further into the future you calculate with slightly incorrect values, what kind of useful computations can you do if you don't have all the initial conditions in your formula?
The only thing I can think of is if you had different sets of three threads--one for each mass--processing the simulation at different simulation times, you can reduce the processing load for the trailing set of threads. So say you had a leading set of threads that operated on simulation time of t + n always. That leading set can narrow the scope of possible answers. Since it knows it's operating on a chaotic system, it knows that what the error is given a certain lead time of n. Therefore, it should be able to limit the upper and lower bound of the possible right answers. Then, the trailing set of threads that operate on simulation time of t, only has to adjust the error, which hopefully is less computationally intensive.