New collision code

Development directions, tasks, and features being actively implemented or pursued by the development team.
Post Reply
hellcatv
Developer
Developer
Posts: 3980
Joined: Fri Jan 03, 2003 4:53 am
Location: Stanford, CA
Contact:

New collision code

Post by hellcatv »

I've committed the new collision system.

How it works is as follows...
a red-black tree of units sorted on radius ...
then each unit checks 2*radius away from its center looking for collisions, that way big units will always find smaller units.

This can also be used to find the closest unit from any given unit (or even closest bolt , or all bolts/units within a certain radius)

Anyhow hopefully people find the new system useful...
and after I remove some of the old dead code, faster.

Feel free to bugcheck it and discuss it here!
I'm excited that it will allow more units than ever before
Vega Strike Lead Developer
http://vegastrike.sourceforge.net/
chuck_starchaser
Elite
Elite
Posts: 8014
Joined: Fri Sep 05, 2003 4:03 am
Location: Montreal
Contact:

Post by chuck_starchaser »

Sounds simple and clever. I wonder, though, how you keep the sorting up to date. If you're doing a full recursive traversal of each red/black tree at each physics frame, I think I have an idea you'll like: Chances are, once all the units in a given red-black tree have been sorted once, there wouldn't be more than a few single-spot changes to do to the ordering per physics frame. If this assumption is correct, you'd probably get better performance by doing a single pass of bubble-sort per physics frame. Although bubble sort is maligned as the slowest sorting algorithm, it is actually by far the fastest when the data-set is highly ordered already. If the single pass bubble-sort fails, then a faster algorithm can be used, just in case further passes of bubble sort might fail again.

Of course, for this to be an advantage you'd need pointers between tree leafs, like a linked list. But as long as bubble sort keeps on succeeding, you can neglect the black red tree, and just update the inter-leaf list pointers, and use them for traversal, instead of the tree. Once in a while, when bubble sort fails after 1 pass, you can update the whole tree.

There's a further advantage to using bubble-sort: Cache coherence. Tree traversals often result in cache misses. Actually, so does traversing a linked list, but there's a simple way to precache elements of a linked list:
A single pass of...

Code: Select all

//leaf pre-cache-ing loop:
static int dummy_var;
leaf *pcurrent = this;
while(pcurrent->pnext)
{
  pcurrent = pcurrent->pnext;
}
dummy_var += (int)(*pcurrent);
//bubble-sort code follows here...
The reason for reading the last element and writing the value to a static dummy variable is to prevent the compiler from optimizing the code away.




The bubble-sort code that follows won't have to wait for the completion of the loop. This may not be very intuitive, but as long as the bubble sort code doesn't depend on the value of dummy_var, it will execute at the same time. I know the older athlon XP has 2 load units and 3 store units, so the little loop will keep one load unit busy, but the bubble sort can use the other, in the meantime.
So, the loop will execute in parallel with the bubble sort, speeding ahead, since the real code has no data dependencies with it; and it will cause all the soon to be needed cache lines to be raised in its wake. Each cache line is 64 bytes, so it probably contains all the data needed. Actually, if each leaf only has a pointer to a unit, you'd probably want to change the code to...

Code: Select all

//leaf and unit precacheing loop:
int dummy_acc;
static int dummy_var;
leaf *pcurrent = this;
while(pcurrent->pnext)
{
  pcurrent = pcurrent->pnext;
  dummy_acc += *((int*)(pcurrent->punit));
}
dummy_var += dummy_acc;
//bubble-sort code follows here...
As long as the bubble-sort code doesn't use the integer ALU, there should be no interference, and the above loop should execute in parallel.

Actually, the bubble sort code should include the distance updates. (Distance squared, I suppose it is --i.e.: x*x + y*y + z*z, as the square root makes no difference to sorting, right?) Plus a few other things as may be found convenient. Reason is, the precacheing loop will often be stalled by cache misses, and it wouldn't pay for the code that follows to be stepping on its heels all the way; so the real code following should do some real work, to rip the most advantage. So, at minimum, the sorting code that follows would go sort of...

Code: Select all

//bubble sort code that follows
//we're in a leaf of the tree, traversing to bubble-sort them
//x,y,z are the position of the unit owning the tree, passed in
//as arguments to the sorting routine we're in.
//distsqr, pnext and punit are leaf class members
leaf *pprevious;
bool success = true; // 8-)
distsqr =
 ((punit->x-x)*(punit->x-x)) +
 ((punit->y-y)*(punit->y-y)) +
 ((punit->z-z)*(punit->z-z));
if(pnext)
{
  pnext->distsqr =
   ((pnext->punit->x-x)*(pnext->punit->x-x)) +
   ((pnext->punit->y-y)*(pnext->punit->y-y)) +
   ((pnext->punit->z-z)*(pnext->punit->z-z));
  if(pnext->distsqr < distsqr)
  {
    //swap first 2 elements
    pfirst=pnext;
    pnext=pnext->pnext;
    pfirst->pnext=this;
    //and step back
    this = pfirst;
  }
  while(1)
  {
    pprevious=this;
    this=pnext;
    if(!pnext) break;
    pnext->distsqr =
     ((pnext->punit->x-x)*(pnext->punit->x-x)) +
     ((pnext->punit->y-y)*(pnext->punit->y-y)) +
     ((pnext->punit->z-z)*(pnext->punit->z-z));
    if(pnext->distsqr < distsqr)
    {
      //swap
      pprevious->pnext=pnext;
      pnext=pnext->pnext;
      pprvious->pnext->pnext=this;
      //and step back
      this=pprevious->pnext;
      //if the swap was not enough, we failed... meaning
      //that another sorting pass is needed, make it red/black
      if(distsqr < pprev->distsqr) success=false; // :-(
    }
  }
}
return success;
Precacheing is a good thing to do to speedup any traversals other than vectors. (Vector traversals trigger the automatic prefetching mechanism present in any modern cpu.) And no small advantage: we're talking x10 speedups, easily. Cache misses are *very* expensive.

Another idea that might speed things up further would be to do partial orderings: If there are a bunch of ships dogfighting each other at the opposite end of the system, it doesn't really pay to keep up to date on them right to the instant. So you could update/sort the full tree once every 32 physics frames, say; only the nearer half of the tree every 16 frames; only the nearest quarter of the tree every 8 frames; ....etceteras...; and only the closest 32 ndth of the tree at every physics frame. (This last one could pay attention to bolts, while the others ignore them.)

Sort of like, assuming 8 instead of 32, for simplicity:

Code: Select all

at_each_frame()
{
  static int frame_no = 0;
  ++frame_no;
  frame_no &= 0x00000007;
  switch(frame_no)
  {
  case 0:
    sort_whole_tree();  break;
  case 1:
    sort_nearest_8th_of_tree();  break;
  case 2:
    sort_nearest_4th_of_tree();  break;
  case 3:
    sort_nearest_8th_of_tree();  break;
  case 4:
    sort_nearer_half_of_tree();  break;
  case 5:
    sort_nearest_8th_of_tree();  break;
  case 6:
    sort_nearest_4th_of_tree();  break;
  default:
    sort_nearest_8th_of_tree();  break;
  }
}
EDIT:
The bubble sort pass should actually be 2-way: sweep from beginning to end, and then back to beginning. Reason being, bubble sort can move an element many spots in the direction of sweep, but only 1 spot in the opposite direction. Assuming some moving objects might be incoming and some outgoing, a one-directional pass of bubble-sort wouldn't stand much of a chance of succeeding, but a 2-way bubble sort is almost assured of success.
The second, reverse sweep would only be done if the first does not succeed, of course, and would be much faster since,
a) The first (forward) pass already updated distances, and
b) Most likely everything is now in cache, so no precacheing loop is needed on the way back.
hellcatv
Developer
Developer
Posts: 3980
Joined: Fri Jan 03, 2003 4:53 am
Location: Stanford, CA
Contact:

Post by hellcatv »

I have a red/black tree version (if order changes remove item and reinsert) and a bubble sort version...
haven't done performance analysis
Vega Strike Lead Developer
http://vegastrike.sourceforge.net/
Post Reply