Shlomi Fish (shlomif) wrote in shlomif_tech,
Shlomi Fish
shlomif
shlomif_tech

  • Location:
  • Music:

C Pointer Comparison Gotcha

Some days ago I integrated a new BSD-licensed balanced binary tree, based on the red-black tree of kazlib into Freecell Solver (fc-solve) and ported two of its collections to it. I've written a comparator function for it and asked my IRC pal Amadiro to test the new fc-solve with some memory optimisations on one of his University's HPC machines which he had access to. However, it crashed pretty quickly.

I was able to reproduce a similar crash on my x86-64 laptop, and tried to investigate it. Then it hit me, because the comparator callback was this:

extern int fc_solve_compare_lru_cache_keys(
    const void * void_a, const void * void_b, void * context
)
{
#define GET_PARAM(p) ((unsigned long) \
    ((const fcs_cache_key_info_t *)(p))->val_ptr)
    return GET_PARAM(void_a) - GET_PARAM(void_b);
}

Do you see the problem? Subtracting two unsigned longs will always yield a non-negative value, which will yield the comparison void. I considered several alternatives to this including using longs instead of unsigned longs (which will still be a problems on platforms where long is less than the pointer size), or subtracting const char * (which may overflow the int return) and eventually opted for the safest version of:

typedef const char * fcs_lru_side_t;

extern int fc_solve_compare_lru_cache_keys(
    const void * void_a, const void * void_b, void * context
)
{
#define GET_PARAM(p) ((fcs_lru_side_t) \
    (((const fcs_cache_key_info_t *)(p))->val_ptr))
    fcs_lru_side_t a, b;

    a = GET_PARAM(void_a);
    b = GET_PARAM(void_b);

    return ((a > b) ? 1 : (a < b) ? (-1) : 0);
#undef GET_PARAM
}

This way I'm sure that the comparison works correctly, and indeed after that Freecell Solver ran fine on the HPC machine, and would have appeared to scale to up to 600 million positions within 64 GigaBytes instead of the previous 400 million positions.

If you're a sucker for pretty graphs, you may wish to inspect the OpenDocument spreadsheet (.ods) of the memory consumption of the solver, though the data is quite estimative.

Update: corrected the second comparison in the corrected example after a copy-paste-and-HTMLisation-error (thanks to Lev).

Tags: c, programming, tech, tech tip, tip
Subscribe
  • Post a new comment

    Error

    Anonymous comments are disabled in this journal

    default userpic

    Your reply will be screened

  • 6 comments