Thread Synchronization

I came across a good tutorial post of thread synchronization tonight, Reading this covered the standard problems and measures to prevent the problems of race conditions, simultaneous access, and hard-to-diagnose timing errors. It was good for what it is. My issue is that one of the best and perhaps least used, most universally applicable, simplest to understand, and most rarely taught in college.

How do I know it isn’t taught? By interviewing graduates for positions at my company. It doesn’t matter if they were from a prestigious university or community college, Computer Science majors or Electrical Engineers, male or female, I didn’t find interviewees who understood the problem, or knew the simplest solution.

If you aren’t familiar with the problem, read the link above. It describes enough aspects of the problem to be useful.

To prevent data-induced timing problems without an additional locking mechanism, it is critical that only one thread can write to each variable. If only one thread will ever write, then there can be no conflict over invalid state or lost transitions. Many may read, but of writers there can be only one.

The best way to do this is with a circular buffer.

A circular buffer is a region of memory, usually contiguous, with two associated pointers. One pointer, the “out” pointer, points to the oldest entry containing data. The other, the “in” pointer, indicates the first empty space ready to receive data. If the pointers are equal, there is no data. To add data to the buffer, simply write the data to the location indicated by the “in” pointer, and then increment it. To remove data from the buffer, copy the data from the “out” pointer, then increment it. In both cases. the increment wraps to the from of the buffer it would go past the end.

This works on any processor architecture that provides coherent write — where the whole pointer is written in one cycle. This is true on most 32 bit processors1. This scheme does not require an atomic read-modify-write cycle. The pointers must be accessible to both the writer and the reader. The writer and reader may be threads, interrupt handlers, or even two separate processors — even asymmetric processors sharing a memory system.

What if one processor is an 8-bit processor, and can not atomically write pointers? Change the pointers to 8-bit indices, which can be atomically written.

What if the data is larger than a byte? Increment by the data size.

What if the data is variable size? Write what is needed, and increment to just after the data.

An interesting variation is a case where there is no data at all, other than the “pointers”. An example would be an event sender and an event receiver. Sending an event turns into an increment of the “in” counter. Receiving the event is an increment of the “out” counter. To know how many events are outstanding, subtract the “out” from the “in”. Notice that a counter violates the one-reader/one-writer criteria, but two counters does not.

Circular buffers are worth considering as a first solution. If a circular buffer isn’t powerful enough, then consider locks, mutexes, spinlocks, or other more exotic solutions. Or, re-frame the problem so that a circular buffer is exactly powerful enough.

This could be a fourth rule of programming.


1) It may be worth checking the machine code generated by the compiler. A 32-bit Pentium-Pro compiler (Metaware HI-C), optimized some 32-bit writes to be performed as two 16-bit writes. This gave better concurrent use of the two integer units and memory path, but may have been applicable to a limited set of x86 processor implementation. Bottom line: it is good to assure that writes are indeed atomic.

Sunday, Sunday…

Sunday I should be back in the workshop. I have some new wood to try — Spanish Cedar. It doesn’t have the aroma of the cedar for a closet, but it does have a rich, golden color — like roasted almonds. If even a few things go right, I’ll be working with it tomorrow.

I’ve went over my machine. I checked the servo-motor brushes (no visible wear) and the spindle brushes (maybe 25% wear). The first wave of intense use loosened some parts and changed the alignment. After strengthening the parts that needed it, the leg length dimensions changed by a tenth of an inch or so. I was delighted — it was time to calibrate!

I had already written new calibration code to better estimate the construction measurement errors, and wrote last weekend new code to sample the data the calibration routine needs. The calibration code is exciting because it take advantage of modern computers to do a better job than my old calibration code. The code now runs multi-threaded and multi-core, pushing my Intel I-7 processor to the limits. It simultaneously estimates all seventy-two (72) simple sources of error, and does so based on a much simpler and more automated data capture system.

More importantly, though, I took a hard look at the results from my first calibration (something like three to five years ago), and saw that the results were terrible! Points that I know are (nearly) co-planar were deduced to be over an inch out of the common plane. I would have been far more accurate using my measurements than the results of my previous calibration.

So tomorrow, I may not be running with a completely calibrated machine, but it will be far more accurate than I was using before.

Now, will the accuracy actually make much difference? Probably not, since my process compensated for many distortions — including internal position errors. Tomorrow should reveal the difference.

Back to the Spanish Cedar… it has a great deal of harden pitch in the wood. I may choose to finish with oil diluted by mineral spirits, or perhaps I’ll go with shellac. I’ll check my references. I still want a stain resistant surface — it won’t do to have beet juice from the red horseradish stain the plate.