comp.programming.threads FAQ | I'm trying some experimental tiers on Patreon to see if I can get to substack-like levels of financial support for this blog without moving to substack!

This is an archive of the comp.programming.threads FAQ, which used to be hosted by Bill Lewis at the now defunct I believe this is up-to-date as of approximately 2001.


  This is a list of the questions which have come up on the newsgroup with
  any answers that were given. (Somewhat edited by yours truly.)  In a few
  cases I have left in the names of the participants.  If you'd like me to
  remove your name, let me know.  If you have other comments/corrects, just
  drop me a line (Bil  (Of course I'll expect *you* to supply 
  any corrections! :-)

  This list is a bit of a hodge-podge, containing everything that I thought
  *might* be useful. Hence it is HUGE and not very well edited. It even has
  duplicates (or worse, near-duplicates). The MFAQ is much smaller and better
  maintained. You may wish to check there first.


          F R E Q U E N T L Y    A S K E D    Q U E S T I O N S 
                Also see:

     Brian's FAQ:
     (Sun's Threads page and FAQ is no more.)

  Many of the most general questions can be answered by reading (a) the
  welcome message, (b) the general information on the other threads pages,
  and (c) any of the books on threads.  References to all of these can be
  found in the welcome message.

Q1:   How fast can context switching be?
Q2:   What about special purpose processors?
Q3:   What kinds of issues am I faced with in async cancellation?
Q4:   When should I use these new thread-safe "_r" functions?
Q5:   What benchmarks are there on POSIX threads?
Q6:   Has anyone used the Sparc atomic swap instruction?
Q7:   Are there MT-safe interfaces to DBMS libraries?
Q8:   Why do we need re-entrant system calls?
Q9:   Any "code-coverage" tools for MT applications?
Q10:  How can POSIX join on any thread?
Q11:  What is the UI equivalent for PTHREAD_MUTEX_INITALIZER?
Q12:  How many threads are too many in one heavyweight process? 
Q13:  Is there an atomic mutex_unlock_and_wait_for_event()?
Q14:  Is there an archive of this newsgroup somewhere?
Q15:  Can I copy pthread_mutex_t structures, etc.?
Q16:  After 1800 calls to thr_create() the system freezes. ??
Q17:  Compiling libraries which might be used in threaded or unthreaded apps?
Q18:  What's the difference of signal handling for process and thread? 
Q19:  What about creating large numbers of threads?
Q20:  What about using sigwaitinfo()?
Q21:  How can I have an MT process communicate with many UP processes?
Q22:  Writing Multithreaded code with Sybase CTlib ver 10.x?
Q23:  Can we avoid preemption during spin locks?
Q24:  What about using spin locks instead of adaptive spin locks?
Q25:  Will thr_create(...,THR_NEW_LWP) fail if the new LWP cannot be added?
Q26:  Is the LWP released upon bound thread termination?
Q27:  What's the difference between pthread FIFO the solaris threads scheduling?
Q28:  I really think I need time-sliced RR.
Q29:  How important is it to call mutex_destroy() and cond_destroy()?
Q30:  EAGAIN/ENOMEM etc. apparently aren't in ?!
Q31:  What can I do about TSD being so slow?
Q32:  What happened to the pragma 'unshared' in Sun C?
Q33:  Can I profile an MT-program with the debugger?
Q34:  Sometimes the specified sleep time is SMALLER than what I want.
Q35:  Any debugger that single step a thread while the others are running?
Q36:  Any DOS threads libraries?
Q37:  Any Pthreads for Linux?
Q38:  Any really basic C code example(s) and get us newbies started?
Q39:  Please put some Ada references in the FAQ.
Q40:  Which signals are synchronous, and whicn are are asynchronous?
Q41:  If we compile -D_REENTRANT, but without -lthread, will we have problems?
Q42:  Can Borland C++ for OS/2 give up a TimeSlice?
Q43:  Are there any VALID uses of suspension?
Q44:  What's the status of pthreads on SGI machines?
Q45:  Does the Gnu debugger support threads?
Q46:  What is gang scheduling?
Q47:  LinuxThreads linked with X11, calls to X11 seg fault.
Q48:  Are there Pthreads on NT?
Q49:  What about garbage collection?
Q50:  Does anyone have any information on thread programming for VMS?
Q51:  Any information on the DCE threads library?
Q52:  Can I implement pthread_cleanup_push without a macro?
Q53:  What switches should be passed to particular compilers?
Q54:  How do I find Sun's bug database?
Q55:  How do the various vendors' threads libraries compare?
Q56:  Why don't I need to declare shared variables VOLATILE?
Q57:  Do pthread_cleanup_push/pop HAVE to be macros (thus lexically scoped)?
Q58:  Analyzer Fatal Error[0]:  Slave communication failure ??
Q59:  What is the status of Linux threads?
Q60:  The Sunsoft debugger won't recognize my PThreads program!
Q61:  How are blocking syscall handled in a two-level system?
Q62:  Can one thread read from a socket while another thread writes to it?
Q63:  What's a good way of writing threaded C++ classes?
Q64:  Can thread stacks be built in privately mapped memory?
Q66:  I think I need a FIFO mutex for my program...
Q67:  Why my multi-threaded X11 app with LinuxThreads crashes?
Q68:  How would we put a C++ object into a thread?
Q69:  How different are DEC threads and Pthreads?
Q70:  How can I manipulate POSIX thread IDs?
Q71:  I'd like a "write" that allowed a timeout value...
Q72:  I couldn't get threads to work with glibc-2.0.
Q73:  Can I do dead-owner-process recovery with POSIX mutexes?
Q74:  Will IRIX distribute threads immediately to CPUs?
Q75:  IRIX pthreads won't use both CPUs?
Q76:  Are there thread mutexes, LWP mutexes *and* kernel mutexes?
Q77:  Does anyone know of a MT-safe alternative to setjmp and longjmp?
Q78:  How do I get more information inside a signal handler?
Q79:  Is there a test suite for Pthreads? 
Q80:  Flushing the Store Buffer vs. Compare and Swap
Q81:  How many threads CAN a POSIX process have? 
Q82:  Can Pthreads wait for combinations of conditions?
Q83:  Shouldn't pthread_mutex_trylock() work even if it's NOT PTHREAD_PROCESS_SHARED?
Q84:  What about having a NULL thread ID?
Q85:  Explain Traps under Solaris
Q86:  Is there anything similar to posix conditions variables in Win32 API ?
Q87:  What if a cond_timedwait() times out AND the condition is TRUE?
Q88:  How can I recover from a dying thread?
Q89:  How to implement POSIX Condition variables in Win32?
Q90:  Linux pthreads and X11
Q91:  One thread runs too much, then the next thread runs too much!
Q92:  How do priority levels work?
Q93:  C++ member function as the startup routine for pthread_create(). 
Q94:  Spurious wakeups, absolute time, and pthread_cond_timedwait()
Q95:  Conformance with POSIX 1003.1c vs. POSIX 1003.4a?
Q96:  Cleaning up when kill signal is sent to the thread.?
Q97:  C++ new/delete replacement that is thread safe and fast?
Q98:  beginthread() vs. endthread() vs. CreateThread? (Win32)
Q99:  Using pthread_yield()?
Q100: Why does pthread_cond_wait() reacquire the mutex prior to being cancelled?
Q101: HP-UX 10.30 and threads?
Q102: Signals and threads are not suited to work together?
Q102: Patches in IRIX 6.2 for pthreads support?
Q104: Windows NT Fibers?
Q105: LWP migrating from one CPU to another in Solaris 2.5.1?
Q106: What conditions would cause that thread to disappear?
Q107: What parts, if any, of the STL are thread-safe?
Q108: Do pthreads libraries support cooperative threads?
Q109: Can I avoid mutexes by using globals?
Q110: Aborting an MT Sybase SQL?
Q111: Other MT tools?
Q112: That's not a book. That's a pamphlet!
Q114: How to cleanup TSD in Win32?
Q115: Onyx1 architecture has one problem
Q116: LinuxThreads linked with X11 seg faults.
Q117: Comments about Linux and Threads and X11
Q118: Memory barriers for synchonization
Q119: Recursive mutex debate
Q120: Calling fork() from a thread
Q121: Behavior of [pthread_yield()] sched_yield()
Q122: Behavior of pthread_setspecific()
Q123: Linking under OSF1 3.2: flags and library order
Q124: What is the TID during initialization? 
Q125: TSD destructors run at exit time... and if it crashes?
Q126: Cancellation and condition variables
Q127: RedHat 4.2 and LinuxThreads?
Q128: How do I measure thread timings? 
Q129: Contrasting Win32 and POSIX thread designs
Q130: What does POSIX say about putting stubs in libc?
Q131: MT GC Issues
Q132: Some details on using CMA threads on Digital UNIX 
Q133: When do you need to know which CPU a thread is on?
Q134: Is any difference between default and static mutex initialization? 
Q135: Is there a timer for Multithreaded Programs? 
Q136: Roll-your-own Semaphores 
Q137: Solaris sockets don't like POSIX_C_SOURCE!
Q138: The Thread ID changes for my thread! 
Q139: Does X11 support multithreading ? 
Q140: Solaris 2 bizzare behavior with usleep() and poll() 
Q141: Why is POSIX.1c different w.r.t. errno usage? 
Q142: printf() anywhere AFTER pthread_create() crashes on HPUX 10.x 
Q143: Pthreads and Linux 
Q144: DEC release/patch numbering 
Q145: Pthreads (almost) on AS/400 
Q146: Can pthreads & UI threads interoperate in one application?
Q147: Thread create timings 
Q148: Timing Multithreaded Programs (Solaris) 
Q149: A program which monitors CPU usage? 
Q150: standard library functions: whats safe and whats not? 
Q151: Where are semaphores in POSIX threads? 
Q152: Thread & sproc (on IRIX) 
Q153: C++ Exceptions in Multi-threaded Solaris Process 
Q154: SCHED_FIFO threads without root privileges ? 
Q155: "lock-free synchronization" 
Q156: Changing single bytes without a mutex 
Q157: Mixing threaded/non-threadsafe shared libraries on Digital Unix 
Q158: VOLATILE instead of mutexes? 
Q159: After pthread_cancel() destructors for local object do not get called?!
Q160: No pthread_exit() in Java.
Q161: Is there anyway I can make my stacks red zone protected?
Q162: Cache Architectures, Word Tearing, and VOLATILE
Q163: Can ps display thread names?
Q164: (Not!) Blocking on select() in user-space pthreads.
Q165: Getting functional tests for UNIX98
Q166: To make gdb work with linuxthreads?
Q167: Using cancellation is *very* difficult to do right...
Q168: Why do pthreads implementations differ in error conditions?
Q169: Mixing threaded/non-threadsafe shared libraries on DU
Q170: sem_wait() and EINTR
Q171: pthreads and sprocs
Q172: Why are Win32 threads so odd?
Q173: What's the point of all the fancy 2-level scheduling??
Q174: Using the 2-level model, efficency considerations, thread-per-X
Q175: Multi-platform threading api
Q176: Condition variables on Win32 
Q177: When stack gets destroyed relative to TSD destructors?
Q178: Thousands of mutexes?
Q179: Threads and C++
Q180: Cheating on mutexes
Q181: Is it possible to share a pthread mutex between two distinct processes?
Q182: How should one implement reader/writer locks on files?
Q183: Are there standard reentrant versions of standard nonreentrant functions?
Q184: Detecting the number of cpus
Q185: Drawing to the Screen in more than one Thread (Win32)
Q186: Digital UNIX 4.0 POSIX contention scope
Q187: Dec pthreads under Windows 95/NT?
Q188: DEC current patch requirements
Q189: Is there a full online version of 1003.1c on the web somewhere?
Q190: Why is there no InterlockedGet?
Q191: Memory barrier for Solaris
Q192: pthread_cond_t vs pthread_mutex_t
Q193: Using DCE threads and java threads together on hpux(10.20)
Q194: My program returns enomem on about the 2nd create.
Q195: Does pthread_create set the thread ID before the new thread executes?
Q196: thr_suspend and thr_continue in pthread
Q197: Are there any opinions on the Netscape Portable Runtime?
Q198: Multithreaded Perl
Q199: What if a process terminates before mutex_destroy()?
Q200: If a thread performs an illegal instruction and gets killed by the system...
Q201: How to propagate an exception to the parent thread?
Q202: Discussion: "Synchronously stopping things" / Cheating on Mutexes
Q203: Discussion: Thread creation/switch times on Linux and NT.
Q204: Are there any problems with multiple threads writing to stdout?
Q205: How can I handle out-of-band communication to a remote client?
Q206: I need a timed mutex for POSIX
Q207: Does pthreads has an API for configuring the number of LWPs?
Q208: Why does Pthreads use void** rather than void*?
Q209: Should I use poll() or select()?
Q210: Where is the threads standard of POSIX ????
Q211: Is Solaris' unbound thread model braindamaged?
Q212: Releasing a mutex locked (owned) by another thread.
Q213: Any advice on using gethostbyname_r() in a portable manner?
Q214: Passing file descriptors when exec'ing a program.
Q215: Thread ID of thread getting stack overflow? 
Q216: Why aren't my (p)threads preemted?
Q217: Can I compile some modules with and others without _POSIX_C_SOURCE?
Q218: timed wait on Solaris 2.6?
Q219: Signal delivery to Java via native interface
Q220: Concerning timedwait() and realtime behavior.
Q221: pthread_attr_getstacksize on Solaris 2.6
Q222: LinuxThreads: Problem running out of TIDs on pthread_create
Q223: Mutexes and the memory model
Q224: Poor performance of AIO in Solaris 2.5?
Q225: Strategies for testing multithreaded code?
Q226: Threads in multiplatform NT 
Q227: Guarantee on condition variable predicate/pthreads?
Q228: Pthread API on NT? 
Q229: Sockets & Java2 Threads
Q230: Emulating process shared threads 
Q231: TLS in Win32 using MT run-time in dynamically loaded DLLs?
Q232: Multithreaded quicksort
Q233: When to unlock for using pthread_cond_signal()?
Q234: Multi-Read One-Write Locking problem on NT
Q235: Thread-safe version of flex scanner 
Q236: POSIX standards, names, etc
Q237: Passing ownership of a mutex?
Q238: NT fibers
Q239: Linux (v.2.0.29 ? Caldera Base)/Threads/KDE 
Q240: How to implement user space cooperative multithreading?
Q241: Tools for Java Programming 
Q242: Solaris 2.6, phtread_cond_timedwait() wakes up early
Q243: AIX4.3 and PTHREAD problem
Q244: Readers-Writers Lock source for pthreads
Q245: Signal handlers in threads 
Q246: Can a non-volatile C++ object be safely shared amongst POSIX threads?
Q247: Single UNIX Specification V2
Q248: Semantics of cancelled I/O (cf: Java)
Q249: Advice on using multithreading in C++?
Q250: Semaphores on Solaris 7 with GCC 2.8.1 
Q251: Draft-4 condition variables (HELP) 
Q252: gdb + linuxthreads + kernel 2.2.x = fixed :) 
Q253: Real-time input thread question
Q254: How does Solaris implement nice()?  
Q255: Re: destructors and pthread cancelation...  
Q256: A slight inaccuracy WRT OS/2 in Threads Primer 
Q257: Searching for an idea 
Q258: Benchmark timings from "Multithreaded Programming with Pthreads" 
Q259: Standard designs for a multithreaded applications? 
Q260: Threads and sockets: Stopping asynchroniously 
Q261: Casting integers to pointers, etc. 
Q262: Thread models, scalability and performance  
Q263: Write threaded programs while studying Japanese!  
Q264: Catching SIGTERM - Linux v Solaris 
Q265: pthread_kill() used to direct async signals to thread? 
Q266: Don't create a thread per client 
Q267: More thoughts on RWlocks 
Q268: Is there a way to 'store' a reference to a Java thread? 
Q269: Java's pthread_exit() equivalent?  
Q270: What is a "Thread Pool"?
Q271: Where did "Thread" come from?
Q272: Now do I create threads in a Solaris driver?
Q273: Synchronous signal behavior inconsistant?
Q274: Making FORTRAN libraries thread-safe?
Q275: What is the wakeup order for sleeping threads?
Q276: Upcalls in VMS?
Q277: How to design synchronization variables?
Q278: Thread local storage in DLL?
Q279:  How can I tell what version of linux threads I've got?
Q280: C++ exceptions in a POSIX multithreaded application?
Q281: Problems with Solaris pthread_cond_timedwait()?
Q282: Benefits of threading on uni-process
Q283: What if two threads attempt to join the same thread?
Q284: Questions with regards to Linux OS?
Q285: I need to create about 5000 threads?
Q286:  Can I catch an exception thrown by a sla
Q287: _beginthread() versus CreateThread()?
Q288: Is there a select() call in Java??
Q289: Comment on use of VOLATILE in the JLS.?
Q290: Should I try to avoid GC by pooling objects myself??
Q291: Does thr_X return errno values? What's errno set to???
Q292: How I can wait more then one condition variable in one place?
Q293: Details on MT_hot malloc()?
Q294: Bug in Bil's condWait()?
Q295: Is STL considered thread safe??
Q296: To mutex or not to mutex an int global variable ??
Q297: Stack overflow problem ?
Q298: How would you allow the other threads to continue using a "forgotten" lock?
Q299: How unfair are mutexes allowed to be?
Q300: Additionally, what is the difference between -lpthread and -pthread? ?
Q301: Handling C++ exceptions in a multithreaded environment?
Q302: Pthreads on IRIX 6.4 question?
Q303: Threading library design question ?
Q304: Lock Free Queues?
Q305: Threading library design question ?
Q306: Stack size/overflow using threads ?
Q307: correct pthread termination?
Q308: volatile guarantees??
Q309: passing messages, newbie?
Q310: solaris mutexes?
Q311: Spin locks?
Q312: AIX pthread pool problems?
Q313: iostream libray and multithreaded programs ?
Q314: Design document for MT appli?
Q315: SCHED_OTHER, and priorities?
Q316: problem with iostream on Solaris 2.6, Sparcworks 5.0?
Q317: pthread_mutex_lock() bug ???
Q318: mix using thread library?
Q319: Re: My agony continues (thread safe gethostbyaddr() on FreeBSD4.0) ?
Q320: OOP and Pthreads?
Q321: query on threading standards?
Q322: multiprocesses vs multithreaded..??
Q323: CGI & Threads?
Q324: Cancelling detached threads (posix threads)?
Q325: Solaris 8 recursive mutexes broken?
Q326: sem_wait bug in Linuxthreads (version included with glibc 2.1.3)?
Q327: pthread_atfork??
Q328: Does anybody know if the GNU Pth library supports process shared mutexes?
Q329: I am trying to make a thread in Solaris to get timer signals.
Q330: How do I time individual threads?
Q331: I'm running out of IPC semaphores under Linux!
Q332: Do I have to abandon the class structure when using threads in C++?
Q333: Questions about pthread_cond_timedwait in linux.
Q334: Questions about using pthread_cond_timedwait.
Q335: What is the relationship between C++ and the POSIX cleanup handlers?
Q336: Does selelct() work on calls recvfrom() and sendto()?
Q337: libc internal error: _rmutex_unlock: rmutex not held.
Q338: So how can I check whether the mutex is already owned by the calling thread?
Q339: I expected SIGPIPE to be a synchronous signal.
Q340: I have a problem between select() and pthread...
Q341: Mac has Posix threading support.
Q342: Just a few questions on Read/Write for linux.
Q343: The man pages for ioctl(), read(), etc. do not mention MT-safety.
Q344: Status of TSD after fork()?
Q345: Static member function vs. extern "C" global functions?
Q346: Can i kill a thread from the main thread that created it?
Q347: What does /proc expose vis-a-vis LWPs?
Q348: What mechanism can be used to take a record lock on a file?
Q349: Implementation of a Timed Mutex in C++
Q350: Effects that gradual underflow traps have on scaling.
Q351: LinuxThreads woes on SIGSEGV and no core dump.
Q352: On timer resolution in UNIX.
Q353: Starting a thread before main through dynamic initialization.
Q354: Using POSIX threads on mac X and solaris?
Q355: Comments on ccNUMA on SGI, etc.
Q356: Thread functions are NOT C++ functions! Use extern "C"
Q357: How many CPUs do I have?
Q358: Can malloc/free allocate from a specified memory range?
Q359: Can GNU libpth utilize multiple CPUs on an SMP box?
Q360: How does Linux pthreads identify the thread control structure?
Q361: Using gcc -kthread doesn't work?!
Q362: FAQ or tutorial for multithreading in 'C++'?
Q363: WRLocks & starvation.
Q364: Reference for threading on OS/390.
Q365: Timeouts for POSIX queues (mq_timedreceive())
Q366: A subroutine that gives cpu time used for the calling thread?
Q367: Documentation for threads on Linux
Q368: Destroying a mutex that was statically initialized.
Q369: Tools for debugging overwritten data.
Q370: POSIX synchronization is limited compared to win32.
Q371: Anyone recommend us a profiler for threaded programs?
Q372: Coordinating thread timeouts with drifting clocks.
Q373: Which OS has the most conforming POSIX threads implementation?
Q374: MT random number generator function.
Q375: Can the main thread sleep without causing all threads to sleep?
Q376: Is dynamic loading of the libpthread supported in Redhat?
Q377: Are reads and writes atomic?
Q378: More discussion on fork().
Q379: Performance differences: POSIX threads vs. ADA threads?
Q380: Maximum number of threads with RedHat 255?
Q381: Best MT debugger for Windows...
Q382: Thread library with source code ? 
Q383: Async cancellation and cleanup handlers.
Q384: How easy is it to use pthreads on win32?
Q385: Does POSIX require two levels of contention scope?
Q386: Creating threadsafe containers under C++
Q387: Cancelling pthread_join() DOESN'T detach target thread?
Q388: Scheduling policies can have different ranges of priorities?
Q389: The entity life modeling approach to multi-threading.
Q390: Is there any (free) documentation?
Q391: Grafting POSIX APIs on Linux is tough!
Q392: Any companies  using pthread-win32?
Q393: Async-cancel safe function: guidelines?
Q394: Some detailed discussion of implementations.
Q395: Cancelling a single thread in a signal handler?
Q396: Trouble debugging under gdb on Linux.
Q397: Global signal handler dispatching to threads.
Q398: Difference between the Posix and the Solaris Threads?
Q399: Recursive mutexes are broken in Solaris?
Q400: pthreads and floating point attributes?
Q401: Must SIGSEGV be sent to the thread which generated the signal?
Q402: Windows and C++: How?
Q403: I have blocked all signals and don't get SEGV!
Q404: AsynchronousInterruptedException (AIE) and POSIX cancellation

 Q1: How fast can context switching be?  

In general purpose processors (SPARC, MIPS, ALPHA, HP-PA, POWER, x86) a
LOCAL thread context switch takes on the order of 50us.  A GLOBAL thread
context switch takes on the order of 100us.  However... (Abdelsalam Heddaya) writes:

>- Certain multi-threaded processor architectures, with special support
>  for on-chip caching of thread contexts can switch contexts in,
>  typically, less than 10 cycles, down to as little as one cycle.

The Tera machine switches with 0 cycles of overhead.

>  Such processors still have to incur a high cost when they run out of
>  hardware contexts and need to perform a full "context swap" with
>  memory.

Hmmm.  With 128 contexts/processors and 16 processors on the smallest
machine, we may be talking about a rare sitution.  Many people doubt
we'll be able to keep the machine busy, but you propose an
embarassment of riches/parallelism.

In any case, I disagree with the implication that a full context swap
is a problem to worry about.  We keep up to 2048 threads active at a
time, with others confined to memory.  The processors issues
instructions for the active threads and completely ignore the inactive
threads -- there's no swapping of threads between processor and memory
in the normal course of execution.  Instead, contexts are "swapped"
when one thread finishes, or blocks too long, or is swapped to disk,
etc.  In other words, at fairly significant intervals.

Preston Briggs


 Q2: What about special purpose processors?  

What are the distinctions between these special purpose processors and
the general purpose processors we're using?


 Q3: What kinds of issues am I faced with in async cancellation?  

Michael C. Cambria wrote:
> In article <4eoe2a$>, (Spike White) wrote:
> [deleted]
> > thread2()
> > {
> >    ...
> >    while(1) {
> >       pthread_setasynccancel(CANCEL_ON);
> >       pthread_testcancel();  /* if there's a pending cancel */
> >       read(...);
> >       pthread_setasynccancel(CANCEL_OFF);
> >       ...process data...
> >    }
> > }
> >
> > Obviously, you shouldn't use any results from the read() call that was
> > cancelled -- God knows what state it was when it left.
> >
> > That's the only main use I've ever found for async cancel.
> I used something quite similar to your example (quoted above) in my
> original question.
> Since the read() call itself is not async cancel safe according to Posix,
> is it even safe to do the above?  In general for any posix call which is
> not async cancel safe, my guess (and many e-mails to me agree) is to
> just not use it.
> Using read() as an example, I'll bet everyone will agree with you not
> to use the results of the read() call.  However, the the motivation for
> my original question was, being as a call() is not async cancel safe,
> by canceling a thread when it is in one of these calls _may_ screw up
> other threads in general and other threads using the same fd in
> particular.  This is why I asked why one would use it.
> In your example, if read() did anything with static data, the next read on
> that fd could have problems if a thread was cancelled while in the read().
> (Note:  if you don't like the "static data" example, substitute whatever
> you like for the implementation reason for read(), or any call, not being
> async cancel safe.  I used static data as an example only.)
> Mike

Specifically, NO, it is NOT safe to call read() with async cancel. On some
implementations it may work, sometimes. In general, it *MAY* work if, on
the particular release of your particular operation system, read() happens
to be implemented with no user-mode code (aside from a syscall trap). In
most cases, a user mode cancel will NOT be allowed to corrupt kernel data.

However, no implementations make any guarantees about their implementation
of read(). It may be a syscall in one version and be moved partly into
libc in the next version.

Unfortunately, the OSF DCE porting guide made reference to the possibility
of using async cancel in place of synchronous system cancel capability on
platforms that don't support the latter. That was really too bad, and it
set a very dangerous precedent.

POSIX 1003.1c-1996 encourages all routines to document whether they are
async cancel safe. (Luckily the advice is in rationale -- which is to say
it's really just commentary and not part of the standard -- because it'd
be horrendously difficult to change the documentation of every single
routine in a UNIX system.) In practice, you should always assume that a
function is NOT async cancel safe unless it says that it IS. And you won't
see that very often.

Because, as has already been commented, async cancel really isn't very
useful. There is a certain small class of application that can benefit
dramatically from async cancel, for good response to shutdown requests in
long-running compute-bound threads. In a long and tight loop it's not
practical to call pthread_testcancel(). So in cma we provided async cancel
for those cases. In retrospect I believe that's probably one of the bad
parts of cma, which POSIX should have omitted. There may well have been
"hard realtime" people in the room who wanted to use it, though (the POSIX
threads standard was developed by roughly 10 "threads people" and 40 to 50
"realtime people").

Dave Butenhof                              Digital Equipment Corporation                       110 Spit Brook Rd, ZKO2-3/Q18
Phone: 603.881.2218, FAX: 603.881.0120     Nashua, NH 03062-2711
                 "Better Living Through Concurrency"

> In article <>,
> Jose Luis Ramos =?iso-8859-1?Q?Mor=E1n?=   wrote:
> %   pthread_setcancelstate(PTHREAD_CANCEL_ENABLE,NULL);
> %   pthread_setcanceltype(PTHREAD_CANCEL_ASYNCHRONOUS,NULL);
> I would guess that your problem comes from this. Asynchronous cancellation
> is almost never a good idea, but if you do use it, you should be really
> careful about whether there's anything with possible side-effects in your
> code. For instance, the C++ exception handler could be screwed up for your
> whole process if you cancel at a bad moment.
> Anyway, try taking out the asynchronous cancellation and see if the problem
> goes with it.

I'll put it a little more strongly than Patrick. The program is illegal. You
CANNOT call any function with asyncronous cancel enabled unless that function
is explicitly specified as "async-cancel safe". There are very few such
functions, and sleep() is not one of them. In fact, within the scope of the
POSIX and UNIX98 standards, with async cancel enabled you are allowed only to

  1. Disable asynchronous cancellation (set cancel type to DEFERRED)
  2. Disable cancellation entirely (set cancel state to DISABLE)
  3. Call pthread_cancel() [This is bizarre and pointless, but it is specified
     in the standard.]

If you call any other function defined by ANSI C, POSIX, or UNIX98 with async
cancel enabled, then your program is nonportable and "non conforming". It MAY
still be "correct", but only IF you are targeting your code to one specific
implementation of the standard that makes the NON-portable and NON-standard
guarantee, in writing, that the function you're calling actually is
async-cancel safe on that implementation. Otherwise, the program is simply

You can, of course, write your own async-cancel safe functions. It's not that
hard to do. In general, like most correct implementations of pthread_cancel(),
you simply DISABLE async cancellation on entry and restore the previous
setting on exit. But it's silly to do that very often. And, of course, that's
not the same as actually allowing async cancel. THAT is a much, much harder
job, except for regions of code that own no resources of any kind.

Asynchronous cancelation was designed for tight CPU-bound loops that make no
calls, and therefore would suffer from the need to call pthread_testcancel()
on some regular basis in order to allow responsiveness to cancellation
requests. That's the ONLY time or place you should EVER even consider using
asynchronous cancellation.

/---------------------------[ Dave Butenhof ]--------------------------\
| Compaq Computer Corporation            |
| 110 Spit Brook Rd ZKO2-3/Q18 |
| Nashua NH 03062-2698 |
\-----------------[ Better Living Through Concurrency ]----------------/

 Q4: When should I use these new thread-safe "_r" functions?  

David Brownell wrote:
> If the "_r" versions are available at all times, use them but
> beware of portability issues.  POSIX specifies a pretty minimal
> set and many implementations add more (e.g. gethostbyname_r).
> Some implementations only expose the "_r" versions if you
> compile in a threaded environment, too.
> - Dave

POSIX 1003.1c-1995 deliberately separates _POSIX_THREAD_SAFE_FUNCTIONS
from _POSIX_THREADS so that they can be easily implemented by
non-threaded systems. The "_r" versions aren't just thread-safe, they
are also much "cleaner" and more modular than the traditional forms.
(for example, you can have several independent readdir_r or strtok_r
streams active simultaneously).

The grand vision is that all UNIX systems, even though without threads,
would of course want to pick up this wonderful new set of interfaces. I
doubt you'll see them in any system without threads support, of course,
but it would be nice.

 Q5: What benchmarks are there on POSIX threads?  

In the book on POSIX.4 by B.Gallmeister there are some very useful POSIX
benchmark programs which allow to measure the real-time performance of an
operating system. However there is nothing on the threads of POSIX.4a!  Does
anybody know of a useful set of benchmark programs on these POSIX threads ??

Any help is greatly appreciated.

Markus Joos

 Q6: Has anyone used the Sparc atomic swap instruction?  

Has anyone used the Sparc atomic swap instruction to safely build lists 
in a multithreaded application?  Any examples?  Any references?

Yes, but it would not help you if you use sun4c machines. ( No atomic
instructions..)  Thus you would be forced to use atomic in sun4m or later,
and spl stuff in sun4c.  Does not make a pretty picture. Why not use
mutex_lock/unlock and let the libraries worry about that. mutex_lock uses
atomic/spl stuff.


[sun4c are SPARC v7 machines such as 4/110, SS1, SS1+, SS2, IPC,
IPX, EPC, EPX. sun4m are v8 machines including SS10, SS20, SS4, SS5, 4/690,
SS1000, SC2000. The UltraSPARC machines are SPARC v8+ (soon to be v9), but
have the same instructions as the sun4ms.]
 Q7: Are there MT-safe interfaces to DBMS libraries?  

A: In general, no.  My current understanding is that NO major DBMS
   vendor has an MT-safe client-side library. (1996)

Peter Sylvester wrote:
> In article <>, Andreas Reichenberger
>  wrote:
> > Richard Moulding wrote:
> > >
> > > I need to interface to an Oracle 7 DB from DCE (non-Oracle)
> > > clients. We are planning to build our own  DCE RPC-stored
> > > procedure interface but someone must be selling something to do
> > > this, no?
> I have the same problem with an Informix application which uses a DCE
> interface, and currently limit it to 1 thread coming in.  This works, but
> could be a bottleneck in busy environments, as other incoming RPCs are put
> in a queue (blocked) until the current one finishes.

... stuff deleted

> A potential way around this would be to fork off separate processes which
> then start their own connection to the database.  The parent then acts as a
> dispatcher for requests coming in.  I know the forking part works without
> DCE, but I suspect that you have to do all the forks before going into the
> DCE server listening mode.
> I also thought I heard something about Oracle providing a thread safe
> library, maybe in 7.3.  Anyone know?
> --
> Peter Sylvester
> MITRE Corp.
> Bedford, MA
> (

This is exactly the way we handled the problem. We wrote a tool that
generates the complete dispatcher from an IDL file. The dispatcher (which is
virtually invisible to the clients and to the developers) distributes the
requests from the clients to its 'backends', which are connected to the
DB. The backends are implemented as single-threaded DCE Servers with the
Interface specified in the IDL File.

We added some features that are not in DCE, like 
  - asyncronous RPC's (the RPC returns immediately and the client can ask the
    dispatcher to return the state of the RPC (if it is done or still running)
    or request the RPC to be canceled) 
  - dividing the backends into classes. i.e. it's possible to have one class of
    backends for querying the database and another class for updates, etc. By
    assigning 2 backends to the query class and the rest of the backends to
    other classes you can limit the number of concurrent queries to 2 (because
    they are time consuming). The client has to specify which class is to be used
    for a RPC (we currently support up to 10 classes)

Context handles are used to tie a client to one backend for transactions which
require more than one RPC to be handled by the same backend (= DB Session).

The reason why the hell we had to do this anyway was to limit the number of
backend processes neccessary to support a few hundred PC clients. We
currently run it on AIX and Digital UNIX with Oracle and Ingres. However,
there's no reason why it shouldn't work on any UNIX platform which supports
OSF DCE (V1.1) and with any DB.

Feel free to contact me for more details...

See 'ya

 Q8: Why do we need re-entrant system calls?  

A: (Jeffrey P Bradford) wrote:
>Why do we need re-entrant system calls?  I know that it's so that
>system calls can be used in a multithreaded environment, but how often
>does one really have multiple threads executing the same system call?
>Do we really need system calls that can be executed by multiple
>threads, or would mutual exclusion be good enough?

Well, there have been some implimentations that felt (feel?) that mutual
exclusion is good enough. And, in fact, that will "thread safe" the
functions. But it runs havoc with performance, and things like
cancelability. Turns out that real applications have multiple threads calls
executing the same system call all the time. read() and write() are popular,
as are send() and recv() (On UNIX).

>I'm assuming that system calls can be designed intelligently enough so
>that, for example, if a process wants to perform a disk read, the
>process performs a system call, exits the system call (so another
>thread can perform a disk read), and then is woken up when the disk
>read is done.

[I assume the behavior you reference "leave the system call" means
"return to user space"]

That all depends on the OS. On UNIX, that is not the default
system call behavior. On VMS it is (Just two examples).

Brian Silver.

 Q9: Any "code-coverage" tools for MT applications?  

Is there an application that can help me with "code-coverage" for
MT applications?


Upon which platform are you working?  I did performance profiling last week
on a MT app using prof & gprof on a Solaris 2.4 machine.  For code coverage,
I use tcov.  I suspect that most OS's w/ kernel threads have thread-aware
gprof and tcov commands.

Spike White          |               | Biker Nerds
HaL Software Systems | '87 BMW K75S, DoD #1347     |  From  HaL
Austin, TX           | 
Disclaimer:  HaL, want me to speak for you?  No, Dave... 
 Q10: How can POSIX join on any thread?  

The pthread_join() function will not allow you to wait for "any" 
thread, like the UI function thr_join() will.  How can I get this?

> >: I want to create a number of threads and then wait for the first
> >: one to finish, not knowing which thread will finish first.  But
> >: it appears pthread_join() forces me to specify exactly which of
> >: my threads I want to wait for.  Am I missing something basic, or
> >: is this a fundamental flaw in pthread_join()?
> >
> >:      Rich Stevens
> >
> >Good call.  I notice Solaris native threads have this support and the
> >pthreads implementations I've seen don't.  I wondered about this myself.
> >
> Same here.  The situation I ran into was a case where once the main
> created the necessary threads and completed any work it was responsible
> for, it just needed to "hang-around" until all the threads completed
> their work before exiting.  pthread_join() for "any" thread in loop using
> a counter for the number of threads seemed the logical choice.  Then I
> realized Solaris threads supported this but POSIX didn't (along with
> reader/writer locks).  Oh well.
> How about the Solaris SPLIT package.  Does it support the "wait for any"
> thread join?

This "wait for any" stuff is highly misleading, and dangerous in most real
threaded applications. It is easy to compare with the traditional UNIX "wait
for any process", but there's no similarity. Processes have a PARENT -- and
when a process "waits for any" it is truly waiting only for its own
children. When your shell waits for your "make" it CANNOT accidentally chomp
down on the termination of the "cc" that make forked off!

This is NOT true with threads, in most of the common industry threading
models (including POSIX 1003.1c-1995 and the "UNIX International" threads
model supported by Solaris). Your thr_join(NULL,...) call may grab the
termination status of a thread used to parallelize an array calculation
within the math library, and thus BREAK the entire application.

Without parent/child relationships, "wait for any" is not only totally
useless, it's outright dangerous. It's like the Win32 "terminate thread"
interface. It may seem "neat" on the surface, but it arbitrarily breaks all
shared data & synchronization invariants in ways that cannot be detected or
repaired, and thus CANNOT be used in anything but a very carefully
constructed "embedded system" type environment where every aspect of the
code is tightly controlled (no third-party libraries, and so forth). The
very limited enviroments where they are safe & useful are dramatically
outweighed by the danger that having them there (and usually very poorly
explained) encourages their use in inappropriate ways.

It really wouldn't have been hard to devise POSIX 1003.1c-1995 with
parent/child relationships. A relatively small overhead. It wasn't even
seriously considered, because it wasn't done in any of the reference
systems, and certainly wasn't common industry practice. Nevertheless,
there are clearly advantages to "family values" in some situations...
among them being the ability to usefully support "wait for any". But
wishful thinking and a dime gets you one dime...

Dave Butenhof                              Digital Equipment Corporation                       110 Spit Brook Rd, ZKO2-3/Q18
Phone: 603.881.2218, FAX: 603.881.0120     Nashua, NH 03062-2711
                 "Better Living Through Concurrency"

I find Dave's comments to be most insightful.  He hits on a big point
that I have hear a number of people express confusion about.  My 2-bits
to add:

  As a programmer we should be thinking about the availability of resources
-- when is something ready for use?  "Is the Matrix multiply complete?" "Has
the data request been satisfied?" etc.  thr_join() is often used as a cheap
substitute for those questions, because we ASSUME that when all N threads
have exited, that the computation is complete.  (Generally accurate, as long
as we control the entire program.  Should some lout get hired to maintain
our code, this assumption could become false in a hurry.)

  The only instance where we REALLY care if a thread has exited is when
the resource in question IS that thread (e.g., we want to un-mmap pages
we reserved for the stack or other rare stuff).

  So... the correct answer is "Don't do that."  Don't use thr_join()
to count threads as they exit.  Set up a barrier or a CV and have the threads
count down as they complete their work.  IE:

worker threads:

...     lock(M);
    if (running_threads == 0) cond_signal(CV);

"Master" thread:

... running_threads = N;
    while (running_threads != 0) cond_wait(M, CV);


 Q11: What is the UI equivalent for PTHREAD_MUTEX_INITALIZER?  


From the man page (man mutex_init):

Solaris Initialize
     The equivalent Solaris API used to  initialize  a  mutex  so
     that  it has several different types of behavior is the type
     argument passed to mutex_init().  No current type  uses  arg
     although  a  future  type  may  specify  additional behavior
     parameters via arg.  type may be one of the following:

     USYNC_THREAD        The mutex can synchronize  threads  only
                         in  this  process.  arg is ignored.  The
                         USYNC_THREAD Solaris mutex type for pro-
                         cess  scope  is  equivalent to the POSIX
                         mutex         attribute          setting

     USYNC_PROCESS       The mutex  can  synchronize  threads  in
                         this  process and other processes.  Only
                         one process should initialize the mutex.
                         arg   is   ignored.   The  USYNC_PROCESS
                         Solaris mutex type for process scope  is
                         equivalent  to the POSIX mutex attribute
                         setting   PTHREAD_PROCESS_SHARED.    The
                         object  initialized  with this attribute
                         must  be  allocated  in  memory   shared
                         between  processes, i.e. either in Sys V
                         shared memory  (see  shmop(2)).   or  in
                         memory  mapped  to a file (see mmap(2)).
                         It is illegal to initialize  the  object
                         this  way and to not allocate it in such
                         shared memory.

     Initializing mutexes can also be accomplished by  allocating
     in  zeroed  memory  (default),  in  which  case,  a  type of
     USYNC_THREAD is assumed.  The same mutex must not be  simul-
     taneously  initialized  by  multiple  threads.  A mutex lock
     must not be re-initialized while in use by other threads.

     If default mutex attributes are used, the macro DEFAULTMUTEX
     can  be used to initialize mutexes that are statically allo-

 Q12: How many threads are too many in one heavyweight process?    

How many are too many for a single machine?


The answer, of course, is "it depends".

Presumably, the number of threads you're considering far outstrips the
number of processors you have available, so it's not really important
whether you're running on uni- or a multiprocessor, and it's not really
important (in this general case) whether the threads implementation has
any kernel support (presumably it doesn't on HP-UX, judging by your post
from 14 Feb 1996 14:31:42 -0500).  So, it comes down to what these
bazillion threads of yours are actually doing.  

If, for the most part, they just sit there waiting for someone to tickle
the other end of a socket connection, then you can probably create LOTS
before you hit "too many".  In this case it would depend on how much
memory is available to your process, in which to keep all of these
sleeping threads (and how much kernel resources are available to create
sockets for them ;-).

If, on the other hand, every one of these bazillion threads is hammering
away on the processor (trying to compute some fractal or something :-),
then creating any more threads than you have processors is too many.
That is, you waste time (performance, throughput, etc.) in switching
back and forth between the threads which you could be spending on
something useful.  That is, life would be better if you just created a
couple of threads and had them make their way through all the work at

Presumably, your application falls somewhere between the two extremes.
The idea is to design so that your "typical operating conditions"
involve a relatively small number of threads active at any one time.
Having extra ones running isn't a catastrophe, it just means that things
aren't quite as efficient as they otherwise might be.


Webb Scales                                Digital Equipment Corporation                   110 Spit Brook Rd, ZKO2-3/Q18
Voice: 603.881.2196, FAX: 603.881.0120     Nashua, NH 03062-2711
         Rule #12:  Be joyful -- seek the joy of being alive.

 Q13: Is there an atomic mutex_unlock_and_wait_for_event()?  

Is it possible for a thread to release a mutex and begin
waiting on an "event" in one atomic operation?  I can think of a few
convoluted ways to achieve or simulate this, but am wondering if
there's an easy solution that I'm missing.


This isn't how you'd really want to look at things (POSIX). Figure out what
condition you're interested in and use a CV.


The NT4.0 beta has a new Win32 API, SignalObjectAndWait that will do what you
want. Sorry, it is not available in 3.51 or earlier.

Robert V. Head
 Q14: Is there an archive of this newsgroup somewhere?  

I believe keeps a 1 year record of every
newsgroup on the Usenet.  You can search it by author to get your
articles, then pick out individual threads...

 Q15: Can I copy pthread_mutex_t structures, etc.?  

"Ian" == Ian Emmons  writes:
In article <> Ian Emmons  writes:

Ian> Variables of the data type pthread_t are, semantically speaking, a sort of 
Ian> reference, in the following sense:

Ian>     pthread_t tid1;
Ian>     pthread_t tid2;
Ian>     void* ret_val;

Ian>     pthread_create(&tid1;, NULL, some_function, NULL);
Ian>     // Now tid1 references a new thread.
Ian>     tid2 = tid1;
Ian>     // Now tid2 references the same thread.
Ian>     pthread_join(tid2, &ret;_val);

Ian> In other words, after creating the thread, I can assign from one pthread_t 
Ian> to another, and they all reference the same thread.  Pthread_key_t's (I 
Ian> believe) behave the same way.

    You should not copy one structure pthread_t to another pthread_t
...  it may not be portable.  In some implementations the pthread_t is
not simple a structure containing only a pointer and some keys .... it
is infact the REAL structure, which would then create two independant
structures which each can be manipulated individually reaping havoc.

Ian> An attributes object, like pthread_attr_t (or an unnamed semaphore sem_t), 
Ian> on the other hand does not behave this way.  It has value semantics, because 
Ian> you can't copy one into another and expect to have a second valid attribute 
Ian> object.

Ian> My question is, do pthread_mutex_t's and pthread_cond_t's behave as 
Ian> references or values?

    Same statement .... I have seen enough problems where someone copied
an initialized lock then continued to lock the two mutexes independently
creating very unwanted behavior.

William E. Hannon Jr.               
AIX/DCE Technical Lead                                         whannon@austin
Austin, Texas 78758     Department ATKS/9132     Phone:(512)838-3238 T/L(678)
'Confidence is what you had, before you understood the situation.' Dr. Dobson

FOLLOWUP: For most programs, you should be passing pointers around, not

pthread_mutex_t     my_lock;

{  ...

foo(pthread_mutex_t *m)
 Q16: After 1800 calls to thr_create() the system freezes. ??  

My problem is that the thread does not get freed or released back to the
system for reuse.  After 1800 calls to thr_create() the system freezes. ??
A: The default for threads in both UI and POSIX is for threads to be
   "undetached" -- meaning that they MUST be joined (thr_join()).  Otherwise
   they will not be garbage collected.  (This default is the wrong choice.  Oh
 Q17: Compiling libraries which might be used in threaded or unthreaded apps?  

   What *is* the straight scoop on how to compile libraries which 
   might be used in threaded or unthreaded apps?  Hopefully the 
   "errno" and "putc()" macros will continue to work even if
   libthread isn't pulled in, so that vendors can make a single
   version of any particular library.

A: Always compile *all* libraries with the reentrancy flag (_REENTRANT for
   UI threads, _POSIX_C_SOURCE=199506L for POSIX threads). Otherwise some 
   poor soul will try to use your library and get hammered.  putc() and
   getc() WILL be slower, but you may use putc_unlocked() & getc_unlocked()
   if you know the I/O stream will be used safely.

   All Solaris libraries are compiled like this.
 Q18: What's the difference of signal handling for process and thread?   

   What's the difference of signal handling for process and thread? Do the
   signals divided into the types of process-based and thread-based which were
   treated differently in HP-RT? Is there any examples? I'd like to know how to
   initiate, mask, block, wait, catch, ...... the signals. How can I set the
   notification list (process or thread?) of SIGIO for both socket and tty
   using fcntl or ioctl? 

A: You probably want to buy one of the books that discuss this in detail.
   Here's the short answer:

    Signal masking is on per-thread based.
    But the signal handlers are per-process based.
    The synchronous signals like SIGSEGV, SIGILL etc will be 
    processed by the thread which caused the signal.

    The other signals will be handled by any ready thread which
    has the mask enabled for the signal.
    There are no special thread library for signal handling.
 Q19: What about creating large numbers of threads?  

I've asked a question about creating 2500 unbound threads. During these
days, I have written some more testing programs. Hope you would help me to
solve some more problems.

1. I have written a program that creates 10 threads. Then the 10 threads
each create 10 more threads. The 100 newly created threads each creates 10
more threads. In a SPARC 2000, if the concurrency level is 100, the program
takes 7 seconds to terminate. From a paper, unbound thread creation is
claimed to take only 56 usec. How comes my testing program is so slow on a
SPARC 2000 that has 20 CPUs? If I use a SPARC 10, the program only takes 1
second to terminate. Is SPARC 2000 slower than a SPARC 10?

2. Instead of creating 2500 threads, I have written a program that creates
200 threads and then kills them all and creates 200 threads and kills them
all and ..... After some while of creating and killing, the program hangs. I
use sigaction to set a global signal handler for the whole process. As the
program is so simple, I don't know where the problem is.

3. In addition, I have written a program that creates 1000 bound
threads. Each thread has a simple loop:

        while (1)
            randomly read an entry of an array

   This time, not only my program hangs, the whole SPARC 2000 hangs. I can't
reset the machine from console. Finally, I have to power down the machine.

Thanks in advance.

 Q20: What about using sigwaitinfo()?  

>Here is what I am doing.  I am using the early access POSIX threads.
>My main program blocks SIGUSR1 and creates a number of threads.
>One of these threads is dedicated to this signal.  All it does is a
>sigwaitinfo on this signal, sets a flag when it returns, and exits.
>If I send the SIGUSR1 signal to the process using the kill command
>from another window, it does not seem to get it and the other threads
>(which are doing a calculation in a loop) report that SIGUSR1 is not
>An earlier version of the program which used a signal handler to set
>the flag worked perfectly.
>Do you have any ideas on this?


I assume you are using sigwaitinfo(3r) from libposix4.
Unfortunately, sigwaitinfo() is not MT-safe, i.e. does not work correctly
in an MT program, on 2.3/2.4. Use sigwait(2) - it should work on 2.3/2.4.
On 2.5 beta, sigwaitinfo() works.

If you really need the siginfo on 2.3/2.4, it is going to be hard, and the 
solution depends on whether you are running 2.3/2.4 but here is an 
alternative suggestion:

Programmers have used signals between processes as an IPC mechanism. Sounds
like you are trying to do the same. If this is the case, I would strongly
suggest that you use shared memory (see mmap(2)) between processes and
shared memory synchronization (using the SysV shared semaphores - see
semop(2)), or POSIX synchronization objects with the PTHREAD_PROCESS_SHARED
attribute. For example, you can set-up a region of shared memory protected
by a mutex and condition variable. The mutex and condition variable would
also be allocated from the shared memory and would be initialized with the
PTHREAD_PROCESS_SHARED attribute. Now, processes which share this memory
can use the mutex and condition variable as IPC mechanisms - any information
that needs to be passed between them can be passed through the shared
memory (alternative to siginfo :-)). To make this asynchronous, you can
have a thread dedicated to monitoring the shared memory area by waiting
on the condition variable. Now, whenever the signalling process wants to
send a signal, it instead issues a cond_signal on the condition variable.
The thread sleeping on this in the other (receiving) process wakes up
now and processes the information.

In general, signal handlers and threads, even though the system might support
this correctly, should not be used together. Signal handlers could be
looked upon as "substitute threads" when threads were not around in UNIX, 
and now that they are, the interactions between them can be complicated. 
You should mix them together only if absolutely necessary.

 Q21: How can I have an MT process communicate with many UP processes?  

>I have a multithreaded process, each thread in the multithreaded
>process wants to communicate with another single-threaded process,
>what is the good way to do that?
>Assume each thread in the multithreaded process is identical, i.e.
>they are generated using the same funcation call and each thread
>creates a shared memory to do the communication, will the generated
>shared memories operate independently if no synchronization provided?  


  It sounds like you have the right idea.  For each thread/process pair,
build a shared memory segment and use that for communications.  You'll need
some sort of synchronization variable in that shared segement for

  There is no interaction between segments what-so-ever.
 Q22: Writing Multithreaded code with Sybase CTlib ver 10.x?  

>A customer is trying to write a multi-threaded application that also
>uses Sybase CTlib ver 10.x, and he is facing some limitations due to
>the Sybase library. 
>BOTTOM LINE: CTlib is reentrant, but according to Sybase is not usable
>in a multi-threaded context. That means it does NOT seem to be usable
>in an MT application.
>The purpose of this mail is NOT to get a fix for CTlib, but to try to
>find a workaround, if one exists...


The workaround for the moment is to use the XA library routines from
Sybase, which are, in turn, based upon the TransArc package pthread*

We should be getting an alpha version of MT safe/hot CTlib towards the first
part of June 1995.  Also of potential interest is there will also be an early
version of native-threaded OpenServer soon as well, which really opens
up a lot of possibilities.

Chris Nicholas
SunSoft Developer Engineering
 Q23: Can we avoid preemption during spin locks?  

>    A while ago I asked you for information on preemption control
> interfaces (in-kernel) which might be available in Solaris2.x. I am
> looking for ways of lowering number of context switches taken as the
> result of adaptive muxtex contention. We have a number of places a
> lock is taken and held for a few scant lines of C. It would be great
> to prevent preemption during these sections of code.


  You're obvious writing a driver of some sort. (Video driver I'd guess?)
And you're VERY concerned with performance on *MP* machines (UPs be damned).
You have tested you code on a standardized, repeatable benchmark, and you
are running into a problem.  You have solid numbers which you are absolutely
certain of.  Right?

  You'll have to excuse my playing the heavy here, but you're talking deep
do-do here, and I don't want to touch it unless I'm totally convinced I (and
you) have to.

  You could set the SPL up to turn off all interrupts.  It would slow your
code down quite a bit though.  The probablity of preemption occuring over "a
few scant lines of C" (i.e., a few dozen instructions) approaches zero.
Regularly suffering from preemption during just these few instructions would
be a VERY odd thing.  I am hard pressed to INVENT a situation like this.
Are you absolutely, totally, completely, 100% certain you're seeing this?
Are you willing to put $10 on it?

 Q24: What about using spin locks instead of adaptive spin locks?  
>    I also would like to know more about something I saw in
> /usr/include/sys/mutex.h. It would appear that it possible to 
> create pure spinning locks (MUXTEX_SPIN) as opposed to the default 
> adaptive mutexes (MUTEX_ADAPTIVE_STAT). These might provide the kind 
> of control I am looking for assuming that these are really supported 
> and not some bastard orphan left over.


  If I understand the question, the answer is "no".  That's what an adaptive
mutex is for.  It optimizes a spin lock to sleep if there's no value in
spinning.  If you use a dumb spin lock instead, you are GUARANTEED to run
 Q25: Will thr_create(...,THR_NEW_LWP) fail if the new LWP cannot be added?  

>    Does Sun's implementation of thr_create(...,THR_NEW_LWP) fail
>to create the multiplexed thread if the new LWP cannot be added to the
>multiplexing pool?  The unixware docs indicate Novell's implementation
>of thr_create() uses THR_NEW_LWP as a hint to the implementation to
>increase the pool size.  They also do not state the behavior if the
>new lwp cannot be created.  What is the official statement?


  It should not create a new thread if it returns EAGAIN.  Mind you, you're
fairly unlikely EVER to see this happen in a real program.  (You'll see it
in bugs & in testing/design.)
 Q26: Is the LWP released upon bound thread termination?  

>  In the sun implementation, if you create a bound
>thread, and the thread eventually terminates, is the LWP released
>upon termination, or upon thr_join with the terminated thread?


  Yes, a bound thread's LWP is released.  This should not affect your
programming at all.  Use thr_setconcurrency() & leave it at that.
 Q27: What's the difference between pthread FIFO the solaris threads scheduling?  

A:  Very little.

 Q28: I really think I need time-sliced RR.  

>Well, i really think I need time-sliced RR. Since I'm making an 
>multithreaded implementation of a functional concurrent process-
>oriented  language. MT support is needed to get good usage
>of multi CPU machines and better realtime. Today processes are custom 
>user-level and the runtime system delivers the scheduling. And the
>language semantic is that processes are timesliced RR.
>Changing the sematic is not realistic. I really hope the pthreads
>will spec RR timeslicing, it would make things easier.


  Think VERY carefully.  When will you ever *REQUIRE* RR scheduling?  And
why?  Remember, you've never had it ever before, so why now?  (There may be
a reason, but it had better be good.)  Scheduling should normally be
invisible, and forcing up to user-awareness is generally a bad thing.

>For the moment, since this will only be a prototype, bound threads
>will do but bot in a real system with a couple with houndreds of
>Convince me I don't need RR timeslicing, that would make things easier.
>Or how do I make my own scheduler in solaris, or should I stay with
>bound threads?

  OK.  (let me turn it around) Give one example of your program which will
fail should thr 3 run before thr 2 where there is absolutely NO
synchronization involved.  With arbitrary time-slicing of course.  I can't
think of an example myself.  (It's just such a weird depencency that I
can't come up with it.  But I don't know everything...)
 Q29: How important is it to call mutex_destroy() and cond_destroy()?  

here is how I init serval of my threading variables

    mutex_init( &lock;, USYNC_PROCESS, 0 );
    cond_init( ¬Busy;, USYNC_PROCESS, 0 );
The storage for the variables is in memory mapped file. once I have
opened the file, I call unlink to make sure it will be automatically
cleaned up. How important is it to call mutex_destroy() and
cond_destroy()? Will I wind up leaking some space in the kernel is I
do not call these functions?

 Q30: EAGAIN/ENOMEM etc. apparently aren't in ?!  

  'Course not.  :-)

  They're in errno.h.  pthread_create() will return them if something goes
wrong.  Be careful, ERRNO is NOT used by the threads calls.
 Q31: What can I do about TSD being so slow?  
 Q32: What happened to the pragma 'unshared' in Sun C?  

   I read about a pragma 'unshared' for the C-compiler in some Solaris-thread
   papers. The new C-3.01 don't support the feature anymore I think. There is
   no hint in the Solaris 2.4 Multithread Programming Guide. But the new
   TSD is very slow. I tested a program with direct register allocation under
   gcc (asm "%g3") instead of calling the thr_getspecific procedure and it was 
   over three times faster. Can I do the same thing or something else with the 
   Sun C-compiler to make the C-3.01 Code also faster?


The "thread local storage" feature that was mentioned in early papers
about MT on Solaris, and the pragma "unshared", were never
implemented.  I know what you mean about the performance of TSD.  It
isn't very fast.  I think the key here is to try to structure your
program so that you don't rely too much on thread specific data, if
that's possible.

The SPARC specification reserves those %g registers for internal use.
In general, it's dangerous to rely on using them in your code.
However, SC3.0.1 does not use the %g registers in any user code. It
does use them internally, but never across function calls, and never
in user code.  (If you do use the %g registers across function calls,
be sure to save and restore the registers.)

You can accomplish what gcc does with the "asm" statement by writing
what we call an "inline function template."  Take a look at the math
library inline templates for an idea on how to do that, and see the
inline() man page.  You might also want to take a look at the
AnswerBook for SPARC Assembly Language Programming, which is found in
the "Solaris 2.x Software Developer Answerbook".  The latest part
number for that is 


The libm templates are found in /opt/SUNWspro/SC3.0.1/lib/
Inline templates are somewhat more work to write, as compared to using
gcc's "asm" feature, but, it's safer.  I don't know about the
robustness of code that uses "asm" - I like gcc, and I use it, but
that particular feature can lead to interesting bugs.

Our next compiler, SC4.0 (coming out in late 1995) will use the %g
registers more aggressively, for performance reasons.  (Having more
registers available to the optimizer lets them do more optimizations.)
There will be a documented switch, -noregs=global (or something like
that) that you will use to tell the SC4.0 NOT to use the global
registers.    When you switch to SC4.0, be sure to read the cc(1) man
page and look for that switch.  
 Q33: Can I profile an MT-program with the debugger?  

   Can I profile an MT-program with the debugger and a special MT-license
   or do I need the thread-analyser?


The only profiling you can do right now for an MT program is what you
get with the ThreadAnalyzer.  If you have the MT Debugger and SC3.0.1,
then, you should also have a copy of the ThreadAnalyzer (it was first
shipped on the same CD that had SC3.0.1) Look for the binary "tha"
under /opt/SUNWspro/bin.  

The "Collector" feature that you can use from inside the Debugger
doesn't work with MT programs.  Future MT-aware-profiling tools will
be integrated with the Debugger - is that where you'd like to use
 Q34: Sometimes the specified sleep time is SMALLER than what I want.  

>I have a program that generates UDP datagrams at regular intervals.
>It uses real time scheduling for improved accuracy.
>(The code I used is from the Solaris realtime manual.)
>This helps, but once in a while I do not get the delay I wanted.
>The specified sleep time is SMALLER (i.e. faster) than what I want.
>I use the following procedure for microsecond delays
>delay(int us) /* delay in microseconds */
>    struct timeval tv;
>    tv.tv_sec = us / 1000000;
>    tv.tv_usec = us % 1000000;
>    (void)select( 0, (fd_set *)NULL, (fd_set *)NULL, (fd_set *)NULL, &tv; );
>As I said, when I select a delay, occasionally I get a much smaller delay.
>    Wanted: 19,776 microseconds, got: 10,379 microseconds
>    Wanted:    910 microseconds, got:    183 microseconds
>As you can see, the error is significant when it happens.
>It does not happen often. (0.5% of the time)
>I could use the usleep() function, but that's in the UCB library.
>Anyone have any advice?


First of all, you can not do a sleep implementation in any increments
other than 10 milliseconds (or 1/HZ variable).

Second, there is a bug in the scheduler (fixed in 2.5) that may
mess up your scheduling in about 1 schedules around every
300,000 or so. 

Third, A much better timing interface will be available in
Solaris 2.6 (or maybe  earlier) thru posix interfaces. That
should give you microsecond resolution with less than 
50 microseconds latency.

 Q35: Any debugger that single step a thread while the others are running?  

|>  Has anyone looked into the possibility of doing a MT debugger
|> that will allow you to single step a thread while the others
|> are running? This will probably require a debugger that attaches
|> a debugger thread to each thread...


This was the topic of my master's thesis. You might check:

and follow the link to the abstract or the full version.


We have used breakpoint debugging to debug threads programs. We have
implemented a debugger that enables the user to write scripts to debug
programs (not limited to threads programs). This is made possible by a Tcl
interface atop gdb and hooks in gdb, that exports some basic debugger
internals to the user domain.  Thus allowing the user to essentially write
his own Application Specific debugger.

Please see the following web page for more info on the debugger

Sudhir Halbhavi
 Q36: Any DOS threads libraries?  

> Is there any way or does anyone have a library that will allow to program
> multitreads.. I need it for SVGA mouse functions.. I use both C++ and
> Watcom C++, 


I use DesqView for my DOS based multi-thread programs.  (Only they don't call
them threads, they call them tasks....)  I like the DesqView interface to 
threads better than the POSIX/Solaris interface, but putting up with DOS was
almost more than I could stand.
 Q37: Any Pthreads for Linux?  


Linux has kernel-level threads now and has had a thread-safe libc for a
while.  With LinuxThreads, you don't have to worry about things like your
errno, or blocking system calls. The few standard libc functions that are
inherently not thread safe (due to using static data areas) have been
augmented with thread-safe alternatives.

LinuxThreads are not (fully) POSIX, however. 

I'm quite familiar with Xavier's package. He's done an awesome job given
what he had to work with. Unfortunately, the holes are large, and his
valiant attempts to plug them result in a larger and more complicated
user-mode library than should be necessary, without being able to
completely resolve the problems.

Linux uses clone() which is not "kernel-level threads", though, with
some proposed (and possibly pending) extensions in a future version of
the kernel, it could become that. Right now, it's just a way to create
independent processes that share some resources. The most critical
missing component is the ability to create multiple executable entities
(threads) that share a single PID, thereby making those entities threads
rather than processes.

Linuxthreads, despite using the "pthread_" prefix, is NOT "POSIX
threads" (aka "pthreads") because of the aforementioned substantial and
severe shortcoming of the current implementation based on clone().
Without kernel extensions, a clone()-based thread package on Linux
cannot come close to conforming to the POSIX specification. The common
characterization of Linuxthreads as "POSIX threads" is incorrect and
misleading. This most definitely is not "a true pthreads
implementation", merely a nonstandard thread package that uses the
"pthread" prefix.

Note, I'm not saying that's necessarily bad. It supports much of the
interface, and unlike user-mode implementations (which also tend to be
far more buggy than Linuxthreads), allows the use of multiple
processors.  Linuxthreads is quite useful despite its substantial
deficiencies, and many reasonable programs can be created and ported
using it. But it's still not POSIX.

 Q38: Any really basic C code example(s) and get us newbies started?  

>Could one of you threads gods please post some really, really basic C code
>example(s) and get us newbies started?  There just doesn't seem to be any other
>way for us to learn how to program using threads.


The following is a compilation of all the generous help that was posted or mailed to me 
concerning the use of threads in introductory programs.  I apologize for it not being
edited very well...  (Now I just need time to go through all of these)

Here's all of the URL's:

 Q39: Please put some Ada references in the FAQ.  


Most Ada books will introduce threading concepts.  Also, check out Windows
Tech Journal, Nov. 95 for more info on this subject.
 Q40: Which signals are synchronous, and whicn are are asynchronous?  

>I have another question. Since we must clearly distinguish the
>sinchronous signals from the asynchronous ones for MT, is there any
>documentation on which is which? I could not find any.


In general, independent of MT, this is an often mis-understood area of
signals.  The adjective: "synchronous"/"asynchronous" cannot be applied to a
signal.  This is because any signal (including normally synchronously
generated signals such as SIGSEGV) could be asynchronously generated using
kill(2), _lwp_kill(2) or thr_kill(3t).

e.g. SIGSEGV, which is normally synchronously generated, can also be sent
via kill(pid, SIGSEGV), in which case it is asynchronously generated. So
labelling SIGSEGV as synchronous and a program that assumes this, would be

For MT, a question is: would a thread that caused the generation of a signal
get this signal?

If this is posed for a trap (SIGSEGV, SIGBUS, SIGILL, etc.), the answer is:
yes - the thread that caused the trap would get the signal.  But the handler
for the trap signal, i.e. a SIGSEGV handler, for example, cannot assume that
the handler was invoked for a synchronously generated SIGSEGV (unless the
application knows that it could not have receieved a SIGSEGV via a kill(),
or thr_kill()).

If this question is posed for any other signal (such as SIGPIPE, or the
real-time signals) the answer should not really matter since the program
should not depend on whether or not the thread that caused the signal to be
generated, receives it. For traps, it does matter, but for any other signal,
it should not matter.

FYI: On 2.4 and earlier releases, SIGPIPE, and some other signals were sent
to the thread that resulted in the generation of the signal, but on 2.5, any
thread may get the signal. The only signals that are guaranteed to be sent
to the thread that resulted in its generation, are the traps (SIGILL,
SIGTRAP, SIGSEGV, SIGBUS, SIGFPE, etc.). This change should not matter since
a correctly written MT application would not depend on the synchronicity of
the signal generation for non-traps, given the above description of signal
synchronicity that has always been true.

 Q41: If we compile -D_REENTRANT, but without -lthread, will we have problems?  

>Hi -
>I had posed a question here a few weeks ago and received a response. Since
>then the customer had some follow-on questions. Can anyone address this
>customer's questions:
>(note: '>' refers to previous answer we provided customer)
>> If only mutexes are needed to make the library mt-safe, the library writer 
>> can do the following to enable a single mt-safe library to be used by both 
>> MT and single-threaded programs:
>Actually, we are only using the *_r(3c) functions, such as strtok_r(3c),
>getlogin_r(3c), and ctime_r(3c).  We are not actually calling thr_*,
>mutex_*, cond_*, etc. in the libraries.
>We want to use these *_r(3c) library functions instead of the normal
>non-MT safe versions (such as strtok(), ctime(), etc.), but if we compile
>the object files with -D_REENTRANT, but do not link with -lthread, will
>we have problems?


No - you will not have any problems, if you do not link with -lthread.

But if your library is linked into a program which uses -lthread, then:

You might have problems in a threaded program because of how you allocate 
and use the buffers that are passed in to the *_r routines.

The usage of the *_r routines has to be thread-safe, or re-entrant in
the library. The *_r routines take a buffer as an argument. If the library
uses a global buffer to be passed to these routines, and does not protect
this buffer appropriately, the library would be unsafe in a threaded program.

Note that here, the customer's library has to do one of the following to ensure
that their usage of these buffers is re-entrant:

- if possible, allocate the buffers off the stack - this would be per-thread
  storage and would not require the library to do different things depending
  on whether the library is linked into a threaded program or not.

- if the above is not possible:

On any Solaris release, the following may be done: (recommended solution):

    - use mutexes, assuming that threads are present, to protect the 
      buffers. If the threads library is not linked in, there are dummy 
      entry points in libc for mutexes which do nothing - and so this 
      will compile correctly and still work. If the threads library is 
      linked in, the mutexes will be real and the buffers will be 
      appropriately protected.

On Solaris 2.5 only:

    - if you do not want to use mutexes for some reason and want to use
      thread-specific data (TSD) if threads are present (say), then on 2.4
          you cannot do anything. On 2.5, though, one of the following may be 
    (a) on 2.5, you can use thr_main() to detect if threads are linked in 
          or not. If they are, carry out appropriate TSD allocation of buffers.

    (b) If you are sure only POSIX threads will be used (if at all), and you
      do not like the non-portability of thr_main() which is not a POSIX
      interface, then, on 2.5, you can use the following (hack) to detect if
      pthreads are linked in or not: you need the #pragma weak declaration 
      so that you can check if a pthreads symbol is present or not. If 
      it is, then pthreads are linked in, otherwise they are not. Following
      is a code snippet which demonstrates this. You can compile it with
      both -lpthread and without. If compiled without -lpthread it prints
      out the first print statement. If compiled with -lpthread, it prints
      out the second print statement. I am not sure if this usage of
      #pragma weak is any more portable than using thr_main().


        #pragma weak pthread_create

            if (pthread_create == 0) {
                printf("libpthread not linked\n");
            } else {
                printf("libpthread is present\n");
                 * In this case, use Thread Specific Data
                 * or mutexes to protect access to the global
                 * buffers passed to the *_r routines.


 Q42: Can Borland C++ for OS/2 give up a TimeSlice?  

Johan>    Does anyone know if Borland C++ for OS/2 has a function that could be 
Johan>    used within a THREAD to give up a TimeSlice.


    If all you want to do is give up your timeslice
however if you are the highest priority thread, you will be immediately dispatched
again, before other threads.  Even when all the threads are the same priority,
my understanding is that the OS/2 operating system has a degradation algorithm
for the threads in a process ... so even if you DosSleep with the "same" priority
your thread still could be dispatched immediately --- depending on the
degradation algorithm.

    If you want to sleep to next clock tick
works, because the system round the 1 up to the next clock tick value.
This should allow other threads in your process to be dispatched.

    Both are valid semantics, depending on what you would prefer.
William E. Hannon Jr.               
DCE Threads Development                                        whannon@austin
 Q43: Are there any VALID uses of suspension?  

    UI threads, OS/2 and NT all allow you to suspend a thread.  I have yet to
  see a program which does not go below the API (ie debuggers, GCs, etc.), but
  still uses suspension.  I don't BELIEVE there is a valid use.  I could be


I'll bite.  Whether we "go below the API" or not is for you to decide.
Our product, ObjectStore, is a client-server object-oriented database
system.  For the purpose of this discussion, it functions like a
user-mode virtual memory system: We take a chunk of address space
and use it as a window onto a database; if the user touches an address
within our special range, we catch the page fault, figure out which
database page "belongs" there, and read that page from the server.  After
putting the page into place, we continue the faulting instruction, which
now succeeds, and the user's code need never know that it wasn't there
all the time.

This is all fine for a single-threaded application.  There's a potential
problem for MT applications, however; consider reading a page from a
read-only database.  Thread A comes along and reads a non-existent page.
It faults, the fault enters our handler, and we do the following:
    get data from server
    make page read-write    ;open window
    copy data to page
    make page read-only ;close window
During the window between the two page operations, another thread can
come along and read invalid data from the page, or in fact write the
page, with potentially disastrous effect.

On Windows and OS/2, we do the following:
    get data from server
    suspend all other threads
    make page read-write
    copy data to page
    make page read-only
    resume all other threads
to prevent the "window" from opening.  On OS/2, we use DosEnterCritSec,
which suspends all other threads.  On NT, we use the DllMain routine
to keep track of all the threads in the app, and we call SuspendThread
on each.  We're very careful to keep the interval during which threads
are suspended as brief as possible, and on OS/2 we're careful not to call
the C runtime while holding the critical section.

On most Unix systems, we don't have to do this, because mmap() has the
flexibility to map a single physical page into two or more separate
places in the address space.  This enables us to do this:
    get data from server
    make read-write alias of page, hidden from user
    copy data to alias page
    make read-only page visible to user
The last operation here is atomic, so there's no opportunity for other
threads to see bogus data.  There's no equivalent atomic operation on
NT or OS/2, at least not one that will operate at page granularity.
Since you do not like Suspend/Resume to be available to user level apis,
I thought the following set of functions (available to programs)
in WinNT (Win32) might catch your interest :) :

CreateRemoteThread -- allows you to start a thread in another process's
address space.. The other process may not even know you've done it
(depending on circumstances).  Supposedly, with full security turned
on (off by default!) this won't violatge C2 security.

SetThreadContext/GetThreadContext - Just lke it sounds.  You can
manipulate a thread's context (CPU registers, selectors, etc!).

Also, you can forcibly map a library (2-3 different ways: createremotethread
can allow this as well) to another proces's address space (that is, you
can map a DLL of yours to a running process).  Then, you can do
things like spawn off threads, after you have invisibly mapped your DLL
into the space.  Yes, there is potential for abuse (and for interestiing

But, microsoft has a use for these things.  They can help you subclass
a window on the desktop for instance.  If you wanted to make say
Netscape's main window beep twice every time it repaints, you could
map a DLL into netscape's address space, subclass the main window
(subclass == "Send ME the window's messages, instead of sending it to
the window -- i'll take care of everything!"), and watch for PAINTs
to come through.

Anyway, don't mean to waste your time.  Just thought you might find
it interesting that a user can start additional threads in someone else's
process, change thread context forcibly (to a decent degree), and
even latch onto a running process in order to change its behavior, or
just latch on period to run a thread you wrote in another proceses's
address space.

 Q44: What's the status of pthreads on SGI machines?  
>> We are considering porting of large application from Concurrent Computer
>> simmetrical multiprocessor running RTU-6 to one of the Silicon Graphics
>> multiprocessors running IRIX (5.3?).
>> Our application uses threads heavily. Both so-called user threads and 
>> kernel threads are required with a fair level of synchronization 
>> primiteves support and such.
>> My question is: what kind of multi-threaded application programming 
>> support is available in IRIX? 
>> Reading some of the SGI technical papers available on their WWW page 
>> just confuses me. I know that Posix threads or Solaris-type 
>> LWP/threads supports would be OK. 


POSIX thread support in IRIX is more than a rumor - pthreads are currently 
scheduled to be available shortly after release of IRIX 6.2 (IRIX 6.2 is 
currently scheduled for release in Feb 96).  If you are interested in 
obtaining pthreads under IRIX as soon as possible, I would recommend 
contacting your local SGI office.
Bruce Johnson, SGI ASD                 
Real-time Applications Engineering          
 Q45: Does the Gnu debugger support threads?  


An engineer at Cygnus is implementing thread support in gdb for Solaris.
No date for completion is given.
 Q46: What is gang scheduling?  


Gang Scheduling is described a variety of ways. Generally the
consistent thread is that a GS gives a process all the processors at
the same time (or none for a time slice). This is most helpful for
"scientific apps" because the most common set up is something like

    do i=1, bignum
       more stuff
       lots more stuff
    end do

the obvious decomposition is bignum/nproc statically allocated. Stuff
and friends take very close to the same time per chunk, so if you get
lucky it all happens in one chime (viz. one big clock). Else it takes
precisely N chimes with no leftovers. When unlucky, it's N chimes +
cleanup for stragglers.

Virtually all supercomputers do this, they may not even bother to give
it a special name. SGI makes this explicit (and supported).

On SPARC/Solaris there is no way for the compiler to know if we'll get
the processors requested or when. So you can suffer multiple chime
losses quite easily.

One can reallocate processor/code on the fly, but with increased overhead.
 Q47: LinuxThreads linked with X11, calls to X11 seg fault. 

You can't rely on libraries that are not at the very least compiled
with -DREENTRANT to do anything reasonable with threads.  A vanilla
X11 build (with out -DREENTRANT and without XTHREADS enabled) 
will likely behave badly with threads.  

It's not terribly hard to build X with thread support these days,
especially if you're using libc-6 with builtin LinuxThreads.  Contact
your Linux distribution maintainer and insist on it.  Debian has just 
switched to a thread-enabled X11 for their libc6 release; has any other

Bill Gribble
 Q48: Are there Pthreads on Win32?  

Several answers here.  #1 is probably the best (most recent!).

A: Yes, there is a GNU pthreads library for Win32.  It is still under
   active development, but you can find out more by looking at

   (This is a combination of Ben Elliston & John Bossom's work. & others?)


Well, Dave Butenhof will probably kill me for saying this, but Digital has a
pthreads implementation for WIN32. I bug them occasionally about packaging
up the header and dll and selling it separately (for a reasonable price, of
course). I think it's a great idea. My company has products on NT and UNIX,
so it would solve some painful portability issues for us.  This
implementation uses the same "threads engine" that Digital uses, rather

than just some wrappers on NT system services.

So, maybe if a few potential customers join me in asking Digital for this,
we'll get somewhere.  What say, Dave?


I have such a beast...sort of.

I have a pthreads draft 4 wrapper that is (nearly) complete and has been
in use for a while (so it seems to work!).

About 6 weeks back I changed this code to provide a draft 10 interface. This
code has however not yet been fully tested nor folded into my projects.
Casting my mind back (a lot has happened in 6 weeks!) I seem to remember
one or two small issues where I wasn't sure of the semantics; I was working
from a document I picked up at my last job which showed how to migrate
from pthreads 4 to pthreads 10, rather than a copy of the standard.

If anyone wants this code, I can make it available.


> > As far as I know, there is no pthreads implementation for NT.  However,
> > ACE provides a C++ threads wrapper which works on pthreads, and on NT
> > (and some others).
> Well, Dave Butenhof will probably kill me for saying this, but Digital has a
> pthreads implementation for WIN32. I bug them occasionally about packaging up
> the header and dll and selling it separately (for a reasonable price, of
> course). I think it's a great idea. My company has products on NT and UNIX,
> so it would solve some painful portability issues for us. This implementation
> uses the same "threads engine" that Digital uses, rather than just some
> wrappers on NT system services.

Yes, DECthreads has been ported to Win32 for quite a while. It runs on Windows
NT 3.51 and 4.0, on Alpha and Intel; and also on Windows 95 (though this was
not quite as trivial as Microsoft might wish us to believe.)

The main questions are:

  1. What's the market?
  2. How do we distribute the code, and at what cost? (Not so much "cost to the
     customer", as "cost to Digital".)

The big issue is probably that, especially with existing free code such as ACE,
it seems unlikely that there'd be much interest unless it was free or "dirt
cheap". Yet, even if we disclaim support, there will still be costs associated,
which means it'd be really tricky to avoid losing money.

> So, maybe if a few potential customers join me in asking Digital for this,
> we'll get somewhere.  What say, Dave?

We'd love to hear who wants this and why. Although I haven't felt comfortable
actually advertising the possibility here, I have forwarded the requests I've
seen here, and recieved via mail (including Jeff's) to my manager, who is the
first (probably of several) who needs to make any decisions.

I'd be glad to forward additional requests. Indications of what sort of product
(e.g., in particular, things like "sold for profit" or "internal utility"
distinctions), and, of course, whether (and how much) you'd be willing to pay,
would be valuable information.

/---------------------------[ Dave Butenhof ]--------------------------\

From: Ben Elliston 

Matthias Block  writes:

> is there someone who knows anything about a Pthread like library for
> Windows NT. It would simplify the work here for me.

I am involved with a free software project to implement POSIX threads
on top of Win32.   For the most part, it is complete, but it's still
well and truly in alpha testing right now.

I expect to be posting an announcement in a few weeks (say, 4?) to
comp.programming.threads.  The source code will be made available via
anonymous CVS for those who want to keep up to date or submit
patches.  I'm looking forward to getting some net testing!

Over the last several months I have seen some requests for
a Win32 implementation of PThreads.  I, too, had been looking
for such an implementation but to no avail.

Back in March, I decided to write my own. It is based upon the
PThreads 1003.1c standard, however, I didn't implement everything.
Missing is signal handling and real-time priority functions.

I based the implementation on the description provided by

    Programming with POSIX Threads, by
    Dave R. Butenhof

I've created a zipped file consisting of some header files, an implib, 
a DLL and a simple test program.

I'm providing this implementation for free and as-is. You may download it



John E. Bossom                                     Cognos Inc.
Voice: (613) 738-1338 x3386        O_o             P.O. Box 9707, Stn. T
  FAX: (613) 738-0002             =(  )= Ack!      OTTAWA, ON  K1G 4K9
 INET:  jebossom@cognos.COM          U             CANADA
 Q49: What about garbage collection?  

Please, please, please mention garbage collection when you come around
to talking about making code multithreaded.  A whole lot of
heap-allocated data needs to be explicitly reference counted *even
more* in a multithreaded program than in a single threaded program
(since it is so much harder to determine whether data is live or not),
and this leads to lots of bugs and worries and nonsense.

With garbage collection, on the other hand, you get to throw away
*all* of your worries over memory management.  This is a tremendous
win when your brain is already approaching meltdown due to the strain
of debugging subtle race conditions.

In addition, garbage collection can help to make the prepackaged
libraries you link against safer to play with (although it obviously
won't help to make them thread safe).  Xt, for example, is very badly
written and leaks like a sieve, but a conservative garbage collector
will safely kill off those memory leaks.  If you're linking against
legacy libraries and you need to write a long-running multithreaded
server, GC can make the difference between buying more RAM and killing
your server every few days so that it doesn't thrash, and simply
plugging in the threads-aware GC and sailing fairly happily along.

Bryan O'Sullivan 

[Please see: Geodesic Systems (     -Bil]

 Q50: Does anyone have any information on thread programming for VMS?  

No ftp or web stuff, although we do have an HTML version of the Guide to
DECthreads and we'll probably try to get it outside the firewall where
it'll do y'all some good, one of these days. I've been very impressed
with Sun's "thread web site", and I'd like to get Digital moving in that
direction to help with the global work of evangelizing threads... but
not until I've finished coding, and writing, and consulting, and all
sorts of other things that seem to take 500% of my time. For general
info, and some information (though not enough) on using POSIX threads,
check Sun's library. (They need to start tapering off the UI threads.)

If you've got VMS (anything since 5.5-2), you'll have a hardcopy of the
Guide in your docset, and on the doc cdrom in Bookreader format. OpenVMS
version 7.0 has POSIX 1003.1c-1995 threads -- anything earlier has only
the old CMA and 1003.4a/D4 "DCE threads". Furthermore, OpenVMS Alpha 7.0
supports SMP threads (kernel support for dynamic "many to few"
scheduling), although "binary compatibility paranoia" has set in and it
may end up being nearly impossible to use. OpenVMS VAX 7.0 does not have
SMP or kernel integration -- integration will probably happen "soon",
but VAX will probably never have SMP threads.

Dave Butenhof                              Digital Equipment Corporation                       110 Spit Brook Rd, ZKO2-3/Q18
Phone: 603.881.2218, FAX: 603.881.0120     Nashua, NH 03062-2711
                 "Better Living Through Concurrency"
 Q51: Any information on the DCE threads library?
 Q52: Can I implement pthread_cleanup_push without a macro?  

I was about to use pthread_cleanup_push, when I noticed that it is
implemented as a macro (on Solraris 2.5) which forces you to have the
pthread_cleanup_pop in the same function by having an open brace { at the
end of the first macro and closing it int the second...  Since I want to
hide most of this stuff in something like a monitor (or a guard in ACE) in
C++ by using the push in a constructor and the pop in the destructor I'm
wondering if there is something fondamental that would prevent me to do so
or could I just re-implement the stuff done by the macros inside some class

POSIX 1003.1c-1995 specifies that pthread_cleanup_push and pthread_cleanup_pop
must be used at the same lexical scope, "as if" the former were a macro that
expands to include an opening brace ("{") and the latter were a macro that
expands to include the matching closing brace ("}").

The Solaris 2.5 definition therefore conforms quite accurately to the intent
of the standard. And so does the Digital UNIX definition, for that matter. If
you can get away with "reverse engineering" the contents of the macros, swell;
but beware that this would NOT be a service to those using your C++ package,
as the results will be extremely non-portable. In fact, no guarantees that it
would work on later versions of Solaris, even assuming strict binary
compatibility in their implementation -- because they could reasonably make
"compatible" changes that would take advantage of various assumptions
regarding how those macros are used that you would be violating.

What you want to do has merit, but you have to remember that you're writing in
C++, not C. The pthread_cleanup_push and pthread_cleanup_pop macros are the C
language binding to the POSIX 1003.1c cancellation cleanup capability. In C++,
the correct implementation of this capability is already built into the
language... destructors. That is, C++ and threads should be working together
to ensure that C++ destructors are run when a thread is cancelled. If that is
done, you've got no problem. If it's not done, you've got far worse problems
anyway since you won't be "destructing" most of your objects anyway.

/---[ Dave Butenhof ]-----------------------[ ]---\
| Digital Equipment Corporation           110 Spit Brook Rd ZKO2-3/Q18 |
| 603.881.2218, FAX 603.881.0120                  Nashua NH 03062-2698 |
\-----------------[ Better Living Through Concurrency ]----------------/

 Q53: What switches should be passed to particular compilers?  

> Does anyone have a list of what switches should be passed to particular
> compilers to have them generate thread-safe code?  For example,
> Solaris-2 & SunPro cc       : -D_REENTRANT
> Solaris-2 & gcc             : ??
> DEC Alpha OSF 3.2 & /bin/cc : -threads
> IRIX 5.x & /bin/cc          : ??
> Similarly, what libraries are passed to the linker to link in threads
> support?
> Solaris-2 & Solaris threads : -lthread
> DEC Alpha OSF 3.2 threads   : -lpthreads
> IRIX 5.x & IRIX threads     : (none)
> And so forth.
> I'm trying to get GNU autoconf to handle threads gracefully.
> Bill

That would be useful information in general, I suppose. I can supply the
information for Digital UNIX (the operating system previously known as
"DEC OSF/1"), at least.

For 3.x and earlier, the proper compiler switch is -threads, which (for
cc) is effectively just -D_REENTRANT. For linking, the cc driver expands
-threads to "-lpthreads -lmach -lc_r" -- you need all three, immediately
preceeding -lc (which must be at the end). -lpthreads isn't enough, it
will pull in libmach and libc_r implicitly and in the wrong order (after
libc, where they will fail to preempt symbols).

For 4.0, you can still use -threads if you're using the DCE threads (D4)
or cma interfaces. If you don't use -threads, the link libraries should
be changed to "-lpthreads -lpthread -lmach -lexc" (before -lc). If you
use 1003.1c-1995 threads, you use "-pthread" instead of "-threads". cc
still gets -D_REENTRANT, but ld gets -lpthread -lmach -lexc.

/---[ Dave Butenhof ]-----------------------[ ]---\
| Digital Equipment Corporation           110 Spit Brook Rd ZKO2-3/Q18 |
| 603.881.2218, FAX 603.881.0120                  Nashua NH 03062-2698 |
\-----------------[ Better Living Through Concurrency ]----------------/

 Q54: How do I find Sun's bug database?  

>I am trying to use Thread Analyzer in Solaris 2.4 for performance
>tuning. But after loading the trace directory, tha exit with following
>error message: 
>Thread Analyzer Fatal Error[0]: Slave communication failure

It always helps if you state which version of the application you are
using, in this case the Thread Analyzer.

There have been a number of bugs which result in this error message
that have been fixed.  Please obtain the latest ThreadAnalyzer patch
from your Authorized Service Provider (ASP) or from our Wep page:
 Q55:  How do the various vendors' threads libraries compare?  

    Fundamentally, they are all based on the same paradigm, and everything
    you can do in one library you can (pretty much) do in any other.  Ease
    of programming and efficency will be the major distinctions.

OS                Preferred Threads POSIX Version   Kernel Support Sched model
---------------   ----------------- -------------   -------------- -------------
Solaris 2.5       UI-threads        1003.1c-1995    yes            2 level(1)
SVR4.2MP/UW 2.0   UI-threads        No
IRIX 6.1          sproc             No
IRIX 6.2          sproc             1003.1c-1995(2)
Digital UNIX 3.2  cma               Draft 4         yes            1 to 1
Digital UNIX 4.0  1003.1c-1995      1003.1c-1995    yes            2 level
DGUX 5.4          ?                 Draft 6         yes
NEXTSTEP          (cthreads?)       No
AIX 4.1           AIX Threads(3)    Draft 7         yes            1 to 1
Plan 9            rfork()           No
OpenVMS 6.2       cma               Draft 4         no
OpenVMS Alpha 7.0 1003.1c-1995      1003.1c-1995    yes            2 level
OpenVMS VAX 7.0   1003.1c-1995      1003.1c-1995    no
WinNT             Win32 threads     No
OS/2              DosCreateThread() Draft 4
Win32             Win32 threads     No              yes            1 to 1


1) Solaris 2.5 blocks threads in kernel with LWP, but provides a signal to
allow user level scheduler to create a new LWP if desired (and
thr_setconcurrency() can create additional LWPs to minimize the chances of
losing concurrency due to blocking.)

2) According to IRIX 6.2 info on SGI's web, 1003.1c-1995 threads will be
provided only as part of the REACT/pro 3.0 Realtime Extensions kit, not in
the base O/S.

3) Can anyone clarify this? My impression is that AIX 4.1 favors 1003.4a/D7
threads; but then I've never heard the term "AIX Threads".

 Q56: Why don't I need to declare shared variables VOLATILE?  

> I'm concerned, however, about cases where both the compiler and the
> threads library fulfill their respective specifications.  A conforming
> C compiler can globally allocate some shared (nonvolatile) variable to
> a register that gets saved and restored as the CPU gets passed from
> thread to thread.  Each thread will have it's own private value for
> this shared variable, which is not what we want from a shared
> variable.

In some sense this is true, if the compiler knows enough about the
respective scopes of the variable and the pthread_cond_wait (or
pthread_mutex_lock) functions. In practice, most compilers will not try
to keep register copies of global data across a call to an external
function, because it's too hard to know whether the routine might
somehow have access to the address of the data.

So yes, it's true that a compiler that conforms strictly (but very
aggressively) to ANSI C might not work with multiple threads without
volatile. But someone had better fix it. Because any SYSTEM (that is,
pragmatically, a combination of kernel, libraries, and C compiler) that
does not provide the POSIX memory coherency guarantees does not CONFORM
to the POSIX standard. Period. The system CANNOT require you to use
volatile on shared variables for correct behavior, because POSIX
requires only that the POSIX synchronization functions are necessary.

So if your program breaks because you didn't use volatile, that's a BUG.
It may not be a bug in C, or a bug in the threads library, or a bug in
the kernel. But it's a SYSTEM bug, and one or more of those components
will have to work to fix it.

You don't want to use volatile, because, on any system where it makes
any difference, it will be vastly more expensive than a proper
nonvolatile variable. (ANSI C requires "sequence points" for volatile
variables at each expression, whereas POSIX requires them only at
synchronization operations -- a compute-intensive threaded application
will see substantially more memory activity using volatile, and, after
all, it's the memory activity that really slows you down.)

/---[ Dave Butenhof ]-----------------------[ ]---\
| Digital Equipment Corporation           110 Spit Brook Rd ZKO2-3/Q18 |
| 603.881.2218, FAX 603.881.0120                  Nashua NH 03062-2698 |
\-----------------[ Better Living Through Concurrency ]----------------/

 Q57: Do pthread_cleanup_push/pop HAVE to be macros (thus lexically scoped)?  

Paul Pelletier wrote:
I was about to use pthread_cleanup_push, when I noticed that it is
implemented as a macro (on Solaris 2.5) which forces you to have the
pthread_cleanup_pop in the same function by having an open brace { at the
end of the first macro and closing it int the second...  Since I want to
hide most of this stuff in something like a monitor (or a guard in ACE) in
C++ by using the push in a constructor and the pop in the destructor I'm
wondering if there is something fundamental that would prevent me to do so
or could I just re-implement the stuff done by the macros inside some class

POSIX 1003.1c-1995 specifies that pthread_cleanup_push and
pthread_cleanup_pop must be used at the same lexical scope, "as if" the
former were a macro that expands to include an opening brace ("{") and the
latter were a macro that expands to include the matching closing brace

The Solaris 2.5 definition therefore conforms quite accurately to the intent
of the standard. And so does the Digital UNIX definition, for that
matter. If you can get away with "reverse engineering" the contents of the
macros, swell; but beware that this would NOT be a service to those using
your C++ package, as the results will be extremely non-portable. In fact, no
guarantees that it would work on later versions of Solaris, even assuming
strict binary compatibility in their implementation -- because they could
reasonably make "compatible" changes that would take advantage of various
assumptions regarding how those macros are used that you would be violating.

What you want to do has merit, but you have to remember that you're writing
in C++, not C. The pthread_cleanup_push and pthread_cleanup_pop macros are
the C language binding to the POSIX 1003.1c cancellation cleanup
capability. In C++, the correct implementation of this capability is already
built into the language... destructors. That is, C++ and threads should be
working together to ensure that C++ destructors are run when a thread is
cancelled. If that is done, you've got no problem. If it's not done, you've
got far worse problems anyway since you won't be "destructing" most of your
objects anyway.

/---[ Dave Butenhof ]-----------------------[ ]---\

 Q58: Thread Analyzer Fatal Error[0]: Slave communication failure ??  

>I am trying to use Thread Analyzer in Solaris 2.4 for performance
>tuning. But after loading the trace directory, tha exit with following
>error message: 
>Thread Analyzer Fatal Error[0]: Slave communication failure
>I do not know what happened. 

It always helps if you state which version of the application you are
using, in this case the Thread Analyzer.

There have been a number of bugs which result in this error message
that have been fixed.  Please obtain the latest ThreadAnalyzer patch
from your Authorized Service Provider (ASP) or from our Wep page:

Chuck Fisher 

 Q59: What is the status of Linux threads?  

 Q60: The Sunsoft debugger won't recognize my PThreads program!  

Nope.  The 3.0.2 version was written before the release of Sun's pthread
library.  However, if you simply include -lthread on the compile line, it
will come up and work.  It's a little bit redundant, but works fine.  Hence:

%cc -o one one.c -lpthread -lthread -lposix4 -g

 Q61: How are blocking syscall handled in a two-level system?  

> Martin Cracauer wrote:
> >
> > In a thread system that has both user threads and LWPs like Solaris,
> > how are blocking syscall handled?
> Well, do you mean "like Solaris", or do you mean "Solaris"? There's no
> one answer for all systems. LWP, by the way, isn't a very general term.
> Lately I've been using the more cumbersome, but generic and relatively
> meaningful "kernel execution contexts". A process is a KEC, an LWP is a
> KEC, a "virtual processor" is a KEC, a Mach thread is a KEC, an IRIX
> sproc is a KEC, etc.
> > By exchanging blocking syscalls to nonblocking like in a
> > pure-userlevel thread implementation?
> Generally, only "pure user-mode" implementations, without any kernel
> support at all, resort to turning I/O into "nonblocking". It's just not
> an effective mechanism -- there are too many limitations to the UNIX
> nonblocking I/O model.
> > Or by making sure a thread that calls a blocking syscall is on its own
> > LWP (the kernel is enterend anyway, so what would be the cost to do
> > so)?
> Solaris 2.5 "latches" a user thread onto an LWP until it blocks in user
> mode -- on a mutex, a condition variable, or until it yields. User
> threads aren't timesliced, and they stick to the LWP across kernel
> blocks. If all LWPs in a process block in the kernel, a special signal
> allows the thread library to create a new one, but other than that you
> need to rely a lot on thr_setconcurrency.
> Digital UNIX 4.0 works very differently. The kernel delivers "upcalls"
> to the user mode scheduler to communicate various state changes. User
> threads, for example, are timesliced on our KECs (which are a special
> form of Mach thread). When a thread blocks in the kernel, the user mode
> scheduler is informed so that a new user thread can be scheduled on the
> virtual processor immediately. The nice thing about this model is that
> we don't need anything like thr_setconcurrency to keep things running.
> Compute-bound user threads can't lock each other out unless one is
> SCHED_FIFO policy. And instead of "fixing things up" by adding a new
> kernel execution context when the last one blocks (giving you a
> concurrency level of 1), we keep you running at the maximum level of
> concurrency supportable -- the number of runnable user threads, or the
> number of physical processors, whichever is less.
> Neither model (nor implementation) is perfect, and it would be safe to
> assume that both Digital and Sun are working on improving every aspect.
> The models may easily become very different in the future.
> /---[ Dave Butenhof ]-----------------------[ ]---\
> | Digital Equipment Corporation           110 Spit Brook Rd ZKO2-3/Q18 |
> | 603.881.2218, FAX 603.881.0120                  Nashua NH 03062-2698 |
> \-----------------[ Better Living Through Concurrency ]----------------/

> Georges Brun-Cottan wrote:
> > So recursive mutex is far more than just a hack for lazy programmer or
> > just a way to incorporate non MT safe third party code. It is a tool
> > that you need in environment such OOP, where you can not or you do not
> > want to depend of an execution context.
> Sorry, but I refuse to believe that good threaded design must end where
> OOP begins. There's no reason for two independently developed packages
> to share the same mutex. There's no reason for a package to be designed
> without awareness of where and when mutexes are locked. Therefore, in
> either case, recursive mutexes remain, at best, a convenience, and, at
> worst (and more commonly), a crutch.
> I created the recursive mutex for DCE threads because we were dealing
> with a brand-new world of threading. We had no support from operating
> systems or other libraries. Hardly anything was "thread safe". The DCE
> thread "global mutex" allowed any thread-safe code to lock everything
> around a call to any unsafe code. As an intellectual exercise, I chose
> to implement the global mutex by demonstrating why we'd created the
> concept of "mutex attributes" -- previously, there had been none. As a
> result of this intellectual exercise, it became possible for anyone to
> conveniently create their own recursive mutex, which is locked and
> unlocked using the standard POSIX functions. There really wasn't any
> point to removing the attribute, since it's not that hard to create your
> own recursive mutex.
> Remember that whenever you use recursive mutexes, you are losing
> performance -- recursive mutexes are more expensive to lock and unlock,
> even without mutex contention (and a recursive mutex created on top of
> POSIX thread synchronization is a lot more expensive than one using the
> mutex type attribute). You are also losing concurrency by keeping
> mutexes locked so long and across so much context that you become
> tempted to use recursive mutexes to deal with lock range conflicts.
> Yes, it may be harder to avoid recursive mutexes. Although I've never
> yet seen a valid case proving that recursive mutexes are NECESSARY, I
> won't deny that there may be one or two. None of that changes the fact
> that an implementation avoiding recursive mutexes will perform, and
> scale, far better than one relying on recursive mutexes. If you're
> trying to take advantage of multithreading, all the extra effort in
> analysis and design will pay off in increased concurrency.
> But, like any other aspect of performance analysis, you put the effort
> where the pay is big enough. There are non-critical areas of many
> libraries where avoiding recursive mutexes would be complicated and
> messy, and where the overhead of using them doesn't hurt performance
> significantly. Then, sure, use them. Just know what you're doing, and
> why.
> /---[ Dave Butenhof ]-----------------------[ ]---\
> | Digital Equipment Corporation           110 Spit Brook Rd ZKO2-3/Q18 |
> | 603.881.2218, FAX 603.881.0120                  Nashua NH 03062-2698 |
> \-----------------[ Better Living Through Concurrency ]----------------/

 Q62: Can one thread read from a socket while another thread writes to it?  

It's supposed to work!  That's certainly how sockets are defined.  It's
an easy enough test on your own system.
 Q63: What's a good way of writing threaded C++ classes?  

> Ian Emmons wrote:
> >
> > Baard Bugge wrote:
> > >
> > > >How would we put the whole object into a thread?
> > >
> > > Been there. Done that. Let the constructor create a thread before
> > > returning to the caller (another object). But beware, your OS will
> > > propably start the thread by calling a function (specified by you)
> > > C-style. You want this function to be a member function in your class,
> > > which is ok as long as you make it static. The thread function will
> > > also need the this-pointer to your newly created object. What you want
> > > will look something like this (in NT):
> > >
> > > // Thread callback function.
> > > // NOTE: Need to be written in C or be a static member function
> > > // because of C style calling convention (no hidden this pointer)
> > > LPTHREAD_START_ROUTINE CThread::ThreadFunc(LPVOID inputparam)
> > > {
> > >    CThread *pseudo_this = (CThread *) inputparam;
> > >    ...
> > > }
> > >
> > > This function have access to all the members in the object through the
> > > pseudo this pointer. And all member functions called by this function
> > > will run in the same thread. You'll have to figure out how to
> > > communicate with the other objects in your system though. Be careful.
> > >
> > > --
> > > BaBu
> >
> > You can take this even a step further.  Add a pure virtual to your generic
> > CThread class like so:
> >
> > class CThread
> > {
> >       ...
> > protected:
> >     // I don't remember what Win32 expects as the return value, here,
> >     // but you can fix this up as you wish:
> >     virtual unsigned entryPoint() = 0;
> >       ...
> > };
> >
> > Then have the static ThreadFunc call it like so:
> >
> > // Thread callback function.
> > // NOTE: Need to be written in C or be a static member function
> > // because of C style calling convention (no hidden this pointer)
> > LPTHREAD_START_ROUTINE CThread::ThreadFunc(LPVOID inputparam)
> > {
> >    return ((CThread*) inputparam)->entryPoint();
> > }
> >
> > Now, to create a specific thread, derive from CThread, override entryPoint,
> > and you no longer have to mess around with a pseudo-this pointer, because
> > the real this pointer is available.
> >
> > One tricky issue:  make sure you differentiate between methods that the
> > thread itself will call, and methods that other threads (such as the one
> > that created the thread object) will call -- you will need to do thread
> > synchronization on class members that are shared data.
> >
> > Ian
> >
> > ___________________________________________________________________________
> > Ian Emmons                                       Work phone: (415) 372-3623
> >                              Work fax:   (415) 341-8432
> > Persistence Software, 1720 S. Amphlett Blvd. Suite 300, San Mateo, CA 94402

OK, let me warn everyone this is a very long response, but I just came off
of a large radar project on which I had to design multithreaded objects so
this question jumped out at me.

Yousuf Khan  wrote in article
> I got some hypothetical questions here, I'm not actually now trying to
> do any of this, but I can see myself attempting something in the near
> future.
> Okay, I'm thinking multithreading and OO design methodologies are
> tailor-made for each other, in theory at least. OO design mandates that
> all object instances are considered concurrent with each other. That
> seems like a perfect application of threading principles. However,
> current threading protocols (such the POSIX Pthreads, Microsoft/IBM
> threads, Sun UI threads, etc.) seem to be based around getting
> subprocedures threaded, rather than getting objects threaded.

First, let me state my own programming background so you can apply the
appropriate grain of salt to what I say and understand my assumptions.  I
have programmed first for a few years in a DEC, VMS environment and then for
several more in a Windows/Windows NT environment.

> Okay, I suppose we can get individual methods within an object to be
> threaded, because they are just like subprocedures anyways. But what if we
> wanted to be extremely pedantic, and we want the entire object to be in
> its own thread, in order to be true to OO design paradigms? How would we
> put the whole object into a thread?  My feeling is that we should just
> call the object's constructor inside a thread wrapper, that way the entire
> object will go into a thread, including any other methods that are part of
> that object. What I guess I'm saying is that will calling the constructor
> inside a thread wrapper, only run the constructor inside that thread and
> then the thread will end, or will the entire object now run inside that
> thread from now on?  Am I being oversimplistic in my speculation?

If you want to force an object to, as you say, "run in one thread", you
would have to be able to make public every member function perform a context
switch to the desired thread upon entering the function and switch back upon
exiting.  You would have to protect all member variables and use Get/Set
functions for them that performed context switches as well.

Under Windows NT, if you send a message to a window created by a different
thread, that context switch is performed for you by the operating system.
Your process waits until the ::SendMessage() call completes.  Other than
using SendMessage(), I do not know how you would accomplish such an
operation.  And SendMessage requires a window to which the message will be
sent.  Thus, under NT, you would have to make your object create some kind
of hidden window in the context of the desired thread and then have every
member function do a ::SendMessage() to that window.

(There are variations -- e.g. SendMessageCallback(), PostMessage(), etc for
asynchronous function calls)

Such a design is possible, and maybe workable, but seems to defeat the
purpose of threads, doesn't it?  If one thread is just going to have to wait
for the special thread every function call, why have the special thread at

And I haven't even considered OLE and accessing objects across process
boundaries, or thread-local storage.

(Again, I'm speaking pretty exclusively about the NT threading model here.
I've had enough VMS to last me a lifetime and know very little about Posix

It seems your reason for wanting the entire object to run in its own thread
is to be true the OO paradigm, but I think that's perhaps too much of a good

Why not make your objects completely thread-safe instead?  Create some sort
of a Single-Writer / Multiple-Reader resource locking object for all objects
of the class.  Make each member function use this resource guard, acquiring
a read-lock if it's a const member function or write-lock if it is not

There's nothing to prevent you from assigning specific threads to the
objects to do background work on them, but as long as all access to the
objects is through those safe member functions, they are completely thread

I mention this because this is how I designed a large radar project I just
finished working on.  I used completely thread-safe, reference counted
objects, read/write locks, and smart pointers in my design and the results
were far better than my most optimistic hopes.  A very fast workstation
program with many dynamic displays showing an enormous amount of continously
changing data stored in a SQL server database.

I've gone on way too long here so I'll end this without saying half of what
I want to say.  Hope this gives you a few ideas.
 Q64: Can thread stacks be built in privately mapped memory?  

I've avoided any response to this long thread for a while because I'm not
sure I want to confuse the issue with facts. And, despite, the facts, I like
the idea of people learning to treat thread stacks "as if they might be"

Nevertheless, at some time I thought it might be helpful to point out what
POSIX says about the matter... and I guess this is a good time.

POSIX very specifically disallows "non-shared" memory between threads.  That
is, it requires that the address space is associated with the PROCESS, not
with the individual THREADS. All threads share a single virtual address
space, and no memory address is private. Stacks, in particular, CANNOT be
set up with private memory. Although, for safe programming, you should
almost always pretend that it's private.

/---[ Dave Butenhof ]-----------------------[ ]---\
|  Digital Equipment Corporation         110 Spit Brook Rd ZKO2-3/Q18  |
|  603.881.2218, FAX 603.881.0120                Nashua NH 03062-2698  |
\-----------------[ Better Living Through Concurrency ]----------------/

 Q65: Has anyone implemented a mutex with a timeout?  

Has anyone implemented a mutex locking function on top of Solaris or POSIX
threads with a timeout?  The problem I'm trying to solve is if a thread is
unable to obtain a mutex after a certain timeframe (say 30 seconds), then I
want the thread to terminate and return an error.  The Solaris and POSIX
API's only allow the user to check if a mutex can be obtained.

Of course! Check out the code for pthread_np_timed_mutex_t at

 Q66: I think I need a FIFO mutex for my program...  

>There are VERY few cases where "lock ordering" is truly necessary. In
>general, when it may seem to be necessary, using a work queue to distribute
>the work across a pool of threads will be easier and more efficient. If
>you're convinced that you need lock ordering, rather than POSIX wakeup
>ordering, you have to code it yourself -- using, essentially, a work queue
>model where threads wishing to lock your "queued mutex" queue themselves in
>order. Use a condition variable and "next waiter" predicate to ensure proper
locking order. It's not that hard.

Right, and you can find a freely available implementation of essentially a
"FIFO Mutex" in ACE.  Take a look at

Dr. Douglas C. Schmidt                  (
Department of Computer Science, Washington University
St. Louis, MO 63130. Work #: (314) 935-4215; FAX #: (314) 935-7302

You can also find an implementation of FIFO mutexes in the file
pthread_np.{h,c} at:
 Q67: Why my multi-threaded X11 app with LinuxThreads crashes?  
> Wolfram Gloger wrote:
> >
> > (Jeff Noll) wrote:
> >
> > >       I'm making an X client that connects to a tcp socket. I'm using a
> > > thread to continually read from that socket connection and a text
> > > widget to sent to the socket. (an X telnet program that looks kind of
> > > like ncftp, seperate input/output windows). When i run this at school
> > > under solaris it seems to be fine, but when i take it home and try it
> > > under linus using linuxthreads 0.5 it crashes when i start entering
> > > into the text window.
> >
> > Crash as in `fatal' X errors ?  A while ago I had a similar experience
> > when trying to create a multi-threaded X11 app with LinuxThreads.  It
> > was quite easy to debug though: the LinuxThreads libpthread library
> > lets all threads get individual errno values (like they should), as
> > long as all sources are compiled with _REENTRANT defined.
> >
> > The X11 libs (at least in XFree86-3.2) are by default not compiled in
> > this way, unfortunately (note I'm not talking about multiple thread
> > support in X11), and they break when using LinuxThreads, e.g. because
> > Xlib relies on read() returning with errno==EAGAIN at times.  This is
> > a problem even when one restricts oneself to using X from a single
> > thread only.
> >
> > Once I recompiled all X11 libs with -D_REENTRANT (totally independent
> > of libpthread), everything works fine.  I could put those libs up for
> > ftp if you're interested to check it out.
> >
> > Regards,
> > Wolfram.
 Q68: How would we put a C++ object into a thread?  

> > > Been there. Done that. Let the constructor create a thread before
> > > returning to the caller (another object). But beware, your OS will
> > > propably start the thread by calling a function (specified by you)
> > > C-style. You want this function to be a member function in your class,
> > > which is ok as long as you make it static. The thread function will
> > > also need the this-pointer to your newly created object. What you want
> > > will look something like this (in NT):
> > >
> > > // Thread callback function.
> > > // NOTE: Need to be written in C or be a static member function
> > > // because of C style calling convention (no hidden this pointer)
> > > LPTHREAD_START_ROUTINE CThread::ThreadFunc(LPVOID inputparam)
> > > {
> > >    CThread *pseudo_this = (CThread *) inputparam;
> > >    ...
> > > }
> > >
> > > This function have access to all the members in the object through the
> > > pseudo this pointer. And all member functions called by this function
> > > will run in the same thread. You'll have to figure out how to
> > > communicate with the other objects in your system though. Be careful.
> > >
> > > --
> > > BaBu
> >
> > You can take this even a step further.  Add a pure virtual to your generic
> > CThread class like so:
> >
> > class CThread
> > {
> >       ...
> > protected:
> >     // I don't remember what Win32 expects as the return value, here,
> >     // but you can fix this up as you wish:
> >     virtual unsigned entryPoint() = 0;
> >       ...
> > };
> >
> > Then have the static ThreadFunc call it like so:
> >
> > // Thread callback function.
> > // NOTE: Need to be written in C or be a static member function
> > // because of C style calling convention (no hidden this pointer)
> > LPTHREAD_START_ROUTINE CThread::ThreadFunc(LPVOID inputparam)
> > {
> >    return ((CThread*) inputparam)->entryPoint();
> > }
> >
> > Now, to create a specific thread, derive from CThread, override entryPoint,
> > and you no longer have to mess around with a pseudo-this pointer, because
> > the real this pointer is available.
> >
> > One tricky issue:  make sure you differentiate between methods that the
> > thread itself will call, and methods that other threads (such as the one
> > that created the thread object) will call -- you will need to do thread
> > synchronization on class members that are shared data.
> >
> > Ian
> >
> > ___________________________________________________________________________
> > Ian Emmons                                       Work phone: (415) 372-3623
> >                              Work fax:   (415) 341-8432
> > Persistence Software, 1720 S. Amphlett Blvd. Suite 300, San Mateo, CA 94402
 Q69: How different are DEC threads and Pthreads?  
Mike.Lanni wrote:
> Baard Bugge wrote:
> >
> > According th the thread-faq, DCE threads (as in HPUX 10.10) is an
> > older version of Posix 1003.1c threads (as in Solaris 2.5).
> >
> > Whats the differences? Is the two of them, fully or partly, source
> > code compatible?
> >
> > I want my multirhreaded code to be cross-compilable on at least the
> > two platforms mentioned above, without too many ifdefs. Can I?
> >
> > --
> > BaBu
> Unfortunately, this is not black and white. If HPUX 10.10 is based on
> Draft 7 or higher, the Solaris and HP codes should be similar. However,
> if HP 10.10 is based on Draft 4, then there is quite a bit of work to be
> done. D4 became popular due to its usage with DCE. Assuming the worst,
> D4, here are some notes that I've put together based on some programming
> I've done. It is not complete by any means, but it should give you an
> idea of what you are up against.
>  - signal handling is different
>  - return codes from pthreads api's are now the real error, vrs. -1
> and    errno
>  - possibly no support for the "non-portable" apis and symbolic
> constants
>  - non support for DCE exception handling
>  - Some of the pthread_attr_ apis have different types and arguments.
>  - Some of the scheduling apis have changed.
>  - Some thread specific api's have changed parameters.
> Below are some mappings that at one time were valid...
> #if defined(_D4_)
> #define PTHREAD_ONCE pthread_once_init
> #define PTHREAD_ATTR_DEFAULT pthread_attr_default
> #define PTHREAD_MUTEXATTR_DEFAULT pthread_mutexattr_default
> #define PTHREAD_CONDATTR_DEFAULT pthread_condattr_default
> #define INITROUTINE pthread_initroutine_t
> #define PTHREAD_ADDR_T pthread_addr_t
> #define START_RTN pthread_startroutine_t
> #define PTHREAD_YIELD pthread_yield
> #define PTHREAD_ATTR_DELETE pthread_attr_delete
> #define PTHREAD_ATTR_CREATE pthread_attr_create
> #define PTHREAD_MUTEXATTR_DELETE pthread_mutedattr_delete
> #define PTHREAD_MUTEXATTR_CREATE pthread_mutedattr_create
> #define PTHREAD_CONDATTR_DELETE pthread_condattr_delete
> #define PTHREAD_CONDATTR_CREATE pthread_condattr_create
> #define PTHREAD_KEYCREATE pthread_keycreate
> #define ATFORK atfork
> #define SIGPROCMASK sigprocmask
> #else
> #define INITROUTINE void *
> #define PTHREAD_ADDR_T void *
> #define START_RTN void *
> #define PTHREAD_YIELD sched_yield
> #define PTHREAD_ATTR_DELETE pthread_attr_destroy
> #define PTHREAD_ATTR_CREATE pthread_attr_init
> #define PTHREAD_MUTEXATTR_DELETE pthread_mutedattr_destroy
> #define PTHREAD_MUTEXATTR_CREATE pthread_mutedattr_init
> #define PTHREAD_CONDATTR_DELETE pthread_condattr_destroy
> #define PTHREAD_CONDATTR_CREATE pthread_condattr_init
> #define PTHREAD_KEYCREATE pthread_key_create
> #define ATFORK pthread_atfork
> #define SIGPROCMASK pthread_sigmask
> #endif
> #if defined(_D4_)
>       rc = pthread_detach(&tid;);
>       rc = pthread_exit(status);
>       rc = pthread_join(tid, &status;);
>          pthread_setcancel(CANCEL_OFF);
>          pthread_setcancel(CANCEL_ON);
>     (void) pthread_setscheduler(pthread_self(),SCHED_FIFO,PRI_FIFO_MAX);
> #else
>       rc = pthread_detach(tid);
>       rc = pthread_exit(&status;);
>       rc = pthread_join(tid, &status;_p);
>          pthread_setcancelstate(PTHREAD_CANCEL_DISABLE,NULL);
>          pthread_setcancelstate(PTHREAD_CANCEL_ENABLE,NULL);
>     struct sched_param param;
>     param.sched_priority = 65535;
>     (void) pthread_setschedparam(pthread_self(),SCHED_FIFO,¶m;);
> #endif /* _D4_ */
> Hope this helps.
> Mike L.
> --------------------------------------------------------------------
> Michael J. Lanni
> NCR                            email:
> 3325 Platt Springs Road        phone:  803-939-2512
> West Columbia, SC 29170          fax:  803-939-7317
 Q70: How can I manipulate POSIX thread IDs?  
Steven G. Townsend wrote:
> Jim Robinson wrote:
> >
> > In article <>, Ian Emmons wrote:
> > >Robert Patrick wrote:
> > >>
> > >> Yes, you can copy one pthread_t to another.  The part you have to be
> > >> careful about is that in some implementations pthread_t is a struct
> > >> and in others it is not.  Therefore, setting two pthread_t's to be
> > >> equal by assignment will not be portable.  However, memcpy(a, b,
> > >> sizeof(pthread_t)) should always work.
> As to the assignment issue, see Jim's comment below.
> As to the first point, assume for the moment that that a part of
> the structure is an array (status, pointers to currently allocated
> keys, whatever) if anything in the array can change, the "copy"
> will not be updated. Assume a status flag/bit which indicates
> whether the thread is runnable, looking at the copy could easily
> produce different results than the actual value of the "true"
> pthread_t.  This is just a bad thing to do.  Other problems
> can occur as well...
> What happens if both the original and the copy are passed to
> pthread_destroy?
> What happens if as we are doing the '=' or memcpy operation the
> thread is currently executing on a different processor (i.e.
> The contents of the pthread_t object would neeed to be protected
> by a mutex)?
> When it comes to copying pthread_t s...
>   Just say 'no'.
> > >>
> > >> Just my two cents,
> > >> Robert
> > >
> > >Since I work in C++ exclusively, this isn't an issue for me, and so I never thought
> > >about that.  For C coders, you're right, of course.
> >
> > Structure assignment is defined in *ANSI* C. See page 127 of K&R;, 2nd
> > edition. Since ANSI C has been standardized for quite some time now, it
> > should be a non-issue for C coders as well, no?
> > --
> > Jim Robinson
> >
 Q71: I'd like a "write" that allowed a timeout value...  

Marc Peters wrote:
> What would be nice to have is a "write" that allowed a timeout value to be
> specified.  A la:
>         write(fdSocket, bufferPtr, bufferLength, timeSpec);
> If the write doesn't succeed withing the specified timeSpec, then errno
> should be set to ETIMEOUT or something.  Obviously, this would be quite handy
> in network code.
> Due to other circumstances beyond my control, the fdSocket cannot be placed
> in non-blocking mode.  Thus, the solution I'm left with is to start a POSIX
> timer, wait on the blocking write, and check for an errno of EINTR when it
> returns (if it timed out).
> I'm aware of the alternate technique of dedicating a thread to dispatching
> signal events.  This dedicated thread merely does a sigwaitinfo() and
> dispatches accordingly.  These technique, too, is offensive for such a simple
> requirement -- the "timed" write.

Why not just do these possibly long writes in separate threads? And, if
some "manager" decides they've gone too long, cancel the threads.

> I've ordered the POSIX 1003.1 standard to pursue this; however, it will be
> several days before it arrives.  Can anyone fill me in with some details of
> SIGEV_THREAD in the meantime?

SIGEV_THREAD creates a thread instead of raising a signal in the more
conventional manner. You get to specify the start routine (instead of
a signal catching function), and the attributes. The thread runs
anonymously (by default it's detached, and if you use an attributes
object with detachstate set to PTHREAD_CREATE_JOINABLE, the behavior
is unspecified).

The main advantage is that your "signal catching function" can lock
mutexes, signal condition variables, etc., instead of being restricted
to only the short list of async-signal safe functions.

In your case, a SIGEV_THREAD action would still constitute a "double
dispatch" for your signal. The code wouldn't look much different from
your current version.

Oh yeah, there's a major disadvantage, for you, in SIGEV_THREAD.
Solaris 2.5 doesn't implement SIGEV_THREAD. So you'd have to wait for
Solaris 2.6. (Just to be fair, I'll also point out that Digital UNIX 4.0
didn't do SIGEV_THREAD, either -- it is a minor and relatively obscure
function, and we all had more important things to worry about. We also
caught up to it later, and we'll be supporting SIGEV_THREAD in Digital
UNIX 4.0D [or at least, that's what we're calling it now, though these
things are always subject to change for various reasons].)

/---[ Dave Butenhof ]-----------------------[ ]---\
| Digital Equipment Corporation           110 Spit Brook Rd ZKO2-3/Q18 |
| 603.881.2218, FAX 603.881.0120                  Nashua NH 03062-2698 |
\-----------------[ Better Living Through Concurrency ]----------------/

 Q72: I couldn't get threads to work with glibc-2.0.  

>I finally got this to compile cleanly, but it stalls (sigsuspend) somewhere
>in pthread_create()...

    If you are using glibc-2.0, you should upgrade to glibc-2.0.1. 
There is a bug in 2.0 that makes all thread creation fails. 

         Yann Doussot  
 Q73: Can I do dead-owner-process recovery with POSIX mutexes?  

el Gringo wrote:
> Hi.
> I am trying to create a mutxe on a program that has to work on NT 4.0 and AIX.
> For NT, I use CreateMutexes...etc, and in this case, if the process owning the
> mutex crashes, the system releases the mutex and returns WAIT_ABONDONED to the
> thread that is waiting for the mutex to be released. And if the mutex is
> opened several times by the same thread, the call succeses and the mutex count
> is incremeneted.
> What I don't know is if the pthread mutexes do the same thing when a thread or
> process owning the mutexe crashes...or when the Pthread_mutex_lock() is called
> several times by the same one. Could someone provide me with a doc or web site
> so I can find those answers ? Thanks. Riad



  You got a *big* set of problems to deal with.  The simple answer is that
POSIX mutexes don't do that, but that you can create that kind of behavior
if you want to.

  The problems I refer to are those surrounding what you do when the owner process
crashes.  There is a massive amount of work to be done to ensure that you don't
have corrupted data after you get WAIT_ABANDONED.  Unless you've already taken
care of this, you've got a h*** of a lot of work to do.

  So...  the best answer is go find an expert in this area (or spend a month
becoming one) and hash out the issues.  Building the mutex will be the easiest
part of it.

  Good luck.

 Q74: Will IRIX distribute threads immediately to CPUs?  

Michel Lesoinne   wrote:
>The first question concerns new and delete under C++ as well as malloc.
>Are these functions thread-safe? 
Yes, these calls are thread-safe.

>The second question has to do with multi-CPU machines. I have noticed
>that POSIX thread do not get shipped to other CPU's immediately after
>being started. For example if you bring up gr_osview and watch the CPU
>usage as you start 4 paralll threads, it takes approximately 1 second
>for the 4 threads to run in parallel rather than on the same CPU. Worse,
The current pthread implementation only creates additional sprocs as it
deems necessary given the application activity.  Interaction between
pthreads which may require context switching can lower the requirement
while CPU-bound threads will raise it.  There can be a short delay
"ramping-up" before the ideal number of sprocs are active.  The kernel
is responsible for scheduling the sprocs on CPUs.  Thus you may be
seeing 2 effects (though 1 second seems a little long to me).

>Is there a way to force IRIX to distribute the threads immediately?
Currently there is no way to force this behaviour though we expect to
add tuning interfaces in the future.  You may try experimenting by setting
the environment variable PT_ITC as a hint to the library that your app is
CPU bound.

 Q75: IRIX pthreads won't use both CPUs?  

Dirk Bartz   wrote:
>I've written a parallel program using pthreads on two processors
>which consists of two parallel stages.
>In first stage reads jobs from a queue (protected by a mutex) and
>should process them parallelly (which it doesn't); the second
>stage works fine.
>Now, some debugger sessions of the first stage show that
>both pthreads are started, but the first one is being blocked
>most of the time. The cvd says:
>   0    _procblk() ["procblk.s":15, 0x0fab4d74]
>   1    _blockproc() ["blockproc.c":23, 0x0fab56e0]
>   2    vp_idle() ["vp.c":1702, 0x07fe61e8]
>It seems that the first pthread is only sharing one processor
>with the second thread. It is *not* blocked at the mutex!
>Does anyone has a clue what happend?
    First of all that curious backtrace is from one of the underlying
sprocs on which the pthreads execute.  As you can see it is currently idle
and blocked in the kernel waiting for the library to activate it when more
work (a pthread) is ready to run.  If you use cvd showthread all command
it will show you the pthread state which should in your case be MUTEX-WAIT
for the pthread of interest.  If you then backtrace that pthread you should
see it blocked in the mutex locking code.

A second point to note is that the pthreads library attempts to use an
appropriate number of sprocs for its scheduling.  If your application creates
2 CPU-bound threads then on an MP machine 2 sprocs will be created to run
the threads.  On a UP only one sproc will be created and will switch between
the two pthreads.  On an MP where the threads are not CPU-bound the problem
is more complex; when 2 pthreads are tightly synchronised then a single
sproc may be a better choice - this may be what you are seeing.

I hope the above explains what you are seeing.

 Q76: Are there thread mutexes, LWP mutexes *and* kernel mutexes?  

> In a typical "two level" scheduling scheme, say solaris,
> synchronization primitives used at the thread level (POSIX or solaris)
> are provided by the user level scheduler library.  At the LWP level,
> are there any synchronization primitives, and if so, where would one
> use those as opposed to using the user level library primitives?
> Ofcourse, there would be some synchronization primitives for the
> kernel use.  Does it mean that there are 3 distinct set of primitives
> (user level, LWP level and kernel level)?  Can anyone throw some light
> on the LWP level primtives (if any) and point out where these would be
> useful?

  You may remember that scene in the Wizard of Oz, where Toto runs away
in panic at the sight of the Powerful Oz.  He discovers a little man 
running the machinery behind a curtin.  The catch-line was "Pay no attention
to that man behind the curtin."

  Same thing here.  You call pthread_mutex_lock() and it does whatever it
needs to so that things work.  End of story.

  But if you *really* want to peek...  If the mutex is locked, then the
thread knows that it needs to go to sleep, and it calls one set of routines
if it's an unbound thread, another if it's bound.  (If you hack around inside
the library code, you'll be able to see the guts of the thing, and you'll
find calls to things like _lwp_mutex_lock().  You will NEVER call those!)

  Now, as for kernel hacking, that's a different picture.  If you are going
to go into the kernel and write a new device driver or fix the virtual 
memory system, you'll be working with a different interface.  It's similar
to pthreads, but unique to each kernel.  The older OSs didn't even HAVE a
threads package!
 Q77: Does anyone know of a MT-safe alternative to setjmp and longjmp?  
>      I am taking an operating systems class; therefore, my
>    question will sound pretty trivial.  Basically, I am
>    trying to create thread_create, thread_yield, and thread_exit
>    functions.  Basically, I have two files.  They compile fine and
>    everything but whenever I try to run the program I get the error:
>    "longjmp or siglongjmp function used outside of saved context
>     abort process"
>    All I know is that we are running this on an alpha machine at
>    school [...]
>    Anyway, I just want to know if anyone has ever tried to do a longjmp from a
>    jmp_buf that was not the same as that used in setjmp.
> The runtime environment provided with some operating systems (e.g.,
> Ultrix or whatever DEC `Unix' is called these days) performs an
> explicit check that the destination stack frame is an ancestor of
> the current one.  On these systems you cannot use setjmp/longjmp
> (as supplied) to implement threads.
> On systems whose longjmp is trusting, setjmp/longjmp is a very common
> way of building user-space threading libraries.  This particular wheel
> has been reinvented many times.
> If you know the layout of a jmp_buf, you *can* use setjmp but you will
> have to implement a compatible longjmp yourself in order to change the
> processor context to that of the next task.  If you have a
> disassembler you might be able to reverse engineer a copy of longjmp
> with the check disabled.
> *I* would consider this outside the scope of such an exercise but your
> professor may disagree.
> Steve
> --
> Stephen Crane, Dept of Computing, Imperial College of Science, Technology and
> Medicine, 180 Queen's Gate, London sw7 2bz, UK:jsc@{, icdoc.uucp}
> Unix(tm): A great place to live, but a terrible place to visit.
 Q78: How do I get more information inside a signal handler?   
Mark Lindner wrote:
> I'm writing a multithreaded daemon that supports dynamic runtime loading
> of modules (.so files). I want it to be able to recover from signals such
> as SIGSEGV and SIGFPE that are generated by faulty module code. If a given
> module causes a fault, I want the daemon to unload that module so that
> it's not called again.
> My problem is that once a signal is delivered, I don't know which worker
> thread it came from, and hence I have no idea which module is faulty. The
> O'Reilly pthreads book conveniently skirts this issue. I poked around on
> the system and found the getcontext() call; I tried saving the context for
> each worker thread, and then using the ucontext_t structure passed as the
> 3rd argument to the signal handler registered by sigaction(), but
> unforunately I can't find anything that matches...the contexts don't even
> appear to be the same.
> Since the behavior of pthreads calls is undefined within a signal handler,
> I can't use pthread_self() to figure out which thread it is either.
> All examples I've seen to date assume that either:
> a) only one thread can generate a given signal
> or
> b) two or more threads can generate a given signal, but the signal handler
> does the same thing regardless of which thread generated it.
> My situation doesen't fall into either of these categories.
> Any help would be appreciated.
> --
> Cheers!
> Mark
> ------------------------------------------------------------------------------
>       |
> ------------------------------------------------------------------------------
>                    I looked up
>                    As if somehow I would grasp the heavens
>                    The universe
>                    Worlds beyond number...
> ------------------------------------------------------------------------------
 Q79: Is there a test suite for Pthreads?  

Re: COMMERCIAL: Pthreads Test Suite Available
The Open Group VSTH test suite for Threads implementations of
POSIX 1003.1c-1995 and  the X/Open System Interfaces (XSH5) Aspen
threads extensions is now generally available.
For further information on the test suite see
For information on the Aspen threads extensions see

> Andrew Josey, Email:
> #include 
 Q80:  Flushing the Store Buffer vs. Compare and Swap  

Just looking at the CAS and InterLockedXXX instructions... 

  "Hey!" says I to myself, "Nobody's minding the store buffer!"
A couple of people have shown some examples of using InterLockedXXX
in Win32, but they never address memory coherency!

  So, if they implement a mutex with InterLockedExchange:

lock(int *lock)
{while (InterLockedExchange(lock, 1) == 1) sleep();} 

unlock(int *lock)
{*lock = 0;}

  at unlock time, some changed data might not be written out to
main memory.  Hence we need this:

unlock(int *lock)
 *lock = 0;

  Or is there something about x86 that I don't know about here?
 Q81: How many threads CAN a POSIX process have?  

Dave Butenhof wrote:
> Bryan O'Sullivan wrote:
> >
> > r> _POSIX_THREAD_THREADS_MAX that claims to be the maximum threads per
> > r> process.
> >
> > As I recall, this is a minimum requirement.  Solaris certainly
> > supports far more than 64 threads in a single process, and I'm sure
> > that Irix does, too.
> POSIX specifies two compile-time constants, in , for each
> runtime limit. One is the MINIMUM value of that MAXIMUM which is
> required to conform to the standard. _POSIX_THREAD_THREADS_MAX must be
> defined to 64 on all conforming implementations, and all conforming
> implementations must not arbitrarily prevent you from creating at least
> that many threads.
> The symbol PTHREAD_THREADS_MAX may ALSO be defined, to the true limit
> allowed by the system, IF (and only if) that limit is fixed and can be
> predicted at compile time. (The value of PTHREAD_THREADS_MAX must be at
> least 64, of course.) I don't know of any systems that define this
> symbol, however, because we don't implement any fixed limit on the
> number of threads. The limit is dynamic, and dictated purely by a wide
> range of resource constraints within the system. In practice, the only
> way to predict how many threads you can create in any particular
> situation is to bring a program into precisely that situation and count
> how many threads it can create. Remember that the "situation" includes
> the total size of your program text and data, any additional dynamic
> memory used by the process (including all shared libraries), the virtual
> memory and swapfile limits of the current system, and, in some cases,
> the state of all other processes on the system.
> In short, the concept of "a limit" is a fiction. There's no such thing,
> without knowing the complete state of the system -- rarely practical in
> real life.
> Oh, and by the way, there's no guarantee (in POSIX or anywhere else)
> that you can create even 64 threads. That just means that the system
> cannot arbitrarily prevent you from creating that many. If you use up
> enough virtual memory, you may be unable to create even one thread.
> That's life.
> As Bryan said, you can normally rely on being able to create hundreds,
> and usually thousands, of threads on any of the current 2-level
> scheduling POSIX threads implementations. Kernel-level implementations
> are typically more limited due to kernel quotas on the number of kernel
> thread slots available for the system and often for each user.
 Q82: Can Pthreads wait for combinations of conditions?  

> Is there any way in Pthreads to wait for boolean combinations of conditions
> (i.e. wait for any one of a set of conditions or wait until all of a set of
> conditions have occurred). I'm looking for a feature similar to the VMS
> Wait for logical OR of event flags or the OS/2 multiplexed semaphores.

  You mean something like this:

void *consumer(void *arg)
{request_t *request;

    while ((length == 0) && (!stop))    <--  While both are true, sleep
      pthread_cond_wait(&requests;_consumer, &requests;_lock);
    if (stop) break;
    request = remove_request();

Or perhaps:

    while ( ((length == 0) && (!stop))  ||
        (age_of(granny) > 100) ||
        (no_data_on_socket(fd) && still_alive(client)) ||
        (frozen_over(hell))  )
      pthread_cond_wait(&requests;_consumer, &requests;_lock);

  Nope.  Can't be done  :-)

  Now if you're thinking about something that involves blocking, it may be
a bit trickier.  In OS/2 or Win32 you might think in terms of:

  WaitForMultipleObjects(Mutex1 and Mutex2)

you'll have to do a bit extra.  Perhaps you'll have two different threads 
blocking on the two mutexes:

    Thread 1

    Thread 2

    Thread 3
 while (!M1 || !M2)
   pthread_cond_wait(&requests;_consumer, &requests;_lock);

  I think this looks sort of ugly.  More likely you'll find a better way 
to structure your code.

 Q83: Shouldn't pthread_mutex_trylock() work even if it's NOT PTHREAD_PROCESS_SHARED?  


  I infer your're trying to get around the lack of shared memory SVs
in some of the libraries by only using trylock?  I can't say that I
approve, but it ought to work...

  In the code example below I hacked up an example which does seem to
do the job.  I can't tell you what you were seeing in your tests.
Hmmm...  Just because this particular hack works on one OS does not
mean that it will necessarily work on another.  (Let's say I wouldn't
stake MY job on it!) 

  What about using shared semaphores?  Maybe SysV semaphores?


> HI,
> I'm having a problem with pthread_mutex_unlock () on Solaris 2.5 for a
> pthread_mutex_t inited in a shared memory structure between 2 tasks.
> I get pthread_mutex_trylock (lockp) to return zero, and both tasks
> agree the mutex is locked.
> When the owning task calls pthread_mutex_unlock (lockp), it returns
> zero, but the other task's pthread_mutex_trylock (lockp) still believes
> the mutex is locked??
> FAQ location or help?  Thanks.
> Heres how I initted the pthread_mutex_t struct:
> In a shared memory struct:
> pthread_mutex_t         mutex_lock =  PTHREAD_MUTEX_INITIALIZER;
> Then either task may call:
> pthread_mutex_trylock (&mutex;_lock)
> pthread_mutex_unlock (&mutex;_lock)
> I've had little luck with  pthread_mutexattr_setpshared () to init for
> the "shared" scenario (core dumped) and especially that this is a Sun'ism
> that doesn't exist in DEC Unix 4.0b, which is another requirement, that
> the solution be portable to all/most Unix'es with threads.
> Thanks.
> Curt Smith

       sleep(4-i/2);    /* wait a second to make it more interesting*/
      if (!err) {pthread_mutex_unlock(&buf-;>lock2);
          printf("Unlocked by parent\n");

  printf("Parent PID(%d): exiting...\n", getpid());

main(int argc, char *argv[])
{int fd;
 pthread_mutexattr_t mutex_attr;

 /* open a file to use in a memory mapping */
 fd = open("/dev/zero", O_RDWR);

 /* Create a shared memory map with the open file for the data 
    structure which will be shared between processes. */

 buf=(buf_t *)mmap(NULL, sizeof(buf_t), PROT_READ|PROT_WRITE,
           MAP_SHARED, fd, 0);

 /* Initialize the counter and SVs.  PTHREAD_PROCESS_SHARED makes
    them visible to both processes. */

/* pthread_mutexattr_setpshared(&mutex;_attr, PTHREAD_PROCESS_SHARED); */

 pthread_mutex_init(&buf-;>lock2, &mutex;_attr);

 if (fork() == 0)
 Q84: What about having a NULL thread ID?  

This is part of an on-going discussion.  The POSIX committe decided not to
do this.  It is, of course, possible to implement non-portable versions
yourself.  You would have to have a DIFFERENT version for different OSs.
-1L works fine for Solaris 2.5, and IRIX 6.2, but not HP-UX 10.30, which
requires (I think!) {-1, -1, -1}.  BE CAREFUL!!!

Now for the discussion...

Ben Self wrote:
> Ian Emmons wrote:
> >
> > So why not support a portable "null" value?  This
> > could be done via a macro that can be used to initialize a pthread_t (just
> > like the macro that initializes a mutex), or it could be done via a couple
> > of functions to set and test for the null value.  Or, the POSIX folks could
> > do as they are used to doing, and make us code around their ommissions with
> > YAC (Yet Another Class).
> >
> > Ian
> >
> As I stated before I think that it is a very natural thing to want to
> do.  In my experience POSIX's omissions are usually more interesting
> than simple oversights.  Often an existing industry code base stands in
> the way or it was deemed too 'trivial' a matter of user code to merrit
> imposing any restriction on implimentation.  in any case, for all
> intents and purposes, it is a done deal.
> As Dave Butenhof candidly exposes a few posts down:
> "due to
> overwhelming agreement that it was a bad (and unnecessary) complication.
> The Aspen committee added suspend/resume to UNIX98 -- but the functions
> were later removed with no significant objections.
> There simply is no significant industry concensus supporting these
> functions. (And that for many very good technical reasons as well as
> some silly political reasons."
> that is exactly how POSIX works.  gotta love 'em
> --Ben
 Q85: Explain Traps under Solaris  

Jim Moore - Senior Engineer, SunSoft wrote:
Email : Jim.Moore@UK.Sun.COM               |   DNRC: The Day Cometh
SunSoft Technical Support (Europe)         |   "adb is your friend"


                        SPARC traps under SunOS (Solaris)
                By: Jim Moore, SunSoft, Sun Microsystems Inc
                      Email: Jim.Moore@UK.Sun.COM


        1       Introduction
        1.1     Who should read this document?

        2       What is a trap?
        2.1     How traps are caused
        2.1.1   Precise Traps
        2.1.2   Deferred Traps
        2.1.3   Disrupt/Interrupt Traps
        2.2     How traps are dispatched to the kernel
        2.2.1   SPARC v7/v8
        2.2.2   SPARC v9 Processor states, normal and special traps Normal Trap (Processor in Execute State) Special Trap (Processor in RED State)

        3       Traps - How SunOS Handles Them
        3.1     Generic Trap Handling
        3.2     Register Windows
        3.3     Interrupts



        This document describes what traps are, how they work and how they
        are handled by the SunOS kernel.   We will look at traps for SPARC
        versions v7, v8 and v9 (v7 and v8 traps are essentially identical).

        In places, we will have to differentiate between the v8 and v9
        quite extensively as there are significant differences between the

        I assume that the readers are familiar with SPARC registers but I
        will give some expansion on the more obscure ones as I go :-)

        Finally, I have made every effort to make this accurate as well as
        informative but at the same time without too lengthy descriptions.
        Even so, in parts it may be heavy going and plain ascii doesn't
        leave much scope for clear diagrams.  Feel free to post or email
        questions and comments!

1.1     Who should read this document?

        Anyone who wants to know more about traps in detail  on  the SPARC
        architecture.  I strongly recommend that you refer to one of these
        two books for more information:

        The SPARC Architecture Manual, Version 8        ISBN 0-13-099227-5
        The SPARC Architecture Manual, Version 9        ISBN 0-13-825001-4

        as they contain more detail on some of these topics.

2       WHAT IS A TRAP?

        The design of SPARC as a RISC processor means that a lot of the
        functionality that is normally controlled by complex instructions
        has to be done by supervisor (kernel) software.  Examples of these
        could be memory exception handling or interrupt handling.  When
        a situation arises that requires special handling in this way,
        a trap occurs to cause the situation to be handled.  We'll look
        at this mechanism in more detail in this section.

2.1     How Traps are Caused

        Traps can be generated for a number of reasons and you can see
        a list of traps under /usr/include/sys/v7/machtrap.h or under
        /usr/include/sys/v9/machtrap.h for SPARC v7/v8 and v9 respectively.

        A trap can be caused either by an exception brought about by the
        execution of an instruction or by the occurrence of some external
        interrupt request not directly related to the instruction.  When
        the IU (Integer Unit, the part of the CPU that contains the
        general purpose registers, does integer math and executes the
        instructions) is about to execute an instruction it first checks
        to see if there are any exception or interrupt conditions pending
        and, if so, it selects the highest priority one and causes a trap.

        Traps are also used to signal hardware faults and malfunctions,
        for example a level15 asynchronous memory fault.  In some fatal
        conditions execution cannot continue and the machine will halt
        or the supervisor software will handle the trap by panicing.

        Next, we'll take a generic look at the different trap categories
        but we'll go into the version specific details later on.

2.1.1   Precise Traps

        A precise trap is brought about by an exception directly caused by
        the executing instruction.  This trap occurs before there is any
        tangible change in the program state of the program that contained
        the trapped instruction.

2.1.2   Deferred Traps

        A deferred trap is similar to a precise trap but in this case the
        program-visible state may have changed by the time the trap occurs.
        Such a trap may in theory occur one or more instructions after the
        trap inducing instruction has executed but it must occur before
        any subsequent instruction attempts to use any modified register
        or resource that the trap inducing instruction used.

        Did you get that?  Hmm.  Okay, here's an example.  Imagine that
        a floating point operation is being executed.  This does not
        happen synchronously with IU instructions and so it is possible
        that a floating point exception could occur as a deferred trap.

2.1.3   Disrupt/Interrupt Traps

        An interrupt trap, as you have probably guessed, is basically the
        assertion of an interrupt, either generated externally (from
        hardware) or internally (via software).  The delivery of interrupts
        is controlled by the PIL (Processor Interrupt Level) field of the
        PSR (Processor State Register), which specifies the minimum
        interrupt level to allow, and also by the mask of asserted bits in
        the IE (Interrupt Enable register...architecture specific).

        Under SPARC v9, we have a concept of a disrupt trap.  This is very
        similar to a deferred trap in that it could be related to an earlier
        instruction but in this case the trap is an unrecoverable error.

2.2     How Traps are Dispatched to the Kernel

        In this section we will look at the flow of execution into the
        kernel when a trap occurs.  This is different for SPARC v7/v8
        and v9 so we will split this section into two.

2.2.1   SPARC v7/v8

        When a trap occurs, the flow of execution jumps to an address which
        is calculated from the TBR (Trap Base Register) and the Trap Type,
        hereafter referred to as TT.  The sequence is as follows:

        1.  An exception/interrupt has been detected as pending by the IU

        2.  The IU multiplies the TT by 16 (TT << 0x4) as there are 4
            instructions per trap table entry.

        3.  The IU loads the address of the trap table (from the TBR) and
            adds the offset calculated in (2).

        4.  The CWP (Current Window Pointer) is decremented, so that we are
            in a new register window.

        5.  The trapped instruction (%pc) and the next instruction to be
            executed (%npc) are written into local registers %l1 and %l2

        6.  Traps are disabled and the current processor mode is set to
            "supervisor".  This is done by setting the ET bit to zero and
            the supervisor mode bit to one in the PSR (refer to the PSR
            description in /usr/include/v7/sys/psr.h).

        7.  Execution resumes at [TBR + (TT<<4)], as calculated in (3)

        Part of the SunOS kernel code is a trap table, which contains 255
        4-instruction entries, each entry corresponding to a trap type from
        0 to 0xff.  This structure is defined by the SPARC implementation.

        Each trap table entry basically contains a branch to a trap handling
        routine and may also load the PSR into a local register for use
        later.  Here's an example of a trap table entry:

                sethi   %hi(trap_handler), %l3          ! Load trap handler
                jmp     [%l3 + %lo(trap_handler)]       ! address and jump
                mov     %psr, %l0                       ! Delay: load %psr
2.2.2   SPARC v9

        The SPARC v9 case is quite different from the v7/v8 case mainly
        due to the concept of processor states and trap nesting.

        We still use a trap table concept under v9 but the destination
        address for the transfer of execution is calculated differently.
        Also, trap table entries for v9 are 8 instructions in size,
        except for spill/fill traps, in which case the entries are 32
        instructions in size.  Also, in a special state called the RED
        state (more on that later) we actually use a different trap table!

        Pretty different, huh?

        The trap table is divided into three parts.  The first half of the
        table is used for machine generated traps.  The next quarter is
        reserved for software initiated traps and the final quarter is
        reserved for future use.  The displacement into the trap table is
        defined by Trap Level (TL) and the Trap Type (TT) together.

        Let's take a look at this in some more detail.  I strongly advise
        that you obtain a copy of the version 9 SPARC architecture manual
        if you want to follow this in detail.

        When a trap occurs, the action taken depends on the TT, the current
        level of trap nesting (contained in the TL) and the processor state
        at that time.  Let's look at processor states and what we mean by
        normal and special traps so that the rest of this section has more
        chance of making sense! Processor States, Normal and Special Traps

        The SPARC v9 processor is always in one of three states and these

        1.  Execute state.  This is the normal execution state.

        2.  RED state.  RED = Reset, Error and Debug.  This is a state that
            is reserved for handling traps when we are at the penultimate
            level of trap nesting or for handling hardware or software

        3.  Error state.  This is a state that is entered when we have a
            trap occur at a point in time when we are at our maximum level
            of trap nesting (MAXTL) or an unrecoverable fatal error has

        Normal traps are traps that are processed when we are in the nice
        cosy execute state.  If we trap in RED state, then this is a special
        trap.  There is an implementation dependent address called RSTVaddr
        which contains the vector to the RED state trap table.  This vector
        could be set to overlay the same one in the TBR.  For a given trap
        in RED state we vector as follows:

        TT      Vector          Reason

        0       RSTVaddr|0x0    SPARC v8 style reset
        1       RSTVaddr|0x20   Power On Reset (POR)
        2       RSTVaddr|0x40   Watchdog Reset (WDR)
        3       RSTVaddr|0x60   Externally Initiated Reset (XIR)
        4       RSTVaddr|0x80   Software Initiated Reset (SIR)
        Others  RSTVaddr|0xa0   All other traps in RED state

        A fatal exception that causes us to drop into error state
        will cause the processor to note the exception and either halt,
        reset or watchdog reset.  After the reset, the processor enters
        RED state with a TL appropriate to the type of reset (usually
        maximum).  Also, the TT is set to the value of the original trap
        that caused the reset and NOT the TT value for the reset itself
        (ie. WDR - Watchdog reset or XIR - Externally Indicated Reset).

        Now that we have a concept of the different traps and processor
        states, let's look at the sequence of execution when a trap occurs
        to deliver the trap to the supervisor (kernel). Normal Trap (Processor in Execute State)

        1.  If TL = MAXTL-1, the processor enters RED state (Goto

        2.  TL = TL + 1

        3.  Processor state, %pc, %npc, CWP, CCR (Condition Code Register),
            TT and ASI (Address Space Identifier register) are saved.

        4.  The PSTATE (Processor State) register is updated as follows:

                a) RED field set to zero
                b) AM (Address Masking) disabled
                c) PRIV (Privileged Mode) enabled
                d) IE cleared, disabling interrupts
                e) AG set (Alternate Global Registers enabled)
                f) Endian mode set for traps (TLE)

            Refer to the architecture manual for a description of PSTATE

        5.  If TT is a register window trap, CWP is set to point to the
            register window to be accessed by the trap handler code.
            Possibilities are:

                a) TT = 0x24 (Clean Window), CWP = CWP + 1
                b) TT <= 0x80 AND TT <= 0xbf (Window Spill),
                   CWP = CWP + CANSAVE + 2.  CANSAVE is a register that
                   contains the number of register windows following the
                   CWP that are NOT in use.
                c) TT <= 0xc0 AND TT <= 0xff (Window fill),
                   CWP = CWP - 1

            For non-register window traps, CWP remains unchanged.

        6.  Control is transferred to the trap table at an address calculated
            as follows:

                New %pc =  TBA | (TL>0 ? 1: 0) | TL
                New %npc = TBA | (TL>0 ? 1: 0) | TL | 0x4

            Remember, TBA = Traptable Base Address, similar to the TBR in v8
            Execution then resumes at the new %pc and %npc Special Trap (Processor in RED State)

        1.  TL = MAXTL

        2.  The existing state is preserved as in, step 3.

        3.  The PSTATE is modified as per, step 4 except that the
            RED field is asserted.

        4.  If TT is a register window trap, CWP processing occurs as in
  , step 5.

        5.  Implementation specific state changed may occur.  For example,
            the MMU may be disabled.

        6.  Control is transferred to the RED state trap table subject to
            the trap type.  Look back to for the RSTVaddr
            information to see how this vector is made.

        This may seem rather complicated but once you have the picture
        built clearly it will all fall into place.  Post or email if you
        need clarification.


        In this section we will look at how SunOS handles traps and look
        at some of the alternatives which were available.  Despite all the
        differences between SPARC v8 and v9 traps I'll do a fairly generic
        description here as it really isn't necessary to describe in detail
        what SunOS does for v9 traps as you can see from the previous
        section what the differences in trap processing are.  Suffice to
        say that the SunOS kernel adheres to those rules.  Instead, we'll
        concentrate on the principles used by the kernel when handling
        various traps.

3.1     Generic Trap Handling

        We'll look at some specifics in a moment but first we'll cover the
        generic trap handling algorithm.

        When traps are handled, the typical procedure is as follows:

        1.  Check CWP.  If we need to handle the trap by jumping to
            'C' (which would use save and restore instructions between
            function calls) then we must make sure we won't have cause
            an overflow when we dive into 'C'.  If we do detect that
            this would be a problem we do the overflow processing now.

        2.  Is this an interrupt?  If so, jump to the interrupt handler.
            Refer to section 3.3 on interrupts.

        3.  Enable traps and dive into the standard trap handler.  We
            enable traps so that we can catch any exceptions brought
            about by handling *this* trap without causing a watchdog reset.

        4.  On return from the trap handler, we check the CWP with the
            CWP we came in with at the start to see if we have to undo
            the overflow processing we might have done before, so that
            we don't get an underflow when we return to the trapped
            instruction (or worse, execution continues in the WRONG window).

        5.  Before we actually return to the trapped instruction, we check
            to see if kprunrun is asserted (ie. a higher priority lightweight
            process is waiting to run).  If so, we allow preemption to

        Traps are used by SunOS for system calls as well as for machine
        generated exceptions.  The parameters to the system call are
        placed in the output registers, the number of the system call
        required (see /usr/include/sys/syscall.h) is placed in %g1 and
        then it executes a "ta 0x8" instruction.  This appears in the kernel
        as a trap with TT = 0x88 and the system trap handler determines
        this to be a system call and calls the relevant function as per
        the system call number in %g1.
        Occasionally, a process will attempt to execute from a page of
        VM that is not mapped in (ie. it is marked invalid in the MMU)
        and this will cause a text fault trap.  The kernel will then
        attempt to map in the required text page and resume execution.
        However, if the process does not have the correct permissions
        or the mapping cannot be satisfied then the kernel will mark
        a pending SIGSEGV segmentation violation against that process
        and then resume execution of the process.  A similar scenario
        applies to data faults; a process attempts to read or write to
        an address in a page marked invalid in the MMU and the kernel
        will attempt to map in the corresponding page for this address
        if possible (ie. maybe the page has been swapped out or this
        is the first attempt to read from that page and so we demand-page
        it in).  I'll explain all this in detail in another text on
        process address spaces, paging and swapping which I plan to do
        as soon as I get time.

        A "bad trap" is simply a trap that cannot be handled (or isn't
        supported).  Usually under SunOS a bad trap has a type of 9 or
        2, for data or text fault respectively (maybe 7 for alignment in
        some cases).

3.2     Register Windows
 Q86: Is there anything similar to posix conditions variables in Win32 API ?  

Ian Emmons wrote:
> Dave Butenhof wrote:
> >
> > kumari wrote:
> > >
> > > Is there anything similar to posix conditions variables in Win32 API ?
> > > Thanks in advance for any help.
> > > -Kumari
> >
> > The answer depends very much upon what you mean by the question. Win32
> > has "events", which can be used to accomplish similar things, so the
> > answer is clearly "yes". Win32 events, however, behave, in detail, very
> > differently, and are used differently, so the answer is clearly "no".
> > Which answer do you prefer? ;-)
> Good answer, Dave.  This is one of the most frustrating things about the
> Win32 threading API.  CV's are incredibly powerful and fairly easy to use,
> but Win32 unfortunately ommited them.
> In WinNT 4.0, there is a new API called SignalObjectAndWait which can be used
> to implement a CV pretty easily.  There are two problems:
> (1) This API is not available on WinNT 3.51 or Win95.  Hopefully it will show
> up in Win97, but I don't know for sure.
> (2) Using this API with a mutex and an auto-reset event, you can create a
> CV-lookalike where PulseEvent will behave like pthread_cond_signal, but there
> is no way to immitate pthread_cond_broadcast.  If you use a mutex and a
> manual event, PulseEvent will behave like pthread_cond_broadcast, but there
> is no way to immitate pthread_cond_signal.  (Sigh ...)
> I know ACE has a Win32 CV that works in general, but I seem to recall Doug
> Schmidt saying that it's very complex and not very efficient.
> Ian
 Q87: What if a cond_timedwait() times out AND the condition is TRUE?  

  [This comment is phrased in terms of the JAVA API, but the issues are the
same. -Bil]

> After thinking about this further even your simple example can fail.
> Consider this situation. A number of threads are waiting on a condition,
> some indefinitely, some with timeouts. Another thread changes the
> condition, sets your state variable and does a notify(). One of the waiting
> threads is removed from the wait-set and now vies for the object lock. At
> about the same time a timeout expires on one of the other waiting threads
> and it too is removed from the wait-set and vies for the lock - it gets the
> lock! This timed-out thread now checks the state variable and wrongly
> concludes that it received a notification.
> In more complex situations where there are multiple conditions and usage of
> notifyAll() even more things can go wrong.

  You are correct with everything you say, right up until the very last word.
The behavior is NOT wrong.  It may not be what you *expected*, but it's not
wrong.  This is a point that's a bit difficult to get sometimes, and it drives
the real-time crowd to distraction (as well it should), but for us time-shared
folks, it's cool.

  When you get a time-out, you've got a choice to make.  Depending upon what
you want from your program, you may choose to say "Timed-out! Signal error!"
or you may choose to check the condition and ignore the time out should it
be true.  You're the programmer.

  A important detail here...  Everything works CORRECTLY.  In particular, if
a thread receives a wakeup, it is removed from the wait queue at that point
and CANNOT subsequently receive a timeout.  (Hence it may take another hour
before it obtains the mutex, but that's OK.)  A thread which times out will
also be removed from the sleep queue and a subsequent wakeup
(pthread_cond_signal()) will be delivered to the next sleeping thread (if

 Q88: How can I recover from a dying thread?  

[OK.  So that's not *exactly* the question being addressed here, but that's
the most important issue.  -Bil]

David Preisler wrote:
> I wish to create an *efficient* and *reliable* multi process/multi threaded
> algorithm that will allow many simultaneous readers (for efficiency) to access
> a block of shared memory,  but allows one and only one writer.
> How could a read counter be guaranteed to be always correct even if your read
> thread or process dies???

Sorry to disillusion you, but this is impossible. Remember that you're
talking about SHARED MEMORY, and assuming that SOMETHING HAS GONE WRONG
with some party that has access to this shared memory. Therefore, there
is no possibility of any guarantees about the shared memory -- including
the state of the read-write lock.

You can approximate the guarantees you want by having some third party
record the identity of each party accessing the lock, and periodically
validate their continued existence. You could then, (assuming you'd
coded your own read-write lock that allowed this sort of manipulation by
a third party), "unlock" on behalf of the deceased read lock holder.
Just remember that the fact that the party had DECLARED read-only
intent, by locking for read access, doesn't guarantee that, in the
throes of death, it couldn't have somehow unintentially modified the
shared data. A read-lock really is nothing more than a statement of
intent, after all. And how far do you wish to trust that statement from
a thread or process that's (presumably equally unintentially) blowing
its cookies?

/---------------------------[ Dave Butenhof ]--------------------------\
| Digital Equipment Corporation          |
| 110 Spit Brook Rd ZKO2-3/Q18 |
| Nashua NH 03062-2698 |
\-----------------[ Better Living Through Concurrency ]----------------/

 Q89: How to implement POSIX Condition variables in Win32?  

Subject: Re: How to implement POSIX Condition variables in Win32 (LONG)
Douglas C. Schmidt wrote:

the following function creates and initializes a condition variable. int pthread_cond_init (pthread_cond_t *cv, const pthread_condattr_t *); { cv->waiters_ = 0;
  cv->generation_count_ = 0;
  cv->release_count_ = 0;

  // Create a manual-reset Event.
  cv->event_ =
    ::CreateEvent (NULL,  /* no security */
                   TRUE,  /* manual reset */
                   FALSE,  /* non-signalled */
                   NULL); /* unnamed */

The following pthread_cond_wait function waits for a condition 
and atomically releases the associated generation_count_;


    ::LeaveCriticalSection (external_mutex);

    // Wait until the event is signaled.
    ::WaitForSingleObject (cv->event_, INFINITE);

    ::EnterCriticalSection (external_mutex);

    // Exit the loop when the event_>
    // is signaled and there are still waiting
    // threads from this generation that haven't
    // been released from this wait yet.
    if (cv->release_count_ 0
        && cv->generation_count_ != c)


  // If we're the last waiter to be notified
  // then reset the manual event.


  if (cv->generation_count_ == 0)
    ::ResetEvent (cv->event_);

This function loops until the event_ HANDLE is signaled and at least
one thread from this ``generation'' hasn't been released from the wait
yet.  The generation_count_ field is used incremented every time the
event_ is signal via pthread_cond_broadcast or pthread_cond_signal.
It tries to eliminate the fairness problems with Solution 1, so that
we don't respond to notifications that have occurred in a previous
``generation,'' i.e., before the current group of threads started

The following function notifies a single thread waiting on a Condition

pthread_cond_signal (pthread_cond_t *cv)
  if (cv->waiters_ cv->release_count_)
      ::SetEvent (cv->event_);

Note that we only signal the Event if there are more waiters than
threads currently being released.

Finally, the following function notifies all threads waiting on a
Condition Variable:

pthread_cond_broadcast (pthread_cond_t *cv)
  if (cv->waiters_ 0)
      ::SetEvent (cv->event_);
      cv->release_count_ = cv->waiters_;

Unfortunately, this implementation has the following drawbacks:
1. Busy-waiting -- This solution can result in busy-waiting if the
waiting thread has highest priority.  The problem is that once
pthread_cond_broadcast signals the manual reset event_ it remains
signaled.  Therefore, the highest priority thread may cycle endlessly
through the for loop in pthread_cond_wait.

2. Unfairness -- The for loop in pthread_cond_wait leaves the critical
section before calling WaitForSingleObject.  Thus, it's possible that
another thread can acquire the external_mutex and call
pthread_cond_signal or pthread_cond_broadcast again during this
unprotected region.  Thus, the generation_count_ will increase, which
may fool the waiting thread into breaking out of the loop prematurely
and stealing a release that was intended for another thread.

3. Potential for race conditions -- This code is only correct provided
that pthread_cond_signal and pthread_cond_broadcast only ever called
by a thread that holds the external_mutex.  That is, code that uses
the classic "condition variable signal" idiom shown above will work.

Dr. Douglas C. Schmidt                  (
Department of Computer Science, Washington University
St. Louis, MO 63130. Work #: (314) 935-4215; FAX #: (314) 935-7302

 Q90: Linux pthreads and X11  

Date: Mon, 16 Feb 1998 17:47:03 +0000
Organization: Visix Software

Steve Cusack wrote:

> I've just started using Linux pthreads and have immediately run into
> the fact that the package appears to be incompatible with X.  I've
> read (via DejaNews) that X is "threads unsafe" and that others have
> had similar problems (X refusing to start).  Does anyone have X11 and
> pthreads working together on a Linux system?  If so, what did you have
> to do?

I ported Vibe (a multi-threaded Java IDE) to Linux, so I have made X11 and
linuxThreads work together.  Its not easy, unless you can target a glibc2

You need to do one of two things: either get/create a recompilied
-D_REENTRANT version of the X libraries, or patch linuxThreads to use the
extern int errno as the errno for the initial thread.  I chose to create
thread aware Xlibs.  (That was not fun.  For some reason the build process
would get into an infinite loop.  I don't remember how I got it to work)

You should be able to search Deja-news for pointers to patching

Replace "xyzzy" with Victor India Sierra India X-Ray to email me.


Get yourself a copy of Redhat 5.0. It comes with pthreads and thread
safe X libs.



For such-compiled binaries for ix86-libc5, see

Strange, for me it was very easy.  Just adding -D_REENTRANT next to
-D_POSIX_SOURCE in did the trick.

But remember:  These libs are still not thread-safe (you can make only
one X11 call at a time -- I don't think this is too bad).

The better option at this stage really is glibc2 with X11 libs
compiled with XTHREADS.

 Q91: One thread runs too much, then the next thread runs too much!  

[I've seen variations on this concern often.  Johann describes the problem
very well (even if he can't find the "shift" key).  -Bil]

===================  The Problem  ================================
Johann Leichtl wrote:
> hi,
> i have a a problem using pthreads in c++.
> basically what i want to do is have class that manages a ring buffer and
> say 2 threads, where one adds entries to the buffer and one that removes
> them.
> i have a global object that represents the buffer and the member
> functions make sure that adding and removing entries from the buffer
> work ok.
> the 2 functions:
> inline void ringbuf::put(int req)
> {
>   pthread_mutex_lock(&mBufLock;);
>   while(elemNum == size)
>     pthread_cond_wait(&cNotFull;, &mBufLock;);
>   buf[next] = req;
>   next = (next + 1) % size;
>   elemNum++;
>   pthread_cond_signal(&cNotEmpty;);
>   pthread_mutex_unlock(&mBufLock;);
> }
> inline void ringbuf::get(int& req)
> {
>   pthread_mutex_lock(&mBufLock;);
>   while(elemNum == 0)
>     pthread_cond_wait(&cNotEmpty;, &mBufLock;);
>   req = buf[last];
>   last = (last + 1) % size;
>   elemNum--;
>   pthread_cond_signal(&cNotFull;);
>   pthread_mutex_unlock(&mBufLock;);
> }
> now my problem is that my consumer thread only wakes up after the buffer
> is full. i tried different buffer sizes and simulated work in both
> producer and consumer.
> when i use a sleep() function in the producer (and or consumer) i can
> get the thing to look at the buffer earlier.
> i was wondering if anybody would have some input on what the problem
> could be here. i've done something similar with UI threads and not C++
> and it works fine.
> thanks a lot.
> hans

===================  The Solutions  ================================

Use either threads w/ system scope or call thr_setconcurrency to
increase the concurrency level...

Here's a handy thing to stick in a *Solaris* pthreads program:



This will give you as much actual concurrency as there are processors
on-line plus one.  It's a starting point rather than a fix-all, but
will cure some of the more obvious problems...

- Bart

Bart Smaalders                  Solaris Clustering      SunSoft         (415) 786-5335          MS UMPK17-201                        901 San Antonio Road
                                                        Palo Alto, CA


  No, actually you *don't* have a problem.

  Your program works correctly, it just happens to work in a slightly
unexpected fashion.  The functions that call put & get are probably
unrealistically simple and you are using local scheduling.  First the
buffer fills up 100%, then it completely empties, then it fills up
again, etc.

  Make your threads system scoped and you'll get what you expect.
[You'll notice Bart suggests a different method for obtaining the
same results (ie. more LWPs).  I like this method because I think
it's a clearer statement of intention AND PROCTOOL will give me
LWP statistics, but not thread statistics.]

  (You can look on the web page below for exactly this program, written
in C, one_queue_solution.c.)

 Q92: How do priority levels work?  

Kamal Kapila wrote:
> Hi there,
> I'm working on an internal package to provide platform independant
> thread services (the initial platforms are DECUNIX 4.0 and Windows NT).
> The problem I'm having is understanding the thread scheduling on
> It would seem to me logical that the threads of a process would have the
> same priority and policy of their associated process by default.
> However, when I check the process priority/policy I get completely
> different values from when I check the individual thread priorities and
> policies.  In fact, the priority values do not seem to even follow the
> same scale (I can set process priorities from 0-63, while thread
> priorities go only from 0-31).  In addition, setting the process
> priority does not seem to effect the thread priorities at all (!).

Basically, there are "interaction issues" in implementing a 2-level
scheduling model (as in Digital UNIX 4.0), that POSIX didn't attempt to
nail down. We deferred dealing with these issues until some form of
industry concensus emerged. That industry concensus has, since, not
merely "emerged", but has become a mandatory standard in the new Single
UNIX Specification, Version 2 (UNIX98).

With 2-level scheduling, it really doesn't make much sense to "inherit"
scheduling attributes from the process -- because those attributes MEAN
entirely different things. Digital UNIX, by the way, doesn't really have
a "process" -- it has (modified) Mach tasks and threads. (There are a
set of structures layered over tasks to let the traditional UNIX kernel
code deal with "processes" in a more or less familiar way, but a process
is really sheer illusion.) Since tasks aren't scheduled, they really
have no scheduling attributes -- threads do. Since a non-threaded
process has a task and a single thread, the 1003.1b (realtime)
scheduling functions operate, in general, on the "initial thread" of the
specified "process".

The kernel thread scheduling attributes control scheduling between
various kernel threads. But a POSIX thread is really a user object, that
we map onto one or more kernel threads (which we call "virtual
processors"). Pretending to set the scheduling attributes of this thread
to the "process" attributes makes no sense, because the scheduling
domain is different. POSIX threads are scheduled only against other
threads within the process -- not against kernel threads in other

POSIX provides a way to create threads that you really want to be
scheduled against other kernel threads -- essentially, forcing the POSIX
thread to be "bound" to a kernel thread itself, at the expense of (often
substantially) higher scheduling costs. This is called "system
contention scope". Digital UNIX 4.0 didn't support system contention
scope (which is an optional feature of POSIX), but we've added it for
the next version (4.0D).

Each system contention scope (SCS) thread has its own scheduling
attributes, independent of the process. While it might make some
intuitive sense to inherit the process priority, POSIX doesn't provide
any such semantics. A newly created thread either has explicit
scheduling attributes, or inherits the attributes of the thread that
created it. Of course, since setting the "process" attributes affects
the initial thread, threads that IT creates will inherit the "process"
attributes by default. But changing the "process" attributes won't (and
shouldn't) affect any SCS threads in the process.

The ambiguity (and the only relevant question for the implementation
you're using, which doesn't support SCS threads), is, what happens to
the virtual processors that are used to execute POSIX threads, when the
"process" scheduling attributes are changed? And with what attributes
should they run initially? UNIX98 removes the (intentional) POSIX
ambiguity by saying that setting the 1003.1b scheduling attributes of
the "process" WILL affect all "kernel entities" (our virtual processors,
Sun's LWPs) used to execute process contention scope (PCS, the opposite
of SCS) threads. By extension, the virtual processors should initially
run with the existing process scheduling attributes.

This will be true of any UNIX98 branded system -- but until then,
there's no portable rules.

The fact that the POSIX thread interfaces don't use the same priority
range as the system is a stupid oversight -- I just didn't think about
it when we converted from DCE threads to POSIX threads for 4.0. This has
been fixed for 4.0D, though it's a bit too substantial a change (and
with some potential risk of binary incompatibilities) for a patch.

> (BTW, I am using sched_getparam() and sched_getscheduler() to get the
> process related values and  pthread_getparam() to get the thread related
> values).


> Specifically, I have the following questions :
> - What is the relationship between the process priority/policy and the
> thread priority and policy  ?

There's very little relationship. Each POSIX thread (SCS or PCS) has its
own scheduling attributes (priority and policy) that are completely
independent of "process" attributes. UNIX98, however, says that the
"kernel entities" used to execute PCS POSIX threads WILL be affected by
changes to the "process" scheduling attributes -- but SCS threads will
not (and should not) be affected by such changes. (Nor will the
scheduling attributes of PCS threads, even though their "system
scheduling attributes" effectively come from the virtual processor,
which is affected.)

> - Does the scheduler schedule individual threads independently, or are
> processes scheduled, with a process's threads then sharing the process
> CPU time?

As I said, there's no such thing as a process, and the closest analog,
the Mach task, isn't a schedulable entity. All threads are scheduled
independently -- each has its own scheduling attributes, its own time
slice quantum, etc. On Digital UNIX 4.0, with only PCS threads, the
kernel schedules the virtual processor threads of all the processes
(plus the single kernel threads associated with all non-threaded
processes). Threaded processes also contain a user-mode scheduler, which
assigns PCS threads to the various virtual processors, based on the PCS
thread scheduling attributes. (A process has one virtual processor for
each available physical processor on the system.)

On Digital UNIX 4.0D, with SCS thread support added, each process may
also have any number of SCS threads, which map directly to individual
and independent kernel threads. SCS threads are scheduled the same as
virtual processors -- each has its own scheduling attributes, time slice
quantum, etc.

(It might seem that managing CPU time by kernel threads rather than by
processes allows users to monopolize the system by creating lots of
kernel threads. But they could do that by creating lots of processes,
too... and a kernel thread is cheaper for the system than a process,
which is really a thread plus a task. The ability to create new kernel
threads, as well as processes, is limited both by user and system
quotas. And of course, in 4.0, users can't actually create new kernel
threads -- only POSIX threads, which are mapped to the process' existing
virtual processors.)

So each process presents a set of runnable kernel threads to the kernel:
A mix of SCS threads and the various PCS threads currently mapped on to
one or more virtual processors. The kernel then determines which kernel
threads to schedule on each processor. (That's why it's called "2-level

> - Is the thread's overall priority a combination of the process priority
> and the individual thread priority ? If so, how is this determined ?

Currently, "process" priority is irrelevant for a threaded process.
Virtual processors don't inherit the process priority. (Actually, they
sort of do, and the first virtual processor is the initial process
thread, which can be changed using the 1003.1b functions -- but the
kernel generates "replacement" virtual processors at various times, and
these currently are always set to the default scheduling attributes
[timeshare policy and priority 19].)

POSIX thread priority determines which threads the user-mode scheduler
assigns to the various virtual processors. Because the virtual processor
priority doesn't change (the whole point of 2-level scheduling is to
avoid expensive kernel calls), the POSIX thread priority has no effect
on the kernel scheduling. That's OK, except in rare cases where
applications comprising multiple PROCESSES have threads (in different
processes) that really need to directly preempt each other based on

> I have read through all of the Digital documentation that I have but I
> have not been able to find any clear answers to my questions.

A description of the behavior (though in less technical/internal detail
than the one in this posting) can be found in Appendix A (section A.3)
of the Digital UNIX "Guide to DECthreads" manual.

/---------------------------[ Dave Butenhof ]--------------------------\
| Digital Equipment Corporation          |
| 110 Spit Brook Rd ZKO2-3/Q18 |
| Nashua NH 03062-2698 |
\-----------------[ Better Living Through Concurrency ]----------------/
 Q93: C++ member function as the startup routine for pthread_create().  

You need a static member function
    static void *function_name(void  *);
in the class declaration

Usually then you pass the classes address as the parameter so the
static function can access the nonstatic members of the class

- Robert

On 19 Sep 1997 03:15:22 GMT, (Phil Romig) wrote:
>I know I should be able to figure this out, but I'm missing something.
>I want to pass a member function as the startup routine for pthread_create().
>That is, I want to create an instance of a class and then pass one of
>the public member functions as the start routine to pthread_create().
>I believe the question comes down to how describe the address of the
>member.  Simple qualification (class::member) will not work because I
>need the address of the of the function that goes with the particular
>instance of the class.
>For the record I'm working on an HPUX 10.01 system so I'm using pthreads
>draft 4, rather than the current standard.  
>Any advice, pointers and suggestions are welcome.
>A quick example of what I want to try:
>  class foo {
>  public:
>   foo(int i)
>   void *go(void *arg);
>  }
> main() {
>  foo *bar = new foo(1);
>  pthread_create(...,&(bar->go),....);

 Q94: Spurious wakeups, absolute time, and pthread_cond_timedwait() 

[Bil: The summary is "Retest conditions with CVs." and "The time-out is an
absolute time because it is."  (NB: Deltas are a proposed extension to POSIX.)
This is a nice exposition.]

Brian Silver wrote:
> Ben Self wrote:
> >
> [Snip]
> > The standard specifies that pthread_cond_wait() and
> > pthread_cond_timedwait() may have spurious wakeups.  The reason for this
> > is that a completly reliable once and only once wake up protocol can be
> > excessively expensive for some asymetric multiprocessor systems.
> Well, maybe I'm being a bit anal about this, but this
> really isn't the case. If it was, then you'd have the
> same issue for mutexes as well, and the standard does
> not allow for spurious wakes on mutexes.

[Bil: Actually, you DO get spurious (define "spurious"!) wakeups with mutexes,
YOU just never see them.]
> The "while(predicate)/wait" construct is very common
> in concurrent environments (regardless of their symetry).
> The reason is that since the environment is highly
> unpredictable, while you were coming back out of the
> wait, the state of the thing that you were waiting for
> may have changed.
> This construct is used to impliment mutexes as well,
> its just that you don't see it since the predicate
> is known; it is the state of the mutex lock. Cv's force
> the construct to the user code because the predicate
> is not known to the impliment of the cv itself. The
> warnings about spurious wakes are taken seriously when
> mutexes are implimented, and are accounted for in the
> exact same "while(predicate)/wait" construct.
> Wake-only-once doesn't really help. It will remove the
> addition of spurious wakes, but it won't account for
> "valid wake, but the predicate changed". Implimenting
> wake-only-once is expensive when you consider that this
> solution solves both problems.
> Also note that the mutex lock around the predicate
> doesn't solve this problem either. There is a race
> that starts once you see the wake and ends once you
> reaquire the mutex. In that time, another thread can
> get the mutex and change the data (believe me, it
> happens - more often than you'd expect). When you
> reaquire the mutex, and exit the wait, the predicate
> has changed and you'll need to go back to waiting.
> Now, a wake-only-once,-and-gimme-that-mutex atomic
> operation might be nice .
> Brian.

I am not  reposting to be defensive or argumentative.  Upon reflection,
however, I have come to the conclusion that neither I nor subsequent
posters have really dealt with the original poster's question let alone
the new topics that we have thrown about. 

Since this is a response largely to Brian Silver's post, a person I have
a good deal of respect for, I have chosen to include some quotes form
Dave Butenhof's book, Programming with POSIX Threads, because we both
know it and have a mutual admiration for his work.

First of and most importantly the original question I attempted to
answer was:

Fred A. Kulack wrote:
> A side question for the rest of the group...
> All the applications I've seen use a delta time for the wait and must
> calculate the delta each time a cond_timedwait is done. What's the rational
> for the
> Posix functions using an ABSOLUTE time?

I was hoping for an uncomplicated answer and the spurious wakeup issue
seemed to fit the bill.  My writing and thinking however was too
simplistic to provide any meaningful insight.  So I will try again. 
Please realize that some of what you will read below is purely personal
supposition.  Chime in if I have misinformed.

1)  Spurious wakeup is a reason for passing a absolute value to
cond_timedwait.  It is not the reason or even a particularly important
reason.  The standard (POSIX 1003.1c-1995) specifically states that a
compliant implementation of pthread_cond_timedwait() may suffer from
spurious wakeups.  It therefore is reasonable to use and absolute
timeout value instead of an delta to simplify the act of retry.

2)  More importantly it is also very likely a performance issue.  Most
systems when scheduling a software interrupt use an absolute value that
reflects an offset into the OS's epoch.  To constantly be re-evaluating
a delta in user code is excessively expensive especially if most systems
really want an absolute value anyway.

3)  Also their is the reality that the structure timespec is the high
resolution time value of choice in POSIX.  And timespec happens to
represent its time as absolute time.  Add into that the needs of the
powerful realtime group that had a great impact of the shape of POSIX
1003.1c.   What integral unit would we use for a delta anyway? and would
it be in nanoseconds?  Eeak!

4)  Most importantly one would hope that the interface were constructed
to promote good coding techniques.  As Brian Silver stated the
"while(predicate)/wait" idiom is an important technique for far more
reasons than just spurious wakeups.  By using an absolute timeout value
as opposed to a delta this idiom is directly supported by easing its

When I originally brought up the  "while(predicate)/wait" idiom it was
because spurious wakeups would necessitate retrying the predicate.  I
did not intend to state that this was the only or even a particularly
important reason for the pattern.  The while "while(predicate)/wait"
idiom or an equivalent is essential to programming with condition

1)  Most importantly is the reason Brian silver stated, "There is a race
that starts once you see the wake and ends once you reacquire the
mutex."  It would be difficult and detrimental to concurrency to
construct through synchronization a situation that did not require
re-testing of the predicate after a wakeup.  This is why Brian's magic
bullet "wake-only-once,-and-gimme-that-mutex atomic" does not exist. 
Although it would be nice.

2)  Spurious wakeups do exist. Be consoled by the fact that "The race
condition that cause spurious wakeups should be considered rare.

3)  Also It enables a powerful technique that I have been using for a
several years with great success that Dave Butenhof refers to as "loose
predicates".  "For a lot of reasons it is often easy and convenient to
use approximations of actual state.  For example, 'there may be work'
instead of 'there is work'."  I will go one step beyond that in my
experience of coding distributed web servers there are situations when
the notification mechanism cannot know with certainly that there is work
without actually have performed the entirety of the task itself.  Often
the best a distributed component has to work with is trends and

Lastly, (whew ;) I believe that I have overstated the significance of
the performance implications of only once wakeups.  "Excessively
expensive" is a bit strong without further qualification.  If it were
such a paramount issue Brian Silver is right, mutexes would suffer from
the same restrictions and they absolutely do not.  

There is a performance issue that I have run across many times and have
seen cited in many references including : "Spurious wakeups may sound
strange but on some multiprocessor systems, making condition wakeup
completely predictable might substantially slow all condition variable
operations. [Butenhof]"  Never-the-less, it is the fact that making
wakeup completely predictable does not get you that much.  You still
need to retest your predicate.  In the end it is such an easy and cheap
thing when taken in the context of the overhead of the synchronization
and latency of the wait.


Ben R. Self
Open Text Corporation -- Home of Livelink Intranet

        More on spurious wakeups

It is so because implementations can sometimes not avoid inserting
these spurious wakeups; it might be costly to prevent them.

Perhaps more importantly, your own program's logic can introduce spurious
wakeups which cannot be eliminated. This can start happening as soon as there
are more than two threads.

You see, a condition waiting thread which has been signaled may have to compete
with another thread in order to re-acquire the mutex.  If that other thread
gets the mutex first, it can change the predicate, so that when finally the
original thread acquires it, the predicate is false.

This is also a spurious wakeup, for all purposes.  To make this form of
spurious wakeup go away, the semantics of condition variables would have to
change in troublesome ways, back to the original monitors and conditions
concept introduced by Quicksort father C. A. R. Hoare. Under Hoare's monitors
and conditions signaling a condition would atomically transfer the monitor to
the first task waiting on the condition, so that woken task could just assume
that the predicate is true:  if (!predicate()) wait(&condition;); /* okay */

The very useful broadcast operation does not quite fit into Hoare's model, for
obvious reasons; the signaler can choose only one task to become the
next monitor owner.  

Also, such atomic transfers of lock ownership are wasteful, especially on a
multiprocessor; the ownership transfer spans an entire context switch from one
task to another, during which that lock is not available to other tasks.
The switch can take thousands of cycles, inflating the length of a small
critical region hundreds of times!

Lastly, a problem with Hoare's approach is that a ``clique'' of tasks can form
which bounce ownership of the monitor among themselves, not allowing any other
task entry into the monitor.  No reliable provision can be made for
priority-based entry into the monitor, because the signal operation implicitly
ingores such priority; at best it can choose the highest priority thread that
is waiting on the condition, which ignores tasks that are waiting to get into
the monitor.  In the POSIX model, a condition variable signal merely wakes up a
thread, making it runnable. The scheduling policy will effectively decide
fairness, by selecting who gets to run from among runnable threads. Waking up
of threads waiting on monitors and conditions is done in priority order also,
depending on the scheduling policy.
> You know, I wonder if the designers of pthreads used logic like this:
> users of condition variables have to check the condition on exit anyway,
> so we will not be placing any additional burden on them if we allow
> spurious wakeups; and since it is conceivable that allowing spurious
> wakeups could make an implementation faster, it can only help if we
> allow them.
> They may not have had any particular implementation in mind.

You're actually not far off at all, except you didn't push it far enough.

The intent was to force correct/robust code by requiring predicate loops. This was
driven by the provably correct academic contingent among the "core threadies" in
the working group, though I don't think anyone really disagreed with the intent
once they understood what it meant.

We followed that intent with several levels of justification. The first was that
"religiously" using a loop protects the application against its own imperfect
coding practices. The second was that it wasn't difficult to abstractly imagine
machines and implementation code that could exploit this requirement to improve
the performance of average condition wait operations through optimizing the
synchronization mechanisms.

/------------------[ ]------------------\
| Compaq Computer Corporation              POSIX Thread Architect |
|     My book:     |
\-----[ ]-----/

 Q95: Conformance with POSIX 1003.1c vs. POSIX 1003.4a? 

christof ameye we41 xxxx wrote:

> Some pthread libraries talk about conformance with POSIX 1003.1c and
> others of conformance with POSIX 1003.4a. What are the
> differences/similarities ?
> I don't realy need the answer, but it might be interesting to know ...

First off, "conformance" to 1003.4a is a completely meaningless
statement. There's no such thing, because 1003.4a was never a standard.
Furthermore, I strongly doubt that there ever *were* any implementations
that could conform even if there was a way to apply meaning to the

1003.4a was the original name of the thread standard -- named for the
fact that it was developed by the realtime group (POSIX working group
designation 1003.4). However, it, like the original realtime standard,
was really an amendment to the "base standard", 1003.1. Eventually,
POSIX decided to resolve the confusion by creating more -- and renaming
all O/S API standards into the 1003.1 space. Thus, 1003.4 became
1003.1b, 1003.4a became 1003.1c, 1003.4b became 1003.1d, and so forth.

There were various draft versions of the standard while it was still
named 1003.4a, but all are substantially different from the actual
standard, and to none of them, technically, can any implementation
"conform". The most common draft is 4, which was the (loose) basis for
the "DCE thread" api, part of The Open Group's DCE suite. There's at
least one freeware implementation that claimed to be roughly draft 6.
IBM's AIX operating system provides a draft 7 implementation. The
"pthread" interface on Solaris 2.4 was draft 8. There is also at least
one implementation which I've seen claiming to be "draft 10". Draft 10
was the final draft, which was was accepted by the IEEE standards board
and by ISO/IEC with only "minor" editorial changes. Nevertheless, draft
10 is NOT the standard, and, technically, one cannot "conform" to it.
"Draft 10" and "1003.1c-1995" are NOT interchangeable terms.

Finally, because 1003.1c-1995 was never published as a separate
document, the official reference is the 1003.1-1996 standard, which
includes 1003.1b-1993 (realtime), 1003.1c-1995 (threads), and
1003.1i-1995 (corrections to the realtime amendment).

In terms of someone writing programs, a lot of that is irrelevant. But
you need to be aware that there's no real definition of "conformance"
for any drafts, so one vendor's "draft 4" is not necessarily the same as
another's "draft 4", and, while that might be inconvenient for you,
there's nothing "wrong" with it. (Although, from the POSIX standards
point of view, it was foolish and irresponsible of "them" [by which I
really mean "us", since I wrote the most common draft 4 implementation,
the original DCE threads reference library ;-) ] to use the "pthread"
prefix at all.)

There are MANY differences between "draft 4" and standard POSIX threads.
There are many (though slightly fewer) differences between draft 7 or 8
and the standard. There are even some differences between draft 10 and
the standard. Look at a move between any two drafts, or between any
draft and the standard, as a PORT to an entirely new threading library
that has some similarities. Be very careful of the details, especially
where things "appear to be the same". And, if you're stuck with a draft
implementation, lobby the vendor to provide a full conforming POSIX

/---------------------------[ Dave Butenhof ]--------------------------\
| Digital Equipment Corporation          |
| 110 Spit Brook Rd ZKO2-3/Q18 |
| Nashua NH 03062-2698 |
\-----------------[ Better Living Through Concurrency ]----------------/

 Q96: Cleaning up when kill signal is sent to the thread.? 

> I'm writing a multi-threaded daemon, which requires some cleanup if a
> kill signal is sent to the thread.  I want just the thread that received
> the signal to exit.
> The platform is Linux 2.0, libc 5.4.23, linuxthreads 0.6 (99% POSIX
> threads).
> The docs indicate that threads share signal functions, but can
> individually block or accept certain signals.  This is workable -- but
> how do I get the thread id of the thread that received the signal?
> And my next question, how portable are thread cleanup routines?
> Thanks,
> Jeff Garzik                                        Quality news feeds
> News Administrator                     INN Technical info, Consulting
> Spinne, Inc.                  


  From the sounds of what you say, the answer is "No."  :-)

  Meaning, don't do that.  There's a better method, cancellation.
If you really want the thread to exit asynchronously, that's the
way to do it.

  Now, it is even more likely that a simple polling routine will
do the job, and that would be even easier to write.

(There's a nice little cancellation example on the web page below.)

 Q97: C++ new/delete replacement that is thread safe and fast? 

> (Bob Pearson) writes:
> > Our platform is Solaris 2.5.1 and I am looking for a commerical, freeware
> > or shareware C++ new/delete replacement that is thread safe and uses more
> > than a single mutex.  We have a multi-threaded application that is using a
> > tremendous amount of new operators and is huge (>200MB) and is constantly
> > running into very high mutex contention due to the single mutex for new in
> > libC.a from:
> >
> >       SUNWSpro "CC: SC4.0 18 Oct 1995 C++ 4.1".
> You might want to check out ptmalloc:
> I would hope that operator new somehow invokes malloc at a lower
> level.  If not, you would have to write a small wrapper -- there
> should be one coming with gcc that you could use.
> Hope this helps,
> Wolfram.
Wolfram Gloger 

 Q98: beginthread() vs. endthread() vs. CreateThread? (Win32) 

[Bil: Look at the description in "Multithreading Applications in Win32" (see

Mark A. Crampton wrote:
> Juanra wrote:
> >
> > I'm a Windows 95 programmer and I'm developing a multithreaded
> > server-side application. I use the CreateThread API to create a new
> > thread whenever a connection request comes. I've read that it's better
> > to use beginthread() and endthread() instead of CreateThread because
> > they initialize the run time libraries. What happens with win32
> > CreateThread function?. Doesn't it work properly?. If not, I can't use
> > beinthread because I can't create my thread in a suspended mode and
> > release it after.
> >
> > Does the function beginthreadNT() work under win95?
> No
> >
> > Thanks in advance.
> > Juan Ra.
> Answer to beginthread - use _beginthreadex, which uses same args as
> CreateThread (you can create suspended).  _beginthreadex works on 95 &
> NT but not Win32S.  The priviledge flags are ignored under 95.
> CreateThread _works_ OK - it just doesn't free memory allocated on the C
> run-time library stack when the thread exists.  So you can attempt to
> clean up the runtime library stack, use _beginthreadex, or not use any C
> run time library calls.

 Q99: Using pthread_yield()? 

Johann Leichtl wrote:
> if i have some code like:
>         ..
>         pthread_yield()
>         something(e.g. lock mutex)
>         ..
> is it guaranteed that the thread will give up the cpu before getting the
> lock or not.

First off, to clarify, (you probably already know this, given the set of
names in your subject line), "pthread_yield" is an obsolete DCE thread
interface, not part of POSIX threads. As such, it is not covered by any
formal standard, and has no real portability guarantees. The way it
works on your particular DCE thread system is probably the way the
developers wanted it to work on that system, and if you disagree there's
no "higher authority" to which you might appeal.

POSIX specifies the behavior of sched_yield, (or, in fact, any
scheduling operation), only with respect to the defined realtime
scheduling policies, SCHED_FIFO and SCHED_RR. Threads running under one
of these policies that call sched_yield will release the CPU to any
thread (in SCHED_FIFO or SCHED_RR) running at the same priority. (There
cannot be any at a higher priority, since they would have preempted the
current thread immediately.)

Is that the same thing as "guaranteed [to] give up the cpu"? For one
thing, sched_yield won't do anything at all if there are no other
threads that are ready to run at the calling thread's priority; it'll
just return.

If you have threads with non-standard scheduling policies, such as
SCHED_OTHER, or a hypothetical SCHED_TIMESHARE, POSIX says nothing about
the behavior or sched_yield. Most likely, (and at least in Digital's
implementation), the function will do the same thing. It doesn't really
worry about scheduling POLICY, only PRIORITY. Note that, because
SCHED_OTHER doesn't necessarily imply preemptive scheduling, you might
actually have a thread "ready to run" at a higher priority than the
current thread's priority. Also, because non-realtime policies aren't
necessarily "strictly priority ordered", and the system generally wants
to simulate some sort of fairness in timeshare scheduling, it is
possible (at least, "not ruled out by the standard") that a call to
sched_yield from a non-realtime thread might yield to a thread with
lower priority -- especially if that other thread is realtime.

/---------------------------[ Dave Butenhof ]--------------------------\
| Digital Equipment Corporation          |
| 110 Spit Brook Rd ZKO2-3/Q18 |
| Nashua NH 03062-2698 |
\-----------------[ Better Living Through Concurrency ]----------------/

 Q100: Why does pthread_cond_wait() reacquire the mutex prior to being cancelled? 

Firstly, thanks to all of you who responded to my post in
comp.programming.threads.  As I alluded to in my posting, I felt quite sure
the problem I was experiencing was one due to misunderstanding.  I
immediately suspected this when my program exhibited the same behaviour
under HP-UX *and* Solaris.

Everyone told me the same thing: use a cleanup handler, pushed onto the
cleanup handler stack for the active thread because pthread_cond_wait
*reacquires* the mutex when it is cancelled.  I can see how this causes
other threads waiting on the same condition variable to fail to be
cancelled, but for me, the $64,000 question is:

        Why does pthread_cond_wait reacquire the mutex prior to being cancelled?

This seems like madness to me.  We're _cancelling_ the thread, so we're no
longer interested in the value of the data we're testing.  Why acquire the
lock and immediately require us to use a cleanup handler?

There must be something more to this ;-)

Ben Elliston                    .====    E-mail:
Compucat Research Pty Limited  /  ====.     Web:
Canberra ACT Australia         .====  /

  You're not thinking hard enough!  It *has* to be like this.

>         Why does pthread_cond_wait reacquire the mutex prior to being cancelled?
> This seems like madness to me.  We're _cancelling_ the thread, so we're no
> longer interested in the value of the data we're testing.  Why acquire the
> lock and immediately require us to use a cleanup handler?

A:  pthread_cond_wait(m, c);
B:  do some work...
C:  pthread_mutex_unlock(m);

  If cancellation happened while sleeping (at A) or while running (at B),
the same cleanup handler would run.  If the state of the mutex was DIFFERENT
at those locations, you'd be up the creek.  Right?

 Q101: HP-UX 10.30 and threads? 
Bryan Althaus wrote:
> Jim Thomas ( wrote:
> : In article <5qj71l$> (Bryan Althaus) writes:
> :
> : Bryan> This is actually becoming a really bad joke.  No HP people seem to want
> : Bryan> to talk about 10.30 though they will post on what compiler flags to
> : Bryan> use to compile an app using pthreads under 10.30!
> :
> : Bryan> And apparently once it does come out you must ask for it.  10.20 will be
> : Bryan> the shipping OS until HP-UX 11.0 comes out.
> :
> : Bryan> If anyone knows when 10.30 is shipping, please email me.  I'll respect
> : Bryan> your privacy and not repost.  We have a current need for threading,
> : Bryan> but since they must be spread out over multiple CPU's, kernel thread
> : Bryan> support is needed - hence HP-UX 10.30.
> :
> : I received an e-mail from "SOFTWARE UPDATE MANAGER" Saturday that says the
> : following.  Note especially the part about not for workstations :-(
> :
> : Jim
> :
> Thanks for the info Jim.  A received email from a kind soul at HP who
> basically explained the deal on 10.30, being used for people/ISV
> transitioning for HP-UX 11.0, but was not sure when it would be out.
> That was all I needed to know. Based on this we will use 10.20 for
> our product roll-out and when HP-UX 11.0 comes out maybe I revisit
> replacing the forking() code with threads.  As it turned out, on
> a two CPU machine, the forking() code actually worked nicely and
> basically was written as if I were using pthreads and wasn't really
> the big hack I thought it was going to be.  Of course each fork()
> costs an additional 70MB's of memory! :)
> Now if someone could let us know roughly the due date for HP-UX 11.0
> and maybe what we can look for in HP-UX 11.0.  Obviously it will have
> kernel threads with pthreads API, NFS PV3, Streams TCP/IP, and support
> both 32 and 64 bit environments. Will HP-UX 11.0 ship more Net friendly
> with a Java Virtual Machine?  Will the JVM be threaded? Will JRE be
> on all HP-UX 11.0 systems?  WebNFS support? WebServer? Browser?  Current
> OS's now come with all these goodies standard: http:/

 Q102: Signals and threads are not suited to work together? 

Keith Smith wrote:
> This is a question I posed to comp.realtime, but noticed that you have a
> discussion going on here....  can you offer me any assistance?
> HEre's the excert:
> Shashi:
> Based on your previous email (below), I have a couple of questions:
> 1. If signals and threads are not suited to work together, what
> mechanism can/should be used to implement timing within a thread.  If I
> have two threads that performed autonomous time-based functions, I want
> to be able to have a per-thread timing mechanism.
> 2. If the approach of "block all signals on all threads and send various
> signals to a process - putting the emphasis on the thread to unblock the
> appropriate signals", how do we deal with other threads which may be
> interrupted by a blocked signal (e.g. a read() call that returns EINTR
> even when its thread blocks the offending signal.  Isn't this a flaw?
> This requires the need for a signal handler (wastefull) with the RESTART
> option speicified.
> It seems like a per-thread mechanisms is needed... how does NT
> accomplish this?
> ** I know I shouldn't be relying on a per-LWP signal, but how else can I
> accomplish what I am trying to do?
> In message " timer's interrupting system calls... HELP",
> writes:
> >Hi,
> >Signals and threads are not suited to work together. Personally, I feel that
> >UNIX has a serious flaw in the sense that most blocking systems calls (e.g.
> >read, semop, msgrcv etc) do not take a timeout parameter. This forces
> >programmers to use alarms and signals to interrupt system calls. I have
> >worked with operating systems such as Mach, NT and other which do not suffer
> >from this problem. This makes porting single threaded applications from UNIX
> >(which rely on signals) to a multithreaded process architecture difficult.
> >
> >Even though I come from an UNIX background (Bell Labs in late 80's) I have
> >learnt the hard way that signals make program much more error prone. I have
> >worked extensively on Mach and NT and never saw a reason to use threads. As
> >far as POSIX.1c is concerned I think they did a favor to the users of threads
> >on UNIX by mandating that signals be a per-process resource. You have to
> >understand that LWP is more of a System R4 concept (same on Solaris) and not
> >a POSIX concept. Two level scheduling is not common on UNIX systems (those
> >who implement have yet to show a clear advantage of two level scheduling).
> >
> >I am sure that Dave Butenhof (frequent visitor to this newsgroup) would have
> >more insight as to why POSIX did not choose to implement signals on a
> >per-thread basis (or LWP as you say). I would advice that you should
> >rearchitect your application not to depend on per-thread (LWP) signals. I
> >feel you will be better off in the long run. Take care.
> >
> >Sincerely,
> >Shashi
> >

 Q102: Patches in IRIX 6.2 for pthreads support? 

Jeff A. Harrell wrote:
> radha subramanian wrote:
> >
> > I heard that a set of patches have to be applied in IRIX 6.2
> > for pthreads support.  Could someone tell me which are these
> > patches ?
>  1404 Irix 6.2 Posix 1003.1b man pages          List    123Kb   07/01/97
>  1645 IRIX 6.2 & 6.3 POSIX header file updates  List    41Kb    07/01/97
>  2000 Irix 6.2 Posix 1003.1b support modules    List    164Kb   07/01/97
>  2161 Pthread library fixes                     List    481Kb   07/01/97
> The whole set is downloadable from:
> A SurfZone password is required.

 Q104: Windows NT Fibers? 

Ramesh Shankar wrote:
> Found some info. on Windows NT Fibers in "Advanced Windows." Just
> wanted to verify whether my understanding is correct.
> - Is a (primitive) "many to one" thread scheduling model.
> - Fibre corresponds to Solaris "threads" (NT threads then correspond
> to Solaris LWP).
> - If a fibre blocks, the whole thread (LWP for us) blocks.
> - Not as sophisticated as Solaris threads.

  Kinda-sorta.  Certainly close enough.  My understanding is that fibers
were built especially for a couple of big clients and then snuck their way 
out.  As such, I would avoid using them like the plague.  I've read the
APIs and they scare me.


Jeffrey Richter, Advanced Windows, 3rd Ed., p.971 states that

"The fiber functions were added to the Win32 API to help companies
quickly port their existing UNIX server applications to Windows NT."


The following sentences say that fibers are targeted to the
proprietary user level thread-like quirks some companies did for
whatever reason (ease of programming, performance).

To answer your question: fibers are not an integral part of any MS
application, and I can't imagine that they use it internally anywhere,
and thus won't achive the stability. Does this argument weigh a bit
against their use in a new program?


PS: Have you noticed that I managed to keep from flaming :-)

>> Fibers are BAD because they comprise a SECOND method of doing threading.
>> If you want threads, use threads. (All that co-routine stuff was
>> great. We don't need them any more.)
>There are two reasons for "threads" and things similar to threads.
>First, they're smaller than full blown processes and with faster
>context switching than with processes.  Second, they allow more fine
>grained concurrency.

I don't think you hit it quite on the head.  Threads, allow a computation
to be decomposed into separately scheduled tasks, which has these advantages:
- the tasks can be run on separate processors.
- the tasks can be prioritized, so that a less important computation
  can, in response to an external event, be suspended to process that event.
- computation can occur in one thread, while waiting for an event, such as
  the completion of I/O
So it's all about improving performance parameters like overall run time,
or average response time, or real time response, and maximizing the
utilization of real resources like processors and peripherals.

>Originally, coprocesses (and tasks and light-weight-processes and
>threads) solved both goals quite well.  Then in the last decade or
>more, thread-like things started getting bigger and slower; ie,
>letting the kernel handle the context switching, making them work well
>with OS calls and standard libraries, signal handling, asynchronous
>I/O, etc.
>Fibers seem like just a return to the efficient/small type of task.
>The drawback to them seems just that they're only on Windows NT, so
>that even if you have a valid need for them the code won't even be
>portable to other Windows boxes.

If you take a thread, and then hack it into smaller units that the
operating system doesn't know about, these smaller units do not
realize the advantages I listed above. They are not scheduled on
separate processors, they cannot be dispatched in response to
real-time inputs, they cannot wait for I/O while computation occurs.

I did not list, as one of the advantages, the ability to change the
logical structure of the program by decomposing it into threads, like
eliminate the implementation of state machines by offloading some state
information into individual program counters and stacks.  To me, that is
a purely internal program design matter that doesn't make any externally
visible difference to parameter like the running time, througput,
real-time response or average response.

It's also a programming language matter as well; a language with
continuations (e.g. Scheme) would have no need for these types of
sub-threads. In Scheme, a function return is done using a function
call to a previously saved continuation.  It's permitted to jump
into the continuation of a function that has already terminated;
the environment captured by the continuation is still available.
(Such captured environments are garbage collected when they become
unreachable).  To me, things like fibers seem like low-level hacks to
provide platform-specific coroutines or continuations to the C language,
whereas threads are a language-independent operating system feature.

>If Fibers are unnecessary because Threads exist, then why not say that
>Threads are unnecessary because Processes exist?
  (Threading comprises
>a SECOND method of splitting work up into separate units of control)

This argument assumes that threads are to processes what processes
are to the system. However according to one popular system model,
processes just become collections of resources that simply *have* one
or more independently scheduled control units. In this model, threads
are the only separate unit of control. A process that has one unit of
control is said to be single-threaded, rather than non-threaded.  Or,
under an alternative model exemplified by Linux, threads are just
collections of tasks that share resources in a certain way.  Two tasks
that don't share an address space, file table, etc are by convention
said to be in different processes. Again, there is just one method of
splitting work into units of control: the task.

 Q105: LWP migrating from one CPU to another in Solaris 2.5.1? 

Hej Magnus!

> Hi!
> I've got a question about threads in Solaris 2.5.1, that I hope You can
> answer for me!
> Short version:
> How does the algorithm work that causes an LWP to migrate from one CPU to
> another CPU in Solaris 2.5.1?

  The LWP gets contexted switched off CPU 0.  When a different CPU becomes 
available, the scheduler looks to see how many ticks have passed.  Solaris 2.5:
if less than 4, some other LWP (or none at all!) gets the CPU.  If > 3, then
just put the LWP on the new CPU.
> Longer version:
> I'm doing some research about a tool that I hope could be used by multi-thread
> programmers in order to find and possibly correct perfomance bottlenecks.
> Basically the tool works in three phases:
> 1) By running the multi-threaded program on a single processor we create a
>    trace wich represent the behaviour of the program.
> 2) By simulating (or re-schedule) the trace on a multi-processor we can tell
>    wether the program has the desired speed-up or not.
> 3) The simulated "execution" is displayd graphically in order to show where
>    the performance bottlenecks are.

  This sounds good...
> I've got a problem when simulating a program that hits a barrier.
> Assume that we, for instance, have 8 bound threads hitting the same barrier
> on a multiprocessor with 7 processors. Here the migration for an LWP from
> one CPU to another is very important. If we have no migration at all the speed
> up will be 4 compared to a single processor.
> On the other hand, if we have full migration, the speed up will be (almost) 7
> if we neglect the impact of cache-misses.

  Of course said $ misses are a BIG deal.

  None-the-less...  I *think* this will happen on Solaris 2.5:

  The first 7 wake up and run for 1 tick (10ms).  The 7 drop 10 points of
priority.  T8 then gets CPU 7, while T1 - T6 run another tick.  They drop 10
points.  T7 wants CPU 7 and will get it from T8.  Now the time slice increases
because we're near the bottom of the priority scale.  Everybody runs for 10
ticks.  From here on out, one thread will migrate around while the others 
keep their CPUs.  I think.

  Of course you'd avoid writing a program that put 8 CPU-bound threads on 7

 Q106: What conditions would cause that thread to disappear? 


> I have a service thread which enters a never-exiting service loop via
> a while(1).  What conditions would cause that thread to disappear?

  You tell it to.  Either return(), pthread_exit(), or pthread_cancel().
That's the only way out.

> It can't be just returning off the end because of the while(1).  Past
> experience has indicated to me that if a single thread causes a
> exception such as a SEGV that the entire process is killed.  Are there
> known conditions which cause just the thread to exit without
> interfering with the rest of the process?

  You're right.  SEGV etc. kill the process (unless you replace the
signal handler).

> I suspect there's stack corruption in this thread, but I would have
> expected such corruption to take the form of a SEGV or something
> similar.  I'm very surprised that just the thread exited leaving
> everything else (seemingly) intact.

  So...  you have a problem.  I *expect* that you'll find the place
where the thread's exiting and it'll be something you wrote.  (The
other option is a library bug.  Always possible (if unlikely).)

  I'm disappointed to see that a breakpoint in pthread_exit() doesn't
get called in the Sun debugger.  Moreover, you don't even get to 
see the stack from the cleanup handlers!  (I'm making this a bug
report.)  I notice that from TSD destructors you at least get to
see a bit of the call stack.

  So...  I'd suggest this:  Declare some TSD, put a breakpoint in
the destructor, and see what happens when your thread exits.  Try
out the bit of code below.

  How does this work on other platforms?

cc -o tmp1 tmp1.c  -g -lpthread

#define _POSIX_C_SOURCE 199506L
#define NULL 0

pthread_attr_t  attr;
pthread_t   thread;
pthread_key_t   key;

void destroyer(void *arg)
{pthread_t tid = pthread_self();
 printf("T@%d in TSD destructor.\n", tid);

void cleanup(void *arg)
{pthread_t tid = pthread_self();
 printf("T@%d in cleanup handler.\n", tid);

void search_sub2()
 pthread_exit(NULL);        /* Surprise exit -- the one you forgot about */

void search_sub1()
 search_sub2();     /* do work */

void *search(void *arg)
  pthread_setspecific(key, (void *) 1234);   /* NEED A VALUE! */
 pthread_cleanup_push(cleanup, NULL);
 search_sub1();     /* do work */


  pthread_key_create(&key;, destroyer);
 pthread_attr_setscope(&attr;, PTHREAD_SCOPE_SYSTEM);
 pthread_attr_setdetachstate(&attr;, PTHREAD_CREATE_JOINABLE);/* Also
default */

 pthread_create(&thread;, &attr;, search, NULL);


 Q107: What parts, if any, of the STL are thread-safe? 

Matt Austern wrote:
> Boris Goldberg  writes:
> > > >I'm finding a memory leak in the string deallocate() (on the call to
> > > >impl_->deallocate()) under heavy thread load, and it brings up a
> > > >frightening question:
> > >
> > > >What parts, if any, of the STL are thread-safe?
> > >
> > 
> > STL thread safety is implementation-dependent. Check with
> > your vendor. Many implementations are not thread-safe.
> One other important point: "thread safety" means different things to
> different people.  Programming with threads always involves some
> cooperation between the language/library and the programmer; the
> crucial queston is exactly what the programmer has to do in order to
> get well-defined behavior.
> See for an
> example of an STL thread-safety policy.  It's not the only conceivable
> threading policy, but, as the document says, it is "what we believe to
> be the most useful form of thread-safety."

 Q108: Do pthreads libraries support cooperative threads? 

Paul Bandler wrote:
> Bryan O'Sullivan wrote:
> >
> > p> Thanks for those who have sent some interesting replies (although
> > p> no-one seems to think its a good idea to not go all the way with
> > p> pre-emptive pthreads).
> >
> > This is because you can't go halfway.  Either you use pthreads in a
> > fully safe manner, or your code breaks horribly at some point on some
> > platform.
> OK, so you would disagree with the postings below from Frank Mueller and
> David Butonhof in July that indicates it is possibe (if inadvisable)?
> Frank Mueller wrote:
> >
> >Raphael.Schumacher@SWISSTELECOM.COM (Schumacher Raphael, GD-FE64) >writes:
> >[deleted...]
> > > 1) Do pthreads libraries support cooperative threads?
> >
> > In a way, somewhat. Use FIFO_SCHED and create all threads at the same priority level,
> > and a thread will only give up control on a blocking operation, e.g. yield, cond_wait,
> > mutex_lock and (if thread-blocking is supported) maybe on blocking I/O (read, write, accept...)
> >
> > This may be close enough to what you want. Short of this, you probably need your coop_yield, yes.
> On HP-UX, at least until 10.30 (which introduces kernel thread support),
> the SCHED_FIFO [note, not "FIFO_SCHED"] scheduling policy workaround
> might work for you, because your threads won't face multiprocessor
> scheduling. I wouldn't recommend it, though -- and of course it won't
> come even close to working on any multiprocessor system that supports
> SMP threads (Solaris, Digital UNIX, IRIX, or even the AIX draft 7
> threads). If you're interested in thread-safety, go for thread-safety.
> While it might be nice to give yourself the early illusion that your
> known unsafe code is running, that illusion could be dangerous later! If
> you've got a real need to run the software sooner than you can convert
> it, you're likely to run into other problems (such as the order in which
> threads run?) If you don't have an immediate need, why look for
> shortcuts that you know are only temporary?
> If you really want to build a "cooperative scheduling" package for your
> threads, (and again, I don't recommend it!), build your own. It's not
> that hard. Inactive threads just block themselves on a condition
> variable until requested to run by some other thread (which signals the
> condition variable and then blocks itself on the same, or another,
> condition variable).
> The "1)" in the original mail implies the first item of a list, but my
> news server has chosen, in its infinitesimal wisdom, not to reveal the
> original post to me. So perhaps I'll have more to say should it repent!
> /---------------------------[ Dave Butenhof ]--------------------------\
> | Digital Equipment Corporation          |
> | 110 Spit Brook Rd ZKO2-3/Q18 |
> | Nashua NH 03062-2698 |
> \-----------------[ Better Living Through Concurrency ]----------------/

 Q109: Can I avoid mutexes by using globals? 

> > j> Now, I have implemented this without using a synchronization
> > j> mechanism for the integer. Since I have only one writer, and
> > j> multiple readers, and the datum is a simple integer, I believe I
> > j> can get away with this.

> But on the other hand why not do it correctly with locks? Locks
> will make the code easier to maintain because it will be somewhat
> self documenting (the call to rwlock() should give most programmers
> a clue) and it will be more robust. In my experiance threaded
> programs are more fragile and more difficult to debug than single
> threaded programs. It is a good idea to keep thread syncronization
> as controlled as you can, this will make debugging simpler.

Remember that sign in "The Wizard of Oz"?

  "I'd go back if I were you."

When you port your program to the next OS or platform and a new
bug appears...  Could it be caused by your hack?  Won't it be
fun trying to guess with each new bug?  How will you prove to yourself
that the bug is elsewhere?

And that guy who maintains your code, won't he have fun?

That's three opinions...

"Just the place for a snark! I have said it three times.
And what I say thrice is true!"


 Q110: Aborting an MT Sybase SQL? 

Bryan Althaus wrote:
> Jim Phillips  wrote:
> : We are using Sybase on Solaris for a database application.
> : We are trying to abort a query that turns out to be long running by
> : using POSIX pthread() in a manner which allows a calling X-Windows
> : program to interrupt a fetch loop and cancel the query.
> : At runtime, it works OK sometimes and sometimes it doesn't.  If we
> : attempt to start a new query too soon after canceling the long running
> : query, we get the remains of the old query's result set.  If we wait a
> : couple of minutes before starting the new query,  then it works fine and
> : the new queries expected result set is returned.
> : We are using ESQL and Solaris 2.5 C compiler to build the Sybase SQL
> : Server 11.0.2 program interface.
> : I have heard a rumor that you cannot use pthread with some X-11
> : versions?
> : Anybody out there have any ideas, thoughts, comments or critique(s).
> Having come from a Sybase OpenServer class, I can tell you that you can't
> "cancel" a query.  Once it is sent to Sybase, the query will run till
> completion.  There is no way to stop Sybase from executing the entire
> query.
> I'm not saying that's your problem, just that I notice you say if you
> wait a couple of minutes before starting the new query all is fine.
> Obviously by then the old query has finished.
> I haven't used ESQL in years (use C++/DBtools.h++) so I don't know how your
> connection to Sybase is done. When you "cancel" do you close the old connection,
> and then open up a new connection when you start a new query?
> Just sounds like you may be using the same connection and getting back the
> first query to finish, which would be the "cancelled" one.
> You might try comp.databases.sybase if this theory is at all likely.
> In any case I'd be interested in what the problem turns out to be.
> Good luck,
> Bryan Althaus

 Q111: Other MT tools? 

The Etch group at UW and Harvard is distributing some program
development and analysis tools based on the Etch Binary
Rewriting Engine for NT/x86 boxes.

The first of these tools is a call graph profiler that helps 
programmers understand where time is going in their program.
The Etch Call Graph Profiler works on Win32 apps built by most
compilers, does not require source, works with multithreaded
programs, understands program and system DLLs, and is designed
to run reasonably fast even if your machine only has an 'average'
configuration. We've used it on serious programs ranging from
Microsoft's SQL server to Lotus Wordpro, and even on Direct Draw
games like Monster Truck Madness.

If you'd like to give our profiler a try, you can download it from:

Follow the link to Download Etch Tools.

Part of our motivation for distributing these tools is to get
some feedback about what we've built and what we should be
building. So please give our tools a try, and send mail to with your comments and


The Etch Group

    Ted Romer
    Wayne Wong
    Alec Wolman
    Geoff Voelker
    Dennis Lee
    Ville Aikas
    Brad Chen
    Hank Levy
    Brian Bershad

 Q112: That's not a book. That's a pamphlet! 

Brian Silver wrote:
> Ben Self wrote:
> > 
> [Snip]
> > Since this is a response largely to Brian Silver's post, a person I have
> > a good deal of respect for, I have chosen to include some quotes form
> > Dave Butenhof's book, Programming with POSIX Threads, because we both
> > know it and have a mutual admiration for his work.
> Flattery will get you everywhere!
> (But what makes you think I admire Dave's work!? His code sure is
> formatted nice, though!  He also puts useful comments in
> his code. Except for the thd_suspend implimentation in his book.)

Brian, after all, likes to tell me that I've published a "pamphlet",
because Addison-Wesley chose to use a soft cover, and "books" have hard
covers. For those who may not have figured all this out, by the way,
Brian's office is diagonally across from mine -- he's currently a
contractor working on Digital's threads library.

As for the suspend example in the book... 'tis true, it is not well
documented. Of course, Brian should have been a little more cautious,
since, as a footnote documents for posterity, the example is Brian's
code, essentially as he wrote it. (And, as Brian just admitted, he
doesn't tend to comment as well as I do. ;-) )

Dave Butenhof 

 Q113: Using recursive mutexes and condition variables? 

I have a question regarding recursive mutexes and condition variables.

Given a mutex created with one of the following attributes:

DCE threads
    pthread_mutexattr_setkind_np( &attr;, MUTEX_RECURSIVE_NP );

X/Open XSH5 (UNIX98)
   pthread_mutexattr_settype( &attr;, PTHREAD_MUTEX_RECURSIVE );

What exactly is the behavior of a pthread_cond_wait() and what effect do
"nested" locks have on this behavior?

Do mixing recursive locks and condition variables make any sense?

This is largely achademic.  However, I (like everyone else in known
space/time) maintain an OO abstraction of a portable subset of Pthreads
and would like to know the appropriate semantics.  

Since the advent of the framework (about 3 years ago) I have managed to
avoid using recursive mutexes.  Unfortunately, my back may be against
the wall on a few new projects and I may be forced to use them.  They
seem to be a real pain.

And yes I promise this concludes my postings on cvs for the forseen



 Q114: How to cleanup TSD in Win32? 

>I am forced to use TSD in multithreading existing code. I just ran
>into the problem that, while the destructor function argument to
>pthread_key_create() and thr_keycreate() appears ideal, there is no
>such facility with the NT TlsAlloc() and that related stuff.

It's pretty easy to write the code to provide this facility. Basically
you have to wrap TlsAlloc() and TlsFree() in some additional logic.
This logic maintains a list of all the keys currently allocated for
your process. For each allocated key it stores the address of the
destructor routine for that key (supplied as an argument to the
wrapped version of TlsAlloc()). When a thread exits it iterates
through these records; for every key that references valid data (i.e.
TlsGetValue() returns non-NULL) call the relevant destructor routine,
supplying the address of the thread specific data as an argument.

The tricky part to all this is figuring out when to call the routine
that invokes the destructors. If you have complete control over all
your threads then you can make sure that it happens in the right
place. If, in the other hand, you are writing a DLL and you do not
have control over thread creation/termination this whole approach gets
rather messy. You can do most of the right stuff in the
DLL_THREAD_DETACH section of DllMain() but the thread that attached
your DLL will not take this route, and trying to clean up TLS from
DLL_PROCESS_DETACH is dangerous at best.

Good luck.

Gilbert W. Pilz Jr.
Systems Software Consultant

 Q115: Onyx1 architecture has one problem 
Hi there,

I made some parallel measurements on SGI's.

It seemed that the Onyx1 architecture has one problem:
the system bus, which introduces a communication bottleneck.
Memory access and float point calculation introduces traffic
on that bus.

Measurements on the new Onyx2 crossbar-based architecture
suggested that these problems would be solved. However, some early
measurements suggested two thoughts:
1. Float point calculation scales better on the Onyx2 architecture,
which suggests that this problem was really communication related.
(-> crossbar). Going beyond 4 processores (more than one crossbar),
 the scaling goes down.

2. Memory allocation:
Memory allocation (basically a sequential operation) is *really*
slow. Most time is spend at the locking mechanism. This surprises
me, because the pthread mutices I'm using in the code are called
at least as much as the memory allocation, but they are much faster.

Does anybody at SGI has some hints to explain this behaviour?



======== 8< ======= 8< ======= 8< ======= 8< ======= 
Dirk Bartz                  University of Tuebingen

Dirk Bartz  writes:

> 2. Memory allocation:
> Memory allocation (basically a sequential operation) is *really*
> slow. Most time is spend at the locking mechanism.

I have noticed this as well, albeit with the `old' sproc-threads on
Irix-5.3.  ptmalloc seems to be an order of magnitude faster in the
presence of multiple threads on that platform:

However, for Irix-6 with pthreads, you have to use a modified
ptmalloc/thread-m.h file, as I've recently discovered.  I will send
you that file by mail if you're interested; it will also be in the
next ptmalloc release, due out RSN.


 Q116: LinuxThreads linked with X11 seg faults. 

Unfortunately the X11 libraries are not compiled with -D_REENTRANT, hence
the problems. You can get the source for the X11 libraries and rebuild them
with the -D_REENTRANT flag and that should help.

If you are using Motif you are out of luck. I spoke to the folks who supply
motif for RedHat Linux. They refused to give me a version recompiled with
the -D_REENTRANT version. They gave me a load of crap about having to test
it and so forth.

I tried using LessTif, but it seemed to be missing too much.


 Q117: Comments about Linux and Threads and X11 

> LinuxThreads linked with X11 by g++ causes calls to the X11 library to seg
> fault.

You can either use Proven's pthread package or LinuxThreads.

Proven's is a giant replacment for the standard libraries that
does user level threads in a single process. 

LinuxThreads uses the Operating system clone() call to implement
threads as seperate processes that share the same memory space.
LinuxThreads seems to be tricky to install as it requires new
versions of the standard libraries in addition to a 2.x kernel and
the pthread library. However, if you get the latest version of RedHat,
you're all set.

I've found Proven's implementation to be much faster, though somewhat
messier to compile and a bit incomplete in it's system call
implementation (remember, it has to provide a substitute for almost
everys system call). Unfortunately I had to switch to LinuxThreads
because the signal handling under Proven's threads was not working

In particular, disk performance seems to suffer under LinuxThreads.
As far as I can tell, the OS level disk caching scheme gets confused
by all the thread/processes that are created.

It's also a bit unnerving typeing "ps" and seeing fourty copies
of your application running!


Unfortunately the X11 libraries are not compiled with -D_REENTRANT, hence
the problems. You can get the source for the X11 libraries and rebuild them
with the -D_REENTRANT flag and that should help.

If you are using Motif you are out of luck. I spoke to the folks who supply
motif for RedHat Linux. They refused to give me a version recompiled with
the -D_REENTRANT version. They gave me a load of crap about having to test
it and so forth.

I tried using LessTif, but it seemed to be missing too much.


 Q118: Memory barriers for synchonization 

Joe Seigh wrote:
> So there are memory barriers in mutexes, contrary to what has been stated
> before in this newsgroup.  Furthermore, it appears from what you are saying is
> that the mutex lock acts as a fetch memory barrier and the mutex unlock
> acts as a store memory barrier, much like Java's mutex definitions.
> Which is not suprising.  Java appears to have carried over quite a bit of the
> POSIX thread semantics.

This is not QUITE correct. First off, the semantic of locking or unlocking a
mutex makes no distinction regarding read or write. In an architecture that
allows reordering reads and writes, neither reads nor writes may be allowed
to migrate beyond the scope of the mutex lock, in either direction. That is,
if the architecture supports both "fetch" and "store" barriers, you must
apply the behavior of both to locking AND unlocking a mutex.

The Alpha, for example, uses MB to prevent reordering of both reads and
writes across the "barrier". There's also a WMB that allows read reordering,
but prevents write reordering. WMB, while tempting and faster, CANNOT be
used to implement (either lock or unlock of) a POSIX mutex, because it
doesn't provide the necessary level of protection against reordering.

Finally, let's be sure we're speaking the same language, ("Gibberishese").
People use "memory barrier" to mean various things. For some, it means a
full cache flush that ensures total main memory coherency with respect to
the invoking processor. That's fine, but it's stronger than required for a
mutex, and it's not what *I* mean. The actual required semantic (and that of
the Alpha) is that a "memory barrier" controls how memory accesses by the
processor may be reordered before reaching main memory. There's no "flush",
nor is such a thing necessary. Instead, you ensure that data (reads and
writes) cannot be reordered past the lock, in either of the processors
involved in some transaction.

An MB preceding the unlock of a mutex guarantees that all data visible to
the unlocking processor is consistent as of the unlock operation. An MB
following the lock of the mutex guarantees that the data visible to the
locking processor is consistent as of the lock operation. Thus, unlocking a
mutex in one thread does not guarantee consistent memory visibility to
another thread that doesn't lock a mutex. Coherent memory visibility, in the
POSIX model, for both readers and writers, is guaranteed only by calling
specific POSIX functions; the most common of which are locking and unlocking
a mutex. A "memory barrier", of any sort, is merely one possible hardware
mechanism to implement the POSIX rules.

/---------------------------[ Dave Butenhof ]--------------------------\

 Q119: Recursive mutex debate 

Robert White wrote:
> I STRONGLY DISAGREE with the idea that recursive mutexes "are a bad idea".
> I have made and use a recursive mutex class in several key C++ endeavors.  As a
> low-level tool recursive mutexes are "bad" in that they tend to lead the sloppy
> down dangerous roads.  Conversly, in experienced hands an recursive mutex is a
> tool of simple elegance.  The core thing, as always, is "knowing what you are
> doing".

Hey, look, recursive mutexes aren't illegal, they're not "morally
perverse", and with XSH5 (UNIX98) they're even standard and portable.
So, fine -- if you like them, you use them. Use them as much as you
like, and in any way you like.

But remember that they're ALWAYS more expensive then "normal" mutexes
(unless your normal mutexes are more expensive than they need to be for
the platform!). And remember that WAITING on a condition variable using
a recursively locked mutex simply won't work. So, if you're using a
condition variable to manage your queue states, you need to at least
analyze your lock usage sufficiently to ensure that the wait will work.
And, once you've done that, it's a simple step to dropping back to a
normal mutex.

There are definitely cases where the expense is acceptable, especially
when modifying existing code -- for example, to create a thread-safe
stdio package. The performance isn't "extremely critical", and you don't
need to worry about condition wait deadlocks (there's no reason to use
them in stdio). Sorting out all of the interactions between the parts of
the package is difficult, and requires a lot of new coding and
reorganization -- and implementing some of the correct semantics gets
really tricky.

Don't waste time optimizing code that's not on the critical path. If
you've got code that's on your critical path, and uses recursive
mutexes, then it's NOT optimized. If you care, you should remove the
recursive mutexes. If you don't care, fine. If the use of recursive
mutexes in non-critical-path code doesn't put it on the critical path,
there's no reason to worry about them.

Still, I, personally, would use a recursive mutex in new code only with
extreme reluctance and substantial consideration of the alternatives.

/---------------------------[ Dave Butenhof ]--------------------------\

[I echo Dave's "extreme reluctance and substantial consideration of the 
 alternatives" -Bil]
 Q120: Calling fork() from a thread 

> Can I fork from within a thread ?


> If that is not explicitly forbidden, then what happens to the other threads in
> the child process ?

There ARE no other threads in the child process. Just the one that
forked. If your application/library has background threads that need to
exist in a forked child, then you should set up an "atfork" child
handler (by calling pthread_atfork) to recreate them. And if you use
mutexes, and want your application/library to be "fork safe" at all, you
also need to supply an atfork handler set to pre-lock all your mutexes
in the parent, then release them in the parent and child handlers.
Otherwise, ANOTHER thread might have a mutex locked when one thread
forks -- and because the owning thread doesn't exist in the child, the
mutex could never be released. (And, worse, whatever data is protected
by the mutex is in an unknown and inconsistent state.)

One draft of the POSIX standard had included the UI thread notion of
"forkall", where all threads were replicated in the child process. Some
consider this model preferable. Unfortunately, there are a lot of
problems with that, too, and they're harder to manage, because there's
no reasonable way for the threads to know that they've been cloned. (UI
threads allows that blocking kernel functions MAY fail with EINTR in the
child... but that's not a very good basis for a recovery mechanism.)
After much discussion and gnashing of teeth and tearing of hair, the
following draft removed the option of forkall.

> Is there a restriction saying that it's OK provided the child immediately does
> an exec ?

Actually, this is the ONLY way it's really safe, unless every "facility"
in the process has proper and correct forkall handling to protect all of
the process state across the fork.

In fact, despite the addition of forkall handlers in POSIX 1003.1c, the
standard specifically says that the child process is allowed to call
only async signal safe functions prior to exec. So, while the only real
purpose of forkall is to protect the user-mode state of the process,
you're really not guaranteed that you can make any use of that state in
the child.

> What if I do this on a multiprocessor machine ?

No real difference. You're more likely to have "stranded" mutexes and
predicates, of course, in a non-fork-safe process that forks, becuase
other threads were doing things simultaneously. But given timeslicing
and preemption and other factors, you can have "other threads" with
locked mutexes and inconsistent predicates even on a uniprocessor.

Just remember, that, in a threaded process, it's not polite to say "fork
you" ;-)

/---------------------------[ Dave Butenhof ]--------------------------\
> David Butenhof wrote:
>> The "UI thread" version of fork() copies ALL threads in the child. The
>> more standard and reasonable POSIX version creates a child process with a
>> single thread -- a copy of the one that called fork().
> Sorry to ask...what do you mean by `the "UI thread" version of fork()'?
> I'm a little confused here.

Alright, if you're only "a little confused", then we haven't done our jobs. 
We'll try for "very confused", OK? Let me know when we're there. ;-)

First, the reference to "UI threads" may have seemed to come out of the 
blue if you're new to this newsgroup and threads; so let's get that out of 
the way. "UI" was a committee that for a time controlled the direction and 
architecture of the System V UNIX specification. (UNIX International.) The 
thread interfaces and behavior they defined (which was essentially what Sun 
had devised for Solaris, modified somewhat along POSIX lines in places) are 
commonly known as "UI threads". (Or sometimes "Solaris threads" since they 
originated on Solaris and aren't widely available otherwise.)

The UI thread definition of fork() is that all threads exist, and continue 
execution, in the child process. Threads that are blocked, at the time of 
the fork(), in a function capable of returning EINTR *may* do so (but need 
not). The problem with this is that fork() in a process where threads work 
with external resources may corrupt those resources (e.g., writing 
duplicate records to a file) because neither thread may know that the 
fork() has occurred. UI threads also has fork1(), which creates a child 
containing only a copy of the calling thread. This is equivalent to the 
POSIX fork() function, which provides a more controlled environment. (You 
can always use pthread_atfork() handlers to create daemon threads, or 
whatever else you want, in the child.)
 Q121: Behavior of [pthread_yield()] sched_yield() 

> > I have a question regarding POSIX threads on Linux and Solaris. The
> > program below compiles and links well on both systems, but instead of the
> > expected "100000, " it always prints out
> > "100000, 0", so the thread is not really ever started.
> Well, both sets of output are legal and correct for the code you supplied.

Yes, this is correct.

> First, you see [p]thread_yeild does not say "give control to another thread"
> it says "if there is another thread that can be run, now might be a good time
> do do that".  The library is under no obligation to actually yeild.  (there
> is a good explaination of this elsewhere in this group, but it has to do with
> the fact that you are running under SCHED_OTHER semantics which are
> completely unspecified semantics, go figure.)

Just for clarity...

    pthread_yield is an artifact of the obsolete and crufty old
    DCE thread implementation (loose interpretation of the 1990
    draft 4 of the POSIX thread standard). It doesn't exist in
    POSIX threads.

    thr_yield is an artifact of the UI threads interface, which
    is, (effectively though not truly), Solaris proprietary.

    sched_yield is the equivalent POSIX function.

As Robert said, POSIX assigns no particular semantics to the SCHED_OTHER
scheduling policy. It's just a convenient name. In the lexicon that we
developed during the course of developing the realtime and thread POSIX
standards, it is "a portable way to be nonportable". When you use
SCHED_OTHER, which is the default scheduling policy, all bets are off.
POSIX says nothing about the scheduling behavior of the thread.
(Although it does require a conforming implementation to DOCUMENT what
the behavior will be.)

Because there's no definition of the behavior of SCHED_OTHER, it would
be rather hard to provide any guarantees about the operation of the
sched_yield function, wouldn't it?

If you want portable and guaranteed POSIX scheduling, you must use the
SCHED_FIFO or SCHED_RR scheduling policies (exclusively). And, of
course, you need to run on a system that supports them.

> Next, the number of threads in a (POSIX) program does not necessarily say
> anthing about the number of actual lightweight processes that will be used to
> execute the program.  In your example there is nothing that "forcably" causes
> the main thread to give up the processor (you are 100% CPU related) so your
> first thread runs through to completion.  An identically arranged ADA program
> (which wouldn't quite be possible 8-) would have equally unstable results.
> (I've seen students write essentially this exact program to "play with" tasks
> and threads in ADA and C, but the program is not valid in any predictable
> way.)

POSIX doesn't even say that there's any such thing as a "light weight
process". It refers only obliquely to the hypothetical concept of a
"kernel execution entity", which might be used as one possible
implementation mechanism for Process Contention Scope thread scheduling.

> Finally, POSIX only says that there will be "enough" LWPs at any moment to
> ensure that the program as a whole "continues to make progress".

That's not strictly true. All POSIX says is that a thread that blocks
must not indefinitely prevent other threads from making progress. It
says nothing about LWPs, nor places any requirements upon how many there
must be.

> When you do the SIGINT from the keyboard you are essentially causing the
> "current" thread to do a pthread_exit/abort. Now there is only one thread
> left, the "second" one, so to keep the program progressing that one get's the
> LWP from the main thread.  That is why you see the second start up when you
> do a "^C"...

SIGINT shouldn't "do" anything to a thread, on a POSIX thread system. IF
it is not handled by a sigaction or a sigwait somewhere in the process,
the default signal action will be to terminate the process (NOT the

It's not clear from the original posting exactly where the described
results were seen: Linux or Solaris? My guess is that this is Linux,
with the LinuxThreads package. Your threads are really cloned PROCESSES,
and I believe that LinuxThreads still does nothing to properly implement
the POSIX signal model among the threads that compose the "process".
That may mean that, under some circumstances, (and in contradiction to
the POSIX standard), a signal may affect only one thread in the process.
The LinuxThreads FaQ says that SIGSTOP/SIGCONT will affect only the
targeted thread, for example. Although it also says that threads "dying"
of a signal will replicate the signal to the other threads, that might
not apply to SIGINT, or there might be a timing window or an outright
hole where that's not happening in this case.

LinuxThreads is, after all, a freeware thread package that's from all
reports done an excellent job of attacking a fairly ambitious goal. A
few restrictions and nonconformancies are inevitable and apparently
acceptable to those who use it (although it's gotta be a portability
nightmare for those who use signals a lot, you're always best off
avoiding signals in threaded programs anyway -- a little extra
"incentive" isn't a bad thing). If you see this behavior on Solaris,
however, it's a serious BUG that you should report to Sun.

> The very same program with a single valid "operational yeild" (say reading a
> character from the input device right after the pthread_create()) will run at
> 100% CPU forever because it will never switch *OUT* of the second thread.

At least, that's true on Solaris, where user threads aren't timesliced.
To get multiple threads to operate concurrently, you need to either
manually create additional LWPs (thr_setconcurrency), or create the
threads using system contention scope (pthread_attr_setscope) so that
each has its own dedicated LWP. Solaris will timeslice the LWPs so that
multiple compute-bound threads/processes can share a single processor.
LinuxThreads directly maps each "POSIX thread" to a "kernel thread"
(cloned process), and should NOT suffer from the same problem. The
kernel will timeslice the "POSIX threads" just as it timeslices all
other processes in the system. On Digital UNIX, the 2-level scheduler
timeslices the user ("process contention scope") threads, so, if a
compute-bound SCHED_OTHER thread runs for its full quantum, another
thread will be given a chance to run.

> In essence there is no good "Hello World" program for (POSIX) threads (Which
> is essentially what you must have been trying to write 8-).  If the threads
> don't interact with the real world, or at least eachother, the overall
> program will not really run.  The spec is written to be very responsive to
> real-world demands.  That responsiveness in the spec has this example as a
> clear degenerate case.

That's not true. "Hello world" is easy. If the thread just printed
"Hello world" and exited, and main either joined with it, or called
pthread_exit to terminate without trashing the process, you'd see
exactly the output you ought to expect, on any conforming POSIX
implementation. The problem is that the program in question is trying to
execute two compute-bound threads concurrently in SCHED_OTHER policy:
and the behavior of that case is simply "out of scope" for the standard.
The translation of which is that there's no reasonable assumption of a
portable behavior.

/---------------------------[ Dave Butenhof ]--------------------------\

 Q122: Behavior of pthread_setspecific() 

> Can you explain the discrepancy between your suggestion and the
> following warning, which I found in the SunOS 5.5.1 man page for
> "pthread_setspecific".
> ******************************************************************
>      pthread_setspecific(),                pthread_getspecific(),
>      thr_setspecific(),  and  thr_getspecific(),  may  be  called
>      either explicitly, or implicitly from a thread-specific data
>      destructor function.  However, calling pthread_setspecific()
>      or thr_setspecific() from a destructor may  result  in  lost
>      storage or infinite loops.
> SunOS 5.5.1         Last change: 30 Jun 1995                    4
> ******************************************************************
> I'm not sure how an infinite loop might occur, while using
> "pthread_setspecific" in a destructor.  Do you know the answer?

We're talking about two different things.

1) What the standard says, which is that the destructor is called, and
   may be called repeatedly (until a fixed, implementation specified
   limit, or forever), until the thread-specific data values for the
   thread become NULL. Because the standard doesn't say that the
   implementation is required to clear the value for each key as the
   destructor is called, that requirement is, implicitly, placed on the
   application. (This oversight will be corrected in a future update
   to the standard.)

   In order to set the value to NULL, you clearly must call the function
   pthread_setspecific() within the destructor. Note that setting the
   value to NULL within the destructor will work either with the current
   standard (and the current LinuxThreads literal implementation) AND
   with the fixed standard (and most other implementations, which have
   already implemented the correct semantics, figuring that an infinite
   loop usually is not desirable behavior).

2) The correct POSIX semantics, which are implemented by Solaris and
   Digital UNIX. (Probably also by IRIX, HP-UX, and AIX, although I
   haven't been able to verify that.) The Solaris manpage warning is
   imprecise, however. There's no problem with a destructor explicitly
   setting the value to NULL. The warning SHOULD say that setting a
   thread-specific data value to any non-NULL value within a destructor
   could lead to an infinite loop. Or, alternately, to a memory leak, if
   the new value represents allocated heap storage, and the system has
   a limit to the number of times it will retry thread-specific data

/---------------------------[ Dave Butenhof ]--------------------------\

 Q123: Linking under OSF1 3.2: flags and library order  

Joerg Faschingbauer wrote:

> Hi,
> recently I posted a question about the correct linking order under
> Solaris 2.4. Got some valuable hints, thanks.
> I have a similar problem now, this time under OSF1 3.2. Can anybody
> tell me if the following is correct? I could not find any hints on
> that topic in the man pages.
> gcc ... -ldnet_stub -lm -lpthreads -lc_r -lmach
> Does pthreads need stuff from c_r, or the other way around? Do I need
> mach at all? Do I need dnet_stub at all?

In a threaded program prior to Digital UNIX 4.0, EVERYTHING needs
libc_r, because libc is not thread-safe. Yes, the thread library
requires libmach, and, because of bizarre symbol preemption requirements
(which, for trivia junkies, were at one time required by OSF for "OSF/1"
branding), if you don't include libmach explicitly things might not work
out right. You must specify libmach BEFORE libc_r. You don't need
-ldnet_stub unless YOU need it (or some other library you're including).
We certainly don't use it.

The best way to build a threaded program on 3.2 is to use "cc -threads".
If you're going to use gcc, or an older cxx that doesn't support
"-threads", or if you need to use ld to link, then the proper expansion
of "-threads" is:

     for compilation:
     for linkage:
          -lpthreads -lmach -lc_r

The linkage switches must be the LAST libraries, exclusive of libc. That
is, if you were using ld to link, ...

     ld <.o files...> -lpthread -lmach -lc_r -lc crt0.o

I don't believe the position of -lm with respect to the thread libraries
will matter much, since it's pretty much independent. If you use -lm
-threads, however, libm will precede the thread libraries, and that's a
good standard to follow.

A side effect of "-threads" is that ld will automatically look for a
reentrant variant of any library that you specify. That is, if you
specify "-lfoo", and there's a "libfoo_r", ld will automatically use
libfoo_r. If you don't use -threads, you'll need to check /usr/shlib (or
/usr/lib if you're building non-shared) for reentrant variants.

Note that, to compile a DCE thread (draft 4) threaded program once you
move to Digital UNIX 4.0 or higher, the compilation expansion of
-threads will need to be changed to "-D_REENTRANT -D_PTHREAD_USE_D4",
and the list of libraries should be "-lpthreads -lpthread -lmach -lexc".
There's no libc_r on 4.0 (libc is fully thread-safe), and you need
libexe since we've integrated with the standard O/S exception mechanism.
Note the distinction between libpthread (the "core" library implementing
POSIX threads), and libpthreads (the "legacy" library containing DCE
thread and CMA wrapper functions on top of POSIX thread functions).

Minor additional notes: as of Digital UNIX 4.0D we've dropped the final
dependencies on the mach interfaces, so libmach is no longer required
(you'll get smaller binaries and faster activation by omitting it once
you no longer need to support earlier versions). And, of course, once
you've moved to 4.0 or later, you should port to POSIX threads, in which
case you can drop -lpthreads and -D_PTHREAD_USE_D4.

/---------------------------[ Dave Butenhof ]--------------------------\
 Q124: What is the TID during initialization?  

Lee Sailer wrote:

> In a program I am "maintaining", there is a
>    foo = RWThreadID();
> call at global scope.  Conceptually, this gets called before main().
> Does this seem OK?  Can this code rely on the "thread" being used to do
> initial construction to be the same as the "main thread"?

[For example, the .init sections of libraries run before main() starts. -Bil]

While the assumption will likely be true, most of the time, it strikes me
as an extremely dangerous and pointless assumption. There are a lot of
reasons why it might NOT be true, sometimes, on some platforms, under some
circumstances. There's no standard or rule of ettiquette forbidding a
difference. Even if the "thread" is the same, the "thread ID" might change
as things get initialized.

I recommend avoiding any such assumptions.

/---------------------------[ Dave Butenhof ]--------------------------\

 Q125: TSD destructors run at exit time... and if it crashes?  

Sebastien Marc wrote:

> On Solaris you can associate a function (called destructor) that will be
> called at the termination of the thread, even if it crashes.

Almost. Both POSIX and UI threads interfaces include thread-specific data.
When you create a thread-specific data (TSD) key, you can specify a
destructor function that will be run when any thread with a non-NULL value
for that key terminates due to cancellation [POSIX only] or voluntary thread
exit (return from the thread's start routine, or a thread exit call --
pthread_exit or thr_exit).

Yes, you can use that as a sort of "atexit" for threads, if you make sure
that each thread uses pthread_setspecific/thr_setspecific to SET a non-NULL
value for the TSD key. (The default value is NULL, and only the thread itself
can set a value.)

However, that doesn't help. There is simply no way that a thread can "crash"
without taking the process with it. A unhandled signal will never terminate a
thread -- either the signal is ignored, or it does something to the process
(stop, continue, terminate). TSD destructors are NOT run:

   * on the child side of a fork
   * in a call to exec
   * in process termination, regardless of whether that termination is
     voluntary (e.g., a call to exit) or involuntary (an unhandled signal).

In all those cases, threads quietly "evaporate", leaving no trace of their
existence. No TSD destructors, no cleanup handlers, nothing. Gone. Poof.

/---------------------------[ Dave Butenhof ]--------------------------\

 Q126: Cancellation and condition variables  

Marcel Bastiaans wrote:

> Anyone:
> I appear to be missing something in my understanding of how condition
> variables work.  I am trying to write a multithreaded program which is
> portable to various platforms.  I am unable to cancel a thread if it is
> waiting on a condition variable which another thread is waiting on also.
> The problem can easily be reproduced on both Solaris 2.5 and HP-UX 10.10.  A
> simple program which demonstrates my problem is shown below.  This program
> sample uses the HP-UX pthreads library but the problem also appears when
> using Solaris threads on Solaris 2.5.

In any case... yes, you are missing something. The program, as written, will
hang on any conforming (or even reasonably correct) implementation of either
DCE threads or POSIX threads. (To put it another way, any implementation on
which it succeeds is completely broken.)

> Is there a problem in this program which I don't understand?  I cannot use
> cleanup handlers because not all platforms support them.  Any help would be
> greatly appreciated.

If you can use cancellation, you can use cleanup handlers. Both are part of
both DCE threads (what you're using on HP-UX 10.10) and POSIX threads (what you
probably are, and, at least, should be, using on Solaris 2.5.) If you've got
cancellation, and you don't have cleanup handlers, you've got an awesomely
broken implementation and you should immediately chuck it.

When you wait on a condition variable, and the thread may be cancelled, you
MUST use a cleanup handler. The thread will wake from the condition wait with
the associated mutex locked -- even if it was cancelled. If the thread doesn't
then unlock the mutex before terminating, that mutex cannot be used again by
the program... it will remain locked by the cancelled thread.

> #include 
> #include 
> #include 
> pthread_cond_t cond;
> pthread_mutex_t mutex;
> void * func(void *)
> {
>    // Allow this thread to be cancelled at any time
>    pthread_setcancel(CANCEL_ON);
>    pthread_setasynccancel(CANCEL_ON);

Serious, SERIOUS bug alert!! DELETE the preceding line before proceeding with
this or any other program. Never, ever, enable async cancelation except on
small sections of straight-line code that does not make any external calls.
Better yet, never use async cancel at all.

In any case, you absolutely CANNOT call any POSIX (or DCE) thread function with
async cancellation enabled except the ones that DISABLE async cancel. (For
bizarre and absolutely unjustifiable reasons [because they're wrong], POSIX
threads also allows you to call pthread_cancel -- but don't do it!)

>    // Wait forever on the condition var
>    pthread_mutex_lock(&mutex;);
>    for(;;) {
>       pthread_cond_wait(&cond;, &mutex;);
>    }
>    pthread_mutex_unlock(&mutex;);
>    return 0;
> }

I suspect your problem is in cancelling the second thread. As I said,
cancellation terminates the condition wait with the associated mutex locked.
You're just letting the thread terminate with the mutex still locked. That
means, cancelled or not, the second thread can never awaken from the condition
wait. (At a lower level, you could say that it HAS awakened from the condition
wait, but is now waiting on the mutex... and a mutex wait isn't cancellable.)

The answer is... if you use cancellation, you must also use cleanup handlers.
(Or other, non-portable equivalent mechanisms, such as exception handlers or
C++ object destructors... on platforms where they're implemented to
interoperate with cancellation. [Both Solaris and Digital UNIX, for example,
run C++ destructors on cancellation.])

/---------------------------[ Dave Butenhof ]--------------------------\
 Q127: RedHat 4.2 and LinuxThreads?  

> > The Linux kernel has supported multithreading for a very long time.
> thank you for the info, Bill.  the man page for clone that ships with
> Red Hat 4.2 states that clone does not work.  here are my questions.
> they all relate to Red Hat 4.2:
> 1. does clone work for all defined parameter values?
> 2. where can i find a list of the c library api's that are not
> reentrant?
> 3. does RedHat 4.2 install LinuxThreads if "everything" is selected?
> > Until recently, the API definition for POSIX thread support was
> > contained in the LinuxThreads package, but that's just a wrapper
> > around the kernel's built=in functioning.  With the releace of libc6
> > (GNU libc) the LinuxThreads functionality is more tightly integrated
> > into the basic C library,
> do you mean that the POSIX thread api's are now in libc so that
> LinuxThreads is obsolete?

With the glibc2 (2.0.5 c is current I think) LinuxThreads is
obsolete. However, you have to get yourself the additional
glibc-linuxthreads package, but that's detail.

AFAIK glibc2 is still in the beta stadium, but it works quite
well. Moreover, it is recommended to use glibc2 for multithreading
rather than libc5. As H.J.Lu, libc5's maintainer, once stated: "I'm
surprised it works at all" (or so).

You can install a "beta" of the libc6 (aka glibc2) as a secondary C
library against which you link your program, and keep the good old
libc5 as the primary library which the system related programs use.

Take a look at

for HOWTOs etc.

Joerg Faschingbauer                           
Voice: ++43/316/820918-31                            Fax: ++43/316/820918-99

 Q128: How do I measure thread timings?  
Andy Sunny wrote:

> I'm conducting some research to measure the following things about
> pthreads using a Multikron II Hardware Instrumentation Board from NIST
> 1) thread creation time (time to put thread on queue)
> 2) thread waiting time (time that thread waits on queue)
> 3) thread execution time (time that thread actually executes)
> Are there any decent papers that explain the pthreads run time system
> and scheduling policy in DETAIL? I have read Frank Mueller's (FSU) paper
> and am trying to obtain the standard from IEEE. What is the latest
> version of the standard and will it help me find the proper libraries
> and functions need to measure the above items?

The standard is unlikely to be of any help to you. It says nothing at all
about implementation. POSIX specifies SOURCE-LEVEL interfaces, and
describes the required portable semantics of those interfaces.
Implementation details are (deliberately, properly, and necessarily) left
entirely to the creator of each implementation. For example, there's no
mention of libraries -- an embedded system, for example, might include all
interfaces in an integrated kernel; and that's fine.

What you need is a document describing the internal implementation details
of the particular system you're using. If the vendor can't supply that,
you'll need to create it yourself -- either by reading source, if you can
get it, or by flailing around blindly in the dark and charting the walls
you hit.

/---------------------------[ Dave Butenhof ]--------------------------\
 Q129: Contrasting Win32 and POSIX thread designs  
Arun Sharma wrote:

> On Mon, 24 Nov 1997 18:10:13 GMT, Christophe Beauregard wrote:
>         c>  while thread context is a Windows concept.
> How so ? pthreads don't have contexts ?

This looks like an interesting discussion, of which I've missed the
beginning. (Perhaps only the followup was cross-posted to
comp.programming.threads?) Anyway, some comments:Anything has "context".
A thread is an abstraction of the executable state traditionally
attributed to a process. The "process" retains the non-executable state,
including files and address space. Why would anyone contend that "thread
context is a Windows concept"? I can't imagine. Maybe it's buried in the
unquoted portion of the orignal message. And then again, some people
think Microsoft invented the world.

>         c> Generally, you'll find that pthreads gives you less control
>         c> over how a thread runs.  There are very good reasons for
>         c> this (one being portability, another being safety).
> In other words, it has to be the least common denominator in the
> fragmented UNIX world. No wonder people love NT and Win32 threads.

POSIX threads gives you far more real control over threads than Win32
(for example, far superior realtime scheduling control). What it doesn't
give you is suspend/resume and uncontrolled termination. Those aren't
"control over how a thread runs". They are extraordinarily poor
programming mechanisms that can almost never be used correctly. Yes, to
some people the key is "almost never", and one may argue that they should
be provided anyway for that 0.001% of applications that "need" it. (But
those of us who actually support threaded interfaces might also point out
that these extremely dangerous functions are for some reason particularly
tempting to beginners who don't know what they're doing -- resulting in
very high maintenance costs, which mostly involves helping them debug
problems in their code.)

This isn't an example of "fragmented UNIX" -- it's UNIX unity, with a
wide variety of different "UNIX camps" reaching a concensus on what's
necessary and useful.

While the Win32 interface comprises whatever the heck a few designers
felt like tossing in, POSIX was carefully designed and reviewed by a
large number of people, many of whom knew what they were doing. Omitting
these functions was a carefully considered, extensively discussed, and
quite deliberate decision. The Aspen committee that designed the thread
extensions to POSIX for the Single UNIX Specification, Version 2,
proposed suspend/resume -- they were later retracted by the original
proposer (with no objections). A POSIX draft standard currently under
development, 1003.1j, had proposed a mechanism for uncontrolled
termination, with the explicit recognition that it could be used (and
then only with extreme care) only in carefully constructed embedded
systems. It, too, was later withdrawn as the complications became more
obvious. (The notion that you can regain control of a process when you've
lost control of any one thread in the process is faulty, because all
threads depend completely on shared resources. If you've lost control of
a thread, you don't know the state of the process -- how can you expect
it to continue?)

>         c> Basically, using signals for dealing with threads is a Bad
>         c> Thing and people who try generally get screwed.
> It doesn't have to be so. That's an implementation problem.

Yes, it does have to be so, because signals are a bad idea to begin with.
Although there were enormous complications even before threads, the
concept becomes all but unsupportable with the addition of full
asynchronous execution contexts to the traditional process.

The "synchronous" signals, including SIGSEGV, should be language
exceptions. The other "asynchronous" signals should be handled
synchronously in independent contexts (threads). If you think about it,
that's what signals were attempting to do; the condition exists as a
separate execution context (the signal handler). Unfortunately, a signal
preempts the hardware context of the main execution context,
asynchronously. That's a really, really bad idea. Although people have
always casually done things like calling printf in signal handlers, too
few people realize that's always been incorrect and dangerous -- only a
small list of UNIX functions are "async-signal safe". The addition of
threads, however, allowing the process to have multiple contexts at any
time, increases the chances that some thread will be doing something that
will conflict with improper use of non async-signal safe functions at
signal level.

> Portability doesn't necessarily have to cripple the API.

And, in fact, it doesn't. It results in a well-designed and robust
interface that can be efficiently implemented everywhere. I'm not arguing
that the POSIX interface is perfect. There is room for additions, and the
Single UNIX Specification, Version 2, makes a good start. Other areas to
consider for future standardization would include debugging and analysis
interfaces. There are POSIX standards in progress to improve support for
"hard realtime" environments (for example, putting timeouts on all
blocking functions to control latency and help diagnose failures).

/---------------------------[ Dave Butenhof ]--------------------------\
 Q130: What does POSIX say about putting stubs in libc?  

Patrick TJ McPhee wrote:

> I'd like to know what Posix has to say about putting stubs in libc.
> Is it permitted? Is it required? What return code can we expect to
> receive from such a stub, and how can we portably ignore it?

POSIX doesn't have ANYTHING to say. POSIX 1003.1 doesn't really recognize the
existance of the concept of a "library". It defines a set of SOURCE LEVEL
interfaces that shall be provided by implementations and that may be used by
applications to achieve certain portable semantics. Now, 1003.2 says a little
about libraries. That is, there's something with a ".a" file suffix, and
there's a utility called "ar" to create them, and a utility called "c89" with
a "-l" switch that may read from a "lib.a" (and may also read
additional file suffixes, e.g., .so). 1003.2 doesn't say anything about which
symbols may or should be resolved from which libraries, though, and hasn't
been updated since 1003.1c-1995 (threads), in any case.

So, if your system's  provides a definition for _POSIX_THREADS, then
1003.1c-1995 tells you that you can potentially call pthread_create. It does
not tell you which libraries you need to link against. UNIX98 does specify
that, for c89, the proper incantation is "-lpthread". But even that's not the
same as a requirement that the symbols must resolve from, and only from, a
libpthread library: only that you're not allowed to build a portable threaded
application without REQUESTING libpthread. (And if you use cc instead of c89,
UNIX98 doesn't help you any more than POSIX, aside from the gentle SUGGESTION
that an implementation provide the thread implementation in a libpthread --
which had, in any case, already become the defacto industry standard.)

So, yes, it's "permitted", and, no, it's not "required".

If you're building an APPLICATION using threads, there's no confusion or
problem. You build according to the rules of the platform, and you've got
threads, wheresoever they might reside. If you try to use threads without
building properly, all bets are off, because you blew it. If you're getting
the interfaces accidentally from somewhere else, that's nobody's fault but
your own.

If you're trying to build thread-safe code that doesn't use threads, you've
got a portability problem. No standard will help you accomplish this. That's
too bad. Requiring libc "stubs" would be one way out -- but as I've already
said, (and as I'll reiterate in the next paragraph!), the Solaris
implementation has some serious limitations of which I don't approve. I would
not consider that an acceptable standard. I'm not entirely happy with our own
solution (a separate set of "tis" interfaces), either, because extra
interfaces are nobody's friend. One might say that there is room here for
innovation. ;-)

If you're trying to build a library that uses threads, regardless of whether
the main program uses threads -- well, you're in trouble again. You SHOULD be
able to simply build it as if you were building a threaded application, and it
should work. Unfortunately it won't work (portably) unless the main program is
linked against the thread library(s), whether or not it needs them. Symbol
preemption will work against you if there are "stubs" for any functions in a
library that will be searched by direct dependencies of the main program.
(Even if your library is searched first, ITS direct dependencies will go at
the end of the list.) That's the problem with the Solaris libc stubs. (I'd
like to say that Digital UNIX avoids this, and that's certainly the intent;
but unfortunately it's not yet entirely true. Although there are no stubs
conflicting with libpthread, we depend on the libexc exception library, which
has a conflicting stub in libc. Luckily, this affects relatively few
operations -- but, technically, it still means it doesn't work.)

On the other hand, your final question is easy. There's no need to "portably
ignore" the errors that a stub might generate. Look, if you try to create a
thread, it either succeeds or it fails. You get back 0, and it worked.
Anything else, and it failed. If the failure is EAGAIN, you might choose to
try again later. Otherwise... hey, you're just not going to create that
thread, so deal with it. The only question is: can you live with that? If you
don't NEED to create a thread, go on with life, single threaded. If you NEED
to create the thread, then you're done. (Whether you return a failure to your
caller, or abort the process, probably depends on what you're trying to do,
and how your interface is designed.) It really doesn't matter whether you got
activated against libc stubs or a real thread library that for some reason
refuses to create the thread. You're not going to do what you wanted to do,
and that's that.

/---------------------------[ Dave Butenhof ]--------------------------\
 Q131: MT GC Issues  

[See Geodesic Systems (            -Bil]

Sanjay Ghemawat wrote:

> All collectors I have known of and read about control both the allocation
> and the deallocation of objects.  So it is fairly easy for them to grab
> all of the locks required before suspending threads.  The only problem
> here might be locks held within the operating system on behalf of a thread
> that is about to be suspended.  Even here, if one is using a thread interface
> like the one provided by Mach,  a call to "thread_abort" will pop the
> thread out of the OS.

There is no general or portable mechanism equivalent to
thread_abort, and it is pretty limited even on Mach. (First,
you have to know it's in a blocking Mach call.)

> >Furthermore, suspend/resume may possibly be necessary for concurrent garbage
> >collection (I haven't yet been convinced of that -- but I haven't found a good
> >alternative, either), but it's definitely far from desirable. It's an ugly and
> >stupendously inefficient kludge. Remember, you have COMPLETELY STOPPED the
> >application while garbage collecting. That's GOOD? Parallel applications want
> First of all, most incremental/concurrent collectors only stop the
> application while they find all of the pointers sitting in thread
> stacks/registers/etc.  The collectors that provide better real-time
> guarantees tend to make other operations (such as storing a pointer
> onto a stack) expensive.  I think there are two classes of systems
> here: hard real-time and others.  A good performance tradeoff for
> systems that do not require hard real-time bounds is to use an
> incremental/concurrent collector that may introduce pauses, but does
> not slow down the mutator with lots of book-keeping work.  So I think
> the argument that suspend/resume are bad only applies to some systems,
> not all.  Probably not even the vast majority of systems that run on
> desktops and commercial servers.

Look, if you want concurrency, you DON'T want pauses. I
already acknowledged that there might not be an alternative,
and that we certainly don't know of any now.  Maybe it's the
best tradeoff we can get. Stopping all the threads is still
bad.  Necessity does not equate to desirability. If you
really think that it is BENEFICIAL to stop the execution of
a concurrent process, argue on. Otherwise, drop it.

> >application while garbage collecting. That's GOOD? Parallel applications want
> >concurrency... not to be stopped dead at various unpredictable intervals so
> >the maintenance staff can check the trash cans. There has gotta be a better
> >way.
> So we should all wait and not use garbage-collection in multi-threaded
> programs until that better way is found?  I don't see why you are so
> vehemently set against suspend/resume.  It solves real problems for
> people implementing garbage collection.  Yes there are tradeoffs
> here: suspend/resume have their downside, but that doesn't mean we
> should ignore them.

Because suspend and resume are poor interfaces into the
wrong part of a scheduler for external use. They are
currently used (more or less) effectively for a small class
of specialized applications (e.g., garbage collection). They
are absolutely inappropriate for general use. The fact that
a bad function "can be used effectively" doesn't mean it
should be standardized. Standards do not, and should not,
attempt to solve all possible problems.

> So do that.  I don't think there are an unbounded number of such
> required operations.  In fact I have implemented a collector for a
> multi-threaded environment that requires just three such operations
> suspend, resume, and reading registers from a suspended thread.
> And folding in the register-state extraction into the suspend call
> seems like a fine idea.

Now, are we talking about "garbage collectors", or are we
talking about suspend and resume? All this rationalization
about garbage collection spills over naturally into a
discussion of suspend and resume -- but that's happenstance.
Sure, our concurrent GC systems, and especially Java, use
suspend/resume. But that's "because it was there", and
solved the problem of pinning down threads long enough to
get their state. But the function required of concurrent
garbage collection is not "suspend all threads and collect
their registers, then resume them". The required function is
"acquire a consistent set of live data root pointers within
the process".

Yes, there are a bounded set of operations required for GC
-- and that has nothing at all to do with suspend or
resume. If the argument for standardizing suspend and resume
is to revolve entirely around the needs of today's
semi-concurrent GC, then we should be designing an interface
to support what GC really needs, not standardizing an ugly
and dangerous overly-generalized scheduling function that
can be (mis-)used to implement one small part of what GC

> >> Since the implementation basically needs to be there for any platform
> >> that supports Java, why not standardize it?  Alternatively, if there is
> >> a really portable solution using signals, I would like to see it
> >> advertised.
> >
> >Any "why not" can be rephrased as a "why". Or a "so what".
> Oh come on.  By this argument, you could do away with all standards.
> The reason for standards is so that multiple implementations with
> the same interface can exist and be used without changing the
> clients of the interface.

And by the converse, perhaps we should standardize every
whim that comes into anyone's head? Baloney. We should
standardize interfaces that are "necessary and
sufficient". Determining exactly of what that consists is
not always easy -- but it's important because standards have
far- and long-reaching consequences.

> I apologize if this message has come across as a bit of a rant, but
> I am tired of people assuming that everyone who asks for suspend/resume
> must be an idiot who does not understand the available thread
> synchronization mechanisms.  There are legitimate uses for
> suspend/resume, of course with some performance tradeoffs.  By
> making the decision to not implement them in a thread library,
> you are taking away the ability of the clients of the library
> to decide on the tradeoff according to their need.  That makes
> the thread library less useful.

I guess we're even -- because I'm tired of hearing people
insist that because they want suspend/resume, it must be
universally accepted as "a cool thing" and forced down
everyone's throat. It's not a cool thing. And, by the way, I
have yet to hear of a single truly legitimizing use. The use
of suspend and resume by GC is an expedient hack. It isn't
really accomplishing what GC needs. It's far more
heavyweight than required, (as you pointed out, a GC system
suspends threads to get a consistent set of root pointers,
NOT because it wants to suspend the threads), and it doesn't
provide half the required capabilities (after all, the real
goal is to get the pointers -- the registers).

As for your final dig, I'm tempted to laugh. You know what
makes a thread library less useful? Providing unsupportable
functions that are nearly impossible to use safely and that
therefore result in significant support costs, preventing
the development team from doing work that would provide
useful features and fixing real problems.

/---------------------------[ Dave Butenhof ]--------------------------\

 Q132: Some details on using CMA threads on Digital UNIX wrote:

> I'm trying to port code from and HP that used the cma threads package to
> a DEC Alpha with the posix package. I've found that some of the standard
> header files (DEC C++) have conflicting definitions (e.g., sys/types.h
> and pthreads_exc.h). Has anyone encountered this porblem and is there
> some simple conversion utility or a better library to use is such a port.

A number of questions present themselves immediately, including:

  1. What version of Digital UNIX are you using?
  2. Are you trying to compile with CMA, DCE thread (draft 4 POSIX), or true
  3. What compiler/link options are you specifying?
  4. What headers do you include?
  5. What is actually happening?

A few comments:

  1. Digital UNIX provides both DCE thread (draft 4 POSIX, both "standard"
     and exception-raising) and CMA interfaces, as in any DCE
     implementation. To use these, compile with "cc -threads" or "cxx
     -threads", and link using the same. If you can't, compile with
     "-D_REENTRANT". Link depends on what version you're using --
     specifically 3.2(x) or 4.0(x). (And for link, watch out for _r
     libraries, e.g., or libfoo_r.a if linking static --
     "-threads" or "-pthread" will cause the linker to find and use them
     automatically; but if you roll your own you'll need to look for them
  2. Digital UNIX 4.0 (and higher) also provides true POSIX threads, if
     you're converting. Compile and link using "cc -pthread" or "cxx
     -pthread". If you can't, compile with "-D_REENTRANT" and link with
     "-lpthread -lexc -lc" (at the end of your list of files). (And, again,
     watch out for _r suffix libraries.)
  3. You mentioned "pthreads_exc.h". Well,  is a header used
     to define the exception-raising variant of the DCE thread (draft 4
     POSIX) API. This conflicts with the  implication of your statement
     "with the posix package", since DCE threads are NOT the same as POSIX
     threads. You cannot use  with POSIX threads.

/---------------------------[ Dave Butenhof ]--------------------------\

 Q133: When do you need to know which CPU a thread is on?  

[This is part on an ongoing series of unsolved problems where there
is a lot of "We don't quite know WHY this is happening, but..."  -Bil]

On Sun, 28 Dec 1997, Bil Lewis wrote:

> Jason,
>   That sounds like a very interesting project.  I'm curious about your decision
> to bind threads to CPUs.  You SAY you need to do it, but you don't give any
> proof.  Did you test you system without binding to CPUs?  What kind of results
> did you get when you did?

The threaded version of the system has not been constructed yet however
a non-threaded (ie forked version) has and we have found significant
performance differences between allowing the processes to arbitrarily
migrate between processors and locking the processes to dedicated

So from that experience it stands to reason that locking threads to
processors would be preferable if we were to attempt to implement
a fully threaded version of the system.

>   I infer from what you say that this is a computationally intensive task.
> Which implies that the threads (or processes) would never migrate to different
> CPUs anyway.  DID they migrate?  I'd very much like to know your experience and
> the performance behavior.

Yes the graphics processes are computationally intensive. It is a standard
technique on multiprocessor SGI's to lock rendering processes to processors.
If they are not locked they will migrate.

The ability to lock threads to processors hasn't been fully implemented
by SGI yet. Currently since threads are bound to their processes, when
the process migrates  the thread gets carried along with it.
I'm guessing that pThreads on the SGI's are being implemented on top of sproc
which is a superset of the capabilities of pthreads. Since sprocs
can be locked to processors I'm hoping soon that the SGI implementation
of pthreads will also inherit that capability.

> Actually in the work we do (Virtual Reality) we crucially need to know not
> only which processor a thread is running on, but to be able to explicitly
> assign a thread to the processor.

Now I don't see any of that.

You have a set of threads that you want to execute in parallel on an SMP. That's
fine. Lots of people have the same need for all sorts of reasons. That, however,
does NOT mean that you need to know on which processor each thread is running,
much less be able to specify on which processor each thread runs. It just means
you need to be sure that the O/S supports parallel computation.

What you're saying is that you don't trust the O/S scheduling at all, and insist
on controlling it yourself. There are cases where that's valid -- but that's
quite different from staying that your application inherently requires processor
identification or control. It doesn't. In nearly ever case requiring
concurrency/parallelism, you'll be best off trusting the O/S to schedule the
processor resources. And if you find that it's not always trustworthy, tell the
developers, and help them fix it! You, and everyone else, will end up with a
better system.

/---------------------------[ Dave Butenhof ]--------------------------\
 Q134: Is any difference between default and static mutex initialization?  

Robert White wrote:

> Venkat Ganti wrote:
> > I want to know  whether there is any difference between the following
> > two mutex initializations using pthreads
> >
> > 1.
> >
> > pthread_mutex_t  mp = PTHREAD_MUTEX_INITIALIZER;
> >
> > 2.
> >
> > pthread_mutex_t mp;
> > pthread_mutex_init (∓, NULL);
> >
> > In this the allocaled memory is zero.
> >
> An other way that these two may be different (in addition to the ones
> mentioned by Dave B. in his reply) is that the latter form can have
> different meaning as the program progresses because the default mutex
> behavior of a program can be changed with the set-attribute calls (I
> forget the exact call) when the attribute sepsified in the call is the
> NULL pointer.

You can't change the attribute values of the NULL attributes object. When
you initialize a mutex using NULL, you're asking for default attributes --
those MUST ALWAYS be the same attributes that will be used by a statically
initialized mutex. It doesn't (and can't) matter when the statically
initialized mutex is first used.

> If you use variant 2, you know that the semantics are those in-force at
> the time the statement is executed.  If you use variant 1, it will likely
> have the default semantics in force at the time the mutex is first used.

The only way this could be true is if an implementation provides some
non-portable and non-standard mechanism for modifying the default
attributes. You'd have a hard time convincing me that such an extension
could conform, since pthread_mutex_init specifically requires that the mutex
gain "default" attributes, and the standard requires that the default value
of any attributes (for which the standard doesn't specify a default) must be
specified in the conformance document.

> The manual, if I recall correctly, "strongly suggests" that variant 1
> only be used to initalize staticly allocated mutexes only.  I suspect
> that the above ambiguity is the reason.

Initializing a mutex on the stack is almost always bogus, and will usually
lead to far more trouble than you ever might have imagined. Doesn't matter
whether the mutex is statically initialized or dynamically initialized,
though, except (as always), a static initialization has no choice but to use
the default attributes.

You can't statically initialize a heap mutex, because the language doesn't
allow you to specify an initial value in that case.

/---------------------------[ Dave Butenhof ]--------------------------\

 Q135: Is there a timer for Multithreaded Programs?  

From: (Richard Sullivan)
Subject: Re: Timing Multithreaded Programs (Solaris) (Bradley J. Marker) wrote:

>I'm trying to time my multithreaded programs on Solaris with multiple 
>processors.  I want the real world running time as opposed to the total 
>execution time of the programming because I want to measure speedup versus 
>sequential algorithms and home much faster the parallel program is for the user.


  Here is what I wrote to solve this problem (for Solaris anyway).  To
use it just call iobench_start() after any setup that you don't want
to measure.  When you are done measuring call iobench_end().  When you
want to see the statistics call iobench_report().  The output to
stderr will look like this:

Process info:
  elapsed time  249.995
  CPU time      164.446
  user time     152.095
  system time   12.3507
  trap time     0.661235
  wait time     68.6506
  pfs    major/minor    3379/     0
  blocks input/output      0/     0
65.8% CPU usage

>>>>>>>>>>>>>>>>>>>>> iobench.h

 * Library Name: UTIL
 * Module Name:  iobench
 * Designer:    R. C. Sullivan 
 * Programmer:  R. C. Sullivan
 * Date:        Sep 22, 1995
 * History Of Changes:
 *      Name         Date            Description
 *      ----         ----            -----------
 *      RCS     Jan 17, 1996     Inital release
 * Purpose:
 *   To report resource usage statistics that will be correct for
 * programs using threads on a Solaris system.
 * Notes:

extern struct prusage prusagebuf_start, prusagebuf_end;
extern int procfd;
extern double real_time, user_time, system_time, trap_time, wait_time;
extern unsigned long minor_pfs, major_pfs, input_blocks, output_blocks, iochars;

void iobench_start();
void iobench_end();
void iobench_report();

>>>>>>>>>>>>>>>>>>>>> iobench.c

 * Library Name: UTIL
 * Module Name:  iobench
 * Designer:    R. C. Sullivan
 * Programmer:  R. C. Sullivan
 * Date:        Sep 22, 1995
 * History Of Changes:
 *      Name         Date            Description
 *      ----         ----            -----------
 *      RCS     Jan 17, 1996     Inital release
 * Purpose:
 *   To report resource usage statistics that will be correct for
 * programs using threads on a Solaris system.
 * Notes:


#include "iobench.h"

struct stat statbuf;
struct prusage prusagebuf_start, prusagebuf_end;
int procfd;
double real_time, total_real_time, user_time, system_time, trap_time, wait_time;
unsigned long minor_pfs, major_pfs, input_blocks, output_blocks, iochars;

void iobench_start() {
  char pfile[80];

  sprintf(pfile, "/proc/%ld", getpid());
  procfd = open(pfile, O_RDONLY);

  ioctl(procfd, PIOCUSAGE, &prusagebuf;_start);

void iobench_end() {
  ioctl(procfd, PIOCUSAGE, &prusagebuf;_end);

  real_time = (double) prusagebuf_start.pr_tstamp.tv_sec +
        (double) prusagebuf_start.pr_tstamp.tv_nsec / NANOSEC;
  real_time = (double) prusagebuf_end.pr_tstamp.tv_sec +
         (double) prusagebuf_end.pr_tstamp.tv_nsec / NANOSEC - real_time;

  total_real_time = (double) prusagebuf_start.pr_rtime.tv_sec +
         (double) prusagebuf_start.pr_rtime.tv_nsec / NANOSEC;
  total_real_time = (double) prusagebuf_end.pr_rtime.tv_sec +
         (double) prusagebuf_end.pr_rtime.tv_nsec / NANOSEC - real_time;

  user_time = (double) prusagebuf_start.pr_utime.tv_sec +
         (double) prusagebuf_start.pr_utime.tv_nsec / NANOSEC;
  user_time = (double) prusagebuf_end.pr_utime.tv_sec +
         (double) prusagebuf_end.pr_utime.tv_nsec / NANOSEC - user_time;

  system_time = (double) prusagebuf_start.pr_stime.tv_sec +
         (double) prusagebuf_start.pr_stime.tv_nsec / NANOSEC;
  system_time = (double) prusagebuf_end.pr_stime.tv_sec +
         (double) prusagebuf_end.pr_stime.tv_nsec / NANOSEC - system_time;

  trap_time = (double) prusagebuf_start.pr_ttime.tv_sec +
         (double) prusagebuf_start.pr_ttime.tv_nsec / NANOSEC;
  trap_time = (double) prusagebuf_end.pr_ttime.tv_sec +
         (double) prusagebuf_end.pr_ttime.tv_nsec / NANOSEC - trap_time;

  wait_time = (double) prusagebuf_start.pr_wtime.tv_sec +
         (double) prusagebuf_start.pr_wtime.tv_nsec / NANOSEC;
  wait_time = (double) prusagebuf_end.pr_wtime.tv_sec +
         (double) prusagebuf_end.pr_wtime.tv_nsec / NANOSEC - wait_time;

  minor_pfs = prusagebuf_end.pr_minf - prusagebuf_start.pr_minf;
  major_pfs = prusagebuf_end.pr_majf - prusagebuf_start.pr_majf;
  input_blocks = prusagebuf_end.pr_inblk - prusagebuf_start.pr_inblk;
  output_blocks = prusagebuf_end.pr_oublk - prusagebuf_start.pr_oublk;
/*  iochars = prusagebuf_end.pr_ioch - prusagebuf_start.pr_ioch;*/

void iobench_report() {
  fprintf(stderr, "Process info:\n");
  fprintf(stderr, "  elapsed time  %g\n", real_time);
/*  fprintf(stderr, "  total time    %g\n", total_real_time);*/
  fprintf(stderr, "  CPU time      %g\n", user_time + system_time);
  fprintf(stderr, "  user time     %g\n", user_time);
  fprintf(stderr, "  system time   %g\n", system_time);
  fprintf(stderr, "  trap time     %g\n", trap_time);
  fprintf(stderr, "  wait time     %g\n",  wait_time);
  fprintf(stderr, "  pfs    major/minor  %6lu/%6lu\n", major_pfs, minor_pfs);
  fprintf(stderr, "  blocks input/output %6lu/%6lu\n", input_blocks, output_blocks);
/*  fprintf(stderr, "  char inp/out  %lu\n", iochars);*/
  fprintf(stderr, "\n");

/*  fprintf(stderr, "%2.5g Mbytes/sec (real time)\n", iochars /
         real_time / 1e6);
  fprintf(stderr, "%2.5g Mbytes/sec (CPU time) \n", iochars /
         (user_time + system_time) / 1e6);*/

  fprintf(stderr, "%2.1f%% CPU usage\n", 100 * (user_time + system_time) /
         real_time + .05);

 Q136: Roll-your-own Semaphores   

[For systems that don't support the realtime extensions (where POSIX
semaphores are defined -- they're NOT in Pthreads).]

In article , 
Casper.Dik@Holland.Sun.Com says...
> (Bob Withers) writes:
> >Thanks much for this info.  Unfortunately I need the semaphores for 
> >inter-process mutual exclusion which makes sem_open important.  I'll just 
> >have to stick with SysV semaphores until we can move to 2.6.
> Well, you can mmap a semaphore in a file if you wish.

Well you sure can and, believe it or not, I actually thought of it before 
I read your post.  My code has not been thoroughly tested but I'm posting 
it here in the hopes that it will be of help to someone else.  Either 
that or I'm just a glutten for criticism.  :-)

Casper, thanks much for your help.



sem_t *sem_open(const char *name, int oflag, ...)
    auto     int                need_init = 0;
    auto     int                val = 1;
    auto     int                fd;
    auto     sem_t *            sem = (sem_t *) -1;
    auto     struct stat        st;

    /* -----------------2/11/98 2:12PM-------------------
     * open the memory mapped file backing the shared
     * semaphore to see if it exists.
     * --------------------------------------------------*/
    fd = open(name, O_RDWR);
    if (fd >= 0)
        /* -----------------2/11/98 2:13PM-------------------
         * the semaphore already exists, it the caller
         * specified O_CREAT and O_EXCL we need to return
         * an error to advise them of this fact.
         * --------------------------------------------------*/
        if ((oflag & O_CREAT) && (oflag & O_EXCL))
            errno = EEXIST;
        auto     int                sem_mode;
        auto     va_list            ap;

        /* -----------------2/11/98 2:14PM-------------------
         * if we get here the semaphore doesn't exist.  if
         * the caller did not request that ir be created then
         * we need to return an error.  note that errno has
         * already been set appropriately by open().
         * --------------------------------------------------*/
        if (0 == (oflag & O_CREAT))

        /* -----------------2/11/98 2:15PM-------------------
         * ok, we're going to create a new semaphore.  the
         * caller should've passed mode and initial value
         * arguments so we need to acquite that data.
         * --------------------------------------------------*/
        va_start(ap, oflag);
        sem_mode = va_arg(ap, int);
        val = va_arg(ap, int);

        /* -----------------2/11/98 2:16PM-------------------
         * create the semaphore memory mapped file.  if this
         * call returns an EEXIST error it means that another
         * process/thread snuck in and created the semaphore
         * since we discovered it doesn't exist above.  we
         * don't handle this condition but rather return an
         * error.
         * --------------------------------------------------*/
        fd = open(name, O_RDWR | O_CREAT | O_EXCL, sem_mode);
        if (fd < 0)

        /* -----------------2/11/98 2:18PM-------------------
         * set flag to remember that we need to init the
         * semaphore and set the memory mapped file size.
         * --------------------------------------------------*/
        need_init = 1;
        if (ftruncate(fd, sizeof(sem_t)))

    /* -----------------2/11/98 2:19PM-------------------
     * map the semaphore file into shared memory.
     * --------------------------------------------------*/
    sem = (sem_t *) mmap(0, sizeof(sem_t), PROT_READ | PROT_WRITE,
                            MAP_SHARED, fd, 0);
    if (sem)
        /* -----------------2/11/98 2:19PM-------------------
         * if the mapping worked and we need to init the
         * semaphore, do it now.
         * --------------------------------------------------*/
        if (need_init && sem_init(sem, 1, val))
            munmap((caddr_t) sem, sizeof(sem_t));
            sem = 0;
        sem = (sem_t *) -1;


int sem_close(sem_t *sem)
    return(munmap((caddr_t) sem, sizeof(sem_t)));

int sem_unlink(const char *name)

 Q137: Solaris sockets don't like POSIX_C_SOURCE!  

A little known requirement in Solaris is that when you define POSIX_C_SOURCE,
you must also define __EXTENSIONS__ when including sys/socket.h.  Hence,
your file should look like this:

#define _POSIX_C_SOURCE 199506L 
#define __EXTENSIONS__


That's because POSIX_C_SOURCE of 1995 vintage doesn't include
socket calls.

The feature macros are *exclusion* macros, not *inclusion* macros.

By default, you will get everything.

When you define something, you get *only* that something.
(Unless you also define __EXTENSIONS__)

This is slightly different in cases where the behaviour is
modified by the macro as in some socket calls.

From: (David Robinson)

The gratuitous use of non-POSIX conforming typedefs in headers is the
root cause. (Should use ushort_t not u_short and uint_t not u_int)
When defining POSIX_C_SOURCE it says use only strictly POSIX conforming
features, typedefs thus can only end in _t.

Good news is that 90+% of the offending headers are fixed in 2.7.

% A question... should I use -mt or -D_POSIX_C_SOURCE=199506L to compile
% a pthread program on Solaris 2.6?  If I use the latter even the most simple
% socket program won't compile.  For example,

Well, these do different things. -mt sets up the necessary macro definitions
for multi-threading, and links with the appropriate libraries. _POSIX_C_SOURCE
tells the compiler that your application is supposed to strictly conform
to the POSIX standard, and that the use of any non-POSIX functions or types
that might be available on the system should not be allowed.

The advantage of this is that, when you move to another system which provides
POSIX support, you are assured of your program compiling, however this
requires some work up-front on your part.

So the answer to your question is that you should use -mt to tell the
compiler your application is multi-threaded, and use _POSIX_C_SOURCE only
if your application is intended to conform strictly to POSIX.

HP's compiler is quite frustrating in this regard, since it assumes by
default that your application is k&r; C. If you use the -Aa option to
tell it your application is ANSI C, it doesn't allow any functions which
aren't defined by ANSI. I always end up using -Ae to tell the compiler to
get stuffed and just compile my program with whatever's on the system, and
I port to HP last after a big change.

 Q138: The Thread ID changes for my thread!  

    I'm using IRIX6.4 threads and MPI-SGI but I'm having
strange problems. To analyse and even debug my program I begun
to write some "similar behaviors" programs very simples, and I
detected a strange thing. Anybody can says me if I doing a
mistake or if was a problem with IRIX 6.4 systems.

THE THREAD ID IS ALSO CHANGED. As you can imagine I have a
lot of problems when I try joining them.

    - If I use only threads call, I commented MPI calls,
the program works fine even if I link with mpi library.
The program changes the main thread priority, and after,
it creates 10 threads with other priorities. Threads Id
are sequential.

    - If I use threads and MPI call (only MPI_Init,
Comm_size, Comm_rank and Finalize) the SAME program does
a change on main thread id after prioriry changing.

    - Another thing: in the first case, on my execution,
thread id begun with id=10000 and the other are sequential
after 10000. In the second case, thread id begun with 10000
and after priority change it assumes id=30000.

    ANYBODY can explain me? TIA.



pthread_attr_t attr;
pthread_mutex_t mutex;
pthread_cond_t cond;
int xcond=0;
int size,rank;

void *Slaves(void *arg)
    int i,j;

    printf("Size: %d Rank: %d    Thread Id %x\n", size, rank,

int main (int argc, char **argv)
    int i,k=10,r;
    pthread_t *ThreadId;  
    struct sched_param params;
    int sched;

    printf("THREAD MAIN BEFORE MPI INIT %x\n", pthread_self());
 *  This lines are commented to see if MPI calls influence over
 *  system behavior.
    MPI_Init(&argc;, &argv;);
    MPI_Comm_size(MPI_COMM_WORLD, &size;);
    MPI_Comm_rank(MPI_COMM_WORLD, &rank;);

    printf("THREAD MAIN AFTER MPI INIT %x\n", pthread_self());


 * If I called MPI the main thread id will be changed after
 * this lines. When I just left MPI initialisation main thread
 * has the same ID that it had before, the problem arrives
 * from this point

    pthread_setschedparam(pthread_self(), SCHED_RR, ¶ms;);

    printf("THREAD MAIN AFTER  PRIO CHG %x\n", pthread_self());

    if (argc==2)

    ThreadId= (pthread_t *) malloc(k*sizeof(pthread_t));

    pthread_mutex_init(&mutex;, NULL);
    pthread_cond_init(&cond;, NULL);

    printf("\n Creating %d threads - Main thread is %x  \n", k,

    for(i=0; i != k; i++) {
        pthread_attr_setinheritsched(&attr;, PTHREAD_EXPLICIT_SCHED); 
        pthread_attr_setschedparam(&attr;, ¶ms;);
        r=pthread_create(&ThreadId;[i], &attr;, Slaves, NULL);
        if (r!=0) {
           printf("Error on thread creation! \n");


 * It was to force threads execution, but this is not necessary
    for(;;) sched_yield(); 

    for(i=0; i != k; i++) {
        r=pthread_join(ThreadId[i], NULL);
        if (r!=0) {
           printf("Error on joining threads...\n");
    printf(" Thead MAIN with id %x terminating...\n", pthread_self());

 Q139:  Does X11 support multithreading ?  
> > I am developing a multithreaded app under Solaris 2.5.1 (UltraSPARC),
> > using mixed Motif and X11, and i wonder if  someone can help me
> > answering some question:
> >
> > Does X11 support multithreading ?
>   Well, kinda.  But...


The X Consortium releases of R6.x can be build MT-safe.

You can tell if you have R6 when you compile by checking the
XlibSpecificationRelease or XtSpecificationRelease feature test macros.
If they are >5 then your implementation may support threaded
programming. Call XInitThreads and/or XtToolkitInitializeThreads to find
out if your system's Xlib and Toolkit Intrinsics (libXt) really do
support threaded programming.

> > Does Motif do the same ?
>   No.  It's not thread-safe.

Motif 2.1 IS MT-safe.

> > Can different threads open their own window and listen to their own
> > XEvents ? How could they do that ? [XNextEvent() can't specify a window
> > handle !].
>   You don't.  Your main loop listens for any event, and then decides what
> to do with it.  Perhaps it hands off a task to another thread.

You can. Each thread could open a separate Display connection and do
precisely what Daniele asks. Even without separate Display connections,
the first thread to call XNextEvent will lock the Display, and the
second thread's call to XNextEvent will block until the first thread
releases its lock. But you can't guarantee which thread will get a
particular event, so in the trivial case you can't be assured that one
thread will process events solely for one window.

>   Take a look at the FAQ for the threads newsgroup (on the page below).  That
> will help a bit.  You may also want to get "Multithreaded Programming with Pthreads"
> which has a section on exactly this, along with some example code.  (No one else
> talks about this, but I thought it important.)

I recommend reading the Xlib and Xt specifications, which are contained
in each and every X Consortium release -- available at, or you can get ready-to-print PostScript of just
the Xlib and Xt specs from
 Q140: Solaris 2 bizzare behavior with usleep() and poll()  
>Jeff Denham wrote:
>> Adam Twiss wrote:
>> > You really really don't want to use usleep() in a threaded environment.
>> > On some platforms it is thread safe, but on Solaris it isn't.  The
>> > affects are best described as "unpredictable", but I've seen a usleep()
>> > call segv because it was in a theaded program on Solaris.
>> >
>> > You want to use nanosleep() instaed.
>> >
>> > Adam
>> I've found that poll(0, 0, msec-timeout-value)
>> works pretty well. Is there significant overhead calling
>> poll in this manner?
>It's not uncommon to use poll() or select() for sleeping. Works

I've seen an occasional bug in Solaris 2.6 where poll() will fail to
restore a pre-existing SIGALRM handler when it returns.  The sequence

    poll(0, 0, timeout);
    (program exits with "Alarm clock" error)

Looking at the truss output, poll() appears to be the only place
after the initial sigaction where the handler for SIGALRM is changed.
The failure is difficult to reliably reproduce, but I've seen it
happen about 10% of the time.  It only happens on 2.6; this same code
works fine on Solaris 2.5.1, AIX 3.2.5 and 4.2.1, HP-UX 9.04 and
10.10, and SCO OpenServer 5.0.4.

The same thing happens with usleep.  I haven't tried nanosleep.

The program in question is single-threaded.

I haven't had the chance to pursue this problem yet; there may be a
fix for it, or it may be some really subtle application bug.

Michael Wojcik            
AAI Development, Micro Focus Inc.
Department of English, Miami University

Q: What is the derivation and meaning of the name Erwin?
A: It is English from the Anglo-Saxon and means Tariff Act of 1909.
-- Columbus (Ohio) Citizen
 Q141: Why is POSIX.1c different w.r.t. errno usage?  

Bryan O'Sullivan wrote:

> d> It's an issue because that implementation is "klunky" and, more
> d> precisely, inefficient.
> I must admit that optimising for uncommon error cases does not make
> much sense to me.

Sure. In my sentence, I would have to say that "klunky" was a more
important consideration than "inefficient".

However, use of errno is NOT strictly in "uncommon error cases". For
example, pthread_mutex_trylock returns EBUSY when the mutex is locked.
That's a normal informational status, not an "uncommon error". Similarly,
pthread_cond_timedwait returns ETIMEDOUT as a normal informational
status, not really an "error". There are plenty of "traditional" UNIX
functions that are similar. It's certainly not universal, but "uncommon"
is an overstatement.

> d> Still, why propagate the arcane traditions, just because they
> d> exist?
> Because they are traditions.  I think there is some non-trivial value
> in preserving interface consistency - principle of least surprise, and
> all that - and 1003.1c violates this for no particularly good reason.

Let's just say that the working group (a widely diverse and contentious
bunch) and the balloting group (an even larger and more diverse group)
were convinced that the reasons were "good enough". Arguing about it at
this point serves no purpose.

> d> Overloading return values with "-1 means look somewhere else for an
> d> error" is silly.
> Sure, it's silly, but it's the standard way for library calls on Unix
> systems to barf, and what 1003.1c does is force programmers to plant
> yet another gratuitous red flag in their brains, with wording similar
> to "hey!  everything else works in such-and-such a way, but *this* is
> *different*!".  I have enough red flags planted in my brain at this
> point that it resembles a pincushion, and I would gladly sacrifice a
> few to ugliness if that ugliness were at least consistent.

UNIX isn't even very consistent about that. Some return -1. Some have
symbolic definitions that "are usually" -1 but needn't be (at least in
terms of guaranteed portability and conformance). Some return NULL. Some
set errno and some don't, requiring that YOU set errno before making the
call if you care WHY it failed (e.g., sysconf).

Hey, even if there was a "C" in "UNIX", it would most definitely NOT
stand for "consistency". Adding threads to UNIX disturbed a lot of
cherished traditions... far more than most people are willing to
acknowledge until they stumble over the shards of the old landscape.
There was good reason for this, though, and the net result is of
substantial benefit to everyone. While the changes to errno may be one of
the first differences people notice, "in the scheme of things", it's
trivial. If it even raises your awareness that "something's different",
maybe it'll save a few people from some bad mistakes, and to that extent
it's valuable even merely as a psychological tool.

Hey... count your mixed blessings, Bryan. I would have reported errors by
raising exceptions, if there'd been any hope at all of getting that into
POSIX. ;-)

/---------------------------[ Dave Butenhof ]--------------------------\
| Digital Equipment Corporation          |
| 110 Spit Brook Rd ZKO2-3/Q18 |
| Nashua NH 03062-2698 |
\-----------------[ Better Living Through Concurrency ]----------------/

 Q142: printf() anywhere AFTER pthread_create() crashes on HPUX 10.x  

>I've spend the last couple of days trying to track down an
>annoying problem when running a multithreaded program on
>HPUX build against DCE Threads...
>If I call printf anywhere AFTER pthread_create had been executed to start
>a thread, when my application ends I get thrown out of my rlogin shell.

We experienced a similar problem about a year ago.  It was a csh bug,
and a patch from HP fixed it.

 Q143: Pthreads and Linux  

Wolfram Gloger wrote:

> You should always put `-lpthread' _after_ your source files:
> % gcc the_file.cpp -lpthread
> Antoni Gonzalez Ciria  writes:
> > When I compile this code with gcc ( gcc -lpthread the_file.cpp) the 
> > program executes fine, but doing so with g++( g++ -lpthread the_file.cpp)
> > the progran crashes, giving a Segmentation fault error.
> >
> > #include 
> >
> > void main(){
> >     FILE * the_file;
> >     char sBuffer[32];
> >
> >     the_file=fopen("/tmp/dummy","rb");
> >     fread( sBuffer, 12, 1, the_file);
> >     fclose( the_file);
> >
> > }
> Using `g++' as the compiler driver always includes the `libg++'
> library implicitly in the link.  libg++ has a lot of problems, and is
> no longer maintained (in particular, it has a global constructor
> interfering with glibc2, if I remember correctly).  If you really need
> it, you must get a specially adapted version for Linux/egcs/gcc-2.8.
> If you don't need libg++, please use `c++' or `gcc' as your compiler
> driver, and use the libstc++ library shipped with egcs or seperately
> with gcc-2.8 (`c++' will link libstdc++ in implicitly).
> When I just tested your program with egcs-1.0.1 and glibc-2.0.6, it
> crashed at first (it fails to check the fopen() result for being
> NULL), but after creating a /tmp/dummy file it ran perfectly, even
> after compiling it with `c++ the_file.cpp -lpthread'.
> Regards,
> Wolfram.

  The -pthread option takes care of everything:
     adds  -D_REENTRANT  during the cpp pass, and
     adds  -lpthread during the link-edit.
  This option has been around for a while. I'm not sure
   it's working for all ports. At least for the x86 AND glibc.
   You may want to take a look at the spec file (gcc -v)

 Q144: DEC release/patch numbering  

It was after 4.0B, and 4.0C is just 4.0B with new hardware support. (If you
install a "4.0C" kit on any hardware that doesn't need the new support, it
will even announce itself as 4.0B.) Although this is not true of all
components, DECthreads policy has been to keep 4.0 through 4.0C identical --
we have always submitted any patches to the 4.0 patch stream, and propagated
the changes through the other patch streams, releasing "functionally
identical" patches that are simply built in the appropriate stream's
environment. (But note that all future patches for the 4.0 - 4.0C stream
will be available only on 4.0A and later... 4.0 is no longer supported.)

The changes are in source in 4.0D and later.

/---------------------------[ Dave Butenhof ]--------------------------\
 Q145: Pthreads (almost) on AS/400  
Fred A. Kulack wrote:

> Hi All.
> You may or may not know, that new in the v4r2m0 release of OS/400 is support
> for kernel threads. Most notably the support for threads is available via
> native Java, but we've also implemented a Pthreads library that is 'based on'
> Posix and Unix98
> Pthreads.

Thank you very much for this update. I'm not likely to ever have occasion or
need to use this, but I like to keep track of who's implemented threads (and
how, and which threads). If nothing else, it provides me with more information
to answer the many questions I get on threading.

> The implementation claims no compliance because there are some differences
> and we haven't implemented all of the APIs. We do however duplicate the
> specification for the APIs that are provided, and we have quite a full set
> of APIs.

Yeah, I understand that one. Same deal on OpenVMS. Most of the APIs are there,
and do more or less what you'd expect -- but VMS isn't POSIX, and some parts
just don't fit. Congratulations on "doing your best". (At least, since you say
"we", I'm making the liberal assumption that you bear some personal
responsibility for this. ;-) )

> Anyone whose interested, can take a look at

 Q146: Can pthreads & UI threads interoperate in one application? 

>Can Solaris pthread/UI thread (pthread_xxxx() versus thr_xxx())
>interoperate in one application ?   Is solaris pthread implemented
>as user level threads ?  I'v read JNI book which says the thread
>model used in the native code must be interoperatable with JVM thread
>model used.  An example the book gives is that if the JVM using user
>level thread (Java green thread) and the native code using Solaris
>native thread, then it will have problem to interoperate.  Does this
>apply to pthread & UI thread interoperatibility on Solaris, if pthread
>is kind of user level thread ?
>Also when people say Solaris native thread, does it mean the UI thread
>(thr_xxx() calls) only or does it also include Solaris pthread ?

Yes. They are built on the same underlying library. Indeed, many
of the libraries you use everyday are built using UI threads and
they get linked into Pthreads programs all the time.

"Implemented at user level" isn't quite the right way of describing
it. "Does the library use LWPs?" is the real question. Green threads
don't, so you can't make JNI calls to pthreads or UI threads. Native
threads do, and you can.

When folks say "Solaris native threads" they mean either pthreads or
UI threads, NOT green threads.

For a more detailed discussion, see my *excellent* book on Java
Threads: "Multithreaded Programming with Java".


 Q147: Thread create timings  

Matthew Houseman  writes:

Thought I'd throw this into the pyre. :)  I ran the thread/process create
stuff on a 166MHz Pentium (no pro, no mmx) under NT4 and Solaris x86 2.6:

NT spawn                240s    24.0  ms/spawn
Solaris spawn (fork)    123s    12.3  ms/spawn  (incl. exec)
Solaris spawn (vfork)    95s     9.5  ms/spawn  (incl. exec)

Solaris fork             47s     4.7  ms/fork
Solaris vfork                    0.37 ms/vfork  (37s/100000)

NT thread create         12s     1.2  ms/create
Solaris thread create            0.11 ms/create (11s/100000)

As you can see, I tried both fork() and vfork(). When doing an immediate
exec(), you'd normally use vfork(); when just forking, fork() is usually
what you want to use (or have to use).

Note that I had to turn the number of creates up to 100000 for vfork
and thread create to get better precision in the timings.

To remind you, here are greg's figures (on a Pentium MMX 200MHz):

>NT Spawner (spawnl):            120 Seconds (12.0 millisecond/spawn)
>Linux Spawner (fork+exec):       57 Seconds ( 6.0 millisecond/spawn)
>Linux Process Create (fork):     10 Seconds ( 1.0 millisecond/proc)
>NT Thread Create                  9 Seconds ( 0.9 millisecond/thread)
>Linux Thread Create               3 Seconds ( 0.3 millisecond/thread)

Just for fun, I tried the same thing on a 2 CPU 170MHz Ultrasparc.
I leave it to someone else to figure out how much of this is due to
the two CPUs... :)

Solaris spawn (fork)            84s     8.4  ms/spawn  (incl. exec)
Solaris spawn (vfork)           69s     6.9  ms/spawn  (incl. exec)

Solaris fork                    21s     2.1  ms/fork
Solaris vfork                           0.17 ms/vfork  (17s/100000)

Solaris thread create                   0.06 ms/create (6s/100000)

 Q148: Timing Multithreaded Programs (Solaris)  

From: (Richard Sullivan)

>I'm trying to time my multithreaded programs on Solaris with multiple 
>processors.  I want the real world running time as opposed to the total 
>execution time of the programming because I want to measure speedup versus 
>sequential algorithms and home much faster the parallel program is for the user.


  Here is what I wrote to solve this problem (for Solaris anyway).  To
use it just call iobench_start() after any setup that you don't want
to measure.  When you are done measuring call iobench_end().  When you
want to see the statistics call iobench_report().  The output to
stderr will look like this:

Process info:
  elapsed time  249.995
  CPU time      164.446
  user time     152.095
  system time   12.3507
  trap time     0.661235
  wait time     68.6506
  pfs    major/minor    3379/     0
  blocks input/output      0/     0
65.8% CPU usage

The iobench code is included in the program sources on: index.html.
 Q149: A program which monitors CPU usage?  

> >Ok, I've tried some web searches and haven't found anything I like the
> >look of.  What I'm after is a program which runs in the background and
> >monitors (primarily) CPU usage for our web server (an Ultra-1 running
> >Solaris 2.6).  However, all the programs I've found are about 2 years
> >old and/or don't run on 2.6.
> >
> >I've seen top, but it doesn't really do what I want; I'd like to have
> >the output from the program as a %cpu usage for each hour (or some
> >other arbitrary time period) stored as a log file or, ideally, as a
> >graph (in some graphics format, eg, .gif).
> Sounds like what sar does, and it comes with 2.6 - to enable recording
> data for it, just uncomment the lines in /etc/init.d/perf and the
> crontab for the 'sys' account.

From what I've read on the product, sounds like 'spong' might be what
you need. I've downloaded it, but haven't had time to install and set up
yet. Try:
 Q150: standard library functions: whats safe and whats not?  

From: (W. Richard Stevens)
Subject: Re: standard library functions: whats safe and whats not?
Date: 17 Feb 1998 14:19:28 GMT

> 1.  Which of the standard C library functions are thread-safe and
> which aren't?  For example, I know that strtok() is un-safe, I can
> infer that from its functionality, but what about the thousands of
> other library calls? I don't want to examine each one individually
> and make guesses about thread safety.
> Is there a list somewhere of what's safe and whats not?

Page 32 of the 1996 Posix.1 standard says "All functions defined by
Posix.1 and the C standard shall be thread-safe, except that the following
functions need not be thread-safe:


Note that a thread-safe XXX_r() version of the above are available,
other than those with an asterisk.  Also note that ctermid() and
tmpnam() are only thread-safe if a nonnull pointer is used as an

    Rich Stevens


POSIX and ANSI C specify only a small part of the "traditional UNIX
programming environment", though it's a start. The real danger in reading the
POSIX list quoted by Rich is that most people don't really know what's
included. While an inclusive list would be better than an exclusive list,
that'd be awfully long and awkward.

The Open Group (OSF and X/Open) has extended the Single UNIX Specification
(also known as "SPEC1170" for it's 1,170 UNIX interfaces, or UNIX95) to
include POSIX.1b-1993 realtime, POSIX.1c-1995 threads, and various
extensions. It's called the Single UNIX Specification, Version 2; or UNIX98.
Within this calendar year, it's safe to assume that most UNIX versions
currently branded by The Open Group (as XPG3, UNIX93, UNIX95) will extend
their brand validation to UNIX98.

The interfaces specification part of the Single UNIX Specification, Version 2
(known as XSH5), in section 2.8.2, "Thread-safety", specifies that all
interfaces defined by THIS specification will be thread-safe, except for "the
following". There are two explicit lists, and one implicit. One is the POSIX
list already quoted by Rich Stevens. The second is an additional list of
X/Open interfaces:

basename      dbm_open    fcvt        getutxline    pututxline
catgets       dbm_store   gamma       getw          setgrent
dbm_clearerr  dirname     gcvt        l64a          setkey
dbm_close     drand48     getdate     lgamma        setpwent
dbm_delete    ecvt        getenv      lrand48       setutxent
dbm_error     encrypt     getgrent    mrand48       strerror
dbm_fetch     endgrent    getpwent    nl_langinfo
dbm_firstkey  endpwent    getutxent   ptsname
dbm_nextkey   endutxent   getutxid    putenv

The implicit list is a statement that all interfaces in the "Legacy" feature
group need not be thread-safe. From another section, that list is:

advance       gamma          putw        sbrk          wait3
brk           getdtablesize  re_comp     sigstack
chroot        getpagesize    re_exec     step
compile       getpass        regcmp      ttyslot
cuserid       getw           regex       valloc
loc1          __loc1         loc2        locs

Obviously, this is still an exclusive list rather than inclusive. But then,
if UNIX95 had 1,170 interfaces, and UNIX98 is bigger, an inclusive list would
be rather awkward. (And don't expect ME to type it into the newsgroup!)

On the other hand... beware that if you've got a system that doesn't claim
conformance to POSIX 1003.1c-1995 (or POSIX 1003.1-1996, which includes it),
then you're not guaranteed to be able to rely even on the POSIX list, much
less the X/Open list. It's reasonable to assume that any implementation's
libpthread (or equivalent, though that name has become pretty much defacto
standard) is thread-safe. And it's probably reasonable to assume, unless
specified otherwise, that "the most common" bits of libc are thread-safe. But
without a formal statement of POSIX conformance, you're just dealing with
"good will". And, even at that, POSIX conformance isn't validated -- so
without validation by the UNIX98 branding test suite, you've got no real
guarantee of anything.

/---------------------------[ Dave Butenhof ]--------------------------\
 Q151: Where are semaphores in POSIX threads?  

David McCann wrote:

> Jan Pechanec wrote:
> >
> > Hello,
> >
> >         I have a summary of POSIX papers on threads, but there in no
> > imformation about semaphores (just conditional vars, mutexes). *NO*
> > pthread_semaphoreinit() etc.
> >
> >         In some materials, there is information on sem_wait(), sem_send() (or
> > sm. like that), but is it for threads (or just processes)?
> I think this whole discussion has digressed from Jan's original question
> above. Yes, there are sem_* calls in Solaris 2.5 (and 2.4 IIRC); you
> just need to link with -lposix4 or whatever to get them. But these are the
> *POSIX.1b* semaphores, which are *process-based* semaphores. They have
> nothing to do with threads.
> Now what Jan wants here is semaphore calls for *POSIX.1c*, i.e. POSIX
> threads. Now, IIRC, the sem_* calls are NOT specified in POSIX.1c, but
> rather their behaviour in MT programs has been clarified/refined in XPG5
> (Unix98) which allows you to use semaphores to synchronize threads and/or
> processes, depending on how you use them.

Not quite true. Yes, XSH5 (the "system interfaces" part of XPG5) says this; but it
does so because POSIX 1003.1-1996 says so, not because it's added something to

In fact, POSIX 1003.1b semaphores were designed by the same working group that did
1003.1c, and while 1003.1b-1993 was approved and published first (and therefore
couldn't mention threads), several aspects of 1003.1b were designed to work with
threads. For example, there are "realtime" extended versions of the 1003.1c
sigwait functions (sigtimedwait and sigwaitinfo). (The interfaces are slightly
incompatible because they return -1 and set errno on errors, rather than returning
an error code: that's because 1003.1c removed the use of errno AFTER 1003.1b was

Additionally, the sem_init function was designed with a parameter corresponding to
the "pshared" attribute of mutexes and condition variables. For 1003.1b, the only
supported value was 1, meaning "process shared". 1003.1c amended the sem_init
description to specify in addition that the value 0 meant "process private", for
use only between threads within the process. (But also note that it's perfectly
reasonable to create a "process shared" semaphore and then use it only between
threads within the process -- it may be less efficient on some implementations,
but it does the same thing.)

> Solaris 2.5 is not Unix98-conformant; the confusion arises because it
> *does* appear to be compliant with POSIX.1b and POSIX.1c (somebody at Sun can
> surely verify this). From what's been said here, I assume 2.6 is either Unix98-
> compliant, or at least contains the MT extensions to POSIX of Unix98.

Solaris 2.5 supports (most of) 1003.1b and 1003.1c, although there were a few
omissions and a few interpretation errors. (Like any other implementation.) This,
however, is not one of them. Solaris 2.5 does NOT define _POSIX_SEMAPHORES in
, which is the way an implementation should advertise support for POSIX
semaphores. Therefore, while it may not implement all capabilities described by
1003.1b and 1003.1c, it doesn't (in this case, anyway) violate the standard. If
you're using POSIX semaphores (even if they seem to work) on Solaris 2.5, then
your application is not "strictly conforming", and if you're subject to any
incompatibilities or porting problems, that's your fault, not the fault of
Solaris. IT says they're not there.

(And, yes, POSIX specifically allows an implementation to provide interfaces while
claiming they're not there; and if it does so, it's not obligated to provide
strict conformance to the standard's description of those interfaces. This is what
Solaris 2.5 should have done, also, with the _POSIX_THREAD_PRIORITY_SCHEDULING
option, since it doesn't completely implement those interfaces.)

Presumably, Solaris 2.6 (though I don't have a system handy to check) DOES define

> At any rate, you can't use the sem_* calls for thread synchronization in
> 2.5; you get ENOSYS in MT programs. I know, I've tried it (on 2.5.1).
> AFAIK, single-threaded programs linked with -lposix4 work fine, but as I
> said above, they're only for process-based semaphores. So if you want to use the
> sem_* calls for thread-synchronization on Solaris, you have to go to 2.6.

First off, other replies have indicated that it's actually libthread, not
libposix4, that provides "working" (though not complete) POSIX semaphores. Most
likely, these semaphores would work with the "pshared" parameter set to either 0
(process) or 1 (cross-process). However, in any case, if you've got something that
can synchronize between processes, you should expect that it can synchronize
between threads as well, though there may be alternatives that are more efficient
on some implementations. (E.g., a pshared mutex will usually be more expensive to
lock or unlock than a private mutex.) (Such a difference in efficiency is less
likely for semaphores, since POSIX already requires that sem_post be async-signal
safe, which means it's far better and easier to keep the implementation inside the
kernel regardless of the pshared argument.)

/---------------------------[ Dave Butenhof ]--------------------------\
| Digital Equipment Corporation          |
| 110 Spit Brook Rd ZKO2-3/Q18 |
| Nashua NH 03062-2698 |
\-----------------[ Better Living Through Concurrency ]----------------/

 Q152: Thread & sproc (on IRIX)  

In article <>,
Yann Boniface   wrote:
>I'm having trouble while using threads and processus on a massive
>parallel machine (SGI).
>The processus creation is OK (sproc (myFunction, PR_SADDR, arg)) as long
>as I don't use pthread library. If I compile the program with the flag
>-lpthread, processus creation didn't work any more, even if I don't
>explicity use thread functions (errno is then ENOTSUP)

You shouldn't mix pthreads and sprocs. You should stick with one or
the other (IMHO pthreads are preferable).
Planet Bog -- pools of toxic chemicals bubble under a choking
atomsphere of poisonous gases... but aside from that, it's not
much like Earth.
 Q153:  C++ Exceptions in Multi-threaded Solaris Process  

Jeff Gomsi  writes:

> We are running a large multi-threaded C++ (C++ 4.2 patch 
> 104631-03) application under Solaris (SunOS 5.5.1 
> Generic_103640-14) on a 14 processor Ultra-Enterprise and 
> observe the following problem.
> The application runs fine single-threaded, but when run
> multi-threaded, throwing a C++ exception can (evidently) 
> cause memory corruption which leads to a SIGSEGV core
> dump. A diagnostic version of the new operator seems to
> reveal that C++ is unwinding things improperly and possibly
> calling destructors which should not be called.
> Does anyone have any ideas on this?

The last time I looked at the patch list for the C++ 4.2, I noticed a
mention of a bug stating that exceptions were not thread safe.  There was
no further description of this bug that I could find.  However, it
supposedly is addressed by one of the later patches. Try upgrading your
patch to -04 or -05....

- Chris

Make sure you have the libC patch 101242-13.
 Q154:  SCHED_FIFO threads without root privileges  ?  

Laurent Deniel wrote:

> Hi,
>  Is there a way to create threads that have the SCHED_FIFO scheduling
>  without root privileges (in system contention scope) ? by for instance
>  changing a kernel parameter (Digital UNIX 4.0B & D or AIX 4.2) ?
>  Thanks in advance,

In Digital UNIX 4.0, using process contention scope (PCS) threads,
any thread can set FIFO policy; further, it can set any priority.
Why? Because the policies and priorities for PCS threads affect only
the threads in the containing process. PCS FIFO/63 threads are really
important in relation to other PCS threads in the process, but have no
influence on the scheduling of other threads in other processes.
The aspect is controlled by the policies and priorities of  the kernel
scheduling entities (VPs -- virtual processors) underlying the PCS
threads, and those characteristics are unaffected by the POSIX
scheduling interfaces.

On V4.0D, newly released, system contention scope (SCS) threads
are supported. Their policies and priorities are by definition seen
by the kernel scheduler and are therefore subject to privilege
restrictions. In short, you can set SCS threads to FIFO or RR policy
without  privilege on V4.0D, but FIFO threads cannot exceed POSIX
prio 18 and RR threads cannot exceed 19. Regardless of this
"limitation," it gives you plenty or rope to hang yourself with!

Jeff Denham (
 Q155: "lock-free synchronization"  

> I recently came across a reference to "lock-free synchronization" (in
> Taligent's Guide to Designing Program's.)  This document referred to
> research that was looking at using primitive atomic operations to build more
> complex structures in ways that did not require locking.
> I'm interested in exploring this topic further and would be greatful if
> anyone could supply references.
> Regards,
> Daniel Parker

Check out the following references --

  M. Herlihy, "Wait free Synchronization," ACM Transactions on Programming
Languages and Systems, Vol 13, No 1, 1991, pp. 124-149.

  M. Herlihy, "A Methodology for Implementing Highly Concurrent Data
Objects," same journal as above, Vol 15, No. 5, 1993, pp. 745 --770.

They should provide a starting point.

 Q156: Changing single bytes without a mutex  

Tim Beckmann wrote:

> David Holmes wrote:
> >
> > I thought about this after posting. An architecture such as Bil describes
> > which requires non-atomic read/mask/write sequences to update variables of
> > a smaller size than the natural word size, would be a multi-threading
> > nightmare. As you note above two adjacent byte values would need a common
> > mutex to protect access to them and this applies even if they were each
> > used by only a single thread! On such a system I'd only want to program
> > with a thread-aware language/compiler/run-time.
> >
> > David
> David,
> My thoughts exactly!
> Does anyone know of a mainstream architecture that does this sort of
> thing?

Oh, absolutely. SPARC, MIPS, and Alpha, for starters. I'll bet most other RISC
systems do it, too, because it substantially simplifies the memory subsystem
logic. And, after all, the whole point of RISC is that simplicity means speed.

If you stick to int or long, you'll probably be safe. If you use anything
smaller, be sure they're not allocated next to each other unless they're under
the same lock.

I wrote a long post on most of the issues brought up in this thread, which
appears somewhere down the list due to the whims of news feeds, but I got
interrupted and forgot to address this issue.

If you've got

     pthread_mutex_t mutexA = PTHREAD_MUTEX_INITIALIZER;
     pthread_mutex_t mutexB = PTHREAD_MUTEX_INITIALIZER;

     char dataA;
     char dataB;

And one thread locks mutexA and writes dataA while another locks mutexB and
writes dataB, you risk word tearing, and incorrect results. That's a "platform
issue", that, as someone else commented, POSIX doesn't (and can't) address.

What do you do? I always advise that you keep a mutex and the data it protects
closely associated. As well as making the code easier to understand, it also
addresses problems like this. If the declarations were:

     typedef struct dataTag {
         pthread_mutex_t mutex;
         char data;
     } data_t;

     data_t dataA = {PTHREAD_MUTEX_INITIALIZER, 0};
     data_t dataB = {PTHREAD_MUTEX_INITIALIZER, 1};

You can now pretty much count on having the two data elements allocated in
separate "memory access chunks". Not an absolute guarantee, since a
pthread_mutex_t might be a char as well, and some C compilers might not align
structures on natural memory boundaries. But most compilers on machines that
care WILL align/pad structures to fit the natural data size, unless you
override it with a switch or pragma (which is generally a bad idea even when
it's possible). And, additionally, a pthread_mutex_t is unlikely to be less
than an int, and is likely at least a couple of longs. (On Digital UNIX, for
example, a pthread_mutex_t is 48 bytes, and on Solaris it's 24 bytes.)

There are, of course, no absolute guarantees. If you want to be safe and
portable, you might do well to have a config header that typedefs
"smallest_safe_data_unit_t" to whatever's appropriate for the platform. Then
it's just a quick trip to the hardware reference manual when you start a port.
On a CISC, you can probably use "char". On most RISC systems, you should use
"int" or "long".

Yes, this is one more complication to the process of threading old code. But
then, it's nothing compared to figuring out which data is shared and which is
private, and then getting the locking protocols right.

/---------------------------[ Dave Butenhof ]--------------------------\
| Digital Equipment Corporation          |
| 110 Spit Brook Rd ZKO2-3/Q18 |
| Nashua NH 03062-2698 |
\-----------------[ Better Living Through Concurrency ]----------------/

> If I'm not mistaken, isn't that spelled:
>     #include 
>     typedef sig_atomic_t smallest_safe_data_unit_t;

You are not mistaken, and thank you very much for pointing that out. While I'd
been aware at some point of the existence of that type, it was far from the top
of my mind.

If you have data that you intend to share without explicit synchronization, you
should be safe in using sig_atomic_t. Additionally, using sig_atomic_t will
protect you against word tearing in adjacent data protected by separate mutexes.

There are additional performance considerations, such as "false sharing" effects
in cache systems, that might dictate larger separations between two shared pieces
of data: but those won't affect program CORRECTNESS, and are therefore more a
matter of tuning for optimal performance on some particular platform.

 Q157: Mixing threaded/non-threadsafe shared libraries on Digital Unix  

claude vallee wrote:

> Hi All.  I have a question on building a mutli-threaded process (on
> Digital Unix 4.0) which is linked with non thread safe shared libraries.
> Let's say:
> mymain.c has no calls to thread functions and none of its functions runs
> in a secondary thread.  I will compile this file with the -pthread
> option. (I call secondary thread any but the main thread)
> contains non thread safe functions, but I know for a fact that
> none of its functions will run in a secondary thread.  This library was
> not built using the -pthread option.
> is my multi-thread library.  It creates threads and its
> functions are thread safe or thread reentrant.  All of its code was
> compiled with the -pthread option.  All the code executing in a
> secondary thread is in this library.
> The questions are:
> 1. Will this work?  If was not built with threads options, is it
> all right if it runs only in the main thread?  Which c runtime library
> will be used at run time? libc or libc_r?

On Digital UNIX 4.0D, this should be OK. On earlier versions, you need to
be careful around use of errno. For various historical reasons I won't try
to explain (much less justify), setting of errno through libc (using the
"seterrno()" function or equivalent) would previously result in setting the
process global errno ("extern int errno;"), not just the per-thread errno
of the calling thread.

For 4.0D, I was able to change the code so that seterrno() always sets the
calling thread's errno, and also sets the global errno ONLY if called from
the initial ("main") thread of the process. With this change, it's safe (as
least as far as errno use) to run non-thread-aware libraries as long as you
use them only in the initial thread.

To make this clear, prior to 4.0D, your liba code running in the main
thread may see errno change at random. As long as liba doesn't read errno,
this shouldn't be a problem.

You do have to be aware of whether liba SETS the global errno -- because
your thread-safe code won't see the global errno through any normal

> 2. I noticed that on my DU 4.0, the and are
> identical!!  I assume this means that I am always using the thread safe
> version of the libc library.  Is that correct?

Yes -- libc_r was another historical oddity. (Due to OSF/1 rules.) It no
longer exists, as of Digital UNIX 4.0. The (hard) link provides binary
compatibility for older programs that still reference libc_r.

> 3. What does -pthread do to my code?  I saw that my objects are
> different (in size anyway), and that my executable point to libmach and
> libpthread.  What is added to the object code?

There are two basic contributions of "-pthread":

   * At compile-time, the definition -D_REENTRANT is provided
   * At link-time, the proper libraries are added, to the end of the actual
     list of libraries, but immediately before the implicit reference to
     libc that's generated by the compile driver. Additionally, -pthread
     causes the linker to search for "reentrant" versions of any library
     you specify. (E.g., if you say "-lfoo" and there's a libfoo_r in your
     -L path, the linker will automatically use it.)

The primary effect of -D_REENTRANT is to change  -- references to
errno make a call into the thread library to get the thread's private errno
address rather than the global errno. There are some other changes to
various headers, though.

> 4. Does defining _THREAD_SAFE when compiling and linking with
> libpthread, libmach and libc_r equivalent to building with the -pthread
> option?

No, _THREAD_SAFE doesn't do anything. It's considered obsolete. You should
use _REENTRANT. (Though I actually prefer the former term, I've never felt
it was worth arguing, or making anyone change a ton of header files.)

> I did some tests, and everything works well... for the moment, but IMHO,
> it does not mean anything.  Everyone knows that non thread safe code
> will work perfectly fine until your demo ;-)

Depends. If the demo is a critical requirement for a multi-million dollar
sale, then, yeah, it can probably hurt you worst by failing then.
Otherwise, though, it'll have a lot more fun by SUCCEEDING at the demo, and
failing when the customer runs the code in their mission-critical
environment. This is a correllary to a correllary to Murphy's Law, which
stated something about the inherent perversity of inanimate objects...

Oh... and since liba is, presumably, a third-party library over which
you've got no direct control... you should tell them immediately that
you're running their code in a threaded application, and it would be to
their long-term benefit to build a proper thread-safe version before you
find another option. If liba is YOUR code, then please don't play this
game: build it at least with -D_REENTRANT.

/---------------------------[ Dave Butenhof ]--------------------------\

 Q158: VOLATILE instead of mutexes?  

> What about exception handlers ? I've always thought that when you had
> code like:
>         int i;
>         TRY
>         {
>                 . . .
>                 proc();
>         }
>         CATCH_ALL
>         {
>                 if (i > 0)
>                 {
>                         . . .
>                 }
>                 . . .
>         }
> that you needed to declare "i" to be volatile least the code in the
> catch block assume that "i" was stored in some register the contents
> of which were overwritten by the call to "proc" (and not restored by
> whatever mechanism was used to throw the exception).

Since neither ANSI C nor POSIX has any concept remotely resembling "exceptions", this
is all rather moot in the context of our general discussion, isn't it? I mean, it's
got nothing to do with sharing data between threads -- and that's what I thought we
were talking about. But sure, OK, let's digress.

Since there's no standard covering the behavior of anything that uses exceptions, (at
least, not if you use them from C, or even if you use the DCE exception syntax you've
chosen from C++), there's no portable behavior. Your fate is in the hands of the
whimsical (and hypothetically malicious ;-) ) implementation. This situation might
lead a cautious programmer to be unusually careful when treading in these waters, and
to wear boots with thick soles. (One might also say that it could depend on exactly
HOW you "foodle with i", but I'll choose to disregard an entire spectrum of mostly
amusing digressions down that fork.)

Should you use volatile in this case? Sure, why not? It might not be necessary on
many platforms. It might destroy your performance on any platform. And, where it is
necessary, it might not do what you want. But yeah, what the heck -- use it anyway.
It's more likely (by some small margin) to save you than kill you.

Or, even better... don't code cases like this!

/---------------------------[ Dave Butenhof ]--------------------------\
 Q159: After pthread_cancel() destructors for local object do not get called?!  
> Hello,
> I've run into a trouble when I found out that when I cancel a thread via
> pthread_cancel() than destructors for local object do not get called.
> Surprising :). But how to deal with this? With a simple  thread code
> it would not be a big problem, but in my case it's fairly complex code,
> quite a few STL classes etc. Has someone dealt with such problem and is
> willing to share his/her soltution with me ? I thought I could 'cancel'
> thread via pthread_kill() and raise an exception within a signal handler
> but it's probably NOT very good idead, is it?;)
> Thank you,
>         Ales Pour
> Linux, egcs-1.0.3, glibc-2.0.7 with LinuxThreads

  Unfortunately, not surprising.  C++ has not formally decided what to do with
thread cancellation, so it becomes compiler-specific.  The Sun compiler (for 
example) will run local object destructors upon pthread_exit() (hence 
cancellation also).  Others may not.

  I suppose the best GENERAL C++ solution is:

    a) Don't use stack-allocated objects.
    b) Don't use cancellation.

  Otherwise you can simply insist on a C++ compiler that runs the destructors.

 Q160: No pthread_exit() in Java.  

 >    In POSIX, we have pthread_exit() to exit a thread.  In Java we
 >  *had* Thread.stop(), but now that's gone.  Q: What's the best way
 >  to accomplish this?
 >    I can (a) arrange for all the functions on the call stack to
 >  return, all the way up to the top, finally returning from the
 >  top-level function.  I can (b) throw some special exception I
 >  build for the purpose, TimeForThreadToExitException, up to the
 >  top-level function.  I can throw ThreadDeath.
 >    But what I really want is thread.exit().
 >    Thoughts?
 >  -Bil
 > -- 
 > ================
 > Bil
 > Lambda Computer Science 
 > 555 Bryant St. #194 
 > Palo Alto, CA,
 > 94301 
 > Phone/FAX: (650) 328-8952

Here's a real quick reply (from a slow connecction from
Sydney AU (yes, visiting David among other things)). I'll
send something more thorough later....

Throwing ThreadDeath yourself is a pretty good way to force current
thread to exit if you are sure it is in a state where it makes sense
to do this.

But if you mean, how to stop other threads: This is one reason why
they are extremely unlikely to actually remove Thread.stop(). The next
best thing to do is to take some action that is guaranteed to cause
the thread to hit a runtime exception. Possibililies range from the
well-reasoned -- write a special SecurityManager that denies all
resource-checked actions, to the sleazy -- like nulling out a pointer
or closing a stream that you know thread needs. See
for a discussion of some other alternatives.


ThreadDeath is an Error (not a checked Exception, since app's routinely
catch all checked Exceptions) which has just the semantics you are talking
about: it is a Throwable that means "this thread should die".  If
you catch it (because you have cleanup to do), you are SUPPOSED to
rethrow it.  1.2 only, though, I think.  Thread.stop() uses it, but
although stop() is deprecated, it appears that ThreadDeath is not.

I think.  :^)


There is *nothing* special about a ThreadDeath object. It does not mean
"this thread should die" but rather it indicates that "this thread has
been asked to die". The only reason it "should" be rethrown is that if
you don't then the thread doesn't actually terminate. This has always
been documented as such and is not specific to 1.2.

If a thread decides that for some reason it can continue with its work
then it can simply throw  new ThreadDeath() rather than calling stop()
on itself. The only difference is that with stop() the Thread is
immediately marked as no longer alive - which is a bug in itself.

 Q161: Is there anyway I can make my stacks red zone protected?  

Allocate your stack segments using mmap.  Use mprotect to make the
page after the bottom of your stack read-only (I'm assuming the stack
grows down on whatever system you're using), or leave a hole in your
address space.  If you get a segfault due to an attempted write at the
top of a red zone, map in some more stack and build a new red zone.

 Q162: Cache Architectures, Word Tearing, and VOLATILE

Tim Beckmann wrote:

> Dave Butenhof wrote:
> > > David,
> > >
> > > My thoughts exactly!
> > >
> > > Does anyone know of a mainstream architecture that does this sort of
> > > thing?
> >
> > Oh, absolutely. SPARC, MIPS, and Alpha, for starters. I'll bet most other RISC
> > systems do it, too, because it substantially simplifies the memory subsystem
> > logic. And, after all, the whole point of RISC is that simplicity means speed.
> MIPS I know :)  The latest MIPS processors R10K and R5K are byte addressable.
> The whole point of RISC is simplicity of hardware, but if it makes the software
> more complex it isn't worth it :)

The whole idea of RISC is *exactly* to make software more complex. That is,
by simplifying the hardware, hardware designers can produce more stable
designs that can be produced more quickly and with more advanced technology
to result in faster hardware. The cost of this is more complicated
software. Most of the complexity is hidden by the compiler -- but you can't
necessarily hide everything. Remember that POSIX took advantage of some
loopholes in the ANSI C specification around external calls to proclaim that
you can do threaded programming in C without requiring expensive and awkward
hacks like "volatile". Still, the interpretation of ANSI C semantics is
stretched to the limit. The situation would be far better if a future
version of ANSI C (and C++) *did* explicitly recognize the requirements of
threaded programming.

> > If you stick to int or long, you'll probably be safe. If you use anything
> > smaller, be sure they're not allocated next to each other unless they're under
> > the same lock.
> Actually, you can be pretty sure that a compiler will split two declarations
> like:
>         char dataA;
>         char dataB;
> to be in two separate natural machine words.  It is much faster and easier for
> those RISC processors to digest.  However if you declare something as:

While that's certainly possible, that's just a compiler optimization
strategy. You shouldn't rely on it unless you know FOR SURE that YOUR
compiler does this.

>         char data[2]; /* or more than 2 */
> you have to be VERY concerned with the effects of word tearing since the
> compiler will certainly pack them into a single word.

Yes, this packing is required. You've declared an array of "char" sized
data, so each array element had better be allocated exactly 1 char.

> > I wrote a long post on most of the issues brought up in this thread, which
> > appears somewhere down the list due to the whims of news feeds, but I got
> > interrupted and forgot to address this issue.
> Yep, I saw it.  It was helpful.  So was the later post by someone else who
> included a link to a DEC alpha document that explained what a memory barrier
> was in this context.  I've seen three different definitions over the years.
> The definition you described in your previous post agreed with the DEC alpha
> description... That a memory barrier basically doesn't allow out of order
> memory accesses to cross the barrier.  A very important issue if you are
> implementing mutexes or semaphores :)[...]
> However, I really believe that dataA and dataB should both be declared as
> "volatile" to prevent the compiler from being too aggressive on it's
> optimization.  The mutex still doesn't guarantee that the compiler hasn't
> cached the data in an internal register across a function call.  My memory
> isn't perfect, but I do think this bit me on IRIX.

The existence of the mutex doesn't require this, but the semantics of POSIX
and ANSI C do require it. Remember that you lock a mutex by calling a
function, passing an address. While an extraordinarily aggressive C compiler
with a global analyzer might be able to determine reliably that there's no
way that call could access the data you're trying to protect, such a
compiler is unlikely -- and, if it existed, it would simply violate POSIX
1003.1-1996, failing to support threads.

You do NOT need volatile for threaded programming. You do need it when you
share data between "main code" and signal handlers, or when sharing hardware
registers with a device. In certain restricted situations, it MIGHT help
when sharing unsynchronized data between threads (but don't count on it --
the semantics of "volatile" are too fuzzy). If you need volatile to share
data, protected by POSIX synchronization objects, between threads, then your
implementation is busted.

> > There are, of course, no absolute guarantees. If you want to be safe and
> > portable, you might do well to have a config header that typedefs
> > "smallest_safe_data_unit_t" to whatever's appropriate for the platform. Then
> > it's just a quick trip to the hardware reference manual when you start a port.
> > On a CISC, you can probably use "char". On most RISC systems, you should use
> > "int" or "long".
> There never are guarantees are there :)

To reiterate again one more time, ( ;-) ), the correct (ANSI C) portable
type for atomic access is sig_atomic_t.

> > Yes, this is one more complication to the process of threading old code. But
> > then, it's nothing compared to figuring out which data is shared and which is
> > private, and then getting the locking protocols right.
> But what fun would it be if it wasn't a challenge :)

Well, yeah. That's my definition of "fun". But not everyone's. Sometimes
"boring and predictable" can be quite comforting.

> However, I would like to revist the original topic of whether it is "safe" to
> change a single byte without a mutex.  Although, instead of "byte" I'd like to
> say "natural machine word" to eliminate the word tearing and non-atomic memory
> access concerns.  I'm not sure it's safe to go back to the original topic, but
> what the heck ;)


> If you stick to a "natural machine word" that is declared as "volatile",
> you do not absolutely need a mutex (in fact I've done it).  Of course, there are
> only certain cases where this works and shouldn't be done unless you really know
> your hardware architecture and what you're doing!  If you have a machine with a
> lot of processors, unnecessarily locking mutexes can really kill parallelism.
> I'll give one example where this might be used:
> volatile int stop_flag = 0;  /* assuming an int is atomic */
> thread_1
> {
>         /* bunch of code */
>         if some condition exists such that we wish to stop thread_2
>                 stop_flag = 1;
>         /* more code - or not :) */
> }
> thread_2
> {
>         while(1)
>         {
>                 /* check if thread should stop */
>                 if (stop_flag)
>                         break;
>                 /* do whatever is going on in this loop */
>         }
> }
> Of course, this assumes the hardware has some sort of cache coherency
> mechanism.  But I don't believe POSIX mutex's or memory barriers (as
> defined for the DEC alpha) have any impact on cache coherency.

If a machine has a cache, and has no mechanism for cache coherency, then it
can't work as a multiprocessor.

> The example is simplistic, but it should work on a vast majority of
> systems.  In fact the stop_flag could just as easily be a counter
> of some sort as long as only one thread is modifying the counter...

In some cases, yes, you can do this. But, especially with your "stop_flag",
remember that, if you fail to use a mutex (or other POSIX-guaranteed memory
coherence operation), a thread seeing stop_flag set CANNOT assume anything
about other program state. Nor can you ensure that any thread will see the
changed value of stop_flag in any particular bounded time -- because you've
done nothing to ensure memory ordering, or coherency.

And remember very carefully that bit about "as long as only one thread is
modifying". You cannot assume that "volatile" will ever help you if two
threads might modify the counter at the same time. On a RISC machine,
"modify" still means load, modify, and store, and that's not atomic. You
need special instructions to protect atomicity across that sequence (e.g.,
load-lock/store-conditional, or compare-and-swap).

Am I trying to scare you? Yeah, sure, why not? If you really feel the need
to do something like this, do yourself (and your project) the courtesy of
being EXTREMELY frightened about it. Document it in extreme and deadly
detail, and write that documentation as if you were competing with Stephen
King for "best horror story of the year". I mean to the point that if
someone takes over the project from you, and doesn't COMPLETELY understand
the implications, they'll be so terrified of the risk that they'll rip out
your optimizations and use real synchronization. Because this is just too
dangerous to use without full understanding.

There are ways to ensure memory ordering and coherency without using any POSIX
synchronization mechanisms, on any machine that's capable of supporting POSIX
semantics. It's just that you need to be really, really careful, and you need to be
aware that you're writing extremely machine-specific (and therefore inherently
non-portable) code. Some of this is "more portable" than others, but even the
"fairly portable" variants (like your stop_flag) are subject to a wide range of
risks. You need to be aware of them, and willing to accept them. Those who aren't
willing to accept those risks, or don't feel inclined to study and fully understand
the implications of each new platform to which they might wish to port, should
stick with mutexes.

/---------------------------[ Dave Butenhof ]--------------------------\
| Digital Equipment Corporation          |
| 110 Spit Brook Rd ZKO2-3/Q18 |
| Nashua NH 03062-2698 |
\-----------------[ Better Living Through Concurrency ]----------------/

 Q163: Can ps display thread names?

>  Is there a way to display the name of a thread (specified thanks
>  to the function pthread_setname_np) in commands such a ps ?
>  (in order to quickly see the behavior of a well-known thread).
>  If it is not possible with Digital UNIX's ps, someone may
>  have hacked some interesting similar utilities that display
>  such thread informations ?

The ps command is a utility to show system information, and this would be
getting into an entirely different level of process information. It would,
arguably, be "inappropriate" to do this in ps. In any case, the decision
was made long ago to not do as you suggest.

The easiest way to get this information is to attach to the process with
ladebug (or another suitable user-level-thread-enabled debugger) and ask
for the information. (E.g., ladebug's "show thread" command.)

While one certainly could create a standalone utility, you'd need to find
the binary associated with the process, look up symbols, use /proc (or
equivalent) to access process memory, and so forth -- sounds a lot like a
debugger, doesn't it?

The mechanism used to access this information is in As
of 4.0D, the associated header file,  is available on the
standard OS kit (with the other development header files). Although it's
not externally documented, it's heavily commented, and reasonably

 Q164: (Not!) Blocking on select() in user-space pthreads.

Subject: Re: Blocking on select() in user-space pthreads under HP/UX 10.20 

David Woodhouse wrote:

> HP/UX 10.20 pthreads as are implemented as user-space as opposed to
> kernel. I've heard rumors that a user-space thread that blocks on
> select() actually blocks all other threads within that process (ie the
> entire process). True of false?

The answer is an absolute, definite, unambigous... maybe.

Or, to put it another way... the answer is true AND false.

However, being in a generous (though slightly offbeat) mood today, I'll go a
little further and explain the answer. (Remember how the mice built Deep
Thought to find the answer to "Life, the Universe, and Everything", and it
came back with the answer "42", so they had to go off and build an entirely
different computer to find the question, which was "what is 9 times 6",
resulting in a third attempt, this time to find out what the question and
answer MEANT?)

Anyway, any blocking kernel call, including select, will indeed block the
process. However, if built correctly, a DCE thread (that's the origin of the
thread library on 10.20) application will never directly call select.
Instead, its calls will be redirected to the user-mode thread library's
"thread-synchronous I/O" package. This package will attempt a NON-blocking
select, and, if it would have needed to block (none of the files are
currently ready), the thread will be blocked on a condition variable "until
further notice". At various points, the library polls (with select) and
awakens any thread waiting for a file that's now ready. When all application
threads are blocked somewhere, the thread library blocks the process in
select, with a mask that OR-s all the masks for which any thread is waiting,
and with a timeout representing the next user-mode timer
(pthread_cond_timewait, select, whatever).

/---------------------------[ Dave Butenhof ]--------------------------\
| Digital Equipment Corporation          |
| 110 Spit Brook Rd ZKO2-3/Q18 |
| Nashua NH 03062-2698 |
\-----------------[ Better Living Through Concurrency ]----------------/

 Q165: Getting functional tests for UNIX98 

> Dave Butenhof wrote somewhere that there were
> a set of functional tests for UNIX98, that could
> also work with POSIX. Any idea where I could find
> it?

The place to look is The Open Group. Start with
(Unfortunately I don't have a bookmark handy for the test suite, and I
can't get to right now; so you're on your own from here. ;-))

 Q166: To make gdb work with linuxthreads?  

Are there any ongoing work or plans to make gdb work with linuxthreads?
>- Erik

Yes, there's a project at the University of Kansas called SmartGDB that is 
working on support for user-level and kernel-level threads.  There is 
already support for several user level thread packages and work is 
currently being done on the linuxthreads support.  The URL is:

We have most of the kernel modifications done required to support it and 
are working on the rest of the changes to gdb.  At this point, I can't 
even guess on a release date, but you can check the web page for more 
information on what's been done so far.  The email contact is

 Q167: Using cancellation is *very* difficult to do right...  

Bil Lewis wrote:

> Dave Butenhof wrote:
> > >   Using cancellation is *very* difficult to do right, and you
> > > probably don't want to use it if there is any other way you can
> > > accomplish your goal.  (Such as looking at a "finish" flag as you
> > > do below.)
> >
> > I don't agree that cancellation is "very" difficult, but it does
> > require understanding of the application, and some programming
> > discipline. You have to watch out for cancellation points, and be
> > sure that you've got a cleanup handler to restore shared data
> > invariants and release resources that would otherwise be stranded if
> > the thread "died" at that point. It's no worse than being sure you
> > free temporary heap storage, or unlock your mutexes, before
> > returning from a routine... but that's not to say that it's trivial
> > or automatic. (And I'm sure we've never gotten any of those things
> > wrong... ;-) )
>   Dave has written 10x as much UNIX code as I have, so our definitions
> of "very difficult" are distinct.  (I've probably been writing MP code
> longer tho...  I built my first parallel processor using two PDP/8s
> back in '72.  Now THERE was a RISC machine!  DEC could have owned the
> world if they'd tried.  I remember when...)

Yeah, PDP-8 was a pretty good RISC, for the time. Of course it needed
virtual memory, and 12 bits would now be considered a rather "quirky"
word size. But, yeah, those could have been fixed.

Oh yeah... and we DID own the world. We just let it slip out of our
hands because we just laughed when little upstarts said they owned it.
(Unfortunately, people listened, and believed them, and eventually it
came to be true.) ;-) ;-)

>   It's that bit "to restore shared data invariants". Sybase, Informix,
> Oracle, etc. spend years trying to get this right.  And they don't
> always succeed.

It's hard to do hard things. Yeah, when you've got an extremely
complicated and extremely large application, bookkeeping gets more
difficult. This applies to everything, not just handling cancellation.
Just as running a multinational corporation is harder than running a
one-person home office. The point is: the fact that the big job is hard
doesn't mean the small job is hard. Or, you get out what you put in. Or
"thermodynamics works". Or whatever.

>   And don't forget to wait for the dead threads.  You can't do
> anything with the shared data until those have all been joined,
> because you can't be sure when they actually die.

That's untrue, as long as you use proper synchronization (or maybe "as
long as you use synchronization properly"). That's exactly why the mutex
associated with a condition wait is re-locked even when the wait is
cancelled. Cleanup code needs (in general) to restore invariants before
the mutex can be safely unlocked. (Note that while the data must be
"consistent", at the time of the wait, because waiting has released the
mutex, it's quite common to modify shared data in some way associated
with the wait, for example, adding an item to a queue; and often that
work must be undone if the wait terminates.)

You only need to wait for the cancelled thread if you care about it's
return value (not very interesting in this case, since it's always
PTHREAD_CANCELED, no matter how many times you look), or if you really
want to know that it's DONE cleaning up (not merely that the shared data
is "consistent", but that it conforms to some specific consistency -- an
attempt that I would find suspicious at best, at least if there might be
more than the two threads wandering about), or if you haven't detached
the thread and want to be sure it's "gone".

>   Conclusion: Dave is right (of course).  The definition of "very" is
>   up for grabs.

The definition of the word "very" is always up for grabs. As Samuel
Clemens once wrote, when you're inclined to use the word "very", write
"damn" instead; your editor will remove it, and the result will be

Sure, correct handling of cancellation doesn't happen automatically.
Neither does correct use of threads, much less correct use of the arcane
C language (and if C is "arcane", what about English!?) Somehow, we
survive all this.

/---------------------------[ Dave Butenhof ]--------------------------\
| Digital Equipment Corporation          |
| 110 Spit Brook Rd ZKO2-3/Q18 |
| Nashua NH 03062-2698 |
\-----------------[ Better Living Through Concurrency ]----------------/

 Q168: Why do pthreads implementations differ in error conditions? wrote:

> I'd like to understand why pthreads implementations from different
> vendors define error conditions differently.  For example, if
> pthread_mutex_unlock is called for a mutex that is not owned by the
> calling thread.
>    Under Solaris 2.5:  "If the calling thread is not the owner of the
>    lock, no error status is returned, and the behavior of the program
>    is undefined."
>    Under AIX 4.2:  It returns the EPERM error code.
> The problem may be that the AIX 4.2 implementation is based on draft 7
> of the pthreads spec, not the standard, but I certainly prefer the AIX
> approach.
> Another example:  pthread_mutex_lock is called for a mutex that is
> already owned by the calling thread.
>    Under Solaris 2.5: "If the current owner of a mutex tries to relock
>    the mutex, it will result in deadlock." (The process hangs.)
>    Under AIX 4.2: It returns the EDEADLK error code.
> Once again, the AIX approach certainly seems preferable.
> Aren't these issues clearly defined by the pthreads standard?  If not,
> why not?

Yes, these issues are clearly defined by the POSIX standard. And it's
clearly defined in such a manner that implementations are not required
to make the (possibly expensive) checks to report this sort of
programmer error -- but so that implementations that do choose to detect
and report the error must do so using a standard error code.

In this case, Solaris 2.5 chooses not to "waste" the time it would take
to detect and report your error, while AIX 4.2 does. Both are correct
and conform to the standard. (Although, as you pointed out, AIX 4.2
implenents an obsolete draft of the standard, in this respect it doesn't
differ substantially from the standard.)

The POSIX philosophy is that errors that are not under programmer
control MUST (or, in POSIX terms, "SHALL") be reported. Examples include
ENOMEM, and other resource shortages. You can't reasonably know that
there's not enough memory to create a thread, because you can't really
know how much you're using, or how much additional is required. On the
other hand, you can be expected to know that you've already locked the
mutex, and shouldn't try to lock it again. POSIX does not require that
an implementation deal gracefully with such programmer errors.

While it is nice to have a "debugging mode" where all programmer errors
are detected, in the long run it's more important to have a "production
mode" where such extremely critical functions as mutex lock and unlock
execute as quickly as possible. In general, the only way to do both is
to have two separate libraries. This complicates maintenance
substantially, of course -- but it also complicates application
development because the two libraries will have different timings, and
will expose different problems in the application design. Which means
you'll inevitably run into a case that only fails on the "production
library", and can't be debugged using the "debug library". That usually
means the development and maintenance costs of building and shipping two
thread libraries usually isn't worthwhile.

You're better off relying on other tools to detect this class of
programming error. For example, the Solaris "lock_lint" program, or
dynamic trace analysis tools that can check for incorrect usage at
runtime in your real program.

/---------------------------[ Dave Butenhof ]--------------------------\
| Digital Equipment Corporation          |
| 110 Spit Brook Rd ZKO2-3/Q18 |
| Nashua NH 03062-2698 |
\-----------------[ Better Living Through Concurrency ]----------------/

 Q169: Mixing threaded/non-threadsafe shared libraries on DU  
Claude Vallee wrote:

> Thanks Dave Butenhof for your excellent answer.  I just have a few
> complementary questions.
> + To make this clear, prior to 4.0D, your liba code running in the
> + main thread may see errno change at random. As long as liba doesn't
> + read errno, this shouldn't be a problem.
> +
> I found out I was using 4.0B.  Is errno the only the problem area of
> the c run time library?  What about other libraries like librt?

The base kit libraries should be thread-safe. There are so many, though,
that I'm afraid I can't claim personal knowledge that they all ARE
thread-safe. I also know of at least one case where a base kit library
was "almost" thread-safe, but had been compiled without -D_REENTRANT
(making it subject to errno confusion). Bugs, unfortunately, are always
possible. While it's not that hard to code a library for thread-safety,
it's nearly impossible to PROVE thread-safety -- because problems come
in the form of race conditions that are difficult to predict or provoke.

> + You do have to be aware of whether liba SETS the global errno --
> + because your thread-safe code won't see the global errno through any
> + normal mechanisms.
> What do you mean by that?  Yes, liba sets errno each time it calls a
> system service (the system service sets it actually).  If you're
> asking if it explicitely sets it, then no.  Are you asking if I am
> counting on setting errno from one thread and reading it from the
> other thread counting on the value to be the same?

Calling system services doesn't count. The libc syscall stubs that
actually make the kernel call DO handle errno correctly with threads.
(On the other hand, if your library runs in a threaded application and
isn't built correctly, you'll end up looking at the global errno while
the system/libc just set your thread errno. That's the point of my 4.0D
change -- at least if that non-thread-safe code runs in the initial
thread, it'll get the right errno.)

> By the way, seterrno(), does not seem to be a public service (it
> doesn't have a man page anyway, (I found _Seterrno() in errno.h, but I
> we're certainly not using it )).

You're right -- int _Geterrno() and _Seterrno(int) are the external
interfaces. I'd recommend compiling for the threaded errno rather than
using those calls, though.

> + No, _THREAD_SAFE doesn't do anything. It's considered obsolete. You
> + should use _REENTRANT. (Though I actually prefer the former term,
> + I've never felt it was worth arguing, or making anyone change a ton
> + of header files.)
> Ok, _THREAD_SAFE is out.  Then, if I define _REENTRANT when compiling
> all my sources, and I explicitely link with libpthread, libmach,
> libc_r, and all the reentrant versions of my libraries, will this
> achieve the same thing as using the "-pthread" option?  (Or am I
> playing with fire again?).

We document the equivalents, and it's perfectly legal to use them.
However, the actual list of libraries may change from time to time, and
you'll get the appropriate set for the current version, automatically,
when you link with "cc -pthread". Over time, using the older set of
libraries may leave you carrying around extra baggage. For example, your
reference to libc_r is long-since obsolete; and on 4.0D, threaded
applications no longer need libmach. While -pthread links automatically
stop pulling in these useless references, you still be carrying them
around, costing you extra time at each program load, as well as extra
virtual memory. Is it a big deal? That's up to you. But if you're using
a compiler that supports "-pthread", I'd recommend using it.

> + If liba is YOUR code, then please don't play this
> + game: build it at least with -D_REENTRANT.
> Yes, liba is our code... Actually, liba is a set of hundreds of
> libraries which take a weekend to build.  And most of our processes
> are not multithread.  What I was trying to achieve is to save on
> processing time (use non thread-safe libraries in single threaded
> processes), and to save on compile time (not building both single
> threaded and multi threaded versions of all the libraries).

If you just compile with -D_REENTRANT, you'll get thread-safe errno, but
that's only a small part of "thread safety". As long as it's only called
in one thread, that's probably OK. For more general thread-safety, with
very little performance impact on non-threaded processes, you might look
into the TIS interfaces (tis_mutex_lock, tis_mutex_unlock, etc.). You
can use these just the equivalent POSIX functions; in a threaded
process, they do the expected synchronization, but in a non-threaded
process they avoid the cost of interlocked data access and memory
barrier instructions, giving very low overhead. (TIS is better than
having pthread stubs in libc, because it works even when the main
program isn't linked against libpthread.)

> Thanks again for a thorough answer.  By the way, for some reason, I
> could never get the answer to my question from my news server (I got
> it from searching usenet through altavista), so please reply by email
> as well as through the newsgroup.

Yeah, news servers can be annoying creatures, with their own strange
ideas of what information you should be allowed to see. You really can't
trust them! ;-)

/---------------------------[ Dave Butenhof ]--------------------------\
| Digital Equipment Corporation          |
| 110 Spit Brook Rd ZKO2-3/Q18 |
| Nashua NH 03062-2698 |
\-----------------[ Better Living Through Concurrency ]----------------/

 Q170: sem_wait() and EINTR  

W. Richard Stevens wrote:

> Posix.1 says that sem_wait() can return EINTR.  Tests on both Solaris 2.6
> and Digital Unix 4.0b show that both implementations do return EINTR when
> sem_wait() is interrupted by a caught signal.  But if you look at the
> (albeit simplistic) implementation of semaphores in the Posix Rationale
> that uses mutexes and condition variables (pp. 517-520) sem_wait() is
> basically:
>     int
>     sem_wait(sem_t *sem)
>     {
>             pthread_mutex_lock(&sem-;>mutex);
>             while (sem->count == 0)
>                     pthread_cond_wait(&sem-;>cond, &sem-;>mutex);
>             sem->count--;
>             pthread_mutex_unlock(&sem-;>mutex);
>             return(0);
>     }
> But pthread_cond_wait() does not return EINTR so this function will never
> return EINTR.  So I was wondering how existing implementations actually
> implement sem_wait() to detect EINTR.  Seems like it would be a mess ...
>         Rich Stevens

On Digital UNIX, sem_wait() turns into a system call, with the usual behavior
in regard to signals and EINTR. You can only implement POSIX semaphores via
pthreads interfaces if you have support for the PSHARED_PROCESS synch
attribute. Digital
UNIX won't have that support until an upcoming major release.
At that point, it's entirely possible that the POSIX semaphores
will be re-implemented to use the model you cite. I certainly
encouraged the maintainer to do so before I left...

Jeff Denham (

Bright Tiger Technologies
SmartCluster� Software for Distributed Web Servers

 Q171: pthreads and sprocs  

Peter Shenkin wrote:
> Can someone out there say something about which is better to use
> for new code -- pthreads or sprocs -- and about what the tradeoffs
> are?

A good question.  I've done little with sprocs, but have used pthreads
on a 32p SGI Origin a good deal.

Sprocs are heavyweight processes; pthreads are MUCH lighter weight.

Sprocs have a considerable amount of O/S and programmer tool support 
for high performance programming; as yet, pthreads has almost none.

Sprocs lend themselves to good fine-grain control of resources
(memory, CPU choice, etc); as yet these strengths are largely lacking 
in SGI pthreads.

The project on which I work has bet the farm on the present and future 
high performance of pthreads and the results so far have been good.
However, we would dearly love for SGI and the rest of the parallel
programming community to better support pthreads as well as they have 
their former proprietary parallel programming models so that we can
control our threads as specifically as we could our sprocs and the like.

Not really a complaint; more of a strong request.

The upshot is, IMHO, go ahead and program in pthreads on SGIs.  The
performance gain you would have gotten from better control of your
sprocs is made up for in the "portability" of pthreads, the rosier
future of pthreads, and their more modest system resource use.

Not my company's official position, I should add.

Randy Crawford

 Q172: Why are Win32 threads so odd?  

Bil Lewis  wrote in article
>   You must know the folks at MS who did Win32 threads (I assume).

Bad assumption. I know of them, but don't know them and they don't know me.

> Some of the design sounds so inefficient and awkward to use, while
> other bits look really nice. 

My own opinion is that win32 grew from uni-thread to pseudo-multi-thread in
a very haphazard manner, basically features were added when they were found
to be needed.
I personally dislike the overall asymmetric properties of the API. Consider
the current problem of providing POSIX condition variables: if you could
release a single waiter on a manual reset event, or release all waiters on
an autoreset event then the problem would be much simpler to solve.
Consider also the ability to signalObjectAndWait() but no corresponding
signalObjectAndwaitMultiple() - another change that would make writing
various forms of CV's easier.

>   Are my views widely shared in the MS world?  And why did they
> choose that design?  Have they thought of adopting the simpler
> POSIX design (if not POSIX itself)?

No idea, sorry. Try asking Dave Cutler he was one of the main thread
architects AFAIK.


Bil Lewis wrote:

>   The real question is: "What the heck was Cutler (I assume?) thinking when he made
> Win32 mutexes kernel objects?"

Well, Dave has a history in this department. Consider his
Digital VAXELN realtime executive, a direct predecessor of
WNT. It was designed and written by Cutler (and some other
folks who later contributed to NT, like Darryl Havens)
to run on the MicroVAX I waaaaay back in the early '80s
at DECWest in Seattle. (Development soon moved to a
dedicated realtime group at the Mill in Maynard.)

VAXELN had processes (called jobs) and threads (called
processes) and kernel objects (PROCESS, DEVICE, SEMAPHORE,
AREA, PORT, MESSAGE). It ran memory-resident on embedded
VAXes(or VAX 9000s for that matter), and let you program
device drivers (or whatever) in Pascal, C, FORTRAN,
or Bliss even. (Pretty nifty little concurrent environment,
a little bit too ahead of its time for its own good, guess.)

The only synchronization object provided oringally was the
semaphore, which like the NT mutex, required a trip into
the kernel even for uncontested locking. This of course
proved to be too expensive for real-world concurrent
programming, so a library-based optimized synch. object
was developed.

It had a count and an embedded binary semaphore object
that could be locked quickly in user space through the
use of the VAX ADAWI interlocked-increment instruction.
A system call occurred only for blocking and wakeup on a
contested lock.

Sounds just like an NT critical section, huh?

Ironically, in VAXELN it was called a MUTEX! History
repeats itself, with only the names changed to protect
the guilty...


 Q173: What's the point of all the fancy 2-level scheduling?? wrote:

> In article <>,
>   Jeff Denham  wrote:
> >
> > Boris Goldberg wrote:
> >
> > Seriously, I've been around this Solaris threads package long enough to
> > be wondering how often anyone is using PROCESS scope threads.
> > With everyone just automatically setting SYSTEM scope threads
> > to get the expected behavior, what's the point of all the fancy 2-level
> > scheduling??
> I think, it is better to use thr_setconcurrency to create #of processors
> + some additional number (for I/O bound threads) of LWPs rather than
> creating LWP for each thread. Can Dave Butenhof or somebody from
> Sun thread designer team please comment on this?

That's kinda funny, since I've got no connection with Sun.

The real problem is that Sun's "2 level scheduling" really isn't at all like
2-level scheduling should be, or was intended to be. There's a famous paper
from the University of Washington on "Scheduler Activations" (one of Jeff
Denham's replies to this thread mentioned that term, so you may have noticed
it), which provides the theoretical basis for modern attempts at "2 level
scheduling". Both Sun and Digital, for example, claim this paper as the
basis for our respective 2-level scheduling models.

However, while we (that's Digital, not Sun) began with a model of 2-way
communication between kernel and user schedulers that closely approximates
the intended BEHAVIOR (though not the detailed implementation) of scheduler
activations, I have a hard time seeing anything usefully similar in Solaris.
They have a signal to save the process from total starvation when the final
thread blocks in the kernel (by giving the thread library a chance to create
another LWP). We automatically generate a new "replacement VP" so that the
process always retains the maximum level of concurrency to which its

The advantages of 2-level scheduling are in performance and scaling.

  1. Scaling. Kernel threads are kernel resources, and (as with processes),
     there are strict limits to how many a kernel can handle. The limits are
     almost always fixed by some configuration process, and additionally
     limited by user quotas. Why? Because they're expensive -- not just to
     the process that uses them, but to the system as a whole. User-mode
     threads, on the other hand, are "just" virtual memory, and, in
     comparison, "dirt cheap". So you can create a lot more user threads
     than kernel threads. Yeah, the user threads can't all run at the same
     time... but neither can the kernel threads, because the number of
     processors (being a relatively expensive HARDWARE resource) is even
     more limited. The point is to balance the number of kernel threads
     against the "potentially parallelism" of the system (e.g., the number
     of processors), while balancing the number of user threads against the
     "potential concurrency" of the process (the maximum parallelism plus
     the maximum number of outstanding I/O operations the process might be
     able to undertake). [On Solaris, you do this manually by creating LWPs
     -- either by creating BOUND (SCS) threads, or by calling
     thr_setconcurrency. On Digital UNIX, this is done for you automatically
     through the integration between user and kernel level schedulers.]
  2. Performance. In many typical workloads, most of the communication is
     between threads within the process. Synchronization involves mutexes
     and condition variables. A 2-level scheduler can optimize these types
     of synchronization, and the resulting context switches, without
     involving the kernel at all. A kernel thread needs to call into the
     kernel to block -- and then another thread needs to call into the
     kernel again to unblock it. A user thread (or a 2-level thread blocking
     in user mode) only needs to call into the thread library. Because a
     call into the kernel is more expensive than a call within the process
     (and usually LOTS more expensive), this can save a lot of time over the
     life of a process.

/---------------------------[ Dave Butenhof ]--------------------------\
| Digital Equipment Corporation          |
| 110 Spit Brook Rd ZKO2-3/Q18 |
| Nashua NH 03062-2698 |
\-----------------[ Better Living Through Concurrency ]----------------/
 Q174: Using the 2-level model, efficency considerations, thread-per-X  
Bil Lewis wrote:

> My experience (which is fairly broad), and my opinion (which is
> extensive) is that THE way to do this is this:

Yes, Bil, your experience is extensive -- but unfortunately mostly limited
to Solaris, which has a poor 2-level scheduling design. (Sorry, guys, but
it's the truth.)

>   Use only System scoped threads everywhere.  On your selected
> release platforms, tune the number of threads for each configuration
> that's important.  (Take educated guesses at the rest.)  For servers,
> do everything as a producer/consumer model.  Forget Thread-per-request,
> thread-per-client, etc.  They are too hard to tune well, and too
> complex.

No, use SCS ("system scoped") threads only where absolutely necessary. But,
yeah, in recognition of your extensive experience, I would acknowledge that
on Solaris they are (currently) almost always necessary. (Solaris 2.6 is
supposed to be better than 2.5, though I haven't been able to try it, and I
know that Solaris developers have hopes to make substantial improvements in
the future.) This is, however, a Solaris oddity, and not inherent in the
actual differences between the PCS ("process scoped") and SCS scheduling

On the rest, though -- I agree with Bil that you should avoid "thread per
request" in almost all cases. Although it seems like a simple extension
into threads, you usually won't get what you want. This is especially true
if your application relies on any form of "fairness" in managing the
inevitable contention between clients, because "thread per request" will
not behave fairly. You'll be tempted to blame the implementation when you
discover this, but you'll be wrong -- the problem is in the application.
The best solution is to switch to a client/server (or "producer/consumer")
model, where you control the allocation and flow of resources directly.

>   Process scoped threads are good for a very small number of unusual
> examples (and even there I'm not totally convinced.)

On the contrary, PCS threads are best except for the very few applications
where cross-process realtime scheduling is essential to correct operation
of the application. (E.g., for direct hardware access.)

>   Simplicity rules.

Right. (Fully recognizing the irony of agreeing with a simplistic statement
while disagreeing with most of the philosophy behind it.)

>   Logic:  Process scope gives some nice logical advantages in design,
> but most programs don't need that.  Most programs want to run fast.
> Also, by using System scoped threads, you can monitor your LWPs,
> knowing which is running which thread.

SCS gives predictable realtime scheduling response across the system, but
most programs don't need that. Most programs want to run fast, and you'll
usually get the most efficient execution, and the best management of system
resources, by using PCS threads. "Monitoring" your LWPs might be
comforting, but probably provides no useful understanding of the
application performance. You need an analysis tool that understands the
CONTENTION between your execution contexts, not merely the identity of the
execution contexts. Such an analysis tool can understand PCS threads as
well as SCS threads.

>   Anywhere you're creating threads dynamically, you need to know
> how many threads you're creating and ensure you don't create too
> many.  (Easy to mess up!)  By using a P/C model, you create exactly
> the right number of threads (tuned to machine, CPUs, etc.) and don't
> have to think about them.  If you run at lower than max capacity, having
> a few idle threads is of very little concern.

Remember that "concurrency" is much more useful to most applications than
"parallelism", and is harder to tune without detailed knowledge of the
actual workload. When you're doing I/O, your "concurrency width" is often
far greater than your "execution width". It's often useful, for example, to
dynamically increase the number of "servers" in a process to balance a
large workload, because each server might be blocked on one client's
request for a "long time". Dynamic creation isn't necessarily tied to a
simplistic "thread per request" model.

>   Opinions may be worth what you paid for 'em.

No, no. Opinions are hardly ever worth as much as you paid for them, and
usually a good deal less. Information, however, may be worth far more. One
might hope that in the process of airing our worthless opinions, we have
incidentally exposed some information that might help someone! ;-)

/---------------------------[ Dave Butenhof ]--------------------------\
| Digital Equipment Corporation          |
| 110 Spit Brook Rd ZKO2-3/Q18 |
| Nashua NH 03062-2698 |
\-----------------[ Better Living Through Concurrency ]----------------/
 Q175: Multi-platform threading api  

From: "Constantine Knizhnik" 

I also have developed threads class library, which provides three system
dependent implementations: based on Posix threads, Win32 and portable
cooperative multitasking using setjmp() and longjmp(). This library was used
in OODBMS GOODS for implementing server and client parts. If somebody is
intersted in this library (and in OODBMS itself), it can be found at

Jason Rosenberg wrote in message <>...
>I am tasked with converting a set of large core C libraries
>to be thread-safe, and to use and implement a multi-platform
>api for synchronization.  We will need solutions for
>Solaris 2.5.1, Digital Unix 4.0B(D?), Irix 6.2-6.5,
>HP-UX 10.20, AIX 4.1.2+, Windows NT 4.0(5.0) and Windows 95(98).
>I have built a basic wrapper api which implements a subset
>of pthreads, and have it working under Digital Unix 4.0B,
>Irix 6.2 (using sprocs), and Windows NT/95.  I am in the
>process of getting it going on the other platforms.

 Q176: Condition variables on Win32   


> I don't see that this is justifiable.

Have you ever seen any of the tortuous attempts by bright fellows like
Jeffrey Richter to define relatively simple abstractions like
readers/writer locks using the Win32 synchronization primitives?  It's
not pretty...  In fact, the first several editions of his Advanced
Windows book were full of bugs (in contrast, we got that right in ACE
using CVs in about 20 minutes...).  If someone of his calibre can't
get this stuff right IN PRINT (i.e., after extensive reviews) I don't
have much faith than garden variety Win32 programmers are going to
have a clue...

> It might be harder if you think in terms of the POSIX facilities. I
> wouldn't say that the combination of a semaphore and mutex or
> critsec is hard though, and the inconvenience of having to acquire
> the mutex after waiting on the semaphore is balanced against
> checking for false wakeups and being unable to signal n resources.

Checking for false wakeups is completely trivial.  I'm not sure what
you mean by "signal n resources".  I assume you're referring to the
fact that condition variables can't be used directly in a
WaitForMultipleObjects()-like API.  Though clearly you can broadcast
to n threads.

> I had in mind that it is allowed to notify more than one thread
> (which always seemed odd to me) but I don't have my POSIX spec
> handy.  Just a nit, but it does stress that false wakeups must be
> handled.

I had it this way originally, but others (hi David ;-)) pointed out
that this was confusing, so I'm grudgingly omitting it from this
discussion.  I suppose anyone who really wants to understand POSIX CVs
ought to read more comprehensive sources (e.g., Bil's book) than my

> I wish.  In fact they store details about the owning thread and
> support recursive acquisition.  I think this was a screwup by the NT
> designers - critical sections are needlessly expensive.  For very
> basic requirements (I believe) you can get a performance gain using
> the InterlockedIncrement et al for spin locks and an auto reset
> event to release waiters.  (Not sure I can prove it at the moment
> though.  If it makes a measurable difference, you have bigger
> problems than the efficiency of mutexes)

BTW, there's been an interesting discussion of this on the
comp.programming.threads newsgroup recently.  You might want to check
this out.

> This occurs in several examples from 3.1 on.  (Most of them? ...).
> Clear copy-paste bug I'd guess.  Better change to
> ReleaseMutex(external_mutex).

Hum, I'm not sure why you say this.  In all these cases the
"pthread_mutex_t" is typedef'd to be a CRITICAL_SECTION.  Am I missing
something here?

> I think this is rather optimistic.  There is no guarantee that any
> of them will release the mutex in a timely fashion. My original
> objection was to a solution that used a semaphore or other count of
> tokens, and that one thread could loop quickly and steal several
> tokens, leaving threads still blocked.

Right, that was the original discussion that triggered this paper.

> >  EnterCriticalSection (external_mutex);
> Another copy-paste bug.  WaitForSingleObject?

Can you please point out where you think these problems are occurring?
As far as I can tell, everything is typedef'd to be a CRITICAL_SECTION
(except for the SignalObjectAndWait() solution).

Take care,

 Q177: When stack gets destroyed relative to TSD destructors?  

Douglas C. Schmidt wrote:

>         Can someone please let me know if POSIX pthreads specifies
> when a thread stack gets destroyed relative to the time at which the
> thread-specific storage destructors get run?  In particular, if a
> thread-specific destructor accesses a pointer to a location on the
> run-time stack, will this memory still exist or will it be gone by the
> time the time the destructor runs?

Thread-specific data destructors must run in the context of the thread
that created the TSD value being destroyed. (This is clearly and
unambiguously implied by the standard. That is, while the standard
doesn't explicitly require this, an implementation that called
destructors in some other context would present a wide range of severe
restrictions in behavior that are not allowed by the standard.) Thus, the
stack must exist and remain valid at this point.

After a thread has terminated (having completed calling all cleanup
handlers and destructors), "the result of access to local (auto)
variables of the thread is undefined". E.g., at this point, (possibly
before a pthread_join or pthread_detach), the stack may have been

/---------------------------[ Dave Butenhof ]--------------------------\
| Digital Equipment Corporation          |
| 110 Spit Brook Rd ZKO2-3/Q18 |
| Nashua NH 03062-2698 |
\-----------------[ Better Living Through Concurrency ]----------------/
 Q178: Thousands of mutexes?  

Peter Chapin wrote:

> I'm considering a program design that would involve, potentially, a large
> number of mutexes. In particular, there could be thousands of mutexes
> "active" at any one time. Will this cause a problem for my hosting
> operating system or are all the resources associated with a mutex in my
> application's address space? For example, in the case of pthreads, are
> there any resources associated with a mutex other than those in the
> pthread_mutex_t object? Is the answer any different for Win32 using
> CRITICAL_SECTION objects? (I know that there are system and process limits
> on the number of mutexes that can be created under OS/2... last I knew it
> was in the 64K range).

POSIX mutexes are usually user space objects, so the limit is purely based on
your virtual memory quotas. Usually, they're fairly small. Some obscure hardware
may require a mutex to live in special memory areas, in which case there'd be a
system quota -- but that's not relevent on any modern "mainstream" hardware.

On an implementation with 1-1 kernel threads (AIX 4.3, HP-UX 11.0, Win32), there
must be some "kernel component" of a mutex -- but this may be no more than a
generic blocking channel, with the actual synchronization occurring in
user-mode, so there may be no persistent kernel resources involved. Win32
mutexes are pure kernel objects -- critical sections, I believe, are user
objects with a kernel blocking channel (but I don't know whether the kernel
resource is persistent or dynamic).

Similarly, even on a 2-level scheduling implementation (Solaris, Digital UNIX,
and IRIX), a "process shared" mutex (a POSIX option that allows placing a mutex
in shared memory and synchronizing between processes) requires a kernel blocking
channel: but again, the persistent state may live completely in user-space. A
process private (default) mutex, on a 2-level scheduling implementation, is
almost certainly always a pure user-mode object.

Any more detailed answers will require knowing exactly what O/S (and version)
you intend to use.

/---------------------------[ Dave Butenhof ]--------------------------\
| Digital Equipment Corporation          |
| 110 Spit Brook Rd ZKO2-3/Q18 |
| Nashua NH 03062-2698 |
\-----------------[ Better Living Through Concurrency ]----------------/

 Q179: Threads and C++  

In article <>,
  Dave Butenhof  wrote:
> Federico Fdez Cruz wrote:
> >         When I create a thread using pthread_create(...), can I ask to the
> > thread to execute a object method instead other "normal" function?
> >         I have tried this, but it seems like the thread doesn't know anything
> > about the object. If I write code in the method that doesn't use any
> > member of the object, all goes well; but If I try to access a member
> > function or a member in the object, I get a segmentation fault.
> >         I have seen that when the new thread is executing, "this" is NULL from
> > inside this thread.
> Yes, you can do this. But not portably. There's no portable C++ calling


> My advice would be to avoid the practice. POSIX 1003.1c-1995 is a C language
> standard, and there are, in general, few portability guarantees when crossing
> language boundaries. The fact that C and C++ appear "similar" is, in some ways,
> worse than crossing between "obviously dissimilar" languages.

In fact, there are significant advantages in doing MT programming in C++.
All the examples in Butenhof D.'s book follow a model of cleaning up (C++
destructor) at the end of block or on error (C++ exception) and can be
written more elegantly in C++ using object (constructor/destructor) and
exceptions. Of course, you will read Butenhof's book for threads (and not

> Use a static non-member function for pthread_create, and have it call your member
> function. There's no need to bend the rules for a minor convenience, even when
> you can depend on an implementation where the direct call happens to work.

You can use the following method in C++ :

class Thread
  pthread_t thread_;
  typedef void* (*StartFunction)(void* arg);
  enum DetachedState { joinable, detached };

  Thread(StartFunction sf, void* arg, DetachedState detached_state);
  void join();
// use this to create a thead with a global function entry.
inline Thread Thr_createThread(Thread::StartFunction sf,
                               void* arg,
                               Thread::DetachedState ds)
   return Thread(sf, arg, ds);

// This is for C++ class member function (non-static)
class ThrAction
   typedef void* (T::*MemberFunction)();

   ThrAction(T* object, MemberFunction mf)
     : object_(object), mf_(mf) {}
   void* run() {
     return (object_->*mf_)();
   T* object_;
   MemberFunction mf_;

// an implementation class : notice the friend
class ThreadImpl
  static void* start_func(void* arg) {
    return ((Action*)arg)->run();
  friend Thread Thr_createThread(Action* action,
                                 Thread::DetachedState ds);

// You use this to create a thread using a member function
inline Thread Thr_createThread(Action* action,
                               Thread::DetachedState ds)
  return Thread(ThreadImpl::start_func, action, ds);

Now, you can use a C++ member function:

class MyClass
  typedef ThrAction Action;
  Action monitor_;
  void* monitor() {  // this will be called in a new thread
    // ...

    : monitor_(this, monitor)
    // start a thread that calls monitor() member function.
    Thread thr = Thr_createThread(&monitor;_, Thread::detached);

struct Detached
  pthread_attr_t attr_;
  Detached() { pthread_attr_init(..); }
  ~Detached() { pthread_attr_destroy(..); }

static Detached detached_attr;

// depends on the static variable detached_attr; so make it
// out-of-line (in file).
Thread::Thread(StartFunction sf, void* arg, DetachedState ds)
  pthread_create(&thread;_, sf, arg,
                 ds == detached ? detached_attr.attr_, 0);

I wrote it in a hurry. I hope it helps.

- Saroj Mahapatra

In article <6riv2p$>, Kaz Kylheku  wrote:
++ I second that! I also use the wrapper solution. Indeed, I haven't
++ found anyting in the C++ language definition which suggests that static member
++ functions are like ordinary C functions.

You are correct.  Most C++ compilers DO treat static member functions
like ordinary C functions, so it's usually possible to pass static C++
member functions as arguments to thread creation functions.  However,
some compilers treat them differently, e.g., the OpenEdition MVS C++
compiler doesn't allow static member functions to be used where
ordinary C functions are used, which is a PAIN.

BTW, if you program with ACE
( it hides all of this
madness from you so you can write a single piece of source code
that'll work with most C++ compilers.

Take care,


I thought all those concerned with developing multi-threaded software
using the STL and C++ might be interested in the topic of STL and thread
safety.  I just bought the July/August 1998 issue of C++ Report and
within there is an article concerning the testing for thread safety of
various popular implementations of STL.  These are the published

 STL implementation     Thread-safe?
 ------------------     ------------
 Hewlett-Packard            NO
 Rogue Wave                 NO
 Microsoft                  NO
 Silicon Graphics           YES
 ObjectSpace                YES

========================  But:   ================

You've missed rather a lot of discussion on this topic in the intervening
months. A few supplemental facts:

1) The definition of ``thread safe'' while not unreasonable is also not
universal. It is the working definition promulgated by SGI and --surprise! --
they meet their own design requirement.

2) Hewlett-Packard provided the original STL years ago, at a time when
Topic A was finding a C++ compiler with adequate support for templates.
Thread safety was hardly a major design consideration.

3) Rogue Wave was quick to point out that the version of their code
actually tested was missing a bug fix that had been released earler.
The corrected code passes the tests in the article.

4) Microsoft has been the unfortunate victim of some messy litigation.
(See web link in footer of this message.) If you apply the fixes from:

then VC++ also passes the tests in the article. Its performance also
improves dramatically.

The C++ Report ain't Consumer Reports. Before you buy on the basis
of an oversimplified table:

a) Make sure your definition of ``thread safety'' agrees with what the
vendor provides.

b) Make sure you get the latest version of the code.

P.J. Plauger
Dinkumware, Ltd.

> and I would like to implement the functionality of the jave interface
> runnable. Actually I suppose both of the following questions are
> really c++ questions, but I'm more optimistic to find the competent
> audience for my problems here...
> My first question is: Why does it make a difference in passing the
> data Argument to pthread_create, whether the function is a member
> function or not?

Because a member function is not a C function. A member function pointer
is a combination of a pointer to the class, and the index of the function
within the class. To make a member function "look" like a C function,
you must make it a static member function.

> In the following code I had to decrement the data pointer address
> (&arg;) by one. This is not necessary, if I define run() outside of any
> class.

This is dangerous!! One another compiler, a different platform, or
the next release of the same compiler, this may or may not work.
It is really a happy accident that it does work in this case.The canonical
form for C++ is to pass the address of a static
member function to pthread_create(), and pass the address
of the object as the argument parameter to pthread_create().
the static member function then calls the non-static member
by casting the void *arg to the class type.


> The second question is: I've tried to use the constructor of Thread to
> start run() of the derived class as the thread. For this I've
> implemented run() in the base class as pure virtual. But I didn't
> succeed because the thread always tried to run the pure virtual base
> function. Why is this?

Because during the constructor of the base class, the object *is*a base
object. It doesn't know that you have derived something
else from it. It is not possible to call a derived class's virtual
functions during construction or destruction of a base class.


Here is a minimalist emulation of the Java Runnable and Thread
interface: no error checks, many routines left out, no thread groups
and so on.



// ----------------------------------------------------------------------

class Runnable
  virtual ~Runnable();
  virtual void run() = 0;


// ----------------------------------------------------------------------

class Thread
  Thread(Runnable *r);
  virtual ~Thread();

  void start();
  void stop();
  void join();

  virtual void run();

  static void *startThread(void *object);
  void runThread();

  Runnable *target;
  // 0=not started, 1=started, 2=finished
  int state;
  pthread_t thread;

  : target(0),

Thread::Thread(Runnable *r)
  : target(r),

  if (state == 1)

void Thread::start()
  pthread_create(&thread;, 0, &Thread;::startThread, this);

void Thread::stop()

void Thread::join()
  void *value = 0;
  pthread_join(thread, &value;);

void Thread::run()

void *Thread::startThread(void *object)
  Thread *t = (Thread *) object;
  return 0;

void Thread::runThread()
  state = 1;
  if (target)
  state = 2;

// ----------------------------------------------------------------------


class Test : public Runnable
  void run();

void Test::run()
  printf("thread run called\n");

int main(int argc, char *argv[])
  Thread t(new Test);
  printf("thread started\n");

  return 0;

> I've run into a trouble when I found out that when I cancel a thread via
> pthread_cancel() than destructors for local object do not get called.
> Surprising :). But how to deal with this? With a simple  thread code
> it would not be a big problem, but in my case it's fairly complex code,
> quite a few STL classes etc. Has someone dealt with such problem and is
> willing to share his/her soltution with me ? I thought I could 'cancel'
> thread via pthread_kill() and raise an exception within a signal handler
> but it's probably NOT very good idead, is it?;)
> Thank you,

  Unfortunately, not surprising.  C++ has not formally decided what to do with
thread cancellation, so it becomes compiler-specific.  The Sun compiler (for 
example) will run local object destructors upon pthread_exit() (hence 
cancellation also).  Others may not.

  I suppose the best GENERAL C++ solution is:

    a) Don't use stack-allocated objects.
    b) Don't use cancellation.

  Otherwise you can simply insist on a C++ compiler that runs the destructors.

Ian Johnston, FX Architecture, UBS, Zurich

 Q180: Cheating on mutexes  

Hi all!  Howz things goin?  Just got back from the COOTS conference
where I learned all sorts of valuable lessons ("Don't try to match
Steve Vinoski drink for drink", "Snoring during dull presentations
is not appreciated").  As to this question...


  Pretty much everybody's been largely correct, but a little excessive.

  If we define the objective to be "I have a variable which will under
go a single, atomic state change, can I test it without a mutex?"
then the answer is "yes, if you do things right."  In particular,
if you want to go from FALSE to TRUE, and you don't care if you see
the change synchroniously, then you're OK.

  This is how spin locks work, this is how pthread_testcancel works
(at least on Solaris), and both Dave B & I talk about how to use 
this for pthread_once.

  With spin locks, we test the ownership bit until it becomes "free".
Then we do a trylock on it.  If somebody else gets it first, we go
back to spinning.

  With pthread_testcancel() we test the cancellation flag for our
thread w/o a lock.  If it ever becomes true, we exit.  (The setter
will set it to true under mutex protection, so that upon mutex unlock,
the value will be quickly flushed to main memory.)

  With pthread_once(), we'll insert a test BEFORE calling pthread_once,
testing a variable.  If it's true, then we know pthread_once has executed
to completion and we can skip the test.  If it's false, then we need to
run pthread_once(), which will grab the proper lock, and do the testing
under that lock, just in case someone else was changing it at that instant.

  So...  If you're very, very, careful and you don't mind missing the exact
point of initial change...  you can get away with it safely.


> > ...  The real
> > trouble is that if you don't use some kind of synchronisation
> > mechanism, the update may not be seen at other CPUs *at all*.
> ...
> Again donning my newbie hat with the point on top, why not?
> For example, might a a pthreads implementation on a distributed-
> memory architecture not propagate global variables to the other
> CPUs at all, in the absence of something like a mutex?

 Q181: Is it possible to share a pthread mutex between two distinct processes?  

> ie: some way to attach to one like you can attach to shared memory.
> Same question for condition variables.

The answer is (as often happens) both YES and NO. Over time, the balance
will shift strongly towards YES.

The POSIX standard provides an option known commonly as "pshared", which,
if supported on your implementation, allows you to allocate a
pthread_mutex_t (or pthread_cond_t) in shared memory, and initialize it
using an attributes object with a specific attribute value, such that two
processes with access to the shared memory can use the mutex or condition
variable for synchronization.

Because this is an OPTION in the POSIX standard, not all implementations
will provide it, and you cannot safely count on it. However, the Single
UNIX Specification, Version 2 (UNIX 98) requires that this POSIX option be
supported on any validated UNIX 98 implementation.

Implementations that provide the pshared option will define the
preprocessor symbol _POSIX_THREAD_PROCESS_SHARED in the  header

For example,

     pthread_mutexattr_t mutattr;

     pthread_mutexattr_init (&mutattr;);
     pthread_mutexattr_setpshared (&mutattr;, PTHREAD_PROCESS_SHARED);
     pthread_mutex_init (&mutex;, &mutattr;);

/---------------------------[ Dave Butenhof ]--------------------------\
| Digital Equipment Corporation          |
| 110 Spit Brook Rd ZKO2-3/Q18 |
| Nashua NH 03062-2698 |
\-----------------[ Better Living Through Concurrency ]----------------/

 Q182: How should one implement reader/writer locks on files?  

> How should one implement reader/writer locks on files?
> The locks should work accross threads and processes.

The only way to lock "a file" is to use the fcntl() file locking
functions. Check your man page. HOWEVER, there's a big IF... these locks
are held by the PROCESS, not by the THREAD. You can't use them to
control access between multiple threads within a process.

If you are interested in a mechanism outside the file system, you could
use UNIX98 read/write locks, with the pshared option to make them useful
between processes (when placed in shared memory accessible to all the
processes). However, UNIX98 read/write locks are not currently available
on most UNIX implementations, so you'd have to wait a while. Of course
you'd have to work out a way to communicate your shared memory section
and the address of the read/write lock(s) to all of the processes
interested in synchronizing. Also, while there are ways to make
fcntl() locking mandatory instead of advisory (at least, on most
systems), there's no way to do this with external locking.

/---------------------------[ Dave Butenhof ]--------------------------\
| Digital Equipment Corporation          |
| 110 Spit Brook Rd ZKO2-3/Q18 |
| Nashua NH 03062-2698 |
\-----------------[ Better Living Through Concurrency ]----------------/
 Q183: Are there standard reentrant versions of standard nonreentrant functions?  

| Certain standard functions found in C programming environments,
| such as gethostbyname, are not reentrant and so are not safe
| for use by multithreaded programs.  There appear to be two
| basic approaches to providing thread-safe versions of these
| functions:
| (1) Reimplement the functions to use thread-local storage.
|     This is the approach that Microsoft has taken.  It's
|     nice because the interface is exactly the same so you
|     don't have to change existing code.

Can you cite documentation that Microsoft has done this consistently? 
(I'd love to be able to rely on it, but haven't been able to pin this
down anywhere.)
| (2) Provide alternate reentrant interfaces.  This is the
|     approach taken by (some/most/all?) Unix vendors.  The
|     reentrant version of the function has the same name
|     as the non-reentrant version plus the suffix _r.  For
|     example, the reentrant version of gethostbyname is
|     gethostbyname_r.
| The big problem I'm having with approach (2) is that the
| reentrant versions are not the same across different Unixes.
| For example, the AIX 4.2 and Solaris 2.5 gethostbyname_r
| interfaces are much different (the Solaris interface is
| horrendous, I must say).

FYI, having dealy with this on a couple of Unix systems:  There's the
way Solaris does it, and the way everyone else does it.  While
"everybody else" may not be 100% consistent, the differences are pretty
minor, and often can be worked around with an appropriate typedef.

To be fair to Sun, Solaris probably got there first, and others chose to
do things differently; but that's the end result.  BTW, if you read the
man pages for things like gethostbyname_r, you'll find a notation that
says that the function is "provisional" or something like that, and may
go away in future releases.  There's no change through Solaris 2.6, and
some indication in the release notes that later Solaris versions will
play some games to support both the traditional Solaris API, and the
newer standards - whereever they are drawn from.)

|                          While this is par for the Unix
| course, I'm somewhat surprised that these interfaces are
| not specified by POSIX.  Or are they?  Is there some
| attempt underway to standardize?  Is there some set of
| _r functions that are specified by POSIX, and if so,
| where can I find this list?

*Some* of them were standardized.  Dave Butenhof's "Programming with
POSIX Threads" lists the following:  getlogin_r readdir_r strtok_r
asctime_r ctime_r gmtime_r localtime_r getgrpid_r getgrpnam_r getpwuid_r
getpwnam_r.  Also, a few functions (ctermid is an example) were declared
thread-safe if certain restrictions were followed.

None of the socket-related calls are on this list.  The problem, I
suspect, is that they were not in any base standard:  They're part of
the original BSD socket definition, and that hasn't made it into any
official standard until very recently.  As I recall, the latest Unix
specifications, like the Single Unix Specification (there really aren't
all that *many* of them, but the names change so fast I, for one, can't
keep up), do standardize both the old BSD socket interface, and the "_r"
variants (pretty much as you see them in AIX).

BTW, standardization isn't always much help:  localtime_r may be in the
Posix standard, but Microsoft doesn't provide it.  (Then again,
Microsoft doesn't claim to provide support for the Posix threads API, so
why would you expect it to provide localtime_r....)  You still have to
come up with system-dependent code.
                            -- Jerry
 Q184: Detecting the number of cpus wrote:
> I have responding to my own posts but I forgot that NT also defines an
> environment variable, NUMBER_OF_PROCESSORS.  Win95/98 may do so as well.
>                 Bradley J. Marker
> In article <6gaa37$>, writes:
> >Win95 only used a single processor the last I looked and there were no plans
> >for SMP for Win98 that I've heard.  I'd personally love SMP on 98 if it
> >didn't
> >cost performance too much.
> >
> >On NT you can get the processor affinity mask and count the number of bits
> >that are on.  Anybody have a better method?
> >
> >sysconf works on IRIX as well as Solaris.  sysconf(_SC_NPROC_CONF) or
> >sysconf(_SC_NPROC_ONLIN) (on Solaris it is NPROCESSORS instead of NPROC).
> >You
> >probably want on-line.  By the way, on an IRIX 6.4 Origin 2000 I am getting
> >sysconf(_SC_THREAD_THREAD_MAX) equal to 64.  Just 64 threads max?  I have
> >multithreaded test programs running with more threads than that (or they
> >seem
> >to be working, anyway).
> >
> >Anybody know how to control the number of processors the threads run on for
> >IRIX?  I'd like both the non-specific run on N processors case and the
> >specifically binding to the Nth processor case.  With Solaris I've been
> >using
> >thr_setconcurrency and processor_bind.
> >
> >Sorry but I don't know for Digital Unix, IBM AIX, or HP-UX.
> >
> >               Bradley J. Marker

In Win32, GetSystemInfo fills in a struct that contains a count of
the number of processors, among other things.

 Q185: Drawing to the Screen in more than one Thread (Win32)  

Note: Followup-to: set to

[long post removed, see the thread :-) ]

Maybe I'm wrong(*), but AFAIR:

    You can only draw in a window from the thread which own the
window (the one which creates the window). This is this thread which
receives all the messages targeted to the window into its thread
messages list (each thread receive messages for the window it
    For what I remember, it runs with TextOut() because the second
thread send a message to the first one (which own the window) which
then do the job.
    So if you use a locking mechanism between the two threads for
accessing the window, you may go to a deadlock (thread 2 waiting for
thread 1 painting, and thread 1 waiting for thread 2 releasing the

    Maybe by defining a user message with adequate parameter, and
posting it (the second thread become then immediatly ready to continue
number crunching) into the thread 1 message list, you can achieve a
good update and a minimum thread 2 locking.



 Q186: Digital UNIX 4.0 POSIX contention scope  

> I recently found myself at the following website, which describes the
> use of pthreads under Digital Unix 4.0.  It is dated March 1996, so
> I am wondering how up to date it is.
> ications/base_doc/DOCUMENTATION/HTML/AA-Q2DPC-TKT1_html/thrd.html
> It refers to several unimplemented optional funtions from Posix
> 1003.1c 1995, including pthread_setscope.  So I am wondering, then,
> what sort of "scope" do dec pthreads have, are they all system level,
> or all process level, etc.

Digital UNIX 4.0 (through 4.0C) did not support POSIX contention scope.
It was just one of those things that "missed the cut". All POSIX threads
are process contention scope (PCS). Digital UNIX 4.0D supports the scope
attribute. (Since 4.0D has been shipping for some time, it appears that
the web link you've found is not up to date.)

On 4.0D, threads are PCS by default, (as they should be), but you can
create SCS (system contention scope) threads for the rare situations
where they're necessary. (For example, to share realtime resource
directly with hardware, or with OS threads, etc.)

 Q187: Dec pthreads under Windows 95/NT?  
> Also, appendix C refers to dec pthreads under Windows 95/NT.  Is that
> a reality?

Depends on what you mean by "reality". Yes, we have DECthreads running
on Win32 "in the lab", and have for some time. In theory, given
sufficient demand and certain management decisions regarding pricing and
distribution mechanism, we could ship it. Those decisions haven't yet
been made. (If you have input on any of this, send me mail; I'd be glad
to forward it to the appropriate person. If you can say whether you're
"just curious" or "want to buy" [and particularly if you can say how
much you'd pay], that information would be useful.)

 Q188: DEC current patch requirements  

> It also doesn't describe the current patch requirements, etc., for
> 4.0B.

The Guide to DECthreads is a reference manual, not "release notes", and
is not updated routinely for patch releases. The version you're reading
is clearly for 4.0 through 4.0C, and there's a new version for 4.0D. We
still haven't managed to find time to push through the sticky fibres of
the bureaucracy to get a thread project web page, on which we could post
up-to-date information like current patches and problems.

In general, you should just keep up with the latest patch kit. You can
always keep an eye on the patch FTP directory for your release, under

 Q189: Is there a full online version of 1003.1c on the web somewhere?  
> Is there a full online version of 1003.1c on the web somewhere?

No. The IEEE derives revenue from sale of its standards, and does not
give them away. I understand this policy is "under review". It doesn't
really matter, though, unless you intend to IMPLEMENT the standard.
1003.1c is not a reference manual, and if you want to learn how to use
threads, check out a book that's actually written to be read; for
example, my "Programming with POSIX Threads" (Addison-Wesley) or Bil
Lewis' "Multithreaded Programming with Pthreads" (Prentice Hall) [which,
I see, is so popular that someone has apparently stolen my copy from my
office: well, after all, the spine IS somewhat more colorful than my
book ;-) ].

On the other hand, what IS freely available on the web is the Single
UNIX Specification, Version 2, including "CAE Specification: System
Interfaces and Headers, Issue 5", which is the new UNIX98 brand
specification that includes POSIX threads (plus some extensions). This
document includes much of the text of POSIX 1003.1c, though in slightly
altered form. Check it out at

/---------------------------[ Dave Butenhof ]--------------------------\
| Digital Equipment Corporation          |
| 110 Spit Brook Rd ZKO2-3/Q18 |
| Nashua NH 03062-2698 |
\-----------------[ Better Living Through Concurrency ]----------------/

 Q190: Why is there no InterlockedGet?  

> ) My question is renewed, however.  Why is there no InterlockedGet and
> ) InterlockedSet.  It seems under the present analysis, these would be
> ) quite useful and necessary.  Their absence was leading me to speculate
> ) that Intel/Alpha/MS insure that any cache incoherence/lag is not
> ) possible.
> InterlockedExchange and InterlockedCompareExchange give you
> combined Get and Set operations, which are more useful.

Unfortunately, InterlockedCompareExchange is not available
under Windows '95, only NT.  But yes, I agree with you....
 Q191: Memory barrier for Solaris  

>I was wondering if anyone knew how to use memory barriers 
>in the Solaris environment. I believe that Dave B.
>posted one for the DEC Alphas.

I assume you use Solaris on a Sparc.

First you should decide whether you are programming for Sparc V8 or Sparc 
V9. Buy the appropriate Architecture manual(s) from Sparc International 
(see: )

I have not seen the Sparc Architecture manuals on-line. If someone has, I 
would be grateful for a pointer...

The Sparc chips can be set in different modes regarding the memory model 
(RMO, PSO, TSO). You need to understand the concepts by reading the 
Architecture manual (chapters 6 and J in V8, chapters 8, D and J in V9). 
It is also helpful to know which ordering model Solaris uses for your 

In V8, the "barrier instruction" you are looking for is "stbar". You can 
use it by specifying
        asm(" stbar");
in your C code.

In V9, the architecture manual says:

"The STBAR instruction is deprecated; it is provided only for compatibility 
with previous versions of the architecture. It should not be used in new 
SPARC-V9 software. It is recommended that the MEMBAR instruction be used in 
its place."

The deprecated stbar instruction is equivalent to MEMBAR #StoreStore.

In V9, "memory barriers" are done with the membar instructions. As far as I 
can see, there are 12 different types of the instructions, depending on the 
type of memory barrier you want to have (check the architecture manual).

 Q192: pthread_cond_t vs pthread_mutex_t  

Jason Mancini wrote:

> I wrote a small program that loops many times,
> locking and unlocking 3 mutexes.  The results are
> 4.2 million mutex lock-unlocks per second.  Doing the same
> for two threads that wait and signal each other results
> in 26,000 wait-signals per second using conditional
> variables.

Of course, this information is largely useless without knowing what
hardware and software you're using. But nevermind that -- it probably
doesn't matter right now that the numbers are meaningless.

> Any explanations as to why conds are so much slower
> than mutexes?  There are no collisions in any of the
> mutex acquisitions.  Also it seems like the mutex rate
> should be higher that it is.

So, you're trying to compare the performance of UNCONTENDED (that is,
non-blocking) mutex lock/unlock versus condition variable waits. Note
that waiting on a condition variable requires a mutex lock and an
unlock, PLUS the wait on a condition variable. Waking a thread that's
waiting on a condition variable also requires locking and unlocking the
same mutex (in order to reliably set the predicate that must be tested
for a proper condition wait). (If you're not locking in the signalling
thread, then you're doing it wrong and your measurements have no
relevance to a real program.)

Why, exactly, would you expect the performance of the condition variable
protocol to be equivalent to the mutex protocol that consists of a small
part of the condition variable protocol -- and, most importantly, that
excludes the actual BLOCKING part of the condition variable protocol?

As for the mutex rate -- 4.2 million per second means that each
lock/unlock pair takes less than 1/4 of a microsecond. Given the
inherent memory system costs of instructions intended to allow
synchronization on a multiprocessor, you'd need to be running on a
REALLY fast machine for that number to be "bad".

> Is there anything faster available for putting many
> threads to sleep and waking them up many times a
> second?

If your 26,000 per second rate isn't good enough, then the answer is
"probably not". Still, by my count, that's way up in the range of "many
times a second". What exactly are you attempting to accomplish by all
this blocking and unblocking, anyway? If you're doing it as a
consequence of some real work, then what's important is the performance
of the WORK, not the cost of individual operations involved in the work.
(You should really be trying to AVOID blocking, not worrying about
blocking faster, because blocking will always be slower than not

Synchronization is not the goal of multithreaded programming. It's a
necessary evil, that's required to make concurrent programming work.
Synchronization is pure overhead, to be carefully minimized. Every
program will run faster without synchronization... unfortunately, most
concurrent programs won't run CORRECTLY without it.

/---------------------------[ Dave Butenhof ]--------------------------\
| Compaq Computer Corporation            |
| 110 Spit Brook Rd ZKO2-3/Q18 |
| Nashua NH 03062-2698 |
\-----------------[ Better Living Through Concurrency ]----------------/

 Q193: Using DCE threads and java threads together on hpux(10.20)  
>I was wondering if anyone here has had any experience with
>using dce threads and java threads together on hpux(10.20)

I'm presuming that you're using a JavaSoft jvm.


The DCE threads on hpux 10.20 are a user-space threads package.
The JVM uses a (different!) user-space threads package.

Ne'er the two shall meet.

If you have access to an hpux 11.x box, which has kernel threads,
there is a better chance of it working (not great, but better).

It's been quite a while since I looked inside the JavaSoft JVM,
but I seem to recall that the thread API isn't too ugly; you should
probably use those calls to do your C++ threads, but be warned
that you're in some rocky, unexplored territory.  I've even heard
that there be dragons there...

 Q194: My program returns enomem on about the 2nd create.  
>   We just upgraded our Alpha from a 250MHz something to a 500+MHz dual
> processor running Digital Unix V4. My program which previously had no
> problem creating hundreds of threads returns enomem on about the 2nd to
> 4th thread create. DEC support advised increasing maxusers from 128 to
> 512 but to no avail. We've got 2Gig of memory and some other sys

The real question is, how much memory does the application use before that
final thread is created? The VM subsystem has an "optimization" for
tracking protected pages that simply doesn't work well with threads. The
thread library always creates a protected page for each stack, to trap
overflows. (You can run without this, by setting the guardsize attribute
to 0... but you shouldn't do that unless you're willing to be money that
your thread won't ever, under any circumstances, overflow the stack;
without the guard page, the results will be catastrophic, unpredictable,
and nearly impossible to debug.)

The problem is that the VM subsystem has a table for dealing with adjacent
pages of differing protection, and it's based on the entire memory size of
the process. If the vm-pagemax parameter is set to 2048, and you have 2048
pages allocated in the process, and you try to protect one of them, the
attempt will fail. If the protection was occurring as part of a stack
creation, pthread_create will return ENOMEM.

While most threaded programs will see this only when they create a lot of
threads (so that the aggregate stack allocation brings the process up over
the vm-vpagemax limit), any program that allocates lots of memory before
thread creation can hit the same limit -- whether the allocation is mmap,
malloc, or just a really big program text or data segment.

So check your vm-vpagemax and make sure it's big enough.

/---------------------------[ Dave Butenhof ]--------------------------\
| Compaq Computer Corporation            |
| 110 Spit Brook Rd ZKO2-3/Q18 |
| Nashua NH 03062-2698 |
\-----------------[ Better Living Through Concurrency ]----------------/

 Q195:  Does pthread_create set the thread ID before the new thread executes?  
Wan-Teh Chang wrote:

> The first argument for pthread_create() is the address of a pthread_t
> variable in which pthread_create() will write the new thread's ID before
> it returns.
> But it's not clear whether the new thread's ID is written into the
> pthread_t
> variable before the new thread begins to run.

The POSIX standard implies that the ID will not be set before the thread is
scheduled. The actual text is "Upon successful completion, pthread_create
shall store the ID of the created thread [...]." You always need to
remember, in any case, that threads operate asynchronously, and one great
way to hammer that message home is to prevent anyone from counting on
basics like having the create's ID when the thread starts.

(Yeah, that sounds mean, and I guess it is. But way back in the early days
of threading, when nobody know much about using threads, and people yelled
"I don't want to use synchronization, so you need to give me non-preemptive
thread scheduling!", we faced a really big problem in education. I
DELIBERATELY, and, if you like, maliciously, designed the CMA [and
therefore the DCE thread] create routine to store the new thread's id AFTER
the thread was scheduled for execution, specifically so that the thread
will, at least sometimes, find it unset, dragging the reluctant programmer,
kicking and screaming, into the world of asynchronous programming. This was
a purely user-mode scheduler, with a coarse granularity timeslicer, and it
was far too easy to write lazy code that wouldn't work on future systems
with multiple kernel threads and SMP. I couldn't prevent people from
getting away with bad habits that would kill their code later -- but I
could at least make it inconvenient! When I converted to a native POSIX
thread implementation for Digital UNIX 4.0, having battled the education
problem for over half a decade and feeling some reasonable degree of
success, I opted for convenience over forced education -- I set the ID
before scheduling the new thread, and made sure it was documented that

> I checked the pthread_create() man pages on all major commercial Unix
> implementations, and only the pthread_create(3) man page on Digital Unix
> (V4.0D) addresses this issue (and gives an affirmative answer):
>     DECthreads assigns each new thread a thread identifier, which DECthreads
>     writes into the address specified as the pthread_create(3) routine's thread
>     argument.  DECthreads writes the new thread's thread identifier before the
>     new thread executes.
> AIX 4.3, HP-UX 11.00, IRIX 6.3, and SunOS 5.6 do not specify the timing
> of the writing of new thread's ID relative to the new thread's execution.
> Is this something not specified in the POSIX thread standard?  I don't
> have a copy of the IEEE POSIX thread standard document, so all I can do
> is to read the man pages.  For my application, I need DECthreads'
> semantics that the new thread ID is written before the new thread
> executes.  I guess on other platforms, I will need to have use a mutex to
> block the new thread until the pthread_create() call has returned.

If you really need to code a thread that uses its own ID immediately, you
have a few choices. One, yeah, it can lock a mutex. Just hold a mutex (or,
better, use a condition variable and some predicate) around the
pthread_create call, and treat the thread ID as a shared resource. (Which
it is, although, since it's write-once, and thread create already
guarantees a consistent view of the address space to the created thread,
there's no need for additional synchronization if it's just written before
the new thread is scheduled.) Two, forget about the shared state and just
have the thread call pthread_self(), which will return the exact same ID
that the creator has stored (or will store).

/---------------------------[ Dave Butenhof ]--------------------------\
| Compaq Computer Corporation            |
| 110 Spit Brook Rd ZKO2-3/Q18 |
| Nashua NH 03062-2698 |
\-----------------[ Better Living Through Concurrency ]----------------/
 Q196: thr_suspend and thr_continue in pthread  
Niko D. Barli wrote:

> Is there anyway to implement, or to emulate
> Solaris thr_suspend() and thr_continue() in
> pthread ?

Yes, there is. But it's ugly and the result is an inefficient and
complicated wart that reproduces almost all of the severe problems
inherent in asynchronous suspend and resume. If you check the Deja News
archive for this newsgroup, you can probably dig up (much) earlier posts
where I actually told someone where to find suspend and resume code.

> This is the case why I need to use thr_suspend
> and thr_continue.

Think again! You don't need them, and you'll be better off if you don't
use them.

> I have 3 servers running on 3 hosts.
> Each server have 2 threads, each listening to 1 of the other 2 servers.
> Socket information is held in global area, so that every thread can
> access it.
> For example, in Server 1 :
>  - thread 1 -> listening to socket a (connection to Server 2)
>  - thread 2 -> listening to socket b (connection to Server 3)
> In each thread, I use select to multiplex between socket
> communication and standard input.
>   ..........
> For example, I ask server 1, to read data from server 2
> (by inputing command from stdin). If my input from stdin
> handled by thread 1, there will be no problem.
> But if thread 2 handle it, thread 2 will send request for
> data to server 2 and waiting. Server 2 will send back data,
> but the data is now listened by BOTH thread 1 and thread 2.
> So what I want to do is to suspend thread 1, and let thread 2
> get the data.

You do NOT want to use suspend and resume for this! What you're talking
about is SYNCHRONIZATION between two threads sharing the same resource.
Suspend and resume are NOT synchronization functions, and they won't do
what you want. For example, if you simply depend on asynchronously
suspending one thread "before" it reads from stdin, what if you're late?
(Threads are asynchronous, and, without explicit synchronization, you
cannot know what one is doing at any given time.) What if the thread
you've suspended has already tried to read, and currently has a stdio
mutex locked? Your other thread will simply block when it tries to read,
until the first thread is eventually resumed to complete its read and
unlock the mutex.

Suspend and resume are extremely dangerous and low-level scheduling
functions. You need to know a lot about everything a thread might possibly
be doing before you can safely suspend it -- otherwise you risk damaging
the overall application. (Very likely causing a hang.) If you don't know
every resource a thread might own when you suspend it, or you don't own
every resource YOU might need to do whatever it is you'll do while the
other thread is suspended, then you cannot use suspend and resume. Even if
you do know, and control, all thread, there is always a better and less
dangerous solution than suspend and resume. (Suspend and resume are often
used because they seem convenient, and expedient; and there are even rare
cases where they can be used successfully. But, far more often than not,
you'll simply let your customer find out how badly you've broken your
application instead of finding it yourself.)

/---------------------------[ Dave Butenhof ]--------------------------\
| Compaq Computer Corporation            |
| 110 Spit Brook Rd ZKO2-3/Q18 |
| Nashua NH 03062-2698 |
\-----------------[ Better Living Through Concurrency ]----------------/
 Q197: Are there any opinions on the Netscape Portable Runtime?  

I am working on the Netscape Portable Runtime (NSPR), so
my opinions are obviously biased.   I'd like to provide some info
to help you evaluate this product.

First, NSPR is more than a thread library.  It also includes
functions that are tied to the thread subsystem, most notably
I/O functions.  I/O functions can block the caller, so they must
know which thread library they are dealing with.

The thread API in NSPR is very similar to pthreads.  The
synchronization objects are locks and condition variables.
The NSPR thread API does not have suspend, resume,
and terminate thread functions.  It also does not have the
equivalent of pthread_exit().  NSPR has thread interrupt,
but not thread cancel.  The absence of these functions
in the API is by design.  Also, condition variables are associated
with locks when they are created, and condition notification
must be done while holding the lock, i.e., no "naked" notifies.

The implementation of  NSPR is often merely a layer
on top of the native thread library.  Where there are no native
threads available, we implement our own user-level threads.
NSPR does do a few "value-added" things:
1.  The condition variable notifies are moved outside of the
     critical section where possible.  You must write code
     like this:
    The actual pthread calls made by NSPR are:
    (We use reference count on the condition variables to deal with
    their destruction.)
2. In some two-level thread implementations, a blocking I/O call
    incurs the creation of a kernel schedulable entity (i.e., LWP).  To
    the number of LWPs created this way, the NSPR I/O functions
    block all the callers on condition variablea, except for one thread.

    A lucky thread is chosen to block in a poll() call on all the
    file descriptors on behalf of the other threads.
3. On NT, NSPR implements a two-level thread scheduler using
    NT fibers and native threads and uses NT's asynchronous I/O,
    while still presenting a blocking I/O API.  This allows you to
    use lots of threads and program in the simpler blocking I/O model.
4. Where it is just too expensive to use the one-thread-per-client
    model but you don't want to give up the simplicity of the blocking
    I/O model, there is work in progress to implement a "multiwait
    receive" API.

These are just some random thoughts that came to mind.  Hope it helps.

 Q198: Multithreaded Perl  
Hello All,

I have just finished to update my Win32 IProc Perl module version
0.15,now i can 
really say that it's a complete module.

Here is all the methods that i have implemented:

    new() (Constructor)
    GetAffinityMask() (WinNT only)
    GetWorkingSet() (WinNT only)
    GetStatus() (WinNT only)
    GetPriorityBoost() (WinNT only)
    GetThreadPriorityBoost() (WinNT only)
    SetAffinityMask() (WinNT only)
    SetIdealProcessor() (WinNT only)
    SetPriorityBoost() (WinNT only)
    SetWorkingSet() (WinNT only)
    SetThreadPriorityBoost() (WinNT only)
    SetThrAffinityMask() (WinNT only)
    SwitchToThread() (WinNT only)

With all those 35 methods you will be in a complete control of your 
Threads and Processes.

Add to this my Win32 MemMap that comes with:

 o SysV like functions (shmget,shmread,shmwrite ...)
 o Memory mapped file functions

Plus my Win32 ISync module that comes with a complete 
sychronisation mechanisms like: 

 o Mutex
 o Semaphores
 o Events
 o Timers 

and the sky will be the limit .

I have included a lot of examples on my modules,i have also updated my
Perl documentation,you will find all the doc at:

and all my modules at:

Than you for your time,and have a nice weekend.

Amine Moulay Ramdane.

"Long life to Perl and Larry Wall!"
 Q199: What if a process terminates before mutex_destroy()?  
> File locks are released if a process terminates (as the files are closed),


> while SYSV-IPC semaphores are persistant across processes,

Unless you specify the SEM_UNDO flag.

> What about (POSIX) mutex's?

There is no cleanup performed on them when a process terminates.
This could affect a mutex (or condition variable) with the process-
shared attribute that is shared between processes.

    Rich Stevens

One more point: the "persistence" of an IPC object is different from
what you are asking about, which is whether an IPC object is "cleaned
up" when a process terminates.  For example, using System V semaphores,
they always have kernel persistence (they remain in existence until
explicitly deleted, or until the kernel is rebooted) but they may or
may not be cleaned up automatically upon process termination, depending
on whether the process sets the SEM_UNDO flag.

Realize that automatic cleanup is normally performed by the kernel (as
in the System V semaphore case and for fcntl() record locks) but the
Posix mutual exclusion primitives (mutexes, condition variables, and
semaphores) can be (are normally?) implemented as user libraries, which
makes automatic cleanup much harder.

And, as others have pointed out here, automatic cleanup of a locked
synchronization primitive may not be desireable: if the primitive is
locked while a linked list is being updated, and the updating process
crashes, releasing the locked primitive does not help because the
linked list could be in some intermediate state.  But there are other
scenarios (such as an fcntl() record lock being used by a daemon to
make certain only one copy of the daemon is started) where the automatic
cleanup is desired.

> What about (POSIX) mutex's?  I don't see this documented anywhere.

It's hidden in the Posix specs--sometimes what is important is not
what the Posix spec says, but what it doesn't say.  "UNIX Network
Programming, 2nd Edition, Volume 2: Interprocess Communications"
(available in ~2 weeks) talks about all this.

    Rich Stevens
 Q200: If a thread performs an illegal instruction and gets killed by the system...  
> % threads should remain open for the life of the application.  However
> % they could perform an illegal instruction and get killed by the system.
> % I would like for the thread creator to post an error that a thread has
> % died, AND then restart the killed thread.
> You don't have to worry about this particular case, since the system will
> kill the entire process for you if this happens. Threads aren't processes.

I've answered many questions, here and in mail, from people who expect that
illegal instructions or segmentation faults will terminate the threads. And
even from people who realize that it will terminate the process, but think
they WANT it to terminate only the thread.

That would be really, really, bad. A quick message to anyone who thinks they
want the process to "recover" from a segv/etc. in some thread: DON'T TRY IT.
At best, you'll just blow up your program later on. At worst, you'll corrupt
permanant external data (such as a database file), and won't detect the error
until much, much later.

Remember that a thread is just an "execution engine". Its only private data
is in the hardware registers of the processor currently executing the thread.
Everything else is a property of the ADDRESS SPACE, not of the thread. A
SIGSEGV means the thread has read incorrect data from the address space. A
SIGILL means the thread has read an illegal instruction from the address
space. Either a pointer (the PC in the case of SIGILL) or DATA in the address
space has been corrupted somehow. This corruption may have occurred within
the execution context of ANY thread that has access to the address space, at
any time during the execution of the program. It does NOT mean that there's
anything wrong with the currently executing thread -- most often, it's an
"innocent victim". The fault lies in the program's address space, and
potentially affects all threads capable of executing in that address space.

There's only one solution: shut down the address space, and all threads
within it, as soon as possible. That's why the default action is to save the
address space and context to a core file and shut down. This is what you want
to happen, and you shouldn't be satisfied with anything less. You can then
analyze the core file to determine what went wrong, and try to fix it.
Meanwhile, you've minimized the damage to any external invarients (files,
etc.)... and, at the very least, you know something went wrong.

In theory, an embedded system might handle a SIGSEGV, determine exactly what
happened, kill any "wayward" thread responsible for the corruption, repair
all data, and continue. Don't even IMAGINE that you can do this on anything
but a truly embedded system. You may be able to detect corruption in your
threads, and in the data under control of your code -- but you link against
libpthread, libc, and probably other libraries. They may have their own
threads, and certainly have LOTS of their own data. You cannot analyze or
reconstruct their data. The process is gone. Forget it and move on with life.

If you need to write a "failsafe" application, fork it from a monitor
process. Do NOT share any memory between them! The parent simply forks a
child, which exec*s the real application. The parent then wait*s for the
child, and if it terminates abnormally, forks a replacement. Either the
parent (before creating the replacement) or the replacement (on startup)
should analyze and repair any files that might have been damaged. And then
you're off and running. Safely.

> % I was going to use the posix call "pthread_join" to wait for thread
> % exits. However using  "pthread_join" does not give the thread id of the
> % thread that has died. Is there a way to do this
> % using another thread command.
> Well, you say
>   pthread_join(tid, &status;);
> and if it returns with a 0 rc, the thread that died was the one with
> id _tid_. Your real problem here is that pthread_join won't return
> until the thread formerly known as tid has gone away, so you can't really
> use it to wait for whatever thread goes away first.

My guess is that John is writing code for Solaris (he wrote the article on
Solaris 2.6), and was planning to use the unfortunate Solaris join-any

John, don't do that! It's a really, really, bad idea. Unlike UNIX processes,
there's no parental "line of descent" in threads. It's fine to have a "wait
any" that waits for any CHILD of the calling process, and therefore it seems
obvious to extend this concept to threads. But a thread has no children.
There are just an amorphous set of threads within a process, all equals. You
create threads, say, and a database program you're using creates threads, and
a fast sort library it uses creates more threads. Maybe you're also using a
thread-aware math library that creates more... and perhaps the thread library
has its own internal threads that occasionally come and go. Guess what? Your
"join any" will intercept the termination of the NEXT THREAD IN THE PROCESS
to terminate. It may be yours, or the thread library's, or anyone else's. If
it's someone else's thread, and the creator CARED about the termination of
that thread, you've broken the application. (Yeah, YOU broke it, because
there's nothing the library developer could reasonably be expected to do
about it.)

Generally true statement: Anyone who uses "join any" has a broken process.
The only exception is when you're sure that can not possibly, ever, be any
threads in the process you didn't create. And I don't believe anyone can ever
reasonably be sure of that in a modular programming environment. That is, if
you link against a library you didn't write, you don't know it can't ever use
threads. And if it ever does, you lose.

The POSIX pthread_join() function was nearly eliminated from the standard at
several points during the development of the standard. It's a minimal "helper
function" that does nothing of any particular value. It's utterly trivial to
implement pthread_join() yourself. Combine one part mutex, one part condition
variable, a dash of data; stir, and serve. You want to join with ANY of your
worker threads? No problem. Just add another sprinkle of data to record (if
you care) which thread terminated. You don't even need to add more mutexes or
condition variables, because they can all share one set. (To code a "normal"
pthread_join, you'd usually want each thread to have its own set.) Before
terminating, each thread locks the mutex, stores its termination state
(whatever information you want from it) and (if you want) its ID, then
signals or broadcasts (depending on your desired semantics) the condition
variable. To "join", you just wait (in, of course, a correctly tested
predicated condition wait loop!) for someone to have terminated.

/---------------------------[ Dave Butenhof ]--------------------------\
| Compaq Computer Corporation            |
| 110 Spit Brook Rd ZKO2-3/Q18 |
| Nashua NH 03062-2698 |
\-----------------[ Better Living Through Concurrency ]----------------/
 Q201: How to propagate an exception to the parent thread?  

> Does anyone have or know of  an approach to mixing threads with C++
> exception handling?  Specifically, how to propagate an exception to the
> parent thread.  I can catch any exceptions thrown within a thread by way
> of a try block in the entry function.  The entry being a static class
> member function.  (I know, a "state of sin" wrt to C++ and C function
> pointers, but it works.)  Copying the exception to global (static or
> free store)  memory comes to mind.

It was the original intention of the designers of the C++ language to 
allow one to throw/catch across thread boundaries.  As far as I know 
the recently ratified ISO C++ standard makes no mention of threads 
whatsover.  (The ISO C++ committee considered threads an OS and/or 
implementation, not a language issue.  BTW please don't send me a 
bunch of replies agreeing or disagreeing, I am not on the ISO 
committee, I am just reporting the facts. ;-)  However, I am unaware 
of ANY compiler that implements the ability to throw/catch across 
thread boundaries.  I have also discussed this issue with some 
experienced C++ programmers, and they also are unaware of any compiler
that implements this.  I am told that CORBA allows this if you want to
take that approach.  You many want to repost this in 
comp.lang.c++.moderated.  In general there are some problems with this
approach anyway, simply killing a thread does not cause the C++ 
destructors to be called.  (Again I say generally, because the ISO 
standard makes no mention of threads, there is no portable behaviour 
upon which you may count.)  It is usually better to catch an exception
within the thread that threw it anyway.



NB: There is no such thing as a "parent" thread.  All threads are created
equal.  But we know what you mean.  

RogueWave's threads.h++ does a rethrow of exceptions in the user-selected
thread (the assigned "parent").  You may wish to look at that.

 Q202: Discussion: "Synchronously stopping things" / Cheating on Mutexes  

William LeFebvre wrote:

> In article <>, Bil Lewis   wrote:
> >  Practically speaking, the operation of EVERYBODY (?) is that a store
> >buffer flush (or barrier) is imposed upon unlocking, and nothing at all
> >done on locking.
> Well, except the guarantee that the lock won't be obtained until
> the flush is finished.

Actually, that's incorrect. There may be no "flush" involved. That's the whole
problem with this line of reasoning. One side changes data and unlocks a mutex;
the other side locks a mutex and reads the data. That's not a discrete event,
it's a protocol; and only the full protocol guarantees visibility and ordering.

I dislike attempts to explain mutexes by talking about "flushes" because while a
flush will satisfy the requirements, it's not a minimal condition. A flush is
expensive and heavy-handed. All that's required for proper implementation of a
POSIX mutex is an Alpha-like (RISC) memory barrier that prevents migration of
reads and writes across a (conceptual) "barrier token". This affects only the
ordering of memory operations FROM THE INVOKING PROCESSOR. With a similar memory
barrier in the correct place in mutex lock, the protocol is complete. But with
only half the protocol you get ordering/visibility on one side, but not on the
other; which means you haven't gotten much.

As implied by the quoted statement above, once you've GOTTEN the mutex, you can
be sure that any data written by the previous holder, while the mutex was
locked, has also made its way to the memory system. The barrier in unlock
ensures that, since the unlocked mutex value can't go out before the previous
data goes out; and the barrier in lock ensures that your reads can't be issued
before your mutex lock completes. But this assurance is not necessarily because
of a "flush", and the fact that someone else unlocked a mutex after writing data
is not enough to ensure that you can see it; much less that you can see it in
the correct order.

/---------------------------[ Dave Butenhof ]--------------------------\
| Digital Equipment Corporation          |
| 110 Spit Brook Rd ZKO2-3/Q18 |
| Nashua NH 03062-2698 |
\-----------------[ Better Living Through Concurrency ]----------------/

Subject: Re: synchronously stopping things

David Holmes wrote:

> wrote in article <6injjb$u44$>...
> > One place where I try to avoid a mutex (at the risk of being called a
> fool)
> > is in singletons:
> >
> > MyClass* singleton(Mutex* mutex, MyClass** instance)
> > {
> >   if (*instance == 0) {
> >     mutex->lock();
> >     if (*instance == 0)  // must check again
> >       *instance = new MyClass();
> >     mutex->unlock();
> >   }
> >   return *instance;
> > }
> This coding idiom is known as the "Double Checked Locking pattern" as
> documented by Doug Schmidt (see his website for a pointer to a paper
> describing the pattern in detail). It is an optimisation which can work but
> which requires an atomicity guarantee about the value being read/written.
> The pattern works as follows. The variable being tested must be a latched
> value - it starts out with one value and at some point will take on a
> second value. Once that occurs the value never changes again.
> When we test the value the first time we are assuming that we can read the
> value atomically and that it was written atomically. This is the
> fundamental assumption about the pattern. We are not concerned about
> ordering as nothing significant happens if the value is found to be in the
> latched condition, and if its not in the latched condition then acquiring
> the mutex enforces ordering. Also we do not care about visibility or
> staleness.

The last part is critical, and maybe rather subtle. You have to not care about
visibility or latency.

So... the code in question is broken. It's unreliable, and not MP-safe at all.
That is, it's perfectly "MT" [multithread] safe, as long as you're on a
uniprocessor... but move to an aggressive MULTIPROCESSOR, and it's "game

Why? Yeah, I thought maybe you'd ask. ;-)

The problem is that you're generating a POINTER to an object of class MyClass.
You're creating the object, and setting the pointer, under a mutex. But when
you read a non-NULL value of the pointer, you're assuming that you also have
access to the OBJECT to which that pointer refers.

That is not necessarily the case, unless you are using some form of explicit
synchronization protocol between the two threads, because having set the value
under a mutex does not guarantee VISIBILITY or ORDERING for another thread
that doesn't adhere to the synchronization protocol.

Yes, "visibility" might seem not to be an issue here -- either the other
thread sees the non-NULL value of "instance", and uses it, or it sees the
original NULL value, and rechecks under the mutex. But ORDERING is critical,
and it's really a subset of VISIBILITY.

The problem is that the processor that sees a non-NULL "instance" may not yet
see the MyClass data at that address. The result is that, on many modern
SMP systems, you'll read garbage data. If you're lucky, you'll SEGV, but you
might just accept bad data and run with it... into a brick wall.

The more aggressive your memory system is, the more likely this is to occur.
You wouldn't, for example, have any problem running on an Alpha EV4 chip...
but on an EV5 or EV6 SMP system, you'll probably end up with intermittent
failures that will be nearly impossible to debug, because they'll often depend
on nanosecond timing factors that you can't reproduce reliably even in
production code, much less under a debugger. (And if you slip by that
"probably" and miss the races, you can be sure that one of your customers will
run into one eventually... and that's even less fun.)

You can fix this problem very simply without a mutex, but that solution is
machine dependent. For example, using DEC C on a Digital UNIX Alpha system, it
could be as simple as changing your test to:

     if (*instance == 0) {
     } else

The "mb" (memory barrier) between the test for the non-NULL pointer, and any
later dereferences of the pointer, ensure that your memory reads occur in the
correct and safe order. But now your code isn't portable. And get it wrong in
one place, and your program is toast. That's what I meant in a previous post
about the risk and cost of such optimizations. Is the cost of locking the
mutex really so high that it's worth sacrificing portability, and opening
yourself up to the whims of those ever-more-creative hardware designers?
Sometimes, yes. Most of the time... no way.

/---------------------------[ Dave Butenhof ]--------------------------\
| Digital Equipment Corporation          |
| 110 Spit Brook Rd ZKO2-3/Q18 |
| Nashua NH 03062-2698 |
\-----------------[ Better Living Through Concurrency ]----------------/

From - Sun May 10 01:31:31 1998
From: Achim Gratz 
Newsgroups: comp.programming.threads
Subject: Re: Mutexes and memory systems (was: synchronously stopping things)

OK, I'm trying to wrap things up a bit, although I guess it will still
become rather long.

Forget caching, memory barriers, store buffers and whatever.  These
are hardware implementation details that are out of your control.  You
don't want to know and most of the time you don't know everything
you'd need to anyway, which is more dangerous than knowing nothing.
Trying to infer the specification from the implementation is what gets
you into trouble.

When you lock mutex A, POSIX gives you a guarantee that all shared
data written by whatever thread under mutex A is visible to your
thread, whatever CPU it might run on, and has been completely written
before any reads to it can occur (this is the ordering part).  When
you unlock the mutex, it is best to assume that the shared data
vanishes in neverland.  It is not guaranteed to be up-to-date or
visible at all by POSIX, nor can you infer any order in which the
writes may be performed or become visible.  It is up to the
implementation to employ the hardware in the appropriate manner.
Efficient employment of the hardware is a quality of implementation
issue that's to be considered after the proof of correctness.

[ I don't have the standard, only Dave's book, but that is the
definition that would allow for the most agressive memory
implementations.  It seems sensible, to me at least, to assume this
definition for utmost portability.  It would be interesting to know if
the exact wording does indeed support my interpretation and whether it
was intended to be that strong.  It would seem that for multiple
concurrent readers you'd need to lock multiple mutexes if you want to
stay within the bounds of the above definition. ]

The problem with this definition and the origin (I think) of the
brouhaha in various (usenet) threads here in this group is that
implementation of multiple reader situations seems overly expensive
since a single mutex allows only a single reader at a time, else you
need a mutex for every reader.  All of the schemes presented so far
that propose to avoid the mutex locking by the readers rely on further
properties of the hardware or pthread library implementation.
Fortunately or unfortunately these properties exist on most if not all
hardware implementations in use today or the library implementors have
taken care to slightly expand the guarantees made by POSIX because you
usually don't tell the customer that he is wrong.

Why do these hacks work?

1) no hardware designer can design a system where it takes an
unbounded time for a write to memory to propagate through the system
as that requires infinite resources

2) noone conciously introduces longer delays than absolutely necessary
for reasons of efficiency and stability

3) there is no implementation, AFAIK, of shared memory that is visible
only under an associated mutex, although capability based
architectures might have them

4) in the absence of 3, keeping a directory of the data changed under
each mutex is likely to be more expensive than making all data visible
and introducing order with respect to any mutex except for those that
are still under lock for writing

5) on a shared memory system that lacks 3 and 4, once the data has
been forced to memory by one processor, it is visible to all
processors and no stray stale copies are in caches if the write
occured under mutex lock; even if not ordered on other processors,
data in memory becomes visible after a bounded time because of 1 and 2

6) the above holds for ccNUMA systems as well, although the time for
propagation can be considerably longer

What does not work?

a) any writes of shared data, without locking, atomic or not, with the
exception of "one-shot-flags" (i.e. any value different from the
initial one that has to be set before the threads are started signals
that some event occured and you never change the value back and you
don't care about the exact value itself)

b) multiple changes to a variable without unlocking/locking between
each change may not become visible at all or may be seen to have
different values in different threads

c) any read of shared data, without locking, where accessing data
after the writer has released the lock would be an error or getting
old data can't be tolerated

d) any read of shared data, without locking, that is not properly
aligned or is larger than the size of memory transactions (there may
be several), tearing may occur without notice

e) porting to a system where strict POSIX semantics are implemented
(e.g. NUMA systems with software coherency)


You can't do much safely without those pesky mutexes.  The things you
can do aren't, IMHO, in the critical path (performance-wise) most of
the time.  The potential payback is low and the original problem can
be solved by barriers just fine within POSIX (I think - it's been a
long day).  POSIX1003.1j will even give you barriers that may be more
efficient because the library writers took the time to evaluate the
hardware properties in depth.  That suggests that you could indeed
wring some cycles out of these hacks at the expense of portability and
correctness.  If you do, just make sure you don't give this piece of
code to anybody else.

[ If you think I'm paranoid, perhaps I am.  But I'm sick of commercial
software that isn't even linked properly so that it breaks on new
machines and OS releases and sometimes even OS patches when it would
be a simple matter of controlling the build environment to do it
right.  If you have to support more than one of these beauties you get
a very big headache when you find out that the intersection of system
configurations that work is the null set.  Specifications and
standards exist for a reason.  If you don't like them, get them
changed.  Don't break things gratuitously. ]

Achim Gratz.

 Q203: Discussion: Thread creation/switch times on Linux and NT   

I'm so excited about this that I had to restate what I now think
to be the key design differences between NT and Linux 2.0 wrt. task

1.  The design of the Linux scheduler appears to make the assumption that,
    at any time during "normal" operation, there will only be a small
    number of actually runnable processes.

2.  The Linux scheduler computes which of these runnable processes to run
    via a linear scan of the run queue - looking for the highest priority

3.  The Linux yield_cpu() function is EXTREMELY prejudicial towards the
    calling program.  If you call yield_cpu() you are not only yielding
    the CPU, but you are also setting your priority to zero (the lowest)
    meaning that you will not run again (because of #2 above) until ALL
    other runnable processes have had a bite at the CPU.

4.  A process, under Linux, steadly has its priority lowered as a function of
    how long it has been scheduled to the CPU.

5.  The Linux scheduler re-sets everyone's priority to a "base"
    priority once all of them have had their priority lowered to zero
    (either through #3 or #4).  This re-set entails another linear
    traversal of the run queue in schedule().


a:  #1 is probably a very reasonable assumption.

b:  #2 causes task-switching time, on Linux, to degrade as more runnable
    processes are added.  It was obviously a design decision driven by
    the assumption in #1.

c:  #3 is, to me, a contentious issue.  Should you get penalized for
    voluntarily yielding the CPU - should it put you on the "back of the bus"
    or should it simply lower your priority by one?  After all, most other
    voluntary yields (such as for I/O or to sleep for a time) usually
    raise your priority under other UNIXs (don't know if that's the case
    with Linux - haven't checked).

    In either case, Mingo's code changes this policy.

d:  #4 is standard, textbook, OS stuff.

e:  #5 is another reasonable behavior.  The linear scan is, again, a function
    of the belief that #1 is true (or so I believe).

f:  Because of the combined effects of #1, #2, #3, and #5 my yield_cpu()
    benchmark was indeed extremely prejudicial to Linux since the assumptions
    that I was making were not the same as those of Linux's designers.  That
    doesn't mean my benchmark is, or was, a "bad" benchmark.  Quite the
    contrary - it illustrates in painful detail what happens when the
    designers of a system are using different criteria than those of the
    users of the system.

    It is up to the community to decide which criteria is more valid.

The net result is that Linux may well beat out NT for context switches
where the number of runnable processes is very small.  On the other hand,
NT appears to degrade more gracefully as the runnable process count
increases.  Which one is a "better" approach is open to debate.

For example, we could probably make Linux degrade gracefully (through hashing,
pre-sorting, etc.), as does NT, at the expense of more up-front work with
the resultant degradation in context-switch time where the # of processes
is very small.

On the other hand, the crossover point between Linux vs. NT appears to be
right around 20 runnable processes.  On a heavily loaded web server (say)
with 20-40 httpd daemons plus other code, does the "real world" prefer the
NT way or the Linux way?  How about as more and more programs become

The great thing about Linux is that we have the source - thus these
observations can be made with some assurance as to their accuracy.  As for
NT, I feel like the proverbial blind man trying to describe something
I've never seen.

The other great thing is that we can change it in the manner that best
suits our needs.  I love choice and I hate Microsoft.


Let us pray:
What a Great System.
Please Do Not Crash.

From: (Linus Torvalds)
Subject: Re: Thread creation/switch times on Linux and NT (was Re: Linux users working at Microsoft!)
Date: 8 Mar 1998 01:31:03 GMT
Organization: Transmeta Corporation, Santa Clara, CA

In article ,
Greg Alexander  wrote:
>In article <6ds74q$j9r$>, Gregory Travis wrote:
>>All process priorities were recomputed 99,834 times - or just
>>0.5% of the time.  Furthermore, only 31,059,779 processes (total)
>>were examined during those recalcs as opposed to the 61,678,377 that
>>were examined by the much more expensive "goodness" function.
>>From my perspective, this would tend to strongly favor the current
>>scheduling implementation (simple linear search as opposed to more
>>complex but robust hashed run queue) - at least for web serving (strong
>>emphasis on the latter).  If I were to look for improvements, under this
>>scenario, I would focus on the "goodness" function since 4% of the time
>>we had to throw ten or more processes through it.  Perhaps bringing it
>>inline with the sched() function.
>>But even that may be overkill since we only called sched() 24,221,164
>>times over a 17 hours period - or about 400 times per second.
>>Comments?  I would be happy to make my modifications available (they are
>>trivial) to anyone who wants to instrument their own application.
>My biggest suggestion is to try kernel profiling.  Check if any notable
>amount of time is actually spent in goodness before worrying about changing

Also, check out 2.1.x - there are some changes to various details of the
scheduler that were brought on by the finer locking granularity, but
that were sometimes also related to performance. 

I do obviously agree with the basic points above - I wrote most of the
scheduler.  Usually there aren't all that many runnable processes even
under heavy load, and having a very simple linear queue is a win for
almost all situations in my opinion.  For example, if there are lots of
processes doing IO, the process list tends to be fairly short and you
really want a very simple scheduler for latency reasons.  In contrast,
if there are lots of CPU-bound processes, there may be lots of runnable
processes, but it very seldom results in a re-schedule (because they
keep running until the timeslot ends), so again there is no real reason
to try to be complex. 

So yes, under certain circumstances the current scheduler uses more CPU
than strictly necessary - and the "40 processes doing a sched_yield()
all the time" example is one of the worst (because it implies a lot of
runnable processes but still implies continuous thread switching). 

Personally I don't think it's a very realistic benchmark (it tells you
_something_, but I don't think it tells you anything you need to know),
which is one reason why Linux isn't maybe the best system out there for
that particular benchmark.  But it would be easy enough to make Linux
perform better on it, so I'll think about it. 

[ Even when I don't find benchmarks very realistic I really hate arguing
  against hard numbers: hard numbers are still usually better than just
  plain "intuition".  And I may well be wrong, and maybe there _are_
  circumstances where the benchmark has some real-world implications,
  which is why I wouldn't just dismiss the thing out-of-hand.  It's just
  too easy to ignore numbers you don't like by saying that they aren't
  relevant, and I really try to avoid falling into that trap. ]

The particular problem with "sched_yield()" is that the Linux scheduler
_really_ isn't able to handle it at all, which is why the Linux
sched_yield() implementation sets the counter to zero - I well know that
it's not the best thing to do for performance reasons, and I think it
unduly penalizes people who want to yield some CPU time, but as it
stands the scheduler can't handle it any other way (the "decrement
counter by one" approach that Ingo suggested is similarly broken - it
just happens to not show it quite as easily as the more drastic "zero
the counter", and it has some other problems - mainly that it doesn't
guarantee that we select another process even if another one were to be

I should probably add a "yield-queue" to the thing - it should be rather
easy to do, and it would get rid of the current scheduler wart with
regard to sched_yield().  My reluctance is purely due to the fact that I
haven't heard of any real applications that it would matter for, but I
suspect we need it for stuff like "wine" etc that need to get reasonable
scheduling in threaded environments that look different from pthreads(). 

From - Sun Mar  8 15:03:12 1998
From: (Gregory Travis)
Subject: Re: Thread creation/switch times on Linux and NT (was Re: Linux users working at Microsoft!)

Here's some more data, using the latest version of my context switching
benchmark.  Test machine is a 64MB 200Mhz Pentium "classic".

Switch              Number of processes/Threads
Time            2   4   8   10  20  40
            ----    ----    ----    ----    ----    ----
Std. Procs      19us    13us    13us    14us    16us    27us
Std. Threads        16us    11us    10us    10us    15us    23us

Mingo Procs      4us     6us    11us    12us    15us    28us
Mingo Threads        3us     3us     5us     7us    12us    22us

NT Procs        10us    15us    15us    17us    16us    17us
NT Threads       5us     8us     8us     9us    10us    11us


The "Std." entries show the results of my yield_cpu() benchmark against
the standard Linux scheduler using either threads or processes.

The "Mingo" entries show the results of the same benchmark but after the
Linux yield_cpu() entry has been modified per Mingo's suggestion so that
it doesn't take the counter to zero.

The "NT" entries show the results of the benchmark under NT.

Each benchmark was run twice for each number (to promot accuracy).  Thus
the above is the result of 72 individual runs.


The dramatic drop in context switch time, between the "Std." and "Mingo"
runs shows how expensive the priority recalc can be - for short run
queues at least.  Note that it makes little or no difference as the
run queue length exceeds about 10 processes.  This is almost certainly
because the cost of the "goodness" function begins to dominate the picture.
For a given number of iterations, the goodness function is much more
expensive than the priority recalc function.  The goodness function must
be performed on each runnable process while the priority recalc must be
performed on all processes.  Thus with a small # of runnable processes,
the expensive goodness function is not called much while the "cheap"
priority recalc is called for each process, runnable or not.  As the run
queue grows, however, the goodness function is called more (while the
priority recalc function is essentially constant).  Around ~15 processes,
on my system, the cost of "goodness" washes out the noise from the priority

Nevertheless, the context switch times shown in the "Mingo" series is
probably closest to the actual Linux context switch times.  Note how the
series dramatically illustrates how context switch overhead, on Linux,
grows as a function of the run queue length.

It appears that the context-switch overhead for Linux is better than NT for
shortish run queues and, especially, where process/process switch time is
compared.  With run queues longer than about 20 processes, though, NT's
scheduler starts to beat out the Linux scheduler.  Also note that NT's
scheduler appears more robust than the Linux scheduler - its degradation
as the run queue grows is nowhere as dramatic as Linux's.  NT's thread
switch times doubled between 2 and 40 threads while Linux's showed a
>sevenfold< slowdown.

Does it matter?  Quite probably not.  From my earlier posting, with data
from a heavily loaded webserver, I saw an average run queue length of
2.5 processes.  The run queue exceeded 10 processes only about 4% of the

I've put my benchmarks, as well as the kernel changes to record
run queue length, on anonymous ftp at

From - Sun Mar  8 15:04:37 1998
From: (Gregory Travis)
Subject: Re: Thread creation/switch times on Linux and NT (was Re: Linux users working at Microsoft!)

In article ,
Greg Alexander  wrote:
>In article <6du96a$gqi$>, Gregory Travis wrote:
>>Here's some more data, using the latest version of my context switching
>>benchmark.  Test machine is a 64MB 200Mhz Pentium "classic".
>>Switch              Number of processes/Threads
>>Time            2   4   8   10  20  40
>>            ----    ----    ----    ----    ----    ----
>>Std. Procs      19us    13us    13us    14us    16us    27us
>>Std. Threads        16us    11us    10us    10us    15us    23us
>>Mingo Procs      4us     6us    11us    12us    15us    28us
>>Mingo Threads        3us     3us     5us     7us    12us    22us
>>NT Procs        10us    15us    15us    17us    16us    17us
>>NT Threads       5us     8us     8us     9us    10us    11us
>Does this look to you like NT maybe never traverses the tree and never
>updates priorities (assuming it even switches every time)?  This indicates
>non-complexity, which is beautiful, but I bet that they didn't do it well.
>(NT being VMS's deranged nephew or something)

I don't know what NT's scheduling algorithm is.  I'm very surprised, given
your comments below, that you are venturing an opinion on how NT works.  It
may not even use a list (what you referred to as a "tree" which it is not
in Linux) at all.

>Please, please, /PLEASE/ use profiling when talking about "this is almost
>certainly because the cost of the goodness function begins to dominate the
>picture."  It will tell you exactly which function dominates which picture
>quite clearly and simply.  It's much easier to say "goodness takes so much
>time, the recalc takes this much time," than bothering to make appeals of
>logic "goodness should take more time because."  Not that the latter is a
>bad idea in any case, just to explain why, but you should never explain why
>something is happening that you aren't certain is happening if you have an

Greg, so far you've contributed nothing positive to this venture other
than making most of us painfully aware that you don't even understand
ulimit and that your favorite way of showing how smart you are is by
throwing out red herrings at every opportunity.

I'll tell you what - why don't you try and reverse that impression?  I
spent about five hours of my life last night running the above sequence (not
to mention all the rest of the time I've devoted to this).  For the past
twenty years I've been paid to design and write software [including
a UNIX kernel release that used a scheduler I wrote] during the day
so perhaps you'll forgive me if I want to take this evening off and
instead watch Bill Gates lie on CSPAN.

So, here's something positive you can do: profile the kernel.  All my
sources and kernel changes are at (anonymous ftp).  Why
don't you take them and report back to us with your findings?  That would be
very nice, thanks.  Don't forget to do it with and without Mingo's very
helpful changes.

>Note that there are variables here you are controlling unintentionally. 
>Your statement would be better made as "With my benchmark and runqueues
>longer than about 20 processes, though, NT's..." or, to be specific, "When
>all runnable processes are calling sched_yield() in a loop and there are a
>minimal number of non-runnable processes and runqueues are longer than about
>20 processes..." and I'm sure there are plenty of other variables I've left
>out.  Having only about 80 processes, with 40 of them in a loop calling
>sched_yield(), you will not get general purpose numbers.  I'd almost expect
>more dormant processes to slow down linux more than NT in this case, but I
>don't know what would happen if the dormant processes were more like your
>"real life" example, i.e. many IO-bound programs that are awakened
>frequently, with an average of some number of them in the runqueue at once. 

You have an awful lot of "I'd almost expect," "I don't know," and
"I'm sure" statements for a guy who earlier so soundly admonished me
for stating what was clearly my opinion.

>NO!  Robust is the WRONG word!  Robust implies it can handle many different
>situations.  It is better at /THIS/ situation with large numbers of idling
>runnable processes.  Your test does not show how NT runs in real life

I can accept that.  Where can I download your test?

>If NT's scheduler really were more "robust," it would matter a good deal. 
>All you've shown is that its times don't appear to grow linearly as the
>number of runnable idling processes grows.

Thank you.  That's all I claimed to show (along with the switch times).

 Q204: Are there any problems with multiple threads writing to stdout?  

> >  > However, even if there are no problems, you may be seeing interleaved
> >  >output:
> >  >
> >  > example:
> >  >
> >  >  printf("x=%d, y=%d\n", x, y);
> >  >
> >  >there is no guarantee that x and y will appear on the same line
> >
> > Surely, printf() will lock the stream object (if you use the MT safe glibc2),
> > no?
> Not on Linux, or any other UNIX variant I've dealt with.  UNIX is used
> to it, even before threads.  stdout on NT doesn't make sense unless it's
> a console appliation.

For POSIX conformance, printf() must lock the process' stdio file stream. That is,
the output is "atomic". Thus, if two threads both call a single printf()
simultaneously, each output must be correct. E.g., for

         printf ("%d, %d\n", 1, 2);      printf ("%s, %s"\n", "abc",

you might get

     1, 2
     abc, def

or you might get

     abc, def
     1, 2

but no more "bizarre" variations. If you do, then the implementation you're using
is broken.

There is another level of complication, though, if you're talking about the
sequence of multiple printf()s, for example. E.g., if you have

         printf ("%d", 1);               printf ("%s", "abc");
         printf (", %d\n", 2);           printf (", %s\n", "def");

Then you might indeed get something like

     abc1, def
     , 2

POSIX adds an explicit stdio stream lock to avoid this problem, which you can
acquire using flockfile() and release using funlockfile(). For example, you could
correct that second example by coding it as

         flockfile (stdout);             flockfile (stdout);
         printf ("%d", 1);               printf ("%s", "abc");
         printf (", %d\n", 2);           printf (", %s\n", "def");
         funlockfile (stdout);           funlockfile (stdout);

Of course, if you write to the same file using stdio from separate processes,
there's no synchronization between them unless there are some guarantees about how
stdio generates the actual file descriptor write() calls from its internal
buffering. (And I don't believe there is.)

/---------------------------[ Dave Butenhof ]--------------------------\
| Compaq Computer Corporation            |
| 110 Spit Brook Rd ZKO2-3/Q18 |
| Nashua NH 03062-2698 |
\-----------------[ Better Living Through Concurrency ]----------------/

 Q205: How can I handle out-of-band communication to a remote client?  

  Stefan Rupp  wrote:
> Good afternoon,
> we encountered a problem in the design of our client-server architecture.
> The situation is as follows:
>  [1] the server runs as a demon process on an arbitrary host
>  [2] the client may connect to any number of servers
>  [3] when connected, the client requests data from a server
>      through a TCP socket and waits for the server to deliver
>      the requested data
>  [4] the server itself can send messages to the client at any
>      time without being asked by the client
> In a first step, I designed the client multithreaded, consisting of the
> 'main' thread and an 'I/O' thread, which handles the communication between
> the client and the server through a SOCK_STREAM socket with a select(2)
> call. The connection between the main thread and the I/O thread is made
> through a pair of pipes, so that the select call, which waits for
> incoming messages from the server as well as from the main thread,
> returns and handles the request. To open a new I/O thread for each server
> the client wants to connect to, is probably not a good idea, because I
> need two pipes for each thread to communicate with. So, only one I/O
> thread must handle the connection to any server the client connects to.
> Does anybody have a better idea how to design the client, so that it
> can handle unexpected callbacks from the server at any time? In the
> book "UNIX Network Programming" it is stated that signal driven I/O
> is nor advisable for a communication link through stream sockets, so
> that is not an option.
> Thanks!
> Doei,
>      struppi
> --
> Dipl.-Inform. Stefan H. Rupp
> Geodaetisches Institut der RWTH Aachen         Email:
> Templergraben 55, D-52062 Aachen, Germany      Tel.:  +49 241 80-5295

Change the client a little. Have one thread that waits on the responses from
the socket- this is a blocking call so is VERY efficent - (you will want a
timeout in there to do houskeeping and to check to shutdown every few seconds
though). Have a second thread that sends messages to the server on the
socket. This is safe, because sockets are bidirectional async. devices. If
the receive thread knows how to deal with messages from the server the
archeticture is quite simple. You may need a queue of messages waiting to be
processed if processing time is long, or a queue of messages to send to the
server to prevent contention on SENDING to the server.

We have implemented a client server using such an archectutre - it works very
well with full async. bidirectional messaging between client and server. the
server can deal with 1500 messages (total not each) a second from 200 clients.


 Q206: I need a timed mutex for POSIX wrote:

> I am doing multi-platform development, and have got several very successful
> servers running on NT and on AIX. The ptroblem is that NT is MUCH more
> efficient in it's MUTEX calls that AIX because of the POSIX mutex int
> pthread_mutex_lock (mutex) does not have a timeout, for that reason I need
> to do a loop doing a pthread_mutex_trylock (mutex) and a 20 milisecond sleep
> uintil timeout ( usually 5 seconds )


Or, more specifically, exactly what do you intend to do when the loop times

   * Which thread owns the mutex? (No way to tell, without additional
     information that cannot be used reliably except under control of a mutex;
     and you've already declared that, in your application, the mutex usage
     protocol is unreliable.)
   * What is that thread doing? Is it hung? Broken? Did it get prempted and
     miss a deadline, but "still ticking"? Unless you know that (not
     impossible, but EXTREMELY difficult to implement, much less to get
     right), you CANNOT "steal" the mutex, or know what to do once you've got
   * You cannot force the owner of the mutex to unlock. You cannot unlock from
     your current thread. You can't assume you now own it. If you knew the
     owner, you could cancel it and join with it (as long as you know nobody
     else is already joining with it), hoping that "it's broken but not TOO
     broken". But then what happens if it doesn't terminate, or if it's
     sufficiently broken that it doesn't release the mutex on the way out?

This is the kind of thing that may sound "way cool" for reliable, fail-safe
servers. In practice, I doubt the value. That kind of fail-safety is almost
always complete illusion except in rigorously isolated embedded system
environments. And in such an environment, it's trivial to write your own
pthread_mutex_timedwait() or work out some alternate (and probably better)
method to recover your runaway state.

In a fully shared memory multithreaded server, when something's "gone wrong"
and you lose control (and that's what we're talking about), the ONLY safe
thing to do is to panic and crash the process, NOW. You can run the server
under a monitor parent that recognizes server exit and forks a new copy to
continue operation. You can keep operation logs to recover or roll back. But
you cannot make the process "fail safe".

> The problem is this is inefficient. NT has a Wait_for_MUTEX with timeout.
> this is good.
> (bummer, Bill got it right :-(   )

No. Just another misleading and overly complicated function that looks
neato-keen on paper. Any code that really, truly DEPENDS on such a capability
is already busted, and just doesn't know it yet.

(Oh, and, yes, I say this with the explicit knowledge that all generalizations
are false, including this one. There is certainly code that doesn't need to be
100% fail safe, and that may be able to productively use such primitives as a
timed mutex wait to slightly improve some failure modes. Maybe, in a very few
cases, maybe even yours, all of the time and effort that went into it provides
some real benefit. "The one absolute statement I might make is that none of my
statements are absolute." ;-) )

You can put together a "timed mutex" yourself, if you want, using a mutex and
a condition variable. Use the mutex to serialize access to control
information, such as your own ownership and waiter data, and use a condition
variable to wait for access. A waiter that times out can then determine which
thread (in your APPLICATION scheme) owns the "mutex". Of course, if the
application is really ill-behaved, then even the "control mutex" might not be
unlocked -- I doubt you could do much in that case, anyway.

One final note. As I said, such "unusual" things as timed mutex waits CAN make
sense for carefully coded embedded application environments, and the folks in
the POSIX realtime working group worry about that sort of thing a lot. While
the concept of timed mutex waits was passed over for POSIX 1003.1c-1995 as too
specialized, the "additional realtime features" standard, 1003.1d, (still in
draft form), adds pthread_mutex_timedwait.

/---------------------------[ Dave Butenhof ]--------------------------\
| Compaq Computer Corporation            |
| 110 Spit Brook Rd ZKO2-3/Q18 |
| Nashua NH 03062-2698 |
\-----------------[ Better Living Through Concurrency ]----------------/

[If you *really* need a timed mutex, you can look at the sample code for
timed mutexes on this web page -- Bil]

 Q207: Does pthreads has an API for configuring the number of LWPs?  

"Hany Morcos (CS)" wrote:

>    Hi, does pthreads has an API for configuring the number of
> lwps for a specific sets of threads.  Or does most OS's assign
> an lwp per a group of thread.

The short answer: PThreads, no.  But UNIX98 includes a
pthread_setconcurrency() extension to the POSIX thread API.

The long answer:

First, "LWP" is a Solaris-specific (actually, "UI thread" specific, but
who cares?) term for a kernel thread used to allow a threaded process to
exploit O/S concurrency and hardware parallelism. So "most OS"s don't
have LWPs, though they do have some form of kernel threads.

(Note, this is all probably more than you want or need, but your
question is rather "fuzzy", I tend to prefer to give "too much"
information rather than "not enough", and for some reason I appear to be
in a "talkative" mood... ;-) )

POSIX 1003.1c-1995 ("pthreads") deliberately says very little about
implementation details, and provides few interfaces specifically to
control details of an implementation. It does allow for a two-level
scheduler, where multiple kernel threads and POSIX threads interact
within a process, but provides only broad definitions of the behavior.
There is no way to directly control the scheduling of PCS threads
("Process Contention Scope", or "user mode") onto "kernel execution
entities" (kernel threads). Although there is a mechanism to avoid
user-mode scheduling entirely, by creating SCS ("System
Contention Scope") threads, which must be directly scheduled by the
kernel. (Or at least must behave as if so scheduled, with respect to
threads in other processes.)

There's no form of "thread grouping" supported. Some systems have class
scheduling systems that allow you to specify relations between threads
and/or processes, but there's nothing of the sort in POSIX. (Nor, if
there were, would it necessarily group threads to an LWP as you

POSIX threads requires that a thread blocking for I/O cannot
indefinitely prevent other user threads from making progress. In some
cases, this may require that the implementation provide a new kernel
execution entity. It can do so either as a "last ditch" effort to
prevent completely stalling the process (as Solaris generally does, by
creating one additional LWP as the last current LWP in the process
blocks), or as a normal scheduling operation (as Digital UNIX does) to
always maintain a consistent level of PCS thread concurrency in the
process. (While I prefer the latter, and experience has shown that this
is what most people expect and desire, POSIX doesn't say either is right
or wrong; and in addition, there are costs to our approach that aren't
always repaid by the increased concurrency.)

UI threads was designed to allow/require the programmer to control the
level of process concurrency, and Sun's POSIX thread implementation uses
the same thread scheduler as their UI thread implementation. While the
"last ditch" LWP creation prevents indefinite stalls of I/O-bound
applications, it doesn't help applications with multiple compute-bound
threads, (the implemenation doesn't time-slice PCS threads). And, at
best, the model allows the process concurrency to be reduced to 1 before
offering any help. (Digital UNIX does time-slice PCS threads, so
compute-bound threads can coexist even on a uniprocessor [though this
isn't the most efficient application model, it's common and worth
supporting].) UI threads provides a thr_setconcurrency() call to allow a
careful programmer to dynamically "suggest" that additional LWPs would
be useful.

Due to Sun influence (and various other vendors who had intended
similarly inflexible 2-level schedulers), the Single UNIX Specification,
Version 2 (UNIX98) includes a pthread_setconcurrency() extension to the
POSIX thread API. Due to increasing cooperation between The Open Group
and PASC (the IEEE group that does POSIX), you can expect to see the
UNIX98 extensions appear in a future version of the POSIX standard. Note
that while this function is essential on Solaris, it has no purpose (and
does nothing) on Digital UNIX, (or on Linux, which uses only kernel
threads). I expect other vendors to move away from hints like
pthread_setconcurrency() as they (and their users) get more experience
with threading. The need for such hackery is largely responsible for the
unsettlingly common advice of UI thread wizards to avoid the Solaris
default of PCS threads ("unbound", in UI terminology) and to use SCS
threads ("bound") instead.

In some ways this is much like the old Win32 vs. Mac OS debate on
preemptive vs. cooperative multitasking. While cooperate multitasking
(or the simplistic/efficient Solaris 2-level scheduling) can be much
better for some class of applications, it's a lot harder to write
programs that scale well and that work the way users expect with
(unpredictable) concurrent system load. While preemptive multitasking
(or tightly integrated 2-level scheduling) adds (system) implementation
complexity and some unavoidable application overhead, it's easier to
program for, and, ultimately, provides more predictable system scaling
and user environment.

>    Wouldn't make more sense if one lwp blocks for a disk I/O
> instead of the entir program, when using grean threads.

"Green threads" is the user-level threading package for Java. It doesn't
use multiple kernel threads, and therefore cannot use hardware
parallelism or true I/O concurrency (although it has hooks to use
non-blocking UNIX I/O to, in many cases, schedule a new user thread
while waiting for I/O).

Modern implementations of Java should use native threads rather than
Green threads. In the case of a Solaris Java using UI threads or POSIX
threads rather than Green threads, disk I/O WOULD block only the LWP
assigned to the calling thread. There's no reason to be using Green
threads on any O/S that has real thread support!

>    I guess now it is very safe for multiple threads to directly
> write to a stream queue since write and read are thread safe.

Java I/O must be thread-safe. ANSI C I/O under a POSIX thread
implementation must be thread-safe. However, there's no standard
anywhere requiring that C++ I/O must be thread-safe -- nor is there for
most other languages. So you need to watch out HOW you write. If you're
writing Java or C, you're probably pretty safe. In any other language,
watch out unless you're using ANSI C/POSIX I/O functions directly.

/---------------------------[ Dave Butenhof ]--------------------------\
| Compaq Computer Corporation            |
| 110 Spit Brook Rd ZKO2-3/Q18 |
| Nashua NH 03062-2698 |
\-----------------[ Better Living Through Concurrency ]----------------/

 Q208: Why does Pthreads use void** rather than void*?  

Ben Elliston wrote:

> Wang Bin  writes:
> >     Today, when I was looking at thr_join(thread_t, thread_t*, void**),
> > I was suddenly confused by void* and void**. Why the third parameter
> > here is void** rather than void*?
> The third parameter is a void * so that the result can be anything you
> like--it's a matter of interpretation.  However, you need to pass the
> address of a void * so that the function can modify the pointer.

The POSIX thread working group wanted to specify a way to pass "any value"
to and from a thread, without making the interface really bulky and
complicated. The chosen way (it's often been debated whether the approach
was right, or good, but that's all irrelevant now) was to use "void*" as a
universal (untyped) value. It's NOT necessarily a pointer (though of course
it may be)... it's just an untyped value. The UI thread interface (defined
by many of the same people) has the same logic.

So when you create a thread, you pass in a "void*" argument, which is
anything you want. When a thread terminates, either by returning from its
start routine or by calling pthread_exit (or thr_exit), it can specify a
"void*" return value. When you join with the thread, you can pass the
function a POINTER to some storage that will receive this thread return
value. The storage to which you point must, of course, be a "void*".

Beware ("be very, very ware", as Pooh was once warned), because this
mechanism, while often convenient, is not at all type-safe. It's really easy
to get yourself into trouble. Do not, EVER pass the address of something
that's not "void*" into thr_join/pthread_join and simply cast the pointer to
(void**). For example, let's look at

     size_t TaskStatus;
     thr_join(..., ..., (void**)&TaskStatus;);

(This is slightly different from Ben's example. He cast the pointer to
"void*"... that'll work, since ANSI C is willing to implicitly convert
between any pointer type and "void*", but the parameter type is actually

What is the SIZE of size_t? Well, on conventional 32-bit system, size_t and
"void*" are probably both 32 bits. On a conventional 64-bit LLP system,
they're probably both 64 bits. But ANSI C doesn't require that conformity.
So what if size_t is a 32-bit "int", while "void*" is 64-bit? Well, now
you've got 32 bits of storage, and you're telling the thread library to
write 64 bits of data at that address. You've also told the compiler that
you really, really, for sure know what you're doing. But you really don't,
do you?

The construct is extremely common, but it's also extremely dangerous, wrong,
and completely non-portable! Do it like the following example, instead. It's
a little more complicated, but it's portable, that may save you a lot of
trouble somewhere. Your compiler might warn you that size_t is smaller than
void*, in cases where you might have otherwise experienced data corruption
by overwriting adjacent storage. If the original value passed by the thread
really WAS a size_t, the extra bits of the void* would have to be 0, [or
redundant sign bits if "size_t" is signed and the value is negative], and
losing them won't hurt you.

     void *result;
     size_t TaskStatus;
     thr_join(..., ..., &result;);
     TaskStatus = (size_t)result);

/---------------------------[ Dave Butenhof ]--------------------------\
| Compaq Computer Corporation            |
| 110 Spit Brook Rd ZKO2-3/Q18 |
| Nashua NH 03062-2698 |
\-----------------[ Better Living Through Concurrency ]----------------/

 Q209: Should I use poll() or select()? (W. Richard Stevens) writes:

>Second, I used to advocate select() instead of poll(), mainly because
>of portability, but these days select() is becoming a problem for
>applications that need *lots* of descriptors.  Some systems let you
>#define FD_SETSIZE to use more than their compiled-in limit (often
>256 or 1024), some require a recompile of the kernel for more than
>the default, and some require a recompile of the library function
>named select().  For these applications poll() is better, as there
>is no inherent limit (other than the per-process descriptor limit).


>Another complaint I had against poll() was how hard it was to remove
>a descriptor from the array (typical for network servers when a client
>terminates), but now you just set the descriptor to -1 and it's ignored.

But that was fixed a long, long time ago.

Which brings me to another advantage of poll(): you just specify the
events you are interested in once; poll uses a different field for the
result events.  (So no resetting of bits in the select masks).

Also, on Solaris, select() is a library routine implemented on top
of poll(); that costs too.  (Though on other systems it might be the


 Q210: Where is the threads standard of POSIX ????  


 Q211: Is Solaris' unbound thread model braindamaged?  

"Doug Royer [N6AAW]" wrote:

> Did you have a specifc braindamaged bug to report?
> In article <>, Boris Goldberg  writes:
> >
> > I briefly browsed Solaris 7 docs at and, regrettably,
> > it doesn't appear that they changed their braindamaged threading model.

Actually, I think Doug phrased that very well. In particular, he didn't use the
word "bug". He merely said "braindamaged". One might easily infer, (as I have),
that he's making the assumption that the "braindamaged" behavior is intentional,
and simply expressing regret that the intent hasn't changed.

Here's a few of the common problems with Solaris 2-level threading. I believe
one of them could accurately be described as a "bug" in Solaris (and that's not
confirmed). The others are merely poor design decisions. Or, in common terms,
"brain damage".

  1. thr_concurrency() is a gross hack to avoid implementing most of the 2-level
     scheduler. It means the scheduler puts responsibility for maintaining
     concurrency on the programmer. Nice for the Solaris thread subsystem
     maintainers -- not so nice for users. (Yes, UNIX has a long and
     distinguished history of avoiding kernel/system problems by complicating
     the life of all programmers. Not all of those decisions are even wrong.
     Still, I think this one is unnecessary and unjustifiable.)
  2. Rumor has suggested that Solaris creates one LWP by default even on SMP
     systems -- if that rumor is true, this condition might shade over the line
     into "true bug". But then, having an SMP isn't necessarily the same as
     being able to use it, so maybe that's deliberate, too.
  3. Blocking an LWP reduces the process concurrency. Yeah, sure the library
     will create a new one when the last LWP blocks, but that's not good. First,
     it means the process has been operating on fewer cylinders than it might
     think for some period of time. And, in many cases even worse, after the
     LWPs unblock, it will be operating on more cylinders than it can sustain
     until the LWPs go idle and time out. Running with more LWPs than processors
     is rarely a good idea unless most of them will always be blocked in the
     kernel. (I've heard unsubstantiated rumors that 2.6 did some work to
     improve on this, and 7 may do more; but I'm not inclined to let anyone "off
     the hook" without details.)
  4. While timeslicing is not required by POSIX, it is the scheduling behavior
     all UNIX programmers (and most who are used to other systems, as well)
     EXPECT. The lack of timeslicing in Solaris 2-level scheduling is a constant
     source of complication and surprise to programmers. Again, this isn't a
     bug, because it's clearly intentional; it's still a bad idea, and goes
     against the best interests of application programmers.

/---------------------------[ Dave Butenhof ]--------------------------\
| Compaq Computer Corporation            |
| 110 Spit Brook Rd ZKO2-3/Q18 |
| Nashua NH 03062-2698 |
\-----------------[ Better Living Through Concurrency ]----------------/


 Q212:  Releasing a mutex locked (owned) by another thread.  
Zoom wrote:

> Hello, I have inherited the maintenance of a multi-threaded application.
> The application uses pthreads and runs on multiple platforms including
> solaris. On solaris it seems to be somewhat squirrely (the technical
> term of course :-) and I get random core dumps or thread panics.
> Absolutely not consistantly reproduceable. Sometimes it will go for
> hours or days cranking away and sometimes it will thread panic shortly
> after it starts up. In researching the the book "Multi-threaded
> Programming with Pthreads" by Bil Lewis et. al. I found on page 50 the
> statement to the effect that under posix it is illegal for one thread to
> release a mutex locked (owned) by another thread. Well, this application
> does that. In fact it does it quite extensively.
> Is there anyone willing to commit to the idea that this may be the
> source of the applications problems.

The answer is an absolutely, definite, unqualified "maybe". It depends
entirely on what the application is doing with those mutexes.

First, I want to be completely clear about this. Make no mistake, locking a
mutex from one thread and unlocking it from another thread is absoutely
illegal and incorrect. The application is seriously broken, and must be

However, reality is a little more complicated than that. POSIX explicitly
requires that application programmers write correct applications. More
specifically, should someone write an incorrect application, it explicitly
and deliberately does NOT require that a correct implementation of the
POSIX standard either DETECT that error, or FAIL due to that error. The
results of programmer errors are "undefined". (This is the basis of the
POSIX standard wording on error returns -- there are "if occurs" errors,
which represent conditions that the programmer cannot reasonably
anticipate, such as insufficient resources; and there are "if detected"
errors, which are programmer errors that are not the responsibility of the
implementation. A friendly/robust implementation may choose to detect and
report some or all of the "if detected" errors -- but even when it fails to
detect the error, it's still the application's fault.)

The principal difference between a binary semaphore and a mutex is that a
mutex carries with it the concept of "ownership". It is that characteristic
that makes it illegal to unlock the mutex from another thread. The locking
thread OWNS the mutex, exclusively, until it unlocks the mutex. IF an
implementation can (and chooses to) detect and report violations of the
ownership protocol, the erroneous attempt at unlock will result in an EPERM
return. However, this is a programmer error. It is often unreasonably
expensive to keep track of which thread owns a mutex: an instruction (or
kernel call) to determine the identity of the locking thread may take far
longer than the basic lock operation. And of course it would be equally
expensive to check for ownership during unlock.

Many implementations of POSIX threads, therefore, do not record, or check,
mutex ownership. However, because it's a mutex, it IS owned, even if the
ownership isn't recorded. The next patch to your operating system might add
checking, or it might be possible to run threaded applications in a
heavyweight debug environment where mutex ownership is recorded and
checked... and the erroneous code will break the application. It'll be the
application's (well, the application developer's) fault.

Anyway, IF the implementation you're using really doesn't record or check
ownership of mutexes. And IF that illegal unlock is done as part of a
carefully managed "handoff" protocol so that there's no chance that the
owner actually needs the mutex for anything. (And, of course, if this
bizarre and illegal protocol is actually "correct" and consistent.) THEN,
your application should work despite the inherent illegality.

You could switch to a binary semaphore, and do the same thing without the
illegality. The application still won't WORK if you're releasing a lock
that's actually in use.

/---------------------------[ Dave Butenhof ]--------------------------\
| Compaq Computer Corporation            |
| 110 Spit Brook Rd ZKO2-3/Q18 |
| Nashua NH 03062-2698 |
\-----------------[ Better Living Through Concurrency ]----------------/

 Q213:  Any advice on using gethostbyname_r() in a portable manner?  

>>>>> "Tony" == Tony Gale  writes:

    Tony> Anyone got any advice on using gethostbyname_r in a portable
    Tony> manner?  It's definition is completely different on the
    Tony> three systems I have looked at. Autoconf rules would be nice
    Tony> :-)

Sorry, no autoconf rules.  Here's what I did in a similar situation:

    struct hostent *hentp = NULL;
    int herrno;
    uint32 ipnum = (uint32)-1;
#if defined(__GLIBC__)
    /* Linux, others if they are using GNU libc.  We could also use Posix.1g
       getaddrinfo(), which should eventually be more portable and is easier to
       use in a mixed IPv4/IPv6 environment. */
    struct hostent hent;
    char hbuf[8192];
    if (gethostbyname_r(hostname, &hent;,
                        hbuf, sizeof hbuf, &hentp;, &herrno;) < 0) {
        hentp = NULL;
#elif defined(sun)
    /* Solaris 2.[456]. */
    struct hostent hent;
    char hbuf[8192];
    hentp = gethostbyname_r(hostname, &hent;, hbuf, sizeof hbuf, &herrno;);
#elif defined(__osf__)
    /* On Digital Unix 4.0 plain gethostbyname is thread-safe because it uses
       thread specific data (and a h_errno macro).  HPUX is rumoured to use
       this method as well.  This will go wrong on Digital Unix 3.2, but this
       whole file is not going to compile there anyway because version 3.2 has
       DCE threads instead of Posix threads. */
    hentp = gethostbyname(hostname);
    herrno = h_errno;
#error I do not know how to do reentrant hostname lookups on this system
    if (hentp == NULL) {
        /* Digital Unix doesn't seem to have hstrerror :-(. */
        hmddns_logerror("gethostbyname(%s): %d", hostname, herrno);
    } else {
        memcpy(&ipnum;, hentp->h_addr, sizeof ipnum);
    return ipnum;


From: David Arnold 

we're using this ...

AC_CHECK_FUNC(gethostbyname_r, [
  AC_MSG_CHECKING([gethostbyname_r with 6 args])
#   include 
  ], [
    char *name;
    struct hostent *he, *res;
    char buffer[2048];
    int buflen = 2048;
    int h_errnop;

    (void) gethostbyname_r(name, he, buffer, buflen, &res;, &h;_errnop)
  ], [
  ], [
    AC_MSG_CHECKING([gethostbyname_r with 5 args])
#     include 
    ], [
      char *name;
      struct hostent *he;
      char buffer[2048];
      int buflen = 2048;
      int h_errnop;

      (void) gethostbyname_r(name, he, buffer, buflen, &h;_errnop)
    ], [
    ], [
      AC_MSG_CHECKING([gethostbyname_r with 3 args])
#       include 
      ], [
        char *name;
        struct hostent *he;
        struct hostent_data data;

        (void) gethostbyname_r(name, he, &data;);
      ], [
      ], [
], [

> Whom do I shoot?

take your pick :-(

-- David Arnold  
CRC for Distributed Systems Technology         +617 33654311   (fax)
University of Queensland           (email)
Australia                        (web) 

 Q214: Passing file descriptors when exec'ing a program.  

Jeff Garzik wrote:

> My MT program must send data to the stdin of multiple processes.
> It also needs to read from the stdout of those _same_ processes.
> How can this be done?

use the dup() function to save your parent stdin and stdout (if needed).
For each child process do:
   create two pipe()'s
   close stdin
   dup() one end of the first pipe
   close stdout
   dup the other end of the second pipe
   close unused ends of pipes
   save the pipe fd's for later use
restore parents stdin and stdout (if needed)
add pipe fd to fdset_t
use select() call to detect when child input from pipe is available

From quick Web search for examples:

A book? Hard to be a top notch Unix programmer without this one on your

  Advanced Programming in the Unix Environment
  W. Richard Stevens , Addison-Wesly Publishing
  ISBN 0-201-56317-7

Good luck!

% use the dup() function to save your parent stdin and stdout (if needed).

Good suggestion, although I'd suggest using dup2() to replace stdin and
stdout with the pipe ends.  If you do this, you have to be careful about
any code that uses stdin and stdout in the rest of your program -- you've
got to be sure you never try to use these handles while they're being set
up for the child process.

Patrick TJ McPhee
East York  Canada

 Q215:  Thread ID of thread getting stack overflow?   
Kurt Berg wrote:

> We are seeking a PORTABLE way of getting the thread ID
> of a thread experiencing a stack overflow.  We have to do
> some post processing to try to determine, given the thread
> ID, what sort of thing to do.
> It is our understanding that pthread_self is NOT "async
> signal safe".
> Thanks in advance.

Umm, as I mentioned in my reply to your email, once you buy into the
concept of doing "portable" things in a signal handler (which represents
a serious ERROR within the process), you're climbing a steep slope with
no equipment. Your fortune cookie says that a disastrous fall is in your

I also commented that, although pthread_self isn't required by the
standard to be async-signal safe, it probably IS, (or, "close enough"),
on most platforms. And in a program that's done something "naughty" and
unpredictable, asynchronously, to its address space, that's as good as
you're going to get regardless of any standard.

However, you neglected to mention in your email that the SIGSEGV you
wanted to handle was a stack overflow. Now this leads to all sorts of
interesting little dilemmas that bring to mind, (among other things),
Steven Wright's famous line "You can't have everything: where would you
put it?" (Actually, the answer is that if you had everything, you could
leave it right where it was, but that's beside the point.) Your system
is telling you that you've got no stack left. While some systems might
support a per-thread alternate signal stack, that's not required by the
standards (and, in any case, it's kinda expensive since you need to
allocate an alternate stack for each thread you create). So... you've
used all your stack, and you want to handle the error. On what? The
stack you've used up? Or the other stack that you can't even designate

Sure, on SOME systems, you may be able to determine (at least sometimes)
that you're "near the end" of the stack, before you're actually there.
The Alpha calling standard, for example, requires the compiler to
"probe" the stack before changing the stack pointer to allocate a new
frame. Thus, if the probe generates a SIGSEGV, you've still got whatever
it was you were trying to allocate. MAYBE that's enough to run a signal

Unfortunately, "maybe", "sometimes", and "some systems" are not words
that contribute to a portable solution.

The answer is that you're as out of luck as your thread (even if you
still have stack). What you want to do is DEBUGGING... so leave it to
the debugger. Make sure that SIGSEGV is set to SIG_DFL. Let the ailing
process pass away peacefully, and analyze the core file afterward. (And
if you're faced with a system that doesn't support analysis of a
threaded process core file, then find a patch... or turn around and face
another system.)

And if you're just trying to leave a log entry to later trace the
failure of some reliable networking system, remember that a thread ID is
transient and local. It means absolutely nothing within another process,
or even at some other time within the same process. Why would you want
to log it? Without the core file, the information is useless; and with
the core file, it's redundant.

/---------------------------[ Dave Butenhof ]--------------------------\
| Compaq Computer Corporation            |
| 110 Spit Brook Rd ZKO2-3/Q18 |
| Nashua NH 03062-2698 |
\-----------------[ Better Living Through Concurrency ]----------------/

 Q216:   Why aren't my (p)threads preemted?  
Lee Jung Wooc wrote:

> I have my opinion and question.
> IMO, the cases that showed up "thread 2" is not by cpu preemption, but by
> normal scheduling. The printf() call in thread function induces a system call
> write( ). but the printf is a library function and will not lose cpu until
> some amount of bytes are stored in the buffer and fflush () is  called. The
> fflush() calls write() then switching occurs. Library buffer size may
> influences when the switching occurs and also the second pthread_create call
> may switch cpu to the first thread.

A thread MAY be timesliced at any time, on any system that supports
timeslicing. As I said, while SCHED_RR requires timeslicing, SCHED_OTHER does
not prohibit timeslicing. (Only SCHED_FIFO prohibits timeslicing.)

In addition to system calls, a thread might block on synchronization within the
process. For example, the buffer written to by printf() is shared among all
threads using stdout, and has to be synchronized in some fashion. Usually, that
means a mutex (or possibly a semaphore). If multiple threads use printf()
simultaneously (and, especially with one of the predefined file streams, like
stdout, it needn't be used by any of YOUR threads to have simultaneous access),
one of them may BLOCK attempting to acquire the synchronization object. That
would result in a context switch.

> I'm assuming the write() call , known to be none-blocking in normal cases,
> can lead to switching.  IMHO, the word none-blocking means that the calling
> context is not scheduled after a context(thread or process, whatever) which
> is assumed to be waitng for an event infinitively.

That's "non-blocking", not "none-blocking". (I mean no disrespect for your
English, which is far better than I could manage in your language, but while I
can easily ignore many "foreign speaker" errors, this one, maybe especially
because you chose to define it, stood out and made me uncomfortable.

> Is my assumtion correct ?

I'm afraid I can't make much sense of your second sentence.

DIGRESSION ALERT (including slings and arrow not specifically targeted to, nor
especially deserved by, the person who wrote the quoted phrase): I find it
difficult to read anything that starts with "IMHO", an abbreviation that I
despise, and which is almost always hypocritical because an opinion one takes
such care to point out is almost never intended to be "humble". It's quite
sufficient to simply declare your opinion. I, and all other responsible
readers, will assume that EVERYTHING we read is in fact the author's opinion,
except (perhaps) when the author specifically claims a statement to be
something else. And even then we'll question whether the author has in fact the
authority and knowledge to make such a claim. In the rare cases where it might
actually be useful to explain that your opinion is your opinion, you might
simply say so without the cloying abbreviation.

With that out of the way, where were we? Oh yes, write().

It's true that most traditional UNIX systems have a bug by which I/O to a file
oriented device is not considered capable of "blocking". That's unfortunate.
It's particularly unfortunate in a threaded world, because some other thread
might be capable of doing a lot of work even while an I/O works its way through
to the unified buffer cache; much less if the data must actually be written to
a remote NFS file system. In any case, this does not usually apply to other
types of file system. If stdout is directed to a terminal, or to a network
socket, the printf()'s occasional write() SHOULD result in blocking the thread.
The write() syscall IS, technically, a "blocking function", despite the fact
that some calls to it might not block. Being a "blocking function" does not
necessarily require that every call to that function block the calling thread.

> As far as, I know threre's no implemention of SCHED_RR in major unix
> distributions. Neither I think the feature is on definite demand.

I believe that Linux implements SCHED_RR fully. I know that Digital UNIX does,
and always has. I have some reason to believe that Solaris 7 implements
SCHED_RR, and I suspect that AIX 4.3 does as well. I'd be surprised if IRIX
(known for its realtime support) didn't support SCHED_RR (not that I haven't
already been surprised by such things). I don't have any idea whether HP-UX
11.0 has SCHED_RR... perhaps that's the "major unix distribution" you're
talking about?

As for demand. Oh yes, there's a very high (and growing) demand, especially for
the sort of "soft realtime" scheduling (and not always all that "soft") that
can be used to build reliable and "highly available" network server systems.
Anybody who doesn't support SCHED_RR either has no customers interested in
networking, or you can safely bet cash that they've already received numerous
requests, bribes, and threats from customers with names that just about anyone
would recognize.

/---------------------------[ Dave Butenhof ]--------------------------\
| Compaq Computer Corporation            |
| 110 Spit Brook Rd ZKO2-3/Q18 |
| Nashua NH 03062-2698 |
\-----------------[ Better Living Through Concurrency ]----------------/

 Q217: Can I compile some modules with and others without _POSIX_C_SOURCE?  

Keith Michaels wrote:

> The _malloc_unlocked looks REAL suspicious now.  Am I getting the wrong
> malloc linked into my program?  The program contains two modules compiled
> separately: all the posix thread stuff is compiled with
> -D_POSIX_C_SOURCE=199506L, and the non-posix module is compiled without it.
> This is necessary because many system interfaces that I need are not
> available in posix (resource.h, bitmap.h, sys/fs/ufs*.h do not compile
> under posix).
> Is the traceback above evidence I have built the program incorrectly?

I don't know whether the traceback is evidence, but, regardless, you
HAVE built the program incorrectly. I don't know whether that incorrectness is
relevant. It's hard to believe that source files compiled without "thread
support" on Solaris would be linked to a non-thread-safe malloc() -- but, if
so, that could be your problem.

You don't need to define _POSIX_C_SOURCE=199506L to get thread support, though
that is one way to do it. Unfortunately, as you've noted, defining that symbol
has many other implications. You're telling the system that you intend to
build a "strictly conforming POSIX 1003.1-1996 application", and therefore
that you do not intend to use any functions or types that aren't defined by
that standard -- and in addition that you reserve the right to define for your
own use any symbols that are not specifically reserved by that standard for
implementation use.

Solaris, like Digital UNIX, (and probably others, though I don't know), has a
development environment that, by default, supports a wide range of standard
and non-standard functions and types. That's all fine, as long as they don't
conflict and as long as the application hasn't required that the environment
NOT do this, as by defining _POSIX_C_SOURCE. To compile threaded code on
Solaris (or Digital UNIX) that is not intended to be "strictly conforming
POSIX 1003.1-1996" you should define only the symbol _REENTRANT. You'll get
the thread-safe versions of any functions or symbols (e.g., errno) where
that's relevant, without restricting your use of non-POSIX capabilities of the
system. DEC C on Digital UNIX provides the proper defines when you compile
with "cc -pthread". I believe that Solaris supports "cc -mt", (though I didn't
know about that the last time I tried to build threaded code on Solaris, so I
haven't checked it).

Don't use -D_POSIX_C_SOURCE=199506L unless you really MEAN it, or if the
system you're using doesn't give you any alternative for access to thread
functions. (As I said, you never need it for Digital UNIX or Solaris.) And
always build ALL of the code that you expect to live in a threaded process
with the correct compiler options for your system. Otherwise, at best, they
may disagree on the definition of common things like errno; and, at worst, the
application may not be thread-safe.

/---------------------------[ Dave Butenhof ]--------------------------\
| Compaq Computer Corporation            |
| 110 Spit Brook Rd ZKO2-3/Q18 |
| Nashua NH 03062-2698 |
\-----------------[ Better Living Through Concurrency ]----------------/


Patrick TJ McPhee
East York  Canada

 Q218: timed wait on Solaris 2.6? wrote:

> I read from somewhere that pthread_cond_timedwait should only be
> used "in realtime situations".  Since Solaris doesn't support the
> realtime option of pthread, does it mean pthread_cond_timedwait
> should not be used on Solaris at all?

Condition variables are a "communication channel" between threads that are
sharing data. You can wait for some predicate condition to change, and you
can inform a thread (or all threads) waiting for a condition that it has

There's nothing intrinsically "realtime" about them, at that level.

You can also have a wait time out after some period of time, if the
condition variable hasn't been signalled. That's not really "realtime",
either, although the nanosecond precision of the datatype does originate
in the needs of the realtime folk who developed 1003.1b (the realtime
POSIX extensions).

On a system that supports the POSIX thread realtime scheduling option
(which, as you commented, Solaris 2.6 doesn't support -- though it
erroneously claims to), multiple threads that have a realtime scheduling
policy, and are waiting on a condition variable, must be awakened in
strict priority order. That, of course, is obviously a realtime constraint
-- but it doesn't apply unless you have (and are using) the realtime
scheduling extensions.

> I tried to use pthread_cond_timedwait in my application and got
> various weird results.
> 1. Setting tv_nsec doesn't seem to block the thread at all.  I
>    guess Solaris might just ignore this field (the value I gave
>    was 25,000,000).

Define "at all". How did you attempt to measure it? By looking at the
sweep second hand on your watch? Using time(1)? Calling gettimeofday()
before and after? Querying a hardware cycle counter before and after? Your
25000000 nanoseconds is just 25 milliseconds.

However, what may have happened is that you specified tv_sec=0, or
time(NULL), and then set tv_nsec to 25000000. With tv_sec=0, that's a
long, long way in the past, and the wait would indeed timeout immediately.
Even with tv_sec=time(NULL), remember that you may well have a nanosecond
"system time" of .026 seconds, and you're setting an absolute timeout
of .025. You really shouldn't use time(NULL) to create a struct
timespec timeout. You should use clock_gettime(). If you want to use small
waits, you may also need to check clock_getres(), which returns the clock
resolution. If your system supports a resolution of 0.1 second, for
example, there's not much point to setting a wait of 0.025 seconds.
(You'll get up to a 0.1 second wait anyway.)

> 2. The thread blocks and yields fine if I use "time(NULL) + 1" in
>    the tv_sec field.  However the thread eventually hangs in some
>    totally irrelevant code (in the system code `close' when I try
>    to close a socket descriptor).

There's no connection between condition waits and sockets, so most of this
item seems completely irrelevant.

> We are thinking of using another thread that sleeps (with nanosleep)
> for a period of time and then wakes up and signals other threads
> as a timer now.  Has anyone tried this approach before?

Depends on what you are really trying to accomplish. I don't see any
application of this technique that has anything to do with the rest of
your message.

/---------------------------[ Dave Butenhof ]--------------------------\
| Compaq Computer Corporation            |
| 110 Spit Brook Rd ZKO2-3/Q18 |
| Nashua NH 03062-2698 |
\-----------------[ Better Living Through Concurrency ]----------------/

 Q219:  Signal delivery to Java via native interface  

"Stuart D. Gathman" wrote:

> Dave Butenhof wrote:
> >
> I am trying to figure out a way to handle signals synchronously in a Java VM.
> I have a thread calling sigwait() which reports supported signals
> synchronously to Java.  But I have no control over other threads in the VM -
> so I can't get them to block the signals.  The sneaky solution of blocking the
> signal in a handler probably won't work in AIX - the man page says "Concurrent
> use of sigaction and sigwait for the same signal is forbidden".

It cannot legally say that, and it may not be saying what it seems to. There's no
restriction in POSIX, or in UNIX98, against using both. However, POSIX does say that
calling sigwait() for some signal MAY change the signal action for that signal. If
you have a silly implementation that actually does this (there's no point except
with a simple purely user-mode hack like the old DCE threads library), then trying
to combine them may be pointless -- but it's not illegal. (And, by the way, if
you're using any version of AIX prior to 4.3, then you ARE using that very
"user-mode hack" version of DCE threads, and you're not really allowed to set signal
actions for any "asynchronous" signal.)

Of course, in practice, such distinctions between "forbidden" and "legal but
meaningless" aren't useful, so one could argue that the incorrect statement "is
forbidden" may not be entirely unjustified. ;-)

> One idea is to have the handler notify the signal thread somehow - not
> necessarily with a signal.  Is there some kind of event queue that could be
> called from a signal handler?

You can call sem_post() from a signal handler. Therefore, you could have a thread
waiting on a semaphore (sem_wait()), and have the signal call sem_post() to awaken
the waiter.

> Another idea is to have the signal thread call sigsuspend.  Then, if the
> handler could determine whether the thread it interrupted is the signal
> thread, it could block the signal all threads except the signal thread.

I don't think I understand what you mean here. One thread cannot block a signal in
other threads. And that "if" hiding in the phrase "if the handler could determine"
is a much bigger word that it might seem. You cannot do that using any portable and
reliable mechanism.

/---------------------------[ Dave Butenhof ]--------------------------\
| Compaq Computer Corporation            |
| 110 Spit Brook Rd ZKO2-3/Q18 |
| Nashua NH 03062-2698 |
\-----------------[ Better Living Through Concurrency ]----------------/

 Q220: Concerning timedwait() and realtime behavior.  

Bil Lewis wrote:

>   First, the definition I'm using of "realtime" is real time i.e.,
> wall clock time.  In computer literature the term is not well-defined,
> though colloqually "soft realtime" means "within a few seconds" while
> "hard realtime" means "less than 100ms."  (I've been quite upset
> with conversations with realtime programmers who talk about 100%
> probabilities, time limits etc.  Ain't no such thing! This muddies
> the RT waters further, but we'll leave that for another time.)
>   As such, anything that refers to the wall clock (eg pthread_cond_timedwait())
> is realtime.

"Realtime" means "wall clock time"? Wow.

>   I think important that people using timed waits, etc. recognize this
> and write programs appropriately.  (I admit to some overkill here, but
> think some overkill good.)

Yeah, it's important to remember that you're dealing with an "absolute" time
(relative to the UNIX Epoch) rather than a "relative" time (relative to an
arbitrary point in time, especially the time at which the wait was initiated). The
sleep(), usleep(), and nanosleep() functions are relative. The
pthread_cond_timedwait() function is absolute. So if "realtime" means "absolute
time" (which has some arbitrary correlation, one might assume, to "wall clock
time"), then, yeah, it's realtime.

> > Condition variables are a "communication channel" between threads that are
> > sharing data. You can wait for some predicate condition to change, and you
> > can inform a thread (or all threads) waiting for a condition that it has
> > changed.
> >
> > There's nothing intrinsically "realtime" about them, at that level.
> >
> > You can also have a wait time out after some period of time, if the
> > condition variable hasn't been signalled. That's not really "realtime",
> > either, although the nanosecond precision of the datatype does originate
> > in the needs of the realtime folk who developed 1003.1b (the realtime
> > POSIX extensions).
>   And here's the sticky point: 'That's not really "realtime"'.  It sure
> isn't hard realtime.  (Many people don't refine their terms.)  But it is
> real time.

Reality is overrated. It certainly has little to do with programming. No, it's not
"realtime" by any common computer science/engineering usage. "Realtime" isn't a
matter of the datatype an interface uses, but rather of the real world constraints
placed on the interface!

An interface that waits for "10 seconds and 25 milliseconds plus or minus 5
nanoseconds, guaranteed, every time" is realtime. An interface (like
pthread_cond_timedwait()) that waits "until some time after 1998 Dec 07
13:08:59.025" is not realtime, because no (useful) real world (real TIME)
constraints are placed on the behavior.

>   So what's my point?  Maybe just that we need some well-defined terminology
> here?

We're talking about POSIX functions, so let's try out the POSIX definition:

     "Realtime in operating systems: the ability of the operating system to
     provide a required level of service in a bounded response time."

Does pthread_cond_timedwait() "provide a required level of service in a bounded
response time"? No, absolutely not, except in conjunction with the scheduling
guarantees provided by the realtime scheduling option.

Of course, in a sense, it is bounded -- pthread_cond_timedwait() isn't allowed to
return BEFORE the specified time. But that's not a useful bound. What realtime
people want is the other direction... "if I wait for 25 milliseconds, what, worst
case, is the LONGEST interval that might pass before control is returned to my

You're correct that "hard" and "soft" realtime aren't quite so firmly defined. In
normal use, soft realtime usually means that it shouldn't be too long, most of the
time, or someone'll get annoyed and write a firmly worded letter of protest. Hard
realtime means the plane may crash if it's too long by more than  nanoseconds.
Hard realtime does not necessarily require fine granularity, or even 100%
precision (though some applications do require this). The principal requirement is

/---------------------------[ Dave Butenhof ]--------------------------\
| Compaq Computer Corporation            |
| 110 Spit Brook Rd ZKO2-3/Q18 |
| Nashua NH 03062-2698 |
\-----------------[ Better Living Through Concurrency ]----------------/

 Q221: pthread_attr_getstacksize on Solaris 2.6 wrote:

> I am trying to find out what default stack size each thread has by the
> following code (taken from Pthreads Programming) in non-main threads:
>   size_t default_stack_size = -1;
>   pthread_attr_t stack_size_custom_attr;
>   pthread_attr_init( &stack;_size_custom_attr );
>   pthread_attr_getstacksize( &stack;_size_custom_attr, &default;_stack_size );
>   printf( "Default stack size = %d\n", default_stack_size );
> The output is 0. Can anyone explain this? Thanks.

Yes, I can explain that. "0" is the default value of the stacksize attribute on
Solaris. Any more questions? ;-)

POSIX says nothing about the stacksize attribute, except that you can set it to
the size you require. It doesn't specify a default value, and it doesn't
specify what that default means. It does say that any attempt to specify a
value less than PTHREAD_STACK_MIN is an error. Therefore, it's perfectly
reasonable (though odd and obscure) to have a default of 0, which is distinct
from any possible value the user might set.

When you actually create a thread, the Solaris thread library looks at the
stacksize attribute, and, if it's 0, substitutes the actual runtime default.
That's pretty simple.

I happen to prefer the way I implemented it. (I suppose that goes without
saying.) When you create a thread attributes object, the stacksize attribute
contains the actual default value, and the code you're using will work.

But the real point, and the lesson, is that what you're trying isn't portable.
While it's not quite "illegal", it's close. Another way of putting it is that
you've successfully acquired the information you requested. The fact that it
happens to be absolutely useless to you is completely irrelevant.

/---------------------------[ Dave Butenhof ]--------------------------\
| Compaq Computer Corporation            |
| 110 Spit Brook Rd ZKO2-3/Q18 |
| Nashua NH 03062-2698 |
\-----------------[ Better Living Through Concurrency ]----------------/

 Q222:  LinuxThreads: Problem running out of TIDs on pthread_create  
Kaz Kylheku wrote:

> ( The comp.programming.threads FAQ wrongfully suggests a technique of using a
> suitably locked counter that is incremented when a detached thread is created
> and decremented just before a detached thread terminates. The problem is that
> this does not eliminate the race condition, because a thread continues to
> exist after it has decremented the counter, so it's possible for the counter
> to grossly underestimate the actual number of detached threads in existence.
> I'm surprised at this *glaring* oversight. )

[This is true, but not very likely.  Like never.  Still, Kaz is right.  -Bil]

In most programs this is simple and reliable, because threads tend to execute the
short amount of code at the end without blocking. That, of course, is not always
true. And you're correct that in some cases, especially with a heavily loaded
system, there can be delays, and they can be significant.

> The only way to be sure that a detached thread has terminated is to poll it
> using pthread_kill until that function returns an error, which is ridiculous.
> That's what joinable threads and pthread_join are for.

That won't work unless you know that no new threads are being created during the
interval. (Anywhere in the process... and you can only know that if you're
writing a monolithic application that calls no external code.) That's because a
POSIX thread ID (pthread_t) may be reused as soon as a thread is both terminated
and detached. (Which, for a detached thread, means as soon as it terminates.)
This won't always happen, and, in some implementations, (almost certainly
including Linux, which probably uses the process pid), may "almost never" happen.
Still, code that uses this "trick" isn't portable, or even particularly reliable
on an implementation where it happens to work most of the time.

Your summary is absolutely correct: that's why join exists.

> Because of this race, you should never create detached threads in an unbounded
> way. Programs that use detached threads should be restricted to launching a
> *fixed* number of such threads.
> I don't believe that detached threads have any practical use at all in the
> vast majority of applications.  An application developed in a disciplined
> manner should be capable of an orderly shutdown during which it joins all of
> its threads.  I can't think of any circumstance in which one would absolutely
> need to create detached threads, or in which detached threads would provide
> some sort of persuasive advantage; it's likely that the POSIX interface for
> creating them exists only for historic reasons.

I believe that detached threads are far easier to use for the vast majority of
programs. Joining is convenient (but not necessary) for any thread that must
return a single scalar value to its creator. Joining is essential when you need
to be "reasonably sure" that the thread has given up its system resources before
going on to something else. In any other case, why bother? Let your threads run
independently, exchange information as needed in any reasonable way, and then
quietly "evaporate" in a puff of greasy black smoke.

> (The FAQ, however, for some unexplained reason, suggests that detached threads
> are preferred over joinable.)

[Personal preference only -Bil]

And I agree, though they're clearly not appropriate in situations where you're
flooding the system with threads (which isn't a design I'd recommend anyway), and
you really need to know when one is gone to avoid resource problems.

/---------------------------[ Dave Butenhof ]--------------------------\
| Compaq Computer Corporation            |
| 110 Spit Brook Rd ZKO2-3/Q18 |
| Nashua NH 03062-2698 |
\-----------------[ Better Living Through Concurrency ]----------------/


 Q223:  Mutexes and the memory model  

Kaz Kylheku wrote:

> In article ,
> Keith Michaels  wrote:
> >I know that mutexes serialize access to data structures and this
> >can be used to enforce a strongly ordered memory model.  But what
> >if the data structure being locked contains pointers to other
> >structures that were build outside of mutex control?
> The mutex object is not aware of the data that it is protecting; it is
> only careful programming discipline that establishes what is protected.
> If some pointers are protected by a mutex, it may be the case that the
> pointed-at objects are also protected. Or it might be the case that such
> objects are not protected by the mutex.
> Any object that is accessed only whenever a certain mutex is held is
> implicitly protected by that mutex.

This is a really good statement, but sometimes I like to go the opposite
direction to explain this.

The actual truth is that mutexes are selfish and greedy. They do NOT protect your
data, or your code, or anything of the sort. They don't know or care a bit about
your data. What they do, and very well, is protect themselves. Aside from mutual
exclusion (the "bottleneck" function), the law says that when you lock a mutex,
you have a "coherent view of (all) memory" with the thread that last unlocked the
mutex. If you follow carefully follow the rules, that is enough.

As Kaz says, you need to apply careful programming discipline in order to be
protected by a mutex. First, never touch shared data when you don't have a mutex
locked... and all the threads potentially touching any shared data must agree on
a single mutex (or the same set of mutexes) for this purpose. (If one thread
locks mutex A to touch a queue, while another locks mutex B to touch a queue,
you've got no protection.) And, if you are moving a piece of data between
"private" and "shared" scopes, you must agree on a single mutex for the
transition. (You can modify private data as you wish, but you must always lock a
mutex before making that private data shared, and before making shared data
private again -- as in queueing or dequeueing a structure.) If your structure
contains pointers to other structures, then they're in the same scope. If there
may also be other pointers to the data, you need to make sure all threads agree,
at every point in time, whether the "secondary" data is private or shared.

/---------------------------[ Dave Butenhof ]--------------------------\
| Compaq Computer Corporation            |
| 110 Spit Brook Rd ZKO2-3/Q18 |
| Nashua NH 03062-2698 |
\-----------------[ Better Living Through Concurrency ]----------------/

 Q224: Poor performance of AIO in Solaris 2.5?  

Bil Lewis wrote:
> Douglas C. Schmidt wrote:
> >
> > Hi Mike,
> >
> > ++ I have an application that needs to write files synchronously (i.e: a
> > ++ database-like application). I figured I should try and use the "aio"
> > ++ family of system calls so that several such writes can be in progress
> > ++ simultaneously if the files are on different disks. (A synchronous write
> > ++ takes around 12-16 msecs typically on my machine.)
> > ++
> > ++ I would have expected that the lio_listio() would be no slower than 2
> > ++ write()'s in series, but it seems to be 4-5 times worse.
> >
> > Our ad hoc tests using quantify/purify seem to indicate that the
> > aio*() calls on Solaris are implemented by spawning a thread PER-CALL.
> > This is most likely to be responsible for the high overhead.  I'm not
> > sure how other operating systems implement the aio*() calls, but
> > clearly spawning a thread for each call will be expensive.
>   I never worked with the AIO stuff, but this does sound correct...  AIO
> was done with threads & creating a new thread per AIO call sounds likely.
> But it's not terribly expensive.  Depending upon the machine it should
> add no more than 20-100us.  You wouldn't even be able MEASURE that.
>   Something is rotten in the state of Denmark.

and I suspect its the scheduler.  I've written a threaded external
call-out/call-back system for our VisualWorks Smalltalk environment
(  It runs on Windows
NT/95/98, OS/2, Intel Linux, Digital Unix, HPUX, AIX and Solaris.  The
scheme maintains a thread-farm, and threads in the farm are used to make
external call-outs, and a rendevouz mechanism is used to respond to
threaded call-ins.

On all but Solaris the performance of a simple threaded call-out to a
null routine is approximately 50 times slower than a non-threaded
call-out (e.e. a simple threaded callout on Intel Linux using a 180 MHz
Pentium Pro is about 85 usec).  But on Solaris it is an order of
magnitude worse (e.g. a simple threaded callout on an UltraSPARC 1 takes
at least 800usecs).  

Since the system uses a thread farm, thread creation times aren't
relevant in determining performance.  Instead, the performance is
determined by pthread_mutex_lock, pthread_cond_signal,
pthread_mutex_unlock, pthread_cond_wait, pthread_cond_timed_wait and the
underlying scheduler.

Dormant threads in the farm are waiting in pthread_cond_wait.  When a
call-out happens the main/Smalltalk thread marshalls the call into some
memory, chooses a thread and does a
{pthread_mutex_lock;pthread_cond_signal;pthread_mutex_unlock} to wake
the thread and let it perform the call.  On return the thread signals
the main thread and enters a pthread_cond_timed_wait (if it times-out
the main thread is resignalled and the wait reentered).  The
main/Smalltalk thread responds to the signal by collecting the result of
the call.

To ensure calls make progress against the main thread all threads in the
farm have higher priority.  On many pthreads platforms, Solaris included
the system has to use a non-realtime scheduling policy because of a lack
of permissions, so on Solaris 2.5/2.6 the scheme is using SCHED_RR.  My
guess is that the scheduler is not prompt in deciding to wake-up a
thread, hence when the calling thread is signalled it isn't woken up
immediately, even though the thread has a higher priority.  One
experiment I've yet to try is putting a thr_yield (as of 2.5
pthread_yield is unimplemented) after the

Although this is all conjecture it does fit with a scheduler that only
makes scheduling decisions occasionally, e.g. at the end of a process's
timeslice.  Anyone have any supporting or contradictory information?

 Q225:  Strategies for testing multithreaded code?  

Date: Tue, 12 Jan 1999 12:41:51 +0100
Organization: NETLAB plus - the best ISP in Slovakia
>Subject says it all: are there any well known or widely used
>methods for ensuring your multithreaded algorithms are threadsafe?
>Any pointers to useful research on the topic?

Let us suppose a program property is an attribute that is true of every
possible history of that program (a history of a program being a concrete
sequence of program states, transformations from one state to another are
carried out by atomic actions performed by one or multiple threads).

Now what about being able to provide a proof that your program has safety
(absence of deadlock, mutual exclusion, ...) and liveliness
(partial/complete correctness, ...) properties?

To prove your program has absence of deadlock property, you may define an
invariant DEADLOCK that is true when all (cooperating) threads in your
program block. Then proving your program will not deadlock is very simple -
you need to assert that for every critical assertion C in the program proof:
C => not DEADLOCK (C implicates DEADLOCK invariant is false, in other words
when preconditions of program statements are true, they exclude possibility
of a state where deadlock is possible).

There is an excellent book covering this topic (the above is an awkward
excerpt from it):
Andrews, Gregory R.
"Concurrent Programming, Principles and Practice"
Addison Wesley 1991
ISBN 0-8053-0086-4

Applying propositions and predicates into your program (or rather sensitive
multithreaded parts) to assert preconditions and postconditions required for
atomic actions execution present a complication, of course. You have to
spend more time on annotating your algorithm, developing invariants that
have to be kept by every statement in the algorithm (and if not, you have to
guard against executing the statement until the invariant is true - and here
you have conditional synchronization :), proving program properties.

But I think it is worth it. Once you prove your program does not deadlock
using programming logic, you may be sure it will not. So I would suggest you
read the above book (if only to be aware of the techniques described there).
It is more a theoretical discussion, but many very helpful paralell
algorithms are described (and proved) there, starting with the very
classical dining philosophers problem, up to distributed heartbeat
algorithm, probe-echo algorithm and multiple-processor operating system
kernel implementation.

Hope this helps,
Best regards,
        Milan Gardian

 Q226: Threads in multiplatform NT   

I have done this.


"Nilesh M." wrote:
> Can I write threaded programs for Win NT and just recompile for both Alpha
> and i386 without any changes or minor changes?

 Q227:  Guarantee on condition variable predicate/pthreads?  

Pete Sheridan wrote:

> Thread 2:
>         pthread_mutex_lock(&m;);
>         if (n != 0)
>         {
>                 pthread_cond_wait(&m;, &c;);
>                 assert(n == 0);
>         }
>         pthread_mutex_unlock(&m;);
> The idea here is that thread 2 wants to wake up when n is 0.  Is the
> assert() correct?  i.e., will n always be 0 at that point?  When the
> condition is signalled, thread 2 has to reacquire the mutex. Thread 1
> may get the mutex first, however, and increment n before this happens.
> Is this implementation dependent?  Or does thread 2 have to use "while
> (n != 0)" instead of "if (n != 0)"?

The assert() is incorrect. The POSIX standard carefully allows for a
condition wait to return "spuriously". I won't go into all the details,
but allowing spurious wakeups is good for both the implementation and the
application. (You can do a search on Deja News if you really want to know,
because I've explained this several times before; or you can read my book,
about which you may learn through the link below.)

To correct Thread 2, change the "if" into a "while" and move the assertion
out of the loop. But then, it becomes rather trivial. You hold the mutex
and loop until n == 0, so, of course, it will be 0 when the loop
terminates (with the mutex still locked).

/---------------------------[ Dave Butenhof ]--------------------------\
| Compaq Computer Corporation            |
| 110 Spit Brook Rd ZKO2-3/Q18 |
| Nashua NH 03062-2698 |
\-----------------[ Better Living Through Concurrency ]----------------/

 Q228:  Pthread API on NT?   
> I need to port a lot of code to NT that depends on pthreads. Has anyone
> built a pthread library on NT using Win32 threads?
> Scott

"Darius S. Naqvi" wrote:

> Dave Butenhof  writes:
> >
> > Lee Jung Wooc wrote:
> >
> > > Any one please help to redirect signal to the mainthread or
> > > any idea on how to make the signal to process is handled in
> > > main thread context ?
> >
> > As Patrick McPhee has already suggested, I recommend that you stop relying on
> > a signal handler for this. To expand a little on his advice, begin by masking
> > SIGUSR2 in main(), before creating any threads. Then create a special thread
> > that loops on the sigwait() function, waiting for occurrences of SIGUSR2. (If
> > the signal is not masked in ALL threads, then it may "slip through" somewhere
> > while the signal thread is not waiting in sigwait() -- for example, while it's
> > starting, or while it's responding to the previous SIGUSR2.)
> >
> Does the signal become pending in the sigwaiter thread in that case?
> To be clear: suppost that a given signal is blocked in all threads,
> and one thread sigwait()'s on it.  Suppose that the while the
> sigwait()ing thread is not actually in sigwait(), that signal is sent
> to the process.  Is the signal then pending in the sigwait() thread,
> so that the next call to sigwait() notices the signal?

If *all* threads have the signal blocked, the the signal remains
pending against the process. The next thread that makes
itself able to receive the signal, either by unblocking the
pending signal in it signal mask or by calling sigwait,
will receive the pending signal.

> I've been assuming that since a signal is received by only one of the
> threads in which it is not blocked, it is not made pending in the
> blocking threads *if* there exists a thread that is not blocking it.
> In order to not lose any signals, it must then be the case that if
> every thread is blocking a signal, then when a signal is sent to the
> process, it is made pending in *every* thread.  I.e., either one
> thread receives the signal and it is not made pending in any thread,
> or the signal is pending in every thread.  Is this true?  (I don't
> have a copy of the standard, but the book "Pthreads Programming" from
> O'Reilly and Associates is silent on this matter.)

Signals sent to a process never "pend" against a thread. They can
only be pending against the process, meaning, as I explained above,
that any qualified thread can eventaully take the signal.

Only per-thread signals, sent via pthread_kill() can be pending
against a thread that has the signal blocked.

Externally, it's not that complicated. Internally, it can interesting....

Jeff Denham (

Bright Tiger Technologies:  Resource-management software
for building and managing fast, reliable web sites
See us at

125 Nagog Park
Acton, MA 01720
Phone: (978) 263-5455 x177
Fax:   (978) 263-5547


 Q229:  Sockets & Java2 Threads  

Nader Afshar wrote:

> Part of a GateWay I am designing is composed of two threads. One thread
> delivers messages to a server through a socket connection, the other
> thread is always listening on the output-stream of the other server for
> incoming messages.
> The problem I'd like to solve is How to stop the second thread. Since
> that thread is blocked listening on the socket connection, I can not use
> the wait() and notify() method to stop it. Furthermore since Thread.stop
> is deprecated in Java2, I seem to be in a quandary!!
> Any suggestions, would be most appreciated.
> btw. I was thinking of using the socket time-out feature and then after
> checking for some state variable indicating a "disconnect" request,
> going back to listening on the socket again, but this approach just does
> not seem very "clean" to me.
> Regards
> Nader

[For Java 2, this works just fine.  See the Java code example 
ServerInterrupt on this web page. -Bil]

Yes, we had the same problem. interrupt() doesn't work reliable, if the
threads is blocking because of reading from a socket. Setting a variable
was also not very "clean", since you also have to set a timeout then for

I did it this way: I opened the socket in an upper thread and passed it to
the receiving thread. When I want to stop the thread, I simply clos the
socket. This causes the blocking read method to throw a Exception, that
could be caught. So the thread can end in a clean way.
This is also the method suggested by SUN. It seems, that there is not
better solution.

>This is also the method suggested by SUN. It seems, that there is not
>better solution.

Despite being recommended by Sun (where do they recommend this?) it is not
guaranteed to work on all platforms. On some systems closing the Java socket
does not kick the blocked thread off. Such behaviour is not currently
required by the API specs.


 Q230: Emulating process shared threads   

"D. Emilio Grimaldo Tunon" wrote:

>    I was wondering if there is a way to emulate process shared
> mutexes and condition variables when the OS supports Posix
> threads but *not* process shared items? I know I can test
> for _POSIX_THREAD_PROCESS_SHARED, but if it is not defined,
> meaning that THAT is not implemented, then what are my
> alternatives? of course assuming there WILL be two processes
> wanting to share a mutex/condition variable.

Depends on your requirements, and how much work you want to do.

First, you could just use some existing cross-process mechanism to
synchronize. You could use a POSIX (or SysV) semaphore. A message queue.
You could use a pipe -- threads try to acquire the lock by reading, and
"release" the lock by writing (unblocking one reader). You could even
create a file with O_EXCL, and retry periodically until the owner
releases the lock by deleting the file.

You COULD emulate a mutex and condition variable yourself using
(completely nonportable) synchronization instructions, in the shared
memory region, and some arbitrary "blocking primitive" (a semaphore,
reading from a pipe to block and write to unblock, etc.) It can be a lot
of work, but it can be done.

There are a million alternatives. You just need to decide how important
the performance is, and how much time you're willing to spend on it.

You might also keep in mind that a few systems already support UNIX98
(Single UNIX Specification, Version 2), and the others will as soon as
the usual morass of overloaded and conflicting product requirements
allows. UNIX98 REQUIRES implementation of the "pshared" option.

/---------------------------[ Dave Butenhof ]--------------------------\
| Compaq Computer Corporation            |
| 110 Spit Brook Rd ZKO2-3/Q18 |
| Nashua NH 03062-2698 |
\-----------------[ Better Living Through Concurrency ]----------------/
 Q231: TLS in Win32 using MT run-time in dynamically loaded DLLs?  
In article <78slb9$s7c$>,
Mike Smith  wrote:
>That's a mouthful!  :-)
>Let me try again.  I'm writing a Win32 DLL that will be loaded dynamically
>(i.e. via LoadLibrary()).  This DLL will spawn multiple concurrent instances
>of the same thread, each of which must have some local variables.  I'd
>prefer if possible to use the runtime library _beginthread() or
>_beginthreadex() rather than the Win32 functions (CreateThread() etc.)
>Meanwhile, the docs for LoadLibrary() that come with VC++6 seem to indicate
>that dynamically loaded DLLs cannot have thread-local storage, at least not
>provided by the run-time library.

If you are talking about the Microsoft language extension 

    declspec(thread_local)  // or however you spell it

you should probably not be using it in the first place. It's best to get the
job done using the standard language syntax as much as possible and stay away
from compiler extensions. 

There is an alternative way to manage thread-local storage. Have a look
at tlsalloc() and friends. This API is a pale imitation of the POSIX
thread-specific keys facility, but it gets the job done. 

You CAN use thread-specific storage in DLL's if you use tlsalloc() Even though
tlsalloc() lacks the cleanup facility that its POSIX counterpart offers, if
you are writing your code strictly as a DLL you can hack in your own cleanup
and destruction of thread-specific data, since your DllMain is called each
time a thread is created or destroyed in the process.

>Has anybody run across situation before?  How did you handle it?  I was
>thinking about allocating the worker thread's local storage prior to
>starting the thread, then passing a pointer to the memory in the thread
>function's (void *) parameter.  Better ideas?

I usually do this dynamically. If an object requires a thread-specific pointer
to something, I will create the index (or key, in POSIX terminology) when
that object is constructed. Then as threads use the object, they each
initialize their corresponding slot when they detect that it's null.

 Q232:  Multithreaded quicksort  

Gordon Mueller  wrote in message
> I'm looking for a multi-threaded/parallel implementation
> (source or detailed description) of the famous quicksort
> algorithm. I'd like to specify a maximum number of k (moderate)
> processors/threads and I'm looking for linear speed-up, of course.

Have a look at Chap. 20 in my book, "C Interfaces and Implementations:
Techniques for Creating Reusable Software "(Addison-Wesley Professional
Computing Series, 1997, ISBN 0-201-49841-3); there's a multi-threaded
implementation of quicksort in Sec. 20.2.1. The source code is available on
line; see

dave hanson


     Parallel quicksort doesn't work all that well; I believe the
speedup is limited to something like 5 or 6 regardless of the number
of processors.  You should be able to find a variety of parallel
sorting algorithms using your favourite search engine.  One you may
want to look at is PSRS (Parallel Sorting by Regular Sampling), which
works well on a variety of parallel architectures and isn't really
difficult conceptually.  You can find some paper describing it at

Steve MacDonald, Ph.D. Candidate  | Department of Computing Science             | University of Alberta | Edmonton, Alberta, CANADA  T6G 2H1
 Q233: When to unlock for using pthread_cond_signal()?  

POSIX specifically allows that a condition variable may be signalled or
broadcast with the associated mutex either locked or unlocked. (Or even
locked by someone else.) It simply doesn't matter. At least, signalling
while not holding the mutex doesn't make the program in any way illegal.

A condition variable is just a communication mechanism to inform waiters of
changes in shared data "predicate" conditions. The predicate itself IS
shared data, and must be changed in a way that's thread-safe. In most cases,
this means that you must hold the mutex when you change the data. (But you
could also have a predicate like "read() returns data", so that you could
write data, signal the condition variable -- and the waiter(s) would simply
loop on the condition wait until read() returns some data.)

The signal doesn't need to be synchronized with the predicate value. What
you DO need to synchronize is SETTING the predicate and TESTING the
predicate. Given that basic and obvious requirement (it's shared data, after
all), the condition variable wait protocol (testing the predicate in a loop,
and holding the mutex until the thread is blocked on the condition variable)
removes any chance of a dangerous race.

However, your scheduling behavior may be "more predictable" if you signal a
condition variable while holding the mutex. That may reduce some of the
causes of "spurious wakeups", by ensuring that the waiter has a slightly
better chance to get onto the mutex waiter list before you release the
mutex. (That may reduce the chance that some other thread will get the
mutex, and access to the predicate, first... though there are no

(There's a lot more about this in my book, information on which can be found
through the link in my signature way down there at the bottom.)

> You see, pthread_cond_signal has no effect if nobody is actually waiting
> on the condition variable. There is no ``memory'' inside a condition variable
> that keeps track of whether the variable has been signalled. Signalling
> a condition variable is like shouting. If nobody is around to hear the
> scream, nothing happens.
> If you don't hold the lock, your signal could be delivered just after another
> thread has decided that it must wait, but just before it has actually
> performed the wait. In this case, the signal is lost and the thread will wait
> for a signal that never comes.

This would be true if you failed to hold the lock when SETTING the
predicate. But that has nothing to do with SIGNALLING the condition
variable. Either the predicate is changed before the waiter's predicate
test, or it cannot be changed until after the waiter is "on" the condition
variable, in position to be awakened by a future signal or broadcast.

You are correct that signalling (or broadcasting) a condition variable with
no waiters "does nothing". That's good -- there's nothing FOR it to do.

/---------------------------[ Dave Butenhof ]--------------------------\
| Compaq Computer Corporation            |
| 110 Spit Brook Rd ZKO2-3/Q18 |
| Nashua NH 03062-2698 |
\-----------------[ Better Living Through Concurrency ]----------------/

 Q234: Multi-Read One-Write Locking problem on NT  

Alok Tyagi wrote:

> We are encountering a problem with MRSW (Multi-Read Single Write) or
> SWMR/MROW Locks on Windows NT :-
> We have our own MRSW functionality implemented using mulitiple Semaphores.
> We are experiencing a problem when
> process holding a shared lock dies ungracefully and consequently, no other
> processes requesting the exclusive access succeed until the MRSW resource is
> removed and re-created. On Unix platforms, the OS SEM_UNDO mechanism can be
> used. Are you aware of any solution to this problem on NT?
> TIA,
> --alok


It turns out that ntdll.dll provides undocumented MRSW support which you might
find of interest.  There is an article on it in the Jan. 1999 edition of the
Windows Developer's Journal (  I have not used it myself but it
looks interesting, if you understandably feel a bit shakey about using an
undocumented microsoft feature, the article provides an insite of how the MRSW
Lock is implemented.

Hope this is of help..


 Q235:  Thread-safe version of flex scanner   

In article <78vv6s$ejp$>,
Donald A. Thompson  wrote:
% I am looking for a version of the flex program that produces a thread-safe
% lex.yy.c.

The version on this system (2.5) has a -+ option, which produces a C++
scanner class which is said to be re-entrant.

%  Alternatively, I'd like some tips on how to make the lex.yy.c
% thread-safe.

You need to re-implement the input routines, and change the interface to
yylex so that things like state variables, yyleng and yytext, and many
of those other globals are passed on the stack.  You don't have to worry
about the character class tables, since they're read-only, but pretty
much everything else needs to be put through the call stack. You then need
to create a skeleton file with your changes and tell flex to use it
instead of it's default one.

This is a big job, so you might think about either using the scanner
class from the -+ option, or having only one thread run the scanner,
and perhaps generate a byte-code which can be run by other threads.

Patrick TJ McPhee
East York  Canada


 Q236: POSIX standards, names, etc  

Jason L Reisman   wrote:
>I am new to the world of POSIX and am interested in finding out all I
>can before starting to code.  
>I have a few questions regarding the standard.  Any help would be
>greatly appreciated.
>(1) When looking up information on POSIX, I found POSIX.1, POSIX.4, etc.
> What do the numbers mean?  Are they indexes to different libraries or
>differt versions?

Lessee...  This is complex, due to the history of the thing.

POSIX.1 is really POSIX 1003.1, which is *the* POSIX standard (i.e. for
Portable Operating System Interfaces).  POSIX 1003.1 comes in several
flavors, which are dated.  The original is 1003.1-1990.  The realtime
interface, which was known during its development as 1003.4, and
then 1003.1b were combined in to 1003.1 and the resulting spec was
1003.1-1994.  Then the threads interface, which was known during development
as 1003.4a was renamed to 1003.1c, and then combined (with a technical
corrigenda to .1b) with 1003.1-1994 to produce 1003.1-1996.

And yes, it's ugly.   Here's a lousy attempt at a picture.  Time increases
from left to right.  If you're viewing this in something that doesn't display
news articles in a fixed-pitch font, it won't make sense.

   1003.4 --+------ 1003.4a ---+
            |                  |
            +- 1003.1b- +      +- 1003.1c -+
                        |                  |
1003.1 -----------------+-- 1003.1 ----+---+-- 1003.1 --- . . . (+.1a? etc)
 1990                        1994      |        1996
                    1003.1i -----------+
                (technical corrections to .1b)

1003.1 is the base.
1003.4 was "realtime extensions", and originally included threads.  Threads
  were broken out to smooth the merges.
1003.1b is the realtime API amendment to 1003.1
1003.1c is the threads API amendment to 1003.1
1003.1a is the amendments for symbolic links, coming very soon.

And the lettering indicates only when the projects were started, nothing

>(2) Do POSIX sockets exist?  A better way to say this is there a
>standard interface (either created or supported by POSIX) to open and
>maintain a socket?

There is (yet another) set of amendments to 1003.1, known as 1003.1g, for
this.  I haven't looked at the drafts to see what the interface looks
like, though.

>(3) How compatible are pthreads between NT and Solaris (or any flavor of
>UNIX for that matter)?

If you have an NT pthreads implementation, I would hope that they're quite
similar.  Note that POSIX makes no requirements that threads be preemptive,
unless certain scheduling options are supported, and the application
requests them.  This is commonly known as "the Solaris problem."

>(3) Are there any recommended books for POSIX beginners (who already
>know how to program)?

Dave Butenhof's book, Programming with POSIX Threads, ISBN 0-201-63392-2,
is quite good.  In fact, I'd call it excellent, and that's not said lightly.
Steve Watt KD6GGD  PP-ASEL-IA              ICBM: 121W 56' 58.1" / 37N 20' 14.2"
 Internet: steve @ Watt.COM                             Whois: SW32
   Free time?  There's no such thing.  It just comes in varying prices... 

 Q237: Passing ownership of a mutex?  

[See the code example for FIFO Mutexes on this web page.
They *may* do what you want.  -Bil
"Fred A. Kulack" wrote:

> You can't portably unlock a mutex in one thread that was locked by another
> thread.

Fred's absolutely correct, but, as this is a common problem, I'd like to
expand and stress this warning.

The principal attribute of a mutex is "exclusive ownership". A locked mutex
is OWNED by the thread that locked it. To attempt to unlock that mutex from
another thread is not merely "nonportable" -- it is a severe violation of
POSIX semantics. Even if it appears to work on some platforms, it does not
work. You may not be getting correct memory visibility, for example, on SMP
RISC systems.

While POSIX does not require that any implementation maintain the ID of the
owning thread, any implementation may do so, and may check for and report the
optional POSIX errors for illegal use of the mutex. The Single UNIX
Specification, Version 2 (SUSV2, or UNIX98), adds support for various "types"
of mutexes, and even POSIX provides for optional thread priority protection
through mutex use. Most of these enhanced mutex attributes require that the
implementation keep track of mutex ownership, and most implementations that
track ownership will report violations.

A program that attempts to lock a mutex in one thread and unlock it in
another is incorrect, not just "potentially nonportable". It probably doesn't
work even where the implementation fails to warn you of your error. Don't
even think about doing that to yourself!

If you really want to "hand off" a lock from one thread to another, you don't
want to use a mutex. (This is almost always a sign that there's something
wrong with your design, by the way, but to all rules there are exceptions.)
Instead, use a binary semaphore. A semaphore doesn't have any concept of
"ownership", and it's legal to "lock" a semaphore (e.g., with the POSIX
operation sem_wait()) in one thread and then "unlock" the semaphore
(sem_post()) in another thread. Legal... but is it correct? Well, remember
that the synchronization operations are also memory visibility points. If you
lock a semaphore in one thread, and then modify shared data,  you need to be
sure that something else (other than the unlock you're not doing) will ensure
that the data is visible to another thread. For example, if you lock a
semaphore, modify shared data, and then create another thread that will "own"
the semaphore, the visibility rules ensure that the created thread will have
coherent memory with its creator; and the semaphore unlock will pass on that
coherency to the next thread to lock the semaphore.

/---------------------------[ Dave Butenhof ]--------------------------\
| Compaq Computer Corporation            |
| 110 Spit Brook Rd ZKO2-3/Q18 |
| Nashua NH 03062-2698 |
\-----------------[ Better Living Through Concurrency ]----------------/

 Q238:   NT fibers  

 Q239:  Linux (v.2.0.29 ? Caldera Base)/Threads/KDE   

jimmy wrote:

> I have Caldera linux (version, if I remember it right 2.0.29, or close
> to that.) My questions: can I use threads with it? (It didn't come with
> the package, that's for sure.) If so, where would I get it, and what
> else do I need in order to start using them? Is there any problem with
> using KDE with threads? Finally, I read somewhere that g++ doesn't like
> threads--is that right? Am I limited to C if I use threads?

You are certainly not going to be able to use POSIX threads with Cladera.
The "linuxthreads" package comes with the glibc library.  Caldera is still
libc5 based.

That having been said, there is no reason why you can't use threads with
KDE.  Do you plan on *writing* KDE applications?  If so, you need to
determine whether or not Qt is thread-safe.

Lastly, there is no limitation on using only C when using POSIX threads.
In fact there is already a small C++ threads package called TThreads which
can provide some easier interface to the threading library.  (I think
they've implimented it wrong, but it's still quite usable.)

Paul Braman

 Q240: How to implement user space cooperative multithreading?  

>Thanks for the help!
>1. My goal is to find a way to implement user space cooperative
>multithreading ( the reason is it should works with hardware description
>language which is event driven and basically serial ). The attached file
>context.c shows my basic understanding. Main() calls ping() and pong() and
>these two functions have private stacks so they can run independently. My
>question is how to avoid those stacks ( they are on heap not on the process's
>stack so it's not under control of OS kernel ) get overflowed.

Make the stacks sufficiently large, and watch your use of auto variables and
recursion. You can also probably roll some special assertions that can be put
at the start of a function that will go off when you are getting too close to
collision. Write a function which uses some tricks to return the remaining
stack space (for example, take the address of a local variable and then
subtract the stack top, or vice versa, depending on whether stacks grow up or
down). Assert if the stack space is down to a few K.

Or you could allocate stacks in their own memory mapped regions which
dynamically allocate pages as they are accessed, and therefore can be large
without occupying virtual memory.

>2. I don't understand how pre-emptive multithreading is implemented,
>especially the implementation of pure user space multithreading. I understand
>preemptive multitasking in process level -- kernel scheduler does the job.

This requires some support from the operating system or from the bare hardware.
I don't know about Win32. In UNIX, you can do it using alarm signals as the
basis for pre-emption. Win32 doesn't implement signals properly; Microsoft
has just paid some token attention to them, because the  header and
the signal() function are part of ANSI C.

In UNIX, signal delivery does most of the work already for implementing
pre-emptive multi-tasking within a single process. The signal mechanism already
takes care of saving the machine context. To do the remainder of a context
switch in a signal handler, all you really have to do is switch the procedure
context only---usually just the stack pointer. so you need some simple
context-switching function which takes a pointer to two stack context areas.
It saves the current stack context into one area, and restores a previously
stored context from the other area. (The other area is chosen by your scheduler
routine, of course, because it determines which thread will run).

The pre-emptive context switch essentially takes place when a signal handler
occurs in one thread, saves its stack context information, restores the
stack context info of another thread and executes a return. Later on,
the thread's turn will come up again, and it will return from the same
signal handler which pre-empted it earlier.

The detailed machine context is actually stored in the signal stack; when you
resume a thread by having it execute the return from the signal handler is when
the precise state of the CPU registers is restored. 

Also, you have to still take care of voluntary context switches, which don't go
through the signal mechanism. The entry point to the voluntary reschedule
context switch has to ensure that it saves at least those registers that are
designated by the compiler's conventions as being ``caller saved''.  When that
thread is later restarted, it will execute a return from that voluntary switch
function, and any registers that the compiler expects not to be clobbered must
be restored. These could just be saved on the stack, or they could be in the
same context area where the aforementioned stack context is put.

(It's best to not cut corners and write an earnest routine that performs a
complete machine context switch, even though this information is already in the
signal stack, and even though some registers may be clobbered during a function

The context switch looks like this (in pseudocode) and must be written
in assembly language, since C doesn't give you access to regiters:

    void context_switch(void *old_context, void *new_context)
        /* save all registers into new context */
        /* restore registers from old context */

The registers do not include the instruction pointer, because that is
stored on the stack of the context_switch() function. When the stack pointer
is switched, the function will return in a stack frame different from the
one it was called for, and the return address in this stack frame will
restore the correct IP.

On some RISC platforms, this stuff is tricky to do! Not all modern computing
platforms give you the neat layout where you have simple stack frames and CPU
registers. For example, above, I have assumed that the function return address
is in the stack frame. In a RISC system, the return address might actually be
in a register. And what about the SPARC architecture with register windows that
supply much of the functionality of a stack frame?

Therefore, take this advice: even if you write your own user-level threading
package, save yourself the headache and steal someone else's context switch
code. It's easy enough to write for a things like 68000 or 80x86, but forget
about SPARC or PA-RISC. :)

>??#@!, I don't have a clue, help!
>#define  STACK_SIZE 4096
>int       max_iteration;
>int       iter;
>jmp_buf   jmp_main;
>jmp_buf   jmp_ping;
>jmp_buf   jmp_pong;

These three buffers clearly form an extremely primitive process table.  In the
general case, you need a whole array of these, one for each thread.

>void  ping(void) {
>  int i = 0;
>  int j = 1000;
>  if (setjmp(jmp_ping) == 0)
>    longjmp(jmp_main, 1);

Right, here you are setting up the ping ``thread'' by saving its initial
context. Later, you will resume the thread by doing a longjmp() to this

In a real threading package, you would have some code for creating a thread
which would allocate a free context, and then ``prime'' it so that the thread
is started as soon as the scheduler chooses it. This priming is often a hack
which tries to set up the thread context so that it looks like the thread
existed before and voluntarily rescheduled. In other words, you might have to
write things into the thread's stack which will fool context_switch() into
executing a ``fake'' return to the entry point of the thread!

I remember with fondness writing this type of fun code. :)

One fine afternoon I wrote a tiny pre-emptive multi-tasking kernel on a PC
using the DEBUG.COM's built in interactive assembler as my only development
tool. The whole thing occupied only 129 bytes. That same day, I wrote some
sample ``applications'' which animated things on the screen, as well as a KILL
command to terminate threads. The scheduling policy was round-robin with no
priorities. There was only one system call interrupt, ``reschedule'', which was
really just a faked clock interrupt to invoke the scheduler. :) The KILL
command worked by locating the kernel in memory by scanning through the
interrupt chain, moving the last process in the table over top of the one being
deleted and decrementing the process count---all with interrupts disabled, of
course. The argument to KILL was the process ID, which was just its position in
the table. This feature made it fun to guess which number to use to kill a
particular process, since the relocated process inherited the ID of the killed
process. :)

It was the day after exams and suddenly had nothing to do, you see, and was
eager to combine the knowledge gleaned from a third-year operating systems
class with obfuscated programming. :)

>  while (1) {
>    i += 2;
>    j += 2;

Oops! The variables i and j will no longer have their correct values
after you come back here from the longjmp. That's because setjmp
and longjmp don't (necessarily) save enough context information to
preserve the values of auto variables.

(Say, did you try running it after compiling with lots of optimizations turned

Declaring i and j volatile might help here, because that should force the
compiler to generate code which re-loads them from memory. Of course, the ANSI
C definition of volatile is a little wishy washy. :)

In un-optimized code, volatile is often redundant, because variables are not
aggressively assigned to registers, which explains why code like this often
works until a release version of the program is built with optimizations.

Or, by chance, the i and j registers may have been assigned to those ``caller
saved'' registers that got implicitly saved and restored in the call to
setjmp(). On 80x86 trash, there is such a dearth of registers that many
compilers mandate that most of the registers must not be clobbered by a
function. With GCC, I think that only EAX and EDX may be clobbered,
though I don't recall exactly.  This allows the generated code to keep
as many temporary values in registers as is reasonably possible,
at the cost of forcing called code to save and restore.

What you really need is to forget setjmp() and longjmp() and get a real context
switch routine. It shouldn't be hard to roll your own on Intel.  (If you aren't
using floating point math, you can get away with saving just the integer
registers, so you don't have to mess with that awful floating point ``stack''
thing that some junkies at Intel dreamed up during a shared hallucination.)

Anyway, I've ranted long enough about things that are probably of no interest
to anyone, so good night.

 Q241: Tools for Java Programming   

In article <>,
    Bil Lewis  writes:
> I'm in the midst of finishing "Multithreaded Programming
> with Java" and am working on a short section covering which
> companies have interesting tools for debugging, testing,
> analyzing, etc. MT programs.
> Obviously, I am covering Sun's JWS, Numega's JCheck, and
> Symantec's Java Cafe.  Are the other products that I should
> know about?
> -Bil

Bil - Parasoft has a Java analyzer.  I haven't used it, but if it's as 
good as their C version (Insure++), it's probably worth writing about.  I 
think they have a "free" trial period too.   Look at for 
more information.

Sue Gleeson

>    >What I was wondering was if there was a tool (a lint sorta thing)
>    >available that would go through code and flag trouble spots, like global
>    >data usage, and static local data, etc.  I of course don't expect the
>    That tool is your brain! If we are talking about C, you can look for
>    static data by using grep to look for the string ``static''.  That
>    is easy enough.

Unfortunately, "brains" are notoriously poor at analyzing concurrency,
and this is exactly the kind of problem that automated analysis and
testing tools are likely to do better than people.  Not as a
substitute for reasoning of course, but as a significant aid (just like lint)

I'm aware of at least two commercial tools that test for race
conditions and/or deadlocks:  

  AssureJ for Java ( 
  Lock Lint for C (

My understanding is that there will soon be other tools in this space
as well.

>    >tool to fix any of the problems, nor really even know for sure when a
>    >problem is a problem, but just flag that there may be a problem.
>    It's hard enough to automate the correctness verification of ordinary
>    single-threaded logic. The halting problem tells us that this isn't even
>    possible in the general case.

Luckily, you don't have to verify the correctness of software to be
useful.  For example, you can just check that the observed locking pattern
is consistent with a "safe" locking discipline (e.g. a particular
piece of shared data is always protected the same lock).  Myself and
some folks at DEC SRC  (now Compaq SRC), built a tool like this that
was extremely effective at finding race conditions.

See for

- Stefan

 Q242:  Solaris 2.6, phtread_cond_timedwait() wakes up early  
This may not answer the question, but it could solve the problem !

You can change the timer resolution in Solaris 2.6 and 2.7 by putting this
in /etc/system and rebooting.

set hires_tick = 1

This sets the system hz value to 1000.


John Garate wrote in message <>...
>For PTHREAD_PROCESS_SHARED condition variables, pthread_cond_timedwait()
>occur up to about 20ms prior to the requested timeout time (sample code
>below).  I wasn't
>expecting this.  I realize clock ticks are at 10ms intervals, but I
>expected my timeout to occur at
>the soonest tick AFTER my requested timeout, not before.  Were my
>expectations out of line?
>cc -mt -lposix4 testwait.c
>/* testwait.c */
>#define _POSIX_C_SOURCE 199506L
>pthread_cond_t cv;
>pthread_mutex_t mutex;
>int main(int argc, char *argv[]) {
>  pthread_condattr_t cattr;
>  pthread_mutexattr_t mattr;
>  timespec_t  ts_now, ts_then;
>  int   timed_out;
>  int   rc;
>  /* condition variable: wait awakes early if PROCESS_SHARED */
>  if(pthread_condattr_init(&cattr;)) exit(-1);
>  if(pthread_condattr_setpshared(&cattr;, PTHREAD_PROCESS_SHARED))
>  if(pthread_cond_init(&cv;, &cattr;)) exit(-1);
>  if(pthread_condattr_destroy(&cattr;)) exit(-1);
>  /* mutex: doesn't matter whether PROCESS_SHARED or not (only cv
>matters) */
>  if(pthread_mutexattr_init(&mattr;)) exit(-1);
>  if(pthread_mutexattr_setpshared(&mattr;, PTHREAD_PROCESS_SHARED))
>  if(pthread_mutex_init(&mutex;, &mattr;)) exit(-1);
>  if(pthread_mutexattr_destroy(&mattr;)) exit(-1);
>  /* calculate future timestamp */
>  clock_gettime(CLOCK_REALTIME,&ts;_then);
>  ts_then.tv_sec+=1;
>  /* wait for that time */
>  timed_out = 0;
>  if(pthread_mutex_lock(&mutex;)) exit(-1);
>  while(!timed_out) {
>    rc = pthread_cond_timedwait( &cv;, &mutex;, &ts;_then );
>    clock_gettime(CLOCK_REALTIME,&ts;_now);
>    switch(rc) {
>    case 0:
>      printf("spurious, in my case\n");
>      break;
>    case ETIMEDOUT:
>      timed_out=1;
>      break;
>    default:
>      printf("pthread_cond_timedwait failed, rc=%d\n",rc);
>      exit(-1);
>    } /* switch */
>  } /* while (!timed_out) */
>  pthread_mutex_unlock(&mutex;);
>  /* did we wake-up before we wanted to? */
>  if (ts_now.tv_sec < ts_then.tv_sec ||
>     (ts_now.tv_sec == ts_then.tv_sec &&
>      ts_now.tv_nsec < ts_then.tv_nsec)) {
>    printf("ts_now  %10ld.%09ld\n", ts_now.tv_sec, ts_now.tv_nsec);
>    printf("ts_then %10ld.%09ld\n", ts_then.tv_sec, ts_then.tv_nsec);
>  }
>  return(0);
>} /* main */

 Q243:  AIX4.3 and PTHREAD problem  
In article <>,
Red Hat Linux User   wrote:

%     After I sent this, I was talking to an IBM'er who was trying to convince me
% to upgrade
% our machines to AIX 4.3.  He mentioned that 4.3 provides POSIX support at
% level(?) 7 and
% 4.2 at level 5.  He said that he had to change some of his code because there

Find him, and kick him.

AIX 4.1 and 4.2 provide two thread libraries: one roughly implements a draft (7
if you must know) of the posix standard, and one implements DCE threads.

AIX 4.3 implements the posix standard, and provides backwards compatibility
with the other two libraries. There are slight changes from the draft support
available in the earlier releases. If significant code changes were needed
to compile on 4.3, the original code was probably written for DCE threads.

It's not too difficult to keep track of this, and if you're selling the
stuff you really have an obligation to at least try. Kick him hard.

You _can_ run programs compiled on 4.1 without change on a 4.3 machine.


Patrick TJ McPhee
East York  Canada

 Q244: Readers-Writers Lock source for pthreads  

In the hope someone may find it useful, here's and implementation of a
readers-writeres lock for PThreads. In this implementation writers are given
priority. Compile with RWLOCK_DEBUG defined for verbose debugging output to
stderr. This output can help track:

 1. Mismatches (eg rwlock_ReadLock(); ... rwlock_WriteUnlock();)
 2. Recursive locks (eg rwlock_ReadLock(); ... rwlock_ReadLock();)

Amongst other things. The debugging output also includes the line numbers of
where the lock was obtained (and released) for greater usefulness. The
debugging mode has been implemented using thread specific data.

Anyway, here's the source:

/* START: rwlock.h */
#ifndef __RWLOCK_H__
#define __RWLOCK_H__

 * $Id: rwlock.h,v 1.8 1999/02/27 14:19:35 lk Exp $
 * Copyright (C) 1998-99 Lee Kindness 
 * This program is free software; you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation; either version 2 of the License, or
 * (at your option) any later version.
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * GNU General Public License for more details.
 * You should have received a copy of the GNU General Public License
 * along with this program; if not, write to the Free Software
 * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.


struct rwlock;
typedef struct rwlock *rwlock_t;

#define RWLOCK_DEBUG 2

void     rwlock_Init(rwlock_t rwl);
rwlock_t rwlock_InitFull(void);
void     rwlock_Destroy(rwlock_t rwl, int full);
void     rwlock_ReadLockD(rwlock_t rwl, char *f, int l);
void     rwlock_ReadUnlockD(rwlock_t rwl, char *f, int l);
void     rwlock_WriteLockD(rwlock_t rwl, char *f, int l);
void     rwlock_WriteUnlockD(rwlock_t rwl, char *f, int l);
# define rwlock_ReadLock(R) rwlock_ReadLockD(R, __FILE__, __LINE__)
# define rwlock_ReadUnlock(R) rwlock_ReadUnlockD(R, __FILE__, __LINE__)
# define rwlock_WriteLock(R) rwlock_WriteLockD(R, __FILE__, __LINE__)
# define rwlock_WriteUnlock(R) rwlock_WriteUnlockD(R, __FILE__, __LINE__)
void     rwlock_ReadLock(rwlock_t rwl);
void     rwlock_ReadUnlock(rwlock_t rwl);
void     rwlock_WriteLock(rwlock_t rwl);
void     rwlock_WriteUnlock(rwlock_t rwl);

#endif /* __RWLOCK_H__ */
/* END: rwlock.h */

/* START rwlock.c */
 * $Id: rwlock.c,v 1.9 1999/02/27 14:19:35 lk Exp $
 * Routines to implement a read-write lock. Multiple readers or one writer
 * can hold the lock at once. Writers are given priority over readers.
 * When compiled with RWLOCK_DEBUG defined verbose debugging output is
 * produced which can help track problems such as mismatches and
 * recursive locks.
 * Copyright (C) 1998-99 Lee Kindness 
 * This program is free software; you can redistribute it and/or modify
 * it under the terms of the GNU General Public License as published by
 * the Free Software Foundation; either version 2 of the License, or
 * (at your option) any later version.
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * GNU General Public License for more details.
 * You should have received a copy of the GNU General Public License
 * along with this program; if not, write to the Free Software
 * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.


#include "rwlock.h"

struct rwlock
    pthread_key_t   key;
    pthread_mutex_t lock;
    pthread_cond_t  rcond;
    pthread_cond_t  wcond;
    int             lock_count;
    int             waiting_writers;

struct LockPos
    int           type;
    char         *file;
    int           line;
    pthread_key_t key;

static void rwlocki_WarnNoFree(void *arg);

static void rwlocki_WaitingReaderCleanup(void *arg);
static void rwlocki_WaitingWriterCleanup(void *arg);

 * rwlock_InitFull()
 * Allocate the memory for, and initialise, a read-write lock.
rwlock_t rwlock_InitFull(void)
    rwlock_t ret;

    if( (ret = calloc(sizeof(struct rwlock), 1)) )

    return( ret );

 * rwlock_Init()
 * Initialise a static, or otherwise allocated, read-write lock.
void rwlock_Init(rwlock_t rwl)
    pthread_key_create(&rwl-;>key, rwlocki_WarnNoFree);
    pthread_mutex_init(&rwl-;>lock, NULL);
    pthread_cond_init(&rwl-;>wcond, NULL);
    pthread_cond_init(&rwl-;>rcond, NULL);
    rwl->lock_count = 0;
    rwl->waiting_writers = 0;

 * rwlock_Destroy()
 * Free all memory associated with the read-write lock.
void rwlock_Destroy(rwlock_t rwl, int full)
    if( full )

 * rwlock_ReadLock()
 * Obtain a read lock.
void rwlock_ReadLockD(rwlock_t rwl, char *f, int l)
    struct LockPos *d;
    if( (d = (struct LockPos *)pthread_getspecific(rwl->key)) )
     fprintf(stderr, "RWL %p %s:%d already has %s lock from %s:%d\n",
      rwl, f, l, d->type ? "write" : "read", d->file, d->line);
     /* but we'll carry on anyway, and muck everything up... */
    if( (d = malloc(sizeof(struct LockPos))) )
     /* init the TSD */
     d->type = 0; /* read */
     d->file = f;
     d->line = l;
     d->key  = rwl->key;
     /* and set it */
     pthread_setspecific(rwl->key, d);
     fprintf(stderr, "RWL %p %s:%d read lock pre\n", rwl, f, l);
 fprintf(stderr, "RWL %p %s:%d cannot alloc memory!\n", rwl, f, l);
void rwlock_ReadLock(rwlock_t rwl)
    pthread_cleanup_push(rwlocki_WaitingReaderCleanup, rwl);
    while( (rwl->lock_count < 0) && (rwl->waiting_writers) )
 pthread_cond_wait(&rwl-;>rcond, &rwl-;>lock);
    /* Note that the pthread_cleanup_pop subroutine will
     * execute the rwlocki_WaitingReaderCleanup routine */
    fprintf(stderr, "RWL %p %s:%d read lock\n", rwl, f, l);

 * rwlock_ReadUnlock()
 * Release a read lock
void rwlock_ReadUnlockD(rwlock_t rwl, char *f, int l)
    struct LockPos *d;
void rwlock_ReadUnlock(rwlock_t rwl)
    if( !rwl->lock_count )
    if( (d = pthread_getspecific(rwl->key)) )
     if( d->type == 0 )
  fprintf(stderr, "RWL %p %s:%d read unlock at %s:%d\n", rwl,
   d->file, d->line, f, l);
  fprintf(stderr, "RWL %p %s:%d mismatch unlock %s:%d\n", rwl,
   d->file, d->line, f, l);
     pthread_setspecific(rwl->key, NULL);
 fprintf(stderr, "RWL %p %s:%d read unlock with no lock!\n", rwl, f, l);

 * rwlock_WriteLock()
 * Obtain a write lock
void rwlock_WriteLockD(rwlock_t rwl, char *f, int l)
    struct LockPos *d;
    if( (d = (struct LockPos *)pthread_getspecific(rwl->key)) )
     fprintf(stderr, "RWL %p %s:%d already has %s lock from %s:%d\n",
      rwl, f, l, d->type ? "write" : "read", d->file, d->line);
     /* but we'll carry on anyway, and muck everything up... */
    if( (d = malloc(sizeof(struct LockPos))) )
     /* init the TSD */
     d->type = 1; /* write */
     d->file = f;
     d->line = l;
     d->key  = rwl->key;
     /* and set it */
     pthread_setspecific(rwl->key, d);
     fprintf(stderr, "RWL %p %s:%d write lock pre\n", rwl, f, l);
 fprintf(stderr, "RWL %p %s:%d cannot alloc memory!\n", rwl, f, l);
void rwlock_WriteLock(rwlock_t rwl)
    pthread_cleanup_push(rwlocki_WaitingWriterCleanup, rwl);
    while( rwl->lock_count )
 pthread_cond_wait(&rwl-;>wcond, &rwl-;>lock);
    rwl->lock_count = -1;
    /* Note that the pthread_cleanup_pop subroutine will
     * execute the rwlocki_WaitingWriterCleanup routine */
    fprintf(stderr, "RWL %p %s:%d write lock\n", rwl, f, l);

 * rwlock_WriteUnlock()
 * Release a write lock
void rwlock_WriteUnlockD(rwlock_t rwl, char *f, int l)
    struct LockPos *d;
void rwlock_WriteUnlock(rwlock_t rwl)
    rwl->lock_count = 0;
    if( !rwl->waiting_writers )
    if( (d = pthread_getspecific(rwl->key)) )
     if( d->type == 1 )
  fprintf(stderr, "RWL %p %s:%d write unlock at %s:%d\n", rwl,
   d->file, d->line, f, l);
  fprintf(stderr, "RWL %p %s:%d mismatch unlock %s:%d\n", rwl,
   d->file, d->line, f, l);
     pthread_setspecific(rwl->key, NULL);
 fprintf(stderr, "RWL %p %s:%d write unlock with no lock!\n",rwl, f, l);

static void rwlocki_WaitingReaderCleanup(void *arg)
    rwlock_t rwl;

    rwl = (rwlock_t)arg;

static void rwlocki_WaitingWriterCleanup(void *arg)
    rwlock_t rwl;

    rwl = (rwlock_t)arg;
    if( (!rwl->waiting_writers) && (rwl->lock_count >= 0) )
 /*  This only happens if we have been cancelled */

static void rwlocki_WarnNoFree(void *arg)
    struct LockPos *d = (struct LockPos *)arg;

    fprintf(stderr, "RWL 0 %s:%d exit during lock-unlock pair\n",
     d->file, d->line);
    pthread_setspecific(d->key, NULL);
/* END rwlock.c */

 Q245: Signal handlers in threads   

In article <>,

Thank you for posting an answer !

This is pretty tricky ... I'll give it a try ....

  Jeff Denham  wrote:
> Yes -- as I said recently in a post regarding a similar
> question about sigwait() -- only the faulting threads can
> catch its own synchronous signals/exceptions.
> You don't have to do the work strictly in a signal
> handler, though. If you have a stack-based exception
> handling package available to you, such as the
> try/catch model in C++,  you can handle the
> synchronous exceptions in the exception handler.
> This model essentially unwinds the
> stack at the point the signal is caught
> by a special handler and delivers it back
> (close to) the orignal context and outside
> of the signal-handler state. At this point,
> you're at "thread level" and can
> pretend you just returned from
> a call to sigwait() ;^)
> (If I'm being overoptimistic about
> actually being at thread level in
> the catch() clause, someone please
> correct me.)
> Here's a little example that catches
> a SIGILL instruction on Solaris,
> built using their V4.2 C++ compiler
> and runtime:
> #include 
> #include 
> #include 
> #include 
> int junk = -1;
> class SigIll
> {
> public:
>         SigIll(void) {};
> };
> void ill(int sig)
> {
>         throw SigIll();
> }
> main()
> {
>         typedef void (*func)(int);
>         func f, savedisp;
>         savedisp = signal(SIGILL, ill);
>         try {
>                 cout << "Issue illegal instruction...\n" << endl;
>                 f = (func)&junk;
>                 (*f)(1);
>         }
>         catch (SigIll &si;) {
>                 cout << "Exception!!!" << endl;
>         }
>         cout << "Survived!\n" << endl;
>         (void) signal(SIGILL, savedisp);
> }
> I'm hardly an expert with the exception stuff, so hopefully
> Kaz and the gang will correct/fill-in for me.
> -- Jeff
> __________________________________________________
> Jeff Denham (
> Bright Tiger Technologies:  Resource-management software
> for building and managing fast, reliable web sites
> See us at
> 125 Nagog Park
> Acton, MA 01720
> Phone: (978) 263-5455 x177
> Fax:   (978) 263-5547
 Q246: Can a non-volatile C++ object be safely shared amongst POSIX threads?  
In message <7bvbrr$sf7$>, "David Holmes"

>I tend to agree with Kaz - I'm unconvinced that there is some global law of
>compilation that takes care of this. Whilst simple compilers would not
>optimise across function calls because of the unknown affects of the
>function, smarter compilers employing data flow analysis techniques,
>whole-program optimisation etc may indeed make such optimisations - after
>all pthread_mutex_lock() does not access the shared data and the compiler
>(without thinking about threads) may assume that the shared data is thus
>unaffected by the call and can be cached.

Please see ISO/IEC 9899-1990, section, example 1.

>Now maybe all that means is that smart compilers have to be thread-aware and
>somehow identify the use of locks and thereby imply that data protected by
>locks is shared and mustn't be optimised.

There's no way a C/C++ compiler can know what data are "protected by
locks" - there's no such thing as "locks" in either language.

> But do the compiler writers know
>this? I think perhaps the use of simple compilers allows us to currently get
>away with this.

Would you care to name a couple of "simple" compilers?

Anyway, you can take my word for it - compiler writers are usually smart
enough to know they are doing compilers for potentially multi-threaded
environment. At least I know that gcc/egcs and SunSoft folks are.

 Q247:  Single UNIX Specification V2  

A web reference you may find useful is

This contains an overview of POSIX Threads (as contained in the
Single UNIX Specification V2) and links to all the pthreads functions.

You can even download a copy of the specification from
that site (see )

 Q248: Semantics of cancelled I/O (cf: Java)  
David Holmes wrote:

> In Java there is currently a problem with what is termed interruptible I/O.
> The idea is that all potentially blocking operations should be interruptible
> so that the thread does not block forever if something goes wrong. The idea
> is sound enough (though timeouts would allow an alternative solution).
> However Java VM's do not actually implement interruptible I/O except in a
> very few cases. Discussion on the Javasoft BugParade indicates that whilst
> unblocking the thread is doable on most systems, actually cancelling the I/O
> request is not - consequently the state of the I/O stream is left in
> indeterminate as far as the application is concerned
> This leads me to wonder how POSIX defines the semantics of cancellation when
> the cancellation point is an I/O operation. Does POSIX actually specify what
> the affects of cancellation are on the underlying I/O stream (device,
> handle, whatever) or does it simply dictate that such operations must at
> some stage check the cancellation status of the current thread?
> Thanks.

POSIX hasn't a lot to say about the details of cancelled
I/O.  It has required and optional cancellation points.
Most, if not all, the required points are traditional
blocking system calls. Most of the optional ones
are library I/O routines. From my kernel and
library experience, it's a lot easier to cancel the
system calls than the library calls, because the
library calls can hold internal I/O mutexes (yikes)
across system calls. If that system call is canceled,
the locks must be released. That means the library
has to have cleanup handlers in stdio and elsewhere
-- doable but potentially costly in implementation
and performance. At Digital, last I knew, we were
planning to support the optional points in the
future (past V5.0?). Don't know the current status.

In practice, the semantics of syscall cancellation are pretty
much those of signals (and in a number of implementation I
know of, pthread_cancel() involves some kind of
specialized signal).  In other words, if you're blocked in a
read() system call, and a SIGXXX arrives, you'll be broken
out of the sleep, and, if a signal handler is present, take
the signal. If SA_RESTART is not on for the signal, the
read() call returns with status -1/errno EINTR. The outcome
of the I/O operation is undefined. In the case of cancellation,
the error return path from the system call is redirected to
a special cancellation handler in the threads library, which
starts the process of propagating the cancel down the calling
thread's stack.

When I implemented system call cancellation on Digital
UNIX, I followed this signal model, which applies only
to *interruptible* sleeps in the kernel. If there's actual
physical I/O in progress, the blocking in the kernel will be
*uninterruptible*. This is the case when a physio()
operation is in progress, meaning that the I/O buffer
is temporarily wired into memory and that the thread
calling read() cannot leave the kernel until the I/O
completes and the memory is unwired. In these cases,
the cancellation pends, just like a blocked signal, until
the read() thread is about to exit the kernel, at which point
the pending cancel is noticed and raised in the usual

So, in the case of both an EINTR return and a cancel,
the calling thread never has a chance to examine the
outcome of the I/O operation. For a cancellation,
the I/O may be complete, but the canceled thread
will never see that fact directly, because its stack
will be unwound by the cancellation handler
past the point where the read() was called.

I'm not sure whether this ramble is at all on point
for you... There's probably nothing here you don't already
know, but maybe there's a few useful hints.
The bottom line is that most OSs offer very
little in the way canceling I/O that has already
been launched. If you look at the AIO section
of POSIX.1c, specifically at aio_cancel(),
you'll notice that the implementation is
not required to do anything in response
to the cancellation request. The only real
requirement that I recall is to return
an AIO_CANCELED status on successful
cancellation. But you may never get that
back. (On Digital UNIX, you can cancel
AIO that's still queued to libaio, but
for kernel based AIO, you'll never
successfully cancel -- the request
is gone into the bowels of the I/O

So, FWIW, sounds to me like you should map this
Java I/O cancel thing right onto pthread

-- Jeff
Jeff Denham (

Bright Tiger Technologies:  Resource-management software
for building and managing fast, reliable web sites
See us at

125 Nagog Park
Acton, MA 01720
Phone: (978) 263-5455 x177
Fax:   (978) 263-5547

Jeff Denham wrote in message <>...
>So, FWIW, sounds to me like you should map this
>Java I/O cancel thing right onto pthread cancellation...

Thanks Jeff. You seemed to confirm basically what I thought.

With the java situation there are problems both with implementing
interruptions on different platforms and establishing what the semantics of
interruptions are and how they can be used. Perhaps part of the problem is
that in Java they have to both deal with the semantics at the lowest level
of the API's (similar to the level POSIX works at) and at a higher level
too. I was just curious how POSIX dealt with the issue - maybe the Java folk
are worrying too much. FYI here's a snip from the relevant bug parade entry

Besides the above implementation issues, we also need to consider the usage
of interruptable semantics. Considering when one user (Java) thread need to
wake up another thread, (let me name it "Foo") which is blocked on the
DataInputStream, which wraps SocketInputStream which wraps recv(). When the
interrupt exception is thrown, the exception will be propagated all the way
up to the user level. However the state of DataInputStream,
SocketInputStream, recv() are possibly in unknown state. If the user ever
want to resume the io operation later, he may get unknown data from stream,
and get totally lost. So Foo has to remember to close the stream if he get
interrupted. But in this way, the usability of interruptable is largely
lost. It is much like the close() semantics of windows. When I use grep to
search the entire build tree, the IOException appear at about 1600 places.
There are 67 places catch IOException, but only 9 places catch
InterruptedIOException in PrintStream
and PrintWriter class. Generally, the InterruptedIOException is considered
as IOException, treated as fatal error. Making InterruptedIOException to
have resumption semantics will be extremely difficult on any platform, and
will be
against the semantics of Java language exception. But if we choose
termination semantics, the interruptable io is very similar to the close()

Thanks again,

 Q249: Advice on using multithreading in C++?  

On Tue, 30 Mar 1999 09:55:30 +0100, Ian Collins  wrote:
>Paul Black wrote:
>> Does anyone have any advice on using multithreading in C++? Searching around,
>> I've noticed a book "OO multithreading in C++". The book seemed to get a
>> mixed reaction on the online bookstores, is it a recommended read? Are there
>> any other books or resources to be recommended?
>A few guides:
>Use a static member function as the thread run function, pass it 'this'
>in pthread_create and cast the thread void* back to the class to use it.
>Make sure you understand the relationship between key data and class
>Take care with class destruction and thread termination.  I tend to use 
>joinable threads, so the class destructor can join with the thread.

This is not good. By the time you are in the destructor, the object should no
longer be shared; the threads should be already joined. When the destructor is
executing, the object is no longer considered to be a complete object.

It's not that calling the join operations is bad, what's bad is that there are
still threads running. A particularly bad thing is to be in the destructor of
the base class sub-object, with threads still roaming over the derived object!

>Make sure your thread does not run before the containing class is
>constructed!  This can cause wierd problems on MP systems.

Actually it can cause weird problems in non-MP systems too. It is simply
verboten to use an object before it is constructed.  Therefore, it's
a bad idea to launch the internal threads of active objects from within
the constructors of those objects. Such threads may be scheduled to run before
construction completes, which may happen in non-MP systems too.

The best practice with respect to destruction and construction is this:
an object may be shared by multiple threads only after construction
completes and before destruction commences.

One way to do this is to write your active objects such that they have a
Start() method that is not called from the constructor, and a Join() method
that is separate from the destructor. The caller who created the object and
called its constructor calls Start() immediately after construction, or perhaps
after synchronizing.  The Join() method simply joins on all of the threads
associated with the active object.  Usually, I also implement a method called
Cancel() which triggers shutdown of all the threads. 

Having a separate Start() method is useful not only from a safety point of
view, but it has practical uses as well. Here is an example.

In one project I'm working on, I have a protocol driver object which has two
sub-objects: a protocol object, and a device driver object.  Both of these
invoke callbacks in the driver, which sometimes passes control back---for
example, the device driver object may hit a callback that passes received data,
which is then shunted to the protocol object, which then may invoke a callback
again to pass up processed data.

The protocol object doesn't have any threads, but it does register a timer,
which is practically as good as having a thread. The driver has two threads,
for handling input and output.

If I registered the timer immediately after  constructing the protocol object,
and started the I/O threads immediately after constructing the driver, it would
be a very bad thing! Because either object might start hitting callbacks, which
would end up inside the other object that is not yet constructed.

Because I have separate start-up methods, I can implement a construction phase
that builds both objects, and then a start phase which starts their threads or

Similarly, when I'm shutting down the arrangement, it would be terrible to stop
the threads of the driver and destroy the driver, because the protocol timer is
still running! Having Cancel() and Join() separate from destruction lets me
kill all of the timer and thread activities for both objects, and then release
the memory.

 Q250:  Semaphores on Solaris 7 with GCC 2.8.1   

I am writing a mutliprocess application that will utilize a circular
buffer in a shared memory segment.  I am using two semaphores to
represent the # of available slots, and the # of slots to consumer by
the server (consumer).

The apps follow this simple model.

    decrement the available_slots semaphore
    do something...
    increment the to_consume semaphore.

    decrement the to_consume semaphore.
    do something...
    increment the available_slots semaphore.

The problem is that when I run my test programs and watch the semaphore
values, I see the available_slots semaphore continually increase?  The
program will run for a while if I remove the last increment in the
consumer program, but will eventually fail with errno 34, Result to
large.  Studying the output, it does not appear to me that the value of
the two semaphores ever reaches a critical point.

This simple example has been almost copied line for line from two
different books on this subject, both yielding the same results.  I have
included the source to both of my test apps.  If anyone can see, or
knows of something that I am just overlooking, I would very much like to
hear from you.

Nicholas Twerdochlib

Platform info:
    Sun Sparc 20 dual ROSS 125Mhz CPUs 64MB RAM
    Solaris 7/2.7
    GCC v2.8.1

Server/consumer source:

union semun {
 int val;
 struct semids_ds *buf;
 ushort *array;

static ushort start_val[2] = {6,0};

union semun arg;

struct sembuf acquire = {0, -1, SEM_UNDO};
struct sembuf release = {0, 1, SEM_UNDO};

int main( void ) {
  int semid;
  key_t SemKey = ftok( "/tmp/loggerd.sem", 'S' );

  if( (semid = semget( SemKey, 2, IPC_CREAT|0666 )) != -1 ) {
    arg.array = start_val;
    if( semctl( semid, 0, SETALL, arg ) < 0 ) {
      printf( "Failed to set semaphore initial states.\n" );
      perror( "SEMCTL: " );

      return -1;

  while( 1 ) {
    printf( "A Ready to consume: SEM %d Value: %d\n", 0, semctl(semid,
0, GETVAL, 0) );
    printf( "A Ready to consume: SEM %d Value: %d\n", 1, semctl(semid,
1, GETVAL, 0) );

    acquire.sem_num = 1;
    if( semop( semid, &acquire;, 1 ) == -1 ) {
      perror( "server:main: acquire: " );
      exit( 2 );

    printf( "B Ready to consume: SEM %d Value: %d\n", 0, semctl(semid,
0, GETVAL, 0) );
    printf( "B Ready to consume: SEM %d Value: %d\n", 1, semctl(semid,
1, GETVAL, 0) );
    release.sem_num = 0;
    if( semop( semid, &release;, 1 ) == -1 ) {
      perror( "server:main: release: " );
      exit( 2 );

Client/producer source


union semun {
 int val;
 struct semids_ds *buf;
 ushort *array;

static ushort start_val[2] = {6,0};

union semun arg;

struct sembuf acquire = {0, -1, SEM_UNDO};
struct sembuf release = {0, 1, SEM_UNDO};

int main( void ) {
  int semid;
  key_t SemKey = ftok( "/tmp/loggerd.sem", 'S' );

  if( (semid = semget( SemKey, 2, 0)) == -1 ) {
    perror( "client:main: semget: " );
    exit( 2 );

  printf( "A Ready to consume: SEM %d Value: %d\n", 0, semctl(semid, 0,
GETVAL, 0) );
  printf( "A Ready to consume: SEM %d Value: %d\n", 1, semctl(semid, 1,
GETVAL, 0) );

  acquire.sem_num = 0;
  if( semop( semid, &acquire;, 1 ) == -1 ) {
    perror( "client:main: release: " );
    exit( 2 );

  printf( "B Ready to consume: SEM %d Value: %d\n", 0, semctl(semid, 0,
GETVAL, 0) );
  printf( "B Ready to consume: SEM %d Value: %d\n", 1, semctl(semid, 1,
GETVAL, 0) );

  release.sem_num = 1;
  if( semop( semid, &release;, 1 ) == -1 ) {
    perror( "client:main: acquire: " );
    exit( 2 );

>buffer in a shared memory segment.  I am using two semaphores to
>represent the # of available slots, and the # of slots to consumer by
>the server (consumer).

>The apps follow this simple model.

>    decrement the available_slots semaphore
>    do something...
>    increment the to_consume semaphore.

>    decrement the to_consume semaphore.
>    do something...
>    increment the available_slots semaphore.

>struct sembuf acquire = {0, -1, SEM_UNDO};
>struct sembuf release = {0, 1, SEM_UNDO};

The error is quite simple; you shouldnt' specify SEM_UNDO for
semaphores that are not incremented decremented by the same process.

SEM_UNDO should be used for a single process that increments and
decrements the semaphore.  WHen the process is killed, the net
effect of the process on the sermaphore will be NIL because of the
adjust value.

With SEM_UNDO, each decrement in the producer will cause the "semadj"
value associated with the "available_slots" semaphore to be increased
by one.  When the produced exits, the semaphore will be incremented by N,
not what you want in this case.

Solaris also puts a bound on teh semadj value; there is no good reason for
this bound, except that it catches programming errors like yours.

Expressed in this posting are my opinions.  They are in no way related
to opinions held by my employer, Sun Microsystems.
Statements on Sun products included here are not gospel and may
be fiction rather than truth.

 Q251:  Draft-4 condition variables (HELP)   

"D. Emilio Grimaldo Tunon" wrote:

>      Could anybody comment on the condition variable differences
> between the latest Posix standard (draft 10?) and the old
> draft 4 (DCE threads?) found in HP-UX 10.20?

There's no "draft 10". There was, once, a draft of the document that
would become the basis of the POSIX 1003.1-1996 standard, that was
labelled draft 10. That document is not the same as POSIX 1003.1-1996,
and the differences are more than a matter of "formalese". Some problems
were found during the editing to integrate the draft 10 text into
1003.1b-1993. In addition, the 1003.1i-1995 (corrections to the realtime
specification, some of which overlapped 1003.1c text) were integrated at
the same time. The standard is POSIX 1003.1-1996. There is no draft 10.

Also, the implementation of "draft 4" that you'll find in HP-UX 10.20
isn't really draft 4. It was a very loose adaptation of most of the text
of the draft, plus a number of extensions and other changes. I prefer to
call it "DCE threads" to make it clear that it's a distinct entity.

Now. There are no differences in condition variables from DCE threads to
POSIX threads. However, many of the names were changed "clean up" the
namespace and better reflect various persons' opinions regarding exactly
what the interfaces ought to be assumed to do.

One of the differences, stemming from the draft 5 addition of static
initialization of synchronization objects, is that they are now
"initialized" (i.e., assumed to be pre-existing storage of unspecified
value) rather than "created" (where the pthread_cond_t type, et al, were
assumed to be pointers or "handles" to dynamically created storage).

> In particular I have run into the 'problem' that neither
> pthread_condattr_init() nor pthread_mutexattr_init() seem
> to be present, I did find:

If you're moving between POSIX threads and DCE threads, you've got many
worse problems. While much of the interface appears similar, every
function (except pthread_self()) has changed in at least one
incompatible way. Be very, very careful about such a move! Do not
consider it "a weird variety of POSIX threads". It's not. It's "DCE
threads", as different a beast from POSIX threads as is UI threads. Many
of the names are similar, and they do something that's often even more
similar -- but porting requires a lot more thought than you might infer
from those similarities. (For example, DCE threads report errors by
setting errno and returning -1, while POSIX threads takes the much more
reasonable and efficient approach of returning an error code directly.)

HP-UX 10.30 (or, more realistically, 11.0) has POSIX threads. Your best
option is to ignore HP-UX 10.20 entirely and require 11.0. But, if you
can't or won't do that, be really careful, and assume nothing.

/---------------------------[ Dave Butenhof ]--------------------------\
| Compaq Computer Corporation            |
| 110 Spit Brook Rd ZKO2-3/Q18 |
| Nashua NH 03062-2698 |
\-----------------[ Better Living Through Concurrency ]----------------/

Try this document.

Porting DCE Threads Programs to HP-UX 11.0 POSIX Threads

You will also find the following book useful.

Threadtime by Scott Norton and Mark Dipasquale. HP Press, Prentice Hall.
ISBN 0-13-190067-6

Discusses about programming using POSIX threads in general
and also about HP-UX specific features.


 Q252:  gdb + linuxthreads + kernel 2.2.x = fixed :)   

After two solid days of differential testing, I found the problem that was
preventing me from debugging of threads under gdb.  It isn't kernel version
related, but it is rather strange so I thought I would share it for the
common curiosity...

It appears that if you are trying to debug a program that links to, and the symbols for that library are not loaded, the
debugging doesn't work.  In my case, I was doing a "set auto-solib-add 0",
to avoid wading through all the libc and other system library stuff, and/or
getting messages from ddd about no source files, ending up in "space" etc.
Apparently, because the symbols for libpthread weren't loaded, the debugging
was not working properly.

Doing a manual load on the library using "sharedlibrary libpthread" solves
the problem.  Threads are then detected and debuggable (?!).

Does anyone know if this behavior is "by design" or "by accident" ?

Thank you very much to the people who responded to my original post.


Paul Archard
parch      // get rid of the
@          // comments to
workfire // reveal the
.com      // email address!


On Thu, 08 Apr 1999 19:55:24 GMT, Paul Archard  wrote:
>Doing a manual load on the library using "sharedlibrary libpthread" solves
>the problem.  Threads are then detected and debuggable (?!).
>Does anyone know if this behavior is "by design" or "by accident" ?

It's probably by design. The GDB patch adds LinuxThreads debugging ability by
making GDB peek at internal LinuxThreads data structures. Indeed, LinuxThreads
itself had to be modified to allow the hack to work by providing some extra
debugging info.

Presumably, without the symbols, GDB can't find the addresses of LinuxThreads
objects that it needs to access. 

 Q253: Real-time input thread question  

On Mon, 12 Apr 1999 13:54:28 GMT, JFCyr  wrote:
>We want our input thread to read some device values at an exact frequency.
>What is the best way? 

Depending on the frequency, you may need a hard real-time kernel which can
schedule your thread to execute periodically with great accuracy. In such
operating systems, the kernel is typically preemptible and takes care not to
disable interrupts for long periods of time.

>- A loop (within the input thread) working on the exact frequency with an 
>RTDSC test.

In POSIX threads, you could use pthread_cond_timedwait to suspend the thread
until a specified time. However, the accuracy of this will be restricted by the
degree to which your OS supports real-time processing. 

>- A WM_TIMER message to the primary window thread

Under windows? You can't get anything better than 10 ms resolution, if that,
and it's not real time by any measure. Too many ``guru meditations''.  If the
frequency is faster than, say, 20-30 Hz, forget it.  On Intel machines, the
Windows clock tick is 100Hz; even though interfaces like WaitForSingleObject()
and Sleep() have parameters expressed in milliseconds, the granularity of any
timed wait is ten milliseconds.  The Win32 API sucks for programming timers,
too.  The various sleep and wait functions express their timeout parameter as a
displacement from the current time rather than as an absolute wake time.  Also,
there is no signal handling; you can't install a periodic interrupt timer.
What you can do is poll the tick count in between thread sleeps.  What you can
do is sleep for shorter periods and check the current tick count.
Another thing you could do, in principle, is write a device driver that
performs the data acquisition at the specified time intervals. Inside the
driver, chances are that you have access to more accurate timing, since you are
in the kernel; also, faster access to the device. Thus you can approximate
real-time processing. The driver can queue the results for collection by the
application, which can take its sweet time.

 Q254: How does Solaris implement nice()?  
> Kelvin,
>   Thanks!  That helps.  One related question: How does NICE work?
> I mean if it just raises/lowers the LWP's priority level once, then
> after a couple of quantum it would become meaningless.
> -Bil

Nice is there to maintain the compatibility of Solaris to the
older Unix and it works in a funny way. 

First, when a user set a nice value, a priority value is calculated 
based on this nice value using some formula. This priority value 
is then passed onto the kernel using priocntl, which becomes the 
user portion (ts_upri) of the LWP priority. The priority table that I talked 
about in my message contributes the system portion (of the LWP priority.
Therefore, we have

pri = ts_cpupri  + ts_upri + ts_boost

ts_boost is the boosting value for the IA class. The CPU picks the LWP
with the highest pri to execute next.

When a user set a nice -19 on a LWP, ts_upri is -59. Since the largest
ts_cpupri in the table is 59, pri is always 0, unless it is in IA 
and has a boost value. If a user wants a finer control of the priority,
instead of using nice, he/she can use priocntl to set ts_upri directly.

Hope this help,


 Q255: Re: destructors and pthread cancelation...   
Hi Bil,

I noticed that you responded to a fellow indicating that the Sun C++
compiler will execute local object destructors upon pthread_exit() and also
if canceled.

Do you know what version of the compiler does this?

As you may recall, I sent you a very long winded email last year complaining
about how UNIX signal handling, C++ exception handling, and pthread
cancellation don't work together.

This bit of information about compiler support on pthread_exit and
cancellation would help solve most of my problems.

ie) if a SIGSEGV occurs, or some fatal FPE, my signal handler could simply
call pthread_exit and I'd get stack based object destructors invoked for
free (yay!).

Do you know if these semantics of pthread_exit and cancellation will be
adopted by the POSIX committee at some point????

I've also heard rumblings that there is a new PThreads standard draft...  I
haven't seen anything though... word of mouth...



John Bossom

 Q256: A slight inaccuracy WRT OS/2 in Threads Primer  
From: Julien Pierre  

Thanks for a most excellent book.

I have been doing multithreaded programming under OS/2 for about 5 years;
yet I would never have thought I could learn so much from a threads book.
How wrong I was!

Now, that said, there is a slight inaccuracy WRT OS/2 on page 102 : there
is SMP support in OS/2 version 2.11 SMP ; and OS/2 Warp Server Advanced

These versions of OS/2 have special kernels modified for SMP, and not all
device drivers work properly with it ; but all 32-bit OS/2 apps that I
have ever tried on it worked. I have found problems with some older 16-bit
OS/2 apps that didn't behave, because they were relying on an old
Microsoft C runtime library which used "fast RAM" semaphores that aren't
SMP safe. The problem was fixed by marking the executable as uniprocessor
with a utility provided with OS/2 SMP - so that its threads would always
run on the same CPU.

These problems with device drivers and many 16-bit apps are probably part
of the reason why IBM hasn't been shipping the SMP kernel in the regular
version of OS/2 (Warp 4). Warp Server SMP does make a very nice operating
system though (I run it at home on one system - see

Julien Pierre     
Theta Band Software LLC

 Q257: Searching for an idea  

  Sure... Let's see what we can think up here...

  How about one of these:

o  The majority of client/server applications are limited more
   by disk I/O than by CPU performance, thus Java's slower computing
   power is less of an issue than in other areas.  (A) is this really
   true?  (B) What configuration of threads do you need to match the
   performance of C programs for a simple, well-defined problem?  (C)
   What do you need to do with java to obtain this level of performance?

o  One problem with Java's wait/notify architecture is that, for problems
   like consumer/producer, many redundant wakeups may be required in 
   order to ensure that the RIGHT threads get woken up (by use of notifyAll()).
   For an optimally configured program, what is the reality of this problem?
   (See my article in Aug. Java report for a lenghty description of this.)

o  Java native threads (on Solaris 2.6, JDK 1.2) use "unbound" threads.  These
   are *supposed* to provide adaquate LWPs for any I/O bound problem so that
   the programmer doesn't need to call the native methods for thr_setconcurrency().
   How well does this work for "realistic" programs?  (Can you find any 
   commerical applications that face this issue?)


>         I´m a spanish computer science student, searching for an idea
> for my final project, before getting my degree; but as of today, I
> haven´t found it.
>         Can you give me any ideas? I´m interested in JAVA, especially in
> multithread programming.
>         If you would like to help me, please send an e-mail to:
>                                                     Thank you very much.
>            Eloy Salamanca. Tlf: 954 360 392  (Spain). E-mail:


 Q258: Benchmark timings from "Multithreaded Programming with Pthreads"  

I ran some of benchmark timings from Bil Lewis's book "Multithreaded
Programming with Pthreads" to get a rough idea how LinuxThreads compares
with PMPthreads. I only have a uniprocessor (Intel 200 MHz MMX w/Red Hat
Linux 5.1) to test with, but the results are interesting anyway.

In case you have an interest in running the benchmarks yourself, I have
attached the performance programs distribution that compiles on
LinuxThreads and PMPthreads. You need to recompile the tools for each
library. Use "make -f Makefile.linuxthreads" to build the LinuxThreads
version, and "make -f Makefile.pmpthreads to build the PMPthreads
version. Use "make -f Makefile.pmpthreads clean" between recompiles. The
"" script runs the tests. I'd be interested in the
results others get.

Here are the results I got:

                                 (second to complete)
                              PMPthreads      LinuxThreads
                              ----------      ------------
lock                            8.09             10.15
try lock                        3.77              8.69
reader                          8.91              6.24
writer                          9.15              6.52
context switch (bound)         10.82             49.18
context switch (unbound)       10.82             49.17
sigmask                        19.19              6.15
cancel enable                   9.65              4.54
test cancel                     2.06              3.94
create (bound)                  1.61             44.84
create (unbound)                1.61             45.64
create process                 13.25             15.08
global                          4.23              4.20
getspecific                    10.31              2.53

Looks like LinuxThreads pays a big price for thread creation and context
switching. The raw data for these results is included in the attached

With some verifications of the results and some commentary, this might
be worth a page on the Programming Pthreads website.

 Q259: Standard designs for a multithreaded applications?  

> Hi  All,
>   I want to know whether there are any standard design techniques for
> developing a  multithreaded application.   Are there any
> books/documents/websites which discuss multithreaded design issues?  I
> am mainly interested in design issues which help different programmers
> working on the same project to coordinate in developing a multithreaded
> application.
>   Any suggestion or experience in this regard is welcome.
>    I am interested in designing or reverse engineering multithreaded
> server applications using C and not C++ or Java.

There are a great many books that cover parallel programming: 
algorithms, programming models, library APIs, etc.  However, 
few cover the design and construction of parallel software.  
The following texts may be more relevant than most:

Multithreading Programming Techniques 
By Prasad, Shashi

Online Price: $39.95
Softcover; 410 Pages
Published by McGraw-Hill Companies
Date Published: 01/1997
ISBN: 0079122507


Structured Development of Parallel Programs 
By Pelagatti, Susanna

Online Price: $44.95
Softcover; 600 Pages
Published by Taylor and Francis
Date Published: 11/1997
ISBN: 0748407596


Designing and Building Parallel Programs : 
    Concepts and Tools for Parallel Engineering 
By Foster, Ian T.

Online Price: $50.95
Hardcover; 600 Pages
Published by Addison-Wesley Publishing Company
Date Published: 12/1994
ISBN: 0201575949


Foundations of Parallel Programming 
By Skillicorn, David

Online Price: $39.95
Published by Cambridge University Press
Date Published: 12/1994
ISBN: 0521455111


Randy Crawford

 Q260: Threads and sockets: Stopping asynchroniously  


I've found that the cleanest way to do this (with regards to portability)
is to set up a unique pipe for every thread that you might want to 
interrupt.  Instead of doing a read() in your reader thread you'd use

FD_SET(reader_fd, &fds;);
FD_SET(msgpipe_fd, &fds;);

ready = select(highestfd+1, &fds;, NULL, NULL, NULL);
if (FD_ISSET(msgpipe_fd, &fds;)) {  /* We've been interrupted */
    .. drain the pipe ..
    .. handle the event gracefully ..

if (FD_ISSET(reader_fd, &fds;)) {   /* We've received data */
    .. grok the data ..


Now, from your controlling thread, (the one which is interrupting the blocked
thread) you could write 1 byte to the 'msgpipe_fd' file descriptor, which
would wake that thread up from it's select().

This seems like a lot of work, but it's probably the only portable way
of accomplishing this task.  Trying to do this with signals is ugly and
potentially unreliable.

Hope this helps,

 Q261: Casting integers to pointers, etc.  

> Oh Lord!  Is that true?  "casting integers to pointers..." ?  Who the
> !@$!@$ came up with this idea that casting is allowed to change bits?
> If *I* were King...

    'Tis true.  A cast doesn't mean "pretend these bits are of type X,"
it is an operator meaning "convert this type Y value to the type X
representation of (approximately) the same value."

    For example:

    double trouble = 3.14;
    double stubble = (int)trouble;

Surely the `(int)' operator is allowed to "change bits," is it not?

> I do not know of any machines where this does not work, however.  DEC,
> Sun, HP, SGI, IBM all cast back and forth as expected.  Are there any?

    There are certainly machines where `int' and `void*' are not even
the same size, which means convertibility between `int' and `void*' cannot
possibly work for all values.  I believe DEC's Alpha (depending on compiler
options) uses a 32-bit `int' and a 64-bit `void*'; there are also ugly
rumors about various "memory models" on 80x86 machines.

    In any case, it's not a crippling restriction.  You want to pass an
`int' (or a `double' or a `struct foobar' or ...) to a thread?  No problem,
just a slight clumsiness:

    struct foobar x;
    pthread_create (..., func, &x;);    /* or `(void*)&x;' if there are no
                     * prototypes in scope */

    void func(void *arg) {
        struct foobar *xptr = arg;
        struct foobar xcpy = *(struct foobar*)arg;  /* alternative */



>     'Tis true.  A cast doesn't mean "pretend these bits are of type X,"
> it is an operator meaning "convert this type Y value to the type X
> representation of (approximately) the same value."

  Grumble.  Oh well.

> > I do not know of any machines where this does not work, however.  DEC,
> > Sun, HP, SGI, IBM all cast back and forth as expected.  Are there any?
>     There are certainly machines where `int' and `void*' are not even
> the same size, which means convertibility between `int' and `void*' cannot

  Are there?  I don't doubt that there WERE, but any today?

>     In any case, it's not a crippling restriction.  You want to pass an
> `int' (or a `double' or a `struct foobar' or ...) to a thread?  No problem,
> just a slight clumsiness:

  BIG clumsiness.

  It is also true that everyone of us who have written on the subject have
completely ignored this little detail.

  Thanks for the insight.

 Q262: Thread models, scalability and performance   

In reference to your comment below on mutex context switching you are wrong.

When thread A releases the lock, time slicing may not switch in quick enough
for thread B and C to grab the lock. And thread A's execution may be quick
enough to grab the lock even before the OS allows B or C to attempt to grab
the lock. This scenario usually occurs on high threaded applications, where
some threads seem to be starved. You can actually see this if you were using
a thread profiler (real-time) such as Rational Quantify-Purify or Failsafe
on AIX.

In reference to data sinks and streams. Remember, large scale string objects
do not perform well, and byte arrays is the choice mechanism. But streams
also provide the ability to be buffered, so that the mechanism for writing
to the stream performs better when sending large amounts of data from one
thread to another using a stream.  

In my case I have a MQ channel which does messaging to the mainframe on
AIX. I have a socket server which can take in requests to place them on the
queue to be processed on the other side of the recieving channel.

What I do is divide the socket itself into two streams on two separate
threads, one talking to the client while the other is sending data onto the
channel and getting data from the channel. The data (or message) is huge in
size and the queue manager usually breaks it up.  But after I get all the
segments back, I need to reformat it slightly and send it back out on the

Using a stream to talk to the threads provides the fastest way to send raw
string data. Remember this is an asych operation. An asych operation is
faster then waiting for the queue manager to reply and then sending out the
message to the user. Streams are a better design.

But if you want to send simple messages then objects are easier, just send
into each thread the object reference and have the threads talk to the
object to bridge the threads in communication.

I prefer the streams mechanism overall.

I think the statement you made is missing what application example I gave,
so I will recap it.

I have a socket based server, where I create the master worker object first
- then send the sock into it when I get the accept().

I then spawn two smaller worker threads which communicate to the socket (one
reading the other writing). I have the two threads communicating via
streams. The one that is reading the from the socket is getting the message
that needs to be relayed to the mq channel, while the other thread is
writing to the socket and getting the data from the mq channel.

I use this same mechanism for another server that does database work on DB2
on AIX as well.

Currently I am revamping the whole server and implementing the JGL
containers into the model.


Bil Lewis wrote:

> Sean,
> Thanks 10^6!  That helps a lot.  A couple  of thoughts:
> > Thread A hits the method, gets a lock on the
> > method and begins to write to the file system. Threads B and C are attempting to write
> > to the logs too, but can not get a lock since thread A has it and is not finished. So
> > threads B and C wait. Now the operating system has time slicing and places those threads
> > in a suspend state. Thread A completes writing to the logs, and releases the lock. Now
> > thread A does some processing. Thread B and C are still in suspend state. Meanwhile
> > thread A attempts to write to the logs again. It gets a lock and does so. Meanwhile
> > thread B and C come out of suspend (due to the operating system time slicing the
> > threads) and they try to write to the logs but can not again. They suspend, and the
> > cycle repeats over and over again.
> It better not repeat!  The very next time A unlocks the mutex, A will be
> context switched (it now has lower priority that B & C), and the lock
> will be available.  Certainly this is the behavior I (think!) I see on
> Solaris. ??
> > >
> > >   This surprises me.  Using a stream to communicate between threads?  This
> > > would certainly be 10x slower than using simple wait/notify.  (?!)
> > >  Streams are the basis for talking on the
> > socket which is interprocess communication, right? You have a client on a process who is
> > communicating to a remote process, the socket is the communication, but the streams off
> > the socket provide the fine grain communication.
> I can certainly see that you can do this, and in IPC it makes some sense.
> But I don't see it in threads.  It would be both limiting and slow.  (Let's
> see...  I have a Java program where I pump ~100 strings/sec across a socket.
> Versus 10,000+ communications via synchronized methods.) ?
> Best Regards,
> -Bil

 Q263: Write threaded programs while studying Japanese!   

Yes, indeed!  You too can now learn the subtle beauty of
the Japanese language while advancing your programming
skills in Pthreads!

         Hurry!  Hurry!  Hurry!

I just got a copy of the Japanese translation of both
Dave Butenhof's book and my own.  It's great to see all 
this Kanji and then "MUTEX" in English.  I used to live
in Kenya, where this happened all the time.  It was
pretty funny.  "Mimi, sijui engine block iko wapa."

 Q264: Catching SIGTERM - Linux v Solaris  

  I didn't notice you declaring the signal handler. You need to
have a signal handler (even tho it'll never get called!) set up
for many signals in order to use sigwait().  The handler turns
off the default signal action.


> I wonder if anyone could shed light on the following problem I
> am having. Early in my servers execution I create a thread:
> to wait on SIGTERM (and SIGINT) and shutdown the server cleanly.
> The shutdown thread works as expected when compiled on Linux
> (libc5 + LinuxThreads, SuSE 5.2) but it doesn't seems to catch
> the signals on Solaris (only tried 2.6). The shutdown thread
> is as follows:
y information.
 Q265: pthread_kill() used to direct async signals to thread?  


  Yes, you can.  But no, you don't want to.

  What were you using your signal handlers for?  To provide
some version of asynchronous processing.  But now you can do
that processing synchronously in another thread!  (This is
a *good* thing.)

  For backwards-compatibility you may wish to retain the basic
model of sending signals to the process, but you can do that with
sigwait() in one thread, blocking it out from all other threads.

  So look at your app carefully.  It is very likely that you can
make it simpler and more robust with less effort.


> I'm porting a multi-process based application into a thread
> environment.  So I've read about the traditional signal model using
> sigaction() and sigprocmask() etc,  and the "new" signal model using
> sigthreadmask() and sigwait() etc ....  But, can't I just redirect my
> old SIGABRT, SIGUSR signals (previously between processes) now to a
> specific thread with pthread_kill()?    Sure if someone issues a command
> line kill() with a SIGUSR then that delivery will be indeterminate since
> it is enabled for all the threads but with enough global state data the
> handler can probably manage to ignore that one.  Have I missed something
> here?

 Q266: Don't create a thread per client  
David Snearline wrote:
> Bil Lewis wrote:
> > Nilesh,
> >
> >   While it's certainly interesting to experiment with lots of threads, I
> > don't think you want to do anything serious like this.  I think you'll be much
> > happier doing a select() call and creating only a few dozen threads.
> >
> > -Bil
> >
> > >
> > > My applications runs on a Sun Sparc station with solaries 2.6 and I am using the
> > > POSIX library.
> > >
> > > The application is a server and creates a thread for each  connection accepted
> > > from a client,
> > > potentially the server is expected to handle upto 1000 connections. Therefore
> > > the server is expected
> > > to create upto 1000 threads.
> > ilesh
> Greetings,
> I was somewhat intrigued by your comment here, and was wondering what the rationale
> was behind it.  I've done many servers under Solaris using an accepting thread plus a
> thread per connection, and so far, I've been pretty happy with results.  Then again,
> this usually involves a hundred threads or so max -- not a thousand.
> Since most of the threads end up being blocked in the kernel on I/O, the only
> drawback I can see are the per-thread resources of the (mostly) idle threads.
> Provided that these resources are kept small, running a thousand threads shouldn't be
> a problem.  Is there more here that I'm missing?

Oh, it's just that you're using up all this memory for the threads and you don't 
need to.  Might as well have one thread block on 1000 fds as have 1000 threads
each blocking on one.  For large numbers of fds, I'd expect to see some real
performance gains.


 Q267: More thoughts on RWlocks  

>     As many of you know the first and second readers-writers problems
> always starves either the readers (first) or the writers (second) .. I
> learned this in my operating systems textbook.....  I was reading along
> anticipating the solution which would not starve either the readers or
> the writers, but he then referred me to the bibliography page.....  And
> it wasn't much help..  Does anyone know of a solution which does not
> starve either one...

i'll stick my neck out on this one...

If starvation is a problem then RWlocks are not the answer.

RWlocks are only useful when writers are rare.  That's why
writer-preference makes sense.

If writers are so common that they can occupy the lock for 
significant amounts of time, then you shouldn't be using
RWlocks.  Actually, if they are so common, what you should be
doing is buying faster hardware! Or lowering the number of 
incoming requests.

Sure, you can always find an exceptional case where a special
version of RWlocks will give good results, but we're not trying
to solve imaginary problems here. For real problems the correct
answer is "Don't do that!"



 Q268: Is there a way to 'store' a reference to a Java thread?  

> Is there a way to 'store' a reference to a thread at a certain point and
> run a command in that thread at a later point in time?

  Of course there is!  (Just depends on what you really mean.)

RunnableTask task = new RunnableTask();
Thread t = new Thread(task);
       ^ reference

task.addCommandToTaskQueue(new Command());
     (this puts the task on the queue and wakes up the thread if

  This may not be what you were THINKING of, but it's probably
what you REALLY want.


 Q269: Java's pthread_exit() equivalent?   

[Simple question, I thought.  LOTS of answers!  For my own use and
for my Java Threads book I implemented InterruptibleThread.exit(),
but there is a lot of logic to insisting that the run() method
be the one to simply return. -Bil]

Bil Lewis writes:
 > Doug,
 >     A question for you.
 >    In POSIX, we have pthread_exit() to exit a thread.  In Java we
 >  *had* Thread.stop(), but now that's gone.  Q: What's the best way
 >  to accomplish this?
 >    I can (a) arrange for all the functions on the call stack to
 >  return, all the way up to the top, finally returning from the
 >  top-level function.  I can (b) throw some special exception I
 >  build for the purpose, TimeForThreadToExitException, up to the
 >  top-level function.  I can throw ThreadDeath.
 >    But what I really want is thread.exit().
 >    Thoughts?
 >  -Bil
 > -- 
 > ================
 > Bil
 > Lambda Computer Science 
 > 555 Bryant St. #194 
 > Palo Alto, CA,
 > 94301 
 > Phone/FAX: (650) 328-8952

Here's a real quick reply (from a slow connecction from
Sydney AU (yes, visiting David among other things)). I'll
send something more thorough later....

Throwing ThreadDeath yourself is a pretty good way to force current
thread to exit if you are sure it is in a state where it makes sense
to do this.

But if you mean, how to stop other threads: This is one reason why
they are extremely unlikely to actually remove Thread.stop(). The next
best thing to do is to take some action that is guaranteed to cause
the thread to hit a runtime exception. Possibililies range from the
well-reasoned -- write a special SecurityManager that denies all
resource-checked actions, to the sleazy -- like nulling out a pointer
or closing a stream that you know thread needs. See
for a discussion of some other alternatives.


Hi Bil,
Here's the replies I got to your question.


Check out the following url's. They give a good description of the
problem and implementation details for better ways to stop a thread


rom: Jeff Kutcher - Sun Houston 
Subject: Re: A threadbare question
To: Peter.Vanderlinden@Eng
MIME-Version: 1.0
Content-MD5: KVELBotxnHX+d34FMCMY4g==

Here's a suggestion:

    private Thread thread = null;

    public void start() {
        if (thread == null) {
            thread = new Thread(this);

    public void stop() {
        thread = null;
    public void run() {
        while (thread != null) {
            try {
            } catch (InterruptedException e) {
                thread = null;
                // using stop() may cause side effects if the class is extended


>From Lee.Worrall@UK  Tue Aug 18 09:03:12 1998

I believe the recommended way to exit a thread is have it drop out of the bottom 
of its run() method (probably in response to some externally triggered event).


> Date: Tue, 18 Aug 1998 08:53:45 -0700 (PDT)
> From: Peter van der Linden 
> Subject: A threadbare question
> To: java-internal@Sun.COM
> A thread knowledgeable colleague asks...
> ------------------
> In POSIX, we have pthread_exit() to exit a thread.  In Java we
> *had* Thread.stop(), but now that's gone.  Q: What's the best way
> to accomplish this?
>   I can (a) arrange for all the functions on the call stack to
> return, all the way up to the top, finally returning from the
> top-level function.  I can (b) throw some special exception I
> build for the purpose, TimeForThreadToExitException, up to the
> top-level function.
>   But what I really want is thread.exit().
> -----------------
> Anyone have any ideas?

Seek, and ye shall find.

------------- Begin Forwarded Message -------------

ThreadDeath is an Error (not an Exception, since app's routinely
catch all Exceptions) which has just the semantics you are talking
about: it is a Throwable that means "this thread should die".  If
you catch it (because you have cleanup to do), you are SUPPOSED to
rethrow it.  1.2 only, though, I think.  Thread.stop() uses it, but
although stop() is deprecated, it appears that ThreadDeath is not.

I think.  :^)

>   I was feeling much more sure of myself before you asked the question.
> Now I need to think.  There are certainly situations where you do wanr
> to exit a thread.  The options would seem to be a Thread.exit() method,
> or an explicit throw of an exception.  What else?  (You sure wouldn't
> want to have "special values" to return from functions to get to the
> run() method.)
>   If Java had named code blocks, you could do a direct goto.  But I
> don't see why that would be good.  Not in general.
>   I don't see any logic for insisting on having an explicit exception.
> (Is there?)

You mean as in throwing an exception in another thread to indicate
termination status of the dying thread? If so: no, there isn't,
although it is always possible to hand-craft this kind of effect.

>   No, I think that a Thread.exit() method defined to throw ThreadDeath
> is the way to go.

In which case, there is no real need for a method; just `throw new
ThreadDeath()' would do. 

When thread.stop() was in the process of being deprecated I argued
that there should be a Thread.cancel() method that is defined as


along with a method isCancelled(), and an associated bit in the Thread
class. The idea is that interrupts can be cleared, but the cancel bit
is sticky, so reliably indicates that a thread is being asked to shut
down. But apparently some people (I think database folks) really want
the freedom to do retries -- in which case they must clear interrupts,
catch ThreadDeaths, and so on, and don't want anything standing in the
way of this.

> (PH is talking to me about modifying my PThreads book into Java.  I'm
> sort of mixed on the idea.)

I think it would be great to have something a lot better than Oaks and
Wong as the `lighter', gentler, more traditionally MT-flavored
alternative to my CPJ book. I think you could do a lot of good in
helping people write better code. (Tom Cargill has been threatening to
write such a book for years, but I don't think he will.)

(People would then complain that your book is insufficently OO, making
a perfect complement to complaints that my book is insufficiently MT :-)

BTW, Have you seen my util.concurrent package? (see
I'd be very interested in your reactions. I'm trying to standardize
some of the more common utility classes people use in concurrent


> >   If Java had named code blocks, you could do a direct goto.  But I
> > don't see why that would be good.  Not in general.
> >
> >   I don't see any logic for insisting on having an explicit exception.
> > (Is there?)
> You mean as in throwing an exception in another thread to indicate
> termination status of the dying thread? If so: no, there isn't,
> although it is always possible to hand-craft this kind of effect.

  No.  "Named blocks" isn't a familiar term?  It's just a clean way of
doing longjmp().  Java *should * have it.

> >   No, I think that a Thread.exit() method defined to throw ThreadDeath
> > is the way to go.
> In which case, there is no real need for a method; just `throw new
> ThreadDeath()' would do.
> When thread.stop() was in the process of being deprecated I argued
> that there should be a Thread.cancel() method that is defined as
>   setCancelledBit();
>   interrupt()
> along with a method isCancelled(), and an associated bit in the Thread
> class. The idea is that interrupts can be cleared, but the cancel bit
> is sticky, so reliably indicates that a thread is being asked to shut
> down. But apparently some people (I think database folks) really want
> the freedom to do retries -- in which case they must clear interrupts,
> catch ThreadDeaths, and so on, and don't want anything standing in the
> way of this.

  I was a bit leary on interrupts until I looked at them more closely.
I think now that they're pretty reasonable.

  So the last remaining question for me is: "Should I do an explicit
throw?  Or just call stop() anyway?"  (I don't want to write my own
subclass BilsThread that implements a java_exit() method.)
> > (PH is talking to me about modifying my PThreads book into Java.  I'm
> > sort of mixed on the idea.)
> I think it would be great to have something a lot better than Oaks and
> Wong as the `lighter', gentler, more traditionally MT-flavored
> alternative to my CPJ book. I think you could do a lot of good in
> helping people write better code. (Tom Cargill has been threatening to
> write such a book for years, but I don't think he will.)
> (People would then complain that your book is insufficently OO, making
> a perfect complement to complaints that my book is insufficiently MT :-)

> BTW, Have you seen my util.concurrent package? (see
> I'd be very interested in your reactions. I'm trying to standardize
> some of the more common utility classes people use in concurrent
> programming.

  As soon as I get back from Utah...

Doug Lea wrote:
> > > But I would only do this if for some reason using interrupt() had to
> > > be ruled out.
> >
> >   ?  interrupt() is unrelated.  I assume my thread has aready gotten the
> > interrupt and has decided to exit.  I've got to check with one of the
> > Java guys, just to get their story on it.  (I'm surprised this has been
> > asked 6k times already.  Something's odd...)
> I think interrupt IS related.  It seems best to propagate the
> interrupt all the way back the call chain in case you have something
> in the middle of the call chain that also needs to do something
> important upon interruption. Don't you think?

  I was thinking in terms of situations were you KNOW your data is
consistant and you've determined that it's time to exit and there's
nothing else to do.  An event which does occur...  Often?  Sometimes?
Only in programs that I write??

  But I see your point.


Hi Bil,

Just a comment on the stop(), ThreadDeath issue in Java. The comment you 
include from Nicholas is inaccurate and misleading.

"ThreadDeath is an Error (not an Exception, since app's routinely
catch all Exceptions) which has just the semantics you are talking
about: it is a Throwable that means "this thread should die".  If
you catch it (because you have cleanup to do), you are SUPPOSED to
rethrow it.  1.2 only, though, I think.  Thread.stop() uses it, but
although stop() is deprecated, it appears that ThreadDeath is not."

Yes ThreadDeath is derived from Error. The term "exception" means 
anything that can be thrown. "exceptions" which are derived from 
Exception are checked "exceptions" and must be caught by the caller or 
delcared in the throws clause of the caller. But this is not important.

There is *nothing* special about a ThreadDeath object. It does not mean 
"this thread should die" but rather it indicates that "this thread has 
been asked to die". The only reason it "should" be rethrown is that if 
you don't then the thread doesn't actually terminate. This has always 
been documented as such and is not specific to 1.2.

If a thread decides that for some reason it can continue with its work 
then it can simply throw  new ThreadDeath() rather than calling stop() 
on itself. The only difference is that with stop() the Thread is 
immediately marked as no longer alive - which is a bug in itself.

Doug Lea wrote:
> >   I see your point.  Still seems ugly to me though.  (Now, if *I* were
> > king...)
> I'd be interested in your thoughts about this, or what you would like
> to see. I used to think I knew what would be better, but I am not so
> sure any more.
> -Doug

  I was feeling much more sure of myself before you asked the question.
Now I need to think.  There are certainly situations where you do want
to exit a thread.  The options would seem to be a Thread.exit() method,
or an explicit throw of an exception.  What else?  (You sure wouldn't
want to have "special values" to return from functions to get to the
run() method.)

  If Java had named code blocks, you could do a direct goto.  But I
don't see why that would be good.  Not in general.

  I don't see any logic for insisting on having an explicit exception.
(Is there?)

  There's plenty to be said about how to ensure consistant data in
such situations.  But I don't think that has to determine the exit

  No, I think that a Thread.exit() method defined to throw ThreadDeath
is the way to go.


(PH is talking to me about modifying my PThreads book into Java.  I'm
sort of mixed on the idea.)

 Q270: What is a "Thread Pool"?  

> So I want to allocate a pool of threads and while the program
> is executing I want to use the threads to service all the different
> modules in the program. This means that there are going to be
> times where I want to change the addresses of procedures that
> threads are using.
> So I create the thread pool with all threads have NULL function
> pointers and all threads created are put to sleep.
> Some time later different modules want to be serviced so I look and
> see if a thread is availible and if one is then I assign the function
> to that thread and make the thread active which starts execution
> of the assign function. After the thread finishes it would be put to
> sleep...and avalible for use by another module....
> is this possible or have I been smoking too much crack?

Rex asks a question which we've seen here a dozen times.
It's a reasonable question and kinda-sorta the right idea
for a solution, but the angle, the conceptual approach, the
metaphor is wrong.

The term "thread pool" conjures up a temp agency where you
wake up typists when you need them and give them a job to do.

This is a lousy way to think of programs. You shouldn't be
thinking about "giving the threads work to do". You should
be thinking about "announcing that there is work to do" and
letting threads pick up that work when they are ready.

The Producer/Consumer model is the way to go here. A request
comes in off the net, the producer puts it on a queue, and a
consumer takes it off that queue and processes it. Consumer
threads block when there's nothing to do, and they wake up and
work when jobs come along.

Some will argue that "Thread Pool" is the same thing. Yes, but.
We've seen SO many questions about "stopping and starting" 
threads in a pool, "giving a job" to a specific thread etc.
People try to implement something along these lines and totally
mess up.

Read my book. Read Dave's book. Read pretty much any of the 
books. We all say (more or less) the same thing.

So, don't think "Thread Pool", think "Producer/Consumer". You'll
be happier.

A good example of a Producer/Consumer problem can be found in
the code examples on


 Q271: Where did "Thread" come from?  

I just picked up on your post to comp.programming.threads, and I noticed
your (?) concerning the term "thread."  I first heard this term used in the
late '60 in a commercial IBM S/360 shop here in Dallas, Tx.  One of the
"heavy weights" (Jim Broyles, still at BC/BS of TX) was writing/developing a
general purpose TP monitor: the "host" was a 360-40 running DOS, (DOS
supported 3 application "partitions": BG, FG1, FG2): the
lines/controllers/terminals managed were IBM 2260 (or "look-alikes".)  I do
not know how many threads Jim's TP monitor used, but this system was used a
BC for almost 10 years.  The system was written in assembler.  All of this
was "pre" CICS, TSO, etc.

Jim Broyles went on to become manager of System Programming for BC/BS ..  I
worked for him for maybe 5-6 years in the mid '70's.  Support for
application threading in S/360 DOS was likely pretty "limited", but big "OZ"
... S/360 OS-MFT/MVT, SVS, MVS provided good facilities for
multi-programming, and, IBM was pushing MP and AP (smp) systems.  We had a
S/370 AP system installed when I left BC/BS (1979).

Net/net, the term has "been around a while."

 Q272: Now do I create threads in a Solaris driver?  

Kernel space is a different beast from user space threading and I
don't deal with that myself.  BUT I know that Solaris kernel threads
are very similar to POSIX threads. You can run them from device drivers.
The DDI should have the interface for threads, but like I said, I've
never paid it much attention.

I would think that a call to your support line should tell you where to

> Hi, I found your threads FAQ web page and wondered if you'd mind answering
> a question for me.  I'm writing a miscellaneous driver for Solaris (that is
> it isn't tied to hardware) and would like to know how to create my own
> threads in kernel space.  At first glance, there appears to be no support
> for this through the usual DDI/DDK means.  Is this the truth ?  Is there
> a way around this ?  Or is the best way to fake it by doing something like
> using a soft interrupt or timeout to start a function that never returns ?
> Darren

 Q273: Synchronous signal behavior inconsistant?  


Yes, it *seems* weird, but it's not. (Well, maybe it is still weird,
but at least there's some logic to it.)

If a program accesses unmapped memory, it will trap into the kernel,
which will say to itself something like "What a stupid programmer!"
and then arrange for a SIGSEGV for that program. Basically it will
pick up the program counter right then and there and move it to the
signal handler (if any). That's how synchronous signals work.

If you send a signal, any signal, to the process yourself, that will
be an asynchronous signal. EVEN if it's SIGSEGV or SIGBUS. And the
sigwaiter will then be able to receive it.


> So, I guess things are not working quite right in that sometimes a
> blocked signal is not delivered to the - only -  thread which is waiting
> for it.
> I coded an example in which SIGBUS is blocked and a thread is on
> sigwait. I arranged the code so that SIGBUS is "internally" generated,
> i.e. I coded a thread that is causing it on purpose. The process goes
> into a spin.
> If I kill the process with -10 from another shell, the result is as
> expected (the thread on sigwait catches it).
> I find that a little weird.
> Thanks for your suggestions,
> Antonio
> Sent via
> Before you buy.


 Q274: Making FORTRAN libraries thread-safe?  

"James D. Clippard" wrote:

> I have a need to use several libraries originally written in FORTRAN as part
> of a numerically intensive multithreaded application.  The libraries are
> currently "wrapped" with a C/C++ interface.
> -----
> My question is: How might one safely accomplish such a task, given FORTRAN's
> non-reentrant static memory layout?
> -----

The answer is really "it depends".  Firstly, with multi-threading you are going
beyond the bounds of what is possible in standard C/C++, so any solution is
by definition system dependent.  I'm not sure off hand if any version of
FORTRAN (eg 90 or 95) has standardised support for threading, but
somehow doubt it.  F77 never had any standardised support for multi-threading.

Second, "FORTRAN's non-reentrant static memory layout" is not strictly true.
It is definitely not true with F90 or F95.  With F77 (and before) things
are a little ambiguous --- eg lots of vendor specific extensions --- so you
will need to look at documentation for your compiler, or try a couple
of test cases like

            CALL TEST
            CALL TEST

            INTEGER I
            WRITE (*,*) i
             i = i+1

to see what happens.  I recall (from F77 days) some keywords like
AUTO and SAVE that control whether a variable is static or
auto.  I don't know how widespread they were (or whether or
not they were standard), as my coding practice rarely relied
on that sort of thing.

If your FORTRAN code uses things like common blocks, then you
essentially have a set of static variables that you need to control
access to.  Much the same as you would need for accessing
static variables in C/C++.

In general, you are probably safest using some of the following
schemes.  None of these are really specific to FORTRAN.

1)  Provide a set of wrapper functions in C/C++.  Have the wrapper
functions use mutex's or similar to prevent multiple threads invoking
particular sets of FORTRAN functions.  For example, if
FORTRAN SUBROUTINE A calls B calls C, and you have
a wrapper for each, ensure that a call to C prevents a call to A
on another thread UNLESS you know that all variables in
C are auto.

2)  Control access to common blocks, as hinted above.

> BTW, one of libraries I am using is SLATEC.  Given that all NRL's base much
> of their research code on SLATEC, I suspect that someone has elegantly
> surmounted this problem.

 Q275: What is the wakeup order for sleeping threads?  

Raghu Angadi wrote:

> A. Hirche wrote:
> >
> > Is it
> > (a) the first thread in the queue (assuming there is an ordered list of
> > waiting threads)
> > (b) any thread (nondeterministic choice)
> > (c) a thread chosen by some other scheme
> Threads are queued in priority order.
> So the thread with the maximum priority will get the mutex.
> If there more than one threads with max priority, then it is
> implementation dependant.

Not quite!

Actually, POSIX places mutex (and condition variable) wakeup ordering
requirements only when:

  1. The implementation supports the _POSIX_THREAD_PRIORITY_SCHEDULING
  2. The threads waiting are scheduled using the SCHED_FIFO or SCHED_RR
     policies defined by POSIX.

If these conditions are true, then POSIX requires that threads be awakened
in priority order. Multiple threads of identical priority must be awakened
"first in first out".

The wakeup order for threads that don't use SCHED_FIFO or SCHED_RR, (e.g.,
the default SCHED_OTHER on many UNIX systems, which behaves more like
traditional UNIX timeshare scheduling), the wakeup order is implementation
defined. Most likely it's still priority ordered, but it need not be, and
there's no definition of how they may interact with POSIX policies.

And, in any case, except on a uniprocessor, saying that "the highest
priority thread gets awakened" is not the same thing as "the highest
priority thread gets the mutex". Some other thread on another processor
might like the thread first, and the high priority thread will just go back
to sleep. This can happen even on a uniprocessor if the thread that unlocked
the mutex has a priority equal to that of the highest priority waiter, (it
won't be preempted), and it locks the mutex again before the waiter can run.

/---------------------------[ Dave Butenhof ]--------------------------\
 Q276: Upcalls in VMS?  

Eugene Zharkov wrote:

> I am somewhat confused by the OpenVMS Linker documentation, by the part
> whcih describes the /threads_enable qualifier. Here is an extract from
> there:
> [...]
> What confuses me is the following. A section about two-level scheduling
> and upcalls in the Guide to DECthreads explicitly states that "this
> section applies to OpenVMS ALPHA only". The above description of the
> MULTIPLE_KERNEL_THREADS option states that the option is applicable only
> to ALPHA systems. The above description of the UPCALLS options does not
> mention a system it applies to. Does that means that the upcalls
> mechanism is actually implemented on OpenVMS VAX?

(Apologies to Dan Sugalski for apparently ignoring his answers, but since
he wasn't completely sure, I figured it was best to go back to the
beginning for a definitive resolution.)

Despite the cleverly leading hint in the documentation, you should assume
that neither upcalls nor kernel threads exist, nor will they ever exist,
on OpenVMS VAX. While most of the infrastructure for upcalls has been
implemented, there were some "issues" that were never resolved due to lack
of resources, and it has been officially deemed low priority.

Nevertheless, it is theoretically possible that, given enough signs of
interest, the implementation of upcalls on OpenVMS VAX could be completed.
There will never be kernel threads on OpenVMS VAX. (We all know that one
should never say "never".)

/---------------------------[ Dave Butenhof ]--------------------------\
 Q277: How to design synchronization variables?  

"Kostas Kostiadis"  writes:

> Are there any rules or techniques to build and test
> synchronisation protocols, or is it a "do what you think
> will work best" thing?

Look up the following paper:

Paper Title: ``Selecting Locking Primitives for Parallel Programming''

Paper Author: Paul E. McKenny

Where Published: Communications of the ACM, 
         Vol 39, No 10, 75--82, October 1996.

It is exactly what you need. In the paper, McKenny describes a pattern
language that helps in the selection of synchronization primitives for
parallel programs ...


 Q278: Thread local storage in DLL?  

> I've written some memory allocation routines that may work when my DLL is called
> from multiple threads. The routines use thread local storage to store a table of the
> memory objects that have been allocated. I'm concerned that the code will not work
> properly when the dll is loaded explicitly using LoadLibrary.
> Has anyone experienced that problem?
> Is there a simple solution for a DLL(Win32)?
> Can I allocate memory somehow in my main dllentry point routine?
> Do I need to put a mutex around the calls to malloc to ensure that the code is
> thread-safe?
> You can email me at
> Rob

    Microsoft has explicitly stated that what you are doing will not work
when your DLL is loaded with LoadLibrary. The correct solution is to
explicitly use the Tls* functions. Read up on TlsAlloc, TlsFree,
TlsGetValue, and TlsSetValue.

    Of course, you can always roll your own by using GetCurrentThreadId()
to get an index into a sparse array protected with a CRITICAL_SECTION.

 Q279:  How can I tell what version of linux threads I've got?  

>>>>> "Phil" == Phil McRevis  writes:

    Phil> How can I tell what version of linux threads I've got on my
    Phil> system?  I browsed through the debian bug database and
    Phil> didn't find anything with "threads" in the list of packages
    Phil> or even what version of pthreads is included in the debian
    Phil> distribution.

executing glibc gives you useful information, as in:

levanti@meta:~$ /lib/
GNU C Library stable release version 2.1.2, by Roland McGrath et al.
Copyright (C) 1992, 93, 94, 95, 96, 97, 98, 99 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
Compiled by GNU CC version 2.95.2 19991109 (Debian GNU/Linux).
Compiled on a Linux 2.2.12 system on 1999-12-25.
Available extensions:
    GNU libio by Per Bothner
    crypt add-on version 2.1 by Michael Glad and others
    linuxthreads-0.8 by Xavier Leroy
    NIS(YP)/NIS+ NSS modules 0.19 by Thorsten Kukuk
    NSS V1 modules 2.0.2
    libthread_db work sponsored by Alpha Processor Inc
Report bugs using the `glibcbug' script to .

    Phil> What version of linux pthreads is considered to be the most
    Phil> stable and bug-free?

No idea.  Hopefully the most recent version.

    Phil> If I need to upgrade the pthreads package on debian, what's
    Phil> involved in doing that?

apt-get update ; apt-get upgrade

 Q280: C++ exceptions in a POSIX multithreaded application?  

On Sun, 16 Jan 2000 21:38:29 -0800, Bil Lewis  wrote:
>Jasper Spit wrote:
>> Hi,
>> Is it possible to use c++ exceptions in a POSIX multithreaded application,
>> without problems ?
>No. Using C++ exceptions is always a problem. (ho, ho, ho).
>But seriously... the interaction between exceptions & Pthreads
>is not defined in the spec. Individual C++ compilers do (or don't)
>implement them correctly in MT code. EG, Sun's 1993 C++ compiler
>did it wrong, Sun's current C++ compiler does it right.

Could you expand on that? What does it mean for a C++ to do it right?  If we
can put together a set of requirements to patch POSIX thread cancellation and
C++ together, I can hack something up for Linux.

The questions are:

- What exception is the cancellation request turned into in the target thread?
  What is the exception's type? What header should it be defined in?

- Upon catching the exception, what steps does the target thread take to
  terminate itself? Just re-enter the threads code by calling pthread_exit()?

- Are the handlers for unhandled and unexpected exceptions global or
  thread specific?

- Do unhandled cancellation exceptions terminate the entire process?

- By what interface does the process arrange for cancellations to turn into
  C++ exceptions?

- What is the interaction between POSIX cleanup handlers and exception
  handling? Do the handlers get executed first and then exception processing
  takes place? Or are they somehow nested together? 

- Does POSIX cleanup handling play any role in converting cancellation
  to a C++ exception?

In article , suggested:
>On Thu, 16 Dec 1999 22:00:37 -0500, John D. Hickin 
>>David Butenhof wrote:
>>> by. It would still be wrong. You need to use 'extern "C"' to ensure that the
>>> C++ compiler will generate a function with C calling linkage.
>>Also this:
>>extern "C" void* threafFunc( void* arg ) {
>>  try {
>>     ...
>>  }
>>  catch( ... ) {
>>    return static_cast(1); // say;
>>  }
>>  return 0;
>>It is manifestly unsafe to let a C++ exception unwind the stack of a
>>function compiled by the C compiler (in this case, the function that
>>invokes your thread function).
>To clarify; what you appear to be saying is that it's a bad idea to allow
>unhandled exceptions to percolate out of a thread function.

Actually, I think what he's saying it stronger than that, and I'd like to 
clarify it, since I'm finally updating a lot of my C++/DCE code to use C++ 
exceptions. He's saying not to *return* from inside a try or catch block, 
since it will force the C++-compiled code to unwind the stack past the C++ 
boundary and back into the C code.

Personally, one of the things that's kept me from using exception handling 
where I could avoid it was that I couldn't find a definitive answer as to 
whether it's safe and efficient to return like that. According to 
Stroustrup, it is, but this points out that it can be tricky in mixed 

  Scott Cantor              If houses were built the way software          is built, the first woodpecker would
  Univ Tech Services        bring down civilization.
  The Ohio State Univ            - Anon.
 Q281: Problems with Solaris pthread_cond_timedwait()?  

In article <>,
John Garate   wrote:

> I can't explain why, but I can say that if you call
> pthread_cond_timedwait() with a timeout
> less than 10ms in the future that you'll get return code zero.  Since I
> call it in a loop, the
> loop spins until finally the timeout time is in the past and
> pthread_cond_timedwait() returns
> ETIMEDOUT.  This happens for me on Solaris 2.6.  If you call it with a
> timeout greater than 10ms in the future, it'll return ETIMEDOUT after
> waiting awhile, but it does so slightly BEFORE the requested time, which
> conflicts with the man-page.

This has nothing to do with spurious wakeups from pthread_cond_wait().
It is just a bug in Solaris 2.6 (and Solaris 7 and anytime before):

 BugID: 4188573
 Synopsis: the lwp_cond_wait system call is broken at small timeout values

This bug was fixed in Solaris 8 and is being patched back to Solaris 7.
There are no plans for patching it back to Solaris 2.6 and beyond.

True, the ETIMEDOUT timeout occurs up to a clock tick (10ms) before
the requested time.  This is also a bug, but has not been fixed.
Of course, expecting time granularity beter than a clock tick
is not a reasonable expectation.

Roger Faulkner


 Q282: Benefits of threading on uni-processor PC?

>Can someone please tell me what the benefits
>of threading are when the potential environment
>is a main-stream uni-processor PC?

The benefits are that you can interrupt the execution of some low priority
task to quickly respond to something more important that needs immediate
attention.  That is the real time event processing benefit.

Another benefit is that your program can do something while it is waiting for
the completion of I/O.  For example, if one of your threads hits a page fault,
your program can nevertheless continue computing something else using another

Those are the two main benefits: decrease the overall running time by
overlapping input/output operations with computation, and to control the
response times to events through scheduling, prioritizing and preemption.

The secondary benefit is that some problems are easy to express using
concurrency which leads to elegant designs.

>Concurrent execution and parallel execution are
>2 different things.  Adding the overhead that you
>get by using multiple threads, looks like a decrease
>in performance...

That depends. Sometimes it is acceptable to eat the overhead. If you have to
respond to an event *now*, it may be acceptable to swallow a predictably long
context switch in order to begin that processing.

>What is all this business about "better utilisation of
>resources" even on uni-processor hardware?

Multitasking was invented for this reason. If you run jobs on a machine in a
serial fashion, it will squander computing resources, by keeping the processor
idle while waiting for input and output to complete. This is how computers
were initially used, until it was realized that by running mixes of concurrent
jobs, the computer could be better utilized.

>All the above is based on NON-network based
>applications.  How does this change when you application
>is related with I/O operations on sockets?

In a networked server application, you have requests arriving from multiple
users.  This is just a modern variant of programmers lining up at the data
centre window to submit punched cards. If the jobs are run one by one, you
waste the resources of the machine. Moreover, even if the resources of the
machine are not significantly wasted, when some programmer submits a very large
processing job, everyone behind has to wait for that big job to complete, even
if they just have little jobs. Moreover, they have to wait even if their
jobs are more important; there is no way to interrupt the big job to run
these more important jobs, and then resume the big job.

The same observations still hold true of in a networked server. If you handle
all of the requests serially, you don't make good use of the resources.  You
don't juggle enough concurrent I/O requests to keep the available peripherals
busy, and idle the processor.  Moreover, if a big request comes in that takes a
long time to complete, the processing of additional requests grinds to a halt.

 Q283: What if two threads attempt to join the same thread?  

On Fri, 18 Feb 2000 22:45:17 GMT, Jason Nye  wrote:
>Hello, all
>If a thread, say tX, is running (joinable) and both thread tY and tZ attempt
>to join it, what is the correct behaviour of pthread_join:
Any behavior is correct, because the behavior is undefined. A thread may be
joined by only one other thread.   Ammong acceptable behaviors would be
that of your program terminating with a diagnostic message, or behaving

 Q284: Questions with regards to Linux OS?  

>    I have some basic questions with regards to Linux OS
>    1) What types of threads (kernel/user space) and Bottom-Handler can
>exist inside a task-list??

Both kernel and user space threads can go into a *wait queue*.

A *task queue*, though unfortunately named, is something else. A task queue
basically has lists of callbacks that are called at various times. These are
not threads. 

>    2) Can I add a user space thread to a task-list?

You cannot add threads to task queues. You can use a wait queue to block a
thread. This is done by adding a thread to the wait queue, changing its state
to something like TASK_INTERRUPTIBLE (interruptible sleep) and calling
schedule() or schedule_timeout().

>    3) I would like to change a thread's priority within a task-list
>from a bottom handler. How can I do it?

With great difficulty, I suspect. You might be better off just having the
thread adjust its priority just before sleeping on that queue.

 Q285: I need to create about 5000 threads?  

Efremov Stanislav wrote:
> I need to create about 5000 threads simultaneously (it'll be a powerful
> server on NT machine, each client invokes it's own thread)
> Did anybody write programs like this? Is it a good idea to invoke a thread
> for each connection?

    No, it's really bad.

> I should write this program on Java. Can you say also, can it be implemented
> with so many threads?

    Not likely.

> I'm really appreciate your opinion. Thanks in advance.

    Use a more rational design approach. For NT, completion ports would be
a good idea.

 Q286:  Can I catch an exception thrown by a slave thread?

Jan Koehnlein wrote:
> Hi,
> does anyone know if it's possible to catch an exception thrown by a
> slave thread in the master thread using C++?

Yes. But you need some extra infrastructure.

You can do this with RogueWave Threads.h++ using what is called an IOU;
I believe also that the ACE Toolkit may implement something similar that
is called a future (but I havn't looked into that aspect of ACE).

Basically an IOU is a placeholder for a result that is computed
asynchronously. To get the result you redeem the IOU. Then you may:

1) get the result that was previously computed,
2) block, if the result isn't yet available.
3) see an exception raised, if the async thread threw one.

The implementation catches the exception in the async thread and copies
it into a location where the IOU can see it. On redemption it is thrown.

Regards, John.
 Q287: _beginthread() versus CreateThread()?  

In article <38be0349$0$18568@proctor>, lee  wrote:

% 1.   Why should I use _beginthread() instead of CreateThread() when using
% the c runtime libs ?

Because the people who wrote the library said so. The documented effects
of not using _beginthread() is that you can have per-thread memory leaks,
but I always like to think that the next release will have some catastrophic
problem if you don't do things their way.

% 2.    What can i use the saved thread_id for ? (as opposed to using the
% handle to the thread)

Some functions take a thread id (postthreadmessage comes to mind),
so sometimes you need that. I like to close handles as soon as possible,
so I don't have to keep track of them.

As I recall, you were taking some steps to set up a variable to hold
the thread ID, but you weren't setting it up correctly. That would be
the only reason I mentioned it. If you want to pass NULL, then just
pass NULL.


Patrick TJ McPhee
East York  Canada

>thanks - still got a few questions though
>... (forgive me if these questions are stupid - still very much a newbie)
>1.   Why should I use _beginthread() instead of CreateThread() when using
>the c runtime libs ?

First of all, you should not use _beginthread() but _beginthreadex().
The _beginthread() function is completely brain-damaged and should
not beused.

The answer to your question is somewhat involved. 

Some standard C functions have interface semantics that are inherently
non-reentrant, and require thread local storage in order to work reasonably
in a treaded environment.

The _beginthread() function is a wrapper around CreateThread which diverts the
newly created thread to a startup function within the C library. This startup
function allocates thread local resources before calling your start function.
More importantly, the function cleans up these thread local resources when your
start function returns.

If you use CreateThread, control is not diverted through the special start
function, so that if your thread also uses the standard library, causing it to
acquire thread local storage, that storage will not be cleaned up when the
thread terminates, resulting in a storage leak.

At the heart of the problem is Microsoft's brain damaged interface for managing
thread local storage, which doesn't permit a destructor function to be
associated with a thread local object.  Thus if a library needs to be notified
of a terminating thread so it can clean up its thread local resources, it needs
to either provide its own thread creating function that the client must use for
all threads that enter that library; or it must provide explicit thread attach
and detach functions (like COM's CoInitialize and CoUninitialize); or it must
be dynamically linked and use the thread destruction notifications passed
through DllMain.

A related problem is that Microsoft does not view the standard C library as
being an integral component of the Win32 system interface, but it is rather an
add on for Visual C. Thus the Win32 core is not ``aware'' of the C library. 

>2.    What can i use the saved thread_id for ? (as opposed to using the
>handle to the thread)

The handle is much more important; for one thing, it lets you wait on the
thread termination. The _beginthread function calls CreateThread internally
and then immediately closes the handle. The _beginthreadex function
casts the handle to an unsigned long and returns it.


 Q288: Is there a select() call in Java??  

Not in 1.2. Maybe in the future? 

> Yes, I was just looking at that JSR. It isn't clear if that
> includes select/poll or not.

I submitted a (pretty idiotic) comment on the JSR, and got a very polite
reply from Mark Reinhold that, while it wasn't heavy on detail, was enough
to convince me, when I though about it, that they've got a pretty neat
design that can efficiently subsume select/poll, asynch I/O, SIGIO et al,
and does so much more tidily than anything that I had dreamed up.

> Do you have any pointers to poll() & Java?

The Developers Guide (PostScript document) that comes with JDK 1.2.1_04 (for 
all I know, possibly other versions too, e.g. 1.2.1_03 or 1.2.2_05) talks 
about it, and one of the four packages in that release is SUNWj2dem, which
contains the source code (Java and C) for the demo poller code.  Be warned
that it is Solaris-specific and the antithesis of pure Java...

Hi Bil,

Hope you enjoyed the tip, I really enjoyed your book.  After spending a fair
amount of time playing with InterruptedException, I view interruption as
just another kind of signal.  I almost never use it for interruption per se,
but I have wondered about using it as a "notify with memory," so that even
if the thread isn't waiting right now it can still get the message.

Are you involved with agitating for select() and better-defined
InterruptedIOInterruptions in a future version of Java?  I'll sign the
petition. :-)


 Q289: Comment on use of VOLATILE in the JLS.?  

>It is my opinion and my experience that inclusion
>of VOLATILE in Java has lead to nothing but confusion
>and bad coding practices.

*Personally*, I completely agree.

>ANYWAY, I think it would be of some value to include a
>statement of where VOLATILE is and isn't useful and
>examples of same. In particular, that VOLATILE is almost
>always the WRONG thing to use, and that programmers
>should avoid it unless they have a VERY clear understanding
>of the consequences.

We might be able to squeeze something in. However, bear in mind that the 
primary purpose of the JLS is to specify the language semantics, not to 
teach people how to use it.

[Some time after this exchange, the issue of the description of the
memory model required by Java popped up led by Lea & Pugh. The gist
of this is that a pile of details will be fixed AND VOLATILE will be
given more adaquate semantics, making it POSSIBLE to use correctly.
It will still be *VERY* difficult and should still be avoided.]

Gilad Bracha
Computational Theologist
Sun Java Software

 Q290:  Should I try to avoid GC by pooling objects myself??  

[From a discussion in Java Report]

Dear Dwight, Dr. Kolawa,

In his article, Dr. Kolawa made a number of good points, but he also
said one thing that drives me crazy. I've seen different people give
different versions of this time and time again and it's just not good.
We'll call this "Fear of Garbage."

This is the preoccupation that so many C and C++ programmers
have with memory usage. In those languages, this concern is well-
founded, but with garbage collectors this concern becomes moot.
Dr. K suggests setting a variable to NULL to allow the GC to collect
garbage earlier. While it is certainly true that eliminating one
pointer to an object will make it more likely to be collected earlier,
it's the wrong thing to do.

The whole idea of GCs is that you DON'T spend your time worrying
about temporary usage of memory. Yes, you may indeed increase the
footprint of your program a bit by not worrying about every reference,
and you can always invent a worst case such as in his article, but
normal programming practices will avoid even these. If he had written
his example like this:


that monster splash image would have been GC'd as it went out of
scope naturally.

Now it is possible to write programs that stuff more and more data
onto a list, data that will never be used again. And that IS a memory
leak and you do have to avoid doing that. Infinite history lists for
example. But that's a different problem. Dr. K is referring to an
issue concerning singletons which just isn't a problem.

In brief, "If it ain't broke, don't fix it."


 Q291:  Does thr_X return errno values? What's errno set to???  

> Some of the thr_X() man pages seem to say the functions return the errno
> value.
> Is this really correct (instead of returning, say, -1 and then setting
> errno)?
> If it is correct, is errno also set correctly?

Yes, that is correct. They return the error value. And errno is "set
correctly" -- meaning that it has no defined value because it isn't
involved in this API. NB: As a side-effect of some functions, errno
WILL have a value on some systems. But it ain't defined, so don't use it.

After responding to your message I went to your website. I am worring that
if this is the message you are giving your students you are doing them
disservice. They will never be able to write real applications in java.

For the classes I teach in threading (both POSIX and Java),
I found it useful to have some example programs for the students
to build upon. One of these programs is a server program which
accepts requests on sockets.

I have just finished polishing up the robust, select() version
of the server (both POSIX and Java) and would love to have folks
take a look at it.



There are four server programs, each accepts clients on a port 
and handles any number of requests (each request is a byte string
of 70 characters).

> There is the simple server, which is single-threaded.

> There is the master/slave server, which is multithreaded and spawns
  one thread to handle each request.

> There is the producer/consumer server, which is multithreaded, but
  spawns off a new thread to receive requests from each new client
  (replies are done by the pool of consumer threads).

> Finally, there is the select server, which is multithreaded and 
  which has only a single thread doing all of the receiving. The 
  producer thread does a select() on all current clients AND the
  port AND an "interruptor" pipe. When select() returns:

  o Requests from clients go onto a queue for the consumer threads to

  o New connections to the port are accept()'d and new clients are

  o Finally, if it is "shutdown" time, a message is sent on the interruptor
    pipe and everyone finishes up and stops.

  This program handles 1k clients, survives client failure, and reliably
  shuts down. (At least it works when *I* test it!)

All these programs have been tested only on Solaris 2.6 on an SS4, but
should run on any UNIX system. The code (along with a pile of other
programs) is located at:

in the directory programs/PThreads, there is a Makefile for Solaris
which gets regular use, also Makefiles for DEC, HP, and SGI, which
get a lot less use. Hence:

bil@cloudbase[182]: make

bil@cloudbase[183]: setenv DEBUG    (If you want LOTS of output)

bil@cloudbase[184]:  server_select 6500 100 0 10 30 &
Server_9206(TCP_PORT 6500 SLEEP 100ms SPIN 0us N_CONSUMERS 10 STOPPER 30s KILLER
Starting up interation 0.
Server bound to port: 6500
Server up on port 6500. Processed 669 requests. 41 currently connected clients.
Time to stop!
Shutdown successful. 676 replies sent.


bil@cloudbase[185]: client 6500 1 1 50      (Better in different window!)
Client_9207(PORT 6500 SLEEP 1ms SPIN 1us N_SOCKETS 50 N_REQUESTS 10000)
Connected to server on port 6500
Connected to server on port 6500
Connected to server on port 6500
Connected to server on port 6500
Client_9207[T@9]    Receiving segments on fd#6...
Client_9207[T@8]    Sending 10000 requests on fd#6...
Client_9207[T@7]    Receiving segments on fd#5...
Client_9207[T@6]    Sending 10000 requests on fd#5...
Client_9207[T@5]    Receiving segments on fd#4...
Client_9207[T@4]    Sending 10000 requests on fd#4...


The Java program is quite similar and even uses much of the same
C code for select(). (Java does not have an API for select(), so
we have to use the native select() via JNI.) The Java server is
happy to run with the C client & vice-versa.

bil@cloudbase[192]: cd programs/Java/ServerSelect
bil@cloudbase[193]: setenv THREADS_FLAG native
bil@cloudbase[194]: setenv LD_LIBRARY_PATH ${LD_LIBRARY_PATH}:.
bil@cloudbase[195]: setenv CLASSPATH ../Extensions/classes:.

bil@cloudbase[198]: java Server 6500 100 0 10 30
Server(port: 6500 delay: 100ms spin: 0us nConsumers: 10 stopperTimeout 30s)
Server now listening on port 6501
Server up on port 6501. Processed 2113 requests. 10 clients.

Everything stopped.

bil@cloudbase[303]: java Client 6500 100 0 10
Client(port: 6500 sDelay: 100 (ms) rDelay: 0 (ms) nClients: 10)

Actual Port: 6501
Client[Thread-0]    Started new sender.
Client[Thread-1]    Started new receiver.
Client[Thread-2]    Started new sender.
Client[Thread-3]    Started new receiver.
Client[Thread-0]    Sent: 100 requests.
Client[Thread-6]    Sent: 100 requests.
Client[Thread-8]    Sent: 100 requests.
Client[Thread-14]   Sent: 100 requests.
Client[Thread-4]    Sent: 100 requests.

 Q292: How I can wait more then one condition variable in one place?  

Condition variables are merely devices to delay the execution of a thread.  You
don't need more than one of these in order to accomplish that task.

Threads that use condition variables actually wait for a predicate to become
true; the evaluation of that predicate is done explicitly in the code, while
holding a mutex, e.g.

    lock mutex


    while not predicate()
    wait (condition, mutex )


    unlock mutex

You can easily wait on an array of predicates, provided that they are
protected by the same mutex.  Simply loop over all of them, and if
they are all false, wait on the condition. If some of them are true,
the perform the associated servicing.

>I need pthread analog  for WaitForMultipleObjects (WIN32)  or
>DosWaitMuxWaitSem (OS2)

You should not need this, unless you (or someone else) made the design mistake
of allowing operating system synchronization objects to be used as an interface
mechanism between program components.

If two unrelated objects, say A and B, are both to generate independent events
which must give rise to some processing in a third object C, which has
its own thread then there is a need for C to be able to wait for a signal
from either A or B.   

The programmer who is oblivious to issues of portability might arrange for
object C to expose two operating system objects, such as Win32 events; have A
and B signal these; and have C do a wait on both objects.

A technique which is easier to port is to use only the programming language
alone to make an interface between A, B and C.  When A and B want to signal C,
they call some appropriate interface methods on C rather than invoking
functions in the operating system specific library. These methods can be
implemented in a number of ways. For example, on a POSIX thread platform, they
can lock a mutex, set one of two flags, unlock a mutex and hit a condition
variable.  The thread inside C can lock the mutex, check both flags and use a
single condition variable to suspend itself if both flags are false.  Even on
Windows, you could handle this case without two events: use a critical section
to protect the flags, and have the thread wait on a single auto-reset event.

There is really no need for WaitForMultipleObjects in a reasonable design.
I've never needed to use it in Win32 programming despite having written plenty
of code that must respond to multiple stimuli coming from different sources. 

On Tue, 11 Jul 2000 14:05:40 -0700, Emmanuel Mogenet 
>This seems to be a favorite topic (shouldn't be in the FAQ's or something),
>but could someone please elaborate on the following questions:
>    1. If I had to implement WaitForMultipleObjects on top of pthreads
>conditions, how would I go about it

One easy way would be to write a miniature kernel which manages the objects
to be waited on. This kernel would protect itself with a mutex. A condition
variable would be used to delay the execution of each incoming thread until
the wait condition is satisfied (either all of the desired objects are in
the ``signalled'' state, or just one or more of them is in that state,
depending on the arguments to the wait):


    while ( none of the objects are signalled )
    wait(mutex, condition)


To do a more efficient job, you need more condition variables, so you don't
wake up too many threads. You could assign one condition variable to each
thread, or you could take some intermediate approach: have a pool of condition
variables. The wait grabs a condition from the pool and registers it to wait.
When an object is signalled, the implementation then hunts down all of the
condition variables that are engaged in a wait on that object, and signals
them, thereby waking up just the threads that are on that object.

>    2. People seem to consider that WaitForMultipleObjects to be an
>ill-designedAPI, however it looks to
>        me like the semantics is very close to that of select which is to my
>knowledge considered
>        pretty useful by most UNIX programmers.

Yes, however select is for multiplexing I/O, not for thread synchronization.
Also note that select and poll ensure fairness; they are capable of reporting
exactly which objects reported true. Whereas WaitForSingleObject returns the
identity of just one object, so the app has to interrogate the state of each
object with wasteful system calls. The WaitForSingleObject function is also
restricted to 64 handles, whereas select and poll implementations can take

%     1. If I had to implement WaitForMultipleObjects on top of pthreads
% conditions, how would I go about it

In general, you can wait for N items by kicking off N threads to do the
waiting and signalling from those waiters.

If you're waiting for events which will be signaled through CVs, and you
control the code for these things, have them all signal the same CV. You
can still test for a lot of things:
 while (!a && !b && !c &&!d)
   pthread_cond_wait(&cv;, &mux;);

If you're waiting for things which can be treated as file handles, you
can use poll() or select().

%     2. People seem to consider that WaitForMultipleObjects to be an
% ill-designedAPI, however it looks to
%         me like the semantics is very close to that of select which is to my
% knowledge considered
%         pretty useful by most UNIX programmers.

Like select(), it places an arbitrary limit on the number of things you can
wait for, so it can be useful as long as your needs don't go beyond those
limits. I think some people don't see the point of this in a multi-threaded

 Q293: Details on MT_hot malloc()?  

There are a number of malloc() implemenations which scale better
than the simple, globally locked version used in Solaris 2.5 and
earlier. A good reference is:

Some comments by the author:

If you quote them in the FAQ, make sure to make a note that these
opinions are my personal ones, not my employer's.

As I tried to describe in my DDJ article, there is no best malloc
for all cases.

To get the best version, I advise the application developers
to try different versions of mt-hot malloc with their specific
app and typical usage patterns and then select the version working
best for their case.

There are many mt-hot malloc implementations available now. Here
are my comments about some of them.

* My mt-hot malloc as described in the DDJ article and the patent.

It was developed first chronologically (as far as I know). It works
well when the malloc-hungry threads mostly use their own memory.
It also uses a few other assumptions described in my DDJ paper.

The main malloc algorithm in my mt-hot malloc is the same binary
search tree algorithm used in the default Solaris libc malloc(3C).

* mtmalloc(3t) introduced in Solaris 7.

I can't comment on this version, other than to say that it's
totally different from my mt-hot malloc implementation.

* Hoard malloc

It's famous, but my test (described in the DDJ article) did not
scale with Hoard malloc at all. It appeas that their realloc()
implementation is broken; at least it was in the version available
at the time of my testing (spring 2001). I've heard reports from
some Performance Computing people (who use Fortran 90 and no
realloc()) that Hoard malloc has helped their scalability very well.

Also, IMHO the Hoard malloc is too complicated, at least for the
simple cases using the assumptions described in my DDJ article.

* ptmalloc (a part of GNU libc)

I have not tested ptmalloc, so I can't comment on it.

* Smart Heap/SMP from MicroQuill

My tests of Smart Heap/SMP were not successful.

-Greg Nakhimovsky
 Q294: Bug in Bil's condWait()?  

In my Java Threads book I talk about how you can create
explicit mutexes and condition variables that behave like
POSIX. I note that you'll probably never use these, but it's
useful to think about how to build them and how they work.

Later on, I talk about the complexities of InterruptedException
and how to handle it. It's a tricky little devil. One of the
possible approaches I mention is to refuse to handle it at
all, but rather catch it, then re-interrupt yourself when
leaving your method. Hence allowing another method to see it.

A fine idea. And most of my code for this is correct.  Richard Carver
(George Mason University in Fairfax, VA, (where I grew up!))
pointed out a cute little bug in one bit of my code.

Here it is:

I wrote condWait like this:

public void condWait(Mutex mutex) {
  boolean       interrupted = false;

  synchronized (this) {
    while (true) {
      try {
      catch (InterruptedException ie) {interrupted=true;}

  if (interrupted) Thread.currentThread().interrupt();

which is BASICALLY correct. If you get interrupted, you set a
flag, then go back and wait again. When you get signaled, you
call interrupt() on yourself and all is well.

UNLESS... You get interrupted, you wait to get the synchonization
lock AND just at that moment, someone calls condSignal() on the
CV. Guess what! You're not on the CV's sleep queue anymore and
you miss the signal! Bummer!

Of course not only is the unlikely to happen, it WON'T happen at
all on JDK 1.1.7. But it could if JDK 1.1.7 had been written
differently and other JVMs are.

ANYWAY, if you followed all that (and if you find this interesting)
here's the solution. The first interrupt will be treated as a
spurious wakeup, but it won't repeat. (Unless I've missed
something else!)

public void condWait(Mutex mutex) {
  boolean       interrupted = false;

  if (Thread.interrupted()) interrupted=true;

  synchronized (this) {
    try {
    catch (InterruptedException ie) {interrupted=true;}

  if (interrupted) Thread.currentThread().interrupt();


 Q295:  Is STL considered thread safe??  

This should probably be a FAQ entry.  Here's the answer I gave 2 months ago
to a similar question:

In general, the desired behavior is that it's up to you to make your
explicit operations on containers, iterators, etc thread safe.  This is good
because you might have several containers all synchronized using the same
locking construct, so if the implementation used individual locks underneath
it would be both wrong and expensive.  On the other hand, implicit
operations on the containers should be thread safe since you can't control
them.  Typically these involve memory allocation.  Some versions of the STL
follow these guidelines.

Look at the design notes at



There is not such thing as *the* STL library. It is an abstract interface
defined in the C++ standard. 

There is no mention of threads in the C++ standard, so the STL is not required
to be thread safe.

To find out whether your local implementation of the STL is thread safe,
consult your compiler documentation.

For an STL implementation to be useful in a multithreaded programming
environment, it simply has to ensure that accesses to distinct containers
do not interfere. The application can ensure that if two or more threads
want to access the same container, they use a lock.

I believe that the SGI and recent versions of the Plauger STL (used
by VC++) are safe in this way.

Hi Cheng,

I'm going to post yet another answer: the term 'Thread-safe' is usually a very
difficult term to understand completely. There is absolutely no way to guarantee
that a given library/software package is 100% thread safe because it all depends
on how you use it.

An example of what I mean is shown below:

class Point2D {

    void setX(double value)
        _x = value;

    void setY(double value)
        _y = value;

    double x() const
        double tmp;
        tmp = _x;
        return tmp;

    double y() const
        double tmp;
        tmp = _y;
        return tmp;

    mutable Mutex lock;
    double _x, _y;

While the above code can be considered 'thread-safe' to a certain extent, it is
possible for it to be used incorrectly. An example is if one thread wants to move
the point (we'll call it 'pt' here):


The Point2D code guarantees that if another thread happens to look at pt's values
that it will receive well defined values and if another thread modifies the
values that it will be blocked appropriately and the two threads will not clobber
the pt object. BUT.... The above two lines do NOT guarantee that the update of
the point is automic which in the case of the above example is more important
than the Point2D being thread-safe. We can change Point2D to have a set(double x,
double y) and a get(double & x, double & y), but these are awkward and they make
the Point2D aware of threads when it should not be aware of them at all.
Therefore, in my opinion, the best design to overcome all the above problems is
to use a Point2D class that contains no locks and we use an externally associated
lock to guard the Point2D object. This way, Point2D is useful in all types of
applications -- including non-threaded applications AND we have the ability to
lock a set of operations on the object to make them appear automic
(transaction-like, if you will).

That being said, here is an example of how I would use the Point2D object (we'll
use the same class declaration as above, minus the lock):

class Point2D {

    void setX(double value)
        _x = value;

    void setY(double value)
        _y = value;

    double x() const
        return _x;

    double y() const
        return _y;

    double _x, _y;

Now, for the usage:

    // This thread (Thread A) updates the object:

    // This thread (Thread B) reads the information:
    if (pt.x() > 10.0)
        // Do something rather uninteresting...
    if (pt.y() < 10.0)
        // Do something else rather uninteresting...

Now, both the lock and the Point2D object are shared between two threads and the
above modification of the pt instance is seen as automic -- there is no chance
for a thread to view that x has been updated but y has not.

*PHEW*. All that being said, it may be clear now that when writing an
implementation of the STL, it is a good idea to consider threading as little as
possible. Usually, the only considerations that should be made are to ensure that
all functions are reentrant and that threads working on different instances of
containers will not clobber each other since they are not sharing any data --
this is usually achieved by making sure there are no static members for STL
containers. Some poor implementations of the commonly used rb_tree implementation
use static members and a 'large' mutex that causes no end of problems (anywhere
from link errors to much unnecessary overhead). A good implementation of the STL
should use 0 locks. Remember that the STL is a set of containers and algorithms.
It was correctly left up to the user of the STL to implement locking so they can
do it in the way they see fit for the problem they are solving. BTW, SGI has an
excellent implementation of the STL and they explain their design decisions on
their STL page (you can find it on their site).

Hope this provides some insight,

 Q296:  To mutex or not to mutex an int global variable ??  


Nice *idea*, but no marbles. :-(

Lots of variations of this idea are proposed regularly, starting with
Dekker's algorithm and going on and on. The problem is that you
are assuming that writes from the CPU are arriving in main memory
in order. They may not.

On all high performance CPUs today (SPARC, Alpha, PA-RISC, MIPS,
x86, etc.) out-of-order writes are allowed. Hence in your example below,
it is POSSIBLE that only one of the values will be updated before the
pointer is swapped.

Bummer, eh?


"Use the mutex, Luke!"

> To avoid locks you might try the following trick:
> Replace the two ints by a struct that contains total and fraction. The global
> variable would then be a pointer to the struct. To modify the variables, the writer
> would use a temporary struct, update the value(s) and then swap the pointers of the
> global pointer and the temporary global pointer. This works assuming that a pointer
> write is an atomic operation (which is the case in all architectures I know).

    I don't know if you are still the maintainer of
the comp.programming.threads FAQ but I was reading Q63
trying to find a good way of using threads in C++ and
the suggestions were really good, but seemed to be from
a windows perspective.  It took me a little while to
translate what was there to something that I could use
and it is pretty much the same thing but here it is
just in case you wanted to include it in the FAQ:

class PThread {
  PThread() { pthread_create(&thr;, NULL, (void *(*)(void *))thread, this); }
  static void thread(PThread *threadptr) { threadptr->entrypoint(); }
  void join() { pthread_join(thr, NULL); }
  virtual void entrypoint() = 0;
  pthread_t thr;

David F. Newman
If you think C++ is not overly complicated, just what is a protected
abstract virtual base pure virtual private destructor, and when
was the last time you needed one?
                -- Tom Cargil, C++ Journal.

 Q297: Stack overflow problem ?  

BL> Yes, as far as I know EVERY OS has a guard page. [...]
BL> Be aware that if you have a stack frame which is larger than a 
BL> page (typically 8k), it is POSSIBLE to jump right over the guard
BL> page and not see a SEGV right away.

KK> The solution to that is to initialize your locals before calling
KK> lower-level functions. [...]

This is an inadequate solution for the simple reason that one isn't
guaranteed that variables with automatic storage duration are in any
particular order on the stack.  The initialisations themselves could
cause out of order references to areas beyond the guard page, depending
from how the compiler chooses to order the storage for the variables.

The correct solution is to use a compiler that generates code to perform
stack probes in function prologues.

For example: All of the commercial C/C++ compilers for 32-bit OS/2
except for Borland's (i.e. Watcom's, IBM's, and MetaWare's) generate
code that does stack probes, since 32-bit OS/2 uses guard pages and a
commit-on-demand stack.  In the function prologue, the generated code
will probe the stack frame area at intervals of 4KiB starting from the
top, in order to ensure that the guard pages are faulted in in the
correct order.  (Such code is only generated for functions where the
automatic storage requirements exceed 4KiB, of course.)

Microsoft Visual C/C++ for Win32 generates stack probe code too.  This
is controlled with the /Ge and /Gs options.  

As far as GCC is concerned, I know that EMX C/C++ for OS/2, the original
GCC port for OS/2, had the -mprobe option, although I gather that in
later ports such as PGCC this has been superceded by the
-mstack-arg-probe option.  (See
.) Whether there is the same or an equivalent option in the GCC ports to
other platforms I don't know. 

 Q298: How would you allow the other threads to continue using a "forgotten" lock?  

Bhavin Shah wrote:
> Hi,
> Sorry if this is a faq, but I haven't found a solution yet:
> In a multi-threaded app, say you have a thread acquire a lock,
> update global variables, then release the lock.  What if for
> some reason (granted that updating a couple variables shouldn't
> take much time), the thread crashes before releasing the lock?
> I don't see a way to set a timeout on acquiring locks.  How would
> you allow the other threads to continue using that same lock?

    You wouldn't.  What's more, you shouldn't!  The dead thread
holding the lock may have left the lock-protected data structures
in an inconsistent or incorrect state, in effect planting a
poison pill for any other thread which might attempt to use
them later on.  "Overriding" a lock is the same as "ignoring" a
lock -- you know that the latter is dangerous, so you should
also understand that the former is equally dangerous.

    There's another peculiar notion in your question, too: that
of "a thread crashing."  The fact that Thread A takes a bus
error or something does *not* mean that Thread A was "at fault,"
and does *not* mean that Threads B-Z are "healthy."  If any
thread at all gets into trouble, you should start with the
supposition that the entire program is in trouble; you shouldn't
think in terms of getting rid of the "offending" thread and
trying to carry on as usual.  Poor but possibly helpful analogy:
if a single-threaded program crashes in Function F(), would it
be appropriate to replace F() with a no-op and keep on going?

 Q299: How unfair are mutexes allowed to be?  

> Hi
>   If several threads are trying to gain access through a mutex,
> how unfair are mutexes allowed to be?  Is there any requirement
> for fairness at all (ie that no threads will be left unluckily starving
> while others get access).
> The code needs to work on many posix platforms, are any of them
> unfair enough to merit building an explicit queueing construct
> like ACE's Token?
> thanks,
> Jeff

Assume the worst and you'll be safe (and probably correct). If your
program looks like this:

while (true) {

while (true) {
  do stuff for 10ms()

  if (i == N) do other stuff()

You can be fairly certain T2 ain't never gonna get that lock more
than once every 1,000+ interations. But this is pretty fake code and
T1 shouldn't look like that. T1 should also "do stuff for Xms" outside
of the critical section, in which case you're safe.

Think about it: The only time there's a problem is when a thread
keeps the mutex for all but a tiny fraction of the time (like T1). And
that would be odd.



 Q300: Additionally, what is the difference between -lpthread and -pthread? ?  

On 01 Aug 2000 15:01:23 -0500, Aseem Asthana  wrote:
>>That is incorrect. With modern gcc, all you need is the -pthread option.
>>If your gcc doesn't recognize that, then you need:
>>    gcc -o hello -D_REENTRANT hello.c -lpthread
>>Without the _REENTRANT, certain library features may work incorrectly
>>under threads.
>the -D_REENTRANT option, is it to link your programs with the thread safe
>standard libraries, or something else?

No, the -D_REENTRANT option makes the preprocessor symbol _REENTRANT known
to your system headers. Some code in the system headers may behave differently
when _REENTRANT is defined. For example, quite typically, the errno macro
has a different definition in threaded programs in order to give each thread
its own private errno. Without _REENTRANT, accesses to errno might go to one
global one, resulting in race conditions.

>Additionally, what is the difference between -lpthread and -pthread?

-lpthread is the directive to link in the library libpthread.a or a shared
version thereof.

-pthread is a special command line option supported by modern gcc, and
some other compilers as well, which sets up all the correct options for
multithreaded building, like -D_REENTRANT and -lpthread.

Subject: Re: problems with usleep()
Kauser Ali Karim wrote:
> Hi,
> I'm using the usleep() function to delay threads and I get a message:
> Alarm clock
> after which my program to exits prematurely since the p_thread_join that I
> call at the end is not executed.
usleep requests the process gets woken up after a time.  This wakeup is
SIGALRM, which is causing you prog to exit.

Use nanosleep, its is preferred for threaded apps.


 Q301: Handling C++ exceptions in a multithreaded environment?  

On Fri, 4 Aug 2000 09:52:50 -0400, Bruce T  wrote:
>I am writing code in C++ in a multithreaded system.  Can anyone point me to
>links or articles on any special strategies/concerns or examples of handling
>C++ exceptions in a multithreaded environment.

Assuming that your compiler supports thread-safe exception handling, you can
simply use the language feature as you normally would.

There aren't really any special concerns. (You wouldn't worry whether function
calls, returns or gotos are a problem under threading, right? So why fuss
over exceptions).

Avoid common misconceptions, like wanting to throw an exception from one thread
to another. This simply isn't a valid concept, since an exception is a
branching mechanism. Branching occurs within the thread of control, by
definition: it's a change in one thread's instruction pointer, so to speak.

(That's not strictly true: function calls *can* take place between threads,
processes or machines through remote procedure calling. An analogous mechanism
can be devised to pass exceptions; e.g. you catch the exception on the 
RPC server side, package it up, send a reply message to the client, which
unpacks it and rethrows.)

 Q302:  Pthreads on IRIX 6.4 question?   
X-Mozilla-Status2: 00000000 wrote:
> Hello,
>    I am having problems with Pthreads on IRIX 6.4. I have two
> threads: the initial thread plus one that has been pthread_created.
> The pthread_created pthread does an ioctl and sits in a driver waiting
> for an
> event. While this is happening the "initial" thread should be eligible
> to run, but it is put to sleep, i.e. doesn't run. Why? On IRIX, what
>  kind of LWP notion
> is there?

Defaul scheduling scope is process on IRIX 6.4, and the number of
execution vehicles is determined by the pthread library -- typically,
you'll start with one execution vehicle unles sthe library detects all
your threads can run in parallel and consume CPU resources. 

But the latest pthread patches for IRIX 6.4 would, in my experience,
create an extra execution vehicle on the fly in the case you describe,
so I'd certainly recommend you to get the *LATEST* set of POSIX
recommended patches.

You can, in 6.4, use pthread_setconcurrency to give hints as to how many
kernel execution vehicles you want. You can also run with system scope
threads using pthread_attr_setscope (giving you one kernel execution
vehicle per thread), but on IRIX this requires CAP_SCHED_MGT
capabilities, as process scope threads in IRIX can schedule themselves
at higher priorities than some kernel threads (see man capabilities).

In 6.5.8, you have PTHREADS_SCOPE_BOUND_NP (incorrectly referred to as
PTHREADS_SCOPE_BOUND in the headers) scope, which gives you what Solaris
and Linux system scope threads are -- one execution vehicle per thread,
but no extra scheduling capabilities (hence no need to do fancy stuf
with capabilities to run this as non-root user); blocking in one thread
is guaranteed not to interfere with immediate availability of kernel
execution vehicles for other threads.

Frank Gerlach wrote:
> I also had the problem of pseudo-parallelity on Solaris 2.6. Only after
> calling  pthread_attr_setscope() the threads would *really* execute in
> parallel. Maybe that helps with your problem..
>              pthread_attr_init(&attr;);
>              pthread_attr_setscope(&attr;,PTHREAD_SCOPE_SYSTEM);
>              pthread_attr_setschedpolicy(&attr;,SCHED_OTHER);
>              int retv=pthread_create(&tids;[i],NULL,threadfunc,&ta;[i]);

As I also said, on IRIX, the closest scope to this one is
PTHREAD_SCOPE_BOUND (actually, ...BOUND_NP, but the header is wrong in
6.5.8). PTHREAD_SCOPE_SYSTEM threads can do much more wrt scheduling in
IRIX, and as a result require CAP_SCHED_MGT capabilities.

*Latest* pthread patch sets, though, usually don't have too many
problems with making extra kernel execution vehicles to avoid deadlocks
(for normal process scope threads) -- in the words of the man page for

    Conversely the library will not permit changes to the concurrency level
to create starvation.  Should the application set the concurrency level to n
and then cause n threads to block in the kernel the library will activate
additional execution vehicles as needed to enable other threads to run.  In
this case the concurrency level is temporarily raised and will eventually
return to the requested level.

Earlier flavours of the pthread library may have more problems to
actually guess whether extra execution vehicles are needed.

 Q303: Threading library design question ?  

Some people say semaphores are the concurrent-processing equivalent of
Still, semaphores are very useful and sometimes even indispensible.
(IMO goto is sometimes also a good construct, e.g. in state machines)

A useful construct might be ReaderBlock and WriterBlock classes, which take
a ReadWriteMutex as a constructor argument and can be used similar to the
synchronize() construct of java. Those classes lock the mutex in the
constructor and unlock it in their destructor, avoiding explicit unlocking
AND exception handling easy. The latter is especially important, as I cannot
think of an elegant way to unlock a mutex in case an exception is thrown,
which will be handled in a calling method.

In general, one could provide a list of synchronization constructs in
ascending order of complextity/danger for novice users. The
Reader/Writerblock is quite harmless, even if you do not think about its
Still, you can easily deadlock your program by using two mutexes and
acquiring them in opposite order.

My feeling is that concurrent programming contains inherent complexity,
which cannot be eliminated.

As a final input, automatic deadlock detection in debug mode would be a
simple, but great feeature for both C++ libs and Java VMs (unfortunately SUN
does not provide this in their VMs).

Beman Dawes wrote:

> There is discussion on the boost mailing list ( of design
> issues for a possible C++ threading library suitable for eventual
> inclusion in the C++ standard library.
> Some suggest starting the design with very low-level primitives, and
> then using these to build higher-level features.  But like a goto
> statement in programming languages, some low-level features can be
> error-prone and so should not always be exposed to users even if present
> in the underlying implementation.
> So here is a question where comp.programming.threads readers probably
> have valuable insights:
> What features should be excluded from a threading library because they
> are known to be error-prone or otherwise dangerous?  What are the
> threading equivalents of goto statements?
> --Beman Dawes 

 Q304:  Lock Free Queues?   

On Thu, 10 Aug 2000 03:55:25 GMT, J Wendel  wrote:
>I wonder if any of you smart quys would care to enlighten me
>about "lock free" algorithms. I've found several papers on
>the Web, but to be honest, I'm having a little trouble
>following the logic.

Lock free algorithms do actually rely on atomic instructions provided by the
hardware. So they are not exactly lock free.

For example, a lock-free queue can be implemented using an atomic compare-swap
instruction to do the pointer swizzling.

The idea is that the hardware provides you with a miniature critical region in
the form of a special instruction which allows you to examine a memory
location, compare it to a value that you supply, and then store a new value if
the comparison matches.  The instruction produces a result which tells you
whether or not the store took place.  The instruction cannot be interrupted,
and special hardware takes care that the memory can't be accessed by other

Here is an illustration. Suppose you want to push a new node onto the lock-free
list. How do you do that? Well, you set your new node's next pointer to point
to the current head node. Then you use the compare-swap to switch the head node
to point to your new node! If it succeeds, you are done. If it fails, it means
that someone else succeeded in pushing or popping before you were able to
execute the instruction. So you must simply loop around and try again.  The
subject of the comparison is simply to test whether the head node still has the
original value.

Pseudo code:

    do {
        node *head_copy = head;
        newnode->next = head;
    } while (!compare_and_swap(&head;, head_copy, newnode));

The compare_and_swap simply behaves like this, except that it's
implicitly atomic:

    int compare_and_swap(node **location, node *compare, node *newval)
        /* lock whole system */

        if (*location == compare) {
            *location = newval;

            /* unlock whole system */
            return 1;   

        /* unlock whole system */
        return 0;

>Can someone explain why "lock free" algorithms don't seem to
>be in widespread use? I've got a work queue based server
>that would benefit from less locking overhead.

They are probably in more widespread use than you might suspect.  However,
there is no portable, standard interface for constructing these things. They
rely on support from the hardware which is not found on all architectures!

These kinds of techniques are more in the domain of the developers of operating
systems and system interface libraries who can use them to construct the
higher level synchronization primitives.

You might find these algorithms used in the implementation of mutexes and
other kinds of objects.

 Q305: Threading library design question ?  
[ OK, so I'm reading a little behind... ]

In article <>, Beman Dawes   wrote:
>What features should be excluded from a threading library because they
>are known to be error-prone or otherwise dangerous?  What are the
>threading equivalents of goto statements?

I think the most commonly asked-for feature that shouldn't be in a thread
library is suspend/resume.  It's amazing how many people believe they
want to arbitrarily suspend another thread at some random place in its

Yes, there are things you can do with suspend/resume that are Very
Difficult without, but it's one of those places where the bugs that
can be introduced are very subtle.

For a quick example, suppose I suspend a thread that's inside a library
(say, stdio) and has a mutex locked (say, stdout or stderr).  Now nobody
can say anything without blocking on the mutex, and the mutex won't
come back until the thread is resumed.

Permute the above with a few dozen thread-aware libraries, and you get
into *serious* trouble.

The other thing that I hear a bunch (this may be unique to the embedded
real-time market that I play in) is to disable context switching from
user space.  It implodes instantly in the face of multi-processor
systems, and effectively elevates the calling thread to ultimate

But those are just my favorites.
Steve Watt KD6GGD  PP-ASEL-IA          ICBM: 121W 56' 57.8" / 37N 20' 14.9"
 Internet: steve @ Watt.COM                         Whois: SW32
   Free time?  There's no such thing.  It just comes in varying prices... 

 Q306:  Stack size/overflow using threads ?  

In article <>, Jason Jesso   wrote:
% -=-=-=-=-=-

% I just began writing a threaded program using pthreads on AIX 4.3 in C.
% In a particular thread I create two 60K arrays as local variables.
% My program crashes in irregular places within this thread
% and I do believe in the "Principal of Proximity".
% My hunch is stack corruption, since when I place "any" one of these two
% arrays as global my program runs fine.
% Could it be possible that I am overflowing the stack space for this
% thread?

Yes. Threads typically have fixed-size stacks. If thread A has a stack
that starts at 0x400000 and thread B has a stack that starts at 0x500000,
then A's stack can't be any bigger than 0x100000, or else it would over-
write B's. POSIX doesn't specify a default stack size, and it varies
from system to system. You can set the stack size by calling
pthread_attr_setstacksize. You can find out the default stack size by
calling pthread_attr_getstacksize on a freshly initialised attr structure.

>From memory, AIX gives about 90k of stack by default, so you probably need
to knock it up a bit. Other systems have different limits. Solaris gives
1M, HP-UX 64k, TRU64 (sic) Unix gives ~20k, Linux gives 1M, and FreeBSD
gives 64k (again, this is working from memory, so don't rely on it).


Patrick TJ McPhee
East York  Canada 

Patrick TJ McPhee wrote:

> In article <>, Jason Jesso   wrote:
> % -=-=-=-=-=-
> % I just began writing a threaded program using pthreads on AIX 4.3 in C.
> %
> % In a particular thread I create two 60K arrays as local variables.
> % My program crashes in irregular places within this thread
> % and I do believe in the "Principal of Proximity".
> %
> % My hunch is stack corruption, since when I place "any" one of these two
> % arrays as global my program runs fine.
> %
> % Could it be possible that I am overflowing the stack space for this
> % thread?
> Yes. Threads typically have fixed-size stacks. If thread A has a stack
> that starts at 0x400000 and thread B has a stack that starts at 0x500000,
> then A's stack can't be any bigger than 0x100000, or else it would over-
> write B's. POSIX doesn't specify a default stack size, and it varies
> from system to system. You can set the stack size by calling
> pthread_attr_setstacksize. You can find out the default stack size by
> calling pthread_attr_getstacksize on a freshly initialised attr structure.

The pthread_attr_getstacksize() is a nice trick, but it's not portable.
Unfortunately, while you're correct that POSIX doesn't specify a default stack
size, you're underestimating the true extent of the lack of specification. Far
beyond not specifying a default size, it doesn't even specify what the default
size MEANS. That is, POSIX never says that the default value of the stacksize
attribute is the number of bytes of stack that will be allocated to a thread
created using the default attributes. It says that there IS a default, and that,
if you ask, you'll get back a size_t integer. Because you're not allowed to set
any value smaller than PTHREAD_STACK_MIN, one can play tricks. Solaris, for
example, has a default value for the stacksize attribute of "0". But that doesn't
mean 0 bytes, it means "default". Nobody can actually ask for 0 bytes, so there's
no ambiguity when pthread_create() is called. One can call this "creative" or
"devious", but it's perfectly legal.

> From memory, AIX gives about 90k of stack by default, so you probably need
> to knock it up a bit. Other systems have different limits. Solaris gives
> 1M, HP-UX 64k, TRU64 (sic) Unix gives ~20k, Linux gives 1M, and FreeBSD
> gives 64k (again, this is working from memory, so don't rely on it).

Tru64 UNIX V5.0 and later gives 5Mb by default. Earlier versions, stuck without
kernel support for uncommitted memory, were forced to compromise with far smaller
defaults to avoid throwing away bushels of swap space. [It was actually more like
24Kb, I think, (which was actually just fine for the vast majority of threads),
but that's hardly relevant.]

/------------------[ ]------------------\
| Compaq Computer Corporation |
| 110 Spit Brook Rd ZKO2-3/Q18, Nashua NH 03062-2698              |
\--------[ ]-------/ wrote:

> Try adjusting the stack size for the thread:
>         static pthread_attr_t thread_stack_size;
>         pthread_attr_init (&thread;_stack_size);
>         pthread_attr_setstacksize (&thread;_stack_size, (size_t)81920);

I recommend NEVER using an absolute size for the stack. It's not portable, it's
not upwards compatible. It's just a number that really means practically nothing
-- and even less except on the exact software configuration you used to measure.
(And even then only as good as the accuracy and thoroughness of your
measurements... and measuring runtime stack depth is not easy unless your program
has only one straightline code path.)

Of course, you may not have much choice...

>         pthread_create (&thread;, &thread;_stack_size, thread_func, 0);
> You can check the default size with:
>         static pthread_attr_t thread_stack_size;
>         pthread_attr_init (&thread;_stack_size);
>         pthread_attr_getstacksize (&thread;_stack_size, &ssize;);
> I ran into this problem on Digital when I had a thread call a deeply nested
> function.  All auto-variables will be allocated from the thread's stack, and
> so it is a good idea to know how much memory your thread function will consume
> beforehand.  Hope this helps.

That works fine on Tru64 UNIX (or the older Digital UNIX and DEC OSF/1 releases),
but it's not portable, or "strictly conforming" POSIX. There's no definition of
what the default value of stacksize means. (An annoying loophole, but some
implementations have exploited it fully.)

In fact, on Tru64 UNIX, or on any implementation where you can get the default
stack size from pthread_attr_getstacksize, I recommend that you make any
adjustments (if you really need to make adjustments) based on that value. Not
quite big enough? Double it. Triple it. Square it. Whatever. If the thread
library suddenly starts using an extra page at the base of each stack, the
default stack size will probably be increased to keep pace -- your arbitrary
hardcoded constant won't change, and you'll be in trouble.

Furthermore, on Tru64 UNIX 5.0 and later, you'll be doing yourself a disservice
by setting a stack size. The default is 5Mb... and if THAT's not enough for you,
you need to have your algorithms examined. Solaris (and I believe Linux) use 1Mb,
which ought to be sufficient for most needs.

In general, be really, really careful about adjusting stack size. If you need to
increase any thread from the default, you should consider making it "as big as
you can stand". Recompilation of ANY code (yours or something in a system library
you use) could expand the size of your call stack any time you change the
configuration. (Installing a patch, for example.) Runtime timing variations could
also affect the depth of your call stack. Cutting it too close is a recipe for
disaster... now or, more likely, (because it's more "fun" for the computer that
way), sometime later.

/------------------[ ]------------------\
| Compaq Computer Corporation |
| 110 Spit Brook Rd ZKO2-3/Q18, Nashua NH 03062-2698              |
\--------[ ]-------/

 Q307:  correct pthread termination?   

sr wrote:

> I'm writing a multithreaded program under AIX 4.3.2, and noticed that
> all threads, whether or not created in detached state, once terminated

> correctly via a pthread_exit() call (after having freed any own
> resources), are still displayed by the ps -efml command until the main

> process terminates.

There's no requirement in POSIX or UNIX 98 that resources be freed at
any particular time. There's no way to force resources to be freed.

On the contrary, POSIX only places a requirement on you, the programmer,
to release your references to the resources (by detaching the thread) so
that the implementation is ABLE to free the resources (at some
unspecified and unbounded future time).

> Is this normal, or there are still resources allocated I'm not aware
> of?

Many implementations cache terminated threads so that it can create new
threads more quickly. A more useful test than the program you show would
be, after the "first round" of threads have terminated, to create a new
round of threads. Does AIX create yet more kernel threads, or does it
reuse the previously terminated threads?

> In the latter case, how do I make sure that a thread, when
> (gracefully) terminated, gets completely freed?

Why would you care? The answer is, there's no way to do this. More
importantly, it should make absolutely no difference to your application
(and very little difference to the system) unless AIX is failing to
reuse those terminated threads. (I would also expect that unused cached
threads would eventually time out, but there's no rule that they must.)

In general, my advice would be "don't worry about things you don't need
to worry about". If you're really sure you do need to worry, please
explain why. What you have described is just "a behavior"; not "a
problem". If you're sure that behavior represents a problem for you,
you'll need to explain why it's a problem. (And while we're all curious
out here, you might keep in mind that it'll do you more good to explain
the problem to IBM support channels.)

Oh, and just a few comments about your program:

While it's probably "OK" for your limited purposes, I can't look at a
threaded program where main() does a sleep() [or any kind of a timed
wait] and then calls exit() without cringing. If you don't need main()
to hang around for some real purpose, then it should terminate with
pthread_exit(). If you do need it to hang around for some reason, a
timed wait is nearly always the wrong way to make it hang around.

Secondly, while I understand the desire to provide thread identification
in your printout, you should be aware that this "(unsigned
long)pthread_self()" construct is bad practice, and definitely
unportable. The pthread_t type is opaque. While many implementations
make this either a small integer or a pointer, it could as easily be a
structure. Unfortunately, POSIX lacks any mechanism to portably identify
individual threads to humans (that's "debugging", which is out of
scope). I'm not saying "don't do it"; I just want to make sure you know
it's a platform dependent hack, not a legal or portable POSIX construct.

[[ This is another re-post of a response that was lost on the bad new
server. ]]

/------------------[ ]------------------\
| Compaq Computer Corporation |
| 110 Spit Brook Rd ZKO2-3/Q18, Nashua NH 03062-2698              |
\--------[ ]-------/

 Q308:  volatile guarantees??   

On Wed, 30 Aug 2000 11:28:46 -0700, David Schwartz  wrote:
>Joerg Faschingbauer wrote:
>> (Kaz Kylheku) writes:
>> > Under POSIX threads, you don't need volatile so long as you use the
>> > locking mechanism supplied by the interface.
>> How does pthread_mutex_(un)lock manage to get the registers flushed?
>    Who cares, it just does. POSIX requires it.

Everyone is saying that, but I've never seen a chapter and verse quote.
I'm not saying that I don't believe it or that it's not existing practice;
but just that maybe it's not adequately codified in the document.

To answer the question: how can it manage to get the registers flushed?
Whether or not the requirement is codified in the standard, it can can be met
in a number of ways. An easy way to meet the requirement is to spill
registers at each external function call.

Barring that, the pthread_mutex_lock functions could be specially recognized by
the compiler. They could be, for instance, implemented as inline functions
which contains special compiler directives which tell the compiler to avoid

The GNU compiler has such a directive, for instance:

    __asm__ __volatile__ ("" : : : "memory");

The "memory" part takes care of defeating caching, and the __volatile__
prevents code motion of the inlined code itself.

Of course, GCC doesn't need this in the context we are discussing, because
it will do ``the right thing'' with external function calls.

I've only used the above as a workaround to GCC optimization bugs.
It can also be used as the basis for inserting a memory barrier instruction:

    #define mb() __asm__ __volatile__ \
    ("" : : : "memory");

It's a good idea to do it like this so that the compiler's optimizations do
not make the memory barrier useless, by squirreling away data in registers
or moving the instruction around in the generated code.

This would be typically used in the implementation of a mutex function,
not in its interface, to ensure that internal accesses to the mutex object
itself are conducted properly. 


I'm having trouble with the meaning of C/C++ keyword volatile. I know you
declare a variable volatile wherever it may be changed externally to the
flow of logic that the compiler is processing and optimising. This makes the
compiler read from the ultimate reserved storage when it is accessed (or so
I believed).

I have seen a discussion in one of the comp.lang.c* groups where it is
suggested that the compiler does not always have to avoid optimising away
memory accesses. This seems logical - since a thread which alters the value
of a variable might not get scheduled, the value of the variable may not
change for some time (many times round a busy loop), so the compiler can use
a cached value for many loops without changing the guarantees made by the
machine abstraction defined in the standards (*good* for performance). That
then renders volatile practically undefinable (since a thread may legally
*never* be scheduled) and when it is for hardware changing a flag, the
hardware doesn't necessarily change memory (a CPU register may be changed).
Volatile behaviour seems best implemented with a function call which uses
some guaranteed behaviour internally (in assembler or other language).

Has anyone hashed this out before and come to any conclusion whether to
trust volatile or spread to other languages? because it's doing my head in
(I ask this here because it concerns concurrent programming and the people
here probably have the experience of this problem).

Tristan Wibberley

In article ,
Kaz Kylheku  wrote:
>Under preemptive threading, the execution can be suspended *at any point* to
>invoke the scheduler; pthread_mutex_lock is not special in that regard. Yet
>compilers clearly do not treat each instruction with the same suspicion that
>pthread_mutex_lock deserves.

pthread_mutex_lock() is special. While threads may be pre-empted at
any point, they are not permitted to access shared data, so the order
in which the operations are performed is irrelevant. By calling
pthread_mutex_lock() a thread gains permission to access shared data,
so at that point the thread needs to update any local copies of that
data. Similarly, by calling pthread_mutex_unlock() a thread
relinquishes this permission, so it must have updated the shared data
from any local copies. Between these two calls, it is the only thread
which is permitted to access the shared data, so it can safely cache
as it likes.

>How about the following paragraph?
>    The values of all objects shall be made stable immediately
>    prior to the call to pthread_mutex_lock, pthread_mutex_unlock,
>    pthread_mutex_trylock and pthread_mutex_timedlock.  The
>    first abstract access to any object after a call to  one of the
>    locking functions shall be an actual access; any cached copy of
>    an object that is accessed shall be invalidated.

That is unnecessarily restrictive. Suppose we have a buffering scheme
which uses this code:
    for (;;) {
        while (!items_ready) wait(data)
        if (!--items_ready) signal(space)
        use buffer[tail]
        tail = (tail + 1) % BUFLEN

Analysis of the other uses of 'items_ready' may indicate that it can
be optimised into:
    local_ready = 0
    lastbatch = 0
    for (;;) {
        if (!local_ready) {
            items_ready -= lastbatch
            if (!items_ready) signal(space)
            while (!items_ready) wait(data)
            local_ready = lastbatch = items_ready
        } else local_ready--
        use buffer[tail]
        tail = (tail + 1) % BUFLEN

Of course the analysis required to determine that this is a valid
optimisation is not simple, and I would not expect to find it in
current compilers, but I don't think the standard should prohibit it.

Kaz Kylheku wrote:

> The standard is flawed because it doesn't mention that calls to
> pthread_mutex_lock and pthread_mutex_unlock must be treated specially.
> We all know how we want POSIX mutexes to work, and how they do work in
> practice, but it should also be codified in the standard, even though
> it may be painfully obvious.

The standard requires memory coherency between threads based on the POSIX
synchronization operations. It does NOT specifically dictate the compiler or system
behavior necessary to achieve that coherency, because it has no power over the C
language nor over the hardware. Besides, it really doesn't matter how the
requirements are achieved, nor by whom.

An implementation (thread library, compiler, linker, OS, hardware, etc.) that
doesn't make memory behave correctly with respect to POSIX synchronization
operations simply does not conform to POSIX. This means, in particular, (because
POSIX does not require use of volatile), that any system that doesn't work without
volatile is not POSIX. Can such a system be built? Certainly; but it's not POSIX.
(It's also not particularly usable, which may be even more important to some

OK, you want chapter and verse? Sure, here we go. POSIX 1003.1-1996, page 32:

2.3.8 memory synchronization: Applications shall ensure that access to any memory
location by more than one thread of control (threads or processes) is restricted
such that no thread of control can read or modify a memory location while another
thread of control may be modifying it. Such access is restricted using functions
that synchronize thread execution and also synchronize memory with respect to other
threads. The following functions synchronize memory with respect to other threads:

      fork()                 pthread_mutex_unlock()   sem_post()
      pthread_create()       pthread_cond_wait()      sem_trywait()
      pthread_join()         pthread_cond_timedwait() sem_wait()
      pthread_mutex_lock()   pthread_cond_signal()    wait()
      pthread_mutex_trylock()pthread_cond_broadcast() waitpid()

In other words, the application is reponsible for relying only on explicit memory
synchronization based on the listed POSIX functions. The implementation is
responsible for ensuring that correct code will see synchronized memory. "Whatever
it takes."

Normally, the compiler doesn't need to do anything it wouldn't normally do for a
routine call to achieve this. A particularly aggressive global optimizer, or an
implementation that "inlines" mutex operations, might need additional compiler
support to meet the requirements, but that's all beyond the scope of the standard.
The requirements must be met, and if they are, application and library developers
who use threads just don't need to worry. Unless of course you choose to try to
create your own memory synchronization without using the POSIX functions, in which
case no current standard will help you and you're entirely on your own on each

/------------------[ ]------------------\
| Compaq Computer Corporation |
| 110 Spit Brook Rd ZKO2-3/Q18, Nashua NH 03062-2698              |
\--------[ ]-------/
 Q309: passing messages, newbie? wrote:

> Hi,
> Thank you all for the  info. i think i am
> on my way, things are working now.
> One more thing, what is the  correct way to
> make a thread sleep/delay.
> The "sleep()" call i think causes the whole
> process to sleep.

No, it cannot. At least, not in any legal implementation of POSIX threads.
(Nor, in my opinion, in any rational implementation of any usable thread

In practice, this happens in some "cheap" (by which I do not mean
"inexpensive") pure user-mode threading libraries. These used to be common
and widely used. There are now real thread packages available "just about
everywhere", and you should run from any implementation with this sort of

In any implementation that uses "multiple kernel execution entities", the
buggy behavior would actually be difficult to achieve, and nearly
impossible to get by accident. Under Linux, for example, threads are really
independent Linux processes. Just try "accidentally" getting another
process to block when you sleep (or read from a file).

> I saw in a paper a call "pthread_delay_np()"
> call to delay a thread, but i couldnt find the
> call in man pages on my Linux 2.2.12 with glibc-2.1.12.

You won't find it on any "pure" implementation of POSIX threads, because
that function doesn't exist. It's from the ancient and obsolete DCE threads
package (which was a cheap user-mode implementation of a long since defunct
draft of the document that eventually became POSIX threads). Because we
couldn't count on having sleep() work, and instead of using somewhat less
portable means to supercede the real sleep() by something that would work,
we introduced pthread_delay_np(). (Which is modelled after nanosleep().)
The function is retained in our current implementation of POSIX threads (on
Tru64 UNIX and OpenVMS) as an extension, partly for "cultural
compatibility" to help people upgrading from DCE threads and partly
because, on OpenVMS, there are still compilation/link modes where we can't
count on sleep() working correctly.

> So we have to delay a thread using sleep() only.

This is correct. Or usleep(), nanosleep(), select(), or whatever is

/------------------[ ]------------------\
| Compaq Computer Corporation |
| 110 Spit Brook Rd ZKO2-3/Q18, Nashua NH 03062-2698              |
\--------[ ]-------/

 Q310:  solaris mutexes?   

Roy Gordon wrote:

> Is the following true:  Solaris mutexes only exist in the user address
> space (including any shared memory space); they have no associated
> kernel data structure.

True. For non-shared-memory mutexes. Ditto CVs & unnamed semaphores.

> If true, this would be opposed to system V semaphores.


> Also, if true, then a given mutex could be moved to a different
> address (suitably aligned) and as long as all threads (or processes, as
> the case may be) reference it at that address, then it would continue
> functioning as if it hadn't been moved.
> Is this correct too (if the initial assumption is correct, that is)?

You mean like a compacting garbage collector? Yeah. 'Matter of
fact, I believe that's what Java does on some platforms.



 Q311: Spin locks?  

I think it worth noting that spin locks are an efficency hack for
SMP machines which are useful under a small number of situations.
Moreover, there is nothing that prevents you from using spin locks
all the time (other than a slight loss of efficency).

In particular, in some libraries ALL locks are actually spin locks. Solaris
2.6 (or is that 7?) and above for example. If you call pthread_mutex_lock()
on an MP machine & the lock is held by a thread currently running on
another CPU, you WILL spin for a little while.

It is very unlikely you would EVER want to build a spin lock yourself.
(I mean, it would be kinda FUN and interesting, but not practical.)
If you *really* want to, go ahead, just time your program carefully.
$10 says you'll find home-spun spin locks won't help.

> BTW, SMP is a bad design that scales poorly. I wish someone could come
> up with a better design with some local memory & some shared memory
> instead.

Like democracy. It sucks, but we haven't come up with anything better :-)



> > BTW, SMP is a bad design that scales poorly. I wish someone could come
> > up with a better design with some local memory & some shared memory
> > instead.
> Like democracy. It sucks, but we haven't come up with anything better :-)

You would like SGI's high-end monster-machines... Ours consists of eight
"node boards," each of which has two CPU's and a local memory pool (512MB or
so). All the memory in the machine is visible to all processors, but IRIX
intelligently migrates individual pages towards the processors that are
hitting them most. It's like another level of cache... As long as each
thread/process stays within a modest, unshared working set, the system
scales very well.


Eppur si muove 

 Q312:  AIX pthread pool problems?   
Kaz Kylheku wrote:

> On Thu, 31 Aug 2000 17:52:08 GMT, sr  wrote:
> >/* lck.c */
> >/* AIX: xlc_r7 lck.c -qalign=packed -o lck */
> >/* LINUX: cc lck.c -lpthread -fpack-struct -o lck */
> Doh, have you reading the GNU info page for gcc? Here is what it
> says about -fpack-struct:
> `-fpack-struct'
>      Pack all structure members together without holes.  Usually you
>      would not want to use this option, since it makes the code
>      suboptimal, and the offsets of structure members won't agree with
>      system libraries.

Kaz is quite correct, but maybe not quite firm enough...

NEVER, ever, under any circumstances, "pack" any structure that you didn't define.
If a header wants its structures packed, it'll do it itself. If it doesn't ask,
don't presume to tell it what it should do.

You asked the compiler to break your mutexes, and it let you, because it didn't
know any better. Now you do know better, so stop asking it. ;-)

/------------------[ ]------------------\
| Compaq Computer Corporation |
| 110 Spit Brook Rd ZKO2-3/Q18, Nashua NH 03062-2698              |
\--------[ ]-------/

 Q313: iostream libray and multithreaded programs ?  


I took a class that you taught at Xilinx. I have written some small
multithreaded programs. These programs work fine if I don't use iostream
library. If I use iostream, these program hanged with the following
  libc internal error: _rmutex_unlock: rmutex not held.

I have attached two files with this message: problem.txt and good.txt.
These files show the compile options. The only difference is that the
broken version use a extra option "-library=iostream,no%Cstd", in both
compile and link lines. Unfortunately, this is the standard build option
at Xilinx. On one at Xilinx understands why this option will break my
program. Could you help me with this problem? Thank you very much.


Content-Type: text/plain; charset=us-ascii;
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;

1) make output:

  CC -O -c -I../ -DSOL -DDLLIMPORT="" -DTEMP -DDEBUG ../Port_ThrTest.c
  CC -O -c -I../ -DSOL -DDLLIMPORT="" -DTEMP -DDEBUG ../Port_ThrMutex.c
  CC -O -c -I../ -DSOL -DDLLIMPORT="" -DTEMP -DDEBUG ../Port_ThrCondition.c
  CC -O -c -I../ -DSOL -DDLLIMPORT="" -DTEMP -DDEBUG ../Port_ThrBarrier.c
  CC -O -c -I../ -DSOL -DDLLIMPORT="" -DTEMP -DDEBUG ../Port_ThrThread.c
  "../Port_ThrThread.c", line 49: Warning (Anachronism): Formal argument 
start_routine of type extern "C" void*(*)(void*) in call to pthread_create(unsigned*, 
const _pthread_attr*, extern "C" void*(*)(void*), void*) is being passed void*(*)(void*).
  1 Warning(s) detected.
  CC  -L. -o Port_ThrTest Port_ThrMutex.o Port_ThrCondition.o Port_ThrBarrier.o 
Port_ThrThread.o Port_ThrTest.o -lpthread -lposix4

2) ldd output: =>       /usr/lib/ =>        /usr/lib/ =>  /usr/lib/ =>     /tools/sparcworks5.0/SUNWspro/lib/ =>     /usr/lib/ =>     /usr/lib/ =>   /usr/lib/ =>    /usr/lib/ =>        /usr/lib/


 Q314:  Design document for MT appli?   

> > If you describe what it is you need to acheive, someone in this forum
> > advise against using threads, or if they think threads will be good,
> > how to use them for best effect.

Here is something I recently posted to the Linux kernel list:


Let's go back to basics. Take a look inside your computer. What do you see?

1) one (or more) CPUs
2) some RAM
3) a PCI bus, containing:
4)   -- a SCSI/IDE controller
5)   -- a network card
6)   -- a graphics card

These are all the parts of your computer that are smart enough to accomplish
some amount of work on their own. The SCSI or IDE controller can read data
from disk without bothering any other components. The network card can send
and receive packets fairly autonomously. Each CPU in an SMP system operates
nearly independently. An ideal application could have all of these devices
doing useful work at the same time.

When people think of "multithreading," often they are just looking for a way
to extract more concurrency from their machine. You want all these
independent parts to be working on your task simultaneously. There are many
different mechanisms for achieveing this. Here we go...

A naively-written "server" program (eg a web server) might be coded like so:

* Read configuration file - all other work stops while data is fetched from
* Parse configuration file - all other work stops while CPU/RAM work on
parsing the file
* Wait for a network connection - all other work stops while waiting for
incoming packets
* Read request from client - all other work stops while waiting for incoming
* Process request - all other work stops while CPU/RAM figure out what to do
                  - all other work stops while disk fetches requested file
* Write reply to client - all other work stops until final buffer

I've phrased the descriptions to emphasize that only one resource is being
used at once - the rest of the system sits twiddling its thumbs until the
one device in question finishes its task.

Can we do better? Yes, thanks to various programming techniques that allow
us to keep more of the system busy. The most important bottleneck is
probably the network - it makes no sense for our server to wait while a slow
client takes its time acknowledging our packets. By using standard UNIX
multiplexed I/O (select()/poll()), we can send buffers of data to the kernel
just when space becomes available in the outgoing queue; we can also accept
client requests piecemeal, as the individual packets flow in. And while
we're waiting for packets from one client, we can be processing another
client's request.

The improved program performs better since it keeps the CPU and network busy
at the same time. However, it will be more difficult to write, since we have
to maintain the connection state manually, rather than implicitly on the
call stack.

So now the server handles many clients at once, and it gracefully handles
slow clients. Can we do even better? Yes, let's look at the next
bottleneck - disk I/O. If a client asks for a file that's not in memory, the
whole server will come to a halt while it read()s the data in. But the
SCSI/IDE controller is smart enough to handle this alone; why not let the
CPU and network take care of other clients while the disk does its work?

How do we go about doing this? Well, it's UNIX, right? We talk to disk files
the same way we talk to network sockets, so let's just select()/poll() on
the disk files too, and everything will be dandy... (Unfortunately we can't
do that - the designers of UNIX made a huge mistake and decided against
implementing non-blocking disk I/O as they had with network I/O. Big booboo.
For that reason, it was impossible to do concurrent disk I/O until the POSIX
Asynchronous I/O standard came along. So we go learn this whole bloated API,
in the process finding out that we can no longer use select()/poll(), and
must switch to POSIX RT signals - sigwaitinfo() - to control our server***).
After the dust has settled, we can now keep the CPU, network card, and the
disk busy all the time -- so our server is even faster.

Notice that our program has been made heavily concurrent, and I haven't even
used the word "thread" yet!

Let's take it one step further. Packets and buffers are now coming in and
out so quickly that the CPU is sweating just handling all the I/O. But say
we have one or three more CPU's sitting there idle - how can we get them
going, too? We need to run multiple request handlers at once.

Conventional multithreading is *one* possible way to accomplish this; it's
rather brute-force, since the threads share all their memory, sockets, etc.
(and full VM sharing doesn't scale optimally, since interrupts must be sent
to all the CPUs when the memory layout changes).

Lots of UNIX servers run multiple *processes*- the "sub-servers" might not
share anything, or they might file cache or request queue. If we were brave,
we'd think carefully about what resources really should be shared between
the sub-servers, and then implement it manually using Linux's awesome
clone() API. But we're not, so let's retreat to the brightly-lit
neightborhood that is pthreads.

We break out the POSIX pthread standard, and find it's quite a bit more
usable than AIO. We set up one server thread for each CPU; the threads now
share a common queue of requests****. We add locking primitives around the
shared data structures in our file cache. Now as soon as a new packet or
disk buffer arrives, any one of the CPUs can grab it and perform the
associated processing, while the other CPUs handle their own work. The
server gets even faster.

That's basically the state-of-the-art in concurrent servers as it stands
today. All of the independent devices in the computer are being used
simultaneously; the server plows through its workload, never waiting for
network packets or disk I/O. There are still bottlenecks - for instance, RAM
and PCI bandwidth are limited resources. We can't just keep adding more CPUs
to make it faster, since they all contend for access to the same pool of RAM
and the same bus. If the server still isn't fast enough, we need a better
machine architecture that separates RAM and I/O busses into
concurrently-accessible pools (e.g. a high-end SGI server).

There are various other tricks that can be done to speed up network servers,
like passing files directly from the buffer cache to the network card. This
one is currently frowned upon by the Linux community, since the time spent
copying data around the system is small compared to the overhead imposed by
fiddling with virtual memory. Lots of work does go into reducing system call
and context switch overhead; that's one of the reasons TUX was developed.

Let's drop the "web server" example and talk about another application that
benefits from concurrency - number crunching. This is a much simpler case,
since the only resources you're worried about are the CPUs and RAM. To get
all the CPU's going at once, you'll need to run multiple threads or
processes. To get truly optimal throughput, you might choose to go the
process route, so that shared memory is kept to an absolute minimum. (Not
that pthreads is a terrible choice; it can work very well for this purpose)

In summary, when "multithreading" floats into your mind, think
"concurrency." Think very carefully about how you might simultaneously
exploit all of the independent resources in your computer. Due to the long
and complex history of OS development, a different API is usually required
to communicate with each device. (e.g. old-school UNIX has always handled
non-blocking network I/O with select(), but non-blocking disk I/O is rather
new and must be done with AIO or threads; and don't even ask about
asynchronous access to the graphics card =).

Don't let these differences obscure your goal: just figure out how to use
the machine to its fullest potential. That's the Linux way of doing things:
think, then act.

-- Dan

The ideas here mostly come from informative pages like Dan Kegel's "C10K", and from reading various newsgroup postings
and UNIX books.

*** POSIX AIO is so ugly, in fact, that it's not unheard-of to simply spawn
a pool of threads that handle disk I/O. You can send requests and replies
via a pipe or socket, which fits right in with the old select()/poll() event

*** If we're servicing many, many clients at once, then running a huge
select()/poll() in each thread will have outrageous overhead. In that case,
we'd have to use a shared POSIX I/O signal queue, which can be done with
clone(), but not pthreads()... See Zach Brown's phhttpd

 Q315:  SCHED_OTHER, and priorities?   

Dale Stanbrough wrote:

> Patrick TJ McPhee wrote:
> > % thinking about them. I simply wanted to know if, given two threads that
> > % are available to run with different priorities, will SCHED_OTHER
> > % -always- choose the higher priority thread. Also will SCHED_OTHER
> > % -never- preempt a higher priority thread simply to run one of lower
> > % priority?
> >
> > SCHED_OTHER does not specify any particular scheduling policy. The
> > behaviour will vary from system to system.
> No it doesn't have to. There could be a part of the POSIX reference that
> says something like
>    "Under no circumstances should any scheduling policy preempt a
>     higher priority thread to run a lower priority thread".

There IS, for the realtime priorities that are defined by POSIX. But the whole
point of SCHED_OTHER (and for many good reasons) is to provide a "standard
name" for a policy that doesn't necessarily follow any of the POSIX rules.

> However it seems from other people's posting that there is no such
> restriction made, or in some cases, possible to be made. I suppose
> the next logical question to ask is...
>    Are there any SCHED_OTHER or other named policies other than
>    SCHED_RR and SCHED_FIFO that -do- such preemption?

That depends on your definitions and point of view. In the simplest terms,
from an external user view, the answer is a resounding "yes".

That's because many SCHED_OTHER implementations are based on standard
UNIX timeshare scheduling, for very good reasons. (That is, it has a long
history, it behaves in reasonable ways, and, perhaps most importantly, it
behaves in generally and widely understood ways.)

Reduced to general and implementation-independent terms, it works roughly like
this: each entity (thread or process) has TWO separate priorities, a "base"
priority and a "current" priority. You set (and see) only the base priority,
but the scheduler operates entirely on the current priority. This priority may
be adjusted to ensure that, over time, all entities get a "fair" share of the
processor resources available. Either compute-bound entities' current priority
may be gradually reduced from the base (where higher priorities are "better",
which is the POSIX model but not the traditional UNIX model), and/or entities
that block may be be gradually increased from the base. The net result is that
entities that have used a lot of CPU won't be given as much in the future,
while entities that haven't gotten much will be given more. In the end, it all
more or less evens out.

>From the scheduler's point of view, higher priority entities are always
preferred. From your point of view, though, your high priority threads may
behave as if they were low priority, or vice versa. Truth is sometimes not
absolute. ;-)

Of course, the POSIX standard doesn't even require that level of "truth". It's
perfectly reasonable for SCHED_OTHER to completely ignore the specified
priority. (There's no requirement that SCHED_OTHER even have a scheduling
parameter.) Would such an implementation be useful? Not for some people
certainly... but they probably ought to be using realtime policies, which are
fully defined by POSIX.

/------------------[ ]------------------\
| Compaq Computer Corporation |
| 110 Spit Brook Rd ZKO2-3/Q18, Nashua NH 03062-2698              |
\--------[ ]-------/ 

 Q316:  problem with iostream on Solaris 2.6, Sparcworks 5.0?   

I found the cause of my problem. In my company, we have build tools that generate
makfiles with list of options for compiling and linking. The link option list
ends with -Bstatic. -mt implicitly appends -lthread and other libraries to the
link command. This causes ld to look for libthread.a instead of The
following link error went away once I removed -Bstatic.


"Webster, Paul [CAR:5E24:EXCH]" wrote:

> Meiwei Wu wrote:
> >
> > My test program is linked with a shared library. This shared library was
> > compiled and linked with -mt option.
> > If I compiled and linked with -mt option, I would get the following link
> > error:
> >
> >   ld: fatal: library -lthread: not found
> It sounds like your compiler isn't installed properly.  The WS5.0
> documentation says that to be multithreaded, your files must be compiled with
> the -mt option (which defines _REENTRANT for you) and linked with the -mt
> option (which links the thread library and libC_mtstubs in the correct order
> for you).
> Also, 5.0 on the sun has broken MT capabilities, especially when it comes to
> iostreams.  There are 3 patches available which help to fix this (and a bunch
> of other things):
> 107357-09
> 107311-10
> 107390-10
> --
> Paul Webster 5E24 - My opinions are my own :-) -
> Fifth Law of Applied Terror: If you are given an open-book exam, you will
>     forget your book.  Corollary: If you are given a take-home exam, you
>     will forget where you live. 

 Q317:  pthread_mutex_lock() bug ??? writes:

>Thanks for pointing it out. I had made a mistake. I was working at a
>Solaris 2.6 machine while looking at an older Solaris 2.5 Answerbook.
>To my surprise 2.6 man pages do not specify an EPERM return value
>though they make comments like "only the thread that locked a mutex can
>unlock it" in man pthread_mutex_unlock.
>My guess is that 2.5 and 2.6 had a bug but rather than fixing it in
>2.6, they just deleted the EPERM part from the RETURN part of the

No, not a bug.  This is completely intentional.

Mutexes require some expensive bus operations; if you do error
checking on unlock you suddenly require more of those expensive operations, so
mutex_unlock becomes a *lot* slower.

>Not really POSIX ensures that implementation will let only the locked
>thread to unlock it. This is acknowledge by solaris as well in
>their "only the thread that locked a mutex can unlock it" phrase. I
>also read this in Kleiman et. al 's "Programming with threads".

You should really read this as "you're not allowed to do so
but if you do all bets are off"

>Andrew>If you want mutex locking to be error-checked, you need to
>Andrew>create the mutex with the PTHREAD_MUTEX_ERRORCHECK type
>PTHREAD_MUTEX_ERRORCHECK is a type of teh mutex. Though I m not sure I
>suspect this was not there in the initial standard ( Both my books on
>pthreads do not make any mention of it ). Solaris did not have a
>pthread_mutexattr_settype() interface till 2.7 ( or 2.8 ??. It
>definitely wasn t there till 2.6 ). Instead this was the default
>behavior as per the man pages.

All comes clear when you read the unlock page in S7:

     If the mutex type is  PTHREAD_MUTEX_NORMAL, deadlock  detec-
     tion  is not provided. Attempting to relock the mutex causes
     deadlock. If a thread attempts to unlock a mutex that it has
     not  locked or a mutex which is unlocked, undefined behavior

     If the mutex type is  PTHREAD_MUTEX_ERRORCHECK,  then  error
     checking is provided. If a thread attempts to relock a mutex
     that it has already locked, an error will be returned. If  a
     thread  attempts to unlock a mutex that it has not locked or
     a mutex which is unlocked, an error will be returned.


 Q318:  mix using thread library?   


Hello. Yes, you can use both pthreads & solaris threads at the same time.
As a matter of fact, you do almost all the time! Most of the Solaris libraries
are written using Solaris threads.

I doubt that your hang has anything to do with mixing the two libraries.
But... how do you build your program? What does your compile line look

If it hangs at malloc_unlock, then I would be suspicious that somewhere
your code is corrupting the library. I would use purify (or the Sun debugger's
bounds checker) to be certain that I wasn't writing past the end of an
array or to an invalid pointer.


> Bil:
>     How are you!
>     This is Christina Li at Lucent Technologies. I have a specific question for
> MT programming, hopefully not too bother you.
>     In the man page of thr_create, it shows like we can use together both
> pthread and Solaris thread library in an application( on Unix Solaris 2.5 ), and
> most of the books don't talk anything about mixed using the two libraries. But I
> heard from some people , that it is not safe to use both pthread and Solaris
> thread at the same time in an application.
>     I have a large application, over 150K line code. Sometimes it just
> mysteriously hang at some very low level, like malloc_unlock or pthread_unlock
> or other places.
> It seems like if I build the application with POSIX_PTHREAD_SEMANTICS and linked
> with pthread helps to bypass some hang.
>     But I am really not sure whether it is a true fix or not, I am quite
> confused of this, would you please help to share some of your ideas?
>     Thanks very much!
> Christina Li.

 Q319:  Re: My agony continues (thread safe gethostbyaddr() on FreeBSD4.0) ?  

Stephen Waits  writes:

>Well, hoping to avoid serial FQDN resolution, I trashed gethostbyaddr()
>and attempted to write my own "thread-safe" equivalent.

>After much research and plodding through nameser.h, this is what I ended
>up with.  The problem is that it STILL doesn't seem to be thread-safe. 
>If I mutex wrap the call to res_query() my proggie works great, just
>that DNS lookups remain serial :(   I'm assuming res_query() uses some
>static data somewhere along the line (going to read the source right
>after this post).

>ANY suggestions on where to go next (besides "use ADNS") much

Run Solaris (I think from 7 onwards we have a multithreaded resolver library).

I believe future bind versions will have a threaded library.  Perhaps
bind 9 has it?

Of course, in Solaris the issue of concurrent lookups was somewhat
more pressing with one daemon doing all the lookups.

Expressed in this posting are my opinions.  They are in no way related
to opinions held by my employer, Sun Microsystems. 

 Q320:  OOP and Pthreads?   

"Mark M. Young"  wrote in message
> [...]
> After having said this, are you still
> suggesting that I use a function or macro to compare the addresses
> and lock the objects accordingly?  Is this common industry practice
> or something?

I can only speak for myself, but I have used it.

> So, the burden of locking an extra mutex, having an extra mutex in
> existence, and hurting serialization outways the burden of the
> complicated comparisons of addresses that might arise (e.g. 4
> objects)?  Once you go beyond 3 objects, the code to perform the
> comparisons would be rediculous and I would like to have a clean
> technique used universally.

It is actually not that ridiculous if you can use the C++ STL
library. Here's an example:

  using namespace std;

  class ADT {
    ADT& operator += (const ADT& b);

    friend class LockObjects;

    int lock() {
      cout << "Locked  : " << this << endl;
      return 0; // Required by my C++/STL implementation.

    int unlock() {
      cout << "Unlocked: " << this << endl;
      return 0;

  class LockObjects : private priority_queue {
    LockObjects(ADT* pArg1, ADT* pArg2)  {

      for_each(c.begin(), c.end(), mem_fun(&ADT;::lock));

    ~LockObjects() {
      for_each(c.rbegin(), c.rend(), mem_fun(&ADT;::unlock));

  ADT& ADT::operator += (const ADT& rhs) {
      lock(this, const_cast(&rhs;));

    // Add the two.

    return *this;

  int main() {
    ADT a, b;

    a += b;

    cout << endl;

    b += a;

    return 0;

Pushing into the priority_queue sorts the objects in decreasing
address order. In practice you probably want to separate the
sorting of the objects and the actual locking of them.

I have used this scheme on several occasions and it has worked
quite nicely. I don't know how expensive the use of priority_queue
is, as it in my case has not been of particular importance.


All the other responders (Kylhelku,Wikman,Butenhof) had excellent
suggestions.  I prefer the key to the address for sorting the locks, for the
reason KK mentioned, but if you have operator< defined for your ADTs, you
could potentially use that for ordering, taking care in the case of
equality.  Wikman's priority queue (you could also use an STL set) is a nice
way to get the sorting cleanly.

I'm not sure whether I was clear in warning you off the class lock.  If you
have thousands of ADTs that are supposed to participate in binary operations
in multiple threads, by introducing the class lock you force the thousands
of operations to be carried out sequentially -- no parallelism is possible.
This will matter a lot on an SMP box, or if any of the operations involve
i/o or something else that blocks the processor.  Comparing a few addresses
(usually a pair of addresses) is trivial in comparison.

The suggestion to provide locking primitives and let the higher-level code
decide how to use them is very valuable.  Typically you provide, e.g.,
operator+ which grabs the locks on its operands, and then calls plusInternal
which does the addition.  Then operator+= grabs the locks and calls
plusInternal and assignInternal, thus avoiding the need for recursive
mutexes and a lock on the temporary.  If you are multiplying two matrices of
these ADTs, matrixMultiply grabs locks on all the elements of all the
matrices, then calls all the usual multiplyInternal and plusInternal members
avoiding a horrendous number of recursive lock/unlock calls.  If you had a
parallelMatrixMultiply operation, it would lock all the elements of both
matrices (and possibly the result matrix), hand off groups of rows from one
matrix and groups of columns from the other to the participating threads,
which would use the multiplyInternal,plusInternal and assignInternal
operations and proceed without taking any additional locks.

This is much more in the spirit of how the STL is used in a threadsafe
fashion (see the discussion in

By the way, you can implement a recursive mutex on top of the normal
pthread_mutex by wrapping pthread_mutex in a class which holds the owner
thread id and a reference counter.  When you are trying to lock, you see if
you are already the owner.  If so, you increment the reference count.  If
you are not, you call the pthread_mutex_lock function.  When you unlock, you
decrement the reference counter and only if it's zero do you call
pthread_mutex_unlock.  A full implementation can be found in the ACE
(Adaptive Communications Environment) library,


 Q321:  query on threading standards?   


I don't know if you recall or not, but we talked at Usenix in San Diego
about threading under Linux.

I was curious about your opinion on a couple of things.

(1)  Have you been following the threading discussion on linux kernel
(summarized at those
of us that have lives)?
       I wondered if you had any opinions on the Posix thread discussion
and Linus's evaluation
       of pthreads.
(2)  Is there a current POSIX standard for Pthreads?  Where might one find
or obtain this.
       I couldn't find a reference to where the standard is in
"Multithreaded Programming with
(3)  There also has been a lot of discussion on the mailing list about some
       that Linux has put into 2.4.0-test8 to support "thread groups".
This is a way to
       provide a container for Linux threads (the process provides this
container on most
       other operating systems).  Apparently this breaks the current Linux
       of pthreads.  But other than that it is a good thing should allow a
better implementation
       of pthreads under Linux, but no details are forthcoming at the

Best Regards,

Ray Bryant
IBM Linux Technology Center

We are Linux. Resistance is an indication that you missed the point

"...the Right Thing is more important than the amount of flamage you need
to go through to get there"
 --Eric S. Raymond 


Yeah, I remember. It was right after the BOF, right?

> Bill,
> I don't know if you recall or not, but we talked at Usenix in San Diego
> about threading under Linux.
> I was curious about your opinion on a couple of things.
> (1)  Have you been following the threading discussion on linux kernel
> (summarized at
> those
> of us that have lives)?
>        I wondered if you had any opinions on the Posix thread discussion
> and Linus's evaluation
>        of pthreads.

I just took a gander at it. Now I won't say I got it all figured out in 15
of scanning, but Linus is, as is often the case, full of himself. He talks about

"Right" (Linux) vs. "Engineered" (POSIX) as a moral battle. I probably
agree with him on many points, but at the end of the day I want something that
works & is used. PThreads fits the bill.

(I notice that Linus is not about using other engineered solutions...)

> (2)  Is there a current POSIX standard for Pthreads?  Where might one find
> or obtain this.
>        I couldn't find a reference to where the standard is in
> "Multithreaded Programming with
>        pthreads".

It's in there. For $130 (no free on-line access :-(  ) you can buy it from
IEEE.  NB: POSIX (1995) vs. UNIX98 (the follow on with a few slight

> (3)  There also has been a lot of discussion on the mailing list about some
> changes
>        that Linux has put into 2.4.0-test8 to support "thread groups".
> This is a way to
>        provide a container for Linux threads (the process provides this
> container on most
>        other operating systems).  Apparently this breaks the current Linux
> implementation
>        of pthreads.  But other than that it is a good thing should allow a
> better implementation
>        of pthreads under Linux, but no details are forthcoming at the
> moment.....

No idea about that. It sounds very odd, considering Linus' railing about
LWPs being broken. I'd want to hear a *very* good reason for threads
groups, esp. as they are pretty much useless in Java.



Yes, we talked right after the BOF at Usenix.

I think the thing that upsets me about the Linus/Linux discussion on
threading is the utter contempt
the Linux core team has for the rest of the world.  I mean I am willing to
accept that POSIX threads
were not designed to fit into the Linux threading model.  But it seems to
me that the people who
worked on the POSIX thread standard were trying to pull together a
consensus among a wide
variation of expecations, requirements, and that they likely did this in a
concientious, diligient,
and competent way.  To that work "shit or crap" is disrepectful to the
people who tried (quite
hard) to make it a good standard.  Oh well.

Yes, I eventually found in your book where to go order the POSIX
specification.  But I figured
not only would I not get my management to cough up $143; I really didn't
want to read a 784
page document.  I think I am going to go with the Butenhof book instead.

The thread group changes that Linus has put in are an attempt to provide a
"container" for
a program's threads.  Using this, one can send a signal to the container
and have one of
the threads in the thread group that is enabled for the signal be the one
that gets the signal;
as per POSIX semantics.  At the moment,, however, the changes break
pthreads.  Oh well,

Best Regards,

Ray Bryant
IBM Linux Technology Center

We are Linux. Resistance is an indication that you missed the point

"...the Right Thing is more important than the amount of flamage you need
to go through to get there"
 --Eric S. Raymond

 Q322:  multiprocesses vs multithreaded..??   

>Patrick TJ McPhee wrote:
>> In article <>,
>> David Schwartz   wrote:
>> %       Threads are inherently faster than multiple proceses.
>> Bullshit.
>    Refute it then.

For Linux users, Question/Answer 6 of this interview:

by kernel developer Ingo Molnar on the TUX webserver is
quite illuminating -- he notes the context switch time for
two threads and for two processes under Linux is identical,
around 2 microseconds on a 500 MHz PIII. Ingo makes
the case using threads (instead of just fork()'ing off
processes) under Linux should be reserved for:

  "where there is massive and complex interaction between
   threads. 98% of the programming tasks are not such. 
   Additionally, on SMP systems threads are *fundamentally
   slower*, because there has to be (inevitable, hardware-
   mandated) synchronization between CPUs if shared VM is used."

Just passing along an interesting interview, not necessarily
my personal opinion -- actually, on the project I'm working on
these days, using non-blocking I/O of multiple streams in a single
process turns out to be the best way of doing things, so I 
vote for "none of the above" :-).

John Lazzaro -- Research Specialist -- CS Division -- EECS -- UC Berkeley
lazzaro [at] cs [dot] berkeley [dot] edu

=================================TOP=============================== ?


Jani Kajala

"Mark M. Young"  wrote in message
> I've read the FAQ, I've searched the net.  Could someone help me to a
> Win32 Pthreads implementation (I don't use Windows by choice)?

 Q323:  CGI & Threads?   

Do a web search for a standard called "Fast CGI".  It basically uses a LWP
with a thread pool to service CGI requests.  The nice thing about Fast CGI
is that any program written to be a Fast CGI executable is still compatible
as a standard CGI executable.


Shelby Cain

"Terrance Teoh"  wrote in message
> Hi,
> Has anyone seen any CGI done in either C / C++ using threads ?
> Basically I am thinking of reducing resources taken up when there are
> too many
> access being done at the same time ?
> Thoughts ? Pointers ? Comments ?
> Thanks !
> Terrance

 Q324:  Cancelling detached threads (posix threads)?   

Jason Nye wrote:

> I'm trying to find out whether the posix specification allows the
> cancellation of detached threads. I have a copy of Butenhof's book and in
> the posix mini-reference at the end of the book, he says that pthread_cancel
> should fail if the target thread is detached. This makes sense to me, but is
> this the correct behaviour? For example LinuxThreads does allow cancellation
> of a detached thread -- who is correct?

This is actually an ambiguity in the standard. An implementation that allows
cancellation of a detached thread doesn't violate the standard. HOWEVER, other
provisions of the standard make such an allowance of questionable value, at
best. For example, when a detached thread terminates the state of that thread
is immediately invalidated, and may be reused immediately for a new thread. At
that point (which you cannot determine) you would be cancelling some new thread
that you probably didn't create, with possibly disastrous consequences to your

There's no excuse for ever cancelling a detached thread. If you do, you may be
breaking your application. If it works today, it might not work tomorrow, for
reasons you cannot easily determine.

In other words, regardless of what the standard says, this is an application
error that individual implementations may or may not detect. (And in the
general case, implementations that reuse pthread_t values cannot detect the
cases where it really matters, because when you cancel a reused pthread_t, the
value is valid at the time.)

So, if you believe my interpretation, you're less likely to get yourself into
trouble by taking advantage of dangerous (and in the final analysis, unusable)
loopholes provided by other implementors. ;-)

/------------------[ ]------------------\
| Compaq Computer Corporation              POSIX Thread Architect |
|     My book:     |
\-----[ ]-----/

 Q325: Solaris 8 recursive mutexes broken?   

"Bill Klein"  wrote in message
> Hey,
> I came on to this newsgroup prepared to ask exactly this question - I'm
trying to get recursive mutexes working under Solaris 7 but
> am having no luck at all.
> "Spam Me Not"  wrote...
> > Confirmed from sun's web page:
> >;_32=42
> Have you tried the patch? Does it actually solve all problems?
> Thanks!

Yes, I tried the patch.  Look a few messages back on this
message thread, where I posted a program to test this, and the
4 resulting cases.   Case #2 was fixed by the patch, which was
simple pthread_mutex_lock and pthread_mutex_unlock of a recursive
mutex by multiple pthreads.  The patch did not fix case #1, where
I used a mutex that happens to be recursive (but where the
recursion is never used AFAIK) in a pthread_cond_wait.  In this
case, the cond_wait never returns, so there's still a problem
with recursive mutexes even with this patch.  Note that sun's
documentation suggests not using recursive mutexes in cond_waits,
because a recursive mutex with count > 1 won't be fully released
by cond_wait, but I don't think that note applies to my program,
which doesn't take recursive locks of the mutex.

Anyone know how to submit a bug report to sun? :)  Do you have to
register on sunsolve first? writes:
>The problem first showed up in Solaris 7 and there was a
>patch (106980-13) that fixed the problem.
>Now I've upgraded my system to Solaris 8 and the problem is
>back.  Obviously the Solaris 7 patch was not propagated
>into the Solaris 8 code and there does not appear to be a
>patch for the Solaris 8 system.  I've searched the SunSolve
>web site with no luck so far.

The Solaris 8 sparc equivalent patch for 106980-13 is 108827-05.

>Does anyone know about this problem and if so, is there a

I don't know what the problem is, because you haven't said.

>workaround for it?

A quick mod to the program to dump out the error gives:

lock: 251, thread 4
lock: 252, thread 4
lock: 253, thread 4
lock: 254, thread 4
lock failed: Resource temporarily unavailable, 255, thread 4
lock failed: Resource temporarily unavailable, 256, thread 4
lock failed: Resource temporarily unavailable, 257, thread 4
lock failed: Resource temporarily unavailable, 258, thread 4

"Resource temporarily unavailable" = EAGAIN

man pthread_mutex_lock...
           The mutex could not be acquired  because  the  maximum
           number   of    recursive  locks  for  mutex  has  been

So what's the fault?  It seems to be behaving exactly as described.

Andrew Gabriel
Consultant Software Engineer

 Q326:  sem_wait bug in Linuxthreads (version included with glibc 2.1.3)?   

Jason Andrew Nye wrote:

> The problem is that POSIX-style cancellation is very dangerous in C++ code
> because objects allocated on the stack will never have their destructors
> called when a thread is cancelled (leads to memory leaks and other nasty
> problems).

This statement is not strictly true. Only an implementation of POSIX thread
cancellation that completely ignores C++, combined with an implemenation of
C++ that completely ignores POSIX thread cancellation, results in a dangerous
environment for applications and use both in combination. Because POSIX
cancellation was designed to work with exceptions (it was in fact designed to
be implemented as an exception), the combination is obvious and natural, and
there's simply no good excuse for it to not work.

Personally, I think it's very near criminal to release an implemenation where
C++ and cancellation don't work together. Developers who do this may have the
convenient excuse that "nobody made them" do it right. The C++ standard doesn't
recognize threads, and POSIX has never dealt with creating a standard for the
behavior of POSIX interfaces under C++. (Technically, none of the POSIX
interfaces are required to work under C++, though you rarely see a UNIX where
C++ can't call write(), or even printf().) Excuses are convenient, but this is
still shallow and limited thinking. I don't understand why anyone would be
happy with releasing such a system.

I spent a lot of time and energy educating the committee that devised the ABI
specification for UNIX 98 on IA64 to ensure that the ABI didn't allow a broken
implementation. Part of this was simply in self defense because a broken ABI
would prohibit a correct implementation.  I'd also had some hope that the
reasonable requirements of the ABI would eventually percolate up to the source
standard. More realistically, though, I hoped that by forcing a couple of C++
and threads groups to get together and do the obviously right (and mandatory)
thing for IA64, they might do the same obviously right (though not mandatory)
thing on their other platforms. Maybe someday it'll even get to Linux.

Please don't settle for this being broken. And especially, don't believe that
it has to be that way. Anyone who can implement C++ with exceptions can create
a language-independent exception facility that can equally well be used by the
thread library -- and, with a few trivial source extensions, by C language code
(e.g., though the POSIX cleanup handler macros).

/------------------[ ]------------------\
| Compaq Computer Corporation              POSIX Thread Architect |
|     My book:     |
\-----[ ]-----/
On Mon, 06 Nov 2000 09:16:42 -0500, Dave Butenhof 
>environment for applications and use both in combination. Because POSIX
>cancellation was designed to work with exceptions (it was in fact designed to
>be implemented as an exception), the combination is obvious and natural, and
>there's simply no good excuse for it to not work.

What should the semantics be, in your opinion? POSIX cleanup handles first,
then C++ unwinding? Or C++ unwinding first, then POSIX cleanup handlers? Or
should the proper nesting of cleanup handlers and C++ statement blocks be

I understand that in the Solaris implementation, the POSIX handlers are done
first and then the C++ cleanup.  How about Digital UNIX? My concern is what GNU
libc should do; where there isn't a standard, imitating what some other popular
implementations do would make sense.
My opinion is that they should be executed in the only possible correct or useful

( ;-) -- but only for the phrasing, not the message.)

Each active "unwind scope" on the thread must be handled in order. (The opposite
order from that in which they were entered, of course.)

The obvious implementation of this is that both C++ destructors (and catch
clauses) and POSIX cleanup handlers, are implemented as stack frame scoped
exception handlers, and that each handler is executed, in order, as the frame is
unwound by a single common unwind handler.

Any other order will break one or the other, or both.

> I understand that in the Solaris implementation, the POSIX handlers are done
> first and then the C++ cleanup.  How about Digital UNIX? My concern is what GNU
> libc should do; where there isn't a standard, imitating what some other popular
> implementations do would make sense.

I don't know the details of the Solaris implementation, but what you describe is
clearly broken and useless except in trivial and contrived examples.

We, of course, do it "correctly", though it could be cleaner. For example, right
now C++ code can't catch a cancel or thread exit except with the overly general
"catch(...)", because C++ isn't allowed to use pthread_cleanup_push/pop, (and
shouldn't want to since C++ syntax is more powerful), and C++ doesn't have a name
for those "foreign" exceptions. (Of course destructors work fine.) We've worked
with the compiler group to add some builtin exception subclasses to deal with
that, but we never found the time to finish hooking up all the bits.

Our UNIX was architected from the beginning with a universal calling standard that
supports call-frame based exceptions. All conforming language processors must
provide unwind information (procedure descriptors) for all procedures, and a
common set of procedures (in libc and libexc) support finding and interpreting the
descriptors and in unwinding the stack. Our C compiler provides extensions to
allow handling these native/common exceptions from C language code. Our
 uses these extensions to implement POSIX cleanup handlers. (For other
C compilers, we use a setjmp/longjmp package built on native exceptions "under the
covers", though with some loss of integration when interleaved call frames switch
between the two models. Support for our extensions, or something sufficiently
similar, would allow me to make gcc work properly.) Both cancel delivery and
pthread_exit are implemented as native exceptions. The native unwind mechanism
will unwind all call frames (of whatever origin) and call each frame's handler (if
any) in the proper order. (Another minor glitch is that our exception system has a
single "last chance" handler, on which both we and C++ rely. We set it once at
initialization, but C++ sets it at each "throw" statement, which will break
cancellation or thread exit of the initial thread since we can't put a frame
handler at or below main(). This is also fixed by our not-quite-done integration
effort with C++.)

This is all covered by the IA64 ABI. Of course it specifies API names, and data
structure sizes and contents. It's also somewhat more biased than our
implementation towards C++, since it was a generalization and cleanup of the C++
ABI section on exceptions rather than something designed independently. (The ABI,
and any C++ implementation, had to do this anyway. Making it general was only a
little more work than making it exclusive to C++, and of fairly obvious value.)

/------------------[ ]------------------\
| Compaq Computer Corporation              POSIX Thread Architect |
|     My book:     |
\-----[ ]-----/

 Q327: pthread_atfork??   
Steve Watt wrote:

> > [ ... ]  Because the thread ththread calls fork() owns all of your locks in the
> >parent, and because that thread (and only that thread) exist in the child, the
> >child thread owns all of the locks when the CHILD handler is called, and it can
> >be sure that all of your data (protected by the locks) is clean and consistent.
> I mostly agree with this, except for the statement that the child thread
> owns all the locks.  Specifically, what about error checking mutexes
> a'la UNIX98?  Can the child process really unlock them?  Does an
> implementation have to keep some manner of PID information around so that
> new threads in the child would correctly EPERM?

You're of course welcome to agree or disagree, but be warned that when it comes to
matters of POSIX threads interpretations, suggesting disagreement with me can lead
to long and complicated replies containing detailed analyses and interpretations of
the relevant sections of the standard. You've been warned. ;-)

Yes, I deliberately and carefully said that the child owns the locks. That's what I
meant, which is why I said it.

That is, in the child process, the recorded owner (of any mutex for which owner is
recorded) of a mutex locked by the thread that called fork() in the parent IS the
single thread in the child process. POSIX does not specify that the thread ID of the
single thread in the child is identical to the ID of the forking thread in the
parent, but it does require that any necessary "transfer of ownership" be made
transparent to the application.

If you locked it in a PREPARE handler, you can unlock it in the CHILD handler, no
matter what type of mutex it was.

At least... this is the INTENT of the POSIX working group. Unfortunately, the text
of the standard is somewhat less clear than one might like. Little is said about
what pthread_atfork() does, or how or why you might use it, except in the RATIONALE
section, which is explicitly NOT a binding part of the standard. (It's commentary
and explanations, but can place no requirements on either implementation or
application.) The description of fork() also is not particularly useful because,
despite the clear implication (by having pthread_atfork()), the standard says that
applications may call only "async-signal safe" functions between the return from
fork() (in the child) and a call to one of the async-signal safe exec*() functions,
and mutex operations are not async-signal safe. (But then, technically, the atfork
CHILD handlers, which are called implicitly by the user-mode wrapper of the _fork()
syscall, are not actually called "by the application" after return from fork();
thereby adding yet another level of fuzzy haze to the dilemma.)

What this all means is that we (the working group) didn't spend nearly enough time
reviewing the vast body of the POSIX standard to find words and implications that
should have been changed. Originally, the thread standard was a completely separate
document, though it modified certain sections of 1003.1. That made a thorough review
awkward. Eventually, the 1003.1c amendment text was integrated with the standard. We
found many of the resulting inconsistencies and holes -- but not all of them.
Unfortunately, some areas, like this one, are not mere editorial changes; fixing the
standard to say what we meant could break some implementations that currently
conform to the letter (while violating the spirit).

What this really means is that use of pthread_atfork() may be broken (and unusable)
on some implementations; and those implementations may not be technically
"nonconforming". We were always aware this would occur in some cases, because we
knew we couldn't make the standard perfect. Many such issues that came up were
dismissed as simple matters of "quality of implementation". Nobody, obviously, would
buy a broken implementation. (The flip side, to which we didn't pay sufficient heed,
is that people DO buy broken implementations all the time, or are forced to use such
systems bought by others, learn to accept the limitations, and even expect them of
other systems.)

"Life's not fair."

> What about thread-specific data?  Should the thread in the child get the
> same thread-specific data as the thread that called fork()?  What if
> the result of pthread_self() is different, such that pthread_equal won't
> say they're equal?

The standard doesn't require that the thread ID will be the same, though we assumed
it usually would be. This wasn't an omission. While we said that thread IDs are
private to the process, there was some interest in "not precluding" an
implementation where thread IDs are global. If thread IDs are global, the thread in
the child must have a unique ID. This silly and irrelevant intent, however, has
certain implications, adding to the general "fuzz" around fork(), because it implies
that the ownership information of mutexes would need to be fixed up; but that's not
actually required anywhere. (In fact, this could be considered a technical violation
of the requirement that the child has a copy of the full address space, "including
synchronization objects".)

Nevertheless, in any implementation crafted by "well intentioned, fully informed,
and competent" developers, it must be possible to use pthread_atfork() (with
sufficient care) such that normal threaded operation may continue in the child. On
any such implementation, the thread ID of the child will be the same as in the
parent, all mutexes properly locked in PREPARE handlers will be unlockable in
CHILD handlers, all thread-specific data attached to the forking thread in the
parent will be accessible to the single thread in the child, and so forth.

This may not apply to Linuxthreads, if, (as I have always assumed, but never
verified), the "thread ID" is really just the pid. At least, pthread_self() in the
child would not be the same as in the parent. This is just one of the reasons that
building "threads" on top of processes is wrong; though the nonconformance of the
consequences here are, as I've detailed, somewhat less clear and absolute than in
other places. (Nevertheless, this substantially and clearly violates the INTENT of
the working group, and may render pthread_atfork(), an important feature of the
standard, essentially useless.) The thread library could and should be smart enough
to fix up any recorded mutex (and read-write lock) ownership information, at least;
and TSD should be carried over because it's "just memory" and there's no reason to
do anything else.

> >The CHILD handler may also just unlock and continue, though more commonly it
> >will do some cleanup or reinitialization. For example, it might save the current
> >process pid somewhere, or reset counters to 0.
> I generally think that about the only good thing to do in the child
> handler is re-initialize the IPCs.

"The only good thing" to do in CHILD handlers is whatever is necessary to clean up
and get ready for business. If you don't do that, there's no point to even
bothering... in which case you just can't expect to fork() a threaded process at

/------------------[ ]------------------\
| Compaq Computer Corporation              POSIX Thread Architect |
|     My book:     |
\-----[ ]-----/
> Does the std address a forkall() concept vs. fork1()?

There's some rationale (commentary that's not part of the standard) explaining that
forkall was proposed and rejected. It doesn't bother to explain the problems with

> > the requirement that the child has a copy of the full address space
> Implying _all_ threads too - a "forkall()" concept. Doesn't POSIX
> replace fork() with fork1(), thus the above requirement is
> not violated since the "true" fork() is not called?

"Full address space" doesn't imply "all threads" at all, except perhaps to a "pure user
mode" thread library. Kernel threads don't live in the process address space.

POSIX doesn't "replace fork" with anything. POSIX **defines** fork. Rather, Solaris
"replaces fork" with their proprietary fork1 interface. (Though only, of course, in
nonstandard compilation environments.)

The concept of "forkall" is foolish. You can't arbitrarily replicate execution streams
unless you can tell them what happened, and there's simply no way to do that. (Solaris
allows that threads in blocking syscalls "might" return EINTR, but that's all, and it's
not nearly enough.) With a single execution context for each process, fork was just fine,
because the execution stream asked for the copy, and knows what happened on both sides of
the fork. When you have multiple independent execution contexts, you have to deal with
the fact that you don't know what any other context is doing, and it doesn't know what
you're doing.

A lot of new mechanism would have to be invented, and many complicated constraints added,
to make "forkall" a useful interface. Each cloned execution context would need to be
immediately notified, and it would need to be able to "clean up" in whatever way
necessary, including terminating itself. This might be done by delivering a signal, but
much of the cleanup likely to be necessary (and thread termination) cannot be done in a
signal handler.

Forkall was proposed. We discussed it a lot. We dismissed it as far too complicated, and
way beyond any rational interpretation of the working group's scope and charter. To some
people who don't look deeply, forkall seems "simpler" than pthread_atfork; but it is
actually vastly more complicated. Unless you don't care about correctness or usability.

/------------------[ ]------------------\
| Compaq Computer Corporation              POSIX Thread Architect |
|     My book:     |
\-----[ ]-----/

 Q328: Does GNU Pth library support process shared mutexes? 

CoreLinux++ WILL support process shared mutexes through a combination of
shared memory and semxxx. This will take a few weeks to implement and will
require that all applications needing this will require using the
libcorelinux++ libraries.

It is also C++.

Frank V. Castellucci
 Q329: I am trying to make a thread in Solaris to get timer signals. 

I am trying to make a thread in Solaris to get timer signals
every second. I am using setitimer() and sigwait() to set up
and catch the signals respectively.

I am sorry to tell you that setitimer()/sigwait() does not work
with the threads/pthreads library on Solaris.  I won't go into
the details, but it is a sorry tale.

To make a thread do a periodic action, use cond_timedwait()
on a dummy cond_t/mutex_t pair that has no other function
than to be used in the call to cond_timedwait().

The thread will wakeup at the time you specify.
It can then do the periodic thing and reissue the
cond_timedwait() to wait another interval.

Roger Faulkner

 Q330: How do I time individual threads? 

I am getting very puzzling behavior using 2 threads on a 2 processor Solaris
computer. I am new to Solaris, so I am probably just doing something stupid.

I am writing code to parallelize some numeric computations. It was
originally written on NT, and I am porting it to Solaris. I am using a dual
processor Dell NT and a dual processor Sun Solaris for development. The
threads are very simple, and can operate completely independently. The code
ports easily from the point of view of compiling, linking, and executing.
However, on NT, I get over 90% speedup in using two threads, but on Solaris
I get almost none (at most about 15%).

Simplified example code is shown below. 
double GetTime()
   return ((double)clock())/CLOCKS_PER_SEC;

Not stupid, just a misinterpretation of clock(3C).

This is from the clock(3C) manual page:

     The clock() function returns the  amount  of  CPU  time  (in
     microseconds)  used  since  the first call to clock() in the
     calling process.

What you get from clock() is the CPU time used by all threads
in the process since the last call to clock().
To do your timing, you want to get the elapsed time.

This is my modification to testth.cpp (times() returns the number
of ticks (HZ) since some time in the past):

double GetTime()
    struct tms dummy;
    return (times(&dummy;))/(double)(HZ);
Roger Faulkner

 Q331: I'm running out of IPC semaphores under Linux! 


>>>>> "Doug" == Doug Hodson  writes:

    Doug> Now that I have IPC semaphores working under Linux, I am
    Doug> running into another problem. I'm running out of them!!! Do
    Doug> I have to recompile the kernel to make more available?

echo 500 > /proc/sys/kernel/sem

Replace 500 with whatever you want.  (untested)
 Q332: Do I have to abandon the class structure when using threads in C++? 

> in C++ much easier. Currently to use threads in C++ you have to
> virtually abandon the class structure and the type checking and
> resort to low level hacking when using threads.

No, that's incorrect.  This problem only occurs if you do not
understand the appropriate patterns and idioms for effective
multi-threaded programming in C++.  We've been developing and
deploying high-performance and real-time OO applications in C++ for
the past decade, and there's now plenty of "collective wisdom" on how
to do this properly and abstractly using lots of nice high-level C++
features.  I recommend that you check out the following resources for
more information:

All of these resources are based on the threading abstractions
provided with ACE, which is a freely-available, open-source framework
that defines a rich source of components for concurrent and network
programming.  You can learn download at from.

BTW, I'm teaching a short-course at UCLA in a couple weeks that'll 
cover this all material in depth.  You can download the course notes
and learn more about the course itself at

Take care,

 Q333: Questions about pthread_cond_timedwait in linux. 

>i've been programming threads for years and am just moving to linux
>(redhat) and have questions about pthread_cond_timedwait:
>pthread_cond_timedwait takes a struct timespec * and looking at the
>example in the doc, the fields are initialized from the fields
>in gettimeofday plus an offset.  that raises the following questions:
>  what happens if the offset puts the tv_sec over the maximum value
>  for that day?

The tv_sec field is the number of seconds since the epoch. If this overflows,
then the year must be 2037, and you aren't using the latest 256 bit hardware. 

:) :) :)

>  What happens if the clock is changed? (like a dst adjustment)

The Linux implementation of pthread_cond_timedwait converts the absolute time
to a relative wait, which is then the subject of a nanosleep() call (with the
delivery of a signal cutting that sleep short when the condition is cancelled).
The nanosleep system call in Linux is based on the system clock tick and not on
calendar time. So the answer is that changing the system time will have no
effect on when the call wakes up. Namely, moving the date forward will not
cause an immediate wakeup.

However, if an unrelated signal (not due to the condition wakeup) interrupts
the pthread_cond_timewait, it will call gettimeofday() again and recompute
the relative wait. At that time it may wake up due to the date change.

 Q334: Questions about using pthread_cond_timedwait. 

I need your help to clarify something...

Consider the following code:

void foo()
struct timeval tv;
struct timespec ts;

if (gettimeofday(&tv;, NULL) < 0)
    // error handling stuff here

// Convert and store to structure that pthread_cond_timedwait wants
ts.tv_sec = tv.tv_sec;
ts.tv_nsec = tv.tv_usec * 1000;

// Add 10 milli-sec (this is how long I want to wait)
ts.tv_nsec += 10 * 1000 * 1000

while (MyPredicate == false_
    status = pthread_cond_timedwait(&condvar;, &mutex;, &ts;);
    // do stuff depending on status

// Other stuff goes here

The problem is that I get lots of ETIMEDOUTs...

Here come the questions:

1) On a normal PC (single processor) running linux,
what is the minimum time I can wait???
I assume 10 milli-sec is ok...

2) On the other end of the scale, what is the max time
I can wait ???  e.g. can I put 300 milli-sec
(i.e. ts.tv_nsec += 300 * 1000 * 1000)???

I am asking because gettimeofday will return the time
in a timeval.  If I just increase the usec and not the
seconds, are there overflow problems ???

if tv_tv.sec is X and tv_tv.usec is 999.999,
if I increase by 100.000 is that going to keep the
seconds the same and go to 099.999, or is
it clever enough to either

increase X to X+1 OR make usec equal to 1.099.999 ???

What I am thinking is that the ETIMEDOUTs might be
because the new time ends up being EARLIER that
the current time.

pthreads conditional waits use an absolute time to specify the timeout
not a relative time. In general you will get the _time now_ and add some
delta to determine the absolute time corresponding to the relative
timeout delta that you wish.

That's the theory. In practice system operators can totally screw you up
by adjusting the clock which changes the machine's notion of the current
absolute time. There isn't an awful lot that you can do about this
problem except...

Certain versions of Unix provide clock_gettime; among those versions of
Unix some will support CLOCK_MONOTONIC, a type of clock that alawys
advances at the same rate regardless of changes to the machine's
absolute clock. A monotonic clock will very useful to use in conjunction
with relative timeouts.

The trouble with this is that while the monotonicity of the clock used
for conditional waits is the default, it seems to be associated with the
condition variable attribute. How then, are you supposed to compute the
timeout value? My guess is clock_gettime with CLOCK_MONOTONIC + delta
should be used but I can't be sure. Also, what happens if the condition
variable attribute is initialized to specify a non-monotonic clock and
we use a monotinic clock to compute the timeout?

If anybody has up to date information on this I'd like to hear about it.

>> >// Add 10 milli-sec (this is how long I want to wait)
>> >ts.tv_nsec += 10 * 1000 * 1000
>> The problem with this statement is that it may potentially increase
>> the value of tv_nsec beyond one billion less one, thus giving
>> rise to an invalid struct timespec.
>Just to clarify, the tv_usec field (although a long) will only go up to
>999.999 (max value gettimeofday will return for usec).

Or,   999,999   for those of us whose locale calls for , as a digit separator


>Since a long goes up to, if I go above 999.999
>this is considered an illegal value...


>And also when I convert to a timespec to use with pthread_cond_timedwait,
>although again the tv_nsec field is a long, I am only allowed to
>go up to 999.999.000 (or 999.999.999 ???)

Yes, up to 999999999.

>If yes, how come the conditional variable returns with an
>ETIMEDOUT and NOT with a EINVAL ???

Because the behavior is simply undefined when you pass a bogus timespec;
undefined means that any response is possible, including ETIMEDOUT.

The Single UNIX Specification does not require the pthread_cond_timedwait
function to detect bad timespec structures.  If the programmer has taken
care that the structures have valid contents, checking them is just
a waste of cycles; and the progrmamer who gets them wrong will likely
also ignore the return value of pthread_cond_timedwait().

 Q335: What is the relationship between C++ and the POSIX cleanup handlers? 

>Ian Collins wrote:
>> Stefan Seefeld wrote:
>> >
>> > Ian Collins wrote:
>> >
>> > > Cleanup handlers are not a good place to destroy objects, use some
>> > > for of container object that to the object and delete it in its
>> > > destructor
>> > > (a smart pointer) to do this.
>> >
>> > that would indeed be the right thing, if....if thread cancelation
>> > would do proper stack unwinding.
>> >
>> > Stefan
>> I would hope it does - the Solaris one does.
>Only if you use Sun's own compiler and even then - not always.

From what I understand, in this compiler, when a thread is canceled, it acts as
if some special exception was thrown which is never caught and which does not
call unhandled()---in other words, just the unwinding is performed.

What is the relationship between this unwinding and the POSIX cleanup handlers?

Are these properly interleaved with the unwinding, or are they done first?

In other words, what I'm asking is: are the POSIX cleanup handlers somehow
hooked into the destructor mechanism so that if I have this

    Object foo;

    pthread_cleanup_push(A, ...)

        Object bar;

        pthread_cleanup_push(B, ...)





what happens? Ideally, handler B() would get called, then the destructor of
object bar, followed by the handler A, and then the destructor of object foo.

For that to work, the cleanup handlers would have to be hooked into the
unwinding mechanism of the C++ implementation, rather than on a separate stack.

E.g. pthread_cleanup_push(X, Y) would be a macro which expands to something

    __cleanup_obj __co(X, Y);

where __cleanup_obj is a class object with a destructor which calls the
handler---except that some compiler extensions are used to make this work in C
as well as C++.

I know there are ways to add these kinds of hooks into GCC.  I'm thinking that
it would be nice to add this support to LinuxThreads, and it would also be
nice if it was compatible with some existing good or popular scheme.

LinuxThreads currently keeps its own chain of cleanup nodes, but there is no
reason not to use some GCC extensions to get it to use the destructor mechanism
instead (other than breaking binary compatibility; i.e this would have
to wait until glibc 2.2.x, and support for 2.1.x and 2.0.x cleanup
handling would have to be retained.)

 Q336: Does selelct() work on calls recvfrom() and sendto()? 

>>> I hope that some one can help me see the light.
>>> Assume:
>>> 1. A socket based server.
>>> 2. On a client connection server creates a child-server thread to
>>>    take care of this clinet.
>>> 3. Child-server implements a retransmission of packet on negative
>>>    ACK (uses alarm signal for time out)
>>Why not use select() with a timeout to block each child-server thread 
>>instead of alarm?
>Does selelct() work on calls recvfrom() and sendto()? I am under the
>impression it only works on connection oriented sockets accept(),
>read(), write(), recv(), send() etc. Please give a simple sketch of
>the usage?

No, select works on datagram sockets as well. On UNIX-like systems, select also
works on other kinds of objects: regular files (not really useful there),
terminal devices, printer ports, etc.

It works pretty much the same way on datagram sockets as it does on stream
sockets. Read availability means there are one or more datagrams waiting. Write
availability means there is buffer space for datagrams.
 Q337: libc internal error: _rmutex_unlock: rmutex not held. 

>I'm writing a distributed application using Iona's Orbix ORB and RogueWaves
>ToolsPro class libraries. Since the application is multi-threaded, I'm using
>RogueWave's Threads.h++ also. The application is build over POSIX threads.
>When I run the application I get the following error:
>libc internal error: _rmutex_unlock: rmutex not held.
>...and the application just hangs there. I have tried moving to Solaris
>instead, but of no good use. I tried some sample thread programs but
>they all
>worked fine.
>Is there something I'm missing? A quick reply or advice will be greatly
>appreciated as the deadlines are short and customer not in a good mood

The message you are getting indicates an internal inconsistency
in libc.  The standard I/O implementation (in libc) uses mutex
locks to protect its internal data structures (the stuff behind
the FILE struct).  The message is saying that some thread that is
doing standard I/O attempted to unlock a lock that it does not own.
This, of course, "cannot happen".

It could be caused by the application (overwriting memory)
or it could be an inconsistency between libc and libthread
(caused by linking libc statically but libthread dynamically
[you can do static linking with libc but there is no static
libthread]) or the application could be defining its own
_thr_main() function that subverts the ones in libthread and libc.

To make any progress on the problem, I'd need a test case that
exhibits the problem.  I have to admit that I know nothing about
the other software you are using (Iona's Orbix ORB and RogueWave's
ToolsPro class libraries and RogueWave's Threads.h++) but they
might interfere with the proper working of libthread and libc
(I'm just speculating here, not accusing).  And, of course, there
could be a bug somewhere in libthread/libc.

One thing you could do to discover more about the problem would
be to apply a debugger (adb or dbx or gdb) to the hung process.
Also you can get stack traces of all the threads in the process
by applying the pstack command to the hung process:
    $ /usr/proc/bin/pstack 

What release of Solaris are you running?
Would it be possible for you to send me your program?
Maybe we should just continue this privately via e-mail
rather than on the newsgroup.  Feel free to send me mail.

Roger Faulkner
From: Boris Goldberg  

It may happen due to incorrect order of linking with libc and libthread.

You must link with libthread before libc. That can be ensured by
-mt flag on lin line.

do ldd on your program: if you see libthread after libc, that's your

 Q338: So how can I check whether the mutex is already owned by the calling thread? 

On Mon, 27 Mar 2000 12:08:54 GMT,  wrote:
>Thanks for all your qualified contributions.
>This is what I've learned:
>If I want to abide by the POSIX standard on UNIX platforms I'd better
>drop the habit of using recursivly lockable mutexes. OK, so be it. But
>I'd really love to port a lot of existing C++ code and use it on Linux.

Obviously. Any sort of proscription against recursive mutexes must be weighed
against the pressing need to port a whole lot of code that needs them.

>So how can I implement my Mutex-Lock-class in a way that it checks
>whether the mutex is already owned by the calling thread?

Very easily. The class simply has to store a counter and the ID of the
owning thread. These can be protected by an additional internal mutex.

>It looks to me that if I just put in an additional boolean flag, no
>thread can safely check this flag because it may be changed
>simultanously by another thread.
>Given a mutex class "NThreads::Mutex" (that used to be recursivly
>lockable), the class NThreads::MutexLock has been implemented as you
>can see below (abbreviated). How can I change it to make it work with a
>non-recursive Mutex class?
>namespace NThreads
>class Mutex
>  friend class MutexLock;
>    Mutex();
>   ~Mutex();
>    bool lock( int timeout )
>    {
>      //return true if not timed out
>    }

These kinds of strategies aren't all that useful, except for debugging
assertions.  About all you can do in the case of such a timeout is to log an
error that a deadlock has probably occured and then abort the application.
It is an internal programming error that is not much different from a
bad pointer dereference, or divide by zero, etc.

>    void unlock()
>    {
>      // unlock system mutex
>    }
>    void lock()
>    {
>      // lock with infinite timeout, no need for return value, but
>    }
>class MutexLock
>  public:
>MutexLock( Mutex& mtx ) : rMtx_( mtx )
>  rMtx_.lock();

I see, this is just one of those safe lock classes whose destructor
cleans up.  

It is the Mutex class that should be made recursive, not the safe lock
wrapper, as in:


    class Mutex {
        pthread_mutex_t actual_mutex_;
        pthread_mutex_t local_mutex_;
        pthread_t owner_;
        int recursion_count_;
        void Lock();
        void Unlock();
        void Wait(Condition &);


    #ifdef USE_OS2_THREADS

    // definition of Mutex class for OS/2


The methods definitions for the POSIX variant would look something like this:

    : recursion_count_(0)
    pthread_mutex_init(&actual;_mutex_, NULL);
    pthread_mutex_init(&mutex;_, NULL);
    // leave owner_ uninitialized

    assert (recursion_count_ == 0);
    int result = pthread_mutex_destroy(&actual;_mutex_);
    assert (result == 0);
    int result = pthread_mutex_destroy(&local;_mutex_);
    assert (result == 0);

    void Mutex::Lock()

    if (recursion_count_ > 0 && pthread_equal(pthread_self(), owner_)) {
        assert (recursion_count_ < INT_MAX); // from 
    } else {
        assert (recursion_count_ == 0);
        recursion_count_ = 1;
        owner_ = pthread_self();


    void Mutex::Unlock()

    assert (pthread_equal(pthread_self, owner_));
    assert (recursion_count_ > 0);

    if (--recursion_count_ == 0)

Or something along these lines. I haven't tested this code.   I did make sure
that wherever both locks are  held, they were acquired in the same order  to
prevent the possibility of deadlock. It's more or less obvious that you must
never try to acquire the actual mutex while holding the local one.

A condition wait requires special trickery:

    void Mutex::Wait(Condition &cond;)

    assert (pthread_equal(pthread_self, owner_));
    assert (recursion_count_ > 0);
    int saved_count = recursion_count_;
    recursion_count_ = 0;


    pthread_cond_wait(&cond.cond;_, &actual;_mutex_);


    assert (recursion_count_ == 0);
    recursion_count_ = saved_count;
    owner_ = pthread_self();


I hope you can massage this into something that works. If I messed up, flames
will ensue.


As a followup to my own posting, I want to make a remark about this:

>A condition wait requires special trickery:
>    void Mutex::Wait(Condition &cond;)
>    {
>    pthread_mutex_lock(&local;_mutex_);
>    assert (pthread_equal(pthread_self, owner_));
>    assert (recursion_count_ > 0);
>    int saved_count = recursion_count_;
>    recursion_count_ = 0;
>    pthread_mutex_unlock(&local;_mutex_);
>    pthread_cond_wait(&cond.cond;_, &actual;_mutex_);
>    pthread_mutex_lock(&local;_mutex_);
>    assert (recursion_count_ == 0);
>    recursion_count_ = saved_count;
>    owner_ = pthread_self();
>    pthread_mutex_unlock(&local;_mutex_);
>    }

Firstly, there is no condition checking while loop around the pthread_cond_wait
because it is assumed that the caller of Mutex::Wait() will implement the
re-test. The intent here is only to wrap the call. Thanks to John Hickin
for raising this in an e-mail.

Secondly, because pthread_cond_wait is a cancellation point, it is necessary
to deal with the possibility that the waiting thread may be canceled. If that
happens, the actual_mutex_ will be locked by the canceled thread, but the state
of the owner_ and recursion_count_ will not be properly recovered.   Thus
the user of the class has no recovery means.

This requires a messy change, involving an extern "C" redirection function
which calls a method that does mutex reacquire wrapup.  There is a need
to communicate the saved recursion count to the cleanup handler, as well
as the identity of the mutex object, using a single void * parameter, so 
a context structure is introduced:

    struct MutexContext {
    Mutex *mtx_;
    int saved_count_;
    MutexContext(Mutex *m, int *c) : mtx_(m), saved_count_(c) { }

The cleanup handler is then written, which takes the context and
calls the object, passing it the saved count:

    extern "C" void Mutex_Cancel_Handler(void *arg)
    MutexContext *ctx = (MutexContext *) arg;


The code that is executed at the end of the old version of Mutex::Wait
is moved into a separate method. This assumes that actual_mutex_ is
locked on entry, which is the case if the pthread_cond_wait is canceled.

    void Mutex::CancelHandler(int saved_count)
    // actual_mutex_ is locked at this point


    assert (recursion_count_ == 0);
    recursion_count_ = saved_count;
    owner_ = pthread_self();


Finally, Wait() is revised to look like this:

    void Mutex::Wait(Condition &cond;)

    assert (pthread_equal(pthread_self, owner_));
    assert (recursion_count_ > 0);

    MutexContext context(this, recursion_count_);
    recursion_count_ = 0;


    // Ensure cleanup takes place if pthread_cond_wait is canceled
    // as well as if it returns normally.

    pthread_cleanup_push(Mutex_Cancel_Handler, &context;);

    pthread_cond_wait(&cond.cond;_, &actual;_mutex_);

 Q339: I expected SIGPIPE to be a synchronous signal. 

> >Using Solaris threads under Solaris 5.7.
> >
> >I would have expected SIGPIPE to be a synchronous signal when it
> >occurs as a result of a failed write or send on a socket that has
> >been disconnected.  Looking through past articles in Deja seemed to
> >confirm this.
> >
> >However, I thought I would undertake the radical idea of actually
> >testing it.  In my tests it looks as if it's an asynchronous signal.
> Yes, it is an asynchronous signal in Solaris.
> This is not a bug in Solaris; it is intentional.
> The purpose of SIGPIPE is to kill a process that is writing
> to a pipe but that has made no provision for the pipe being
> closed at the other end.

On HP-UX, SIGPIPE is a synchronous signal and one shouldn't even try
'sigwait'-ing for it. Sounds logical too. Any reason why it's different
on Solaris7? The above paragraph didn't seem like a very convincing


-- Rajiv Shukla

> If you want to deal with a pipe or socket being closed, then either
> mask SIGPIPE or catch it with a do-nothing signal handler and test
> the errno that comes with a failed write() or a send() operation.
> If it is EPIPE, then that corresponds to SIGPIPE, and you have
> gotten the answer synchronously.

In Digital (Tru64) Unix, we made SIGPIPE a synchronous signal,
and I still believe that's the right disposition for it. Uncaught,
I will terminate the process. Caught, it allows corrective
action to occur in the thread that cares about the broken
connection. Useful? Barely. More accurate? Much.

That aside, the best thing to do with SIGPIPE is to
set it  to SIG_IGN and pick up the EPIPE error return
on the write() call. Masking/catching the signal
isn't the right thing to do if you don't care about
the signal, and you most likely don't.
It's cheapest to ignore it and move on.

 Q340: I have a problem between select() and pthread... 

>Hi! everyone..
>I have a problem that is the syncronization between select() and pthread...
>That is as follows...
>the main thread is blocking in select() func.
>and at the same time, the other thread is closed a socket descriptor in
>fd_set..  this work causes a EBADF error in select().
>so, I wrote in main thread:
>    if ((nready = select(nfds, readfds, writefds, exeptionfds)) == -1) {
>        if (errno == EBADF) goto SELECT_LABEL;
>        perror("select()");
>    }
>But that is not solved...
>after goto syntax, I got infinitely EBADF error in select().
>How do I for solving that???
>after select(), close a socket descriptor??
>or only *ONE* thread controls socket descriptors??
>I use the POSIX thread on Solaris 7..

You have to figure out in the main thread which file descriptor
was closed by the other thread and delete its bit from the fdset's
before reissuing the select().  The select interface() itself
will not help you to determine this.

In Solaris, select() is implemented on top of poll(2).  If you
use the poll() interface directly, then a closed file descriptor
will show up in the array of pollfs's with revents containing the
POLLNVAL bit.  Thus the poll() interface will tell you which file
descriptor has been closed and you can stop polling on it.

Roger Faulkner
 Q341: Mac has Posix threading support. 

> I'm looking at a cross-platform strategy for our application.
> Threads was one issue which came up, and Posix threads seems like a good
> prospect.

> It is supported under Windows (
> and Unix, but I don't think Mac has Posix threading support.

I'm maintaining a free (nonpreemptive) pthreads library, available at*


Matthias Neeracher
   "I really don't want the SNMP agent controlling my toilet to tell
    someone when/where I'm using it." -- Sean Graham

 Q342: Just a few questions on Read/Write for linux. 

>Just a few questions on Read/Write
>lock stuff since man pages don't exist (yet)
>for linux.
>1) Where can I find documentation, sample code,
>or anything else that will help (eg. URLs etc.)

These locks are based on The Single Unix Specification.

>2) Can I treat the rwlock stuff same as a mutex
>in terms of init/destroy/lock/unlock/trylock ???
>I had a look at pthread.h and all the calls look
>the same... (Is it basically a mutex that allows
>multiple locks for readers?)

Something like that.

>3) What's the story with overhead if you start using
>r/w locks?

In Linux, there is somewhat more overhead compared to mutexes because the locks
are more complex.  The structures and the operations on them are larger.

Also, as of glibc-2.1.3, each thread maintains a linked list of nodes which
point to the read locks that it owns. These nodes are malloced the first time
they are needed and then kept in a thread-specific free list for faster
recycling. The nodes of these lists are destroyed when the thread terminates.

Each time a read lock is acquired, a linear search of this list is made to see
whether the thread already owns the read lock. In that case, a reference count
field is bumped up in the linked list field and the thread can proceed.

(The lists are actually stacks, so that a recently acquired lock is at the
front of the list.)

This algorithm is in place in order to implement writer-preference for locks
having the default attribute, while meeting the subtleties of the spec with
respect to recursive read locks.

The prior versions of the library purported to implement writer preference,
but due to a bug it was actually reader preference.

>4) If you have many readers could that mean that the
>writer will never get a chance to lock, or are the
>locks first-come-first-serve ???  I'm thinking

Writer preference, subject to the requirements of The Single UNIX Specification
which says that a thread may recursively acquire a read lock unconditionally,
even if writers are waiting.

In glibc-2.1.3, LinuxThreads supports the non-portable attribute


which gives you more efficient writer preference locks, at the cost of
not supporting recursive read locks. These kinds of locks do not participate
in the aforementioned linked lists. If a writer is waiting on a lock,
and a thread which already has a read lock tries to acquire another one,
it simply deadlocks.

>(I know it's probably dim but...) if a reader can
>always lock, there might be a case where there is
>always at least one reader on the mutex.  What
>happnes if a writer comes along and rwlocks ???

If you read the spec, you will note that this is implementation defined.  An
implementation may, but is not required to, support writer preference.
The Linux one does (now).
 Q343: The man pages for ioctl(), read(), etc. do not mention MT-safety. 

>But so far I do have an implementation in mind, and
>I have learned enough to check if any library
>functions I will call are MT-safe.  And so I 
>started checking man pages, and to my horror
>found that the man pages for such indespensable 
>familiars as ioctl(), read(), and write() do 
>not mention this issue.
>(messy complication: I'm looking at man pages on
>SunOS, but the project will be on Linux.  I don't
>have a Linux account yet.  Bother, said Pooh)

On Solaris, everything in section 2 of the manual pages
(that is, system calls, not general C library functions)
is thread-safe unless explicitly stated otherwise.
Sorry that the man pages are not more clear on this point.

I can't speak for Linux.

Roger Faulkner
 Q344: Status of TSD after fork()? 

>OK, here's an ugly scenario:
>Imagine that you're some thread running along, you've got some reasonable
>amount of stuff stashed away in pthread_{get,set}specific[1].
>Now you call fork().
>Those who have read the POSIX standard know that "If a multithreaded
>process calls fork(), the new process shall contain a replica of the
>calling thread and its address space...  Consequently ... the child
>process may only execute async-signal safe operations until ... one of
>the exec functions is called."
>So, the process is using pthread_*, but it hasn't called pthread_create(),
>so it doesn't really count as a multithreaded process, right?  (Well, I'm
>using that as an assumption at the instant.)

I can't speak for other implementations, but with Solaris pthreads,
the child of fork() is a fully-fledged multithreaded process that
contains only one thread, the one that performed the fork().
It can continue doing multithreaded stuff like create more threads.
Of course, there are the standard caveats that apply to fork(),
like the process must have dealt with its own locks by appropriate
use of pthread_atfork(3THR) or some other mechanism.

>Now for the hard part:  Does pthread_self() return the same value for the
>thread in the child process as it did in the parent for the thread that
>called fork()?  This has implications on thread-specific data, in that
>the definition of "the life of the calling thread" (POSIX 1003.1:1996
>section, lines 15-16) would be assoicated (in my mind) to the
>result of pthread_self().

On Solaris, in the child process, pthread_self() returns 1
(the thread-ID of the main thread) regardless of the value of
the thread-ID of the corresponding thread in the parent process.

>So what I'm looking for is opinions on:
>  A)  Should thread-specific data be replicated, or
>  B)  Should all pthread_getspecific keys now return NULL because it's a
>      new thread in a different process?
>Ugh.  Implementor opinions welcome, as well as users.

On Solaris, the thread-specific data of the forking thread
in the child process is replicated.
Should it?  I think so, but you must ask the standards bodies.

>[1]I like to think of pthread_{get,set}specific as (conceptually) indexing
>   a two-dimensional array that is addressed on column by the result of
>   pthread_self(), and the row by pthread_key_create()'s return.

You should stop thinking this way.  The thread-ID is an opaque object;
it is not to be interpreted as an index into anything.  You should
think of pthread_{get,set}specific as being indexed by the thread
(its register set if you wish), not by its thread-ID.

Roger Faulkner

 Q345: Static member function vs. extern "C" global functions? 

Do I have to? Oh well here goes....

This still uses a nasty cast.  It is also not a good idea to
start a thread in a constructor for the simple reason that
the thread may run _before_ the object is constructed - this
is even more likely if this a base class - I know, I've been
there and done that.

Use an extern "C" friend as in the following compete example:


extern "C" void* startIt( void* );

class Fred
  pthread_t tid;

  friend void* startIt( void* );

  void* runMe() throw() { std::cout << "Done" << std::endl; return NULL; }


  int start() throw() { return pthread_create( &tid;, NULL, startIt, this ); }

  pthread_t id() const throw() { return tid; }

void* startIt( void* p )
  Fred* pF = static_cast(p);

  return pF->runMe();

int main()
  Fred f;
  int  s;

  if( (s = f.start()) )
    return s;

  std::cout << "Started" << std::endl;

  void* status;

  pthread_join(, &status; );

  pthread_exit( 0 );

Warwick Molloy wrote:

> Hi,
> What's the difference between a static member function and extern "C" global
> functions?
>     name mangling
> All C++ code is linked with a regular C linker.  That's why you need name
> mangling to allow such things as overloading etc.
> If you want to get an extern "C" pointer to a static member function, do this
> extern "C" {
>     typedef void* (*extern_c_thrd_ptr)( void *);
> }
> class floppybunny {
>     void worker_func( void );
>     static void* foo_func( void *p)
>     {
>         floppybunny* ptr =(floppybunny*)p;
>         ptr -> worker_func();  // convert explic this pointer to implied this
> pointer.
>     }
>     floppybunny( void )
>     {
>         pthread_create( &tid;, (extern_c_thrd_ptr)foo_func, (void*)this);
>     }
> };
> That makes the thread function nicely associated with your class and best of
> all...
>                         IT WORKS.
> Regards
> Warwick.  (remove the spam to reply)
> Ian Collins wrote:
> > Timmy Whelan wrote:
> >
> > > You can also make the member function static:
> > >
> >
> > For the Nth time, static members are _NOT_ the same as extern "C"
> > functions.
> > Thier linkage may be different.  Use a friend defined as extern "C" or make
> > the
> > real start member public.
> >
> >     Ian
> >
> > >
> > > class foo
> > > {
> > > public:
> > >         static void *startThread(void *param);
> > >
> > >         void *actualThreadFunc( );
> > > };
> > >
> > > void *
> > > foo::startThread( void *param )
> > > {
> > >         foo *f = (foo *)param;
> > >         return f->actualThreadFunc( );
> > > }
> > >
> > > If you need to pass in parameters, use member variables.
> > >
> > > "Mr. Oogie Boogie" wrote:
> > > >
> > > > Howdy,
> > > >
> > > > How does one make a C++ class member function as the starting function
> > > > for a thread?
> > > >
> > > > I keep getting the following warning and have been unable to find any
> > > > documentation/source to get rid of it.
> > > >
> > > > In method `slm_th::slm_th(char * = "/dev/tap0")':
> > > > warning: converting from `void * (slm_th::*)(void *)' to
> > > > `void * (
> > > > *)(void *)'
> > > >
> > > > This is the class:
> > > >
> > > > class slm_th {
> > > >   public:
> > > >     void *Read(void *arg);
> > > > }
> > > >
> > > > void *slm_th::Read(void *arg) {
> > > > ...
> > > > }
> > > >
> > > > Thanks,
> > > >
> > > > -Ralph
> > >

One minor point: Calling convention is, in the general case, a
compiler-specific thing and not an operating-system-specific thing.  Different
compilers for the same operating system can easily have calling conventions
for functions with "C" or "C++" linkages that are incompatible.  

(Some platforms/operating systems have an ABI standard that defines the C
language calling conventions for the platform and operating system.  This is
not universally the case, however.  It is especially not the case for x86
platforms running non-Unix operating systems.)

 Q346: Can i kill a thread from the main thread that created it? 

>can i kill a thread from the main thread that created it?
>under Windows, i only found the CWinThread::ExitInstance () method,

You can kill a thread with TerminateThread().

Using TerminateThread is really, really, really, really, not recommended.  If
thread owns a critical section the critical section is not released and it
will forever be unaccessable.  If other threads then try to enter it they
will hang forever.  Also, the stack allocated to the thread is not released
and various other bad things can happen.

If you think you need to use TerminateThread it's a good sign that your
threading design is broken.  You should be telling the thread to exit itself.

Figuring out how to call TerminateThread using MFC'isms such as CWinThread is
left as an exercise to the reader.


> Also, the stack allocated to the thread is not released
> and various other bad things can happen.

Yes, it's that bad... A while ago I started writing an app that used
TerminateThread() - it leaked about a megabyte per second under load =).

> If you think you need to use TerminateThread it's a
> good sign that your threading design is broken.
> You should be telling the thread to exit itself.

I don't agree 100%; I've encountered several situations where it would be
very handy to kill a thread (think about a long-running computation whose
results you aren't interested in anymore). Pthreads has a nice design - a
thread can explicitly say when it may be cancelled... (I often end up coding
a solution like yours - a message-passing mechanism to tell threads to die -
but that always seems to add more complexity than it's worth...)


 Q347: What does /proc expose vis-a-vis LWPs? 

>> Thanks for the answer! I would really like to know how to see which
>> thread is running on which processor to see if my multithreaded
>> app (which uses the pipeline model) is really using the 6 available CPUs on my
>> platform. Is there such a beast?
>/proc on Solaris doesn't expose this information, so I doubt that any
>non-Sun utility can show it. I don't know if Sun has something (bundled or
>unbundled). As for migrating LWPs from one processor to another - it's
>perfectly normal on Solaris.

You are wrong.  /proc does provide this information, in the lwpsinfo
struct contained in /proc//lwp//lwpsinfo for each lwp in
the process:

    processorid_t pr_onpro;         /* processor which last ran this lwp */

It is displayed with the prstat utility.  Use the command 'prstat -L'
to see each lwp in each process.

Roger Faulkner
 Q348: What mechanism can be used to take a record lock on a file? 

> whats mechanism can be used to take a record lock on a file (using the
> fcntl() call), in a posix multi threaded application.  Seems to me that
> these locks are process based, and therfore multiple threads within the same
> process are treated as the same thing.
> Any pointer would be appreciated

This has been discussed several times before. Yes, fcntl() locks are
process-based, for a number of reasons historical and pragmatic. Some people
have successfully built a two-level file locking strategy that uses mutexes
between threads within a process and fcntl() between processes. Essentially,
you reference count the fcntl() lock(s) so that the process holds an fcntl()
lock whenever any thread within the process has an area locked; if more than
one thread within the process is interested in the same file area, they
synchronized among themselves using a mutex. I believe that sample code may
have been posted. Search the newsgroup archives, if you can find a good server.
(I don't know what the state of Deja is now; it was always a good one, and may
be again if the transfer of control has been straightened out.)

/------------------[ ]------------------\
 Q349: Implementation of a Timed Mutex in C++ 

Thanks to everbyody who spend brain in my program.
It works!(stable like a rock)
Here it is. If it is usefull for somebody -> use it:


typedef struct
  pthread_mutex_t   mutex;
  pthread_cond_t    cond;
  pthread_t     owner;
  int           value;

void*       main_mutex;     /* Pointer to my main_Mutex */  
int         mutexTestCnt = 0;   /* Counter */
pthread_t       thread;     
pthread_cond_t  startcond;      /* Cond to start threads */
pthread_mutex_t startmutex;

int MutexCreate(void* *id)
  Mutex_t *Mutexvar  = malloc(sizeof(Mutex_t));
  if(Mutexvar == NULL) {return -1;}
  *id = (void*)Mutexvar;
  return 0;
int MutexDelete(void* id)
  Mutex_t *mutex =(Mutex_t *)id; 
  if (mutex->value!=1) {return -1; }
  return 0;

int MutexObtain(void* id, int timeoutrel)
   Mutex_t *mutex =(Mutex_t *)id;  
  int status=0;
  struct timeval now;
  struct timespec timeout;
  if(mutex == NULL)  return -1;
  if ((mutex->value<0)||(mutex->value>1))
    return -2; 
  if (mutex->value==0)
    timeout.tv_sec = now.tv_sec + timeoutrel;
    timeout.tv_nsec = now.tv_usec * 1000;
        return -3; 
  return 0;
int MutexRelease(void* id)
  Mutex_t *mutex =(Mutex_t *)id; 
  if ((mutex->value<0)||(mutex->value>1))
    return -1; 
  if (pthread_equal(mutex->owner,pthread_self())==0) 
    return -2; 
  return 0;
void *testfunc(void * arg)
  int i;
  pthread_mutex_lock(&startmutex;);   /* Start all threads at the same
time */
  printf("Thread %s started as %i.\n",(char *)arg,pthread_self());

      if(MutexObtain(main_mutex, 1000) != 0)
      printf("Thread %i: MutexObtain() FAILED\n", thread);
      /* Modify protected variables */
      i = ++mutexTestCnt;
     /* Release CPU */
      /* And check if somebody else could get into the critical section
      printf("Thread %i: Mutex violated by %i\n",
      /* Leave critical section */
      if(MutexRelease(main_mutex) != 0)
      printf("Thread %i: MutexRelease() FAILED\n", thread);

      /* Allow rescheduling (another thread can enter the critical
section */
    printf("Thread %s ready\n",(char *)arg);
    return NULL;

int main(void)
    pthread_t t_a,t_b,t_c;
    int ret;
    char* a;
    if(MutexCreate(&main;_mutex)!=0)  return -1;

    ret=pthread_create(&t;_a,NULL,testfunc,(void *)"a");
    if(ret!=0) fprintf(stderr,"Can't create thread a\n");
    ret=pthread_create(&t;_b,NULL,testfunc,(void *)"b");
    if(ret!=0) fprintf(stderr,"Can't create thread b\n");
    ret=pthread_create(&t;_c,NULL,testfunc,(void *)"c");
    if(ret!=0) fprintf(stderr,"Can't create thread c\n");
    printf("Press key to start\n"); getc(stdin);    
    printf("All done\n");
    return 0;

 Q350: Effects that gradual underflow traps have on scaling. 

Dave Butenhof  writes:
> Martin Shepherd wrote:
> > By the way, neither in your book, nor in the other POSIX threads books
> > that I have, is there any mention of the devastating effects that
> > gradual underflow traps can have on scaling. I'm not even sure why
> > this is occurs, and would like to understand it better. My guess is
> > that if multiple threads are suffering underflows at the same time, as
> > was the case in my program, there is contention for a single underflow
> > handler in the kernel. Is this correct?
> Perhaps, in the HP-UX kernel. I don't know. It would depend on whether the
> underflow is handled by hardware or software; and, if in software, precisely
> how and where. If you're reporting underflow traps to the APPLICATION, that's
> certainly a performance sink if you're underflowing much; signal delivery is
> expensive, and certainly doesn't help your application's scaling.

My experience on a number of systems is that gradual underflow is
usually performed in software, not in hardware, and this includes
expensive workstations and super-computers traditionally used for
number crunching. For example, Sun sparcs, HP's, Dec Alpha's etc..,
all do this. If this weren't bad enough, there is no standard way to
disable it. In Solaris one calls nonstandard_arithmetic(), on HP one
calls fpsetflushtozero(), and I don't know what one does on other

Whether gradual-underflow traps are delivered as signals all the way
to the application, or whether the kernel handles them I don't know,
but regardless, they can increase the run time of any program by large
factors, and seriously suppress scaling in parallel programs, so in
general it is really important to either avoid them or disable them.
In particular, the ability to reliably disable them process-wide, just
as a diagnostic aid, is indispensable, because vendors rarely provide
tools to monitor them.

> This is getting extremely machine-dependent, and therefore it's hard to say
> much about it in a general book. Furthermore, even on platforms where it's a
> problem, it's only going to affect the (relatively, and maybe absolutely)
> small number of FP-intensive applications that do a lot of
> underflowing.

While it is true that most FP-intensive applications shouldn't
underflow, and that good programmers will do their utmost to avoid
performing any calculations that might underflow, everybody makes
mistakes. In my case, once I worked out how to globally enable sudden
underflow across all of my threads, my program speeded up by a factor
of 4. This then led me to a bug in the test that was supposed to have
prevented the underflowing calculations in the first place, and the
end result was a factor of 15 speedup.  I agree that this is somewhat
specialized and very machine specific, but so are the discussions of
memory barriers, and memory caching models that one finds in good
books on parallel programming with threads...

 Q351: LinuxThreads woes on SIGSEGV and no core dump. 

> is there something inherently wrong with my system or is this all
> "normal" behaviour? i'm using the pthreads shipped with glibc 2.1.2 -
> they might be a bit old, but i don't want to get into a big fight with
> my sysadmin.

I have experienced all sorts of strange errors similar to yours. The
workaround is to include this in your program:

  void sig_panic(int signo) {
  struct sigaction act;
  act.sa_handler = sig_panic;
  sigaction(SIGSEGV, &act;, NULL);
  sigaction(SIGBUS, &act;, NULL);
  sigaction(SIGTRAP, &act;, NULL);
  sigaction(SIGFPE, &act;, NULL);

This produces reliable core dumps and you can do a post-morten analysis.

 Q352: On timer resolution in UNIX. 

Under most Unix flavors, user processes (usually) enjoy the
10ms resolution. This is the time unit the kernel dispatcher
is timer-interrupted to handle ``asynchronous'' events. When
timer-related events, such as, firing, handling, etc., are
bound to the dispatcher `tick', it is not possible to get
finer resolution than that.

But, there are several exceptions to the above, especially
on machines equipped with ``cycle counters''.

IRIX 6.5.x allows privileged processes to call nanosleep()
with sub-millisecond resolution. The actual resolution is
only restricted by the overhead to dispatch a kernel thread
to handle the event. I have seen reaction times in the range
of 300-400 micro-seconds on 200 MHz 2 CPU systems. The same
is true for timers (see timer_create()) based on the
CLOCK_SGI_FAST timer, which is IRIX specific, and thus, not

Solaris 8 finally managed to be able to disassociate the
handling of timer events from the scheduler tick. One can
utilize the high-resolution cycle counter by specifying
CLOCK_HIGHRES for clock-id in the timer_create(3RT) call. I
have seen sub-millisecond resolutions under Solaris
8. Unfortunately nanosleep(3RT) is still bound to the 10ms
dispatcher tick. For earlier Solarise's one could change the
HZ (or something like that) variable to, say, 1000, in order
to obtain 1 millisecond dispatcher tick duration. Some
people claimed that this can be tuned to 10000, but then the
system could spend most of its time serving the timer

HP-UX 11.00 supports the 10ms resolution with nanosleep()
and timer_create(). One needs to get special real-time
version of the kernel in order to have access to higher
resolution timers.

From a casual perusal of BSD4.4 derivatives (and I think
also in Linux systems) the best on can get is the 10ms

In POSIX systems the portable way to request
``high-resolution'' timers is via the CLOCK_REALTIME clockid
in timer_create() which is guaranteed to be as small as
10ms.  I have not seen any system giving finer resolution
than 10ms with timer_create()and CLOCK_REALTIME.

I don't have access to AIX or Tru-Unix 64.

poll(), select(), sigtimedwait() offer the usual 10ms

Michael Thomadakis
Computer Science Department
Texas A&M; University

Joe Seigh wrote:

> bill davidsen wrote:
> >   I believe that the resolution of select() is actually 100ms, even
> > though you can set it in us.
> >
> I think what you are seeing is probably an artifact of the scheduler.  It
> looks like what the system is doing is when the timer pops, the system just
> marks the thread ready and the the thread has to wait until the next available
> time slice.  On solaris for programs in the time sharing class this appears
> to be about 10 ms or so.  Try timing nanosleep with various settings to
> see this affect.
> You might try running the program at real time priority to put it in the
> real time scheduling class and playing with the scheduler's real time
> parameters.  However setting up the kernel for real time stuff probably
> increases the kernel overhead significantly, so if you are looking for
> overall system throughput, this is not the way to do it.
> For non timed waits, I've seen a lot less latency.  This is probably because
> the pthread implementation chose to pre-empt a running thread.  The implication
> of this is that they are rewarding the cooperative processing model though
> possibly at the expense of extra context switching unless you do something
> to alleviate that.
> Joe Seigh
 Q353: Starting a thread before main through dynamic initialization. 

> c) As my program starts I might start a thread before main because of
> some other file static object's dynamic initialization. This thread
> might acquire my lock xyzzy before that lock is dynamically initialized
> setting xyzzy.locked_ to 1.

    My coding policies do not permit this. I recommend
that you don't allow it either. Threads should not be
started by the initialization or creation of static
objects. This just makes too many problems.

    For many of my classes, since we know that there is
only one thread running before all initialization is
complete, we don't bother to mess with any locks, we just
bypass them. Fortunately, any thread created after a change
to a memory location is guaranteed to see that change.

 Q354: Using POSIX threads on mac X and solaris? 

Does any one know of any advantages or disavtanges of using posix thread
(pthread) on mac X and solaris compared to native implementations.

Do pthread make call to native implementation in both these cases and is the
maping between pthread and kernel object 1:1 .

I don't know anything about the thread implementation on the mac. On Solaris,
pthreads are roughly equivalent to the so-called solaris threads
implementation. I believe that both APIs sit on top of lower-level calls.
The main advantage of using POSIX threads is portability. The other is

% Do pthread make call to native implementation in both these cases and is the
% maping between pthread and kernel object 1:1 .

The mapping between pthreads and the kernel scheduling entity in Solaris
depends on what you ask for. Note that you must be careful if you try to
use the m:n model, because the Solaris two-level thread scheduler is crap.
(this is not related to the API -- it's crap for both pthreads and UI threads).

On Mac OS X, POSIX threads is the lowest-level threading interface anyone
should be calling, at least outside the kernel. The POSIX interface uses Mach
threads, and there is an API to create Mach threads -- but it's not very
convenient. (You need to create the thread, load the registers with intimate
knowledge of the calling standard, including creating a stack and setting it to
"bootstrap" the thread.) Also, the Mach API has limited (and inefficient)
synchronization mechanisms -- IPC.

On Solaris, while libpthread depends on libthread, UI threads isn't really so
much a "native implementation"; they're more or less parallel, and happen to
share a common infrastructure, which happens (for mostly historical reasons) to
reside in libthread. You could consider the LWP layer to be "native threads",
but, somewhat like Mach threads, they're really not intended for general use.

The POSIX thread API is far more general, efficient, and portable than Mach,
UI, or LWP interfaces. Unfortunately, the POSIX thread implementation on Mac OS
X is incomplete, (it can't even build half of my book's example programs), and
I wouldn't want to count on it for much. (Though I have no evidence that what's
there doesn't work.) Still, you wouldn't be any better off working directly
with Mach threads.

Solaris, by the way, supports both "N to M" and "1 to 1" thread mappings.
Solaris 8 has a special library that's always 1 to 1. The normal libpthread
provides both N to M (Process Contention Scope, or PCS) and 1 to 1 (System
Contention Scope, or SCS); though the default is PCS and you can't change the
scope of the initial thread. Mac OS X supports only 1 to 1 scheduling.

/------------------[ ]------------------\
| Compaq Computer Corporation              POSIX Thread Architect |
|     My book:     |
\-----[ ]-----/
 Q355: Comments on ccNUMA on SGI, etc. 

> I have a big problem with my simulation. I am trying to implement a
> parallel neural network simulator using SNNS simulator, C language and
> POSIX threads. I am using a SGI machine with 8 processors and an IRIX
> Origin 2000 system. For more than 2 weeks I am trying to make my code to
> run faster on 8 processors than on 2 - but I still can't get any
> progress ! [...]

The Origin 2000 systems are ccNUMA; that means, unlike traditional SMP
multiprocessors, all systems do not have equal access to all memory. Any
memory you use will be on one "node" or another. Threads running on that
node (each Origin 2000 node has 2 processors, so you're potentially
running threads on 4 different nodes) have fast local access. Threads
running on other nodes have to go through the network interconnects
between nodes. Those interconnects are slower (typically much slower) and
also have limited bandwidth. That is, it's probably not possible for 3
nodes to simultaneously access memory in the 4th node without severe
performance degradation over the "normal" local access.

> I have read now on the IRIX documentation, that the cache memory may be
> a very important issue - and that each thread should access the same
> subset of data all the time - for good performances. This is not the
> case in my program. And also, the network (which has around 600 units)
> and the connections are created by the main thread and - probably - are
> stored on one processor ?! This means that all the others processors are
> communicating with this one to get the unit's information ? Is this so
> bad ? This can be the only reason for the low performances ?

Running with cache is always the best strategy for performance. Modern
processors are so much faster than memory, that memory access is the only
defining characteristic of program performance. We used to count
instructions, or CPU cycles; but all that's irrelevant now. You count
memory references, as a first indicator; for detailed information, you
need to analyze the cache footprint. Most processors have multiple levels
of cache, maybe up to 3, before you hit main memory. The first level
delays the instruction pipeline by a couple of cycles. The second may be
on the order of 10 cycles, the third 20 to 100 cycles. And, relative to
normal processor speeds, if you've got to hit main memory you might as
well break for lunch. And that's just LOCAL memory, not remote memory on
some other node.

> Also, the global list of spikes is updated by all threads - and now I am
> wondering where is stored, and how I should store it, in order to have
> an efficient use of it. In the same documentation it says that you
> should store the used data on the same processor but here the spikes are
> inserted by different threads and computed by any of the threads. This
> is because the entire simulation is driven by 'events' and time issues -
> so any available thread compute the next incoming event.

Writing closely packed shared data from multiple threads, even on an SMP,
is "bad mojo". When the data lives within the same  hardware cache line,
all other processors that have written or read the data "recently" need to
see that their cached copy is now invalid, and refetch from main memory.
That, obviously, is expensive. When all of your threads are writing to the
same cache line continuously, the data in that line "ping pongs" between
processor caches. This is the number 1 program characteristic that leads
to the old "my program runs faster without threads". (The second, and less
subtle, is overuse of application synchronization.) And remember, in a
ccNUMA system like yours, anything dealing with memory is far worse unless
the memory is local to your node. Obviously, memory shared by all your
threads cannot possibly be local to all of them unless you're using only a
fraction (2) of the available processors. That is very likely why you ran
into the magic number "2 threads (processors)". When you're using only 2,
the system can keep both of them, and their memory, on the same node.
Beyond 2, that's impossible.

I'm not sure how IRIX manages your memory in this case. Some systems might
automatically "stripe" memory that's not otherwise assigned across all the
nodes. (If there's enough data to do that.) That may tend to even out the
non-local memory references, and can often perform better than simply
putting all the memory into one node. On the other hand, memory that's not
explicitly assigned is often allocated on the first node to reference the
memory; and if your startup initializes your entire data array (or
allocates it all from malloc), then it's likely that the entire data set
IS on the node where you started. Which means that the other 3 nodes are
beating on its interconnect port continuously, and you're operating in the
worst case performance mode.

The best strategy (if you can) would be to explicitly target memory to
specific nodes along with two specific threads that will be doing all
(ideally) or most of the access to that memory. (In your case, this
probably isn't possible; but the closer you come, the better your
performance will be.) Even making sure that your global arrays are striped
might help. In fact, even making sure that they're allocated from two
nodes instead of just one might double your performance. I'm not familar
with the IRIX APIs for assigning memory (or threads) to specific
ccNUMA nodes, but such things must exist, and you might consider looking
them up and giving it a try.

Otherwise, you might consider limiting your application to a single node.
Given that your application sounds pretty heavily CPU bound with
relatively little I/O, you're unlikely to gain any advantage in that case
from more than 2 threads. (The more blocking I/O you do, the more likely
it is that additional threads will improve throughput.) If you can split
the dataset more or less in half, you might consider doing that across 2
nodes, with 4 threads, and see how that works.

Just as optimizing threaded performance has started to go from pure black
magic to something that's almost engineering, along comes ccNUMA and
breaks all the rules and brings back that element of magic. Welcome to the
bleeding edge, and... good luck. ;-)

/------------------[ ]------------------\
| Compaq Computer Corporation              POSIX Thread Architect |
|     My book:     |
\-----[ ]-----/

Origin 2000 is rather old at this point.  Origin 3000 is
the current system, and its memory system is even less
NUMA than the Origin 2000.

> [...good stuff snipped...]

>                                                          I'm not familar
> with the IRIX APIs for assigning memory (or threads) to specific
> ccNUMA nodes, but such things must exist, and you might consider looking
> them up and giving it a try.

Indeed, one has complete control over where memory is placed.

> [...more snipped...]

> Just as optimizing threaded performance has started to go from pure black
> magic to something that's almost engineering, along comes ccNUMA and
> breaks all the rules and brings back that element of magic. Welcome to the
> bleeding edge, and... good luck. ;-)

I'd hardly call NUMA bleeding edge after all these years.

One thing Dave didn't bring up is "execution vehicle"-pthread
affinity.  In an M-on-N pthread implementation the kernel
schedules the "execution vehicles" and the library schedules
the pthreads onto them.  The kernel is cache+memory affinity
aware and tries to schedule execution vehicles to maximize
affinity, while trying to be fair, schedule real time threads,
etc.  The library has to avoid deadlock, observe priorities,
and schedule what could be far more threads than execution
vehicles.  What can happen is that IRIX may nicely place
say 5 execution vehicles on the 4 CPUs in one C-brick, and
1 CPU in another "nearby" C-brick, and leave them there,
maximizing affinity, but the library, for a variety of reasons,
may end up moving pthreads around on these execution vehicles in a
way that is not affinity friendly.

For CPU intensive applications this may be a performance issue,
so the library provides a nonportable scope: PTHREAD_SCOPE_BOUND_NP
to bind a pthread to an execution vehicle.  For realtime and
applications which typically run alone on a system the library
provides a nonportable call: pthread_setrunon_np() to force a
bound (or system scope) thread to run on a particular CPU.

I understand that Sun recently released an alternate version
of its pthread library which has a N-on-N implementation.  I'd
guess they did this because of the same affinity issue.  Does
anyone know different?
 Q356: Thread functions are NOT C++ functions! Use extern "C" 

Patrick TJ McPhee wrote:
> In article ,
> Doug Farrell  wrote:
> % And again you refer to 'the standard C++ thread function', what are you
> % talking about?
> There isn't any, but do use it if you don't want to pass a C function.
> [Cry of frustation followed by general elucidation omitted]
> If it causes you emotional distress to create a C function,
> then use the standard C++ thread class (keeping in mind that there
> isn't one).

Just so's this doesn't go on and on and on: Patrick, is it fair to
assume that you are ladling on the irony here?

Doug, the essence of what has been said so far is this:

pthread_create's prototype in C++ is:

  extern "C" pthread_create(pthread_t *, pthread_attr_t *,
                            void *(*start_routine)(void *),
                            void *);

See that `extern "C"'?  That covers _all_ function types in the
declaration; in particular the start_routine function pointer, whose
type is actually

  extern "C" void *(*)(void *);

that is, `pointer to C function taking void * and returning void *'.
By passing a function whose C++ prototype is:

  class SomeClass {
    // ...
    static void *threadfn(void *);

or just:

  void *threadfn(void *);

  (therefore, &threadfn; is `pointer to C++ function taking void * and
  returning void *'),

you are invoking undefined behaviour.  Your implementation is now
allowed to activate your modem and phone the speaking clock in Ulan
Batur, amongst other things.  You _know_ you _mustn't_ invoke
undefined behaviour, just as you _know_ that unless you feel obliged
to by current compilers' handling of implicit template
instantiation, you shouldn't put the implementation in the header

In short, using POSIX threads, you cannot put the argument to
pthread_create inside the class.  Period.  Put it in the
implementation file inside an anonymous namespace, or use a global
static, and pass it a pointer to the class as its argument, like this:

  class SomeClass { void *threadfn(); };

  extern "C" static void *threadfn(void *args)
    SomeClass *pSomeClass = static_cast(args);
    return pSomeClass->threadFn();

Guy (not saying anything further just in case it starts another
pointless "Standard C++ is broken with respect to threading--oh no it
isn't--oh yes it is etc. ad nauseam" thread).

*Ask yourself: given a header file containing the implementation, or a
library/object file containing the implementation, what must my users
do if I change the implementation?
 Q357: How many CPUs do I have? 

NoOfCpus = sysconf(_SC_NPROCESSORS_CONF); /* Linux */

GetSystemInfo(&SystemInfo;);                                       /* NT */
NoOfCpus = SystemInfo.dwNumberOfProcessors;

I have made the experience that for busy CPU bound threads the number of
threads should not extensively exceed the number of available processors.
That delivered the best performance.

Victor Khomenko  wrote in message
> Hi,
> I want to make the number of working threads depend on the number of
> processors in the system. Is there a good way to find out this information
> (run time)? How many threads per processor is a good ratio (all threads are
> going to be pretty busy, but can sometimes wait on mutexes and conditions)?
> I need this information for Linux and Win32.
> Victor.
 Q358: Can malloc/free allocate from a specified memory range? 

> Using mmap to share data between processes leads to the requirement to
> dynamically allocate and free shared memory blocks.  I don't want to
> mmap each block separately, but prefer to allocate and free the memory
> from within the mapped region.  Is there a way to redirect malloc/free
> library functions to allocate from a specified memory range, instead of
> the heap?
> I don't want to mmap each block separately or to use shmget because of
> the cost of so many mappings.
> -K
The mmalloc package at:
might be a good starting point.

 Q359: Can GNU libpth utilize multiple CPUs on an SMP box? 

>> > Is there any existing patches that can make GNU libpth utilize
>> > multiple CPUs on an SMP box?
>> I recall that IBM is doing something much like it. I cannot remember
> Can you give me some clues to find it?  I've tried google, but it
> returned either too many or no results.

Here is the URL:

bye, Christof

 Q360: How does Linux pthreads identify the thread control structure? 

R Sharada wrote:

>     I have a query related to how Liux pthreads implementation
> idnetifies the thread control structure or descr for a current thread,
> in the case when the stack is  non-standard ( by way of having called a
> setstackaddr /setstacksize ).

First off, don't ever use pthread_attr_setstackaddr(), because it's a
completely brain-damaged interface that's inherently broken and totally
nonportable. I've explained why it's broken (both in terms of engineering
features and political history), and I won't repeat it here. (You can
always search the archives.) Just don't use it.

The next version of POSIX and UNIX (2001) contains my corrected version,
which ended up being named pthread_attr_setstack(). At some point, this
will begin to appear on Linux and other systems.

> Currently the method ( in thread_self
> routine ) just parses through the whole list of threads until one
> matches the current sp and then obtains the descr from there. This could
> get quite slow in conditions where there are a lot  of threads ( close
> to max ). Isn't there a better way to this?

No; not for Linux on X86.

>     Does anone know how this is handled in other UNIXes - AXI, Solaris,
> etc.??

The best way to handle it is to define in the hardware processor context a
bit of information that's unique for each thread. SPARC and IA-64 define a
"thread register" that compilers don't use for anything else, but can be
read by assembly code or asm(). Alpha defines a "processor unique" value
that can be read by a special instruction. I believe that PowerPC has one
or the other of those techniques, as does MIPS.

LinuxThreads can and should use these mechanisms when built for the
appropriate hardware; but on X86 (which is still the most common Linux
platform), none of this is an option. Of course, "the system" could define
a universal calling standard that reserved from the compiler code
generators "a register" that could be used in this way. However, the X86
register set is pretty small already, and, in any case, trying to make that
change NOW would be a major mistake since you couldn't use any existing
binary code (or compilers).

/------------------[ ]------------------\
| Compaq Computer Corporation              POSIX Thread Architect |
|     My book:     |
\-----[ ]-----/

 Q361: Using gcc -kthread doesn't work?! 

> i have a multithreaded program on a dual-pentium machine running freebsd
> 4.3. compiling everything with
>    gcc -pthread ...
> works fine, but doesn't realy make the second processor worth the money
> (i.e. everything runs on one thread). according to 'man gcc' compiling
> with
>    gcc -kthread
> should fix the problem. unfortunately, gcc tells me it doesn't recognise
> the option. in a message on mailing.freebsd.bugs i read that for freebsd
> 4.2 one had to recompile gcc with the appropriate arguments set. i did a
> make and make install in /usr/src/gnu/usr.bin/cc, but i couldn't add any
> options and the compiler turned out just the same as the last...
> anybody know what i should do here?

As you may have already seen, there's a FreeBSD bug report on this:

Here are the comments in the "Audit-Trail" section at the bottom of the

"The -kthread link flag was purposely removed, since linuxthreads is not 
part of the base system.  There are explicit instructions that come with 
the linuxthreads port that explain how to link with linuxthreads."

> p.s. i don't want to start a flame-war on linuxthreads vs. whatever -
> the purpose of compiling under freebsd is to be able to tell for myself
> which os is best for my needs ;)

You may find "Kernel-Scheduled Entities for FreeBSD" interesting

I once did a test with linuxthreads (available under /usr/ports/devel)
on a dual-CPU FreeBSD system and my test program successfully used
both processors.  However, my test program was trivial, so I'd want to
do a lot more testing before I'd put anything more complicated into
production.  As you may know from reading this newsgroup, there's some
criticism of the linuxthreads model.  But at least it lets threaded
programs use multiple CPUs on FreeBSD :-)

Michael Fuhr
 Q362: FAQ or tutorial for multithreading in 'C++'? 

Using Only for WIN32 API !!

MSDN Library (With samples and function documentation)

Thread function documentation :

If you speak french :

Tomasz Bech  a écrit dans le message :
> Hi,
>     Does anybody know about good faq or tutorial for multithreading in
> 'C++'?
>   Thanks,
>         Tomasz

 Q363: WRLocks & starvation. 

> "Dave Butenhof"  schrieb im Newsbeitrag
> The UNIX 98 standard (and the forthcoming POSIX 1003.1-2001 standard)
> includes  POSIX read-write lock interfaces. You'll find these interfaces implemented
> (at least) on AIX 4.3.x, Solaris 8, Tru64 UNIX 5.0, and any moderately recent
> version of Linux. Earlier versions of Solaris and Tru64 UNIX also provided
> different nonstandard interfaces for read-write locks.
> I guess these implementations will take care of classic problems like
> starvation of the writer, don't they?

Sure. If they want to. In whatever manner thought best by the designers. (Or in
whatever way the code happened to fall out if they didn't bother to think about

Even the POSIX standard read-write lock doesn't require any particular
preference between readers and writers. Which (if any) is "right" depends
entirely on the application. Preference for readers often results in improved
throughput, and is frequently better when you have rarely updated data where
the USE of the data is substantially more important than the updates. (For
example, the TIS read-write locks on Tru64 UNIX were developed specifically to
replace a completely broken attempt to do it using a single mutex in the libc
exception support code. It used the construct to manage access to the code
range descriptor list for stack unwinding; or to update it with a newly loaded
or generated code range. Read preference was appropriate, and sufficient.)

Write preference can be better when you don't care principally about
"throughput", or where multiple readers are really relatively rare; and where
operating on stale data is worse than having the readers wait a bit. (Or where
you simply cannot tolerate the possibility of a starving writer wandering the

A generally good compromise is a modified FIFO where adjacently queued readers
are batched into a single wakeup; but that still constrains reader concurrency
over read preference and increases data update latency over writer preference.
Like all compromises, the intention is more to keep both sides from being angry
enough to launch retaliatory nukes, rather than to make anyone "happy". It does
avoid total starvation, but at a cost that may well be unacceptable (and
unnecessary) to many applications.

It wouldn't make sense for the standard to mandate any of those strategies.
Partly because none of them is "best" for everyone (or even for ANYone). Partly
because there are probably even better ideas out there that haven't been
developed yet, and it makes no sense to constrain experimentation until and
unless a clear winner is "obvious". (For example, had the standard specified a
strategy, it would have been either reader or writer, not "modified FIFO",
because the latter wasn't in wide use at the time.)

We considered a read-write lock attribute to specify strategy. We decided that
this would be premature. While we've advanced a bit in the intervening time, I
think it would still be premature. Though of course individual implementations
are welcome (and even encouraged) to experiment with such an attribute. If some
set of strategies become relatively common practice, the next update of POSIX
and UNIX (probably 2006) could consider standardizing it.

> I enjoyed the discussion about how to implement condition variables in
> Win32. What would the windows implementation of these read-write lock
> interfaces look like?

Probably already done, somewhere. Go look! I don't even want to THINK about it.
(But then, I feel that way about anything Windows-ish. Everyone, except
possibly Bill Gates, would be better off without Windows.)

/------------------[ ]------------------\
| Compaq Computer Corporation              POSIX Thread Architect |
|     My book:     |
\-----[ ]-----/

 Q364: Reference for threading on OS/390. 

Gee M Wong wrote:

> I've got a new project starting up, and it has been over a decade since 
> I last wrote a C/C++ program on the mainframe.  Would someone please 
> suggest a current libraray and reference for threading on OS/390 
> (preferably Pthread).

"4.3 Chapter 23. Using Threads in an OS/390 UNIX Application..."

"OS/390 C/C++ Library Start here to access the OS/390 C/C++ 
publications available on the Web..."
 Q365: Timeouts for POSIX queues (mq_timedreceive()) 

> > Wile a thread is waiting for a message to arrive in a message queue,
> > using mq_receive(), I'd like to have a way to unblock the thread when
> > after a certain timeout no message has arrived. In pSOS the timeout is
> > a parameter of the q_receive() call.
> > Is this also possible using POSIX queues?
> Well, sort of. The POSIX 1003.1d-1999 amendment to POSIX 1003.1-1996
> includes an mq_timedreceive() function that allows you to specify a
> timeout. However, it's not widely implemented yet, and likely won't be
> available on your platform. (You haven't said what your platform is;
> "POSIX" doesn't help much since there's no such operating system!)
You're right. Actually, the software should work on multiple
platforms, being Linux and pSOS the most important. mq_timedreceive()
is not implemented in pSOS.
> > If not, is there a work around for this problem?
> You could always create a thread that waited for the specified interval
> and then sends a special message to the queue, awakening a waiter. 
Yes, I tried that one. It works, but I wondered if there is a more
elegant way to do this. As you pointed out, this is mq_timedreceive()
(maybe implement my own mq_timedreceive for pSOS?)
> You could also interrupt it with a signal, causing an EINTR return from
> mq_receive(); though that imposes a number of complications, including
> deciding what signal number to use, what happens (what you want to
> happen) when the thread isn't actually waiting in mq_receive(), and so
> forth.
> You can't use alarm(), because the signal it generates isn't directed at
> any particular thread but rather to the process as a whole. (Although you
> can get away with it if you control the main program, so that you can
> ensure SIGALRM is blocked in all threads except the one you want to
> interrupt.)
Thanx for that one. I have to check if alarm() is supported by pSOS.
> If you can wait for the platforms you care about to implement
> mq_timedreceive(), that'd be the best solution. Otherwise... choose your
> hack.
> /------------------[ ]------------------\
> | Compaq Computer Corporation              POSIX Thread Architect |
> |     My book:     |
> \-----[ ]-----/
 Q366: A subroutine that gives cpu time used for the calling thread? 

> I would like to write a subroutine that gives cpu time used for the calling
> thread. I used times (I'm under Tru64 V5.0 using pthread), and it returns a
> cumulative cpu time, not the cpu time for the given thread. Any suggestions ?

The 1003.1d-1999 amendment to POSIX added optional per-thread clock functions; but I
doubt they're implemented much of anywhere yet. (And definitely not on Tru64 UNIX.)
Where implemented, you'll find that  defines _POSIX_THREAD_CPUTIME, and
you could call clock_gettime() with the clock ID CLOCK_THREAD_CPUTIME_ID (for the
calling thread), or retrieve the clock ID for an arbitrary thread (for which you
have the pthread_t handle) by calling pthread_getcpuclockid().

(I'd like to support this, and a lot of other new stuff from 1003.1d-1999 and
1003.1j-2000, as well as the upcoming UNIX 2001. But then, there are a lot of other
things we'd like to do, too, and I couldn't even speculate on time frames.)

Whether there are any alternatives or "workarounds" depends a lot on what you're
trying to accomplish.

In any case, times() MUST return process time, not thread time. That's firmly
required by the standard. Otherwise, times() would be broken for any code that
wasn't written to know about threads; which is most of the body of UNIX software.

/------------------[ ]------------------\
| Compaq Computer Corporation              POSIX Thread Architect |
|     My book:     |
\-----[ ]-----/

 Q367: Documentation for threads on Linux 

> "Dan Nguyen"  wrote in message
> news:9g45ls$2lsi$
> > Robert Schweikert  wrote:
> > > I am looking for some documentation for threads on Linux. What I am
> > > after is some idea what is implemented, what works, what doesn't. Where
> > > the code is, and what is planned for the future.
> >
> > Linux uses a 1-1 type threading model.  LinuxThreads as it is known is
> > a kernel level thread using the clone(2) system call (only available
> > in Linux, and don't use it yourself).  It implemnts the pthread
> > library, so any pthread application should run correctly.
> >
> I am everthing but an expert on this, but it seems pthread is not fully
> implemented on Linux.

This is correct. The essential problem is that clone() doesn't, currently,
support the creation of multiple THREADS within a single PROCESS. Instead, it
creates multiple PROCESSES that share a single ADDRESS SPACE (and other
resources). The basic distinction is that each clone()d process has its own
pid and signal actions, and that they lack a shared pending signal mask.

While these deficiencies can be critical for some code, the LinuxThreads
implementation does enough extra work "under the covers" that most threaded
applications won't notice. There are people working on solving the problems,
so you can expect them to be "short term".

> Have a look at: comp.os.linux.development.apps The thread from the 11th June
> 2001 called "sharing Pthread mutexes among processes".

POSIX provides an OPTION supporting "pshared" synchronization objects, that
can be used between processes. Implementations need not provide every option
to be "POSIX". If by "full POSIX" you choose to mean "an implementation
correctly and completely providing all mandatory and optional features and
behaviors", then I doubt any exist.

/------------------[ ]------------------\
| Compaq Computer Corporation              POSIX Thread Architect |
|     My book:     |
\-----[ ]-----/
 Q368: Destroying a mutex that was statically initialized. 

Ross Smith wrote:

> David Schwartz wrote:
> >
> > > Thanks. Apparently even Mr Butenhof makes the occasional mistake :-)

Like R2D2, I have been known to make mistakes, from time to time. Still, this
particular example isn't one of them.

What I actually said was "You do not need to destroy a mutex that was statically
initialized using the PTHREAD_MUTEX_INITIALIZER macro." And you don't. You CAN, if
you want to; but you don't need to. Why should you? It's static, so it never goes
out of scope. You can't have a memory leak, because the little buggers can't
reproduce. If you want to destroy one, and even dynamically initialize a new mutex
at the same address, have at it.

> >         Let me point out one more thing: It really doesn't make sense to
> > attempt to statically initialize something that's dynamically created.
> > So you shoulnd't be statically initializing a mutex that isn't global
> > anyway. And if it's global, you should never be done with it.
> >
> >         Can you post an example of a case where you are done with a statically
> > initialized mutex where it isn't obvious that dynamic initialization is
> > better?
> Any case where the mutex isn't global, I would have thought.
>   void foo() {
>     pthread_mutex_t mutex(PTHREAD_MUTEX_INITIALIZER);
>     // ...
>     pthread_mutex_destroy(&mutex;);
>   }

You can't do that. It's illegal. You can ONLY use the POSIX static initializers for
STATICALLY ALLOCATED data. Nevermind that compilers will happily compile the broken
code: that doesn't mean it's not broken any more than being able to compile
"x=0;z=y/x;" means you should expect it to work. You're violating POSIX rules. The
resulting code MAY work (at least sometimes, or "appear to work" in some
situations) on SOME implementations, but it is not legal POSIX code and isn't
guaranteed to work anywhere.

Of course, this is legal:

     void foo() {
         static pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
         // ...

But that's not the same thing. ALL invocations of foo() share the same global
mutex. Private mutexes aren't much good, anyway. Your example is pointless unless
foo() is creating threads and passing the address of "mutex" to them for
synchronization; in which case it had better also be sure all threads are DONE with
the mutex before returning. It must also use pthread_mutex_init() to initialize
"mutex", and pthread_mutex_destroy() to destroy it before returning.

/------------------[ ]------------------\
| Compaq Computer Corporation              POSIX Thread Architect |
|     My book:     |
\-----[ ]-----/

> I am looking for some documentation for threads on Linux. What I am
> after is some idea what is implemented, what works, what doesn't. Where
> the code is, and what is planned for the future.
The best resourse for Linuxthreads documentation is in the info pages
for libc ('info libc') -- under the headings ADD-ONS -> `POSIX Threads'.

Artie Gold, Austin, TX   

I found cprof at very useful.


On Wed, 13 Jun 2001, stchang wrote:

> We are developing muti-thread program code. However, it does not have
> good performance. The performance is about 1.5X compare with
> non-mutithread code. Sometime, it slower than non-mutithread code. Does
> someone give me some idea about how to profile muti-thread code or
> analysis thread?
> Thanks!

stchang wrote:

> We are developing muti-thread program code. However, it does not have
> good performance. The performance is about 1.5X compare with
> non-mutithread code. Sometime, it slower than non-mutithread code. Does
> someone give me some idea about how to profile muti-thread code or
> analysis thread?

The first, and often the best tool to apply is common sense.

You don't say on what hardware (or OS) you're running. Actually, for an
untuned application, if you're running on a 2-CPU system, 1.5X speedup
isn't at all bad.

However, a performance decrease isn't particularly surprising, either. It
means you're not letting the threads run in parallel. There are many
possible reasons, some of which are due to "quirks" of particular
implementations. (For example, on Solaris, you need to use special
function calls to convince the system you're on a multiprocessor.)

The most common reasons are that your application is designed to "wait in
parallel". Contention for a common resource is the most common problem.
For example, all threads do all (or nearly all) their work holding one
particular application mutex. No matter how many threads you have, they
can't do anything significant in parallel, and they waste time in
synchronization and context switching. Guaranteed to perform worse than
single-threaded code, trivial to write.

The contention may not even be in your code. If they all do I/O to a
common file (stream or descriptor), they will spend time waiting on the
same kernel/C/C++ synchronization. If that I/O drives the performance of
the application, you lose.

The problem might even be in how you're using your hardware. When
processors in an SMP or CC-NUMA system repeatedly write the same "block"
of memory, they need to synchronize their caches. If all your threads are
busily doing nothing else on all available processors, you can reduce
those processors to doing little but talking to each other about what
cache data they've changed.

Adding threads doesn't make an application "go faster". Careful design
makes it go faster. Careful and appropriate use of threads is one TOOL a
developer can use when designing a faster application. But it affects
every aspect of the design (not just implementation) of the code.

Sometimes you do need to analyze the behavior of running code, and it's
nice to have good tools. (If you can run on Tru64 UNIX or OpenVMS, by the
way, Visual Threads does an awesome job of helping you to understand the
synchronization behavior of your program.) Regardless of the tools,
though, good performance comes from careful design and thorough
understanding of what your application does, and the demands it places on
the OS and hardware; the sooner in the design cycle you accomplish this,
and the more completely you apply the knowledge, the better the results
will be.

/------------------[ ]------------------\
| Compaq Computer Corporation              POSIX Thread Architect |
|     My book:     |
\-----[ ]-----/
 Q369: Tools for debugging overwritten data. 

> Oh btw, suppose I've got a local variable in a thread function, and I KNOW
> something overflows or overwrites it (when NOT supposed to happen), is there
> a way to find out who trashes it ?

There are several tools to it:
 - purify by Rational Software (
   Very good tool, but expensive
 - Insure by Parasoft
   Very good, has a few quirks but nothing serious. Can catch illegal
   parameters to systems calls too.
If your budget can't handle the above tools, or if you can limit the
trashing to the heap, you can look into:
 - electric fence
   freeware, pretty good debug-heap. I have encountered a few problems
   fork()ing multithreaded programs under solaris, though.
 - miscellaneous debug heaps
 - idh (
   (disclaimer: I wrote it)
 Q370: POSIX synchronization is limited compared to win32. 

On Fri, 20 Apr 2001 23:28:55 -0400, Timur Aydin  wrote:
>Hello everybody,
>After quite some time doing multithreaded programming under win32, I have
>now started to do development under Linux using LinuxThreads. However, I am
>noticing that the synchronization objects are quite limited compared to the
>ones under win32.

However, the nature and variety of the objects provided by Win32 leaves much to
be desired. Events are simply not very suited for solving a wide variety of
synchronization problems.  It's a lot easier to solve synchronization problems
with condition variables because they have no programmer visible state. The
logic is based entirely on the state of your own data. Objects like events or
semaphores carry their own state; to solve a synchronization problem, the
programmer must bring about some meaningful association between the semaphore's
state and the state of the program. In my programming experience, such
associations are fragile and difficult to maintain.

>As far as I have learned, it is not possible to do a timed
>wait on a mutex or a semaphore.

Timed waits on mutexes are braindamaged for most kinds of work. They
are useful to people working in the real-time domain, so the 200X draft
of POSIX has added support for timed mutex waits---it was due to pressure
from some real time groups, apparently. In real time applications, the
duration of a critical region of code may be determined precisely,
so that a timed out mutex wait can serve as a watchdog.
You can find the implementation of pthread_mutex_timedlock in glibc 2.2. 
For reasons of efficienty, not every mutex type supports this operation, just
the default one.  Glibc 2.2 also adds barriers, and the POSIX timer functions:
timer_create and friends.

Also realize that the Linux pthread_mutex_t is a lot closer to the Windows
CRITICAL_SECTION than to the Windows mutex. Note that there is no timed
lock function for critical sections!

>Also, while under win32 the synchronization objects can have both
>interprocess and intraprocess scope, under linux the only object that can do
>this is the semaphore. 

The traditional UNIX semaphore, that is.

>So you can't have a mutex or a condition object that
>can be accessed by separate processes.

There is a provision in the POSIX interface for process shared mutexes and
conditions, but it's not implemented in Linux.

>And, lastly, it is not possible to
>wait on multiple objects simultaneously.

Again, this is a braindamaged concept to begin with, and severely limited
in Windows (only 64 handles can be waited on). Not to mention that
the WaitForMultipleObjects function is broken on Windows CE, so it
cannot be considered portable across all Win32 platforms.
Lastly, it has fairness issues: under the ``wait for any'' semantics, the
interface can report the identity of at most one ready object, regardless of
how many are actually ready. This can lead to one event being serviced
with priority over another one, depending on its position in the array.

With condition varibles, your program is waiting for a *predicate* to become
true. The condition variable is just a place to put the thread to sleep.
If you want to wait for more than one predicate, just form their logical
conjunction or disjunction as needed, and ensure that signaling of the
condition variable is done in all the right circumstances, e.g.

    /* wait for any of three predicates */

    while (!predicate1() || !predicate2() || !predicate3())
    pthread_cond_wait(&cond;, &mutex;);

This is equivalent to waiting on three events. The thread is parked in some
wait function, and can wake up for any of three distinct reasons.
A better structure might be this:

    int p1 = 0, p2 = 0, p3 = 0;

    /* mutex assumed locked */
    for (;;) {
    p1 = predicate1();
    p2 = predicate2();
    p3 = predicate3();

    if (p1 || p2 || p3)

    pthread_cond_wait(&cond;, &mutex