August 18, 2009
Presenter: Tim Halloran, SureLogic
Overview: A two-hour presentation on concurrency in Java. It
didn't go into implementation details using the latest frameworks
(e.g., Fork/Join, Phasers) but gave a solid review on what to do
and not to do. Tim obviously knows his stuff and was a great
speaker.
Following are notes I took during the presentation. Any error
in them is almost certainly my fault whether due to transcription
or misunderstanding.
SureLogic: located in Squirrel Hill, small outfit
- Common problems found in realworld code
- SL has tools to be able to look at software and judge
- Self-judge: 8 or 9 in concurrency, not a 10
- Reviewed some books (Java Puzzlers, Effective Java 2.0)
I. Problem
- "Concurrency errors are the cockroaches of the software bug
world..."
- Goetz, "is your java program working by accident?", movement
toward multicore on desktop and bringing errors closer to devs
- "One of the frst steps toward remediation [of race conditions]
is to understand how to correctly write reentrant code"
(Howard, LeBlanc, and Viega)
- Past: drive by Moore's Law; future: driven by Amdahl's Law
(performance is a function of how much your program can be
parallellized)
- concurrency failures: difficult to diagnose, maddeningly
difficult to reproduce; often fail-slow, difficult to pin down
when something actually went wrong
II. JMM in ~10 minutes
- Chap 16 of JCiP uch better than JLS
- Specifies minimal guarantees about when writes become visible
to other threads
- Memory model isn't just what you think it might be
- Mental model of single piece of data being read/written by
multiple threads is wrong -- processors have their own
caches. Different CPUs have different issues with cache
coherency.
- Compiler writers are REALLY clever, and that works GREAT for a
single thread. But you can be very surprised by reordering on
multiple threads.
- "Happens-Before" - defined by JMM as a partial ordering, but
it's not complete; there may be clumps of things that are
ordered, but they cannot be compared
- Determines what actions threads can take; thread B is
guaranteed to see the results of action A iff there's a
happens-before.
- happens-before always concerns two threads, not a thread and
"all others"
Rules:
- Program order: Each action in a thread happens-before every
other action in a thread that comes later in the program.
- Lock: An unlock on an intrinsic lock (or Lock object)
happens-before every subsequent lock on that same monitor lock.
- Volatile field: A write to a volatile field happens-before
every subsequent read of that same field. Also, the state of
the thread that wrote the volatile value is also available to
other threads. Without 'volatile' you're guaranteed
NOTHING. (It may work okay now, but it's an accident.)
- Thread start: a call to start() on a thread happens before any
actions in the started thread. (Sidenote: threads can only
communicate via fields.)
- Thread termination: all actions in a thread happen before any
other thread detects that a thread has terminated (via join()
or isAlive()).
- Interruption: A thread calling interrupt on another thread
happens before the interrupted thread detects the
interrupt. (Sidenote: if you interrupt a thread while it's in
sleep() it will throw a InterruptedException. Duh.)
- Finalizer: The end of a constructor will be reached before the
start of a finalizer. (However, see Effective Java: "Avoid
finalizers")
- Transitivity: If A happens before B and B happens before C, A
happens before C.
- (another rule about finally he'll bring up later)
III. Concurrency Anti-Patterns
- Approach: bad code to good code
- Why do these exist?
- Concurrency is difficult: avoid!
- Concurrency in frameworks: try to encapsulate, introduce
callbacks, get ready for poor docs
- Misconceptions: policies about what you can do with threads
(or the context you're in) is poorly documented, performance
"uber alles" ("we're getting rid of all these locks because
they're slowing us down"), myths started by "Cliff"
- But... my unit tests pass!
- Unit testing sucks for concurrency
- Integration testing better, stress testing on variety of hardware
- Emerging analysis tools (static and dynamic), SureLogic does
this (good to know!)
AP #1: Invisible unprotected operations
- write performed by one thread isn't visible to others
- unsynchronized getters -- the setter has one, but getter
doesn't; this *might* work because of the volatile write rule
(thread context piggyback), but it might not
Fix:
- protect getter/setter with lock on method (synchronized --
happens-before requires a consistent lock)
- use util.concurrent.Atomic variables
Example code for a simple counter; even with different types
of correct implementations the performance profiles are quite
different, even on the same VM on different OSs -- but one of
the main things is that Amdahl's law holds -- the impls with
ThreadLocal are more easily parallelizable.
AP #2: Placebo Locks
- when a locking policy is not followed consistently, the lock
only exists to make the developer feel better
- example from java.util.Logging: readers and writers must follow
the same locking policy, or synchronization is worthless: log()
checks to see if filter is null, but setFilter() doesn't follow
the same policy
- they looked at java.util.Logging and it is (or was, ~year ago)
actually much better than log4j concurrency-wise
- Locking stack objects: nono; threads communicate with fields
(!)
AP #3: Choosing the wrong lock
- Despite looking reasonable, this is a problem
- Static code locking on instance
- Synchronizing on java.util.concurrenet.Lock
- Protect static variables with static locks (e.g., Foo.class)
- Locking on this.getClass() is always wrong (you could have 2+
separate locks)
- Contention: using juc.Lock was 4x better contended performance
in Java5, down to ~2x in Java6; example in Java7 from Doug Lea
diminishes gap even further
- If you can use intrinsic locks, they're simpler
AP #4: Too much mutable state
- no happens-before state in constructor to other threads
- special case: if field is final, the JMM will guarantee
initialization safety; this includes REACHABLE fields within the
final field
- Generally (?) this is only a problem when you leak 'this' from
the constructor
- Chapter in JCiP about this (initialization safety)
AP #5: Swing'n in the main thread
- Swing has thread confinement policies, it runs in a single
thread; "The EDT is responsible for executing any method that
modifies a component's state. This includes any component's
constructor..."
- Demo of Flashlight on simple GUI, showing shared state, what
thread was invoking what.
AP #6: Dangerous observers
- Observer/Listener tough to implement concurrently
- One issue: adding/removing observers is done in multiple
places; iterating over observers when you notify may throw
concurrent modification exception
- Use locking? What happens when you call notify() on the
observer while you're holding a lock? How long will that be
held? -- "Never call foreign methods while holding a
lock."
- Fix #1: make a copy before iterating
- Fix #2: use CopyOnWriteArrayList
- Fix #3 (from me, design vs coding): only allow
initialization with known objects, don't add/remove
IV. Q+A
- "other silver bullets?" "maybe -- other programming languages
may be there, but most of us aren't going to be using them;
various web containers and frameworks help with this; STM
sounds good, but even then you need to understand what you're
doing"