August 21, 2009

Notes on PittJUG: Stop Racing Toward Disaster

August 18, 2009
Presenter: Tim Halloran, SureLogic

Overview: A two-hour presentation on concurrency in Java. It didn't go into implementation details using the latest frameworks (e.g., Fork/Join, Phasers) but gave a solid review on what to do and not to do. Tim obviously knows his stuff and was a great speaker.

Following are notes I took during the presentation. Any error in them is almost certainly my fault whether due to transcription or misunderstanding.

SureLogic: located in Squirrel Hill, small outfit

Common problems found in realworld code
SL has tools to be able to look at software and judge
Self-judge: 8 or 9 in concurrency, not a 10
Reviewed some books (Java Puzzlers, Effective Java 2.0)

I. Problem

"Concurrency errors are the cockroaches of the software bug world..."
Goetz, "is your java program working by accident?", movement toward multicore on desktop and bringing errors closer to devs
"One of the frst steps toward remediation [of race conditions] is to understand how to correctly write reentrant code" (Howard, LeBlanc, and Viega)
Past: drive by Moore's Law; future: driven by Amdahl's Law (performance is a function of how much your program can be parallellized)
concurrency failures: difficult to diagnose, maddeningly difficult to reproduce; often fail-slow, difficult to pin down when something actually went wrong

II. JMM in ~10 minutes

Chap 16 of JCiP uch better than JLS
Specifies minimal guarantees about when writes become visible to other threads
Memory model isn't just what you think it might be
Mental model of single piece of data being read/written by multiple threads is wrong -- processors have their own caches. Different CPUs have different issues with cache coherency.
Compiler writers are REALLY clever, and that works GREAT for a single thread. But you can be very surprised by reordering on multiple threads.
"Happens-Before" - defined by JMM as a partial ordering, but it's not complete; there may be clumps of things that are ordered, but they cannot be compared
- Determines what actions threads can take; thread B is guaranteed to see the results of action A iff there's a happens-before.
- happens-before always concerns two threads, not a thread and "all others"

Rules:

Program order: Each action in a thread happens-before every other action in a thread that comes later in the program.
Lock: An unlock on an intrinsic lock (or Lock object) happens-before every subsequent lock on that same monitor lock.
Volatile field: A write to a volatile field happens-before every subsequent read of that same field. Also, the state of the thread that wrote the volatile value is also available to other threads. Without 'volatile' you're guaranteed NOTHING. (It may work okay now, but it's an accident.)
Thread start: a call to start() on a thread happens before any actions in the started thread. (Sidenote: threads can only communicate via fields.)
Thread termination: all actions in a thread happen before any other thread detects that a thread has terminated (via join() or isAlive()).
Interruption: A thread calling interrupt on another thread happens before the interrupted thread detects the interrupt. (Sidenote: if you interrupt a thread while it's in sleep() it will throw a InterruptedException. Duh.)
Finalizer: The end of a constructor will be reached before the start of a finalizer. (However, see Effective Java: "Avoid finalizers")
Transitivity: If A happens before B and B happens before C, A happens before C.
(another rule about finally he'll bring up later)

III. Concurrency Anti-Patterns

Approach: bad code to good code
Why do these exist?
- Concurrency is difficult: avoid!
- Concurrency in frameworks: try to encapsulate, introduce callbacks, get ready for poor docs
- Misconceptions: policies about what you can do with threads (or the context you're in) is poorly documented, performance "uber alles" ("we're getting rid of all these locks because they're slowing us down"), myths started by "Cliff"
But... my unit tests pass!
- Unit testing sucks for concurrency
- Integration testing better, stress testing on variety of hardware
- Emerging analysis tools (static and dynamic), SureLogic does this (good to know!)

AP #1: Invisible unprotected operations

write performed by one thread isn't visible to others
unsynchronized getters -- the setter has one, but getter doesn't; this *might* work because of the volatile write rule (thread context piggyback), but it might not

Fix:

protect getter/setter with lock on method (synchronized -- happens-before requires a consistent lock)
use util.concurrent.Atomic variables

Example code for a simple counter; even with different types of correct implementations the performance profiles are quite different, even on the same VM on different OSs -- but one of the main things is that Amdahl's law holds -- the impls with ThreadLocal are more easily parallelizable.

AP #2: Placebo Locks

when a locking policy is not followed consistently, the lock only exists to make the developer feel better
example from java.util.Logging: readers and writers must follow the same locking policy, or synchronization is worthless: log() checks to see if filter is null, but setFilter() doesn't follow the same policy
they looked at java.util.Logging and it is (or was, ~year ago) actually much better than log4j concurrency-wise
Locking stack objects: nono; threads communicate with fields (!)

AP #3: Choosing the wrong lock

Despite looking reasonable, this is a problem
Static code locking on instance
Synchronizing on java.util.concurrenet.Lock
Protect static variables with static locks (e.g., Foo.class)
Locking on this.getClass() is always wrong (you could have 2+ separate locks)
Contention: using juc.Lock was 4x better contended performance in Java5, down to ~2x in Java6; example in Java7 from Doug Lea diminishes gap even further
If you can use intrinsic locks, they're simpler

AP #4: Too much mutable state

no happens-before state in constructor to other threads
special case: if field is final, the JMM will guarantee initialization safety; this includes REACHABLE fields within the final field
Generally (?) this is only a problem when you leak 'this' from the constructor
Chapter in JCiP about this (initialization safety)

AP #5: Swing'n in the main thread

Swing has thread confinement policies, it runs in a single thread; "The EDT is responsible for executing any method that modifies a component's state. This includes any component's constructor..."
Demo of Flashlight on simple GUI, showing shared state, what thread was invoking what.

AP #6: Dangerous observers

Observer/Listener tough to implement concurrently
One issue: adding/removing observers is done in multiple places; iterating over observers when you notify may throw concurrent modification exception
Use locking? What happens when you call notify() on the observer while you're holding a lock? How long will that be held? -- "Never call foreign methods while holding a lock."
Fix #1: make a copy before iterating
Fix #2: use CopyOnWriteArrayList
Fix #3 (from me, design vs coding): only allow initialization with known objects, don't add/remove

IV. Q+A

"other silver bullets?" "maybe -- other programming languages may be there, but most of us aren't going to be using them; various web containers and frameworks help with this; STM sounds good, but even then you need to understand what you're doing"

Next: My little Pittsburgher

Previous: Instead of release notes...