GIST NOTES 11 - Advanced Concurrency and Non-blocking Threads
[DISCLAIMER: This is solely for non-commercial use. I don't claim ownership of this content. This is a crux of all my readings studies and analysis. Some of them are excerpts from famous books on the subject. Some of them are my contemplation upon experiments with direct hand coded code samples using IDE or notepad.
I've created this mainly to reduce an entire book into few pages of critical content that we should never forget. Even after years, you don't need to read the entire book again to get back its philosophy. I hope these notes will help you to replay the entire book in your mind once again.]
[JDK 7]
======================
CONCURRENT PROGRAMMING
======================
>in concurrent programming there are two basic units of execution; 1.processes and 2. threads
>concurrency or the effect of concurrency is possible even in single processor systems; but gives enhanced performance in multiprocessor systems which are common today
Processes
----------
>a process is a self contained execution environment
>it has a complete, private set of basic run-time resources particularly memory space
>processes can be seen as programs or applications however, an application can be a set of cooperating processes
>to facilitate communication between processes, most operating systems support Inter Process Communication(IPC) resources
>such resources may be pipes or sockets
>IPC is not just for communication between processes on the same system, but also for processes running on different systems
>most JVM implementations run as a single process; a java application can create additional processes using ProcessBuilder object
Threads
-------
>threads are sometimes called as lightweight processes(LWP)
>both processes and threads provide an execution environment
>but creating a new thread requires fewer resources than creating a new process
>threads exist within a process; every process has at least one thread
>threads share the resources (like memory and open files) of the process under which they run
>this makes for efficient but potentially problematic communication among threads
>every java application has at least one thread(along with several system threads of course); user always starts with 'main' thread when he runs the java app
>one thread can interrupt another thread; on such occasion it is a general programming practice for the interrupted thread to return from whatever it was doing
>suppose, a thread didn't receive InterruptedException for long time; how can it check whether it has received any interrupts or not?
>it can call Thread.interrupted() method which returns true if there was any pending interrupt signals for this thread; if true the thread can return from its work or waiting status; interrupted() method clears the interrupted flag when called, so that it can receive next interrupt signals on that boolean variable
>upon the detection of interrupt, the thread can throw InterruptedException in order to propagate the signal to a centralized exception handler
Interrupt status flag
---------------------
>this flag is maintained internally in every thread
>[non static]Thread.interrupt() can set this flag
>[static]Thread.interrupted() returns the flag and clears it(this method can be used by the owner thread only)
>[non static]Thread.isInterrupted() returns the flag leaving the flag status intact; this method is used by other threads to check this thread
>By convention any method that exits by throwing InterruptedException, clears interrupt flag on exit; however another thread might set interrupt flag to true immediately again
>join() and sleep() have overloaded methods to specify time out; both of them throw InterruptedException upon interrupt() call on this thread by other threads
>interrupt(), interrupted(), isInterrupted()
Thread Interference
-------------------
>corrupting the shared data (shared by multiple threads)
Memory Consistency Errors
-------------------------
>errors that occur due to inconsistent 'view' of the shared data by a thread
>to avoid this error, an understanding of happens-before relationship is required
>this relationship ensures that memory writes are visible and apparent for every thread; because write happens before other thread reads; one of the things that create happens-before relationship is synchronization
>two more such things are Thread.start() and Thread.join()
>Thread.start() every statement before this call happen before all the statements of the new thread
>Thread.join() all statements of target(which is joined by another) thread happen before the subsequent statements of the joining thread
>synchronized keyword before constructor is illegal
>no need to synchronize constructors because, only one thread has access to an object which is being constructed through constructors
>but before that make sure that the reference to the object which is being constructed(through constructor call) is not leaked prematurely; like adding 'this' reference to a shared list inside the constructor will create concurrency issues
>synchronized methods are simple solutions to avoid thread interference and memory consistency errors
>final fields can be safely read through non-synchronized methods, but still 'liveness' problem can occur due to this
Intrinsic lock or monitor lock
------------------------------
>this locks provide both exclusive access and happens-before relationship essential to visibility
>these are nothing but object locks in java used internally for synchronized methods and blocks
>synchronizing unwanted code can cause 'liveness' problem
>synchronized blocks allow use of multiple locks for different sections of code, and allow to keep unwanted code lines from being synchronized
Reentrant synchronization
-------------------------
>a thread can acquire a lock it already owns again and again
>this facility enables reentrant synchronization
>hence easy to call another synchronized method for which the already acquired lock is sufficient
>without reentrant synchronization, a thread can block itself forever and to avoid that, requires additional precautions
Atomic Action
-------------
>all steps happen at once; nobody can come in between
What are/aren't already atomic in java?
---------------------------------------
1.read or write for reference variables ARE
2.read or write for primitives except double and long ARE
3.read or write for volatile variables of all kinds ARE(reference, primitives including long & double)
4.increment/decrement operations ARE NOT atomic
>atomic operations cannot be interleaved hence they can be used without the fear of thread interference
>use of volatile variables reduces the risk of memory consistency errors
>because writing volatile variables ensures happens-before relationship with subsequent reads of that variable(what a humbo jumbo about happens-before stuff? I don't understand a thing. Well, remember everybody is forced to not keep any cached data for volatile variables? so when someone writes a volatile variable, everyone who reads subsequently will by default by the design of volatile variable and its rules, get the latest update; hence, any volatile variable write has the happens-before relationship with any upcoming reads)
>not only the latest writes of volatile variables are visible to all threads; but also the side effects of the code that updated is also visible to all threads
>using atomic variables is more efficient than synchronized codes; but lot of responsibility(to avoid memory consistency errors) is left to users of atomic variables, which is not the case for synchronized codes; that is in sychronized coded applications, threads can be dumb and not care about the way other threads might change the shared data; but in atomic variable situation, threads have to be aware of everyone else and their actions on the shared data and how they can screw the data up;
>so the choice between atomic variables and synchronized code depends on the size and complexity of the application
Liveness: ability of a concurrent application to execute in a timely manner
--------
Liveness Problems: deadlock, starvation and livelock
Common one: deadlock - blocked by each other(everybody is waiting for each other)
Less common ones: starvation and livelock
Starvation: a thread not able to gain regular access and unable to progress; happens when a greedy thread hogs the shared resources; for example when a synchronized method takes long time to return and that method is called by a thread frequently, other threads that need frequent access to other methods on the same object will be often blocked
Livelock: here threads are not blocked as in deadlock; they are simply too busy responding to each other to resume work; they get tangled up in responses only never moving on to resume the work; similar to two people attempting to pass each other in a corridor; happens when both the threads are being nice to each other blindly, not realizing that they are doing an awkward dancing(eek!)
Guarded Blocks
--------------
>certain block of code has to be entered upon meeting a specific condition only
>instead of waiting for the condition on a CPU wasteful loop, use wait() notifyAll() mechanism of threads
>Note: There is a second notification method, notify, which wakes up a single thread. Because notify doesn't allow you to specify the thread that is woken up, it is useful only in massively parallel applications — that is, programs with a large number of threads, all doing similar chores. In such an application, you don't care which thread gets woken up.
A Strategy for Defining Immutable Objects(no need for synchronization for these)
-----------------------------------------
The following rules define a simple strategy for creating immutable objects. Not all classes documented as "immutable" follow these rules. This does not necessarily mean the creators of these classes were sloppy — they may have good reason for believing that instances of their classes never change after construction. However, such strategies require sophisticated analysis and are not for beginners.
Don't provide "setter" methods — methods that modify fields or objects referred to by fields.
Make all fields final and private.
Don't allow subclasses to override methods. The simplest way to do this is to declare the class as final. A more sophisticated approach is to make the constructor private and construct instances in factory methods.
If the instance fields include references to mutable objects, don't allow those objects to be changed:
Don't provide methods that modify the mutable objects.
Don't share references to the mutable objects. Never store references to external, mutable objects passed to the constructor; if necessary, create copies, and store references to the copies. Similarly, create copies of your internal mutable objects when necessary to avoid returning the originals in your methods.
--x--
java.util.concurrent.locks
--------------------------
>a framework for locking and waiting for conditions that is distinct from built-in synchronization and monitors
>Lock replaces use of synchronized methods and statements
>a Condition replaces the use of the Object monitor methods
Lock
----
The use of synchronized methods or statements provides access to the implicit monitor lock associated with every object, but forces all lock acquisition and release to occur in a block-structured way: when multiple locks are acquired they must be released in the opposite order, and all locks must be released in the same lexical scope in which they were acquired.
While the scoping mechanism for synchronized methods and statements makes it much easier to program with monitor locks, and helps to avoid many common programming errors involving locks, there are occasions where you need to work with locks in a more flexible way. For example, some algorithms for traversing concurrently accessed data structures require the use of "hand-over-hand" or "chain locking": you acquire the lock of node A, then node B, then release A and acquire C, then release B and acquire D and so on. Implementations of the Lock interface enable the use of such techniques by allowing a lock to be acquired and released in different scopes, and allowing multiple locks to be acquired and released in any order.
With this increased flexibility comes additional responsibility. The absence of block-structured locking removes the automatic release of locks that occurs with synchronized methods and statements. In most cases, the following idiom should be used:
Lock l = ...;
l.lock();
try {
// access the resource protected by this lock
} finally {
l.unlock();
}
When locking and unlocking occur in different scopes, care must be taken to ensure that all code that is executed while the lock is held is protected by try-finally or try-catch to ensure that the lock is released when necessary.
Lock implementations provide additional functionality over the use of synchronized methods and blocks by providing a non-blocking attempt to acquire a lock (tryLock()), an attempt to acquire the lock that can be interrupted (lockInterruptibly(), and an attempt to acquire the lock that can timeout (tryLock(long, TimeUnit)).
A Lock class can also provide behavior and semantics that is quite different from that of the implicit monitor lock, such as guaranteed ordering, non-reentrant usage, or deadlock detection. If an implementation provides such specialized semantics then the implementation must document those semantics.
Note that Lock instances are just normal objects and can themselves be used as the target in a synchronized statement. Acquiring the monitor lock of a Lock instance has no specified relationship with invoking any of the lock() methods of that instance. It is recommended that to avoid confusion you never use Lock instances in this way, except within their own implementation.
The java.util.concurrent package defines three executor interfaces:
-------------------------------------------------------------------
1.Executor, a simple interface that supports launching new tasks.
2.ExecutorService, a subinterface of Executor, which adds features that help manage the lifecycle, both of the individual tasks and of the executor itself.
3.ScheduledExecutorService, a subinterface of ExecutorService, supports future and/or periodic execution of tasks.
Runnable - a task that returns nothing
Callable - a task that returns something
Future - which helps to retrieve the return value of a Callable; helps to manage the status of Runnable and Callable tasks
ExecutorService - supports creation and management of Runnable/Callable through Future objects; also supports managing the shutdown of the executor
Creating Your Own Thread Pool
-----------------------------
>determining thread pool size (fixed or dynamic)
>job addition (single thread or multithreads add jobs)
>workers should be responsive to the manager all the time
>premature shutdown/graceful shutdown of all threads
>manager(or main thread) can wait for all threads to complete through join() method
>workers can wait on empty queue
>nofication is necessary whenever a new job is added to the queue
>when a thread is interrupted from wait() and if the wait() statement is wrapped in try-catch within synchronized block, when the control enters catch block to handle InterruptedException, at this time also it needs the lock on the object; if you put sleep statement inside catch block, it sleeps while holding the lock; no other thread can use the lock;
>the above situation happens if you badly coded the after statements to wait()-try-catch section; because when the manager interrupts all the workers to shutdown, the first interrupted thread can hold the lock in the catch block and sleep forever, and no other thread will be able to proceed to graceful shutdown and thread pool will never shutdown and never do any work either
>in the middle of an atomic operation do not give up your lock to others; if you do, when you(thread) come back, start the atomic operation all over again(so be mindful of where you put wait() call)
>PThread - POSIX standard threads usually available in Unix systems; POSIX-Portable Operating System Interface
Guarded Block - sensitive section of code block that accesses shared data
Mutex - mutual exclusion mechanism to protect guarded blocks
Monitor - mechanism to protect guarded blocks using mutex objects such as locks
Semaphore - alternate mechanism to protect guarded blocks; differs from monitor in the respect that semaphore has more than one identical resources (instead of a single lock in Monitors), and keeps giving them away to each thread, each time keeping track of no.of remaining resources; once all of them are over, it forces new threads to wait; it also doesn't insist on ownership (threads that acquired lock, has complete ownership over it and it can decide when to release) to threads as in Monitors
Dining philosopher problem - thread contention illustrated by shared forks
IPC Communication Through: sockets, signal handlers, shared memory, semaphores, and files
-----------------------------------------------------------------------------------------------
Java Concurrency In Practice By Brian, Tim, Joshua, Joseph, David and Doug Lea
-----------------------------------------------------------------------------------------------
[Reading this book gives you nightmares. It is like, all you know about java for-sure, go wrong in concurrent environment]
Threads are more basic units of scheduling than processes.
Since threads share same heap/memory and resources from their process, they can communicate more efficiently.
Threads are useful in GUI applications for improving responsiveness of the user interface, and in server applications for improving resource utilization and throughput.
They simplify the implementation of JVM(GC usually runs in one or more dedicated threads).
Java java.nio is for non-blocking IO(was developed for the use of single threaded programs which cannot be blocked).
[[[Frameworks(such as servlets, RMI, etc) introduce concurrency into applications by calling application components from framework threads. Components invariably access application state, that requiring that all code paths accessing that state be thread-safe.]]]
Use of the following could introduce concurrency issues into your SINGLE threaded program
-----------------------------------------------------------------------------------------
>Timer - TimerTasks might access shared data in future in parallel to your application
>Servlets and Java Server Pages(JSP) - servlets need to be thread safe because to handle high volume http requests, multiple threads are employed by the webserver; even if a servlet is used by one thread, it still could access application wide objects like ServletContext(application-scoped) or shared session-objects(HttpSession)
>RMI - remote object calls happen in their own threads created by RMI framework; so remote objects should ensure concurrent access for themselves as well as the data objects they expose
>Swing and AWT - GUI apps are inherently asynchronous; to ensure responsiveness of GUI at all times, Swing and AWT create a separate thread for handling user-initiated events, and to update the GUI visible to the user; Swing components(like JTable) are not thread-safe; instead thread-safety is achieved by confining all access to GUI to a single thread; any code which wants to control GUI should run under the single event-dispatcher thread
#often it is effective to place synchronization as close to shared data(or within shared data object) so that thread-safety is ensured everywhere
#it is also imperative to see what is atomic at every level of programming; though you use a thread-safe data structure, it may not provide atomicity at your application level, and hence synchronization is needed in your application level as well
>When designing thread safe classes, good object oriented techniques encapsulation, immutability, and clear
specification of invariants are your best friends.
>A class is [Thread Safe] when it continues to behave correctly when accessed from multiple threads
>Thread safe classes encapsulate any needed synchronization so that clients need not provide their own
>stateless objects(which have no fields, relying only on method local variables) are always thread safe(because local variables on stack are specific to the executing thread and not shared)
>Sateless Servlets are great because they don't require thread safety codes
>Race Condition: The possibility of incorrect results in the presence of unlucky timing; the correctness of a computation depends on the relative timing or interleaving of multiple threads by the runtime; in other words, getting the correct result depends on chance factor;
>The most common type of race condition is check-then-act, where a potentially stale observation is used to make a decision on what to do next
>race condition is often confused with a related term data-race; data-race happens when two threads share a non-final field without coordination; happens whenever a thread is trying to write a variable that might be read next by another thread, or trying to read a variable that might have been written by another thread last time; not all race conditions are data races; not all data races are race conditions; but both make application behave unpredictably in concurrent environment
>lazy initialization(initialize only when it is required, and only once) can cause race conditions if not synchronized [e.g. concurrency problem in singleton pattern]
>Compound Actions - sequences of operations that must be executed atomically in order to remain thread-safe
>when there is only one state variable(instance field or otherwise), use Atomic data type to ensure thread-safety; when you have more than one state variable in your object, using atomic data type is not just enough
>to preserve state consistency, update related variables in a single atomic operation
>intrinsic locks (or monitor locks) in java act as mutexes(mutual exclusion locks)
>Reentrancy means that locks are acquired on a per-thread rather than perinvocation basis. [7] Reentrancy is implemented by associating with each lock an acquisition count and an owning thread. When the count is zero, the lock is considered unheld. When a thread acquires a previously unheld lock, the JVM records the owner and sets the acquisition count to one. If that same thread acquires the lock again, the count is incremented, and when the owning thread exits the synchronized block, the count is decremented. When the count reaches zero, the lock is released.
[7] This differs from the default locking behavior for pthreads (POSIX threads) mutexes, which are granted on a per invocation basis.
>Reentrancy facilitates encapsulation of locking behavior, and thus simplifies the development of object oriented
concurrent code. Without reentrant locks, the very natural looking code in Listing 2.7, in which a subclass overrides a synchronized method and then calls the superclass method, would deadlock.
>synchronization is also about visibility of data modifications across all threads (no surprises to any thread)
>statement reordering(different order from the code) can be done by compiler and JVM memory model and CPU to take advantage of caching data between operations, and for improving performance; this reordering can give the most weirdest results in a concurrent environment where thread-safety is missing
>3.1.2. Non atomic 64 bit Operations When a thread reads a variable without synchronization, it may see a stale value, but at least it sees a value that was actually placed there by some thread rather than some random value. This safety guarantee is called out of thin air safety. Out of thin air safety applies to all variables, with one exception: 64 bit numeric variables (double and long) that are not declared volatile . The Java Memory Model requires fetch and store operations to be atomic, but for nonvolatile long and double variables, the JVM is permitted to treat a 64 bit read or write as two separate 32 bit operations. If the reads and writes occur in different threads, it is therefore possible to read a nonvolatile long and get back the high 32 bits of one value and the low 32 bits of another.[3] Thus, even if you don't care about stale values, it is not safe to use shared mutable long and double variables in multithreaded programs unless they are declared volatile or guarded by a lock.
[3] When the Java Virtual Machine Specification was written, many widely used processor architectures could not efficiently provide atomic 64 bit arithmetic operations.
>Locking is not just about mutual exclusion; it is also about memory visibility. To ensure that all threads see the most up to date values of shared mutable variables, the reading and writing threads must synchronize on a common lock.
>The Java language also provides an alternative, weaker form of synchronization, volatile variables, to ensure that updates to a variable are propagated predictably to other threads. When a field is declared volatile, the compiler and runtime are put on notice that this variable is shared and that operations on it should not be reordered with other memory operations. Volatile variables are not cached in registers or in caches where they are hidden from other processors, so a read of a volatile variable always returns the most recent write by any thread.
>Use volatile variables only when they simplify implementing and verifying your synchronization policy; avoid using volatile variables when verifying correctness, would require subtle reasoning about visibility. Good uses of volatile variables include ensuring the visibility of their own state, that of the object they refer to, or indicating that an important lifecycle event (such as initialization or shutdown) has occurred.
>[6] Debugging tip: For server applications, be sure to always specify the -server JVM command line switch when invoking the JVM, even for development and testing. The server JVM performs more optimization than the client JVM, such as hoisting variables out of a loop that are not modified in the loop; code that might appear to work in the development environment (client JVM) can break in the deployment environment (server JVM). For example, had we "forgotten" to declare the variable asleep as volatile in Listing 3.4, the server JVM could hoist the test out
of the loop (turning it into an infinite loop), but the client JVM would not. An infinite loop that shows up in development is far less costly than one that shows up only in production.
Listing 3.4. Counting Sheep.
volatile boolean asleep;
...
while (!asleep)
countSomeSheep();
>Locking can guarantee both visibility and atomicity; volatile variables can only guarantee visibility.
>You can use volatile variables only when all the following criteria are met:
Writes to the variable do not depend on its current value, or you can ensure that only a single thread ever updates the value; The variable does not participate in invariants with other state variables; and Locking is not required for any other reason while the variable is being accessed.
>Listing 3.6. Allowing Internal Mutable State to Escape. Don't Do this.
class UnsafeStates {
private String[] states = new String[] {
"AK", "AL" ...
};
public String[] getStates() { return states; }
}
>Publishing states in this way is problematic because any caller can modify its contents. In this case, the states array has escaped its intended scope, because what was supposed to be private state has been effectively made public.
Publishing an object also publishes any objects referred to by its non private fields. More generally, any object that is reachable from a published object by following some chain of non private field references and method calls has also been published.
>From the perspective of a class C, an alien method is one whose behavior is not fully specified by C. This includes methods in other classes as well as overrideable methods (neither private nor final) in C itself. Passing an object to an alien method must also be considered publishing that object. Since you can't know what code will actually be invoked, you don't know that the alien method won't publish the object or retain a reference to it that might later be used from another thread. Whether another thread actually does something with a published reference doesn't really matter, because the risk of misuse is still present.[7] Once an object escapes, you have to assume that another class or thread may, maliciously or carelessly, misuse it. This is a compelling reason to use encapsulation: it makes it practical to analyze programs for
correctness and harder to violate design constraints accidentally.
[7] If someone steals your password and posts it on the alt.free passwords newsgroup, that information has escaped: whether or not someone has (yet) used those credentials to create mischief, your account has still been compromised. Publishing a reference poses the same sort of risk.
>A final mechanism by which an object or its internal state can be published is to publish an inner class instance, as shown in ThisEscape in Listing 3.7. When ThisEscape publishes the EventListener, it implicitly publishes the enclosing ThisEscape instance as well, because inner class instances contain a hidden reference to the enclosing instance.
28 Java Concurrency In Practice
Listing 3.7. Implicitly Allowing the this Reference to Escape. Don't Do this.
public class ThisEscape {
public ThisEscape(EventSource source) {
source.registerListener(
new EventListener() {
public void onEvent(Event e) {
doSomething(e);
}
});
}
}
>an object is in a predictable, consistent state only after its constructor returns, so publishing an object from within its constructor can publish an incompletely constructed object. This is true even if the publication is the last statement in the constructor. If the this reference escapes during construction, the object is considered not properly constructed.[8]
[8] More specifically, the this reference should not escape from the thread until after the constructor returns. The this reference can be stored somewhere by the constructor as long as it is not used by another thread until after construction. SafeListener in Listing 3.8 uses this technique. Do not allow the this reference to escape during construction.
>Calling an overrideable instance method (one that is neither private nor final) from the constructor can also allow the this reference to escape. If you are tempted to register an event listener or start a thread from a constructor, you can avoid the improper construction by using a private constructor and a public factory method, as shown in SafeListener in Listing 3.8.
Listing 3.8. Using a Factory Method to Prevent the this Reference from Escaping During Construction.
public class SafeListener {
private final EventListener listener;
private SafeListener() {
listener = new EventListener() {
public void onEvent(Event e) {
doSomething(e);
}
};
}
public static SafeListener newInstance(EventSource source) {
SafeListener safe = new SafeListener();
source.registerListener(safe.listener);
return safe;
}
}
>3.3. Thread Confinement
Accessing shared, mutable data requires using synchronization; one way to avoid this requirement is to not share. If data is only accessed from a single thread, no synchronization is needed. This technique, thread confinement, is one of the simplest ways to achieve thread safety. When an object is confined to a thread, such usage is automatically thread safe even if the confined object itself is not [CPJ 2.3.2].
Swing uses thread confinement extensively. The Swing visual components and data model objects are not thread safe;
instead, safety is achieved by confining them to the Swing event dispatch thread. To use Swing properly, code running in threads other than the event thread should not access these objects. (To make this easier, Swing provides the invokeLater mechanism to schedule a Runnable for execution in the event thread.) Many concurrency errors in Swing applications stem from improper use of these confined objects from another thread.
Another common application of thread confinement is the use of pooled JDBC (Java Database Connectivity) Connection
objects. The JDBC specification does not require that Connection objects be thread safe.[9] In typical server applications, a thread acquires a connection from the pool, uses it for processing a single request, and returns it. Since most requests, such as servlet requests or EJB (Enterprise JavaBeans) calls, are processed synchronously by a single thread, and the pool will not dispense the same connection to another thread until it has been returned, this pattern of connection management implicitly confines the Connection to that thread for the duration of the request.
[9] The connection pool implementations provided by application servers are thread safe; connection pools are necessarily accessed from multiple threads, so a non thread safe implementation would not make sense.
Just as the language has no mechanism for enforcing that a variable is guarded by a lock, it has no means of confining an object to a thread. Thread confinement is an element of your program's design that must be enforced by its implementation. The language and core libraries provide mechanisms that can help in maintaining thread confinement local variables and the ThreadLocal class but even with these, it is still the programmer's responsibility to ensure that thread confined objects do not escape from their intended thread.
Stack Confinement
-----------------
>a special case of thread confinement
>an object can only be reached through local variables - this is called stack confinement
>also for primitives, the language semantics ensure that primitive local variables are always stack confined
>stack confinement however resides in the heads of the programmers and hence has to be documented for the future maintainers not to let the stack local objects be published outside
ThreadLocal
-----------
>wrap global variables in ThreadLocal object(static private field) to confine it within a thread
>ThreadLocal maintains an internal map for each thread and serves the request according to the calling thread(that is current thread)
Listing 3.10. Using ThreadLocal to Ensure thread Confinement.
private static ThreadLocal
= new ThreadLocal
public Connection initialValue() {
return DriverManager.getConnection(DB_URL);
}
};
public static Connection getConnection() {
return connectionHolder.get();
}
>when the thread exits, the thread local copies specific to the thread become avaialable for garbage collection
Immutability
------------
>Immutable objects are always thread safe
>[16] While it may seem that field values set in a constructor are the first values written to those fields and therefore that there are no "older" values to see as stale values, the Object constructor first writes the default values to all fields before subclass constructors run. It is therefore
possible to see the default value for a field as a stale value.
>Using a static initializer is often the easiest and safest way to publish objects that can be statically constructed:
public static Holder holder = new Holder(42);
Static initializers are executed by the JVM at class initialization time; because of internal synchronization in the JVM, this mechanism is guaranteed to safely publish any objects initialized in this way [JLS 12.4.2].
>4.1. Designing a Thread safe Class
The design process for a thread safe class should include these three basic elements:
Identify the variables that form the object's state;
Identify the invariants that constrain the state variables;
Establish a policy for managing concurrent access to the object's state.
>Encapsulating data within an object confines access to the data to the object's methods, making it easier to ensure that the data is always accessed with the appropriate lock held.
>Confinement makes it easier to build thread safe classes because a class that confines its state can be analyzed for thread safety without having to examine the whole program.
>Priority inheritance is when a thread gets a temporary bump in priority, due to what threads are waiting on locks that it owns. With priority inheritence, if a low priority thread owns a lock, that a high priority thread later wants -- it will be bumped to the higher priority. The reason is... the higher priority thread is blocked waiting for this lock, so the low priority thread needs to run at a higher priority in order to finish up and free the lock for the high priority thread. The low priority thread is "inheriting" the high priority of the high priority thread.
>If a class is composed of multiple independent thread safe state variables and has no operations that have any invalid state transitions, then it can delegate thread safety to the underlying state variables.
>If a state variable is thread safe, does not participate in any invariants that constrain its value, and has no prohibited state transitions for any of its operations, then it can safely be published.
>If extending a class to add another atomic operation is fragile because it distributes the locking code for a class over multiple classes in an object hierarchy, client side locking is even more fragile because it entails putting locking code for class C into classes that are totally unrelated to C. Exercise care when using client side locking on classes that do not commit to their locking strategy. Client side locking has a lot in common with class extension they both couple the behavior of the derived class to the implementation of the base class. Just as extension violates encapsulation of implementation [EJ Item 14], client side locking violates encapsulation of synchronization policy.
>Document a class's thread safety guarantees for its clients; document its synchronization policy for its maintainers.
>Because the synchronized collections commit to a synchronization policy that supports client side locking, [1] it is possible to create new operations that are atomic with respect to other collection operations as long as we know which lock to use. The synchronized collection classes guard each method with the lock on the synchronized collection object itself.
>Listing 5.3. Iteration that may Throw ArrayIndexOutOfBoundsException.
for (int i = 0; i < vector.size(); i++)
doSomething(vector.get(i));
>using iterators does not obviate the need to lock the collection during iteration if other threads can concurrently modify it. The iterators returned by the synchronized collections are not designed to deal with concurrent modification, and they are fail fast meaning that if they detect that the collection has changed since iteration began, they throw the unchecked ConcurrentModificationException.
>These fail fast iterators are not designed to be foolproof they are designed to catch concurrency errors on a "good faith effort" basis and thus act only as early warning indicators for concurrency problems. They are implemented by associating a modification count with the collection: if the modification count changes during iteration, hasNext or next throws ConcurrentModificationException. However, this check is done without synchronization, so there is a risk of seeing a stale value of the modification count and therefore that the iterator does not realize a modification has been made. This was a deliberate design tradeoff to reduce the performance impact of the concurrent modification detection code.[2]
>[2] ConcurrentModificationException can arise in single threaded code as well; this happens when objects are removed from the collection directly rather than through Iterator.remove()
>for-each loop also internally uses iterators to iterate the collection and hence for-each loop also can throw ConcurrentModificationException if not guarded by a lock
>but locking the whole iteration loop also hurts performance(with starvation and deadlock risk as well)
>possibly, the collection can be cloned in each thread for iteration(cloning itself should be guarded) if it is less costlier than concurrently iterating the same collection via multiple threads
>hidden iterator is possible when we print a collection variable; internally toString() method iterates over the elements and there is a possible ConcurrentModificationException while printing a collection as well
>If HiddenIterator wrapped the HashSet with a synchronizedSet, encapsulating the synchronization, this sort of error would not occur. Just as encapsulating an object's state makes it easier to preserve its invariants, encapsulating its synchronization makes it easier to enforce its synchronization policy.
>Replacing synchronized collections with concurrent collections can offer dramatic scalability improvements with little risk.
>While you can simulate the behavior of a Queue with a List in fact, LinkedList also implements Queue the Queue classes were added because eliminating the random access requirements of List admits more efficient concurrent implementations.
>Concurrent collections, e.g. ConcurrentHashMap: Instead of synchronizing every method on a common lock, restricting access to a single thread at a time, it uses a fine grained locking mechanism called lock striping (see Section 11.4.3) to allow a greater degree of shared access.
>Tradeoff with concurrent collections: iterator is not absolute; it doesn't give snapshot view of the collection; parallel modifications while iterating is allowed; size() and isEmpty() methods only tell approximate condition at that time which can change every instant; basically the absoluteness(determinability) we had with synchronized collections is lost in concurrent collections
>Only if your application needs to lock the map for exclusive access [3] is ConcurrentHashMap not an appropriate drop in replacement.
>Since a ConcurrentHashMap cannot be locked for exclusive access, we cannot use client side locking to create new atomic operations such as put if absent; but these are already provided in concurrent collections
>CopyOnWriteArrayList/CopyOnWriteArraySet: create new collection whenever it is modified;
>Iterators for the copy on write collections retain a reference to the backing array that was current at the start of iteration, and since this will never change, they need to synchronize only briefly to ensure visibility of the array contents. As a result, multiple threads can iterate the collection without interference from one another or from threads wanting to modify the collection. The iterators returned by the copy on write collections do not throw ConcurrentModificationException and return the elements exactly as they were at the time the iterator was created, regardless of subsequent modifications.
>Obviously, there is some cost to copying the backing array every time the collection is modified, especially if the collection is large; the copy on write collections are reasonable to use only when iteration is far more common than modification. This criterion exactly describes many event notification systems: delivering a notification requires iterating the list of registered listeners and calling each one of them, and in most cases registering or unregistering an event listener is far less common than receiving an event notification.
>BlockingQueue simplifies the implementation of producer consumer designs with any
number of producers and consumers. One of the most common producer consumer designs is a thread pool coupled
with a work queue; this pattern is embodied in the Executor task execution framework
>If the producers consistently generate work faster than the consumers can process it, eventually the application will run out of memory because work items will queue up without bound. Again, the blocking nature of put greatly simplifies coding of producers; if we use a bounded queue, then when the queue fills up the producers block, giving the consumers time to catch up because a blocked producer cannot generate more work.
>Bounded queues are a powerful resource management tool for building reliable applications: they make your program
more robust to overload by throttling activities that threaten to produce more work than can be handled.
>if blocking queues don't fit easily into your design, you can create other blocking data structures using Semaphore
>The last BlockingQueue implementation, SynchronousQueue, is not really a queue at all, in that it maintains no storage
space for queued elements. Instead, it maintains a list of queued threads waiting to enqueue or dequeue an element. In
the dish washing analogy, this would be like having no dish rack, but instead handing the washed dishes directly to the next available dryer. While this may seem a strange way to implement a queue, it reduces the latency associated with moving data from producer to consumer because the work can be handed off directly. (In a traditional queue, the enqueue and dequeue operations must complete sequentially before a unit of work can be handed off.) The direct handoff also feeds back more information about the state of the task to the producer; when the handoff is accepted, it knows a consumer has taken responsibility for it, rather than simply letting it sit on a queue somewhere much like the difference between handing a document to a colleague and merely putting it in her mailbox and hoping she gets it soon.
Since a SynchronousQueue has no storage capacity, put and take will block unless another thread is already waiting to
participate in the handoff. Synchronous queues are generally suitable only when there are enough consumers that there
nearly always will be one ready to take the handoff.
>The producer consumer pattern also enables several performance benefits. Producers and consumers can execute concurrently; if one is I/O bound and the other is CPU bound, executing them concurrently yields better overall throughput than executing them sequentially.
>5.3.2. Serial Thread Confinement
The blocking queue implementations in java.util.concurrent all contain sufficient internal synchronization to safely
publish objects from a producer thread to the consumer thread.
For mutable objects, producer consumer designs and blocking queues facilitate serial thread confinement for handing
off ownership of objects from producers to consumers. A thread confined object is owned exclusively by a single thread, but that ownership can be "transferred" by publishing it safely where only one other thread will gain access to it and ensuring that the publishing thread does not access it after the handoff. The safe publication ensures that the object's state is visible to the new owner, and since the original owner will not touch it again, it is now confined to the new thread. The new owner may modify it freely since it has exclusive access.
Object pools exploit serial thread confinement, "lending" an object to a requesting thread. As long as the pool contains sufficient internal synchronization to publish the pooled object safely, and as long as the clients do not themselves publish the pooled object or use it after returning it to the pool, ownership can be transferred safely from thread to thread.
One could also use other publication mechanisms for transferring ownership of a mutable object, but it is necessary to
ensure that only one thread receives the object being handed off. Blocking queues make this easy; with a little more work, it could also be done with the atomic remove method of ConcurrentMap or the compareAndSet method of AtomicReference.
>5.3.3. Deques and Work Stealing
Java 6 also adds another two collection types, Deque (pronounced "deck") and BlockingDeque, that extend Queue and
BlockingQueue. A Deque is a double ended queue that allows efficient insertion and removal from both the head and
the tail. Implementations include ArrayDeque and LinkedBlockingDeque.
Just as blocking queues lend themselves to the producer consumer pattern, deques lend themselves to a related pattern called work stealing. A producer consumer design has one shared work queue for all consumers; in a work stealing design, every consumer has its own deque. If a consumer exhausts the work in its own deque, it can steal work from the tail of someone else's deque. Work stealing can be more scalable than a traditional producer consumer design because workers don't contend for a shared work queue; most of the time they access only their own deque, reducing contention. When a worker has to access another's queue, it does so from the tail rather than the head, further reducing contention.
Work stealing is well suited to problems in which consumers are also producers when performing a unit of work is likely to result in the identification of more work. For example, processing a page in a web crawler usually results in the identification of new pages to be crawled. Similarly, many graph exploring algorithms, such as marking the heap during garbage collection, can be efficiently parallelized using work stealing. When a worker identifies a new unit of work, it places it at the end of its own deque (or alternatively, in a work sharing design, on that of another worker); when its deque is empty, it looks for work at the end of someone else's deque, ensuring that each worker stays busy.
>The put and take methods of BlockingQueue throw the checked InterruptedException, as do a number of other library methods such as Thread.sleep. When a method can throw InterruptedException, it is telling you that it is a blocking method, and further that if it is interrupted, it will make an effort to stop blocking early.
Interruption is a cooperative mechanism. One thread cannot force another to stop what it is doing and do something
else; when thread A interrupts thread B, A is merely requesting that B stop what it is doing when it gets to a convenient stopping point if it feels like it. While there is nothing in the API or language specification that demands any specific application level semantics for interruption, the most sensible use for interruption is to cancel an activity. Blocking methods that are responsive to interruption make it easier to cancel long running activities on a timely basis.
>When your code calls a method that throws InterruptedException, then your method is a blocking method too, and
must have a plan for responding to interruption. For library code, there are basically two choices:
Propagate the InterruptedException. This is often the most sensible policy if you can get away with it just propagate
the InterruptedException to your caller. This could involve not catching InterruptedException, or catching it and
throwing it again after performing some brief activity specific cleanup.
Restore the interrupt. Sometimes you cannot throw InterruptedException, for instance when your code is part of a
Runnable. In these situations, you must catch InterruptedException and restore the interrupted status by calling
interrupt on the current thread, so that code higher up the call stack can see that an interrupt was issued, as
demonstrated in Listing 5.10.
You can get much more sophisticated with interruption, but these two approaches should work in the vast majority of
situations. But there is one thing you should not do with InterruptedException; catch it and do nothing in response.
This deprives code higher up on the call stack of the opportunity to act on the interruption, because the evidence that
the thread was interrupted is lost. The only situation in which it is acceptable to swallow an interrupt is when you are
extending Thread and therefore control all the code higher up on the call stack.
Listing 5.10. Restoring the Interrupted Status so as Not to Swallow the Interrupt.
public class TaskRunnable implements Runnable {
BlockingQueue
...
public void run() {
try {
processTask(queue.take());
} catch (InterruptedException e) {
// restore interrupted status
Thread.currentThread().interrupt();
}
}
}
>5.5. Synchronizers
Blocking queues are unique among the collections classes: not only do they act as containers for objects, but they can
also coordinate the control flow of producer and consumer threads because take and put block until the queue enters
the desired state (not empty or not full).
A synchronizer is any object that coordinates the control flow of threads based on its state. Blocking queues can act as
synchronizers; other types of synchronizers include semaphores, barriers, and latches. There are a number of
synchronizer classes in the platform library; if these do not meet your needs, you can also create your own
>Latches - one time use synchronizers; latch is closed(blocks all threads) until a given condition happens; after the condition occurs, the latch is opened permanently(all threads can pass the gate)
>CountDownLatch - blocks all threads until a given counter reaches zero(can be initialized to a positive value upon start)
>in a thread pool, the master thread can wait at once for all workers to finish using a countdown latch instead of waiting on each one sequentially; also the master can start all threads at once using a latch
Ensuring that a computation does not proceed until resources it needs have been initialized. A simple binary
(two state) latch could be used to indicate "Resource R has been initialized", and any activity that requires R
would wait first on this latch.
Ensuring that a service does not start until other services on which it depends have started. Each service would
have an associated binary latch; starting service S would involve first waiting on the latches for other services on
which S depends, and then releasing the S latch after startup completes so any services that depend on S can
then proceed.
Waiting until all the parties involved in an activity, for instance the players in a multi player game, are ready to
proceed. In this case, the latch reaches the terminal state after all the players are ready.
>FutureTask also acts like a latch
Ensuring that a computation does not proceed until resources it needs have been initialized. A simple binary
(two state) latch could be used to indicate "Resource R has been initialized", and any activity that requires R
would wait first on this latch.
Ensuring that a service does not start until other services on which it depends have started. Each service would
have an associated binary latch; starting service S would involve first waiting on the latches for other services on
which S depends, and then releasing the S latch after startup completes so any services that depend on S can
then proceed.
Waiting until all the parties involved in an activity, for instance the players in a multi player game, are ready to
proceed. In this case, the latch reaches the terminal state after all the players are ready.
Tasks described by Callable can throw checked and unchecked exceptions, and any code can throw an Error.
Whatever the task code may throw, it is wrapped in an ExecutionException and rethrown from Future.get. This
complicates code that calls get, not only because it must deal with the possibility of ExecutionException (and the
unchecked CancellationException), but also because the cause of the ExecutionException is returned as a
THRowable, which is inconvenient to deal with.
5.5.3. Semaphores
Counting semaphores are used to control the number of activities that can access a certain resource or perform a given
action at the same time [CPJ 3.4.1]. Counting semaphores can be used to implement resource pools or to impose a
bound on a collection.
A Semaphore manages a set of virtual permits; the initial number of permits is passed to the Semaphore constructor.
Activities can acquire permits (as long as some remain) and release permits when they are done with them. If no permit
is available, acquire blocks until one is (or until interrupted or the operation times out). The release method returns a
permit to the semaphore. [4] A degenerate case of a counting semaphore is a binary semaphore, a Semaphore with an
initial count of one. A binary semaphore can be used as a mutex with non reentrant locking semantics; whoever holds
the sole permit holds the mutex.
[4] The implementation has no actual permit objects, and Semaphore does not associate dispensed permits with threads, so a permit acquired in
one thread can be released from another thread. You can think of acquire as consuming a permit and release as creating one; a Semaphore is not
limited to the number of permits it was created with.
Semaphores are useful for implementing resource pools such as database connection pools. While it is easy to construct
a fixed sized pool that fails if you request a resource from an empty pool, what you really want is to block if the pool is
empty and unblock when it becomes nonempty again. If you initialize a Semaphore to the pool size, acquire a permit
before trying to fetch a resource from the pool, and release the permit after putting a resource back in the pool,
acquire blocks until the pool becomes nonempty. This technique is used in the bounded buffer class in Chapter 12. (An
easier way to construct a blocking object pool would be to use a BlockingQueue to hold the pooled resources.)
Similarly, you can use a Semaphore to turn any collection into a blocking bounded collection, as illustrated by
BoundedHashSet in Listing 5.14. The semaphore is initialized to the desired maximum size of the collection. The add
operation acquires a permit before adding the item into the underlying collection. If the underlying add operation does
not actually add anything, it releases the permit immediately. Similarly, a successful remove operation releases a permit,
enabling more elements to be added. The underlying Set implementation knows nothing about the bound; this is
handled by BoundedHashSet.
5.5.4. Barriers
We have seen how latches can facilitate starting a group of related activities or waiting for a group of related activities
to complete. Latches are single use objects; once a latch enters the terminal state, it cannot be reset.
Barriers are similar to latches in that they block a group of threads until some event has occurred [CPJ 4.4.3]. The key
difference is that with a barrier, all the threads must come together at a barrier point at the same time in order to
proceed. Latches are for waiting for events; barriers are for waiting for other threads. A barrier implements the protocol
some families use to rendezvous during a day at the mall: "Everyone meet at McDonald's at 6:00; once you get there,
stay there until everyone shows up, and then we'll figure out what we're doing next."
CyclicBarrier allows a fixed number of parties to rendezvous repeatedly at a barrier point and is useful in parallel
iterative algorithms that break down a problem into a fixed number of independent subproblems. Threads call await
when they reach the barrier point, and await blocks until all the threads have reached the barrier point. If all threads
meet at the barrier point, the barrier has been successfully passed, in which case all threads are released and the barrier
is reset so it can be used again. If a call to await times out or a thread blocked in await is interrupted, then the barrier is
considered broken and all outstanding calls to await terminate with BrokenBarrierException. If the barrier is
successfully passed, await returns a unique arrival index for each thread, which can be used to "elect" a leader that
takes some special action in the next iteration. CyclicBarrier also lets you pass a barrier action to the constructor;
this is a Runnable that is executed (in one of the subtask threads) when the barrier is successfully passed but before the
blocked threads are released.
Another form of barrier is Exchanger, a two party barrier in which the parties exchange data at the barrier point [CPJ
3.4.3]. Exchangers are useful when the parties perform asymmetric activities, for example when one thread fills a buffer
with data and the other thread consumes the data from the buffer; these threads could use an Exchanger to meet and
exchange a full buffer for an empty one. When two threads exchange objects via an Exchanger, the exchange
constitutes a safe publication of both objects to the other party.
The timing of the exchange depends on the responsiveness requirements of the application. The simplest approach is
that the filling task exchanges when the buffer is full, and the emptying task exchanges when the buffer is empty; this
minimizes the number of exchanges but can delay processing of some data if the arrival rate of new data is
unpredictable. Another approach would be that the filler exchanges when the buffer is full, but also when the buffer is
partially filled and a certain amount of time has elapsed.
>JVM can't exit until all the (non daemon) threads have terminated, so failing to
shut down an Executor could prevent the JVM from exiting.
>6.2.5. Delayed and Periodic Tasks
The Timer facility manages the execution of deferred ("run this task in 100 ms") and periodic ("run this task every 10
ms") tasks. However, Timer has some drawbacks, and ScheduledThreadPoolExecutor should be thought of as its
replacement.[6] You can construct a ScheduledThreadPoolExecutor through its constructor or through the
newScheduledThreadPool factory.
78 Java Concurrency In Practice
[6] Timer does have support for scheduling based on absolute, not relative time, so that tasks can be sensitive to changes in the system clock;
ScheduledThreadPoolExecutor supports only relative time.
A Timer creates only a single thread for executing timer tasks. If a timer task takes too long to run, the timing accuracy
of other TimerTasks can suffer. If a recurring TimerTask is scheduled to run every 10 ms and another Timer-Task takes
40 ms to run, the recurring task either (depending on whether it was scheduled at fixed rate or fixed delay) gets called
four times in rapid succession after the long running task completes, or "misses" four invocations completely. Scheduled
thread pools address this limitation by letting you provide multiple threads for executing deferred and periodic tasks.
Another problem with Timer is that it behaves poorly if a TimerTask throws an unchecked exception. The Timer thread
doesn't catch the exception, so an unchecked exception thrown from a TimerTask terminates the timer thread. Timer
also doesn't resurrect the thread in this situation; instead, it erroneously assumes the entire Timer was cancelled. In this
case, TimerTasks that are already scheduled but not yet executed are never run, and new tasks cannot be scheduled.
(This problem, called "thread leakage" is described in Section 7.3, along with techniques for avoiding it.)
>ScheduledThreadPoolExecutor deals
properly with ill behaved tasks; there is little reason to use Timer in Java 5.0 or later.
If you need to build your own scheduling service, you may still be able to take advantage of the library by using a
DelayQueue, a BlockingQueue implementation that provides the scheduling functionality of
ScheduledThreadPoolExecutor. A DelayQueue manages a collection of Delayed objects. A Delayed has a delay time
associated with it: DelayQueue lets you take an element only if its delay has expired. Objects are returned from a
DelayQueue ordered by the time associated with their delay.
>Future represents the lifecycle of a task and provides methods to test whether the task has completed or been
cancelled, retrieve its result, and cancel the task. Callable and Future are shown in Listing 6.11. Implicit in the
specification of Future is that task lifecycle can only move forwards, not backwards just like the ExecutorService
lifecycle. Once a task is completed, it stays in that state forever.
The behavior of get varies depending on the task state (not yet started, running, completed). It returns immediately or
throws an Exception if the task has already completed, but if not it blocks until the task completes. If the task
completes by throwing an exception, get rethrows it wrapped in an ExecutionException; if it was cancelled, get
throws CancellationException. If get throws ExecutionException, the underlying exception can be retrieved with
getCause.
>There are several ways to create a Future to describe a task. The submit methods in ExecutorService all return a
Future, so that you can submit a Runnable or a Callable to an executor and get back a Future that can be used to
retrieve the result or cancel the task. You can also explicitly instantiate a FutureTask for a given Runnable or Callable.
(Because FutureTask implements Runnable, it can be submitted to an Executor for execution or executed directly by
calling its run method.)
As of Java 6, ExecutorService implementations can override newTaskFor in AbstractExecutorService to control
instantiation of the Future corresponding to a submitted Callable or Runnable. The default implementation just
creates a new FutureTask, as shown in Listing 6.12.
Listing 6.12. Default Implementation of newTaskFor in ThreadPoolExecutor.
protected
return new FutureTask
}
Submitting a Runnable or Callable to an Executor constitutes a safe publication (see Section 3.5) of the Runnable or
Callable from the submitting thread to the thread that will eventually execute the task. Similarly, setting the result
value for a Future constitutes a safe publication of the result from the thread in which it was computed to any thread
that retrieves it via get.
>The primary challenge in executing tasks within a time budget is making sure that you don't wait longer than the time
budget to get an answer or find out that one is not forthcoming. The timed version of Future.get supports this
requirement: it returns as soon as the result is ready, but throws TimeoutException if the result is not ready within the
timeout period.
Cancelling a task
-----------------
>One such cooperative mechanism is setting a "cancellation requested" flag that the task checks periodically; if it finds
the flag set, the task terminates early. PrimeGenerator in Listing 7.1, which enumerates prime numbers until it is
cancelled, illustrates this technique. The cancel method sets the cancelled flag, and the main loop polls this flag
before searching for the next prime number. (For this to work reliably, cancelled must be volatile.)
>Blocking library methods like Thread.sleep and Object.wait try to detect when a thread has been interrupted and
return early. They respond to interruption by clearing the interrupted status and throwing InterruptedException,
indicating that the blocking operation completed early due to interruption. The JVM makes no guarantees on how
quickly a blocking method will detect interruption, but in practice this happens reasonably quickly.
>If a thread is interrupted when it is not blocked, its interrupted status is set, and it is up to the activity being cancelled to
poll the interrupted status to detect interruption. In this way interruption is "sticky?” if it doesn't trigger an
InterruptedException, evidence of interruption persists until someone deliberately clears the interrupted status.
Calling interrupt does not necessarily stop the target thread from doing what it is doing; it merely delivers the
message that interruption has been requested.
A good way to think about interruption is that it does not actually interrupt a running thread; it just requests that the
thread interrupt itself at the next convenient opportunity. (These opportunities are called cancellation points.) Some
methods, such as wait, sleep, and join, take such requests seriously, throwing an exception when they receive an
interrupt request or encounter an already set interrupt status upon entry. Well behaved methods may totally ignore
such requests so long as they leave the interruption request in place so that calling code can do something with it.
Poorly behaved methods swallow the interrupt request, thus denying code further up the call stack the opportunity to
act on it.
The static interrupted method should be used with caution, because it clears the current thread's interrupted status. If
you call interrupted and it returns TRue, unless you are planning to swallow the interruption, you should do something
with it either throw InterruptedException or restore the interrupted status by calling interrupt again
>Interruption is usually the most sensible way to implement cancellation.
>Because each thread has its own interruption policy, you should not interrupt a thread unless you know what
interruption means to that thread.
Critics have derided the Java interruption facility because it does not provide a preemptive interruption capability and
yet forces developers to handle InterruptedException. However, the ability to postpone an interruption request
enables developers to craft flexible interruption policies that balance responsiveness and robustness as appropriate for
the application.
>Only code that implements a thread's interruption policy may swallow an interruption request. General purpose task
and library code should never swallow interruption requests.
GUI APPLICATIONS
-----------------
>Single threaded GUI frameworks are not unique to Java; Qt, NextStep, MacOS Cocoa, X Windows, and many others are
also single threaded. This is not for lack of trying; there have been many attempts to write multithreaded GUI
frameworks, but because of persistent problems with race conditions and deadlock, they all eventually arrived at the
single threaded event queue model in which a dedicated thread fetches events off a queue and dispatches them to
application defined event handlers. (AWT originally tried to support a greater degree of multithreaded access, and the
decision to make Swing single threaded was based largely on experience with AWT.)
>Multithreaded GUI frameworks tend to be particularly susceptible to deadlock, partially because of the unfortunate
interaction between input event processing and any sensible object oriented modeling of GUI components. Actions
initiated by the user tend to "bubble up" from the OS to the application a mouse click is detected by the OS, is turned
into a "mouse click" event by the toolkit, and is eventually delivered to an application listener as a higher level event
such as a "button pressed" event. On the other hand, application initiated actions "bubble down" from the application
to the OS changing the background color of a component originates in the application and is dispatched to a specific
component class and eventually into the OS for rendering. Combining this tendency for activities to access the same GUI
objects in the opposite order with the requirement of making each object thread safe yields a recipe for inconsistent
lock ordering, which leads to deadlock (see Chapter 10). And this is exactly what nearly every GUI toolkit development
effort rediscovered through experience.
>Another factor leading to deadlock in multithreaded GUI frameworks is the prevalence of the model view control (MVC)
pattern. Factoring user interactions into cooperating model, view, and controller objects greatly simplifies implementing
GUI applications, but again raises the risk of inconsistent lock ordering. The controller calls into the model, which
notifies the view that something has changed. But the controller can also call into the view, which may in turn call back
into the model to query the model state. The result is again inconsistent lock ordering, with the attendant risk of
deadlock.
In his weblog,[1] Sun VP Graham Hamilton nicely sums up the challenges, describing why the multithreaded GUI toolkit is
one of the recurring "failed dreams" of computer science.
[1] http://weblogs.java.net/blog/kgh/archive/2004/10
>if you were to use multithreaded UI toolkits, things will mostly
work, but you will get occasional hangs (due to deadlocks) or glitches (due to races). This multithreaded approach works
best for people who have been intimately involved in the design of the toolkit.
>Single threaded GUI frameworks achieve thread safety via thread confinement; all GUI objects, including visual
components and data models, are accessed exclusively from the event thread. Of course, this just pushes some of the
thread safety burden back onto the application developer, who must make sure these objects are properly confined.
>9.1.2. Thread Confinement in Swing
All Swing components (such as JButton and JTable) and data model objects (such as TableModel and TreeModel) are
confined to the event thread, so any code that accesses these objects must run in the event thread. GUI objects are kept
consistent not by synchronization, but by thread confinement. The upside is that tasks that run in the event thread need
not worry about synchronization when accessing presentation objects; the downside is that you cannot access
presentation objects from outside the event thread at all.
>The Swing single thread rule: Swing components and models should be created, modified, and queried only from the
event dispatching thread.
>As with all rules, there are a few exceptions. A small number of Swing methods may be called safely from any thread;
these are clearly identified in the Javadoc as being thread safe. Other exceptions to the single thread rule include:
SwingUtilities.isEventDispatchThread, which determines whether the current thread is the event thread;
SwingUtilities.invokeLater, which schedules a Runnable for execution on the event thread (callable from
any thread);
SwingUtilities.invokeAndWait, which schedules a Runnable task for execution on the event thread and
blocks the current thread until it completes (callable only from a non GUI thread);
methods to enqueue a repaint or revalidation request on the event queue (callable from any thread); and
methods for adding and removing listeners (can be called from any thread, but listeners will always be invoked
in the event thread).
>In a GUI application, events originate in the event thread and bubble up to application provided listeners, which will
probably perform some computation that affects the presentation objects. For simple, short running tasks, the entire
action can stay in the event thread; for longer running tasks, some of the processing should be offloaded to another
thread.
>In the simple case, confining presentation objects to the event thread is completely natural. Listing 9.3 creates a button
whose color changes randomly when pressed. When the user clicks on the button, the toolkit delivers an ActionEvent
in the event thread to all registered action listeners. In response, the action listener picks a new color and changes the
button's background color. So the event originates in the GUI toolkit and is delivered to the application, and the
application modifies the GUI in response to the user's action. Control never has to leave the event thread
>When the view
receives an event indicating the model data may have changed, it queries the model for the new data and updates the
display. So in a button listener that modifies the contents of a table, the action listener would update the model and call
one of the fireXxx methods, which would in turn invoke the view's table model listeners, which would update the
view. Again, control never leaves the event thread. (The Swing data model fireXxx methods always call the model
listeners directly rather than submitting a new event to the event queue, so the fireXxx methods must be called only
from the event thread.)
>The task triggered when the button is pressed is composed of three sequential subtasks whose execution alternates
between the event thread and the background thread. The first subtask updates the user interface to show that a long
running operation has begun and starts the second subtask in a background thread. Upon completion, the second
subtask queues the third subtask to run again in the event thread, which updates the user interface to reflect that the
operation has completed. This sort of "thread hopping" is typical of handling long running tasks in GUI applications.
>9.3.1. Cancellation
Any task that takes long enough to run in another thread probably also takes long enough that the user might want to
cancel it. You could implement cancellation directly using thread interruption, but it is much easier to use Future, which
was designed to manage cancellable tasks.
When you call cancel on a Future with mayInterruptIfRunning set to true, the Future implementation interrupts
the thread that is executing the task if it is currently running. If your task is written to be responsive to interruption, it
can return early if it is cancelled. Listing 9.6 illustrates a task that polls the thread's interrupted status and returns early
on interruption.
>regardless of how thread pool is implemented it is the task that has to be coded to be responsive to interrupts if it has to be a cancellable task
>SwingUtilities is there to submit tasks to EventDispatchThread(EDT)
>SwingWorker is there to create cancellable, progressful tasks that can run on thirdparty threads (other than EDT)
>when GUI data model has to be updated in thread-safe manner use the same thread jumping technique with SwingWorker and SwingUtilities, to post model updates to EDT so that it can update the DataModel
>otherwise, EDT and background threads can use concurrent collections(such as ConcurrentHashMap, CopyOnWriteArrayList) to safely share data only when it suits the context
9.4.2. Split Data Models
----------------------------
>From the perspective of the GUI, the Swing table model classes like TableModel and treeModel are the official
repository for data to be displayed. However, these model objects are often themselves "views" of other objects
managed by the application. A program that has both a presentation domain and an application domain data model is
said to have a split model design (Fowler, 2005).
In a split model design, the presentation model is confined to the event thread and the other model, the shared model,
is thread safe and may be accessed by both the event thread and application threads. The presentation model registers
listeners with the shared model so it can be notified of updates. The presentation model can then be updated from the
shared model by embedding a snapshot of the relevant state in the update message or by having the presentation
model retrieve the data directly from the shared model when it receives an update event.
>this was the case with Brocade EFCM product which has server side model updating client side GUI via event notifications (the model in question was maintained at both the server and at the client GUI model)
>The snapshot approach is simple, but has limitations. It works well when the data model is small, updates are not too
frequent, and the structure of the two models are similar. If the data model is large or updates are very frequent, or if one
or both sides of the split contain information that is not visible to the other side, it can be more efficient to send
incremental updates instead of entire snapshots. This approach has the effect of serializing updates on the shared
model and recreating them in the event thread against the presentation model. Another advantage of incremental
updates is that finer grained information about what changed can improve the perceived quality of the displayif only
one vehicle moves, we don't have to repaint the entire display, just the affected regions.
>Consider a split model design when a data model must be shared by more than one thread and implementing a thread
safe data model would be inadvisable because of blocking, consistency, or complexity reasons.
>Borrowing from the approach taken by GUI frameworks, you can easily create a dedicated thread or single threaded
executor for accessing the native library, and provide a proxy object that intercepts calls to the thread confined object
and submits them as tasks to the dedicated thread. Future and newSingleThreadExecutor work together to make this
easy; the proxy method can submit the task and immediately call Future.get to wait for the result. (If the class to be
thread confined implements an interface, you can automate the process of having each method submit a Callable to a
background thread executor and waiting for the result using dynamic proxies.)
----x----
7.3.1. Uncaught Exception Handlers
----------------------------------
The previous section offered a proactive approach to the problem of unchecked exceptions. The Thread API also
provides the UncaughtExceptionHandler facility, which lets you detect when a thread dies due to an uncaught
exception. The two approaches are complementary: taken together, they provide defense indepth against thread
leakage.
When a thread exits due to an uncaught exception, the JVM reports this event to an application provided
UncaughtExceptionHandler (see Listing 7.24); if no handler exists, the default behavior is to print the stack trace to
System.err.[8]
[8] Before Java 5.0, the only way to control the UncaughtExceptionHandler was by subclassing ThreadGroup[ThreadGroup implements UncaughtExceptionHandler interface; we need access to uncaughtException(Thread, Throwable) method from there]. In Java 5.0 and later, you
can set an UncaughtExceptionHandler on a per thread basis with Thread.setUncaughtExceptionHandler, and can also set the
default UncaughtExceptionHandler with Thread.setDefaultUncaughtExceptionHandler. However, only one of these
handlers is called; first JVM looks for a per thread handler, then for a ThreadGroup handler. The default handler implementation in
ThreadGroup delegates to its parent thread group, and so on up the chain until one of the ThreadGroup handlers deals with the uncaught
exception or it bubbles up to the toplevel thread group. The top level thread group handler delegates to the default system handler (if one exists;
the default is none) and otherwise prints the stack trace to the console.
Listing 7.24. UncaughtExceptionHandler Interface.
public interface UncaughtExceptionHandler {
void uncaughtException(Thread t, Throwable e);
}
What the handler should do with an uncaught exception depends on your quality of service requirements. The most
common response is to write an error message and stack trace to the application log, as shown in Listing 7.25. Handlers
can also take more direct action, such as trying to restart the thread, shutting down the application, paging an operator,
or other corrective or diagnostic action.
Listing 7.25. UncaughtExceptionHandler that Logs the Exception.
public class UEHLogger implements Thread.UncaughtExceptionHandler {
public void uncaughtException(Thread t, Throwable e) {
Logger logger = Logger.getAnonymousLogger();
logger.log(Level.SEVERE,
"Thread terminated with exception: " + t.getName(),
e);
}
}
In long running applications, always use uncaught exception handlers for all threads that at least log the exception.
To set an UncaughtExceptionHandler for pool threads, provide a ThreadFactory to the ThreadPoolExecutor
constructor. (As with all thread manipulation, only the thread's owner should change its UncaughtExceptionHandler.)
The standard thread pools allow an uncaught task exception to terminate the pool thread, but use a try-finally block
to be notified when this happens so the thread can be replaced. Without an uncaught exception handler or other failure
notification mechanism, tasks can appear to fail silently, which can be very confusing. If you want to be notified when a
task fails due to an exception so that you can take some task specific recovery action, either wrap the task with a
Runnable or Callable that catches the exception or override the afterExecute hook in ThreadPoolExecutor.
Somewhat confusingly, exceptions thrown from tasks make it to the uncaught exception handler only for tasks
submitted with execute; for tasks submitted with submit, any thrown exception, checked or not, is considered to be
part of the task's return status. If a task submitted with submit terminates with an exception, it is rethrown by
Future.get, wrapped in an ExecutionException.
7.4. JVM Shutdown
---------------------
The JVM can shut down in either an orderly or abrupt manner. An orderly shutdown is initiated when the last "normal"
(non daemon) thread terminates, someone calls System.exit, or by other platform specific means (such as sending a
SIGINT or hitting Ctrl-C). While this is the standard and preferred way for the JVM to shut down, it can also be shut
down abruptly by calling Runtime.halt or by killing the JVM process through the operating system (such as sending a
SIGKILL).
7.4.1. Shutdown Hooks
In an orderly shutdown, the JVM first starts all registered shutdown hooks. Shutdown hooks are unstarted threads that
are registered with Runtime.addShutdownHook. The JVM makes no guarantees on the order in which shutdown hooks
are started. If any application threads (daemon or nondaemon) are still running at shutdown time, they continue to run
concurrently with the shutdown process. When all shutdown hooks have completed, the JVM may choose to run
finalizers if runFinalizersOnExit is true, and then halts. The JVM makes no attempt to stop or interrupt any
application threads that are still running at shutdown time; they are abruptly terminated when the JVM eventually halts.
If the shutdown hooks or finalizers don't complete, then the orderly shutdown process "hangs" and the JVM must be
shut down abruptly. In an abrupt shutdown, the JVM is not required to do anything other than halt the JVM; shutdown
hooks will not run.
Shutdown hooks should be thread safe: they must use synchronization when accessing shared data and should be
careful to avoid deadlock, just like any other concurrent code. Further, they should not make assumptions about the
state of the application (such as whether other services have shut down already or all normal threads have completed)
or about why the JVM is shutting down, and must therefore be coded extremely defensively. Finally, they should exit as
quickly as possible, since their existence delays JVM termination at a time when the user may be expecting the JVM to
terminate quickly.
Shutdown hooks can be used for service or application cleanup, such as deleting temporary files or cleaning up
resources that are not automatically cleaned up by the OS. Listing 7.26 shows how LogService in Listing 7.16 could
register a shutdown hook from its start method to ensure the log file is closed on exit.
Because shutdown hooks all run concurrently, closing the log file could cause trouble for other shutdown hooks who
want to use the logger. To avoid this problem, shutdown hooks should not rely on services that can be shut down by the
application or other shutdown hooks. One way to accomplish this is to use a single shutdown hook for all services,
rather than one for each service, and have it call a series of shutdown actions. This ensures that shutdown actions
execute sequentially in a single thread, thus avoiding the possibility of race conditions or deadlock between shutdown
actions. This technique can be used whether or not you use shutdown hooks; executing shutdown actions sequentially
rather than concurrently eliminates many potential sources of failure. In applications that maintain explicit dependency
information among services, this technique can also ensure that shutdown actions are performed in the right order.
Listing 7.26. Registering a Shutdown Hook to Stop the Logging Service.
public void start() {
Runtime.getRuntime().addShutdownHook(new Thread() {
public void run() {
try { LogService.this.stop(); }
catch (InterruptedException ignored) {}
}
});
}
>All JVM created threads are daemon threads except for main thread
>Normal threads and daemon threads differ only in what happens when they exit. When a thread exits, the JVM
performs an inventory of running threads, and if the only threads that are left are daemon threads, it initiates an orderly
shutdown. When the JVM halts, any remaining daemon threads are abandoned finally blocks are not executed,
stacks are not unwound the JVM just exits.
>but System.exit() does not wait for all user threads to finish; it just waits for shutdown hooks to finish and terminates regardless of the presence of any user or daemon threads
Finalizers
-----------
Since finalizers can run in a thread managed by the JVM, any state accessed by a finalizer will be accessed by more than
one thread and therefore must be accessed with synchronization. Finalizers offer no guarantees on when or even if they
run, and they impose a significant performance cost on objects with nontrivial finalizers. They are also extremely
difficult to write correctly.[9] In most cases, the combination of finally blocks and explicit close methods does a better
job of resource management than finalizers; the sole exception is when you need to manage objects that hold resources
acquired by native methods. For these reasons and others, work hard to avoid writing or using classes with finalizers
(other than the platform library classes) [EJ Item 6].
Chapter 8: Applying Thread Pools
================================
>Tasks that use ThreadLocal. ThreadLocal allows each thread to have its own private "version" of a variable. However,
executors are free to reuse threads as they see fit. The standard Executor implementations may reap idle threads when
demand is low and add new ones when demand is high, and also replace a worker thread with a fresh one if an
unchecked exception is thrown from a task. ThreadLocal makes sense to use in pool threads only if the thread local
value has a lifetime that is bounded by that of a task; Thread-Local should not be used in pool threads to communicate
values between tasks.
8.1.1. Thread Starvation Deadlock
----------------------------------------
If tasks that depend on other tasks execute in a thread pool, they can deadlock. In a single threaded executor, a task
that submits another task to the same executor and waits for its result will always deadlock. The second task sits on the
work queue until the first task completes, but the first will not complete because it is waiting for the result of the
second task. The same thing can happen in larger thread pools if all threads are executing tasks that are blocked waiting
for other tasks still on the work queue. This is called thread starvation deadlock, and can occur whenever a pool task
initiates an unbounded blocking wait for some resource or condition that can succeed only through the action of
another pool task, such as waiting for the return value or side effect of another task, unless you can guarantee that the
pool is large enough.
Sizing Thread Pools
-------------------
To size a thread pool properly, you need to understand your computing environment, your resource budget, and the
nature of your tasks. How many processors does the deployment system have ?How much memory ?Do tasks perform
mostly computation, I/O, or some combination ?Do they require a scarce resource, such as a JDBC connection ?If you
have different categories of tasks with very different behaviors, consider using multiple thread pools so each can be
tuned according to its workload.
For compute intensive tasks, an Ncpu processor system usually achieves optimum utilization with a thread pool of Ncpu
+1 threads. (Even compute intensive threads occasionally take a page fault or pause for some other reason, so an
"extra" runnable thread prevents CPU cycles from going unused when this happens.) For tasks that also include I/O or
other blocking operations, you want a larger pool, since not all of the threads will be schedulable at all times. In order to
size the pool properly, you must estimate the ratio of waiting time to compute time for your tasks; this estimate need
not be precise and can be obtained through pro filing or instrumentation. Alternatively, the size of the thread pool can
be tuned by running the application using several different pool sizes under a benchmark load and observing the level of
CPU utilization.
Given these definitions:
= number of CPUs
= target CPU utilization,
= ratio of wait time to compute time
The optimal pool size for keeping the processors at the desired utilization is:
You can determine the number of CPUs using Runtime:
int N_CPUS = Runtime.getRuntime().availableProcessors();
>Of course, CPU cycles are not the only resource you might want to manage using thread pools. Other resources that can
contribute to sizing constraints are memory, file handles, socket handles, and database connections. Calculating pool
size constraints for these types of resources is easier: just add up how much of that resource each task requires and
divide that into the total quantity available. The result will be an upper bound on the pool size.
When tasks require a pooled resource such as database connections, thread pool size and resource pool size affect each
other. If each task requires a connection, the effective size of the thread pool is limited by the connection pool size.
Similarly, when the only consumers of connections are pool tasks, the effective size of the connection pool is limited by
the thread pool size.
>we can create different types of thread pools using Executors; but to vary the nature of thread pools(size, queueing, creation of threads, etc) you can use ThreadPoolExecutor; say configuration of a thread pool
core size - size of the thread pool in normal situation
max pool size - pool size can go up to this point if thread pool is busy; but always attempt to stay at core size whenever idle threads present
keep alive time - maximum time a thread can be idle; after this if the pool size is greater than core size, this thread will be terminated
>when ThreadPoolExecutor is created not all threads are created; as and when tasks are submitted, threads are created; unless you call prestartAllCoreThreads
> Developers are sometimes tempted to set the core size to zero so that the worker threads will eventually be torn down and therefore won't
prevent the JVM from exiting, but this can cause some strange seeming behavior in thread pools that don't use a SynchronousQueue for their
work queue (as newCachedThreadPool does). If the pool is already at the core size, ThreadPoolExecutor creates a new thread only if
the work queue is full. So tasks submitted to a thread pool with a work queue that has any capacity and a core size of zero will not execute until
the queue fills up, which is usually not what is desired. In Java 6, allowCoreThreadTimeOut allows you to request that all pool threads be
able to time out; enable this feature with a core size of zero if you want a bounded thread pool with a bounded work queue but still have all the
threads torn down when there is no work to do.
Listing 8.2. General Constructor for ThreadPoolExecutor.
public ThreadPoolExecutor(int corePoolSize,
int maximumPoolSize,
long keepAliveTime,
TimeUnit unit,
BlockingQueue
ThreadFactory threadFactory,
RejectedExecutionHandler handler) { ... }
>The newFixedThreadPool factory sets both the core pool size and the maximum pool size to the requested pool size,
creating the effect of infinite timeout; the newCachedThreadPool factory sets the maximum pool size to
Integer.MAX_VALUE and the core pool size to zero with a timeout of one minute, creating the effect of an infinitely
expandable thread pool that will contract again when demand decreases. Other combinations are possible using the
explicit ThreadPool-Executor constructor.
>For very large or unbounded pools, you can also bypass queuing entirely and instead hand off tasks directly from
producers to worker threads using a SynchronousQueue. A SynchronousQueue is not really a queue at all, but a
mechanism for managing handoffs between threads. In order to put an element on a SynchronousQueue, another
thread must already be waiting to accept the handoff. If no thread is waiting but the current pool size is less than the
maximum, Thread-PoolExecutor creates a new thread; otherwise the task is rejected according to the saturation
policy. Using a direct handoff is more efficient because the task can be handed right to the thread that will execute it,
rather than first placing it on a queue and then having the worker thread fetch it from the queue. SynchronousQueue is
a practical choice only if the pool is unbounded or if rejecting excess tasks is acceptable. The newCachedThreadPool
factory uses a SynchronousQueue.
>The newCachedThreadPool factory is a good default choice for an Executor, providing better queuing performance
than a fixed thread pool.[5] A fixed size thread pool is a good choice when you need to limit the number of concurrent
tasks for resource management purposes, as in a server application that accepts requests from network clients and
would otherwise be vulnerable to overload.
[5] This performance difference comes from the use of SynchronousQueue instead of LinkedBlocking-Queue. SynchronousQueue
was replaced in Java 6 with a new non blocking algorithm that improved throughput in Executor benchmarks by a factor of three over the Java
5.0 SynchronousQueue implementation (Scherer et al., 2006).
Bounding either the thread pool or the work queue is suitable only when tasks are independent. With tasks that depend
on other tasks, bounded thread pools or queues can cause thread starvation deadlock; instead, use an unbounded pool
configuration like newCachedThreadPool.[6]
[6] An alternative configuration for tasks that submit other tasks and wait for their results is to use a bounded thread pool, a
SynchronousQueue as the work queue, and the caller runs saturation policy.
8.3.3. Saturation Policies
--------------------------
When a bounded work queue fills up, the saturation policy comes into play. The saturation policy for a
ThreadPoolExecutor can be modified by calling setRejectedExecutionHandler. (The saturation policy is also used
when a task is submitted to an Executor that has been shut down.) Several implementations of
RejectedExecutionHandler are provided, each implementing a different saturation policy: AbortPolicy,
CallerRunsPolicy, DiscardPolicy, and DiscardOldestPolicy.
The default policy, abort, causes execute to throw the unchecked Rejected-ExecutionException; the caller can catch
this exception and implement its own overflow handling as it sees fit. The discard policy silently discards the newly
submitted task if it cannot be queued for execution; the discard oldest policy discards the task that would otherwise be
executed next and tries to resubmit the new task. (If the work queue is a priority queue, this discards the highest
priority element, so the combination of a discard oldest saturation policy and a priority queue is not a good one.)
>The caller runs policy implements a form of throttling that neither discards tasks nor throws an exception, but instead
tries to slow down the flow of new tasks by pushing some of the work back to the caller. It executes the newly
submitted task not in a pool thread, but in the thread that calls execute. If we modified our WebServer example to use
a bounded queue and the caller runs policy, after all the pool threads were occupied and the work queue filled up the
next task would be executed in the main thread during the call to execute. Since this would probably take some time,
the main thread cannot submit any more tasks for at least a little while, giving the worker threads some time to catch
up on the backlog. The main thread would also not be calling accept during this time, so incoming requests will queue
up in the TCP layer instead of in the application. If the overload persisted, eventually the TCP layer would decide it has
queued enough connection requests and begin discarding connection requests as well. As the server becomes
overloaded, the overload is gradually pushed outward from the pool threads to the work queue to the application to
the TCP layer, and eventually to the client enabling more graceful degradation under load.
Listing 8.3. Creating a Fixed sized Thread Pool with a Bounded Queue and the Caller runs Saturation Policy.
ThreadPoolExecutor executor
= new ThreadPoolExecutor(N_THREADS, N_THREADS,
0L, TimeUnit.MILLISECONDS,
new LinkedBlockingQueue
executor.setRejectedExecutionHandler(
new ThreadPoolExecutor.CallerRunsPolicy());
8.3.4. Thread Factories
----------------------------
Whenever a thread pool needs to create a thread, it does so through a thread factory (see Listing 8.5). The default
thread factory creates a new, nondaemon thread with no special configuration. Specifying a thread factory allows you to
customize the configuration of pool threads. ThreadFactory has a single method, newThread, that is called whenever a
thread pool needs to create a new thread.
There are a number of reasons to use a custom thread factory. You might want to specify an
UncaughtExceptionHandler for pool threads, or instantiate an instance of a custom Thread class, such as one that
performs debug logging. You might want to modify the priority (generally not a very good idea; see Section 10.3.1) or
set the daemon status (again, not all that good an idea; see Section 7.4.2) of pool threads. Or maybe you just want to
give pool threads more meaningful names to simplify interpreting thread dumps and error logs.
Listing 8.5. ThreadFactory Interface.
public interface ThreadFactory {
Thread newThread(Runnable r);
}
>The interesting customization takes place in MyAppThread, shown in Listing 8.7, which lets you provide a thread name,
sets a custom UncaughtException-Handler that writes a message to a Logger, maintains statistics on how many
threads have been created and destroyed, and optionally writes a debug message to the log when a thread is created or
terminates.
If your application takes advantage of security policies to grant permissions to particular codebases, you may want to
use the privilegedThreadFactory factory method in Executors to construct your thread factory. It creates pool
threads that have the same permissions, AccessControlContext, and contextClassLoader as the thread creating the
privilegedThreadFactory. Otherwise, threads created by the thread pool inherit permissions from whatever client
happens to be calling execute or submit at the time a new thread is needed, which could cause confusing security
related exceptions.
8.3.5. Customizing ThreadPoolExecutor After Construction
--------------------------------------------------------
Most of the options passed to the ThreadPoolExecutor constructors can also be modified after construction via setters
(such as the core thread pool size, maximum thread pool size, keep alive time, thread factory, and rejected execution
handler). If the Executor is created through one of the factory methods in Executors (except
newSingleThreadExecutor), you can cast the result to Thread-PoolExecutor to access the setters as in Listing 8.8.
Executors includes a factory method, unconfigurableExecutorService, which takes an existing ExecutorService
and wraps it with one exposing only the methods of ExecutorService so it cannot be further configured. Unlike the
pooled implementations, newSingleThreadExecutor returns an ExecutorService wrapped in this manner, rather than
a raw ThreadPoolExecutor. While a single threaded executor is actually implemented as a thread pool with one
thread, it also promises not to execute tasks concurrently. If some misguided code were to increase the pool size on a
single threaded executor, it would undermine the intended execution semantics.
8.4. Extending ThreadPoolExecutor
---------------------------------
ThreadPoolExecutor was designed for extension, providing several "hooks" for subclasses to override beforeExecute,
afterExecute, and terminate that can be used to extend the behavior of ThreadPoolExecutor.
The beforeExecute and afterExecute hooks are called in the thread that executes the task, and can be used for
adding logging, timing, monitoring, or statistics gathering. The afterExecute hook is called whether the task completes
by returning normally from run or by throwing an Exception. (If the task completes with an Error, afterExecute is not
called.) If beforeExecute throws a RuntimeException, the task is not executed and afterExecute is not called.
The terminated hook is called when the thread pool completes the shutdown process, after all tasks have finished and
all worker threads have shut down. It can be used to release resources allocated by the Executor during its lifecycle,
perform notification or logging, or finalize statistics gathering.
>Sequential loop iterations are suitable for parallelization when each iteration is independent of the others and the work
done in each iteration of the loop body is significant enough to offset the cost of managing a new task.
Chapter 13: Explicit Locks
==========================
>Lock and ReentrantLock
Why create a new locking mechanism that is so similar to intrinsic locking ?Intrinsic locking works fine in most situations
but has some functional limitations it is not possible to interrupt a thread waiting to acquire a lock, or to attempt to
acquire a lock without being willing to wait for it forever. Intrinsic locks also must be released in the same block of code
in which they are acquired; this simplifies coding and interacts nicely with exception handling, but makes non block
structured locking disciplines impossible. None of these are reasons to abandon synchronized, but in some cases a
more flexible locking mechanism offers better liveness or performance.
>With intrinsic locks, a deadlock is fatal the only way to recover is to restart the application,
and the only defense is to construct your program so that inconsistent lock ordering is impossible. Timed and polled locking offer another option: probabilistic deadlock avoidance.
>lockInterruptibly() is useful for implementing cancellable tasks
> When we started this book, ReentrantLock seemed the last word in lock scalability. Less than a year later, intrinsic locking gives it a good run for its money. Performance is not just a moving target, it can be a fast moving target.
>the comparative advantage of explicit locks over intrinsic went down in Java 6 from what it was in Java 5 because intrinsic locks improved in Java6
>With a fair lock, a newly requesting thread is queued if the lock is held by another thread or if threads are queued waiting for the lock; with a nonfair lock, the thread is queued only if the lock is currently held.[4]
[4] The polled tryLock always barges, even for fair locks.
#otherwise there is no point in having tryLock() method if it can't barge in?!
>usually non-fair locks have higher throughput than fair lock because of reduced time spent in queueing threads
>concurrent collections have even higher throughput than non-fair locks
>auto lock release of intrinsic locking is a far greater advantage over the explicit locks; use explicit locks only when you can't achieve what you want with synchronized code
A warning cry
-------------
Under Java 5.0, intrinsic locking has another advantage over ReentrantLock: thread dumps show which call frames
acquired which locks and can detect and identify deadlocked threads. The JVM knows nothing about which threads hold
ReentrantLocks and therefore cannot help in debugging threading problems using ReentrantLock. This disparity is
addressed in Java 6 by providing a management and monitoring interface with which locks can register, enabling locking
information for ReentrantLocks to appear in thread dumps and through other management and debugging interfaces.
The availability of this information for debugging is a substantial, if mostly temporary, advantage for synchronized;
locking information in thread dumps has saved many programmers from utter consternation. The non block structured
nature of ReentrantLock still means that lock acquisitions cannot be tied to specific stack frames, as they can with
intrinsic locks.
Future performance improvements are likely to favor synchronized over ReentrantLock. Because synchronized is
built into the JVM, it can perform optimizations such as lock elision for thread confined lock objects and lock coarsening
to eliminate synchronization with intrinsic locks (see Section 11.3.2); doing this with library based locks seems far less
likely. Unless you are deploying on Java 5.0 for the foreseeable future and you have a demonstrated need for
ReentrantLock's scalability benefits on that platform, it is not a good idea to choose ReentrantLock over
synchronized for performance reasons.
ReadWrite Lock
--------------
>read write locks allow: a resource can be accessed by multiple readers or a single
writer at a time, but not both.
>The interaction between the read and write locks allows for a number of possible implementations. Some of the
implementation options for a ReadWriteLock are:
Release preference. When a writer releases the write lock and both readers and writers are queued up, who should be
given preference readers, writers, or whoever asked first ?
Reader barging. If the lock is held by readers but there are waiting writers, should newly arriving readers be granted
immediate access, or should they wait behind the writers ? Allowing readers to barge ahead of writers enhances
concurrency but runs the risk of starving writers.
Reentrancy. Are the read and write locks reentrant ?
Downgrading. If a thread holds the write lock, can it acquire the read lock without releasing the write lock ?This would
let a writer "downgrade" to a read lock without letting other writers modify the guarded resource in the meantime.
Upgrading. Can a read lock be upgraded to a write lock in preference to other waiting readers or writers ?Most read
write lock implementations do not support upgrading, because without an explicit upgrade operation it is deadlock
prone. (If two readers simultaneously attempt to upgrade to a write lock, neither will release the read lock.)
ReentrantReadWriteLock provides reentrant locking semantics for both locks. Like ReentrantLock, a
ReentrantReadWriteLock can be constructed as non fair (the default) or fair. With a fair lock, preference is given to the
thread that has been waiting the longest; if the lock is held by readers and a thread requests the write lock, no more
readers are allowed to acquire the read lock until the writer has been serviced and releases the write lock. With a non
fair lock, the order in which threads are granted access is unspecified. Downgrading from writer to reader is permitted;
upgrading from reader to writer is not (attempting to do so results in deadlock).
Like ReentrantLock, the write lock in ReentrantReadWriteLock has a unique owner and can be released only by the
thread that acquired it. In Java 5.0, the read lock behaves more like a Semaphore than a lock, maintaining only the count
of active readers, not their identities. This behavior was changed in Java 6 to keep track also of which threads have been
granted the read lock. [6]
[6] One reason for this change is that under Java 5.0, the lock implementation cannot distinguish between a thread requesting the read lock for the
first time and a reentrant lock request, which would make fair read write locks deadlock prone.
Read write locks can improve concurrency when locks are typically held for a moderately long time and most operations
do not modify the guarded resources. ReadWriteMap in Listing 13.7 uses a ReentrantReadWriteLock to wrap a Map so
that it can be shared safely by multiple readers and still prevent reader writer or writer writer conflicts.[7] In reality,
ConcurrentHashMap's performance is so good that you would probably use it rather than this approach if all you
needed was a concurrent hash based map, but this technique would be useful if you want to provide more concurrent
access to an alternate Map implementation such as LinkedHashMap.
[7] ReadWriteMap does not implement Map because implementing the view methods such as entrySet and values would be difficult and
the "easy" methods are usually sufficient.
Chapter 14:Building Custom Synchronizers
========================================
>Every call to wait is implicitly associated with a specific condition predicate. When calling wait regarding a particular condition predicate, the caller must already hold the lock associated with the condition queue, and that lock must also guard the state variables from which the condition predicate is composed.
>When control re enters the code calling wait, it has reacquired the lock associated with the condition queue. Is the condition predicate now true ?Maybe. It might have been true at the time the notifying thread called notifyAll, but could have become false again by the time you reacquire the lock. Other threads may have acquired the lock and changed the object's state between when your thread was awakened and when wait reacquired the lock. Or maybe it hasn't been true at all since you called wait. You don't know why another thread called notify or notifyAll; maybe it was because another condition predicate associated with the same condition queue became true. Multiple condition predicates per condition queue are quite common BoundedBuffer uses the same condition queue for both the "not full" and "not empty" predicates.[9]
>For all these reasons, when you wake up from wait you must test the condition predicate again, and go back to waiting
(or fail) if it is not yet true. Since you can wake up repeatedly without your condition predicate being true, you must
therefore always call wait from within a loop, testing the condition predicate in each iteration. The canonical form for a
condition wait is shown in Listing 14.7.
Listing 14.7. Canonical Form for State dependent Methods.
void stateDependentMethod() throws InterruptedException {
// condition predicate must be guarded by lock
synchronized(lock) {
while (!conditionPredicate())
lock.wait();
// object is now in desired state
}
}
>When using condition waits (Object.wait or Condition.await):
Always have a condition predicate some test of object state that must hold before proceeding;
Always test the condition predicate before calling wait, and again after returning from wait;
Always call wait in a loop;
Ensure that the state variables making up the condition predicate are guarded by the lock associated with the condition queue;
Hold the lock associated with the condition queue when calling wait, notify, or notifyAll; and
Do not release the lock after checking the condition predicate but before acting on it.
>Missing signals: if you miss the signal that comes before waiting, you end up waiting forever. so check the condition before calling wait(while holding the lock) as shown in the previous code sample
>Whenever you wait on a condition, make sure that someone will perform a notification whenever the condition predicate becomes true.
>when two threads wait on different conditions on the same object, a single notify() might wake up the wrong thread and target thread may wait forever for the hijacked notify; so use notifyAll()
>Single notify can be used instead of notifyAll only when both of the following conditions hold:
Uniform waiters. Only one condition predicate is associated with the condition queue, and each thread executes the
same logic upon returning from wait; and One in, one out. A notification on the condition variable enables at most one thread to proceed.
#latch that lets multiple threads in is not a candidate for single notify
#bounded buffer where threads might wait for two different conditions(empty or full) again is not a candidate for single notify
>but single notify() is efficient than notifyAll() if we could use it
> If ten threads are waiting on a condition
queue, calling notifyAll causes each of them to wake up and contend for the lock; then most or all of them will go
right back to sleep. This means a lot of context switches and a lot of contended lock acquisitions for each event that
enables (maybe) a single thread to make progress. (In the worst case, using notify-All results in O(n^2) wakeups where
n would suffice.) This is another situation where performance concerns support one approach and safety concerns
support the other.
>a notification is performed every time an
object is put into or removed from the buffer. This could be optimized by observing that a thread can be released from a
wait only if the buffer goes from empty to not empty or from full to not full, and notifying only if a put or take effected
one of these state transitions. This is called conditional notification. While conditional notification can improve
performance, it is tricky to get right (and also complicates the implementation of subclasses) and so should be used
carefully.
Single notification and conditional notification are optimizations. As always, follow the principle "First make it right, and
then make it fastif it is not already fast enough" when using these optimizations; it is easy to introduce strange liveness
failures by applying them incorrectly.
>AbstractQueuedSynchronizer, upon which most of the state dependent classes in java.util.concurrent are built
(see Section 14.4), exploits the concept of exit protocol. Rather than letting synchronizer classes perform their own
notification, it instead requires synchronizer methods to return a value indicating whether its action might have
unblocked one or more waiting threads. This explicit API requirement makes it harder to "forget" to notify on some
state transitions.
>Intrinsic condition queues have several drawbacks. Each intrinsic lock can have only one associated condition queue,
which means that in classes like BoundedBuffer multiple threads might wait on the same condition queue for different
condition predicates, and the most common pattern for locking involves exposing the condition queue object. Both of
these factors make it impossible to enforce the uniform waiter requirement for using notifyAll. If you want to write a
concurrent object with multiple condition predicates, or you want to exercise more control over the visibility of the
condition queue, the explicit Lock and Condition classes offer a more flexible alternative to intrinsic locks and
condition queues.
>A Condition is associated with a single Lock, just as a condition queue is associated with a single intrinsic lock; to create
a Condition, call Lock.newCondition on the associated lock. And just as Lock offers a richer feature set than intrinsic
locking, Condition offers a richer feature set than intrinsic condition queues: multiple wait sets per lock, interruptible
and uninterruptible condition waits, deadline based waiting, and a choice of fair or nonfair queueing.
Listing 14.10. Condition Interface.
public interface Condition {
void await() throws InterruptedException;
boolean await(long time, TimeUnit unit)
throws InterruptedException;
long awaitNanos(long nanosTimeout) throws InterruptedException;
void awaitUninterruptibly();
boolean awaitUntil(Date deadline) throws InterruptedException;
void signal();
void signalAll();
}
>Unlike intrinsic condition queues, you can have as many Condition objects per Lock as you want. Condition objects
inherit the fairness setting of their associated Lock; for fair locks, threads are released from Condition.await in FIFO
order.
Hazard warning: The equivalents of wait, notify, and notifyAll for Condition objects are await, signal, and
signalAll. However, Condition extends Object, which means that it also has wait and notify methods. Be sure to
use the proper versions await and signal - instead!
> ReentrantLock requires that the Lock be held when calling signal or signalAll, but Lock implementations are permitted to
construct Conditions that do not have this requirement.
14.4. Anatomy of a Synchronizer
-------------------------------
>The interfaces of ReentrantLock and Semaphore have a lot in common. Both classes act as a "gate", allowing only a limited number of threads to pass at a time; threads arrive at the gate and are allowed through (lock or acquire returns successfully), are made to wait (lock or acquire blocks), or are turned away (tryLock or tryAcquire returns false, indicating that the lock or permit did not become available in the time allowed). Further, both allow interruptible, uninterruptible, and timed acquisition attempts, and both allow a choice of fair or nonfair queueing of waiting threads.
>it is a common exercise to prove that a counting semaphore can be implemented using a lock (as in SemaphoreOnLock in Listing 14.12) and that a lock can be implemented using a counting semaphore.
In actuality, they are both implemented using a common base class, Abstract-QueuedSynchronizer (AQS)as are many
other synchronizers. AQS is a framework for building locks and synchronizers, and a surprisingly broad range of synchronizers can be built easily and efficiently using it. Not only are ReentrantLock and Semaphore built using AQS, but so are CountDownLatch, ReentrantReadWriteLock, SynchronousQueue,[12] and FutureTask.
[12] Java6 replaces the AQS based SynchronousQueue with a (more scalable) non blocking version.
>Using AQS to build synchronizers offers several benefits. Not only does it substantially reduce the implementation effort, but you also needn't pay for multiple points of contention, as you would when constructing one synchronizer on top of another. In SemaphoreOnLock, acquiring a permit has two places where it might block once at the lock guarding the semaphore state, and then again if a permit is not available. Synchronizers built with AQS have only one point where they might block, reducing context switch overhead and improving throughput. AQS was designed for scalability, and all the synchronizers in java.util.concurrent that are built with AQS benefit from this.
AbstractQueuedSynchronizer
--------------------------
>For a class to be state dependent, it must have some state. AQS takes on the task of managing some of the state for the synchronizer class: it manages a single integer of state information that can be manipulated through the protected getState, setState, and compareAndSetState methods. This can be used to represent arbitrary state; for example, ReentrantLock uses it to represent the count of times the owning thread has acquired the lock, Semaphore uses it to represent the number of permits remaining, and FutureTask uses it to represent the state of the task (not yet started, running, completed, cancelled). Synchronizers can also manage additional state variables themselves; for example, ReentrantLock keeps track of the current lock owner so it can distinguish between reentrant and contended lock acquisition requests.
>Acquisition and release in AQS take the forms shown in Listing 14.13. Depending on the synchronizer, acquisition might be exclusive, as with Reentrant-Lock, or nonexclusive, as with Semaphore and CountDownLatch. An acquire operation has two parts. First, the synchronizer decides whether the current state permits acquisition; if so, the thread is allowed to proceed, and if not, the acquire blocks or fails. This decision is determined by the synchronizer semantics; for example, acquiring a lock can succeed if the lock is unheld, and acquiring a latch can succeed if the latch is in its terminal state.
The second part involves possibly updating the synchronizer state; one thread acquiring the synchronizer can affect whether other threads can acquire it. For example, acquiring a lock changes the lock state from "unheld" to "held", and acquiring a permit from a Semaphore reduces the number of permits left. On the other hand, the acquisition of a latch by one thread does not affect whether other threads can acquire it, so acquiring a latch does not change its state.
Listing 14.13. Canonical Forms for Acquisition and Release in AQS.
boolean acquire() throws InterruptedException {
while (state does not permit acquire) {
if (blocking acquisition requested) {
enqueue current thread if not already queued
block current thread
}
else
return failure
}
possibly update synchronization state
dequeue thread if it was queued
return success
}
void release() {
update synchronization state
if (new state may permit a blocked thread to acquire)
unblock one or more queued threads
}
>A synchronizer supporting exclusive acquisition should implement the protected methods TRyAcquire, TRyRelease,
and isHeldExclusively, and those supporting shared acquisition should implement tryAcquireShared and
TRyReleaseShared. The acquire, acquireShared, release, and releaseShared methods in AQS call the TRy forms of
these methods in the synchronizer subclass to determine if the operation can proceed. The synchronizer subclass can
use getState, setState, and compareAndSetState to examine and update the state according to its acquire and
release semantics, and informs the base class through the return status whether the attempt to acquire or release the
synchronizer was successful. For example, returning a negative value from TRyAcquireShared indicates acquisition
failure; returning zero indicates the synchronizer was acquired exclusively; and returning a positive value indicates the
synchronizer was acquired nonexclusively. The TRyRelease and TRyReleaseShared methods should return true if the
release may have unblocked threads attempting to acquire the synchronizer.
To simplify implementation of locks that support condition queues (like ReentrantLock), AQS also provides machinery
for constructing condition variables associated with synchronizers.
14.5.1. A Simple Latch
---------------------------
OneShotLatch in Listing 14.14 is a binary latch implemented using AQS. It has two public methods, await and signal,
that correspond to acquisition and release. Initially, the latch is closed; any thread calling await blocks until the latch is
opened. Once the latch is opened by a call to signal, waiting threads are released and threads that subsequently arrive
at the latch will be allowed to proceed.
Listing 14.14. Binary Latch Using AbstractQueuedSynchronizer.
@ThreadSafe
public class OneShotLatch {
private final Sync sync = new Sync();
public void signal() { sync.releaseShared(0); }
public void await() throws InterruptedException {
sync.acquireSharedInterruptibly(0);
}
private class Sync extends AbstractQueuedSynchronizer {
protected int tryAcquireShared(int ignored) {
// Succeed if latch is open (state == 1), else fail
return (getState() == 1) ? 1 : -1;
}
protected boolean tryReleaseShared(int ignored) {
setState(1); // Latch is now open
return true; // Other threads may now be able to acquire
}
}
}
In OneShotLatch, the AQS state holds the latch state closed (zero) or open (one). The await method calls
acquireSharedInterruptibly in AQS, which in turn consults the TRyAcquireShared method in OneShotLatch. The
tryAcquire-Shared implementation must return a value indicating whether or not acquisition can proceed. If the latch
has been previously opened, tryAcquireShared returns success, allowing the thread to pass; otherwise it returns a
value indicating that the acquisition attempt failed. The acquireSharedInterruptibly method interprets failure to
mean that the thread should be placed on the queue of waiting threads. Similarly, signal calls releaseShared, which
causes tryReleaseShared to be consulted. The TRyReleaseShared implementation unconditionally sets the latch state
to open and indicates (through its return value) that the synchronizer is in a fully released state. This causes AQS to let
all waiting threads attempt to reacquire the synchronizer, and acquisition will now succeed because tryAcquireShared
returns success.
OneShotLatch is a fully functional, usable, performant synchronizer, implemented in only twenty or so lines of code. Of
course, it is missing some useful feature such as timed acquisition or the ability to inspect the latch statebut these are
easy to implement as well, since AQS provides timed versions of the acquisition methods and utility methods for
common inspection operations.
OneShotLatch could have been implemented by extending AQS rather than delegating to it, but this is undesirable for
several reasons [EJ Item 14]. Doing so would undermine the simple (two method) interface of OneShotLatch, and while
the public methods of AQS won't allow callers to corrupt the latch state, callers could easily use them incorrectly. None
of the synchronizers in java.util.concurrent extends AQS directly they all delegate to private inner subclasses of
AQS instead.
14.6.1. ReentrantLock
----------------------
ReentrantLock supports only exclusive acquisition, so it implements tryAcquire, tryRelease, and
isHeldExclusively; tryAcquire for the non fair version is shown in Listing 14.15. ReentrantLock uses the
synchronization state to hold the lock acquisition count, and maintains an owner variable holding the identity of the
owning thread that is modified only when the current thread has just acquired the lock or is just about to release it.[14] In
tryRelease, it checks the owner field to ensure that the current thread owns the lock before allowing an unlock to
proceed; in tryAcquire, it uses this field to differentiate between a reentrant acquisition and a contended acquisition
attempt.
[14] Because the protected state manipulation methods have the memory semantics of a volatile read or write and ReentrantLock is careful to
read the owner field only after calling getState and write it only before calling setState, ReentrantLock can piggyback on the memory
semantics of the synchronization state, and thus avoid further synchronization see Section 16.1.4.
When a thread attempts to acquire a lock, tryAcquire first consults the lock state. If it is unheld, it tries to update the
lock state to indicate that it is held. Because the state could have changed since it was first inspected a few instructions
ago, tryAcquire uses compareAndSetState to attempt to atomically update the state to indicate that the lock is now
held and confirm that the state has not changed since last observed. (See the description of compareAndSet in Section
15.3.) If the lock state indicates that it is already held, if the current thread is the owner of the lock, the acquisition
count is incremented; if the current thread is not the owner of the lock, the acquisition attempt fails.
Listing 14.15. tryAcquire Implementation From Non fair ReentrantLock.
protected boolean tryAcquire(int ignored) {
final Thread current = Thread.currentThread();
int c = getState();
if (c == 0) {
if (compareAndSetState(0, 1)) {
owner = current;
return true;
}
} else if (current == owner) {
setState(c+1);
return true;
}
return false;
}
ReentrantLock also takes advantage of AQS's built in support for multiple condition variables and wait sets.
Lock.newCondition returns a new instance of ConditionObject, an inner class of AQS.
>semaphore types of synchronizers only override xxxsharedxxx() versions of methods in AQS
14.6.2. Semaphore and CountDownLatch
---------------------------------------
Semaphore uses the AQS synchronization state to hold the count of permits currently available. The tryAcquireShared
method (see Listing 14.16) first computes the number of permits remaining, and if there are not enough, returns a value
indicating that the acquire failed. If sufficient permits appear to be left, it attempts to atomically reduce the permit
count using compareAndSetState. If that succeeds (meaning that the permit count had not changed since it last
looked), it returns a value indicating that the acquire succeeded. The return value also encodes whether other shared
acquisition attempts might succeed, in which case other waiting threads will also be unblocked.
The while loop terminates either when there are not enough permits or when TRyAcquireShared can atomically
update the permit count to reflect acquisition. While any given call to compareAndSetState may fail due to contention
with another thread (see Section 15.3), causing it to retry, one of these two termination criteria will become true within
a reasonable number of retries. Similarly, tryReleaseShared increases the permit count, potentially unblocking waiting
threads, and retries until the update succeeds. The return value of TRyReleaseShared indicates whether other threads
might have been unblocked by the release.
CountDownLatch uses AQS in a similar manner to Semaphore: the synchronization state holds the current count. The
countDown method calls release, which causes the counter to be decremented and unblocks waiting threads if the
counter has reached zero; await calls acquire, which returns immediately if the counter has reached zero and
otherwise blocks.
Listing 14.16. tryacquireshared and tryreleaseshared from Semaphore.
protected int tryAcquireShared(int acquires) {
while (true) {
int available = getState();
int remaining = available - acquires;
if (remaining < 0
|| compareAndSetState(available, remaining))
return remaining;
}
}
protected boolean tryReleaseShared(int releases) {
while (true) {
int p = getState();
if (compareAndSetState(p, p + releases))
return true;
}
}
14.6.3. FutureTask
-------------------
At first glance, FutureTask doesn't even look like a synchronizer. But Future.get has semantics that are very similar to that of a latch if some event (the completion or cancellation of the task represented by the FutureTask) has occurred, then threads can proceed, otherwise they are queued until that event occurs.
FutureTask uses the AQS synchronization state to hold the task status running, completed, or cancelled. It also maintains additional state variables to hold the result of the computation or the exception it threw. It further maintains a reference to the thread that is running the computation (if it is currently in the running state), so that it can be interrupted if the task is cancelled.
14.6.4. ReentrantReadWriteLock
-------------------------------
The interface for ReadWriteLock suggests there are two locks; a reader lock and a writer lock;but in the AQS based implementation of ReentrantReadWriteLock, a single AQS subclass manages both read and write locking. ReentrantRead-WriteLock uses 16 bits of the state for the write lock count, and the other 16 bits for the read lock count. Operations on the read lock use the shared acquire and release methods; operations on the write lock use the exclusive acquire and release methods. Internally, AQS maintains a queue of waiting threads, keeping track of whether a thread has requested exclusive or shared access. In ReentrantRead-WriteLock, when the lock becomes available, if the thread at the head of the queue was looking for write access it will get it, and if the thread at the head of the queue was looking for read access, all
queued threads up to the first writer will get it.[15]
[15] This mechanism does not permit the choice of a reader preference or writer preference policy, as some read write lock implementations do. For that, either the AQS wait queue would need to be something other than a FIFO queue, or two queues would be needed. However, such a strict ordering policy is rarely needed in practice; if the nonfair version of ReentrantReadWriteLock does not offer acceptable liveness, the fair version usually provides satisfactory ordering and guarantees nonstarvation of readers and writers.
====================
NON-BLOCKING THREADS
====================
Refer http://www.ibm.com/developerworks/java/library/j-jtp04186/
--
>no locking mechanism is used to execute a critical section; all are allowed to perform their task simultaneously however, upon detection of outdated data in the middle of an atomic operation, the clients requested for atomic operation should be ready to receive aborted attempt; that is their entire operation will be cancelled and they have to try at it again
>the data getting outdated in the middle of an atomic sequence is detected by 'Compare And Set' methodology where, in the final step of committing the data, once again the data is checked for consistency; setting the data is always preceded by comparing the current data with data read during the start of the atomic operation(allowing other threads to butt-in in between)
>ususally CAS(Compare And Set) is provided by hardware instructions directly, and non-blocking algorithms build on these machine instructions
>but however from Java 5, atomic operations related registers(data field) are provided as java atomic data types by java.util.concurrent.atomic package; these atomic data types can be used to build non-blocking algorithms in java itself now
>non-blocking algorithms are fundamentally something similar to source control conflict-merge operations; for example, in CVS or ClearCase, when two different copies are committed at the same time, one succeeds and one fails; the failed copy should update its source one more time and make its changes on top of it and try to commit again and so on; this principle is used in non-blocking thread implementations
>non-blocking threads should request/entrust the entire atomic operation to the data; sometimes it gets accepted, sometimes it gets rejected; upon rejection, the thread has to reupdate the data and based on the new data, it has to compile its atomic operation and try again
>In locking mechanism, there is over-head of context switching, is involved when a thread has to wait for a lock; it practically gets suspended not doing any work till it acquires the lock; but in non-blocking thread situation, the thread at the worst, reattempts to modify the data couple of times instead of getting itself suspended; that is all; so throughput and performance is superior in non-blocking thread systems
>but! when the contention for a resource is extremely intensive, non-blocking threads keep attempting to modify the data, and the rejection rate(attempt failed due to an update in the data) is very high; and it is very high for all threads; in this situation, blocking threads perform better because, they simply wait for the lock and taking turn nicely instead of making chaos and increasing the rejection rate
>non-blocking threads can be implemented only when atomic data instruction is supported by the hardware
>java atomic classes are built on top of machine level atomic instructions which are very common in todays processors; but if the underlying platform does not support machine level atomic instructions, then internal locking mechanism is used and hence it may not be an actual non-blocking thread
>java atomic classes are kind of extension on volatile variables
>we know that volatile variables are so volatile that care should be taken when reading and modifying them, because they can change at anytime
>atomic java classes provide methods which act like they are dealing with volatile variables
>some methods provide behavior 'read like you read a volatile variable' - get() method
>some provide the behavior 'write like you write a volatile variable' - set()
>'read and then write like you read and write(togther one after the other) a volatile variable' - compareAndSet()
>as long as your atomic operation consists only one CAS operation(counter and stacks) your task ends in one do-while() loop where you keep trying till CAS succeeds; but what if your atomic operation is little complex(linked list and queues with tail) and involves two or more CAS operations; threads are thread-safe only with respect to one CAS; in this situation things can go wrong in between two CAS operations of a thread; the solution is always let every thread know clearly that some other thread is in the midst of its multiple CAS operations; this can be done by introducing code logics that look for inconsistency in the underlying data; so every thread can detect such inconsistency(in case of tailed queue, any thread can see that queue is being updated currently when it sees, its tail item has next reference; this means some thread is about to change the tail to point to actual new tail); so, instead of waiting for the thread to finish, the thread which sees the inconsistency finishes the job of the other thread and continues with its own task now; the original thread which was in the midst of updating, should check for its second CAS success status; it should perform second CAS and simply assume that it was a success; because any other thread that comes afterwards or came in between, would have done the job; otherwise, the first thread will corrupt the queue here
>so a data can be in two states; one is normal state; other is quiescent state; in the second state, any thread should be able to see the inconsistency and should be able to cleanup the data on its own
Usage of non-blocking algos:
----------------------------
>GC uses for concurrent and parallel collection
>scheduler uses to schedule processes and threads
>SynchronousQueue was replaced with new non-blocking version
>
New IDEA(well, old idea only; happens in atomic update while loops and CAS):
----------------------------------------------------------------------------
One way to achieve non blocking concurrency is to appoint a gateway agent to the shared data and put all requests to read and modify through that agent using a queue to hold such requests; agent performs the requests one by one from the queue; (note here each request is an atomic operation(set of operations that have to be performed together or cancel everything altogether) any request that cannot be committed to the shared data(due to inconsistency arose after one of the previous requests by someone else) is rejected and a notification is sent to the requester; the requester should always wait for the success or failure nofications for his requests in order to conclude his update/read operation; upon receiving failure status, the requester has to place a fresh request.
Another thing that would make everything easy is to let the said agent broadcast updates to the shared data to all threads, so that any thread is preparing itself to place a new request is well informed about the latest on going events.
--x--
Chapter 15: Atomic variables and non-blocking algorithms
========================================================
>Much of the recent research on concurrent algorithms has focused on non blocking algorithms, which use low level atomic machine instructions such as compare and swap instead of locks to ensure data integrity under concurrent access. Non blocking algorithms are used extensively in operating systems and JVMs for thread and process scheduling, garbage collection, and to implement locks and other concurrent data structures.
>As of Java 5.0, it is possible to build efficient non blocking algorithms in Java using the atomic variable classes such as AtomicInteger and AtomicReference. Atomic variables can also be used as "better volatile variables" even if you are not developing non blocking algorithms. Atomic variables offer the same memory semantics as volatile variables, but with additional support for atomic updates making them ideal for counters, sequence generators, and statistics gathering while offering better scalability than lock based alternatives.
>Volatile variables are a lighter weight synchronization mechanism than locking because they do not involve context switches or thread scheduling. However, volatile variables have some limitations compared to locking: while they provide similar visibility guarantees, they cannot be used to construct atomic compound actions. This means that volatile variables cannot be used when one variable depends on another, or when the new value of a variable depends on its old value. This limits when volatile variables are appropriate, since they cannot be used to reliably implement common tools such as counters or mutexes.[2]
>Locking has a few other disadvantages. When a thread is waiting for a lock, it cannot do anything else. If a thread holding a lock is delayed (due to a page fault, scheduling delay, or the like), then no thread that needs that lock can make progress. This can be a serious problem if the blocked thread is a high priority thread but the thread holding the lock is a lower priority thread a performance hazard known as priority inversion. Even though the higher priority thread should have precedence, it must wait until the lock is released, and this effectively downgrades its priority to that of the lower priority thread. If a thread holding a lock is permanently blocked (due to an infinite loop, deadlock, livelock, or other liveness failure), any threads waiting for that lock can never make progress.
>For lock based classes with fine grained operations (such as the synchronized collections classes, where most methods contain only a few operations), the ratio of scheduling overhead to useful work can be quite high when the lock is frequently contended.
15.2. Hardware Support for Concurrency
--------------------------------------
Exclusive locking is a pessimistic technique it assumes the worst (if you don't lock your door, gremlins will come in and rearrange your stuff) and doesn't proceed until you can guarantee, by acquiring the appropriate locks, that other threads will not interfere. For fine grained operations, there is an alternate approach that is often more efficient the optimistic approach, whereby you proceed with an update, hopeful that you can complete it without interference. This approach relies on collision detection to determine if there has been interference from other parties during the update, in which case the operation fails and can be retried (or not). The optimistic approach is like the old saying, "It is easier to obtain forgiveness than permission", where "easier" here means "more efficient".
Processors designed for multiprocessor operation provide special instructions for managing concurrent access to shared variables. Early processors had atomic test and set, fetch and increment, or swap instructions sufficient for implementing mutexes that could in turn be used to implement more sophisticated concurrent objects. Today, nearly every modern processor has some form of atomic read modify write instruction, such as compare and swap or load linked/store conditional. Operating systems and JVMs use these instructions to implement locks and concurrent data structures, but until Java 5.0 they had not been available directly to Java classes.
15.2.1. Compare and Swap
------------------------
The approach taken by most processor architectures, including IA32 and Sparc, is to implement a compare and swap
(CAS) instruction. (Other processors, such as PowerPC, implement the same functionality with a pair of instructions: loadlinked and store conditional.) CAS has three operands a memory location V on which to operate, the expected old value A, and the new value B. CAS atomically updates V to the new value B, but only if the value in V matches the expected old value A; otherwise it does nothing. In either case, it returns the value currently in V. (The variant called compare and set instead returns whether the operation succeeded.) CAS means "I think V should have the value A; if it does, put B there, otherwise don't change it but tell me I was wrong." CAS is an optimistic technique it proceeds with the update in the hope of success, and can detect failure if another thread has updated the variable since it was last examined. SimulatedCAS in Listing 15.1 illustrates the semantics (but not the implementation or performance) of CAS. When multiple threads attempt to update the same variable simultaneously using CAS, one wins and updates the variable's value, and the rest lose. But the losers are not punished by suspension, as they could be if they failed to acquire a lock; instead, they are told that they didn't win the race this time but can try again. Because a thread that
loses a CAS is not blocked, it can decide whether it wants to try again, take some other recovery action, or do nothing.[3] This flexibility eliminates many of the liveness hazards associated with locking (though in unusual cases can introduce the risk of livelock see Section 10.3.3).
>writing a CAS function in java or any other high level programming language is not exactly the actual CAS with actual performance benefits; you need to use CAS machine intructions(hardware support is necessary) in order to implement CAS
>At first glance, the CAS based counter looks as if it should perform worse than a lock based counter; it has more operations and a more complicated control flow, and depends on the seemingly complicated CAS operation. But in reality, CAS based counters significantly outperform lock based counters if there is even a small amount of contention, and often even if there is no contention. The fast path for uncontended lock acquisition typically requires at least one CAS plus other lock related housekeeping, so more work is going on in the best case for a lock based counter than in the normal case for the CAS based counter. Since the CAS succeeds most of the time (assuming low to moderate contention), the hardware will correctly predict the branch implicit in the while loop, minimizing the overhead of the more complicated control logic.
The language syntax for locking may be compact, but the work done by the JVM and OS to manage locks is not. Locking
entails traversing a relatively complicated code path in the JVM and may entail OS level locking, thread suspension, and context switches. In the best case, locking requires at least one CAS, so using locks moves the CAS out of sight but doesn't save any actual execution cost. On the other hand, executing a CAS from within the program involves no JVM code, system calls, or scheduling activity. What looks like a longer code path at the application level is in fact a much shorter code path when JVM and OS activity are taken into account. The primary disadvantage of CAS is that it forces the caller to deal with contention (by retrying, backing off, or giving up), whereas locks deal with contention automatically by blocking until the lock is available.[5]
[5] Actually, the biggest disadvantage of CAS is the difficulty of constructing the surrounding algorithms correctly.
CAS performance varies widely across processors. On a single CPU system, a CAS typically takes on the order of a handful of clock cycles, since no synchronization across processors is necessary. As of this writing, the cost of an uncontended CAS on multiple CPU systems ranges from about ten to about 150 cycles; CAS performance is a rapidly moving target and varies not only across architectures but even across versions of the same processor. Competitive forces will likely result in continued CAS performance improvement over the next several years. A good rule of thumb is that the cost of the "fast path" for uncontended lock acquisition and release on most processors is approximately twice the cost of a CAS.
15.2.3. CAS Support in the JVM
------------------------------
So, how does Java code convince the processor to execute a CAS on its behalf ?Prior to Java 5.0, there was no way to do this short of writing native code. In Java 5.0, low level support was added to expose CAS operations on int, long, and object references, and the JVM compiles these into the most efficient means provided by the underlying hardware. On platforms supporting CAS, the runtime inlines them into the appropriate machine instruction(s); in the worst case, if a CAS like instruction is not available the JVM uses a spin lock. This low level JVM support is used by the atomic variable classes (AtomicXXX in java.util.concurrent. atomic) to provide an efficient CAS operation on numeric and reference types; these atomic variable classes are used, directly or indirectly, to implement most of the classes in java.util.concurrent.
Atomic Classes
--------------
AtomicInteger bears a superficial resemblance to an extended Counter class, but offers far
greater scalability under contention because it can directly exploit underlying hardware support for concurrency.
There are twelve atomic variable classes, divided into four groups: scalars, field updaters, arrays, and compound variables. The most commonly used atomic variables are the scalars: AtomicInteger, AtomicLong, AtomicBoolean, and AtomicReference. All support CAS; the Integer and Long versions support arithmetic as well. (To simulate atomic variables of other primitive types, you can cast short or byte values to and from int, and use floatToIntBits or doubleToLongBits for floating point numbers.)
The atomic array classes (available in Integer, Long, and Reference versions) are arrays whose elements can be updated atomically. The atomic array classes provide volatile access semantics to the elements of the array, a feature not available for ordinary arrays a volatile array has volatile semantics only for the array reference, not for its elements. (The other types of atomic variables are discussed in Sections 15.4.3 and 15.4.4.)
While the atomic scalar classes extend Number, they do not extend the primitive wrapper classes such as Integer or Long. In fact, they cannot: the primitive wrapper classes are immutable whereas the atomic variable classes are mutable. The atomic variable classes also do not redefine hashCode or equals; each instance is distinct. Like most mutable objects, they are not good candidates for keys in hash based collections.
15.3.2. Performance Comparison: Locks Versus Atomic Variables
-------------------------------------------------------------
To demonstrate the differences in scalability between locks and atomic variables, we constructed a benchmark comparing several implementations of a pseudorandom number generator (PRNG). In a PRNG, the next "random" number is a deterministic function of the previous number, so a PRNG must remember the previous number as part of its state.
Listings 15.4 and 15.5 show two implementations of a thread safe PRNG, one using ReentrantLock and the other using
AtomicInteger. The test driver invokes each repeatedly; each iteration generates a random number (which fetches and
modifies the shared seed state) and also performs a number of "busy work" iterations that operate strictly on thread
local data. This simulates typical operations that include some portion of operating on shared state and some portion of operating on thread local state.
Figures 15.1 and 15.2 show throughput with low and moderate levels of simulated work in each iteration. With a low
level of thread local computation, the lock or atomic variable experiences heavy contention; with more thread local computation, the lock or atomic variable experiences less contention since it is accessed less often by each thread.
15.4. Non blocking Algorithms
-----------------------------
Lock based algorithms are at risk for a number of liveness failures. If a thread holding a lock is delayed due to blocking I/O, page fault, or other delay, it is possible that no thread will make progress. An algorithm is called non blocking if failure or suspension of any thread cannot cause failure or suspension of another thread; an algorithm is called lock free if, at each step, some thread can make progress. Algorithms that use CAS exclusively for coordination between threads can, if constructed correctly, be both non blocking and lock free. An uncontended CAS always succeeds, and if multiple threads contend for a CAS, one always wins and therefore makes progress. Non blocking algorithms are also immune to deadlock or priority inversion (though they can exhibit starvation or livelock because they can involve repeated retries).
We've seen one non blocking algorithm so far: CasCounter. Good non blocking algorithms are known for many common data structures, including stacks, queues, priority queues, and hash tables though designing new ones is a task best left to experts.
15.4.2. A Non blocking Linked List
----------------------------------
The two non blocking algorithms we've seen so far, the counter and the stack, illustrate the basic pattern of using CAS to update a value speculatively, retrying if the update fails. The trick to building non blocking algorithms is to limit the scope of atomic changes to a single variable. With counters this is trivial, and with a stack it is straightforward enough, but for more complicated data structures such as queues, hash tables, or trees, it can get a lot trickier.
15.4.3. Atomic Field Updaters
-----------------------------
Listing 15.7 illustrates the algorithm used by ConcurrentLinkedQueue, but the actual implementation is a bit different. Instead of representing each Node with an atomic reference, ConcurrentLinkedQueue uses an ordinary volatile reference and updates it through the reflection based AtomicReferenceFieldUpdater, as shown in Listing 15.8.
Listing 15.8. Using Atomic Field Updaters in ConcurrentLinkedQueue.
private class Node
private final E item;
private volatile Node
public Node(E item) {
this.item = item;
}
}
private static AtomicReferenceFieldUpdater
The atomic field updater classes (available in Integer, Long, and Reference versions) represent a reflection based "view" of an existing volatile field so that CAS can be used on existing volatile fields. The updater classes have no constructors; to create one, you call the newUpdater factory method, specifying the class and field name. The field updater classes are not tied to a specific instance; one can be used to update the target field for any instance of the target class(Node a, b;). The atomicity guarantees for the updater classes are weaker than for the regular atomic classes because you cannot guarantee that the underlying fields will not be modified directly; the compareAndSet and arithmetic
methods guarantee atomicity only with respect to other threads using the atomic field updater methods(some threads might use the Node instance directly to update). In ConcurrentLinkedQueue, updates to the next field of a Node are applied using the compareAndSet method of nextUpdater. This somewhat circuitous approach is used entirely for performance reasons. For frequently allocated, short lived objects like queue link nodes, eliminating the creation of an AtomicReference for each Node is significant enough to reduce the cost of insertion operations. However, in nearly all situations, ordinary atomic variables perform just fine in only a few cases will the atomic field updaters be needed. (The atomic field updaters are also useful when you want to perform atomic updates while preserving the serialized form of an existing class.)
15.4.4. The ABA Problem
-----------------------
The ABA problem is an anomaly that can arise from the naive use of compare and swap in algorithms where nodes can
be recycled (primarily in environments without garbage collection). A CAS effectively asks "Is the value of V still A?", and proceeds with the update if so. In most situations, including the examples presented in this chapter, this is entirely sufficient. However, sometimes we really want to ask "Has the value of V changed since I last observed it to be A?" For some algorithms, changing V from A to B and then back to A still counts as a change that requires us to retry some algorithmic step.
This ABA problem can arise in algorithms that do their own memory management for link node objects. In this case, that the head of a list still refers to a previously observed node is not enough to imply that the contents of the list have not changed. If you cannot avoid the ABA problem by letting the garbage collector manage link nodes for you, there is still a relatively simple solution: instead of updating the value of a reference, update a pair of values, a reference and a version number. Even if the value changes from A to B and back to A, the version numbers will be different.
AtomicStampedReference (and its cousin AtomicMarkableReference) provide atomic conditional update on a pair of
variables. AtomicStampedReference updates an object reference integer pair, allowing "versioned" references that are
immune[8] to the ABA problem. Similarly, AtomicMarkableReference updates an object reference boolean pair that is
used by some algorithms to let a node remain in a list while being marked as deleted.[9]
[8] In practice, anyway; theoretically the counter could wrap.
[9] Many processors provide a double wide CAS (CAS2 or CASX) operation that can operate on a pointer integer pair, which would make this operation reasonably efficient. As of Java 6, Atomic-StampedReference does not use double wide CAS even on platforms that support it. (Double wide CAS differs from DCAS, which operates on two unrelated memory locations; as of this writing, no current processor implements DCAS.)
16.1.3. The Java Memory Model in 500 Words or Less
--------------------------------------------------
The Java Memory Model is specified in terms of actions, which include reads and writes to variables, locks and unlocks of monitors, and starting and joining with threads. The JMM defines a partial ordering [2] called happens before on all actions within the program. To guarantee that the thread executing action B can see the results of action A (whether or not A and B occur in different threads), there must be a happens before relationship between A and B. In the absence of a happens before ordering between two operations, the JVM is free to reorder them as it pleases.
The only order guarantees in JVM
--------------------------------
A data race occurs when a variable is read by more than one thread, and written by at least one thread, but the reads
and writes are not ordered by happens before. A correctly synchronized program is one with no data races; correctly
synchronized programs exhibit sequential consistency, meaning that all actions within the program appear to happen in
a fixed, global order.
The rules for happens before are:
Program order rule. Each action in a thread happens before every action in that thread that comes later in the program order.
Monitor lock rule. An unlock on a monitor lock happens before every subsequent lock on that same monitor lock.[3]
Volatile variable rule. A write to a volatile field happens before every subsequent read of that same field.[4]
Thread start rule. A call to Thread.start on a thread happens before every action in the started thread.
Thread termination rule. Any action in a thread happens before any other thread detects that thread has terminated, either by successfully return from Thread.join or by Thread.isAlive returning false.
Interruption rule. A thread calling interrupt on another thread happens before the interrupted thread detects the
interrupt (either by having InterruptedException thrown, or invoking isInterrupted or interrupted).
Finalizer rule. The end of a constructor for an object happens before the start of the finalizer for that object.
Transitivity. If A happens before B, and B happens before C, then A happens before C.
Other happens before orderings guaranteed by the class library include:
Placing an item in a thread safe collection happens before another thread retrieves that item from the collection;
Counting down on a CountDownLatch happens before a thread returns from await on that latch;
Releasing a permit to a Semaphore happens before acquiring a permit from that same Semaphore;
Actions taken by the task represented by a Future happens before another thread successfully returns from
Future.get;
Submitting a Runnable or Callable to an Executor happens before the task begins execution; and
A thread arriving at a CyclicBarrier or Exchanger happens before the other threads are released from that
same barrier or exchange point. If CyclicBarrier uses a barrier action, arriving at the barrier happens before
the barrier action, which in turn happens before threads are released from the barrier.
No comments:
Post a Comment