Sunday, May 27, 2012

GIST NOTES 9 - Java Threads


GIST NOTES 9 - Java Threads

[DISCLAIMER: This is solely for non-commercial use. I don't claim ownership of this content. This is a crux of all my readings studies and analysis. Some of them are excerpts from famous books on  the subject. Some of them are my contemplation upon experiments with direct hand coded code samples using IDE or notepad.


I've created this mainly to reduce an entire book into few pages of critical content that we should never forget. Even after years, you don't need to read the entire book again to get back its philosophy. I hope these notes will help you to replay the entire book in your mind once again.]


[Java 7.0]

Light Weight Process -> LWPs are threads running on kernel threads; LWPs have tendency to share address space and resources with other LWPs inside a kernel thread; LWPs sit between user threads and kernel threads, that is one or more user threads are implemented on top of a single LWP

Thread -> an instance of execution or thread of execution which runs on a CPU assuming that it is the only activity the processor is currently doing but in reality CPU time is given to thread partially. Threads share resources and address spaces with other threads inside a single Process

Process -> Is a general user-specific application(though a single application can be made up of more than one process); a process has separate memory and resource which is not shared with any other processes; process has an objective and a task or activity to be completed; to finish its job it can employ many threads internally; e.g. text editor

>an instance of Thread is just an object like any other
>a thread of execution on the other hand is a separate process(light weight) that has its own call stack
>in java there is one thread per call stack or one call stack per thread
>everything runs in some threads even if you haven't explicitly created one
>the main() method where the java application starts, actually runs on a thread called 'main' thread
>any exception that comes up all the way up to main method and not handled there as well breaks the main thread the stack trace goes as follows:-

Exception in thread "main" java.lang.Exception
        at MyClass.main(MyClass.java:5)

>as soon as a new thread is created from main thread, call execution starts happening on a new stack separate from main's stack
>call stacks of each thread run concurrently with each other
>some JVMs map java threads onto OS threads (native threads)
>some JVMs act as mini OS and schedule java threads on its own within the CPU cycles it bargains from the underlying OS
>"when it comes to threads very little is guaranteed"
>both you and operating system can create daemon threads
>normal threads are called user threads
>JVM doesn't wait for daemon threads to finish when it shuts down; JVM waits till all user threads complete and exit
>daemon threads are kind of threads that run low on the background

>To create a thread class either extend java.lang.Thread class or implement java.lang.Runnable; but to run the created thread we always need a java.lang.Thread instance

>override public void run() method in your class (it is available both in Thread class as well as Runnable interface)
>overloading run() method will not impact thread behavior; you are free to have as many overloaded run() methods as possible for fun
>thread of execution only uses 'public void run()' method and no other
>calling run() method directly doesn't start the thread
>usually Thread instance is considered as 'worker' and Runnable implementer is considered as the 'job' to be done; so by implementing Runnable interface you keep the worker and the job separate from each other
>Thread class also implements Runnable
>when a thread object is created but not started yet, it is considered not alive yet
>after calling start() method on the thread object, it is considered alive, but run() method may not have started executing yet
>unless you are trying to improve java.lang.Thread don't extend it; instead use Runnable interface

>a Runnable object can be reused in multiple threads to run
>but a Thread object cannot be used multiple times to start threads; attempt to start() the thread more than once gives exception and that's why start() method itself is synchronized

>after run() method completes, the thread is considered dead
>isAlive() tells you whether a thread has completed or not
>getState() returns enum constants(Thread.State) corresponding to the state of the thread

Thread.State is a static nested enum
NEW - not yet started
RUNNABLE - is executing in JVM
BLOCKED - waiting for a monitor lock
WAITING - waiting indefinitely for another thread to complete a task
TIMED_WAITING - waiting for another thread upto a specified time
TERMINATED - thread has exited

>before calling start(), the thread is in State.NEW state, after the call it moves to State.RUNNABLE state

>some methods are static in Thread class (e.g. sleep() and yield()), because they control the currently running live threads
>start(), setPriority() and such methods do not impact the running thread; they are merely setting up a Thread object which will be started as a new thread of execution soon
>join() method is also NOT static; t.join() causes the currently executing thread to pause until thread t finishes
>so any method which would control a live running thread has to be a STATIC method!
>why sleep() method throws InterruptedException? sleep method allows a thread to sleep for a specified amount of time; if someone wakes it before the waiting time is finished, it throws exception and in the catch block you can do whatever you want knowing that the thread has woken up prematurely; if no one interrupted, then after the sleep time is over, the thread comes out of sleep state normally; the thread hangs at the Thread.sleep() statement until the time expires or someone interrupts; if no interrupts, the control moves to next statement when the sleep is over without any exceptions of course

java.lang.Object class methods related to threads
-------------------------------------------------
public final void wait() throws InterruptedException
public final void notify()
public final void notifyAll()

#only wait() method throws exception
#all three methods are final methods; we can't override these blokes

>t.join() method doesn't pause the currently running thread if t is already dead(TERMINATED) or not yet started(NEW)

>to avoid data incosistency while concurrent threads are modifying the same data, we can either synchronize the methods that modify the data or use synchronized block around the critical code lines
>synchornizing a method, uses the object on which the instance method is called as lock object; synchronized block allows any object to be used as lock

public synchronized void modify(){//critical code} //the lock is the object on which modify() is invoked

public void modify(){
//Object lock = new Object(); if this line is introduced, all threads can enter the synchronized block at the same time
synchronized(lock){  //the lock is any arbitrary object 'lock'
//critical code goes here
}
}

#when we use any arbitrary 'lock' object in synchronized blocks, make sure that only one instance of the lock is available to all threads; so usually a lock object should be an instance variable(if all threads are holding the same object containing modify() method), or final instance variable or static instance variable(if different threads holding different object that contains modify() method)

#idea is never allow duplicate locks; a lock should be unique

>only wait() method gives up the lock immediately
>usually instance data modified by instance methods, static data is modified by static methods; by synchronizing those methods we can protect instance data(lock on that instance) and static data(lock on Class object of that class)
>but what if a static data is modified by an instance method? even if the instance method is synchronized, it will not protect the data because any number of threads through different objects(each of them act as locks) can modify the same static data; so in this situation synchronizing the code using block sychronization and using Class object of the class as lock should fix the problem

>same problem arises when a static method modifies a non-static data through one of the objects;
>so to keep things simple and safe, always access static fields through static methods, non-static fields through non-static method and mark all those methods as synchronized

>thread-safe classes are nothing but classes which protect their internal data by synchronizing their sensitive methods; when a thirdparty class uses a thread-safe class(like Vector or the one returned by Collections.synchronizedList()), it still can run into data inconsistency problems; because the thread safe class has synchronized individual methods; but the thirdparty class might use a sequence of method(synchronized ones) calls on the thread-safe class; those calls could be part of one atomic operation; that is they have to be done together or not at all; when the thirdparty's method(where the sequence happens) is not synchronized, multiple threads may try to execute the atomic(it is not atomic yet) operation simultaneously and two thirdparty sequences running at the same time can mess each other up;

>so identify atomic operations in your code and synchronize the whole operation yourself even if the underlying data is thread-safe by itself; in such cases there is a redundant thread-safety which might hog the performance of your application; if possible remove the redundant thread-safety and put thread-safety always at higher level, by yourself

>deadlock: two threads waiting for each others lock - to avoid always acquire locks in a predetermined order
>press ctrl+break on the command prompt where java app is running to get a thread dump on deadlocked application; thread dump detects any existing deadlocks and displays; one can also see the monitor id each thread is holding and the monitor id they are waiting for; there can be two or more threads involved in a circular wait causing a deadlock;
>symptom of a deadlock is unexpected long delay in certain synchronized tasks without apparent cause or Exceptions in the system

Thread Dump
-----------
>displays all user threads and daemon threads in JVM
>displays thread names, thread ids and their states
>displays locks they are holding
>displays stack information for deadlocked threads
>displays Heap details(spaces like new generation, eden space, from space, to space, tenured generation, compacting perm gen, etc)

Notes from Java Language Specification on Threads
====================================
>in java, synchronization is implemented using monitors
>every object in java is associated with a monitor
>only one thread can lock on a monitor of an object at a time
>using volatile variables is a kind of synchronization
>every object in addition to having a monitor also has an associated 'wait set'; wait set is a set of threads waiting for the monitor on that object; when the object is first created, the wait set is empty;
>adding/removing threads from the wait set are atomic operations
>wait sets are manipulated solely through wait(), notify() and notifyAll() methods
>before or after Thread.sleep or Thread.yield, their cached variables are not flushed or updated or reloaded; that is they use the same copy of variables when they come back from sleep or runnable pool

For example, in the following (broken) code fragment, assume that this.done is a nonvolatile
boolean field:

while (!this.done)
Thread.sleep(1000);

The compiler is free to read the field this.done just once, and reuse the cached value in each execution of the loop. This would mean that the loop would never terminate, even if another thread changed the value of this.done.
---X---

>if a thread calls obj.wait() (wait set for the object is already empty) and nobody calls notify() or the notify was sent before the call to wait(), then this thread will wait forever even if no other thread is using the lock of obj

>a thread can come out of wait() only when someother thread issues notify() or it is interrupted

>if a thread calls wait() on an object where no other thread is waiting for(wait set for the object is empty), then also the thread goes to waiting state

>so whenever a thread gives up lock by calling wait(), it should check for valid conditions to wait(ensure there will be a notification in future for sure) or invoke timed wait(); and also do a recheck on the condition to continue when it wakes up from wait, because the notify might have been a wrong signal, or the thread wokeup spontaneously(occurs in some situations)

>so keep your conditions to wait and conditions to continue, in check, before start waiting and start continuing respectively

>Calling wait() outside synchronized code will not cause compilation error, but might cause runtime error(IllegalMonitorStateException)

>synchronization is a tricky business; while overriding a synchronized method, the overrider may choose to eliminate synchronized in its implementation and open the gate of hell; but still to access the parent version of the method in that subclass object, threads should go through lock acquisition

Reference: Kathy Sierra's SCJP Book

No comments:

Post a Comment