Synchronization (computer science)

In computer science, synchronization refers to one of two distinct but related concepts: synchronization of processes, and synchronization of data. Process synchronization refers to the idea that multiple processes are to join up or handshake at a certain point, in order to reach an agreement or commit to a certain sequence of action. Data synchronization refers to the idea of keeping multiple copies of a dataset in coherence with one another, or to maintain data integrity. Process synchronization primitives are commonly used to implement data synchronization.

Thread or process synchronization

Figure 1: Three processes accessing a shared resource (critical section) simultaneously.

Thread synchronization is defined as a mechanism which ensures that two or more concurrent processes or threads do not simultaneously execute some particular program segment known as critical section. When one thread starts executing the critical section (serialized segment of the program) the other thread should wait until the first thread finishes. If proper synchronization techniques[1] are not applied, it may cause a race condition where, the values of variables may be unpredictable and vary depending on the timings of context switches of the processes or threads.

For example, suppose that there are three processes namely, 1, 2 and 3. All three of them are concurrently executing and then need to share a common resource (critical section) as shown in Figure 1. Synchronization should be used here to avoid any conflicts for accessing this shared resource. Hence, when Process 1 and 2 both try to access that resource it should be assigned to only one process at a time. If it is assigned to Process 1, the other process (Process 2) needs to wait until Process 1 frees that resource (as shown in Figure 2). [2]

Figure 2: A process accessing a shared resource if available, based on some synchronization technique.

Another synchronization requirement which needs to be considered is the order in which particular processes or threads should be executed. For example, we cannot board a plane until we buy the required ticket. Similarly, we cannot check emails before validating our credentials (i.e., user name and password). In the same way, an ATM will not provide any service until we provide it with a correct PIN.[3]

Other than mutual exclusion, synchronization also deals with the following:

Processes access to critical section is controlled by using synchronization techniques. This may apply to a number of domains.

Classic problems of synchronization

The following are some classic problems of synchronization:

These problems are used to test nearly every newly proposed synchronization scheme or primitive.

Hardware synchronization

Many systems provide hardware support for critical section code.

A single processor or uniprocessor system could disable interrupts by executing currently running code without preemption, which is very inefficient on multiprocessor systems.[4] "The key ability we require to implement synchronization in a multiprocessor is a set of hardware primitives with the ability to atomically read and modify a memory location. Without such a capability, the cost of building basic synchronization primitives will be too high and will increase as the processor count increases. There are a number of alternative formulations of the basic hardware primitives, all of which provide the ability to atomically read and modify a location, together with some way to tell if the read and write were performed atomically. These hardware primitives are the basic building blocks that are used to build a wide variety of user-level synchronization operations, including things such as locks and barriers. In general, architects do not expect users to employ the basic hardware primitives, but instead expect that the primitives will be used by system programmers to build a synchronization library, a process that is often complex and tricky."[5] Many modern hardware provides special atomic hardware instructions by either test-and-set the memory word or compare-and-swap contents of two memory words.

Synchronization strategies in programming languages

In Java, to prevent thread interference and memory consistency errors, blocks of code are wrapped into synchronized (lock_object) sections.[2] This forces any thread to acquire the said lock object before it can execute the block. The lock is automatically released when thread leaves the block or enter the waiting state within the block. Any variable updates, made by the thread in synchronized block, become visible to other threads whenever those other threads similarly acquires the lock.

In addition to mutual exclusion and memory consistency, Java synchronized blocks enable signaling, sending events from those threads, which have acquired the lock and execute the code block to those which are waiting for the lock within the block. This means that Java synchronized sections combine functionality of mutexes and events. Such primitive is known as synchronization monitor.

Any object is fine to be used as a lock/monitor in Java. The declaring object is implicitly implied as lock object when the whole method is marked with synchronized.

The .NET framework has synchronization primitives. "Synchronization is designed to be cooperative, demanding that every thread or process follow the synchronization mechanism before accessing protected resources (critical section) for consistent results." In .NET, locking, signaling, lightweight synchronization types, spinwait and interlocked operations are some of mechanisms related to synchronization.[6]

Synchronization examples

Following are some synchronization examples with respect to different platforms.[7]

Synchronization in Windows

Windows provides:

Synchronization in Linux

Linux provides:

Enabling and disabling of kernel preemption replaced spinlocks on uniprocessor systems. Prior to kernel version 2.6, Linux disabled interrupt to implement short critical sections. Since version 2.6 and later, Linux is fully preemptive.

Synchronization in Solaris

Solaris provides:

Pthreads synchronization

Pthreads is a platform-independent API that provides:

Data synchronization

Main article: Data synchronization
Figure 3: Changes from both server and client(s) are synchronized.

A distinctly different (but related) concept is that of data synchronization. This refers to the need to keep multiple copies of a set of data coherent with one another or to maintain data integrity, Figure 3. For example, database replication is used to keep multiple copies of data synchronized with database servers that store data in different locations.

Examples include:

Challenges in data synchronization

Some of the challenges which user may face in data synchronization:[8]

Data formats complexity

When we start doing something, the data we have usually is in a very simple format. It varies with time as the organization grows and evolves and results not only in building a simple interface between the two applications (source and target), but also in a need to transform the data while passing them to the target application. ETL (extraction transformation loading) tools can be very helpful at this stage for managing data format complexities.

Real-timeliness

This is an era of real-time systems. Customers want to see the current status of their order in e-shop, the current status of a parcel delivery—a real time parcel tracking—, the current balance on their account, etc. This shows the need of a real-time system, which is being updated as well to enable smooth manufacturing process in real-time, e.g., ordering material when enterprise is running out stock, synchronizing customer orders with manufacturing process, etc. From real life, there exist so many examples where real-time processing gives successful and competitive advantage.

Data security

There are no fixed rules and policies to enforce data security. It may vary depending on the system which you are using. Even though the security is maintained correctly in the source system which captures the data, the security and information access privileges must be enforced on the target systems as well to prevent any potential misuse of the information. This is a serious issue and particularly when it comes for handling secret, confidential and personal information. So because of the sensitivity and confidentiality, data transfer and all in-between information must be encrypted.

Data quality

Data quality is another serious constraint. For better management and to maintain good quality of data, the common practice is to store the data at one location and share with different people and different systems and/or applications from different locations. It helps in preventing inconsistencies in the data.

Performance

There are five different phases involved in the data synchronization process:

Each of these steps is very critical. In case of large amounts of data, the synchronization process needs to be carefully planned and executed to avoid any negative impact on performance.

Mathematical foundations

Synchronization was originally a process-based concept whereby a lock could be obtained on an object. Its primary usage was in databases. There are two types of (file) lock; read-only and read–write. Read-only locks may be obtained by many processes or threads. Readers–writer locks are exclusive, as they may only be used by a single process/thread at a time.

Although locks were derived for file databases, data is also shared in memory between processes and threads. Sometimes more than one object (or file) is locked at a time. If they are not locked simultaneously they can overlap, causing a deadlock exception.

Java and Ada only have exclusive locks because they are thread based and rely on the compare-and-swap processor instruction.

An abstract mathematical foundation for synchronization primitives is given by the history monoid. There are also many higher-level theoretical devices, such as process calculi and Petri nets, which can be built on top of the history monoid.

See also

References

  1. Gramoli, V. (2015). More than you ever wanted to know about synchronization: Synchrobench, measuring the impact of the synchronization on concurrent algorithms (PDF). Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. ACM. pp. 1–10.
  2. 1 2 Janssen, Cory. "Thread Synchronization". Techopedia. Retrieved 23 November 2014.
  3. Fatheisian, Halleh; Rosenberger, Eric. "Synchronization". Department of Computer Science, George Mason University. Retrieved 23 November 2014.
  4. Silberschatz, Abraham; Gagne, Greg; Galvin, Peter Baer (July 11, 2008). "Chapter 6: Process Synchronization". Operating System Concepts (Eighth ed.). John Wiley & Sons. ISBN 978-0-470-12872-5.
  5. Hennessy, John L.; Patterson, David A. (September 30, 2011). "Chapter 5: Thread-Level Parallelism". Computer Architecture: A Quantitative Approach (Fifth ed.). Morgan Kaufmann. ISBN 978-0-123-83872-8.
  6. "Synchronization Primitives in .NET framework". MSDN, The Microsoft Developer Network. Microsoft. Retrieved 23 November 2014.
  7. Silberschatz, Abraham; Gagne, Greg; Galvin, Peter Baer (December 7, 2012). "Chapter 5: Process Synchronization". Operating System Concepts (Ninth ed.). John Wiley & Sons. ISBN 978-1-118-06333-0.
  8. "Data Synchronization". Javlin Inc. Retrieved 23 November 2014.

External links

This article is issued from Wikipedia - version of the Saturday, April 30, 2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.