Introduction
A distributed database is one in which storage devices are not put together in a common processor. They may be stored in several processors in one location or maybe dispersed over a network of interconnected computers. Unlike parallel systems, here the processors are all put together and entail one database system. A distributed database system entails dispersed sites that do not share physical components. System administrators can distribute collections of data across many physical locations (Rahimi, 2010) . The diagram below is an example of a distributed database system whose communication channel is used in communicating with the various locations, and each system has its memory and its database.
Figure 1 Two processes that ensure the distributed database remain up-to-date are replication and duplication. Replication is the use of specially modified software that brings change in the distributive information, after identifying the necessary changes, replication makes them appear similar. Without computer resources, the process is at times complicated and takes time depending on the size and the number of distributed databases. Duplication identifies one database as a controller and then duplicates it. To ensure that every distributed location has similar data, the process is done at a particular time. In this process, users change the master database to avoid overwriting the information (Chhanda, 2009) .
Delegate your assignment to our experts and they will do the rest.
There are many distributed database designs technologies such as local autonomy, synchronous and asynchronous systems. In applying these technologies, it is mostly determined by the needs of the business and the sensitivity of the information in the database (Ozsu, 2011) . It is also largely influenced by the cost and the price that the business is willing and able to spend on the establishment and enhancement of data security, consistency, and integrity.
Goals of a Distributed Database System
The concept of a distributed database is built with the goal and objective of improving reliability, availability, and performance. In distributed database systems, reliability is improved since if one system fails to function for some time, another system can complete the task. Since reliability is achieved even when a server fails, that means the system is available to serve the client request. Performance is achieved by the distribution of the database over different locations. As a result of the distributive system, the databases are available in every location, and this proves easy to maintain (Ozsu, 2011) .
Types of Distributed Database Management Systems
Homogeneous Distributed Database Management System
Homogeneous distributed database system consists of a network of two or more databases with the same type of database management system software, and it can be stored on one or more machines. In this kind of system, data can be accessed and modified at the same time on the various databases in the network. In a homogeneous distributed database system, all the software and hardware running all the databases instances could appear through a single interface creating the impression of a single database. All the sites are made up of identical software, and they work together in processing the various user requests. Every sight renders a part of its autonomy for the changes in the software (Kedar, 2010) .
A homogeneous database management system appears to the end user just as a single system. The homogeneous system is easier to manage as well as design. The operating system used at each location of a homogeneous database has to be the same and compatible. Also, all the data structures that are used at each location have to be the same. A homogenously distributed database management system also needs all the database application used in the various locations to be compatible and similar.
Figure 2
Heterogeneous Distributed Database System
Heterogeneous distributed database systems contain a network of two or more different databases with varying types of database management software that can be stored on one or more machines. The system makes it possible for data to be accessible to several databases in the network. Generic connectivity such ODBC and JDBC makes it possible for the system data to be accessible on several databases in the network. In a heterogeneous distributed database, different sites are free to use different software. The difference in software is a primary problem for the query processing as well as transaction processing. The different sites may not be aware of each other, and as such, they could only provide limited facilitation for the cooperation needed in the transaction processing. Heterogeneous systems allow for different nodes to have different hardware and software as well as varying data structures. The locations could also be incompatible. It also makes it possible for various computers as well as operating systems, data applications and data models to be used at the different locations.
A location could have the latest relational database management technology as another location stores data using conventional files as well as old versions of database management system (Singh, 2011) . Similarly, a location could have the windows operating system while another one has UNIX. Heterogeneous systems are commonly used when sites make use of their independent hardware and software. The translations in a heterogeneous system need to allow communication between different sites. The users need to be able to make the various requests in a database language at their local sites. Mostly, users use the SQL database language for these purposes. Whenever the hardware is different, the translation is straightforward, and the computer codes and word-length are changed. The heterogeneous system in most cases is not economically or technically feasible. A user in a given location is only able to read the data at another location but cannot update it.
Figure 3
Distributed Database Management System Architecture
A distributed database system makes it possible for applications to access data from both the local and remote databases (Prabhakaran, 2015) . There are various architectures in distributed database management systems.
Client/server database architecture of Distributed System
The client-server architecture contains several clients and a few servers that are connected to a network. A client sends their query to one of the servers, and the earliest available server gives a solution and a reply (Umar, 2016) . The implementation of the client-server architecture is simple and easy to execute as a result of the centralized server system.
Figure 4
Collaborating server architecture
The collaborating server architecture is designed in a way that it can run a single query on multiple servers. The servers break the single queries into smaller queries after which the result is sent to the client. The collaborating server architecture is made up of a collection of database servers where each server is capable of executing the current transactions across the various databases (Umar, 2016) .
Figure 5
Middleware Architecture
The middleware architecture is designed in a way such that a single query is executed on multiple servers. The system only requires one server that is capable of managing the queries and transactions from the various multiple servers. Middleware architecture makes use of the local servers in handling the local transactions and queries (Singh, 2011) . The software is used in the execution of queries and transactions across the independent database servers. This type of software is known as middleware.
Data Replication in Distributed Systems
Data replication entails the process of copying data at multiple locations that are different computers or servers to improve the availability of the data (Helal, 2012) . It is done with the aim to increase the availability of the data as well as speed up the query evaluation.
Types of Data Replication
Synchronous Replication
In this kind of replication, the replica is modified immediately after the introduction of some changes in the relation table. It means that there is no difference between the original data and the replica (Ozsu, 2011) .
Asynchronous replication
In asynchronous replication, the replica is modified after a commit has been fired onto the database.
Replication Schemes
There are three commonly used replication schemes.
Full Replication
In full replication scheme, the database is normally available to most locations and users in the communication network (Helal, 2012) . It offers high availability of data since the database is available to most locations. It also makes it possible to have faster execution of queries.
Figure 6
No Replication
Having no replication means that each fragment is stored at only one location. It allows for concurrency to be minimized and facilitates easy recovery of data.
Figure 7
Partial Replication
It means that only some fragments are replicated from the database. The number of replicas created for the different fragments is dependent on the importance attached to the specific data in the fragment (Helal, 2012) .
Figure 8
Advantages and Disadvantages of Distributed Database Management System
Advantages of Distributed Database Management Systems
Distributed database management is advantageous based on several factors.
DDBS Reflects organizational structure
In most organizations, the systems are naturally distributed over several locations, e.g., a bank has many offices in different cities thus allowing databases used in such an application to be distributed over these locations (Group, 2013) . A bank may keep a database at every branch office with details such as the names of staff that work at that location, the account information of different customers, etc. The staffs in a branch office can make local inquiries of the information they need.
Improved share ability and local autonomy
The geographical distribution of an organization is depicted in the distribution of the data; users from different sites can access data. Data is placed close to the users thus allowing local control of the data and thereby establishing and enforcing local policies regarding the use of this data. A global database administrator [DBA] is responsible for the entire system (Sumathi, 2014) . This responsibility is assigned the local level and as such the local DBA can manage the local DBMS.
Improved availability, performance, and reliability
In a centralized database management system, a computer failure terminates its application, but in a distributed system a different server picks up the task and completes it. The failure of a node or a communication link does not necessarily make the data inaccessible because the information is distributed on different sites. As the data is closer to the site with many users and given the inherent parallelism of distributed database management systems, the speed of access is both faster and efficient than that of the remote centralized database. Due to each site handling only a portion of the information, there is no contention for the CPU as compared to what happens in a centralized DBMS (Rahimi, 2010) .
Economics and Modular growth
It is cheaper to create a system of smaller computers with the same power of a giant computer. This arrangement enables corporate divisions and other departments to obtain their computers. It also costs less to add workstations to a network than to update a mainframe system. In the distributed environment, it's easier to expand since new sites are added without interfering with the operations of different sites (Rahimi, 2010) . This flexibility makes it possible for an organization to expand relatively easily.
Disadvantages of Distributed Database Management Systems
Complexity
A distributed DBMS that covers the distributed nature from the user and provides an acceptable level of performance, reliability, and availability is inherently more complicated than a centralized DBMS (Kasbe, 2016) . Data can be copied making it more complex to the distributed DBMS.
Cost
Increased complexity alludes to the fact that we can expect the procurement and maintenance costs for a DDBMS to be higher than that of a centralized DBMS. Again DBMS requires other hardware to establish a network between sites. There are some communication expenses incurred with the use of this network plus additional labor costs to maintain the local DBMS and the underlying network.
Security
Data access is easily controlled in a centralized system while in distributed DBMS it offers access to replicated data and the network itself has to be made secure. Networks are regarded as an insecure communication medium, but significant measures need to be made to ensure networks are more secure (Ozsu, 2011) .
Lack of standards
Although distributed DBMS depends on effective communication, standard communication and data access protocols are now evident. Due to lack of standards, the effective potential of distributed DBMS has been limited which is a significant deterrent (Gupta, 2012) .
Conclusion
Distributed information can settle on network servers or decentralized computers on the internet, on private corporate computers or extranets or other organization networks. Distributed databases store data in many devices; they also improve performance at end-user worksites allowing transactions to be processed on several computers. Distributed database management is an efficient way of synchronizing data periodically and ensuring that updates performed at a location are automatically reflected in all the stored data.
References
Chhanda, R. (2009). Distributed Database Systems. Delhi: Dorling Kindersley.
Group, I. (2013). Introduction to Database Management Systems. New Delhi: Tata McGraw-Hill.
Gupta, S. (2012). Introduction to Database Management System. New Delhi: University Science Press.
Helal, A. (2012). Replication Techniques in Distributed Systems. Boston: Kluwer Academic Publishers.
Kasbe, T. (2016). DBMS Concepts. New York: Sage.
Kedar, S. (2010). Database Management Systems. Chicago: Technical Publications.
Ozsu, T. (2011). Principles of Distributed Database Systems. New York: Springer.
Prabhakaran, B. (2015). Multimedia Database Management Systems. Boston: Springer.
Rahimi, S. (2010). Distributed Database Management Systems: A Practical Approach. New Jersey: Wiley.
Singh, K. (2011). Database Systems: Concepts, Design and Applications. Delhi: Dorling Kindersley.
Sumathi, S. (2014). Fundamentals of Relational Database Management Systems. New York: Springer.
Umar, A. (2016). Distributed Database Management Systems Issues and Approaches. Chicago: Sage.