Distributed Database System Architecture in DBMS

What do you mean by Distributed database system? What are the reasons for building distributed systems and what are are the Implementation issues with it? 

Today, I'm going to clear all your queries related to distributed systems so without further ado let's begin-
Distributed database system architecture in dbms
Learn Distributed Database Systems


What is Distributed Database System?

In a distributed database system , the database is keept on many computers . The computers in a very distributed system communicate with each other through numerous communication media , like high - speed personal networks or the Internet . 

They do not share main memory or disks . The computers in a very distributed system might vary in size and performance , starting from workstations up to mainframe systems . 

The computers in a distributed system are brought up by variety of different names , like sites or nodes , reckoning on the context during which they're mentioned . 

We tend to primarily use the term site , to stress the physical distribution of these systems . The overall structure of a distributed system can be seen in the figure below . 

The most variations between shared - nothing parallel databases and distributed databases are that distributed databases are generally geographically separated , are individually administered , and have a slower interconnection . 

Another major distinction is that , in a  distributed database system , we tend to differentiate between local and global transactions . A local transaction is one that accesses data solely from sites wherever the data was initiated . 

A global transaction , on the opposite hand , is one that either accesses data in a  site totally different from the one where the transaction was initiated , or accesses data in many totally different sites .


Reasons for building Distributed Database Systems?

There are many reasons for building distributed info systems , as well as sharing of data , autonomy , and availableness . 
Distributed Database System Architecture in DBMS

A distributed system


Sharing data

The key advantage in building a distributed info system is that the provision of an atmosphere where users at one site are also ready to access the information residing at alternative sites .
As an example , during a distributed university system , where every field stores information  associated with that field , it's attainable for a user in one field to access information in another field . 

Without this capability , the transfer of student records from one field to a different field would have to resort to some external mechanism that might couple existing systems . 

Autonomy

The first advantage of sharing information by means of data distribution is that every website is in a position to retain a degree of management over data that are keept domestically .
During a Centralized system , the info administrator of the central web site controls the info. During a distributed system , there's a global info administrator accountable for the complete system . 

A section of these responsibilities is delegated to the local info administrator for each site. Reckoning on the look of the distributed info system , every administrator might have a unique degree of  native autonomy . The chance of native autonomy is commonly a serious advantage of distributed databases.

Availableness

If one web site fails during a distributed system , the remaining sites may be able to continue operate . Specially, if data things are replicated in many sites,a transaction needing a specific information item might notice that item in any of several sites . 

Thus , the failure of a web site doesn't essentially imply the closedown of the system . The failure of one website should be detected by the system , and acceptable action is also required to get over the failure .  

The system should not use the services of the unsuccessful website. Finally , once the unsuccessful website recovers or is repaired , mechanisms should be out there to integrate it smoothly into the system.

Though recovery from failure is a lot of advanced in distributed systems than in centralized systems , the power of most of the system to still operate despite the failure of one website ends up in raised handiness .                                     
Handiness is crucial for info systems used for real - time applications . Loss of access to data by , for instance , an airline might lead to the loss of potential ticket buyers to competitors . 

 An Example of a Distributed Database 

Consider a banking system consisting of 4 branches in four completely different cities . Every branch has its own computer , with a info of all the accounts maintained at that branch . 

Every such installation is so a website . There additionally exists one single website that maintains info concerning all the branches of the bank . 

For example , to understand the distinction between the two forms of transactions - local and global - at the sites , contemplate a transaction to add $50 to account range A - 177 set at the Valleyview branch . 

If the transaction was initiated at the Valleyview branch , then it's thought-about local ; otherwise , it's thought-about global . 

A transaction to transfer $50 from account A - 177 to account A - 305 , that is found at the Hillside branch , may be a global transaction , since accounts in two completely different sites are accessed as a result of its execution . 

In a perfect distributed database system , the sites would share a standard global schema ( though some relations is also hold on solely at some sites ) , all sites would run identical distributed info - management software system , and also the sites would remember of every other's existence . 

If a distributed info is constructed from scratch, it'd so be doable to achieve the above goals. However , truly a distributed info should be created by linking along multiple already - existing info systems , every with its own schema and probably running completely different info - management software system . 

Such systems are generally referred to as multidatabase systems or heterogeneous distributed info systems . 

Implementation Issues  

Atomicity of transactions is a very important issue in building a distributed database system. If a transaction runs across two sites , unless the system designers are careful , it should commit at one website and abort at another , resulting in an inconsistent state. 

Transaction commit protocols guarantee such a state of situations cannot arise . The two-phase commit protocol (2PC) is the most widely  used of these protocols.

The basic plan behind 2PC is for every website to execute the transaction till it enters the partially committed state , then leave the commit call to a single organizer site ; the transaction is said to be in the ready state at at a site at this point . 

The organizer decides to commit the transaction given that the transaction reaches the prepared state at each website wherever it executed ; otherwise ( for instance , if the transaction aborts at any website ) , the organizer decides to abort the transaction . 

Each website wherever the transaction executed should follow the choice of the coordinator. If a website fails once a transaction is in prepared state , once the location recovers from failure it ought to be during a position to either commit or abort the transaction , looking on the choice of the organizer . 

Concurrency management is another issue during a distributed database . Since a transacti -on might access data things at many sites , transaction managers at many sites might have to coordinate to implement concurrency management . 

If locking is employed , locking are often performed regionally at the sites containing access- ed data items , however there's conjointly a break of deadlock involving transactions originating at multiple sites . 

Thus deadlock detection has to be dole out across multiple sites . Failures are additionally common in distributed systems since not solely might computers fail , however communicat- -ion links might also fail . 

Replication of data things , that is that the key to the continued functioning of distributed databases once failures occur , more complicates concurrency management .

The quality transaction models , supported on multiple actions dole out by one program unit ,are usually inappropriate for finishing up tasks that cross the boundaries of databases that can't or won't collaborate to implement protocols like 2PC . 

Various approaches , supported on persistent messaging for communication , are typically used for such tasks ; persistent messaging will be discussed in other post . 

Once the tasks to be dole out are complicated , involving multiple databases and / or multiple interactions with humans , coordination of the tasks and ensuring transaction properties for the tasks become additionally sophisticated . 

Work flow management systems are systems designed to assist with finishing up such tasks , and will be discussed later . Just in case a corporation should choose from a distributed design and a centralized design for implementing an application , the system creator should balance the benefits against the disadvantages of distribution of data . 

We've already seen the benefits of using distributed databases . The first disadvantage of distributed info systems is that the further quality needed to confirm correct coordination among the sites . 

This inflated quality takes varied forms : 

Software - Development Cost. It's tougher to implement a distributed database system ; so , it's additional expensive.

Greater Potential for Bugs. Since the sites that represent the distributed system operate in parallel , it's tougher to confirm the correctness of algorithms , especially operation throughout failures of a part of the system , and recovery from failures . The potential exists for terribly delicate bugs. 

Increasedd Processing Overhead. The exchange of messages and also the additional computation needed to realize intersite coordination are a style of overhead that doesn't arise in centralized systems . 

There are many approaches to distributed information style , starting from totally distributed styles to ones that embody an oversized degree of centralization . 



















Previous
Next Post »

1 comment:

Anonymous said...

Hey

Post a Comment

Please do not enter any spam links in comment box.