Friday 21 October 2011

SQL Server Replication Concepts and Trouble shooting

What is Replication :
Replication is a set of technologies for copying and distributing data and database objects from one database to another and then synchronizing between databases for consistency.
 
Benefits Of Replication : 
 
v Common benefit of SQL Server replication is the availability of data when and where it is needed.
vAllowing multiple sites to keep copies of the same data.
vSeparating the OLTP (Production)  and OLAP (Reporting) systems.
vIncreasing aggregate read performance
Replication is one of the High availability choices. 
 
When to use Replication:
  
 
Distribute copies of data on a scheduled basis.
Distribute data changes to other servers.
Allow multiple users and sites to make changes then merge the data modifications together, potentially identifying and resolving conflicts. (Merge)
 
Copy and distribute data to one or more sites.
 
Replication Types :
vSnapshot Replication
vTransactional Replication
vUpdatable Subscriptions for Transactional Replication
vMerge Replication
vPeer-Peer Replication (Not in SQL Server 2000)

Terminologies before getting started:
Publisher is a server that makes the data available for subscription to other servers. (Source server)
Subscriber is a server that receives and maintains the published data. (Destination Server)
Distributor is the server that manages the flow of data through the replication system.
Agents are the processes that are responsible for copying and distributing data between publisher and subscriber.
An article can be any database object, like Tables (Column filtered or Row filtered), Views, Indexed views, Stored Procedures, and User defined functions.
Publication is a collection of articles.

snapshot Replication:
 
Snapshot replication distributes data exactly as it appears at a specific moment in time and does not monitor for updates to the data. Snapshot replication is best used as a method for replicating data that changes infrequently and when the size of replicated data is not very large.
     In the following situations we can use :
vData is mostly static and does not change often. When it does change, it makes more sense to publish an entirely new copy to Subscribers.
vIt is acceptable to have copies of data that are out of date for a period of time.
vReplicating small volumes of data in which an entire refresh of the data is reasonable.
  

How Snapshot Replication works :
 
 
 
Transactional Replication :
 
With transactional replication, an initial snapshot of data is applied at Subscribers, and then when data modifications are made at the Publisher, the individual transactions are captured and propagated to Subscribers.
   
   In the following situations we can use :
vIf you want incremental changes to be propagated to Subscribers as they occur.
vSubscribers are reliably and/or frequently connected to the Publisher. 
 
How Transactional Replication works :
  
 
 
How Transactional replication works :
 
 
Updatable Subscriptions for Transactional Replication:
 
When data is updated at a Subscriber, it is first propagated to the Publisher and then propagated to other Subscribers.
Immediate updating. The Publisher and Subscriber must be connected to update data at the Subscriber.
Queued updating The Publisher and Subscriber do not have to be connected to update data at the Subscriber. Updates can be made while the Subscriber or Publisher is offline.
  
Merge Replication:
 
 Merge replication is the process of distributing data from Publisher to Subscribers, allowing the Publisher and Subscribers to make updates while connected or disconnected, and then merging the updates between sites when they are connected.
 
In the following situations we can use :
vMultiple Subscribers need to update data at various times and propagate those changes to the Publisher and to other Subscribers.
vSubscribers need to receive data, make changes offline, and later synchronize changes with the Publisher and other Subscribers
Merge replication is commonly used by laptop users and others who can not be constantly connected to the publisher.
Merge replication is mainly for Server-to-Client environment.  This allows subscribers to make modification on replicated data and send them back to publisher.
 
How Merge Replication works :
 
Replication Agents :
ØSnapshot Agent prepares snapshot files containing schema and data of published tables and database objects, stores the files in the snapshot folder, and records synchronization jobs in the distribution database.
ØLog Reader Agent moves transactions marked for replication from the transaction log on the Publisher to the distribution database.
ØMerge Agent applies initial snapshots held in the database tables of the Publisher to the Subscribers. It also merges data changes made by the Publisher or Subscribers after the initial snapshot is created.
ØDistribution Agent moves the snapshot (for snapshot replication and transactional replication) and the transactions held in the distribution database tables (for transactional replication) to the destination tables at the Subscribers.
ØThe Queue Reader agent runs with the transactional replication. It runs at the Distributor database and transfers the changes from the Subscribers to the Publisher database. Only one instance of this agent exists to serve all the Publishers.
  
Transactional Replication System Tables:
 
System Table
Description
MSarticles
Contains a row for each replicated article
MScached_peer_lsns
Used in peer-to-peer replication; keeps track of the LSN in the transaction log for the commands that need to be sent to the Subscriber server
MSdistribution_agents
Stores a row for each Distribution Agent running on the local Distributor server
MSdistribution_history
Keeps a record of the history data for the Distribution Agent running on the local Distributor server
MSlogreader_agents
Keeps a record for each row of the Log Reader Agent running on the local Distributor server
MSlogreader_history
Keeps a record of the history of the Log Reader Agent running on the local Distributor server
MSpublication_access
Contains a row for each SQL Server login that has access to the publication of a Publisher server
MSpublications
Stores information for each publication that is replicated by any Publisher server
MSpublicationthresholds
Monitors the replication performance metrics with each row containing the threshold value
MSpublisher_databases
Contains a row for each database on the Publisher server that is
serviced by the local Distributor server
MSqreader_agents
Keeps a record for each row of the Queue Reader Agent running
on the local Distributor server
MSqreader_history
Keeps a record of the history of the Queue Reader Agent running
on the local Distributor server

 
Peer-to-peer replication :

ØWith the peer-to-peer replication topology in SQL Server 2005, each node acts as a publisher and as a subscriber. Replication recognizes when changes have occurred on a given node but only allows those changes to cycle through the nodes one at a time.
ØIf one database is down (A, for example), the other databases (B and C) can still replicate. Whenever that database (A) comes back up, it can synchronize with the others (B and C) and get the changes that took place after it went down. This is possible because the databases (A, B, and C) all act as both publisher and subscriber
ØPeer-to-peer transactional replication has added a new dimension to SQL Server database replication, giving DBAs more flexibility. As a new feature, it isn't yet widely used. But it will be.
Here are some guidelines for selecting it:
Requires high availability (with other types, if publisher goes down, replication topology is down too).
Less number of nodes (would be better to have 10 or less).
 
 
Structure of Peer-Peer :
 
 
Replication Monitor:
 
GOALS
ØEnsure that replication is running and transactions are being delivered to subscribers.
ØVerify that replication performance is acceptable. This entails delivering transactions within a reasonable time frame sothat data on the subscriber(s) is consistent with data on the publisher.
ØIdentify and troubleshoot any errors or replication performance issues.
OPTIONS
ØShow 25 worst-performing subscriptions
ØShow 50 worst-performing subscriptions
ØShow errors and warnings only
ØShow errors only
ØShow warnings only
ØShow subscriptions running (not stopped due to error)
ØShow subscriptions not running
 
 
 
Troubleshooting
 
Concepts
Data Not being sent to Subscriber
The table is filtered, and there are no changes to deliver to a given Subscriber.
One or more agents are not running or are failing with an error.
A transactional subscription was initialized without a snapshot, and changes have occurred on the Publisher since the publication was created.
Replication of stored procedure execution for a transactional publication produces different results at the Subscriber.
The INSERT stored procedure used by a transactional article includes a condition that is not met.
Data is deleted by a user, a replication script, or another application.
Data is deleted by a trigger, or a trigger includes a ROLLBACK statement.
Data Mismatch at the Publisher and Subscriber
Data is updated at a Subscriber that should be treated as read-only. The subscription database should be treated as read-only unless you are using merge replication, transactional replication with updatable subscriptions, or peer-to-peer transactional replication.
Triggers are used at the Subscriber. Triggers can modify data at the Subscriber and also prevent the data from being updated if the trigger issues a ROLLBACK.
Scripts are executed by replication at the Subscriber but not at the Publisher.
Replication of stored procedure execution for a transactional publication produces different results at the Subscriber.
Constraint violations or other issues prevent rows from being inserted, updated, or deleted at the Subscriber.
TOOLS USED : TABLEDIFF