I'm having a problem with distributed transactions.
I'm using SQL Server 2008 R2, Windows 7, .Net 4.0.
Here's what I actually want to do:
- I have a buffer database (named A)
- I have another database (named B)
- I want to send data from database A to database B through a WCF webservice (SOAP over HTTP, A and B not on the same machine)
- If the data is successfully sent to database B, data is removed from database A
Everything is part of a transaction, so the whole operation is atomic.
Here's what I'm currently trying to do sequentially (everything in a transaction):
- I read a chunk from database A (say 50 rows)
- I update rows read in step 1 (I actually set the boolean
Sent
column of rows with the value true
, so in the future I know that these rows are sent)
- I consume (client side) a WCF webservice (SOAP 1.2,
wsHttpBinding
) that is part of the transaction flow (blocking synchronous call). The webservice request sends the data read in step 1
- The webservice implementation (server side) inserts data in database B (data that is contained in the webservice request)
- if no exception is received from the server, data with the
Sent
value as true
are removed from database A
- if a specific
FaultException
is received from the server, data with the Sent
value as true are removed from database A
- for other
FaultExceptions
and other exceptions (endpoint not found, or anything else), data with the Sent
value as true
are not removed from database A
Note: there are actually multiple buffer databases (A1, A2, A3..., AN) and multiple target databases (B1,.... BN). There are N threads to deal with N databases.
If I run my server and my client, everything is working just fine. Data is "transferred" atomically per chunks from database A to database B.
When I brutally stop my client in the middle of the transaction (process killed), most of the time everything is fine (i.e. data is not removed from database A nor added to database B).
But sometimes (this is my actual problem), my database B becomes locked. The only option to unlock it is to terminate the server process.
When the problem occurs, I can see in MSDTC that there is an active transaction (note that in the screenshot there are 3 transactions as I'm currently dealing with 3 buffer databases and 3 target databases, but sometimes there is only 1 transaction even if there are 3 DB).
I see that database B is locked when trying to run a SELECT
againt database B. As you can see in the picture below, my SELECT
request is blocked by session ID 67 which corresponds to an INSERT
in database B by the server process.
The lock remains forever (no transaction timeout, even after 1+ hour) until the server process is terminated. I can't validate or cancel the transaction in MSDTC, I get a warning saying that "it cannot be aborted because it is not "In Doubt"
".
Why does the database B remained locked? If the client is terminated, shouldn't the transaction fail and the lock on database B be released after some timeout?
Here's my code for server side:
// Service interface
[ServiceContract]
public interface IService
{
[OperationContract]
[FaultContract(typeof(MyClass))]
[TransactionFlow(TransactionFlowOption.Allowed)]
void SendData(DataClass data);
}
// Service implementation
[ServiceBehavior()]
public partial class Service : IService
{
[OperationBehavior(TransactionScopeRequired = true)]
public void SendData(DataClass data)
{
if (data == null)
{
throw new FaultException<MyClass>(new MyClass());
}
try
{
// Inserts data in database B
using (DBContextManagement ctx = new DBContextManagement())
{
// Inserts data using Entity Framework
// This will add some entities to the context
// then call context.SaveChanges()
ctx.InsertData(data);
}
}
catch (Exception ex)
{
throw new FaultException<MyClass>(new MyClass());
}
}
}
Here's my server side configuration (self-hosted WCF service):
<system.serviceModel>
<services>
<service name="XXXX.MyService">
<endpoint binding="wsHttpBinding" bindingConfiguration="WsHttpBinding_IMyService" contract="XXXX.IMyService" />
</service>
</services>
<bindings>
<wsHttpBinding>
<binding name="WsHttpBinding_IMyService" transactionFlow="true" allowCookies="true" >
<readerQuotas maxDepth="32" maxArrayLength="2147483647" maxStringContentLength="2147483647" />
<security mode="None" />
</binding>
</wsHttpBinding>
</bindings>
<behaviors>
<serviceBehaviors>
<behavior name="">
<serviceTimeouts transactionTimeout="00:00:20" />
<serviceMetadata httpGetEnabled="true" />
<serviceDebug includeExceptionDetailInFaults="true" />
<dataContractSerializer maxItemsInObjectGraph="2147483647" />
</behavior>
</serviceBehaviors>
</behaviors>
<serviceHostingEnvironment multipleSiteBindingsEnabled="true" />
</system.serviceModel>
Here's my client code:
try
{
using (DBContextManagement ctx = new DBContextManagement())
{
using (TransactionScope ts = new TransactionScope(TransactionScopeOption.Required, new TransactionOptions { IsolationLevel = System.Transactions.IsolationLevel.ReadUncommitted, Timeout = new TimeSpan(0, 0, 30) }, EnterpriseServicesInteropOption.None))
{
// First of all, retrieves data from database A
// Internally queries the database A through Entity Framework
var data = ctx.GetData();
// Mark data as sent to the server
// Internally updates the database A through Entity Framework
// This actually set the Sent property as true, then call
// context.SaveChanges()
ctx.SetDataAsSent(data);
try
{
// Send data to the server
MyServiceClient proxy = new MyServiceClient();
MyServiceClient.SendData(data);
// If we're here, then data has successfully been sent
// This internally removes sent data (i.e. data with
// property Sent as true) from database A through entity framework
// (entities removed then context.SaveChanges)
ctx.RemoveSentData();
}
catch (FaultException<MyClass> soapError)
{
// SOAP exception received
// We internally remove sent data (i.e. data with
// property Sent as true) from database A through entity framework
// (entities removed then context.SaveChanges)
ctx.RemoveSentData();
}
catch (Exception ex)
{
// Some logging here
return;
}
ts.Complete();
}
}
}
catch (Exception ex)
{
// Some logging here (removed from code)
throw;
}
Here's my client configuration:
<system.serviceModel>
<bindings>
<wsHttpBinding>
<binding name="WsHttpBinding_IMyService"
closeTimeout="00:01:00"
openTimeout="00:01:00"
receiveTimeout="00:10:00"
sendTimeout="00:01:00"
allowCookies="false"
bypassProxyOnLocal="false"
hostNameComparisonMode="StrongWildcard"
transactionFlow="true"
maxBufferPoolSize="2147483647"
maxReceivedMessageSize="2147483647"
messageEncoding="Text"
textEncoding="utf-8"
useDefaultWebProxy="true">
<readerQuotas maxDepth="32"
maxStringContentLength="2147483647"
maxArrayLength="16384"
maxBytesPerRead="4096"
maxNameTableCharCount="16384" />
<security mode="None">
<transport clientCredentialType="None" proxyCredentialType="None" realm="" />
<message clientCredentialType="UserName" algorithmSuite="Default" />
</security>
</binding>
</wsHttpBinding>
</bindings>
<client>
<endpoint address="http://localhost:8080/MyService.svc"
binding="wsHttpBinding"
bindingConfiguration="WsHttpBinding_IMyService"
contract="XXX.IMyService"
name="WsHttpBinding_IMyService" />
</client>
</system.serviceModel>
So my questions are:
- Is my pattern/design with distributed transactions a valid and robust design?
- What could cause my database B to be locked for infinite time if my client's transaction is brutally terminated?
- What should I change (config and/or code and/or design) to make it work as expected (i.e. if my client process dies/crashes, I want the transaction to be aborted so the database B is not locked)?
Thanks.
EDIT
I do think I over-simplified my use case. It looks like I'm actually doing a simple data replication between multiple SQL Server instances. That's not well explained (my bad), that's not what I'm trying to achieve.
I'm not simply duplicating data. I read from A on machine M1 then write to B on machine M2, but what's written is not what I've read but some calculated value that comes from what was read.
I'm pretty sure SQL Server replication services could handle business computation for me, but I can't go this way for some reasons:
- I can't change the webservices thing because the interface is now defined and fixed.
- I can't use SQL Server replication because I'm actually not responsible of the server side (which writes to database B). I'm even not sure that there will be SQL Server on the other side (could be a Java backoffice with MySQL, PostgreSQL or anything)
- I can't use SQL Server service broker or any message-oriented-middleware (which would fit IMO) for same reason (potentially heterogeneous databases and environments)
I'm stuck by WCF, and I can't even change the binding configuration to use MSMQ or whatever because of interoperability requirements. MSMQ is great for sure (I'm already using it in another part of the project), but it's Windows only. SOAP 1.2 is a standard protocol, and SOAP 1.2 transactions are standard too (WS-Atomic
implementation).
Maybe the WCF transaction thing is not a good idea, actually.
If I have understood correctly how it actually works (please correct me if I'm wrong), it simply allows to "continue" the transaction scope on server side, which will require a transaction coordinator to be configured on server side, which probably breaks my interoperability needs (again, there could be a database on server side that does not integrate well with the transaction coordinator).