A bad day with the Vault server

If you are having a problem using Vault, post a message here.

Moderator: SourceGear

Post Reply
Hans Olav Nymand
Posts: 55
Joined: Wed Sep 29, 2004 8:09 am
Location: Denmark, Copenhagen
Contact:

A bad day with the Vault server

Post by Hans Olav Nymand » Fri Aug 24, 2018 6:18 am

Hi,

Today wasn’t a very good Vault day ?

The sequence of events was as follows

From the sgvault.log:
----24-08-2018 09:06:22 Read_Only--robot208.cdp.yy(17.110.69.146)--SSL Enabled System.Web.HttpException (0x800704CD): The remote host closed the connection. The error code is 0x800704CD.
----Loads of similar messages in between…
----24-08-2018 09:06:50 Read_Only--robot208.cdp.yy(17.110.69.146)--SSL Enabled Uncaught Exception: The remote host closed the connection. The error code is 0x800704CD.

From the sgvault.log:
----24-08-2018 09:16:51 Asbjørn_Hansen--asha-1524.cdp.yy(17.110.68.114)--SSL Enabled System.Web.HttpException (0x80070040): The remote host closed the connection. The error code is 0x80070040.
----Loads of similar messages in between…
----24-08-2018 09:33:53 Asbjørn_Hansen--asha-1524.cdp.yy(17.110.68.114)--SSL Enabled Uncaught Exception: The remote host closed the connection. The error code is 0x800704CD.

Approx. 12:00 large branch of $/Rel/09.05 to $/WebRel/2.10.46.109 from a label started.

From the Applicatin event log at the Vault server (the server hosting the IIS hosting the Vault service):
12:21-12:30, 10 occurences
Message:
Event code: 3001 Event message: The request has been aborted. Event time: 24-08-2018 12:20:38 Event time (UTC): 24-08-2018 10:20:38 Event ID: ab6c6db96e1d48f2a38563e3afc0cf56
Event sequence: 29499 Event occurrence: 1 Event detail code: 0 Application information: Application domain: /LM/W3SVC/1/ROOT/VaultNotifyService-1-131795336469850135
Trust level: Full Application Virtual Path: /VaultNotifyService Application Path: C:\inetpub\wwwroot\VaultService\VaultNotifyService\
Machine name: SVRVAULT2 Process information: Process ID: 8672 Process name: w3wp.exe Account name: PDC\sa sgvault Exception information:
Exception type: HttpException Exception message: Request timed out. Request information: Request URL: https://vault.cdp.yy:443/VaultNotifySer ... rvice.asmx
Request path: /VaultNotifyService/VaultNotifyService.asmx User host address: 17.110.68.107 User: Is authenticated: False Authentication Type: Thread account name: PDC\sa sgvault
Thread information: Thread ID: 213 Thread account name: PDC\sa sgvault Is impersonating: False Stack trace: Custom event details:
Log Name: Application
Source: ASP.NET 4.0.30319.0 Logged: 08/24/2018 12:20:38
Event Id: 1309 Task Category: Web Event
Level: Warning Keywords: Classic
OP Code: Computer: svrVault2.cdp.yy

Around 12:45 people start reporting that they have connection problems, and indeed the Vault server seems to be completely unresponsive

From the sgvault.log:
----24-08-2018 13:02:51 mikkel_sorensen--mpbs-1593.cdp.yy(17.110.68.93)--SSL Enabled Vault Failed to establish a connection to the database.
----A smaller number of similar messages in between
----24-08-2018 13:06:17 Peter_Viuf--pdc-1666.cdp.yy(17.110.69.86)--SSL Enabled Vault Failed to establish a connection to the database.
----More similar messages in between

----24-08-2018 13:06:38 Hans_Dybkjær--pdc-1656.cdp.yy(17.110.68.42)--SSL Enabled Rolling Back a transaction at VaultServiceSQL.VaultSqlConn.RollbackTransaction(Boolean bShowStacktrace)
----24-08-2018 13:06:58 Hans_Dybkjær--pdc-1656.cdp.yy(17.110.68.42)--SSL Enabled System.Data.SqlClient.SqlException (0x80131904): Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding. ---> System.ComponentModel.Win32Exception (0x80004005): The wait operation timed out
----24-08-2018 13:06:58 Hans_Dybkjær--pdc-1656.cdp.yy(17.110.68.42)--SSL Enabled Rolling Back a transaction at VaultServiceSQL.VaultSqlConn.RollbackTransaction(Boolean bShowStacktrace)
----24-08-2018 13:06:58 Hans_Dybkjær--pdc-1656.cdp.yy(17.110.68.42)--SSL Enabled System.InvalidOperationException: This SqlTransaction has completed; it is no longer usable.
----24-08-2018 13:06:58 Hans_Dybkjær--pdc-1656.cdp.yy(17.110.68.42)--SSL Enabled Critical Error! Object reference not set to an instance of an object.
----24-08-2018 13:06:58 Hans_Dybkjær--pdc-1656.cdp.yy(17.110.68.42)--SSL Enabled *** Flushing tree cache for RepID 25 ***
----24-08-2018 13:06:58 Hans_Dybkjær--pdc-1656.cdp.yy(17.110.68.42)--SSL Enabled (8c18c163-ea71-413f-8624-25419b31d610) EndTx (Revision - 0) returned: Failure
----24-08-2018 13:06:58 Hans_Dybkjær--pdc-1656.cdp.yy(17.110.68.42)--SSL Enabled (8c18c163-ea71-413f-8624-25419b31d610) CopyBranch: $/Rel/09.05 to $/WebRel/2.10.46.109 TV2 EWS returned: Failure

Around 13:20 the server started working OK again…

We have inspected the following
• Windows system log @ SQL Server machine: No problems seen
• Windows application log @ SQL Server machine: No problems seen
• Windows system log @ IIS/VaultService machine: No problems seen
• Windows application log @ IIS/VaultService machine: No problems seen APART from the 10 warnings listed above
• SQL Server log: No problems seen

We can of course provide the entire sgVault.log and other logs if requested.

We understand the problem related to a timeout and understand that this can be partly solved by increasing the timeout setting for the connection, however we do not understand why such problems does not occur in any other logs. For instance we sometimes see records in the SQL server logs that "X transactions took more than Y seconds to complete", but no such messages today.


Hans Olav

Tonya
Posts: 871
Joined: Thu Jan 20, 2005 1:47 pm
Location: SourceGear

Re: A bad day with the Vault server

Post by Tonya » Fri Aug 24, 2018 8:05 am

Hi Hans,

We'll need the full server log. Please send it over to support@sourcegear.com so we can look into this further.

Thanks,

Tonya

Hans Olav Nymand
Posts: 55
Joined: Wed Sep 29, 2004 8:09 am
Location: Denmark, Copenhagen
Contact:

Re: A bad day with the Vault server

Post by Hans Olav Nymand » Mon Aug 27, 2018 1:16 am

Hi Tonya,

I've sent the logfiles.

We've also determined that at least part of the problem was a long-lived database lock taken by the
stored procedure sgVault.dbo.ufngetusersecurityrights.

Best regards,
Hans Olav

Tonya
Posts: 871
Joined: Thu Jan 20, 2005 1:47 pm
Location: SourceGear

Re: A bad day with the Vault server

Post by Tonya » Mon Aug 27, 2018 9:49 am

Thank you for sending over the log file. I've responded in the open support ticket.

Tonya

Post Reply