Skip to content
English
  • There are no suggestions because the search field is empty.

HTTP 500 Errors During MongoDB Failover in FileCloud HA Environment

Original Question or Issue:

During Disaster Recovery (DR) testing, shutting down the MongoDB primary server in the RCC environment caused the FileCloud application to become unresponsive until the MongoDB server was brought back online.


Environment:

  • Product - FileCloud Server
  • Version - 23.253
  • Platform - Linux,

Steps to Reproduce:

  • Configure FileCloud with a MongoDB replica set.

  • Perform DR testing by shutting down the active MongoDB primary node.

  • Observe FileCloud application behavior during the MongoDB election process.

  • Attempt user operations while the replica set is electing a new primary node.
  • Monitor FileCloud, MongoDB, and web server logs.

Error or Log Message:

Possible symptoms include:

  • HTTP 500 errors returned to users.

  • Temporary database connectivity errors in FileCloud logs.
  • Application appears unresponsive during MongoDB failover.
  • User requests fail while no primary MongoDB node is available.

Defect or Enhancement Number:

 


Cause:

The MongoDB replica set failover completed successfully, with a new primary elected in approximately 11 seconds. During the election period, temporary database connectivity errors and HTTP 500 responses were expected because no writable primary was available.

The extended application unavailability was primarily caused by the RCC web services remaining offline after the MongoDB cluster had already recovered. Additionally, the MongoDB driver was configured with the default fail-fast setting (serverSelectionTryOnce=true), causing requests to fail immediately instead of waiting for the new primary to become available.


Resolution or Workaround:

To improve FileCloud behavior during MongoDB failovers, update the MongoDB connection settings as follows:

'serverSelectionTryOnce' => false,'serverSelectionTimeoutMS' => 15000,

'connectTimeoutMS' => 60000,

'socketTimeoutMS' => 60000

Benefits

  • Allows FileCloud to wait for MongoDB primary election completion.

  • Reduces the likelihood of HTTP 500 errors during failover events.
  • Improves application resilience during planned DR testing and unplanned MongoDB failovers.
  • Provides more reasonable timeout handling.

Notes:

  • Ensure web services remain available during MongoDB failover testing whenever possible.

  • Verify load balancer health checks and routing behavior after failover events.
  • If load testing is required, simulate activity through:
  • FileCloud APIs
  • FileCloud Desktop/ServerSync clients
  • Multiple concurrent users performing uploads, downloads, and file operations
  • Microsoft Defender is not known to prevent FileCloud from reconnecting after MongoDB failover, although it may introduce minor latency during reconnection.