Mongo

Connection Issues

Below is an example of a connection issue you might find in the mongo container logs.
docker logs mongo
...
{
"t": {
"$date": "2022-06-14T14:42:03.662+00:00"
},
"s": "I",
"c": "NETWORK",
"id": 4712102,
"ctx": "ReplicaSetMonitor-TaskExecutor",
"msg": "Host failed in replica set",
"attr": {
"replicaSet": "skynet",
"host": "eu-fin-8.siasky.net:27017",
"error": {
"code": 6,
"codeName": "HostUnreachable",
"errmsg": "Error connecting to eu-fin-8.siasky.net:27017 (46.183.219.217:27017) :: caused by :: No route to host"
},
"action": {
"dropConnections": true,
"requestImmediateCheck": false,
"outcome": {
"host": "eu-fin-8.siasky.net:27017",
"success": false,
"errorMessage": "HostUnreachable: Error connecting to eu-fin-8.siasky.net:27017 (46.183.219.217:27017) :: caused by :: No route to host"
}
}
}
}
This is indicating that the eu-fin-8 server is unreachable in the cluster.

Troubleshooting Steps

Case 1 - Server is Offline

First check is if the eu-fin-8 is online. If the server is offline, bring the server back online and the problem should resolve itself.

Case 2 - Server is Online

If the server is online, but other servers can't connect to it then you can try the following steps.
  1. 1.
    If your mongo cluster is set up on DNS entries, make sure that the DNS records are accurate and there wasn't an IP change.
  2. 2.
    Verify that the firewall settings on server are up to date and are not blocking mongo traffic. If your mongo cluster is operating on port 27017 you should see the following in your ufw settings
sudo ufw status
Status: active
To Action From
-- ------ ----
...
27017/tcp ALLOW Anywhe

Case 3 - Server is Online, reachable, included in replicaset config, but not part of replicaset

Verify that the mongo node is reachable from other servers:
E.g. check that mongo service at eu-fin-4.example.com is reachable from your local machine or from other mongo server:
# Check connection to mongo servicefrom other machine
curl eu-fin-4.example.com:27017
Expected successful response (i.e. mongo service is reachable):
It looks like you are trying to access MongoDB over HTTP on the native driver port.
Verify server is connected to mongo replicaset:
# Connect to the server
# Connect to mongo shell
. skynet-webportal/.env && docker exec -it mongo mongo -u admin -p $SKYNET_DB_PASS
You should get a mongo shell.
If the last line of mongo shell looks like:
skynet:<PRIMARY>
or
skynet:<SECONDARY>
... where skynet is the name of your replicaset, then the mongo node is correctly in the replicaset.
If the last line in mongo shell ends with just:
>
... then the server is not part of replicaset.
Verify server is in replicaset config:
# Connect to any other server that runs replicaset correctly (e.g. eu-fin-3)
# Connect to mongo shell
. skynet-webportal/.env && docker exec -it mongo mongo -u admin -p $SKYNET_DB_PASS
# Check replicaset config of the server eu-fin-4
rs.config().members.forEach(m => {if (m.host == 'eu-fin-4.example.com:27017') {printjson(m)}})
... you should see config data of your server.
Verify server status in replicaset:
In mongo shell (on the working replicaset server) run:
# Check server status
rs.status().members.forEach(m => {if (m.name == 'eu-fin-4.example.com:27017') {printjson(m)}})
If you see in the output JSON:
...
"name" : "eu-fin-4.example.com:27017",
...
"stateStr" : "(not reachable/healthy)",
...
"lastHeartbeatMessage" : "Our replica set configuration is invalid or does not include us",
...
... replicaset is not configured correctly on your server (eu-fin-4).
The easiest way is to delete mongo db directory on eu-fin-4 and rerun portals-setup-following playbook.
WARNING:
Deleting mongo db directory is a dangerous operation!!!
We can perform it now because:
  • Our mongo cluster (except eu-fin-4) is running correctly and contains all production data.
  • We are not touching mongo cluster production data.
  • We are touching only local, non-production mongo data stored on eu-fin-4.
Continue only if you have read and understand the warning above.
Steps on eu-fin-4 server (assuming server is not running correctly and it is already out of loadbalancer):
# Stop sia and mongo
docker stop sia mongo
# Remove unused, non-production data
sudo rm -rf /home/user/skynet-webportal/docker/data/mongo/db
Then rerun portals-setup-following playbook.
You will get prompt from Ansible:
Do you want to reset MongoDB database on eu-fin-4 (y/n)?:
You can confirm the prompt now and Ansible playbook configures the mongo replicaset correctly.
As the eu-fin-4 server is already in the replicaset config of the mongo cluster (verified above), the cluster connects to the server automatically and makes it a cluster (replicaset) member.
Last modified 4d ago
Copy link
Outline
Connection Issues
Troubleshooting Steps