How to turn standalone MongoDB server into a replica set with Docker-Compose

How to setup and deploy a MongoDB replica set with Docker?
How do I start a replica set of MongoDB in Docker?
How to automatically initiate MongoDB replica set in Docker-Compose container?

docker-compose.yml

Original standalone configuration

// docker-compose.yml
version: '3'

services:
    ...
    mongodb:
        build: .docker/mongodb
        container_name: mongodb
        volumes:
            - ./.docker/mongodb/mongod.conf:/etc/mongod.conf
            - ./.docker/mongodb/initdb.d/:/docker-entrypoint-initdb.d/
            - ./.docker/mongodb/data/db/:/data/db/
            - ./.docker/mongodb/data/log/:/var/log/mongodb/
        env_file:
            - .env
        environment:
            MONGO_INITDB_ROOT_USERNAME: ${MONGO_INITDB_ROOT_USERNAME}
            MONGO_INITDB_ROOT_PASSWORD: ${MONGO_INITDB_ROOT_PASSWORD}
            MONGO_INITDB_DATABASE: ${MONGO_INITDB_DATABASE}
        ports:
            - "27017:27017"
        command: ["-f", "/etc/mongod.conf"]

New replica set configuration

// docker-compose.yml
version: '3.4'

services:
    ...
    mongodb:
        build: .docker/mongodb
        container_name: mongodb
        hostname: mongodb
        volumes:
            - ./.docker/mongodb/mongod.conf:/etc/mongod.conf
            - ./.docker/mongodb/initdb.d/:/docker-entrypoint-initdb.d/
            - ./.docker/mongodb/data/db/:/data/db/
            - ./.docker/mongodb/data/log/:/var/log/mongodb/
        env_file:
            - .env
        environment:
            MONGO_INITDB_ROOT_USERNAME: ${MONGO_INITDB_ROOT_USERNAME}
            MONGO_INITDB_ROOT_PASSWORD: ${MONGO_INITDB_ROOT_PASSWORD}
            MONGO_INITDB_DATABASE: ${MONGO_INITDB_DATABASE}
            MONGO_REPLICA_SET_NAME: ${MONGO_REPLICA_SET_NAME}
        ports:
            - "27017:27017"
        healthcheck:
            test: test $$(echo "rs.initiate().ok || rs.status().ok" | mongo -u $${MONGO_INITDB_ROOT_USERNAME} -p $${MONGO_INITDB_ROOT_PASSWORD} --quiet) -eq 1
            interval: 10s
            start_period: 30s
        command: ["-f", "/etc/mongod.conf", "--replSet", "${MONGO_REPLICA_SET_NAME}", "--bind_ip_all"]

What has changed and why?

version: '3.4'

Version had to be changed to 3.4 to be able to use healthcheck's start_period parameter (more below).

        hostname: mongodb

Hostname needs to be specified to avoid our mongodb container loosing it PRIMARY status and throwing up errors like "Our replica set config is invalid or we are not a member of it" after restarting it.

        environment:
            ...
            MONGO_REPLICA_SET_NAME: ${MONGO_REPLICA_SET_NAME}

A new environment variable (defined in the .env file) to be used for the replica set _id.

        healthcheck:
            test: test $$(echo "rs.initiate().ok || rs.status().ok" | mongo -u $${MONGO_INITDB_ROOT_USERNAME} -p $${MONGO_INITDB_ROOT_PASSWORD} --quiet) -eq 1
            interval: 10s
            start_period: 30s

This one merits a longer background story.

In short, to run a MongoDB server as a replica set, two things are necessary - start the mongod daemon with --replSet parameter (more on it below), and initiate the replica set using rs.initiate() command. This can be done manually once the container has properly started, but it would be even better if our docker-compose.yml could do it itself automatically when starting up the containers.

One of the possible ways of doing it is using the healthcheck configuration option. It is not exactly what it is supposed to be used for, but we can make it work for us just as well.

Q&A: What is healthcheck supposed to be used for?

healthcheck configures a check that’s run to determine whether or not containers for this service are “healthy”.

The healthcheck instruction tells Docker how to test a container to check that it is still working. This can detect cases such as a web server that is stuck in an infinite loop and unable to handle new connections, even though the server process is still running.

When a container has a healthcheck specified, it has a health status in addition to its normal status. This status is initially starting. Whenever a health check passes, it becomes healthy (whatever state it was previously in). After a certain number of consecutive failures, it becomes unhealthy.

The health check will first run interval seconds after the container is started, and then again interval seconds after each previous check completes.

If a single run of the check takes longer than timeout seconds then the check is considered to have failed.

It takes retries consecutive failures of the health check for the container to be considered unhealthy.

start period provides initialization time for containers that need time to bootstrap. Probe failure during that period will not be counted towards the maximum number of retries. However, if a health check succeeds during the start period, the container is considered started and all consecutive failures will be counted towards the maximum number of retries.

There can only be one healthcheck instruction in a Dockerfile. If you list more than one then only the last healthcheck will take effect.

The command after the CMD keyword can be either a shell command (e.g. healthcheck CMD /bin/check-running) or an exec array (as with other Dockerfile commands).

The command’s exit status indicates the health status of the container. The possible values are:

  • 0: success - the container is healthy and ready for use
  • 1: unhealthy - the container is not working correctly
  • 2: reserved - do not use this exit code

Read more in docker-compose documentation and docker documentation.

What we ask it to do here is to execute both rs.initiate() (initiates a replica set) and rs.status() (returns current status of the replica set) commands against our MongoDB installation, and as long as one of them returns 1 in the ok property (which should always be the case), the whole test command returns 0, meaning that container is healthy and ready to use (and has replica set initiated, which is our ultimate goal).

One thing to note is that in healthcheck configuration option the $ symbol needs to be escaped with another $ to avoid an "ERROR: Invalid interpolation format for "healthcheck" option in service "mongodb": <command>".

        command: ["-f", "/etc/mongod.conf", "--replSet", "${MONGO_REPLICA_SET_NAME}", "--bind_ip_all"]

Finally, the updated command to start the mongod service as replica set and bind it to all IPv4 addresses.

.env.dist

The only change here is adding a new environment variable specifying the name of the environment variable to be used for the replica set _id:

MONGO_REPLICA_SET_NAME=rs0

Obviously the same change needs to be copied over and adapted to your needs to the proper .env file.

Final result

After adding all above configuration and restarting your mongodb container you need to wait until its starts up and becomes healthy:

$ docker-compose ps
           Name                         Command                  State                Ports          
-----------------------------------------------------------------------------------------------------
mongodb                      docker-entrypoint.sh -f /e ...   Up (healthy)   0.0.0.0:27017->27017/tcp

To verify whether the replica set has been deployed properly, ssh to your mongodb container:

$ docker exec -it mongodb bash

and login to the database:

$ mongo -u <root> -p <root>

Modify with your own root username and password.

You should see your replica set's name and PRIMARY in the prompt:

rs0:PRIMARY>

rs.status() command should give you all info on your replica set current status:

rs0:PRIMARY> rs.status()
{
    "set" : "rs0",
    "date" : ISODate("2019-04-29T19:37:58.083Z"),
    "myState" : 1,
    "term" : NumberLong(6),
    "syncingTo" : "",
    "syncSourceHost" : "",
    "syncSourceId" : -1,
    "heartbeatIntervalMillis" : NumberLong(2000),
    "optimes" : {
        "lastCommittedOpTime" : {
            "ts" : Timestamp(1556566675, 1),
            "t" : NumberLong(6)
        },
        "readConcernMajorityOpTime" : {
            "ts" : Timestamp(1556566675, 1),
            "t" : NumberLong(6)
        },
        "appliedOpTime" : {
            "ts" : Timestamp(1556566675, 1),
            "t" : NumberLong(6)
        },
        "durableOpTime" : {
            "ts" : Timestamp(1556566675, 1),
            "t" : NumberLong(6)
        }
    },
    "lastStableCheckpointTimestamp" : Timestamp(1556566655, 1),
    "members" : [
        {
            "_id" : 0,
            "name" : "mongodb:27017",
            "health" : 1,
            "state" : 1,
            "stateStr" : "PRIMARY",
            "uptime" : 208,
            "optime" : {
                "ts" : Timestamp(1556566675, 1),
                "t" : NumberLong(6)
            },
            "optimeDate" : ISODate("2019-04-29T19:37:55Z"),
            "syncingTo" : "",
            "syncSourceHost" : "",
            "syncSourceId" : -1,
            "infoMessage" : "",
            "electionTime" : Timestamp(1556566483, 1),
            "electionDate" : ISODate("2019-04-29T19:34:43Z"),
            "configVersion" : 1,
            "self" : true,
            "lastHeartbeatMessage" : ""
        }
    ],
    "ok" : 1,
    "operationTime" : Timestamp(1556566675, 1),
    "$clusterTime" : {
        "clusterTime" : Timestamp(1556566675, 1),
        "signature" : {
            "hash" : BinData(0,"wYB/wEpnycj35tqDe0xx9M0MiL4="),
            "keyId" : NumberLong("6685216626012323842")
        }
    }
}

rs.conf() (or its alias rs.config()) returns a document that contains the current replica set configuration:

rs0:PRIMARY> rs.config()
{
    "_id" : "rsServiceShop",
    "version" : 1,
    "protocolVersion" : NumberLong(1),
    "writeConcernMajorityJournalDefault" : true,
    "members" : [
        {
            "_id" : 0,
            "host" : "mongodb:27017",
            "arbiterOnly" : false,
            "buildIndexes" : true,
            "hidden" : false,
            "priority" : 1,
            "tags" : {

            },
            "slaveDelay" : NumberLong(0),
            "votes" : 1
        }
    ],
    "settings" : {
        "chainingAllowed" : true,
        "heartbeatIntervalMillis" : 2000,
        "heartbeatTimeoutSecs" : 10,
        "electionTimeoutMillis" : 10000,
        "catchUpTimeoutMillis" : -1,
        "catchUpTakeoverDelayMillis" : 30000,
        "getLastErrorModes" : {

        },
        "getLastErrorDefaults" : {
            "w" : 1,
            "wtimeout" : 0
        },
        "replicaSetId" : ObjectId("5cc6a9189e2f8beadb9824e4")
    }
}

Finally db.isMaster() is ultimate confirmation of your server being the primary node, especially its 'ismaster' property:

rs0:PRIMARY> db.isMaster()
{
    "hosts" : [
        "mongodb:27017"
    ],
    "setName" : "rs0",
    "setVersion" : 1,
    "ismaster" : true,
    "secondary" : false,
    "primary" : "mongodb:27017",
    "me" : "mongodb:27017",
    "electionId" : ObjectId("7fffffff0000000000000006"),
    "lastWrite" : {
        "opTime" : {
            "ts" : Timestamp(1556566775, 1),
            "t" : NumberLong(6)
        },
        "lastWriteDate" : ISODate("2019-04-29T19:39:35Z"),
        "majorityOpTime" : {
            "ts" : Timestamp(1556566775, 1),
            "t" : NumberLong(6)
        },
        "majorityWriteDate" : ISODate("2019-04-29T19:39:35Z")
    },
    "maxBsonObjectSize" : 16777216,
    "maxMessageSizeBytes" : 48000000,
    "maxWriteBatchSize" : 100000,
    "localTime" : ISODate("2019-04-29T19:39:37.158Z"),
    "logicalSessionTimeoutMinutes" : 30,
    "minWireVersion" : 0,
    "maxWireVersion" : 7,
    "readOnly" : false,
    "ok" : 1,
    "operationTime" : Timestamp(1556566775, 1),
    "$clusterTime" : {
        "clusterTime" : Timestamp(1556566775, 1),
        "signature" : {
            "hash" : BinData(0,"gclCKMb4hBpgCzf3j3rMpaZpT6E="),
            "keyId" : NumberLong("6685216626012323842")
        }
    }
}