A MongoDB replica set is a group of nodes that maintain the same dataset to provide high availability, redundancy, and automatic failover for reliable read and write operations.
- Primary Node: Handles all write operations and replicates data to secondaries.
- Secondary Nodes: Maintain copies of data and can serve read requests to distribute load.
- Arbiter (Optional): Participates in elections to choose a primary but does not store data.
- High Availability: Automatic failover minimizes downtime during failures.
- Fault Tolerance: Data redundancy protects against hardware or network issues.
Write Semantics
In a MongoDB replica set, all write operations go to the primary node, which records changes in its oplog and replicates them to secondary nodes asynchronously.
- Primary-only Writes: Inserts, updates, and deletes are handled by the primary node.
- Asynchronous Replication: Secondaries pull changes from the primary’s oplog.
- Write Acknowledgment: Writes are acknowledged after being applied to the primary’s oplog (write concern can extend this).
- High Availability: If the primary fails, a secondary can be elected to continue handling writes.
Example: Writing to the Primary Node
// Insert a document into the collection
db.myCollection.insertOne({ "name": "Alex", "age": 30 })
Output:
{
"acknowledged": true,
"insertedId": ObjectId("60f9d7ac345b7c9df348a86e")
}
Upon successful execution of the write operation, MongoDB returns an acknowledgment indicating that the operation was successful:
Read Semantics
In a MongoDB replica set, reads go to the primary by default for strong consistency, but can be routed to secondary nodes using read preferences to improve scalability and availability.
- Default Reads: Routed to the primary to ensure the most up-to-date data.
- Read Preferences: Allow directing reads to secondaries for better read scalability.
- Consistency Trade-off: Reads from secondaries may return slightly stale data.
- Fault Tolerance: Secondaries can serve reads if the primary is unavailable.
Example: Read from Secondary Node
// Set read preference to read from secondary nodes
collection.find().readPreference("secondary")
Output:
{ "_id": ObjectId("60f9d7ac345b7c9df348a86e"), "name": "Alice", "age": 30 }When reading from a secondary node, MongoDB routes the read operation to one of the secondary nodes. The output will contain the queried data from the secondary node.
Read Preferences in MongoDB
MongoDB provides different read preferences to balance performance and consistency:
| Read Preference | Description |
|---|---|
| primary (default) | Reads from the primary only. Ensures strong consistency. |
| primaryPreferred | Reads from the primary if available, otherwise falls back to a secondary. |
| secondary | Reads from secondary nodes for load balancing. May return stale data. |
| secondaryPreferred | Reads from secondaries but uses primary if no secondary is available. |
| nearest | Reads from the nearest node (primary or secondary) based on network latency. |
Read Concern and Write Concern
MongoDB provides additional options to control the behavior of read and write operations:
- Read Concern: Controls the consistency level of read operations.
- Consistency Levels: Options include local, majority, and linearizable.
- Data Freshness: Determines how up-to-date the data must be when read.
- Write Concern: Controls how write operations are acknowledged.
- Acknowledgment Levels: Options include acknowledged and majority.
- Timeout Control: wtimeout sets how long the client waits for write acknowledgments.