Note about MongoDB 2019.02

421

Engine comparison:

MMAPv1

MMAPv1 is quite mature and has proven to be quite stable over the years. One of the storage allocation strategies used with this engine is the power of two allocation strategy. This primarily involves storing double the amount of document space (in power of twos) such that in-place updates of documents become highly likely without having to move the documents during updates. Another storage strategy used with this engine is fixed sizing. In this, the documents are padded (for example, with zeros) such that maximum data allocation for each document is attained. This strategy is usually followed by applications that have fewer updates.Consistency in MMAPv1 is achieved by journaling, where writes are written to a private view in memory which are written to the on-disk journal. Upon which the changes are then written to a shared view that is the data files. There is no support for data compression with MMAPv1. Lastly, MMAPv1 heavily relies on page caches and hence uses up available memory to retain the working dataset in cache thus providing good performance. Although, MongoDB does yield (free up) memory, used for cache, if another process demands it. Some production deployments avoid enabling swap space to ensure these caches are not written to disk which may deteriorate performance.

WiredTiger

WiredTiger provides the ability, for multiple clients, to perform write operations on the same collection. This is achieved by providing document-level concurrency such that during a given write operation, the database only locks a given document in the collection as against its predecessors, which would lock the entire collection. This drastically improves performance for write heavy applications. Additionally, WiredTiger provides compression of data for indexes and collections. The current compression algorithms used by WiredTiger are Google’s Snappy and zLib. Although disabling compression is possible, one should not immediately jump this gun unless it is truly load-tested while planning your storage strategy.WiredTiger uses Multi-Version Concurrency Control (MVCC) that allows asserting point-in-time snapshots of transactions. These finalized snapshots are written to disk which helps create checkpoints in the database. These checkpoints eventually help determine the last good state of data files and helps in recovery of data during abnormal shutdowns. Additionally, journaling is also supported with WiredTiger where write-ahead transaction logs are maintained. The combination of journaling and checkpoints increases the chance of data recovery during failures. WiredTiger uses internal caching as well as filesystem cache to provide faster responses on queries. With high concurrency in mind, the architecture of WiredTiger is such that it better utilizes multi-core systems.

Binding MongoDB process to a specific network interface and port

Customizing the MongoDB configuration file

  1. Start your favorite text editor and add the following in a file called mongod.conf:

  1. Start your mongod instance:

Link: https://docs.mongodb.com/manual/reference/configuration-options/

Running MongoDB as a Docker container

  1. Download the latest MongoDB Docker image:

 

(https://thisdavej.com/how-to-install-redis-on-a-raspberry-pi-using-docker/)

docker pull redis:2.8.23

docker run -d –name redis2823 -v /home/ubuntu/redis_docker/redis2823:/data -p 0.0.0.0:9999:6379 –restart unless-stopped redis:2.8.23 –appendonly yes –maxmemory 512mb –tcp-backlog 128

  1. Check that the image exists:

  1. Start a container:

  1. Check if the container is running successfully:

  1. Let’s connect to our mongo server using the mongo client from the container:

  1. Stop the mongo instance and with host mode networking:

  1. Connect to the new instance using mongo shell:

Understanding and Managing Indexes

In this chapter, we will be covering the following topics:

  • Creating an index
  • Managing existing indexes
  • How to use compound indexes
  • Creating background indexes

  • Creating TTL-based indexes

https://docs.mongodb.com/v3.0/tutorial/expire-data/

  • Creating a sparse index
  • Creating a partial index
  • Creating a unique index

Performance Tuning

In this chapter we will be covering the following topics:

  • Configuring disks for better I/O
  • Measuring disk I/O performance with mongoperf
  • Finding slow running queries and operations
  • Figuring out the size of a working set

High Availability with Replication

In this chapter, we will cover the following topics:

  • Initializing a new replica set

This should create three parent directories: /data/server1/data/server2, and /data/server3, each containing subdirectories named conflogs, and db. We will be using this directory format throughout the chapter.

  • Adding a node to the replica set
  • Removing a node from the replica set
  • Working with an arbiter
  • Switching between primary and secondary nodes
  • Changing replica set configuration
  • Changing priority to replica set nodes

High Scalability with Sharding

In this chapter, we will cover the following recipes:

  • Setting up and configuring a shard cluster
  • Managing chunks
  • Moving non-sharded collection data from one shard to another
  • Removing a shard from the cluster
  • Understanding tag aware sharding – zones

Monitoring MongoDB

The following recipes are included in this chapter:

  • Monitoring MongoDB performance with mongostat
  • Checking replication lag of nodes in a replica set
  • Monitoring and killing long running operations on MongoDB

db.currentOp()

  • Checking disk I/O usage
  • Collecting MongoDB metrics using Diamond and Graphite

Deploying MongoDB in Production

This chapter contains the following recipes:

  • Configuring MongoDB for a production deployment</li>
  • Upgrading production MongoDB to a newer version
  • Setting up and configuring TLS (SSL)
  • Restricting network access using firewalls

*************************************

  • Mastering MongoDB 3.x

  • By: Alex Giamas

  • Publisher: Packt Publishing

  • Pub. Date: November 17, 2017

  • Web ISBN-13: 978-1-78398-261-5

  • Print ISBN-13: 978-1-78398-260-8

Schema Design and Data Modeling

The second chapter of our book will focus on schema design for schema-less databases such as MongoDB. This may sound counterintuitive; in fact there are considerations that we should take into account when developing for MongoDB.The main points of this chapter are:

  • Schema considerations for NoSQL
  • Data types supported by MongoDB
  • Comparison between different data types
  • How to model our data for atomic operations
  • Modeling relationships between collections:
    • One to one
    • One to many
    • Many to many
  • How to prepare data for text searches in MongoDB
  • Ruby:
    • How to connect using the Ruby mongo driver
    • How to connect using Ruby’s most widely used ODM, Mongoid
    • Mongoid model inheritance management
  • Python:
    • How to connect using the Python mongo driver
    • How to connect using Python’s ODM, PyMODM
    • PyMODM model inheritance management
  • PHP:
    • Sample code using annotations-driven code
    • How to connect using the MongoDB PHP driver
    • How to connect using PHP’s ODM, Doctrine
    • Model inheritance management using Doctrine

Setting up MongoDB using the Docker containers

The container movement, as I like to call it, has touched almost all the aspects of information technology. Docker, being the tool of choice, is integral to the creating and managing of containers.

docker inspect mongo-server-1 | grep IPAddress

docker run -dit –restart unless-stopped –name mongo-server-2 -v /home/ubuntu/docker_mongodb/data/db2:/data/db -p 9999:27017 mongo 

In this recipe, we will install Docker on the Ubuntu (14.04) server and run MongoDB in a container.

Getting ready

  1. First, we need to install Docker on our Ubuntu server, which can be done by running this command:
  2. Start the Docker service:
  3. Confirm that Docker is running as follows:

How to do it…

  1. Fetch the default MongoDB image from Docker Hub as follows:
  2. Let’s confirm that the images are installed with the following command:
  3. Start the MongoDB server:

    Alternately, you can also run the docker ps command to check the list of running containers.

  4. Fetch the IP of this container:

  5. Connect to our new container using the mongo client:

  6. Create a directory on the server:

  7. Start a new MongoDB container:

  8. Fetch the IP of this new container as mentioned in Step 4, and connect using the Mongo client:

  9. Let’s make another directory for our final container:

    Start a new MongoDB container:

  10. Let’s connect to this container via localhost:

How it works…

We start by downloading the default MongoDB image from DockerHub (https://hub.docker.com/_/mongo/). A Docker image is a self-sustaining OS image that is customized for the application that it is supposed to run. All Docker containers are isolated executions of these images. This is very similar to how an OS template is used to create virtual machines.

The image download operation defaults to fetching the latest stable MongoDB image, but you can specify your version of choice by mentioning the tag, for example, docker pull mongo:2.8.

We verify that the image was downloaded by running the docker images command, which will list all the images installed on the server. In step 3, we start a container in the detached (-d) mode with the name, mongo-server-1, using our mongo image. Describing the container internals may be out of the scope of this cookbook, but, in short, we now have an isolated docker pseudo-sever running inside our Ubuntu machine.

By default, each Docker container gets an RFC 1918 (non-routable) IP address space assigned by the docker server. In order to connect to this container, we fetch the IP address in step 4 and connect to the mongodb instance in step 5.

However, each Docker container is ephemeral and hence, destroying the container would mean losing the data. In step 6, we create a local directory that can be used to store our mongo database. We start a new container in step 7; it is similar to our earlier command with the addition of the Volumes (-v) switch. In our example, we are exposing the /data/db2 directory to the mongo container namespace as /data/db. This is similar to NFS-like file mounting but within the confines of the kernel namespace.

Finally, if we want external systems to connect to this container, we bind the container’s ports to that of the host machine. In step 9, we use the Port (-p) switch to bind the TCP 9999 port on the Ubuntu server to TCP 27017 of this container. This ensures that any external systems connecting to the server’s port 9999 will be routed to this particular container.

See also

You can also try to link two containers using the Link (-l) command line parameter of the docker command.

For more information visit http://docs.docker.com/userguide/dockerlinks/.