Master Mongoose Virtualization

Creating Virtual References within Database Schema

Lauren Cunningham
3 min readJun 1, 2021
photo credit: canvas.com

A Space Limitation Solution

Mongo DB stores document-based records as BSON objects. These objects have the same format as JSON objects but include some additional information in a richer format. Persisted documents can only hold a maximum of 16 megabytes.

Due to this limitation, it’s best to be mindful of the data that is being stored in each collection. This can get tricky if you need to refer to another collection in the object fields. For example, if you’re creating a database regarding stats for basketball players, you might want to include the name of the team that a specific player belongs to. Hypothetically, you might grab the team_name string from a separate team collection instead of including the team’s information on each player object. If you’re familiar with the way that GraphQL takes only the information needed when making query requests, this works about the same way.

By assigning virtual object references to model fields, we can take only the information needed from other collections WHEN we need them. Virtual references are not persisted in the database. It returns virtual objects when we query for them so that we can get the necessary information without taking up space.

Here’s an example of what a schema using virtuals might look like:

const mongoose = require('mongoose'),
const userSchema = new mongoose.Schema ( { name: { type: String, required: true, }, stats: [ jerseyNumber: {type: Integer, unique: true}, highestScore: {type: Integer}, gamesPlayed: {type: Integer} ], team: [{ type: Schema.Types.ObjectId, ref:'Team' }] })

The team field is not actually saved to the database. It is only used to look up and return information on the team that the player belongs to upon a query.

.id versus ._id

When using virtualization, the previously undefined field holds a new mongoose object that we referenced. To differentiate between fields that have persisted data and are not virtual references, virtual objects are assigned an _id field upon creation instead of id. Remember that this is only the case for mongoose objects.

.populate()

The populate method can be seen frequently in queries. It takes in the name of the virtual reference field as a string, looks for a type of object and a ref property, then returns the new object requested.

const matchingPlayer = Player.findById(player.id).populate('team').exec()

Hopefully, you caught that we used .exec() on the end. This is used to explicitly say that we are done with this query and ready to run it. Otherwise, our function might just hang there and get caught in a promise.

Queries

Luckily, common query methods such as find and sort can be used. You can find some great information in the Mongo docs on how to query collections.

As a side note, I wanted to mention that there are some special query methods that we can use that are exclusive to MongoDB. These are always preceded by a dollar sign.

If you want to find players with a highScore that is greater than 28, you can use $gt followed by 28 like this…

const matchingPlayer = await Player.find({
players: {
stats: [
highScore: {$gt: 28, $lt: 2}
]
}
})

You can probably guess that $lt instead of $gt would return a player object with a high score that is less than 28. Adding $lt at the end defines a limit for the number of players we want to be returned.

Another fun method is $in which can be used to return array fields that contain a specific thing. For instance, you could use it to write a query for all the players whose stats contain a highScore of 28. You can find more of these here.

Conclusion

Although Mongo databases have some limitations, there are ways to get around them. Virtualization through Mongoose not only solves this problem- but also opens the door to more flexibility within our program. We can choose to only receive certain fields we need which leads to faster runtimes while we are saving space! Now that’s efficiency.

--

--