JS monorepos in prod 1: project initialization
Every project journey begins with the step of initialization. When your overall project is composed of multiple projects, it is tempting to create one Git repository per project. In Node.js, a project translates to a package. However, managing too many closely related repositories is confusing and time-consuming.
Placing multiple projects inside a single Git repository and using a tool like Lerna to facilitate their management worth the effort. This architecture is called a monorepo. It simplifies the versioning and publishing of the components as well as their manipulation and development.
At Adaltas, we have been developing and maintaining several monorepos for a couple of years. This article is the first one from a serie of 7 in which we share our best practices. It covers the project initialization using Yarn and Lerna:
Starting a new project
The idea for an example project comes from our past work. Over the years, we have accumulated several Gatsby plugins that have never been published and shared with the open-source community. Those plugins are copy/pasted from one Gatsby website to another, sometimes with bug fixes and enhancements. Since we have multiple copies more or less up-to-dates between each other, older websites don’t benefit from those changes. The idea is to centralize the development of those plugins inside a single repository and share them by publishing them on NPM.
A new project is started from scratch. It is called remark-gatsby-plugins
and is hosted on GitHub. This repository is a container for multiple packages that are plugins for Gatsby and gatsby-transformer-remark
plugin.
mkdir remark-gatsby-plugins
cd remark-gatsby-plugins
git init
echo "# remark and Gatsby plugins by Adaltas" > README.md
git add README.md
git commit -m "docs: project creating"
git remote add origin https://github.com/adaltas/remark-gatsby-plugins.git
git push -u origin master
The commit message is prefixed by docs
and it is not by hazard. This aspect is covered later by the Conventional Commits chapter in the following article commit enforcement and changelog generation.
Ignoring files from Git
There are two strategies to choose from:
- To selectively define the path to be ignored.
- To define global ignore rules and selectively exclude path from those rules.
I usually choose the latest strategy to ignore all hidden files by default. I start with:
cat <<CONTENT > .gitignore
.*
node_modules
!.gitignore
CONTENT
git add .gitignore
git commit -m 'build: ignore hidden files and node modules'
Project initialization
I am personally using Yarn instead of NPM. Both package managers are perfectly fine, but I had issues in the past using NPM with monorepos and links. In this setup, Yarn also seems to be the tool of choice across the community. Its native support for monorepos, called workspaces, works well with Lerna.
To initialize a package with yarn
:
yarn init
yarn init v1.22.5
question name (remark-gatsby-plugins):
question version (1.0.0): 0.0.0
question description: A selection of remark and Gatsby plugins developed and used by Adaltas
question entry point (index.js):
question repository url (https://github.com/adaltas/remark-gatsby-plugins.git):
question author (David Worms <[email protected]>):
question license (MIT):
question private:
git add package.json
git commit -m "build: package initialization"
It created a package.json
file and committed it.
Monorepo with Lerna
The project contains a package.json
file. Following the Node.js terminology, the project is now a Node.js package. However, it will not be published on NPM, the official Node.js repository. Only the packages inside this package will be published.
Instead of creating a Git repository for each package, it is easier to maintain a single repository storing multiple Node.js packages. Since multiple packages are managed inside the same repository, we call this a monorepo.
Multiple tools exist to manage monorepos. Lerna is a popular choice but not the only one. At Adaltas, we have been using it for some time and we continue for this article.
Besides having just one Git repository to manage, there are additional advantages to legitimate the usage of monorepos:
- When multiple packages are developed, many duplicated dependencies are declared inside the
package.json
file. Declaring the dependencies inside the top-most project managed with Lerna reduces space and time. It is called “hoisting” dependencies. - When packages depend on each other’s, changes in one package often need to be instantly reflected in the other packages. A single feature may span multiple packages. Publishing the changes of the dependent packages is not possible, it takes too much time and there could be too many changes not justifying a release. The solution is to link the dependencies by creating symbolic links. For large projects, this is a tedious task. A tool like Lerna automates the creation of those links.
- Having one central location federates the execution of your commands. For example, you install all the dependencies of all your packages with a single command,
yarn install
. For testing, the commandlerna test
runs all your tests.
Additionally, Lerna helps us to manage our versions with respect to the Semantic Versioning (SemVer) specification.
The command to initialize Lerna is:
yarn add lerna
yarn lerna init --independent
The --independent
flag tells Lerna to manage the version of each package independently. Without it, Lerna aligns the versions of the packages it manages.
These commands add the lerna
dependency to the package.json
and creates a new lerna.json
file:
{
"packages": [
"packages/*"
],
"version": "independent"
}
Then, we commit our pending changes:
git add lerna.json package.json
git commit -m 'build: lerna initialization'
Publishing or ignoring lock files
The yarn add
command has generated a yarn.lock
file. With NPM, the file would have been package-lock.json
.
My approach is to publish lock files for my final applications. I don’t publish the lock files for the packages which are meant to be used as dependencies. Some people agree with my opinion. However, the Yarn documentation states the contrary:
All
yarn.lock
files should be checked into source control (e.g. git or mercurial). This allows Yarn to install the same exact dependency tree across all machines, whether it be your coworker’s laptop or a CI server.
Framework and library authors should also check yarn.lock
into source control. Don’t worry about publishing the yarn.lock
file as it won’t have any effect on users of the library.
I am perplexed. If it is not used, then why committing a huge file. Anyway, let’s ignore them for now. The end result is that those lock files will be ignored from Git:
echo 'package-lock.json' >> .gitignore
echo 'yarn.lock' >> .gitignore
git add .gitignore
git commit -m "build: ignore lock files"
Yarn integration
Since we are using Yarn instead of NPM, add these properties to lerna.json
:
{
"npmClient": "yarn",
"useWorkspaces": true
}
The useWorkspaces
property tells Lerna to not use lerna.json#packages
but instead to look for packages.json#workspaces
. According to the Lerna Bootstrap documentation, both are similar except that Yarn doesn’t support recursive globs **
.
Update Lerna to remove the packages
property from lerna.json
, it now contains only:
{
"npmClient": "yarn",
"useWorkspaces": true,
"version": "independent"
}
Update the packages.json
file to contain:
{
"private": true,
"workspaces": [
"packages/*"
]
}
The private
property is required. Any attempt to register a new dependency without it raises an error from Yarn in the form of “Workspaces can only be enabled in private projects”. Note, it was possible to define the project as private
when we were initializing it with yarn init
. Now, that our project is a monorepo, it is a good time to mark the root package as private
since it will not be published on NPM. Only the packages inside it are for publishing.
Note, executing lerna init
now will sync the packages.json#workspaces
back inside lerna.json#packages
with the new values.
Now, save the changes:
git commit -a -m 'build: activate yarn usage'
If you are not familiar with Git, the -a
flag adds all the modified files to the commit. New files are disregarded.
Package location
By default, Lerna manages packages inside the “packages” folder. The majority of projects using Lerna uses this convention. It is a good idea to respect it. But in our case, we have two types of plugins:
- The Gatsby plugins
- The Gatsby Remark plugins which extend the
gatsby-transformer-remark
plugin
Thus, I modify the workspaces
array in the packages.json
file to be:
{
"workspaces": [
"gatsby/*",
"gatsby-remark/*"
]
}
The packages’ location is saved:
git commit -a -m 'build: workspaces declaration'
Packages creation
Let’s import two packages for the sake of testing. They are currently located inside my /tmp
folder:
ls -l /tmp/gatsby-caddy-redirects-conf
total 16
-rw-r--r--@ 1 david staff 981B Nov 26 21:20 gatsby-node.js
-rw-r--r--@ 1 david staff 239B Nov 26 21:19 package.json
ls -l /tmp/gatsby-remark-title-to-frontmatter
total 16
-rw-r--r-- 1 david staff 1.2K Nov 26 11:35 index.js
-rw-r--r--@ 1 david staff 309B Nov 26 21:14 package.json
To import the packages and commit:
mkdir gatsby gatsby-remark
mv /tmp/gatsby-caddy-redirects-conf gatsby/caddy-redirects-conf
git add gatsby/caddy-redirects-conf
mv /tmp/gatsby-remark-title-to-frontmatter gatsby-remark/title-to-frontmatter
git add gatsby-remark/title-to-frontmatter
git commit -m 'build: import project'
Cheat sheet
Package initialization:
Monorepo initialization:
yarn add lerna
yarn lerna init
yarn lerna init --independent
git add lerna.json package.json
git commit -m 'build: lerna initialization'
Ignore lock file (optional):
echo 'package-lock.json' >> .gitignore
echo 'yarn.lock' >> .gitignore
git add .gitignore
git commit -m "build: ignore lock files"
Yarn integration (unless using NPM), remove the package
property from lerna.json
and:
{
"npmClient": "yarn",
"useWorkspaces": true
}
Update the packages.json
file to contain:
{
"private": true,
"workspaces": [
"packages/*"
]
}
Next
The following article cover the versioning and publishing strategies of packages with Lerna.