Sample Content for Local Dev and Testing With Drupal 8 Core Migrate

Recently, the subject of sample content for use with local onboarding and testing in a continuous integration environment came up on a project. I went on a bit of a foray into Drupal.org to see what was out there.

The Current State of Demo Content

(Update October 30, 2020) It has been brought to my attention that I did indeed miss a contributed module that can be used for this purpose: Migrate default content. This project allows you to use CSV or YAML files as a data source and looks like it sits between making your own migrations and Yaml Content in terms of complexity: still uses Migrate, but the source definition is more like making source files for Yaml Content.

----

It seems there are two projects you might reach for to accomplish a sample content build right now (if I missed one, leave us a comment): Yaml Content, and Default Content (more on them at the end).

I think both of these projects miss the mark a bit. Both aspire to be a full content export/import system. For what I’m looking to accomplish, we don’t need to be able to export content, just make demo content for testing or local development.

Also, looking through the issue queues for both, I noticed some common goals:

Automatic delete of sample content
Not re-importing the same content on re-run
Or, re-importing the same content, but with updates
etc.

Soon, I found myself thinking:

These projects are both going to re-implement things Migrate already does.

So, I decided to see what I could do with core Migrate.

The Goals

I wanted to make a set of sample content that was simple, easy to maintain and extend, and that can be run locally or in a remote environment like Bitbucket Pipelines or your CI system du jour.

Testing and iterative development become much simpler if a baseline setup can be repeated quickly. Think running “ddev restore-snapshot” vs. importing a full mysql dump.

So this should be repeatable, should run fast, and ideally require little-to-no dependencies be added to the project. This is to avoid negatively impacting local and CI site install runtime. Again, a goal is for this to be very repeatable so it is useful for testing.

Outside of the core requirement of Migrate, there will be at least one other dependency to get this working. That being some way to run the migration imports. If your project is already using Migrate Tools and Migrate Plus you’re ready to go, and if not I recommend installing Migrate Run as a dev dependency.

Setting up an Import for a Basic Page

Looking for the benefits of few dependencies and a fast, repeatable runtime; what are our options? I wondered, “what is the simplest thing I could do to get a node on the page?” With just one core dependency and one contributed module we can import a node!

The one core dependency is, again, Migrate.

The first thing we need in order to do anything custom in Drupal is a module. The minimum requirement for making this module has two parts.

First, the info file:

name: Sample Content
type: module
description: Imports sample content using Migrate
core_version_requirement: ^8 || ^9
package: Example Project
dependencies:
  - file
  - link
  - migrate
  - path
  - paragraphs

File: sample/sample.info.yml

This is the Initial module boilerplate: almost none. We will not need to create a sample.module file. This does include dependencies for all the field types used in upcoming examples, however.

Now, the second part: add a sample/migrations/ directory; the system will read Migrate plugins from here.

This is all that is required to import a sample basic page node using the core “embedded_data” source plugin:

id: sample_nodes_page
label: Sample Nodes - Page
migration_tags: sample
source:
  plugin: embedded_data
  data_rows:
    -
      title: Sample Page 1
      path: /sample_page_1
  ids:
    title:
      type: string
process:
  type:
    plugin: default_value
    default_value: page
  status:
    plugin: default_value
    default_value: 1
  title: title
  'path/pathauto':
    plugin: default_value
    default_value: 0
  'path/alias': path
destination:
  plugin: entity:node

File: migrations/sample.nodes.page.yml

That’s all. If you just wanted to make a module that would import a single node with no field values, this will do that.

How Does This Work?

Do I Have to config-import This?

No—if you’ve used Migrate Extras, you may be familiar with configuration import for those migrations. However, this is actually a Migrate plugin definition, not a config entity, so it will not be imported/exported with configuration management.

To update Migrate plugins after editing the file, a cache rebuild is all that is necessary.

Ok, so how do I run this?

Well, now you’ve got me. Remember earlier I mentioned that one contributed module we would need? Drupal core, at this time, provides no Drush commands for working with migrations. So, we need to make some or include a project to make them for us.

Migrate Run provides the drush commands we need. These commands were forked from Migrate Tools. Note: most projects, I find, will already require migrate_tools and migrate_plus (usually for ongoing data imports). If you have those installed, you can use the commands from migrate_tools instead, they are nearly identical.

Once you have migrate_run installed, enable your sample module (you would never forget to enable a custom module before trying to use it, right? I’ve never done this on every project ever…). Then, to check that your migration is being loaded, use the migrate:status command:

user@site-web:/var/www/html/web $ drush ms sample_nodes_page
 ------------------- -------- ------- ---------- ------------- ---------------------
  Migration ID        Status   Total   Imported   Unprocessed   Last Imported
 ------------------- -------- ------- ---------- ------------- ---------------------
  sample_nodes_page   Idle     1       0          1
 ------------------- -------- ------- ---------- ------------- ---------------------

Then you can give the migrate:import command a spin:

user@site-web:/var/www/html/web$ drush mim sample_nodes_page

 [notice] Processed 1 item (1 created, 0 updated, 0 failed, 0 ignored) in 0.4 seconds (468.1/min) - done with 'sample_nodes_page’

Wait, What just Happened?

If you look at your /admin/content page, you should see a single node with the title “Sample Page 1” and path /sample_page_1! Demo content! 🎉

Sure, this is just a node with a title and path alias for now, but we can experiment with changing things in the data_rows or process plugins to add field values and evolve this into something useful without much trouble.

Run `drush mr sample_nodes_page` (migrate:rollback) to roll-back the import (this will delete the node).

Then make your changes to sample.nodes.page.yml file and run `drush cr` to reload the plugin definition. Tip: run migrate:rollback before changing the plugin YAML, because if you accidentally make a syntax error, the rollback command will fail when it reads the plugin file.

Ok, fine, but What About Paragraphs?

Yea, we use Paragraphs too. Who doesn’t?

I find this is right around the time people tend to get tripped up with Drupal migrations. When we edit a Paragraph in a node edit form, it looks like it is just part of the node. It’s all done right inline, after all. But there’s a lot of work going on behind the scenes to make it look seamless.

What’s really happening is that a paragraph entity separate from the node entity is created and managed for every paragraph item added to the node during editing.

If you’ve ever used Migrate before, you’re probably noticing the potential issue right about now: you cannot define entities inline in a migration. If this is a must-have for your demo content, you will really like Yaml Content.

It’s ok, though because Migrate is built to handle this and the system for relating entities to other entities is pretty easy to work with.

Defining Some Paragraphs

Before our Page nodes can render paragraphs, we have to actually have some Paragraph entities to reference.

My test project has a very simple paragraph type called “text” which has a single formatted text field. The sample github repository linked below also contains a set of sample field configurations describing everything used by these migrations.

id: sample_paragraphs_text
label: Sample Paragraphs - Text
migration_tags: sample
source:
  plugin: embedded_data
  data_rows:
    -
      sample_id: text1
      body_text: |
        <p>Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do
        eiusmod tempor incididunt ut labore et dolore magna aliqua.</p>
    -
      sample_id: text2
      body_text: |
        <p>Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do
        eiusmod tempor incididunt ut labore et dolore magna aliqua.</p>
  ids:
    sample_id:
      type: string
process:
  type:
    plugin: default_value
    default_value: text
  'field_body/value': body_text
  'field_body/format':
    plugin: default_value
    default_value: basic_html
destination:
  plugin: entity:paragraph

File: migrations/sample.paragraphs.text.yml
Creating two Paragraph entities of type ‘text’. Note the destination plugin is now ‘entity:paragraph’.

That’s all well and good, but paragraphs have to be referenced by a field in a node to be rendered. This is where things get really cool. All we have to do is tell our node migration to look up the paragraphs created by this migration and place the entity_id that got created during mim into the ERR field in our nodes when they are created.

We’ll need to add three new things to the node migration to get this to happen:

A migration dependency -- this will tell Migrate that the paragraphs have to be created before they can be referenced by the node migration.
A new value in the source data_rows to tell the node migration which paragraph we want to reference in each row.
A new process item for the destination ERR content field to tell Migrate which migration(s) to look up the target IDs from.

Those look like this, respectively:

The migration dependency that tells the system we are going to reference something from another migration.
```
migration_dependencies:
  required:
    - sample_paragraphs_text
```
The source data for this node that the field_content process item will reference below. It has to be an array so sub_process can operate on it correctly.
```
  content_items:
    -
      sample_id: text1
```

The process information for how the source data relates to the migration we want to perform the lookup on. In my page content type, there is an ERR field called field_content that can reference Text and Image paragraphs.

 field_content:
    plugin: sub_process
    source: content_items
    process:
      target_id:
        plugin: migration_lookup
        migration: sample_paragraphs_text
        source: sample_id
      target_revision_id: '@target_id'

Putting That all Together

See this example repo on Github for the new full version of the “Sample Nodes - Page” migration that will reference these new text paragraphs. I cut that out here in the interest of page length since it’s the same as above, plus the three new parts.

With both migrations in place, to run them at the same time, use the tag option: drush mim --tag=sample. Even though the machine name “sample_nodes_page” would sort before “sample_paragraphs_text” alphabetically, the migrations will be run in the correct order due to our migration_dependencies.

You should see something like this:

user@site-web:/var/www/html/web$ drush mim --tag=sample
[notice] Processed 3 items (3 created, 0 updated, 0 failed, 0 ignored) in 0.8 seconds
 (227.5/min) - done with 'sample_paragraphs_text'
 [notice] Processed 3 items (3 created, 0 updated, 0 failed, 0 ignored) in 0.4 seconds
 (468.1/min) - done with 'sample_nodes_page’

Hmm, How About File Uploads?

These are similar to the Paragraphs example in that the files will need to be migrated and then referenced from the parent entity, For this we need to use the file_copy plugin to actually get the file in place to be referenced by our Image field (which is in another paragraph type in my example).

id: sample_files
label: Sample Files
migration_tags: sample
source:
  plugin: embedded_data
  data_rows:
    -
      fid: 1
      filename: sample1.jpg
    -
      fid: 2
      filename: sample2.jpg
  ids:
    fid:
      type: integer
  Constants:
    # You may need to change this path if you cannot place your module
    # in a directory called simply “sample”.
    files_path_rel: modules/custom/sample/files/
    dest_path: public://
process:
  docroot:
    plugin: callback
    callable: realpath
    source: '.'
  files_path:
    plugin: concat
    source:
      - '@docroot'
      - constants/files_path_rel
    delimiter: /
  file_source:
    plugin: concat
    source:
      - '@files_path'
      - filename
  file_dest:
    plugin: concat
    source:
      - constants/dest_path
      - filename
  uid:
    plugin: default_value
    default_value: 1
  fid: fid
  filename: filename
  uri:
    plugin: file_copy
    source:
      - '@file_source'
      - '@file_dest'
destination:
  plugin: entity:file

File: migrations/sample.files.yml

This is a lot longer than some of the others, but let’s break it down using what we’ve already seen and introduce some Migrate conventions. It’s pretty simple to take one step at a time.

Start at the bottom; the “meat” here is the file_copy plugin. This is where the actual work occurs and the File entities are created from the source rows. We can ignore those for now, though. Let’s look at how the file_source and file_dest paths are built.

In Migrate, the source values used in process items can refer to other values that were produced from earlier process items. That’s what the '@file_dest' item is doing.

So, backing up to file_dest, that’s just a concat of a constant + the filename from the source row. Nothing too fancy, so our value would come out like “public://sample1.jpg”. This is then stored in the uri value after the copy.

The ‘@file_source’ path is similar, but needs a bit of runtime info, so it’s a concat of something calculated in another process: “callback”. If you know the absolute path of your build, you could just put the entire thing in a constant and simplify this, but I assumed it would be different between a local environment and a CI environment for completeness.

Putting it all Together Again, Again

Here is a sample repo I created with all the example files in one spot: https://github.com/pmallett-mc/sample

Happy migrating!

Notes on Other Options

Yaml Content

Pros: Very well documented, which makes it easy to pick up and use; runs pretty fast, too.

Cons: Adds an extra dependency on your project; slightly esoteric yaml syntax for embedded entities; code is a little too abstracted/confusing.

Default Content

Pros: Seems very similar to Yaml Content, but for those who like hal+json.

Cons: Uses d8 rest (hal+json); not as well documented with examples - seems like the workflow almost requires building content to export to then turn around and use for an import.

YAML Content gives you a low barrier to entry and is very well documented. However, over time you may want more features and the example content can become a bit unwieldy (order of imports is a thing that has to be juggled sometimes, for example). Overall, it’s a solid choice for demo content.

External Links

Persona

Drupal Developer

Services

Development