As a Ruby On Rails developer, I bet you’ve seen a lot of .yml files in your project like database.yml(database configuration), en.yml(translate configuration), etc. But let’s face it, do you really know about .yml file???
All the things I know when I start a new rails project are .yml files serve for configuration but not know why using this extension and the benefit of it.
What is YAML?
Follow wikipedia, we have a definition:
“YAML (a recursive acronym for “YAML Ain’t Markup Language”) is a human-readable data-serialization language” YAML (from version 1.2) is a superset of JSON and is commonly used for configuration files and in applications where data is being stored or transmitted.
It means a lot, right? We use it for configuration and somehow YAML is so much more than JSON, especially about human-readable respective.
YAML vs JSON
Before dive into the different, we need to know the term superset first:
“Superset is A programming language that contains all the features of a given language and has been expanded or enhanced to include other features as well.” - Font
If you are an FE developer, you can see the relationship between YAML and JSON is similar to TypeScript and Javascript in JS world.
Let’s see this example:
{
"json": [
"rigid",
"better for data interchange"
],
"yaml": [
"slim and flexible",
"better for configuration"
],
"object": {
"array": [
{
"null_value": null
},
{
"boolean": true
},
{
"integer": 1
}
]
},
"paragraph": "Blank lines denote\nparagraph breaks\n",
"content": "Or we\ncan auto\nconvert line breaks\nto save space"
}
Here is an example JSON file, seems easy to read, but you can see some limitations:
- Can’t create variables.
- Can’t use external variables.
- Overrides values.
And now, convert it to YAML syntax
json:
- rigid
- better for data interchange
yaml:
- slim and flexible
- better for configuration
object:
array:
- null_value:
- boolean: true
- integer: 1
paragraph: >
Blank lines denote
paragraph breaks
content: |-
Or we
can auto
convert line breaks
to save space
Hmmm, the light is flickering :smiley:
Concepts, Types, Syntax
Let’s take a look in some concepts of YAML
INDENTATION
In Yaml, indentation does matter. It uses whitespace indentation to nest information. By whitespace, keep in mind tab is not allowed.
KEY/VALUE
Like in JSON/JS, YAML also uses the key/value syntax and you can use in various ways:
key: value
key_one: value one
key one: value # This works but it's weird
'my key': somekey
COMMENTS
To write a comment in YAML, you just have to use # followed by your message content.
# I'm a comment
person: # I'm also a comment
age: 20
LIST
There’re 2 ways to write lists:
- The old way(JSON way): array of strings.
people: ['Anne', 'John', 'Max']
- The new way(Hyphen syntax from YAML).
people:
- Anne
- John
- Max
STRINGS
We have several ways to write strings in yaml:
company: Google # Single words, no quotes
full_name: John Foo Bar Doe # Full sentence, no quotes
name: 'John' # Using single quotes
surname: "Christian Meyer" # Using double quotes
While in JSON we would have only one way to use double quotes:
{
"company": "Google",
"full_name": "John Foo Bar Doe",
"name": "John",
"surname": "Christian Meyer"
}
NUMBERS
We have two types of number in YAML: Integer and Float
year: 2019 # Integer
nodeVersion: 10.8 # Float
NODE ANCHORS
I have no doubt if you are yawning with some information above, but please wake up because Node Anchors is an interesting feature.
An anchor is a mechanism to create a group of data (an object) that can be injected or extended from other objects.
If you’re a ruby on rails developer, I’m surely that you read that kind of feature in database.yml
default: &default
adapter: sqlite3
pool: <%= ENV.fetch("RAILS_MAX_THREADS") { 5 } %>
timeout: 5000
development:
<<: *default
database: db/development.sqlite3
test:
<<: *default
database: db/test.sqlite3
production:
<<: *default
database: db/production.sqlite3
But wait, take a deep eye, you may ask what the hell with:
default: &default
.
.
.
development:
<<: *default
Yeah, that it is, here is an anchor. If you don’t use anchor, you have to repeat the same group configuration, equivalent:
development:
adapter: sqlite3
pool: <%= ENV.fetch("RAILS_MAX_THREADS") { 5 } %>
timeout: 5000
database: db/development.sqlite3
test:
adapter: sqlite3
pool: <%= ENV.fetch("RAILS_MAX_THREADS") { 5 } %>
timeout: 5000
database: db/test.sqlite3
production:
adapter: sqlite3
pool: <%= ENV.fetch("RAILS_MAX_THREADS") { 5 } %>
timeout: 5000
database: db/production.sqlite3
So much copy/paste here, instead of it, we create an anchor “default“, and inject it to another place in the YAML file.
default: &default
adapter: sqlite3
pool: <%= ENV.fetch("RAILS_MAX_THREADS") { 5 } %>
timeout: 5000
development:
<<: *default
database: db/development.sqlite3
JSON SYNTAX
Because YAML is a superset of JSON, it means we can write YAML by JSON way::
{
"details": {
"company": {
"name": "Google",
"year": 2019,
"active": true
},
"employees": [
"Anne",
"John",
"Max"
]
}
}
SHELL/BASH ENVIRONMENT
It’s very common .yml files are used as config files for many things, but especially for CI/CD environment.
In CI/CD environment, we usually use docker for setup/installed environment, let’s check out a docker-compose.yml file:
version: "3"
variables:
REDIS_IMAGE: redis
services:
node-app:
build: .
ports:
- '4001:8081'
redis-server:
image: $REDIS_IMAGE
Note that the syntax to use variables by $ isn’t from YAML but shell/bash.
What GitLab CI does is getting everything you’d defined in variables and creates shell variables.
Conclusion
Every day we see a lot of “yml“ files but not so sure about the benefit and how it works, I hope you found some useful information in this article.
References
Yaml official website
Convert Json To Yaml
Yaml Wikipedia
Yaml blog