What is JSON linting?
To understand JSON linting, let’s quickly break down the two concepts of JSON and linting.
JSON is an acronym for JavaScript Object Notation, which is a lightweight, text-based, open standard format designed specifically for representing structured data based on the JavaScript object syntax. It is most commonly used for transmitting data in web applications. It parses data faster than XML and is easy for humans to read and write.
Linting is a process that automatically checks and analyzes static source code for programming and stylistic errors, bugs and suspicious constructs.
JSON has become popular because it is human-readable and doesn’t require a complete markup structure like XML. It is easy to analyze into logical syntactic components, especially in JavaScript. It also has many JSON libraries for most programming languages.
Benefits of JSON linting
Finding an error in JSON code can be challenging and time-consuming. The best way to find and correct errors while simultaneously saving time is to use a linting tool. When Json code is copied and pasted into the linting editor, it validates and reformats Json. It is easy to use and supports a wide range of browsers, so applications development with Json coding don’t require a lot of effort to make them browser-compatible.
JSON linting is an efficient way to reduce errors and it improves the overall quality of the JSON code. This can help accelerate development and reduce costs because errors are discovered earlier.
Some common JSON linting errors
In instances where a JSON transaction fails, the error information is conveyed to the user by the API gateway. By default, the API gateway returns a very basic fault to the client when a message filter has failed.
One common JSON linting error is parsing. A “parse: unexpected character" error occurs when passing a value that is not a valid JSON string to the JSON. parse method, for example, a native JavaScript object. To solve the error, make sure to only pass valid JSON strings to the JSON.
Another common error is NULL or inaccurate data errors, not using the right data type per column or extension for JSON files, and not ensuring every row in the JSON table is in the JSON format.
How to fix JSON linting errors
If you encounter a NULL or inaccurate data error in parsing, the first step is to make sure you use the right data type per column. For example, in the case of “age,” use 12 instead of twelve.
Also make sure you are using the right extension for JSON files. When using a compressed JSON file, it must end with “json” followed by the extension of the format, such as “.gz.”
Next, make sure the JSON format is used for every row in the JSON table. Create a table with a delimiter that is not in the input files. Then, run a query equivalent to the return name of the file, row points and the file path for the null NSON rows.
Sometimes you may find files that are not your source code files, but ones generated by the system when compiling your project. In that instance, when the file has a .js extension, the ESLint needs to exclude that file when searching for errors. One method of doing this is by using ‘IgnorePatterns:’ in .eslintrc.json file either after or before the “rules” tag.
“ignorePatterns”: [“temp.js”, “**/vendor/*.js”],
“rules”: {
Alternatively, you can create a separate file named‘.eslintignore’ and incorporate the files to be excluded as shown below :
**/*.js
If you opt to correct instead of ignore, look for the error code in the last column. Correct all the errors in one fule and rerun ‘npx eslint . >errfile’ and ensure all the errors of that type are cleared. Then look for the next error code and repeat the procedure until all errors are cleared.
Of course, there will be instances when you won’t understand an error, so in that case, open https://eslint.org/docs/user-guide/getting-started and type the error code in the ‘Search’ field on the top of the document. There you will find very detailed instructions as to why that error is raised and how to fix it.
Finally, you can forcibly fix errors automatically while generating the error list using:
Npx eslintrc . — fix
This is not recommended until you become more well-versed with lint errors and how to fix them. Also, you should keep a backup of the files you are linting because while fixing errors, certain code may get overwritten, which could cause your program to fail.
JSON linting best practices
Here are some tips for helping your consumers use your output:
First, always enclose the Key : Value pair within double quotes. It may be convenient (not sure how) to generate with Single quotes, but JSON parser don’t like to parse JSON objects with single quotes.
For numerical values, quotes are optional but it is a good idea to enclose them in double quotes.
Next, don’t ever use hyphens in your key fields because it breaks python and scala parser. Instead use underscores (_).
It’s a good idea to always create a root element, especially when you’re creating a complicated JSON.
Modern web applications come with a REST API which returns JSON. The format needs to be parsed, and often feeds into scripts and service daemons polling the API for automation.
Starting with a new REST API and its endpoints can often be overwhelming. Documentation may suggest looking into a set of SDKs and libraries for various languages, or instruct you to use curl
or wget
on the CLI to send a request. Both CLI tools come with a variety of parameters which help to download and print the response string, for example in JSON format.
The response string retrieved from curl
may get long and confusing. It can require parsing the JSON format and filtering for a smaller subset of results. This helps with viewing the results on the CLI, and minimizes the data to process in scripts. The following example retrieves all projects from GitLab and returns a paginated result set with the first 20 projects:
$ curl "https://gitlab.com/api/v4/projects"
The GitLab REST API documentation guides you through the first steps with error handling and authentication. In this blog post, we will be using the Personal Access Token as the authentication method. Alternatively, you can use project access tokens for automated authentication that avoids the use of personal credentials.
REST API authentication
Since not all endpoints are accessible with anonymous access they might require authentication. Try fetching user profile data with this request:
$ curl "https://gitlab.com/api/v4/user"
{"message":"401 Unauthorized"}
The API request against the /user
endpoint requires to pass the personal access token into the request, for example, as a request header. To avoid exposing credentials on the terminal, you can export the token and its value into the user's environment. You can automate the variable export with ZSH and the .env plugin in your shell environment. You can also source the .env
once in the existing shell environment.
$ vim ~/.env
export GITLAB_TOKEN=”...”
$ source ~/.env
Scripts and commands being run in your shell environment can reference the $GITLAB_TOKEN
variable. Try querying the user API endpoint again, with adding the authorization header into the request:
$ curl -H "Authorization: Bearer $GITLAB_TOKEN" "https://gitlab.com/api/v4/user"
A reminder that only administrators can see the attributes of all users, and the individual can only see their user profile – for example, email
is hidden from the public domain.
How to request responses in JSON
The GitLab API provides many resources and URL endpoints. You can manage almost anything with the API that you’d otherwise configure using the graphic user interface.
After sending the API request, the response message contains the body as string, for example as a JSON content type. curl
can provide more information about the response headers which is helpful for debugging. Multiple verbose levels enable the full debug output with -vvv
:
$ curl -vvv "https://gitlab.com/api/v4/projects"
[...]
* SSL connection using TLSv1.2 / ECDHE-RSA-CHACHA20-POLY1305
* ALPN, server accepted to use h2
* Server certificate:
* subject: CN=gitlab.com
* start date: Jan 21 00:00:00 2021 GMT
* expire date: May 11 23:59:59 2021 GMT
* subjectAltName: host "gitlab.com" matched cert's "gitlab.com"
* issuer: C=GB; ST=Greater Manchester; L=Salford; O=Sectigo Limited; CN=Sectigo RSA Domain Validation Secure Server CA
* SSL certificate verify ok.
[...]
> GET /api/v4/projects HTTP/2
> Host: gitlab.com
> User-Agent: curl/7.64.1
> Accept: */*
[...]
< HTTP/2 200
< date: Mon, 19 Apr 2021 11:25:31 GMT
< content-type: application/json
[...]
[{"id":25993690,"description":"project for adding issues","name":"project-for-issues-1e1b6d5f938fb240","name_with_namespace":"gitlab-qa-sandbox-group / qa-test-2021-04-19-11-13-01-d7d873fd43cd34b6 / project-for-issues-1e1b6d5f938fb240","path":"project-for-issues-1e1b6d5f938fb240","path_with_namespace":"gitlab-qa-sandbox-group/qa-test-2021-04-19-11-13-01-d7d873fd43cd34b6/project-for-issues-1e1b6d5f938fb240"
[... JSON content ...]
"avatar_url":null,"web_url":"https://gitlab.com/groups/gitlab-qa-sandbox-group/qa-test-2021-04-19-11-12-56-7f3128bd0e41b92f"}}]
* Closing connection 0
The curl
command output provides helpful insights into TLS ciphers and versions, the request lines starting with >
and response lines starting with <
. The response body string is encoded as JSON.
How to see the structure of the returned JSON
To get a quick look at the structure of the returned JSON file, try these tips:
- Enclose square brackets to identify an array
[ …. ]
. - Enclose curly brackets identify a dictionary
{ … }
. Dictionaries are also called associative arrays, maps, etc. ”key”: value
indicates a key-value pair in a dictionary, which is identified by curly brackets enclosing the key-value pairs.
The values in JSON consist of specific types - a string value is put in double-quotes. Boolean true/false, numbers, and floating-point numbers are also present as types. If a key exists but its value is not set, REST APIs often return null
.
Verify the data structure by running "linters". Python's JSON module can parse and lint JSON strings. The example below misses a closing square bracket to showcase the error:
$ echo '[{"key": "broken"}' | python -m json.tool
Expecting object: line 1 column 19 (char 18)
jq – a lightweight and flexible CLI processor – can be used as a standalone tool to parse and validate JSON data.
$ echo '[{"key": "broken"}' | jq
parse error: Unfinished JSON term at EOF at line 2, column 0
jq
is available in the package managers of most operating systems.
$ brew install jq
$ apt install jq
$ dnf install jq
$ zypper in jq
$ pacman -S jq
$ apk add jq
Dive deep into JSON data structures
The true power of jq
lies in how it can be used to parse JSON data:
jq
is likesed
for JSON data. It can be used to slice, filter, map, and transform structured data with the same ease thatsed
,awk
,grep
etc., let you manipulate text.
The output below shows how it looks to run the request against the project API again, but this time, the output is piped to jq
.
$ curl "https://gitlab.com/api/v4/projects" | jq
[
{
"id": 25994891,
"description": "...",
"name": "...",
[...]
"forks_count": 0,
"star_count": 0,
"last_activity_at": "2021-04-19T11:50:24.292Z",
"namespace": {
"id": 11528141,
"name": "...",
[...]
}
}
]
The first difference is the format of the JSON data structure, so-called pretty-printed. New lines and indents in data structure scopes help your eyes and allow you to identify the inner and outer data structures involved. This format is needed to determine which jq
filters and methods you want to apply next.
About arrays and dictionaries
The set of results from an API often is returned as a list (or "array") of items. An item itself can be a single value or a JSON object. The following example mimics the response from the GitLab API and creates an array of dictionaries as a nested result set.
$ vim result.json
[
{
"id": 1,
"name": "project1"
},
{
"id": 2,
"name": "project2"
},
{
"id": 3,
"name": "project-internal-dev",
"namespace": {
"name": "🦊"
}
}
]
Use cat
to print the file content on stdout and pipe it into jq
. The outer data structure is an array – use -c .[]
to access and print all items.
$ cat result.json | jq -c '.[]'
{"id":1,"name":"project1"}
{"id":2,"name":"project2"}
{"id":3,"name":"project-internal-dev","namespace":{"name":"🦊"}}
How to filter data structures with jq
Filter items by passing | select (...)
to jq
. The filter takes a lambda callback function as a comparator condition. When the item matches the condition, it is returned to the caller.
Use the dot indexer .
to access dictionary keys and their values. Try to filter for all items where the name is project2
:
$ cat result.json | jq -c '.[] | select (.name == "project2")'
{"id":2,"name":"project2"}
Practice this example by selecting the id
with the value 2
instead of the name
.
Filter with matching a string
During tests, you may need to match different patterns instead of knowing the full name. Think of projects that match a specific path or are located in a group where you only know the prefix. Simple string matches can be achieved with the | contains (...)
function. It allows you to check whether the given string is inside the target string – which requires the selected attribute to be of the string type.
For a filter with the select chain, the comparison condition needs to be changed from the equal operator ==
to checking the attribute .name
with | contains ("dev")
.
$ cat result.json | jq -c '.[] | select (.name | contains ("dev") )'
{"id":3,"name":"project-internal-dev","namespace":{"name":"🦊"}}
Simple matches can be achieved with the contains
function.
Filter with matching regular expressions
For advanced string pattern matching, it is recommended to use regular expressions. jq
provides the test function for this use case. Try to filter for all projects which end with a number, represented by \d+
. Note that the backslash \
needs to be escaped as \\
for shell execution. ^
tests for beginning of the string, $
is the ending check.
$ cat result.json | jq -c '.[] | select (.name | test ("^project\\d+$") )'
{"id":1,"name":"project1"}
{"id":2,"name":"project2"}
Tip: You can test and build the regular expression with regex101 before test-driving it with jq
.
Access nested values
Key value pairs in a dictionary may have a dictionary or array as a value. jq
filters need to take this factor into account when filtering or transforming the result. The example data structure provides project-internal-dev
which has the key namespace
and a value of a dictionary type.
{
"id": 3,
"name": "project-internal-dev",
"namespace": {
"name": "🦊"
}
}
jq
allows the user to specify the array and dictionary types as []
and {}
to be used in select chains with greater and less than comparisons. The []
brackets select filters for non-empty dictionaries for the namespace
attribute, while the {}
brackets select for all null
(raw JSON) values.
$ cat result.json | jq -c '.[] | select (.namespace >={} )'
{"id":3,"name":"project-internal-dev","namespace":{"name":"🦊"}}
$ cat result.json | jq -c '.[] | select (.namespace <={} )'
{"id":1,"name":"project1"}
{"id":2,"name":"project2"}
These methods can be used to access the name attribute of the namespace, but only if the namespace contains values. Tip: You can chain multiple jq
calls by piping the result into another jq
call. .name
is a subkey of the primary .namespace
key.
$ cat result.json | jq -c '.[] | select (.namespace >={} )' | jq -c '.namespace.name'
"🦊"
The additional select command with non-empty namespaces ensures that only initialized values for .namespace.name
are returned. This is a safety check, and avoids receiving null
values in the result you would need to filter again.
$ cat result.json| jq -c '.[]' | jq -c '.namespace.name'
null
null
"🦊"
By using the additional check with | select (.namespace >={} )
, you only get the expected results and do not have to filter empty null
values.
How to expand the GitLab endpoint response
Save the result from the API projects call and retry the examples above with jq
.
$ curl "https://gitlab.com/api/v4/projects" -o result.json 2&>1 >/dev/null
Validate CI/CD YAML with jq
for Git hooks
While writing this blog post, I learned that you can escape and encode YAML into JSON with jq
. This trick comes in handy when automating YAML linting on the CLI, for example as a Git pre-commit hook.
Let’s take a look at the simplest way to test GitLab CI/CD from our community meetup workshops. A common mistake with the first steps of the process can be missing the two spaces indent or missing whitespace between the dash and following command. The following examples use .gitlab-ci.error.yml
as a filename to showcase errors and .gitlab-ci.main.yml
for working examples.
$ vim .gitlab-ci.error.yml
image: alpine:latest
test:
script:
-exit 1
Committing the change and waiting for the CI/CD pipeline to validate at runtime can be time-consuming. The GitLab API provides a resource endpoint /ci/lint. A POST request with JSON-encoded YAML content will return a linting result faster.
Parse CI/CD YAML into JSON with jq
You can use jq to parse the raw YAML string into JSON:
$ jq --raw-input --slurp < .gitlab-ci.error.yml
"image: alpine:latest\n\ntest:\nscript:\n -exit 1\n"
The /ci/lint
API endpoint requires a JSON dictionary with content
as key, and the raw YAML string as a value. You can use jq
to format the input by using the arg parser:
§ jq --null-input --arg yaml "$(<.gitlab-ci.error.yml)" '.content=$yaml'
{
"content": "image: alpine:latest\n\ntest:\nscript:\n -exit 1"
}
Send POST request to /ci/lint
The next building block is to send a POST request to the /ci/lint. The request needs to specify the Content-Type
header for the body. With using the pipe |
character, the JSON-encoded YAML configuration is fed into the curl command call.
$ jq --null-input --arg yaml "$(<.gitlab-ci.error.yml)" '.content=$yaml' \
| curl "https://gitlab.com/api/v4/ci/lint?include_merged_yaml=true" \
--header 'Content-Type: application/json' --data @-
{"status":"invalid","errors":["jobs test config should implement a script: or a trigger: keyword","jobs script config should implement a script: or a trigger: keyword","jobs config should contain at least one visible job"],"warnings":[],"merged_yaml":"