SQL/CoginitiScript Linter

Overview

Coginiti includes a powerful built-in SQL linter that performs real-time interactive checks on catalog assets and SQL/CoginitiScript code. The linter helps enforce best practices and maintain code consistency by detecting potential issues and offering suggestions for improvement.

 

Configuration

The linter can be configured using either configuration files or in-file comments.

Configuration Files

The default configuration file for the linter is named linter.toml. When checking for configuration, the linter first searches the directory of the file being analyzed. If not found, it will search parent directories until it either finds a linter.toml file or reaches the root directory. If no configuration file is found, the linter is disabled.

Each rule follows the format linter.sql.<group>.<rule_name>, where <group> represents a category (e.g., aliasing, structure). At a minimum, each rule includes an enabled property that controls whether it is active. More complex rules may define additional properties.

Here is an example of linter.toml which contains configuration for all rules we currently support:

[linter.sql.aliasing]
type.explicit_as = true
type.enabled = true
prevent_self_alias.enabled = true
prevent_duplicates.enabled = true
prevent_unused.enabled = true
require_alias_for_expression.enabled = true

[linter.sql.structure]
order_select_targets_by_complexity.enabled = true
distinct_no_parens.enabled = true
join_table_order.enabled = true
cte_use_instead_of_subquery.enabled = true
cte_prevent_unused.enabled = true
join_no_using.enabled = true

 

In-File Configuration

Rules can also be specified directly within SQL files using special comment syntax. These rules apply from the line where they are defined until the end of the file or until a subsequent comment updates the configuration.

Example:

-- Enable rule to prevent self aliasing
-- linter.sql.aliasing.prevent_self_alias.enabled = true
--
-- Prevent usage of unused aliases
-- linter.sql.aliasing.prevent_unused.enabled = true
SELECT
transaction_date AS transaction_date
FROM
sales_header
;

File-based configuration is preferred in most cases. In-file configuration should be used sparingly, only when specific files require custom rule adjustments.

Rules

Aliasing group

Given rules checks various cases regarding the way aliases for SQL identifiers should be specified.

linter.sql.aliasing.type

Name Checks the style of the aliases used for the SQL identifiers.
Description Aliasing of SQL identifiers to follow preference (requiring an explicit AS is the default).
Code action Fix code action is available.
Additional properties
explicit_as: specified whether `AS` keyword should be used when defining SQL alias. Default value is true.

 

Anti-pattern

In case we want to force explicit `AS` keyword.

SELECT
a foo
FROM
table_a
;

Best practice

SELECT
a AS foo
FROM
table_a
;

 

linter.sql.aliasing.prevent_self_alias

Name Identifier aliases should not alias to itself, i.e. self-alias.
Description Assigning an alias to SQL identifier with the same name as identifier is a redundant in SQL and doesn’t affect its functionality.
Code action Fix code action is available.

 

Anti-pattern

SELECT
transaction_datetime AS transaction_datetime,
product_id AS product_id,
quantity AS quantity,
price AS price
FROM
sales_header
;

Best practice

SELECT
transaction_datetime,
product_id,
quantity,
price
FROM
sales_header
;

 

linter.sql.aliasing.prevent_duplicates

Name Aliases should be unique within each clause.
Description Reusing SQL alias in the same context (column, table, subquery, etc) is very likely a coding error.
Code action No.

 

Anti-pattern

SELECT
a AS foo,
b AS foo
FROM
table_a
;

Best practice

SELECT
a AS foo,
b AS bar
FROM
table_a
;

 

linter.sql.aliasing.prevent_unused

Name SQL identifiers should not be aliased if that alias is not used.
Description  
Code action Yes.

 

Anti-pattern

SELECT
a,
b
FROM
table_a AS foo
;

Best practice

SELECT
foo.a,
foo.b
FROM
table_a AS foo
;

 

linter.sql.aliasing.require_alias_for_expression

Name Column expressions should be aliased.
Description When enabled this rule forces to specify alias for column expressions.
Code action No.

 

Anti-pattern

SELECT
SUM(a),
SUM(b)
FROM
table_a
;

Best practice

SELECT
SUM(a) AS sum_a,
SUM(b) AS sum_b
FROM
table_a
;

 

Structure group

Given rules checks various cases regarding the way SQL queries should be structured.

 

linter.sql.structure.order_select_targets_by_complexity

Name Select wildcards first, followed by simple columns, and place calculations and aggregates last.
Description  
Code action Yes.

 

Anti-pattern

SELECT
a,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY date) AS row_num,
b
FROM
table_a
;

Best practice

SELECT
a,
b,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY date) AS row_num
FROM
table_a
;

 

linter.sql.structure.distinct_no_parens

Name Prevent using DISTINCT with parentheses.
Description  
Code action Yes.

 

Anti-pattern

SELECT
DISTINCT(a),
b
FROM
table_a
;

Best practice

SELECT
DISTINCT
a,
b
FROM
table_a
;

 

linter.sql.structure.join_table_order

Name Joins should list the table referenced earlier/later first.
Code action Yes.

 

Anti-pattern

In this example, the table referenced later is listed first in the join clause.

SELECT
sh.transaction_datetime,
sd.product_id,
sd.quantity,
sd.price
FROM
sales_header
INNER JOIN sales_detail
ON sales_detail.sales_header_key = sales_header.sales_header_key
;

Best practice

List the tables that were referenced earlier first.

SELECT
sh.transaction_datetime,
sd.product_id,
sd.quantity,
sd.price
FROM
sales_header
INNER JOIN sales_detail
ON sales_header.sales_header_key = sales_detail.sales_header_key
;

 

linter.sql.structure.cte_use_instead_of_subquery

Name Join/From clauses should not contain subqueries. Use CTEs instead.
Code action Yes.

 

Anti-pattern

SELECT
table_a.a,
table_a.b,
tmp.c,
tmp.d
FROM
table_a
INNER JOIN (
SELECT
id
c,
b
FROM
table_b
) _tmp
ON _tmp.id = table_a.id
;

Best practice

WITH _tmp AS (
SELECT
id
c,
b
FROM
table_b
)
SELECT
table_a.a,
table_a.b,
tmp.c,
tmp.d
FROM
table_a
INNER JOIN _tmp
ON _tmp.id = table_a.id
;

 

linter.sql.structure.cte_prevent_unused

Name Prevent unused CTE (common-table expressions)
Description Defining an unused CTE does no harm, but it adds unnecessary code that could be removed.
Code action Yes.

 

Anti-pattern

WITH s1 AS (
SELECT
a,
b
FROM
table_a
),
s2 AS (
SELECT
c,
d
FROM
table_b
)
SELECT
*
FROM
s1

Best practice

WITH s1 AS (
SELECT
a,
b
FROM
table_a
)
SELECT
*
FROM
s1

 

linter.sql.structure.join_no_using

Name Prevent USING keyword in the join clauses.
Code action Yes.

 

Anti-pattern

SELECT
table_a.a,
table_a.b,
table_b.c,
table_b.d
FROM
table_a
INNER JOIN table_b USING(id)
;

Best practice

SELECT
table_a.a,
table_a.b,
table_b.c,
table_b.d
FROM
table_a
INNER JOIN table_b
ON table_b.id = table_a.id
;

 

Ambiguous group

linter.sql.ambiguous.avoid_distinct_with_group_by

Name DISTINCT clause is not needed when GROUP BY is used.
Description When using GROUP BY a DISTINCT` clause should not be necessary as every non-distinct SELECT clause must be included in the GROUP BY clause.
Code action Fix code action is not available.

 

Anti-pattern

WITH _src AS (
SELECT 1 AS col1, 2 AS col2
)
SELECT
DISTINCT
col1,
col2
FROM _src
GROUP BY
col1,
col2
;

Best practice

WITH _src AS (
SELECT 1 AS col1, 2 AS col2
)
SELECT
col1,
col2
FROM _src
GROUP BY
col1,
col2
;

 

linter.sql.ambiguous.avoid_select_star_in_final_projection

Name Using SELECT * could produce an unknown number of result columns.
Description Querying all columns using * produces a query result where the number or ordering of columns changes if the upstream table’s schema changes. This should generally be avoided because it can cause slow performance, cause important schema changes to go undetected, or break production code.
Code action Fix code action is not available.

 

Anti-pattern

WITH cte AS (
SELECT * FROM foo
)
SELECT * FROM cte
UNION
SELECT a, b FROM t

Best practice

WITH cte AS (
SELECT * FROM foo
)
SELECT a, b FROM cte
UNION
SELECT a, b FROM t

 

linter.sql.ambiguous.column_references

Name Reference all GROUP BY/ORDER BY columns either by name or by position.
Code action Fix code action is not available.

 

Anti-pattern

SELECT
col_1,
col_2,
SUM(col_3)
FROM
table_a
GROUP BY
col_1,
2
;

Best practice

SELECT
col_1,
col_2,
SUM(col_3)
FROM
table_a
GROUP BY
col_1,
col_2
;

linter.sql.ambiguous.use_fully_qualified_inner_join

Name INNER JOIN clause should use fully qualified name.
Description Enforce fully qualified INNER JOIN clause and [LEFT/RIGHT/FULL] OUTER JOIN.
Code action Fix code action is available.

 

Anti-pattern

SELECT
table_a.*,
table_b.*
FROM
table_a
JOIN table_b
ON table_a.customer_id = table_b.id
;

Best practice

SELECT
table_a.*,
table_b.*
FROM
table_a
INNER JOIN table_b
ON table_a.customer_id = table_b.id
;

linter.sql.ambiguous.use_fully_qualified_outer_join

Name OUTER JOIN clause should use fully qualified name.
Description Enforce fully qualified [LEFT/RIGHT/FULL] OUTER JOIN.
Code action Fix code action is available.

 

Anti-pattern

SELECT
table_a.*,
table_b.*
FROM
table_a
LEFT JOIN table_b
ON table_a.customer_id = table_b.id
;

Best practice

SELECT
table_a.*,
table_b.*
FROM
table_a
LEFT OUTER JOIN table_b
ON table_a.customer_id = table_b.id
;

linter.sql.ambiguous.avoid_implicit_cross_join

Name Prevent usage of implicit CROSS JOIN.
Description Identifies cases when cross join is implicitly specified where it could have a high risk of a mistaken cross join.
Code action Fix code action is available.

 

Anti-pattern

SELECT
*
FROM
table_a
JOIN table_b
;

Best practice

SELECT
*
FROM
table_a
CROSS JOIN table_b
;

linter.sql.ambiguous.consistent_order_by_direction

Name If any columns in the ORDER BY clause specify ASC or DESC, they should all do so.
Code action Fix code action is available.

 

Anti-pattern

SELECT
a,
b
FROM
foo
ORDER BY
a,
b DESC
;

Best practice

SELECT
a,
b
FROM
foo
ORDER BY
a ASC,
b DESC
;

linter.sql.ambiguous.set_expression_match_columns_count

Name When writing set expressions, all queries must return the same number of columns.
Code action Fix code action is not available.

 

Anti-pattern

WITH cte AS (
SELECT
a,
b
FROM foo
)
SELECT * FROM cte
UNION
SELECT
c,
d,
e
FROM
t
;

Best practice

WITH cte AS (
SELECT a, b FROM foo
)
SELECT
a,
b
FROM cte
UNION
SELECT
c,
d
FROM
t
;

linter.sql.ambiguous.use_fully_qualified_union

Name Forces to use UNION ALL / UNION DISTINCT fully qualified names instead of simple UNION clause.
Code action Fix code action is available.

 

Anti-pattern

SELECT
a,
b
FROM
table_a

UNION

SELECT
a,
b
FROM
table_b
;

Best practice

SELECT
a,
b
FROM
table_a

UNION ALL

SELECT
a,
b
FROM
table_b
;

Supported Platforms

At this point linter is supported on the following platforms:

  • Postgres
  • Redshift
  • Snowflake
  • Bigquery
Was this article helpful?
0 out of 0 found this helpful
Have more questions? Submit a request