Getting Started with CoginitiScript

Introduction

The SQL language does not provide an easy way to reuse code. Stored procedures / functions are the closest things which could be used for this task but the development experience is not great. You have to manually deploy stored procedures into database before you can use them and you need to make sure the version deployed there is consistent to what you expect based on your source code. In addition, a lot of database users do not have permissions to deploy stored procedures / functions due to internal policy rules.

CoginitiScript allows users when writing SQL following the same best practices of code organization and reuse as software developers. In addition, being stored inside the Coginiti catalog it allows you to collaborate efficiently and version your assets. By organizing SQL logic into reusable modular pieces (code blocks + packages) as a user you create a single source of truth for your analytics.

This tutorial will show you how to define a SQL code block and reuse it in downstream calculations.

For a full overview of the language features, please refer to the Language Reference documentation.

For this tutorial we’re going to pretend we are a e-commerce shop and we need to calculate RFM segmentation for our customers. Our orders table looks like following:

DROP TABLE IF EXISTS orders;

CREATE TABLE orders(
  transaction_id INT,
  transaction_date TIMESTAMP,
  customer_email VARCHAR(255),
  quantity INT,
  unit_price DOUBLE PRECISION
);
 
This tutorial features SQL queries written in Redshift SQL syntax.
 

What is RFM analysis?

An RFM analysis is a way of segmenting customers using three metrics: recency (when their last purchase was), frequency (how often they purchase), and monetary value (how much they spent).

As a first step, for each customer we have to calculate those three metrics.

  1. Open Catalog tab and add new folder with name tutorial by invoking a context menu with right mouse click and selecting New Folder item.
  2. Right click on the created tutorial folder and select New Entry menu item
  3. Name it getting_started and click Create button
  4. Expand tutorial folder and double click on getting_started to open it in the editor tab.
  5. Paste the following code into the editor and save it by clicking on File -> Save Tab menu item in the application menu.
#+src sql rfmValues()
#+begin
  SELECT
    customer_email AS email,
    MAX(transaction_date) AS recent_order,
    COUNT(*) AS total_orders,
    SUM(quantity * unit_price) AS total_sales
  FROM
    orders
  WHERE
    customer_email IS NOT NULL
    AND unit_price != 0
  GROUP BY
    customer_email
  ORDER BY
    customer_email
#+end

To execute a given block in Coginiti application just place a cursor anywhere inside the block and hit Run at Cursor button (assuming correspondent connection to Redshift server has been selected in the dropdown).

The query above provides the values we’ll need for our RFM analysis. The data set is now ready for RFM analysis with 5 tier groups for each dimensions (R, F and M). To calculate such quantiles we will create a separate block with a query referencing our first block.

Past following code into the editor just below previous block.

#+src sql rfmQuantiles()
#+begin
  SELECT
    email,
    NTILE(5) OVER ( ORDER BY recent_order ) AS rfm_recency,
    NTILE(5) OVER ( ORDER BY total_orders ) AS rfm_frequency,
    NTILE(5) OVER ( ORDER BY total_sales ) AS rfm_monetary
  FROM
    {{ rfmValues() }}
  ORDER BY 2 DESC
#+end

Please note that syntax of block invocation has changed from ${block()} to {{ block() }} in 22.5.

This block returns a dataset with customer emails alongside the quintile that each customer falls into for recency, frequency, and monetary value.

Now that we know where our customers fall generally, we can put them into correspondent segment:

#+src sql rfmSegments()
#+begin
  SELECT
    email,
    rfm_recency,
    rfm_frequency,
    rfm_monetary,
    rfm_recency * 100 + rfm_frequency * 10 + rfm_monetary AS rfm_cell,
    CASE
      WHEN rfm_cell IN (355,255) THEN 'Cannot lose'
      WHEN rfm_cell IN (543,542,453,452) THEN 'Active fans'
      WHEN rfm_cell IN (525,524,515,514) THEN 'Promising newbies'
      WHEN rfm_cell IN (335,334,325,324) THEN 'Potential churners'
      ELSE 'Other'
    END AS rfm_segment
  FROM
    {{ rfmQuantiles() }}
  ORDER BY 6 DESC
#+end

If we want to precalculate this data and store it as a table we could just execute the following query:

CREATE TABLE rfm_segments AS
SELECT * FROM {{ rfmSegments() }};

As you can see when you write SQL queries you reference CoginitiScript blocks just like you would reference tables or views (in the same contexts). Code blocks provides a nice way of abstracting certain datasets and reusing them in the downstream calculations.

 
Was this article helpful?
1 out of 1 found this helpful
Have more questions? Submit a request