Custom SQL Queries
Learn how to create custom tables and write SQL queries for advanced analytics in Hardal
SQL Queries Guide for Hardal Analytics
Overview
Hardal Analytics allows you to create custom metrics using SQL queries to analyze your data in ways specific to your business needs. This guide will help you understand how to write effective SQL queries in Hardal’s Analytics Dashboard to extract insights from your collected data.
Hardal uses ClickHouse as its underlying database, which provides high-performance SQL querying capabilities for analytics workloads.
Prerequisites
Before writing SQL queries in Hardal Analytics, you should have:
- Access to the Hardal Analytics Dashboard
- Basic understanding of SQL syntax
- Familiarity with JSON data structures (as much of Hardal’s data is stored in JSON format)
- Understanding of your data collection setup
Data Structure
Learn details about Hardal Database Structure
Database Overview
The analytics data is stored in a database called default
which contains multiple tables. For this documentation, we’ll focus on the analytics
table, which can be accessed using the format default.analytics
.
Table Structure
The analytics
table contains the following columns:
Column Name | Type | Description |
---|---|---|
id | UUID | Unique identifier for each record |
event_name | LowCardinality(String) | Name of the tracked event |
distinct_id | LowCardinality(String) | Unique identifier for tracking |
properties | String | JSON object containing detailed event data |
created_at | DateTime | Timestamp of event creation |
Properties Structure
The properties
column contains a JSON object with rich event data. Here’s the standard structure:
Top-Level Properties
distinct
: Contains user identification informationpage
: Page-related informationscreen
: Device screen specificationsbrowser
: Browser detailsdevice_type
: Type of device (e.g., desktop, mobile)timezone
: User’s timezonetimestamp
: Event timestampquery_params
: URL query parametersevents
: Array of detailed event informationbatch_size
: Number of events in batchbatch_timestamp
: Timestamp of the batch
Properties Detail
Page Information
Screen Information
Browser Information
Writing SQL Queries
To create a custom SQL query:
- Navigate to the Analytics Dashboard
- Click “Create Metric” button
- In the query editor, write your SQL query
- Click “Execute Query” to run and visualize the results
- Select your preferred visualization type (Table, Line Chart, Bar Chart, or Score Card)
Basic Query Structure
A basic query to retrieve data from Hardal follows this structure:
Working with JSON Data
Since much of Hardal’s data is stored in JSON format in the properties
column, you’ll need to use special functions to extract values:
String Extraction
Numeric Extraction
Array Extraction
When working with arrays in JSON (like events
):
Date and Time Filtering
Important: Hardal requires all queries to include a date range filter using the BETWEEN toDateTime(?) AND toDateTime(?)
syntax. This is mandatory to prevent queries from scanning too much data and affecting performance.
If you want to filter for the last 7 days, you’ll need to calculate the date range explicitly:
Example Queries
Here are some practical examples of SQL queries you can use in Hardal Analytics:
Basic Property Extraction
Basic Page View Analysis
Purchase Analysis by City
Purchases by Platform
User Journey Analysis
Analyzing GA4 Events
Visualization Options
After executing your query, you can visualize the results in several formats:
Table View
Displays the raw data in a tabular format. Best for detailed data exploration or when you need to see individual records.
Bar Chart
Useful for comparing values across categories. Good for queries that group data by dimensions like page, country, or device type.
Line Chart
Ideal for showing trends over time. Best for queries that include date or timestamp columns.
Score Card
Shows a single value, usually a key metric like total purchases, average revenue, or user count. Good for dashboard summary metrics.
Troubleshooting and Optimization
Common Issues
-
Missing Required Date Range: All queries must include a date range filter:
- Ensure you include
WHERE created_at BETWEEN toDateTime(?) AND toDateTime(?)
- Replace the ? placeholders with actual date values
- Ensure you include
-
Query Timeout: If your query takes too long to execute:
- Add more specific filters
- Narrow the date range
- Add a LIMIT clause
- Use more efficient JSON extraction
-
No Results: If your query returns no data:
- Check your date range - ensure data exists for the period
- Verify event names and property paths
- Simplify filters to see if any data exists
-
JSON Extraction Errors: If you’re having trouble accessing nested data:
- Use the right extraction function for the data type
- Check the path exactly
- Remember array indices start at 1 in ClickHouse, not 0
Performance Tips
-
Use Specific Date Ranges: Always include a date filter to limit the amount of data scanned.
-
Filter Early: Apply WHERE clauses as early as possible in your query logic.
-
Limit Result Sets: Use LIMIT or TOP to return only the data you need.
-
Use Appropriate JSON Extractors: Match the extractor to the data type (String, Int, Float).
-
Avoid Complex Subqueries: Use CTEs (WITH clauses) for better readability and performance.
Best Practices
-
Always Include Date Range: Every query must include the required date range filter using
BETWEEN toDateTime(?) AND toDateTime(?)
. -
Document Your Queries: Add comments to explain complex logic.
-
Use Meaningful Aliases: Give columns clear names that reflect what the data represents.
-
Format Your Queries: Use consistent indentation and line breaks for readability.
-
Test Incrementally: Build complex queries step by step, verifying results at each stage.
-
Reuse Tested Patterns: Save working query patterns for future use.
-
Handle NULL Values: Always handle potential NULL values in JSON extraction or calculations.
-
Use WITH Clauses: For complex queries, use CTEs (WITH clauses) to make your logic clearer.
-
Use Appropriate JSON Extraction Functions:
JSONExtractString
for textJSONExtractFloat
for decimal numbersJSONExtractInt
for whole numbersJSONExtractBool
for boolean values
-
Remember Array Indexing: When working with arrays, remember that indexing starts at 1, not 0
-
Chain Extractions Correctly: For nested properties, chain the extraction functions from outer to inner elements
-
Use LIMIT During Development: Consider adding LIMIT clauses during development to avoid processing large amounts of data