LearnData Model

Data Model

Hardal's ClickHouse database schema — core raw tables and pre-aggregated analytics tables for fast querying.

Hardal stores your data in ClickHouse across two types of tables: core tables that hold raw event data, and analytics tables that are pre-aggregated for fast querying.

Use the analytics tables whenever possible. They are pre-aggregated and optimized for performance — queries run significantly faster than scanning raw core tables.


Analytics Tables

These tables are pre-aggregated and purpose-built for reporting. Start here.

sessions

Every user session with device, location, and attribution data. One row per event within a session.

ColumnTypeDescription
website_idStringYour signal ID
user_idStringAnonymous user identifier
session_idStringSession identifier
event_idUUIDUnique event identifier
event_nameStringEvent name (e.g. page_view, purchase)
countryStringVisitor country
cityStringVisitor city
deviceStringDevice type (desktop, mobile, tablet)
browserStringBrowser name
osStringOperating system
hostnameStringDomain where the event occurred
referrer_domainStringReferring domain
urlStringFull page URL
event_timeDateTimeWhen the event occurred
session_partUInt32Part number within a multi-part session
-- Page views by country for the last 7 days
SELECT country, count() AS views
FROM sessions
WHERE website_id = 'your_signal_id'
  AND event_name = 'page_view'
  AND event_time >= now() - INTERVAL 7 DAY
GROUP BY country
ORDER BY views DESC

items

Product-level data from e-commerce events (purchase, add_to_cart, view_item, etc.).

ColumnTypeDescription
website_idStringYour signal ID
session_idStringSession identifier
event_idUUIDParent event identifier
event_nameStringEvent that contained the item
event_timeDateTimeEvent timestamp
transaction_idStringOrder or transaction ID
transaction_valueFloat64Total transaction value
currencyStringCurrency code
item_idStringProduct identifier
item_nameStringProduct name
item_brandStringBrand name
item_categoryStringPrimary category
item_category2StringSecondary category
item_category3StringTertiary category
item_skuStringSKU code
item_variantStringVariant name (e.g. color, size)
item_variant_idStringVariant identifier
item_priceFloat64Unit price at purchase
item_first_priceFloat64Original price before discount
item_taxFloat64Tax amount
item_discountFloat64Discount amount
item_quantityUInt32Units purchased
-- Top products by revenue this month
SELECT
  item_name,
  item_brand,
  sum(item_price * item_quantity) AS revenue,
  sum(item_quantity) AS units_sold
FROM items
WHERE website_id = 'your_signal_id'
  AND event_name = 'purchase'
  AND toYYYYMM(event_time) = toYYYYMM(today())
GROUP BY item_name, item_brand
ORDER BY revenue DESC
LIMIT 20

session_attr

Session-level marketing attribution — which source, medium, and campaign drove each session.

ColumnTypeDescription
website_idStringYour signal ID
session_idStringSession identifier
sourceStringTraffic source (e.g. google, facebook)
mediumStringTraffic medium (e.g. cpc, organic)
campaignStringUTM campaign value
channelStringResolved channel (e.g. PaidSearch, OrganicSocial)
channel_scoreFloat64Attribution weight score
touchpoint_countUInt32Number of touchpoints in session
-- Conversions by channel this week
SELECT
  sa.channel,
  count() AS conversions,
  sum(pr.revenue) AS revenue
FROM sessions s
JOIN session_attr sa ON s.session_id = sa.session_id AND s.website_id = sa.website_id
JOIN purchase_revenue pr ON s.event_id = pr.event_id AND s.website_id = pr.website_id
WHERE s.website_id = 'your_signal_id'
  AND s.event_name = 'purchase'
  AND s.event_time >= now() - INTERVAL 7 DAY
GROUP BY sa.channel
ORDER BY revenue DESC

user_attr

User-level attribution — the winning channel across all of a user's sessions.

ColumnTypeDescription
website_idStringYour signal ID
user_idStringAnonymous user identifier
sourceStringAttributed source
mediumStringAttributed medium
campaignStringAttributed campaign
channelStringAttributed channel
channel_scoreFloat64Attribution score
touchpoint_countUInt32Total touchpoints across sessions

purchase_revenue

Deduplicated revenue per event. Use this to sum revenue without double-counting.

ColumnTypeDescription
website_idStringYour signal ID
event_idUUIDPurchase event identifier
revenueFloat64Revenue value (deduped)

vw_daily_campaign_attribution

Pre-aggregated daily view of campaign performance. Best for campaign reporting queries.

ColumnTypeDescription
dateDateDay
campaignStringCampaign name
sourceStringTraffic source
mediumStringTraffic medium
channelStringResolved channel
sessionsUInt64Unique sessions
conversionsUInt64Conversion events
conversionRateFloat64Conversions ÷ sessions
revenueFloat64Total revenue
channelScoreFloat64Average channel attribution score
-- Campaign performance last 30 days
SELECT
  source,
  medium,
  campaign,
  sum(sessions) AS sessions,
  sum(conversions) AS conversions,
  sum(revenue) AS revenue
FROM vw_daily_campaign_attribution
WHERE date >= today() - 30
GROUP BY source, medium, campaign
ORDER BY revenue DESC

Core Tables

These tables hold the raw event data that powers everything above.

Avoid querying core tables directly for reporting — they are high-volume and not optimized for aggregation queries. Use the analytics tables above instead.

TableEnginePurpose
website_eventMergeTree (partitioned monthly)Every raw event with full device, browser, URL, and location data
event_dataMergeTreeCustom event properties stored as key-value pairs
session_dataReplacingMergeTreeSession-level properties stored as key-value pairs
touchpointsMergeTree (partitioned monthly)Raw marketing touchpoints with every supported click ID and UTM parameter

website_event schema

The primary raw event table. Every hardal.track() call writes a row here.

ColumnTypeDescription
idUUIDEvent identifier
website_idLowCardinality(String)Signal ID
session_idLowCardinality(String)Session identifier
hostnameLowCardinality(String)Domain
browser / browser_versionStringBrowser and version
osStringOperating system
deviceStringDevice type
screen / viewport_sizeStringScreen and viewport dimensions
languageStringBrowser language
country / city / regionStringGeo data
timezoneStringUser timezone
url / url_path / url_queryStringFull URL and components
referrer / referrer_domainStringReferrer data
page_titleStringHTML page title
event_typeStringpageview or event
event_nameStringEvent name
created_atDateTimeTimestamp (UTC)

event_data schema

Custom parameters passed with events (e.g. value, currency, item_id). Each property is a separate row.

ColumnTypeDescription
website_idStringSignal ID
session_idStringSession identifier
event_idUUIDLinks to website_event.id
event_nameStringEvent name
data_keyStringProperty name
string_valueNullable(String)String value
number_valueNullable(Float64)Numeric value
date_valueNullable(DateTime)Date value
data_typeUInt321 = string, 2 = number, 3 = date