Interest Analysis — Event Table Requirements
This page describes the requirements for event tables used in the Interest Analysis model. BPP supports three types of interest prediction:
- Product Interest — based on an explicit product classification already in the event row.
- Custom Interest — inferred from content/page analysis using a custom taxonomy.
- IAB Interest — inferred using the industry-standard IAB taxonomy.
General requirements for event tables
To support any type of Interest Analysis, each event must include:
| Column name | Type | Description |
|---|---|---|
event_timestamp | DATETIME | When the event occurred (mandatory, UTC) |
hashed_email | STRING | A user identifier (mandatory — any consistent identifier works) |
event_label | STRING | Interest classification label (optional, see below) |
page_url | STRING | Full URL of the visited page (optional, used for content classification) |
All timestamps must be in UTC. The user identifier must be consistent across tables.
Product Interest
Scenario: the interest label is already available in the event row (e.g. from product/category tracking).
Required columns: event_timestamp, hashed_email, event_label.
Label format: hierarchical, using the double-pipe separator (||):
Electronics||Smartphones
Home & Garden||Furniture||Chairs
Example row
event_timestamp | hashed_email | event_label
---------------------|-----------------|---------------------------
2024-05-01 12:01:00 | a1b2c3@hash.com | Fashion||Shoes||Sneakers
IAB & Custom Interest
Scenario: the event lacks explicit categorization. BPP infers it through content classification.
Required columns: event_timestamp, hashed_email, page_url (must include the protocol, e.g. https://).
How it works
- A topic classification phase analyses the content of
page_url. - An interest label is generated and written into
event_label. - Interest modelling then uses
event_labelexactly as in Product Interest.
- IAB Interest uses the predefined IAB taxonomy.
- Custom Interest uses a client-specific taxonomy defined during onboarding.
Before classification
event_timestamp | hashed_email | page_url
---------------------|-----------------|------------------------------------------
2024-06-10 09:45:00 | xyz789@hash.com | https://www.example.com/blog/laptops
After classification
event_timestamp | hashed_email | page_url | event_label
---------------------|-----------------|---------------------------------------|------------------------
2024-06-10 09:45:00 | xyz789@hash.com | https://www.example.com/blog/laptops | Technology||Computers
Notes
event_labelis optional at ingestion, but required before interest models can run.- For Product Interest,
event_labelmust be present at load time. - For IAB/Custom Interest,
event_labelis generated automatically.
Best practices
- Use consistent, meaningful hierarchies in
event_label(max 3 levels). - Avoid duplicate or conflicting labels for the same event.
- Ensure all
page_urlvalues are reachable and crawlable for classification.
Summary
| Interest type | Required columns | Label source | Notes |
|---|---|---|---|
| Product | event_timestamp, hashed_email, event_label | From event tracking | Label must be present at load time |
| IAB | event_timestamp, hashed_email, page_url | Generated via scraping | Applied only to pageview events |
| Custom | event_timestamp, hashed_email, page_url | Generated via scraping | Taxonomy defined per client |