Uploading Existing Events Via CSV
On this page
Alternative way of capturing events
Capturing events through the API is important for continuous training and improvement of the models.
It can take time until your users generate enough events for Recommend.
After implementing events ingestion, you can import past events to benefit from Recommend earlier by uploading a CSV File in the Algolia dashboard.
This feature is in beta. By joining the beta program, you understand that Algolia Recommend’s CSV upload might not work for your use case.
You can upload your events when configuring your model in the Collect Events section in the Algolia dashboard.
Your events must respect the following format:
- The CSV file must be 100 MB or less in size.
- Each row should represent an event tied to a single objectID.
- The timestamps should cover a period of time of at least 30 days. The data should be as current as possible. When the model trains, it will ignore any data that is more than 90 days old.
The first row must contain
eventName. Any extra columns are ignored.
- The values must match the following criteria:
userToken: a unique identifier for the user session.
timestamp: the date of the event in standard format (ISO8601 or RFC3339) - with or without the time.
objectID: a unique identifier for the item the event is tied to.
eventType: the type of event (either ‘click’ or ‘conversion’).
eventName: a name for the event, which can be the same as
eventType. After you upload a well-formatted file containing enough events, the model can start training. If you decide to re-upload a file, the training will take only the newer file into account and will discard the old one.
Recommend models rely only on the timestamp values to determine the most recent window in which there is enough data to train.
For example, if all the events you upload have a timestamp older than 90 days, the models won’t have any valid events for training.
Once you send enough events to train the model, Algolia Recommend only uses those events for training and discards the events from the CSV file.
Exporting Google Analytics events through BigQuery
To export Google Analytics (GA360) data from BigQuery, you must have:
- A GA360 account with a website tracking ID.
- Enhanced Ecommerce activated and set up for the website.
- BigQuery Export enabled in GA360 to set up daily imports into BigQuery.
productSKU from GA360 has to match the
objectID in your index.
You can use the query below to export the data required to train both models. In the following code, you must replace:
GCP_PROJECT_IDwith the name of the project that holds the GA360 data in BigQuery.
BQ_DATASETwith the name of the dataset the exports are stored in.
DATE_TOwith the corresponding dates (in
YYYY-MM-DDformat) for a time window of at least 30 days.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 WITH ecommerce_data AS ( SELECT fullVisitorId as user_token, TIMESTAMP_SECONDS(visitStartTime + CAST(hits.time/1000 AS INT64)) as timestamp, products.productSKU as object_ids, CASE WHEN hits.eCommerceAction.action_type = "2" THEN 'click' WHEN hits.eCommerceAction.action_type = "3" THEN 'click' WHEN hits.eCommerceAction.action_type = "5" THEN 'click' WHEN hits.eCommerceAction.action_type = "6" THEN 'conversion' END AS event_type, CASE WHEN hits.eCommerceAction.action_type = "2" THEN "product_view" WHEN hits.eCommerceAction.action_type = "3" THEN "add_to_cart" WHEN hits.eCommerceAction.action_type = "5" THEN "checkout" WHEN hits.eCommerceAction.action_type = "6" THEN "purchase" END AS event_name FROM `GCP_PROJECT_ID.BQ_DATASET.ga_sessions_*`, UNNEST(hits) as hits, UNNEST(hits.product) as products WHERE _TABLE_SUFFIX BETWEEN FORMAT_DATE('%Y%m%d',DATE('DATE_FROM')) AND FORMAT_DATE('%Y%m%d',DATE('DATE_TO')) AND fullVisitorId IS NOT NULL AND hits.eCommerceAction.action_type in UNNEST(["2", "3", "5", "6"]) ), dedup_ecommerce_data AS ( SELECT user_token as userToken, timestamp, event_name as eventName, event_type as eventType, object_id as objectID FROM ecommerce_data GROUP BY userToken, timestamp, eventName, eventType, objectID ) select * from dedup_ecommerce_data
You can run this query in the SQL workplace for BigQuery. You can then export the results as a CSV file to Google Drive where it’s available for download.
You can also automate this task with the BigQuery API client libraries.