In this sprint, you will design your data pipeline, including what data to sync to Algolia and how often you update it. By the end of this sprint, you will have all of our data live in Algolia. Once you have completed the tasks below, you can move onto the next sprint.
Depending on the size of your company, some of these roles may be the same person. This sprint it is important we identify these roles and get in contact with them.
Planning and project oversight
Analyze and design IT components
Build application business logic, server scripts and API’s
First, you should review the mockups created in the previous sprint and identify alldata types included. For instance, are articles, products, FAQ included in the mockups? Any data type you want to be searchable must have its own Algolia index.
Now that you know all of the data types you want to sync into Algolia, you need to think about how you want to structure this data and how often it needs to be updated.
Best practices are covered in this webinar, including how to handle different ranking strategies such as ‘sort bys’. it’s likely the data you have requires some transformation. Some use cases, such as handling multiple languages also require specific indexing strategies.
At this point, it makes sense to create a system diagram of what will pass between systems and how often.
The tools you use to build your pipeline depend on the systems you are pulling data out of. For each data type, identify the system you need access to and check the relevant section below.
Out of the box connectors
The first step of implementing a Shopify integration is setting up a full reindex. Once you have validated your Algolia account you can trigger this straightaway. This creates three indices; products, pages and collections. If you want to enrich these indices further with data from an API or 3rd party system you can utilize metafields. If you want to enrich them with data directly managed in Shopify, use named tags. If you have the option, we recommend named tags are recommended as metafields can slow the indexing process.
If you are unable to utilize one of our out of the box integrations, you can check out connector options that are built by third parties and the Algolia community.
Do it yourself
If the system with the required data has an integration point where you can use one of the API clients listed above, this is an optimum point to index to Algolia.
If you can only access the entire database, we can use replaceAllObjects.
The Algolia crawler is a good fit for your implementation if you have static, HTML content you want to index. For example, the Algolia Crawler is a great way to index data for a site search implementation. Optimally, you can enrich crawled static content with ranking data from Google Analytics or Adobe Analytics.
You can manage all configuration details, in the Crawler Editor as a JSONfile. Once you’ve set up your startUrls and sitemaps you can run the Crawler and use the path explorer and data analysis to figure out which URLs have been crawled and which haven’t. Then you can update the configuration to ensure all required URLSs are crawled.