Import & Synchronize Your Data - PHP

Last updated 10 April 2017
Table of contents
Also available for:

Before being able to search using Algolia, you’ll need to push your data to our servers using one of our API clients.

Importing data into Algolia and keeping it up to date is simple. Nonetheless, there are some steps to take to make sure the process is efficient and effective. So let’s go through them!

Importing Data

There are two ways to import data: through the dashboard and through the API. While the API is recommended, as it gives you more power and makes it easier to keep your data up to date, importing via the dashboard can be useful to get up and running quickly.

Importing Through the Dashboard

Import synchronize data 1

When you have created an Algolia index, you’re able to add records to it through the dashboard.

On the right-hand side of your index browse page, you have an option to add records manually (using the JSON format) or by uploading a file, which can be a JSON, a CSV, or a TSV and must be below 50MB.

This is mostly useful for quick experimentation, but it’s not recommended for an implementation in production.

Importing Through the API

Meanwhile, importing your data through the API via one of our many API Clients, gives you the most flexibility and control.

Setup the API Client

Use Composer to manage your algoliasearch dependency (if you don’t use Composer, you can copy the algoliasearch.php file and the src and resources directory to your project). Add algolia/algoliasearch-client-php to your composer.json file and launch composer install:

composer require algolia/algoliasearch-client-php

Import your data

To index your existing data for the first time, you need to iterate through your records and send each to Algolia’s servers.

The following lines load all your database records and send them to Algolia’s servers:

<?php
// initialize API Client & Index
$client = new \AlgoliaSearch\Client('YourApplicationID', 'YourAPIKey');
$index = $client->initIndex('test');
$pdo = new PDO('mysql:host=localhost;dbname=YourDatabaseName', 'mysql_user', 'mysql_password', array(PDO::MYSQL_ATTR_INIT_COMMAND => "SET NAMES 'utf8'"));
$results = $pdo->query('SELECT * from YourTableName');
if ($results)
{
  $batch = array();
  // iterate over results and send them by batch of 10000 elements
  foreach ($results as $row)
  {
  	// select the identifier of this row
    $row['objectID'] = $row['YourIDColumn'];
    array_push($batch, $row);
    
    if (count($batch) == 10000)
    {
      $index->saveObjects($batch);
      $batch = array();
    }
  }
    
  $index->saveObjects($batch);
}

Custom Object IDs

If your objects have unique IDs and you would like to use them to make future updates easier, you can specify the objectID in the records you push to Algolia. The value you provide for objectIDs can be an integer or a string. If you don’t provide this attribute, Algolia will generate one for each record, like "228506501".

Batching

We recommend sending your records in batches. We suggest a batch size of 1,000 or 10,000 records, depending on the average record size. 

This has multiple benefits: it reduces network calls and it increases indexing performance. Customers with the largest number of records, such as those on the Enterprise or Pro plans, will see the biggest impact on performance, but all customers are recommended to send indexing operations in batches where possible.

<?php
$res = $index->addObject(array("firstname" => "Jimmie", 
                               "lastname" => "Barninger"));
$index->waitTask($res['taskID']);

For our Enterprise customers, note these two further considerations:

  • We accept 1GB max per API call (but we recommended much smaller batches)
  • If you send more than 1M objects, it’s recommended to wait for the indexing task to complete before sending more

Update your data

After the initial import of the data, you’ll need to keep your index up-to-date with the latest additions and changes on your application or website.

Good news, the API is very flexible! You can update your records:

  • one by one in realtime or batch them
  • update a complete record or only a subset of its attributes
  • update a full index without generating inconsistencies on the search during the process

Incremental Updates

To save your new or updated objects, use the following code:

<?php
// update the record with objectID="myID1"
// the record is created if it doesn't exist
$res = $index->saveObject(array("firstname" => "Jimmie",
                                "lastname" => "Barninger",
                                "objectID" => "myID1"));

To delete your deleted objects, use the following code:

<?php
// delete the record with objectID="myID1"
$index->deleteObject("myID1");

To update a subset of the attributes of an object, you can either save/override the record using the save/add methods or use the following code:

<?php
// update (or add if it doesn't exist) the attribute "firstname" of object "myID1"
$res = $index->partialUpdateObject(array("firstname" => "Jimmie", "objectID" => "myID1"));

Reindexing

Clear an Index

Clearing an index removes all its records but keeps its settings.

To clear an index, use the following code:

<?php
$index->clearIndex();

Atomical Re-indexing

In some cases, you may want to totally change the way your index is structured and need to reindex all your data. In order to keep your existing service running while re-importing your data, we recommend the usage of a temporary index plus an atomical move.

<?php
// import all your data to a temporary `YourIndexName_temp` index
// [...]
// rename the tempory index to its final name
$client->moveIndex("YourIndexName_temp", "YourIndexName");

You should set all the settings of the main index on the temporary one, except the replicas setting. The move_index operation will override all the settings of the destination index except this one.

How Often to Send Updates

Finally we address how often to send updates. This is another question with no one right answer. What you go with will depend on how often new data is added to your site and how quickly that data needs to be searchable.

For example, for an e-commerce shop:

  • You’ll want to update in real-time the price changes, or the availability of a product
  • You don’t necessarily need to update the number of sales that you use for the ranking, so you can send them by batches periodically every hour/day/week…

You’ll need to find a balance between having the information in the search as fast as possible, and minimizing the number of operations (because it has an impact on the pricing / performance).

Did you find this page helpful?

We're always looking for advice to help improve our documentation! Please let us know what's working (or what's not!) - we're constantly iterating thanks to the feedback we receive.

Send us your suggestions!