Skip to content

Instantly share code, notes, and snippets.

@xieyunzi
Created March 19, 2017 13:52
Show Gist options
  • Save xieyunzi/8435c476477e7bde11912211d140b275 to your computer and use it in GitHub Desktop.
Save xieyunzi/8435c476477e7bde11912211d140b275 to your computer and use it in GitHub Desktop.
Elasticsearch Scan/Scroll
// https://www.elastic.co/guide/en/elasticsearch/client/php-api/5.0/_search_operations.html#_scan_scroll
$client = ClientBuilder::create()->build();
$params = [
"scroll" => "30s", // how long between scroll requests. should be small!
"size" => 50, // how many results *per shard* you want back
"index" => "my_index",
"body" => [
"query" => [
"match_all" => new \stdClass()
]
]
];
$docs = $client->search($params); // Execute the search
$scroll_id = $docs['_scroll_id']; // The response will contain no results, just a _scroll_id
// Now we loop until the scroll "cursors" are exhausted
while (\true) {
// Execute a Scroll request
$response = $client->scroll([
"scroll_id" => $scroll_id, //...using our previously obtained _scroll_id
"scroll" => "30s" // and the same timeout window
]
);
// Check to see if we got any search hits from the scroll
if (count($response['hits']['hits']) > 0) {
// If yes, Do Work Here
// Get new scroll_id
// Must always refresh your _scroll_id! It can change sometimes
$scroll_id = $response['_scroll_id'];
} else {
// No results, scroll cursor is empty. You've exported all the data
break;
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment