elasticsearch-definitive-guide-en
Introduction
1. Getting started
2. Distributed Cluster
3. Data In Data Out
4. Distributed CRUD
5. Search
6. Mapping Analysis
7. Query DSL
8. 056_Sorting
9. Distributed Search
10. 070_Index_Mgmt
11. 075_Inside_a_shard
- 11.1. 20_Making_text_searchable
- 11.2. 30_dynamic_indices
- 11.3. 40_Near_real_time
- 11.4. 50_Persistent_changes
- 11.5. 60_Segment_merging
12. 080_Structured_Search
- 12.1. 05_term
- 12.2. 10_compoundfilters
- 12.3. 15_terms
- 12.4. 20_contains
- 12.5. 25_ranges
- 12.6. 30_existsmissing
- 12.7. 40_bitsets
- 12.8. 45_filter_order

elasticsearch-definitive-guide-en

[[pagination]] === Pagination

Our preceding <> told us that 14 documents in the((("pagination"))) cluster match our (empty) query. But there were only 10 documents in the hits array. How can we see the other documents?

In the same way as SQL uses the LIMIT keyword to return a single `page'' of results, Elasticsearch accepts ((("from parameter")))((("size parameter")))thefromandsize` parameters:

size:: Indicates the number of results that should be returned, defaults to 10

from:: Indicates the number of initial results that should be skipped, defaults to 0

If you wanted to show five results per page, then pages 1 to 3 could be requested as follows:

[source,js]

GET /_search?size=5 GET /_search?size=5&from=5

GET /_search?size=5&from=10

// SENSE: 050_Search/15_Pagination.json

Beware of paging too deep or requesting too many results at once. Results are sorted before being returned. But remember that a search request usually spans multiple shards. Each shard generates its own sorted results, which then need to be sorted centrally to ensure that the overall order is correct.

.Deep Paging in Distributed Systems

To understand why ((("deep paging, problems with")))deep paging is problematic, let's imagine that we are searching within a single index with five primary shards. When we request the first page of results (results 1 to 10), each shard produces its own top 10 results and returns them to the requesting node, which then sorts all 50 results in order to select the overall top 10.

Now imagine that we ask for page 1,000--results 10,001 to 10,010. Everything works in the same way except that each shard has to produce its top 10,010 results. The requesting node then sorts through all 50,050 results and discards 50,040 of them!

You can see that, in a distributed system, the cost of sorting results grows exponentially the deeper we page. There is a good reason that web search engines don't return more than 1,000 results for any query.

TIP: In <> we explain how you can retrieve large numbers of documents efficiently.