Peggy elasticsearch應用

Field datatypes
 a simple type like string, date, long, double, boolean or ip.
 a type which supports the hierarchical nature of JSON such as object or nested.
 or a specialised type like geo_point, geo_shape, or completion.

get - 取得資料
 http://localhost:9200/_index/_type/_id
 http://localhost:9200/_index/_type/_id?pretty

get search
 http://localhost:9200/_index/_type/_search
 http://localhost:9200/_index/_type/_search?q=xxx&pretty

post search & query_string
{
"query": {
"query_string": {
"query": "*"
}
}
}
=

query_string
{
"query_string" : {
"fields" : ["content", "name"],
"query" : "this AND that"
}
}
{
"query_string": {
"query": "(content:this OR name:this) AND (content:that OR name:that)“
}
}
=

query_string - query
 string
 “手機” 套 = “手機” OR 套
 apple phone = apple OR phone
 title: “手機” OR title:套 = title: (“手機“ 套) ! = (title: “手機” 套)
 boolean
 isPCT: true
 date & range
 dateName: [2012-01-01 TO 2012-12-31]
 dateName: [2012-01-01 TO *]
 dateName: {2011-12-31 TO *]
 range: [ 1 TO 5 ]

query_string - query
 object
 inventorsRaw.name: Nicky
 _missing_ & _exists_
 _missing_: title
 _exists_: title

query_string - nested
{
"query": {
"nested": {
"path": "relatedDocumentsRaw",
"query": {
"query_string": {
"query": "relatedDocumentsRaw.type:*"
}
}
}
}
}

query – size & from
 size (default: 10)
 The size parameter allows you to configure the maximum amount of hits to be
returned.
 from (default: 0)
 The from parameter defines the offset from the first result you want to fetch.
 [query_phase_execution_exception]
 Result window is too large, from + size must be less than or equal to: [10000]
 See the scroll api for a more efficient way to request large data sets.

query – sort & _source
 sort
 Allows to add one or more sort on specific fields.
 _source
 Allows to control how the _source field is returned with every hit.
{
"query": "…",
"size": 5,
"from": 10,
"sort": [{ "pubDate": "desc" }],
"_source": ["pubDate"],
}

query - filter
{
"query": {
"query_string": { "query": "*" }
},
"filter": {
"script": {
"script": {
"lang": "groovy",
"file": "fileNamw",
"params": {
"params1": "date1",
"params2": "date2",
}
}
}
}
}

query - aggregations (aggs)
 The aggregations framework helps provide
aggregated data based on a search query.
 size: 回傳的筆數
 default :10
 size: 0 回傳全部結果
 min_doc_count: 回傳的結果最小筆數
 order: 排序
 date_histogram: 依照日期
 terms: 依照doc_dount 結果
{
"query": "…",
"aggs": {
"date_agg": {
"date_histogram": {
"field": “pubDate",
"interval": "day",
"format": "yyyy-MM-dd",
"order": { "_count": "desc" },
"min_doc_count": 1
} },
"kindCode_agg": {
"terms": {
"field": "kindCode",
"size": 20,
"shard_size": 20
} }
}
}

query - aggregations (aggs)
{
"aggregations": {
"kindCode_agg": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{ "key": "U", "doc_count": 75879 },
{ "key": "A", "doc_count": 73732 },
{ "key": "B", "doc_count": 44115 },
{ "key": "S", "doc_count": 38981 } ] },
"appDocs": {
"buckets": [
{ "key_as_string": "2016-01-06", "key": 1452038400000, "doc_count": 56079 },
{ "key_as_string": "2016-01-27", "key": 1453852800000, "doc_count": 42351 } ] }
}
}

_timestamp field
 Mapping  query result
{
"mappings": {
"my_type": {
"_timestamp": {
"enabled": true
}
}
}
}
{
"_index": "test2",
"_type": "type",
"_id": "2",
"_score": 1,
"_timestamp": 1454051014319,
"_source": {
"name": "Tony",
"day": "1990-03-21"
}
}

scan&scroll
 POST
 http://localhost:9200/{{_index}}/({{type}}/)_search?search_type=scan&scroll=1m
{
"query": {
"query_string": {
"query": “*"
}
}
}

Keeping the search context alive
 The scroll parameter (passed to the search request and to every scroll request)
tells Elasticsearch how long it should keep the search context alive.
 Its value (e.g. 1m, see the section called “Time unitsedit”) does not need to be
long enough to process all data — it just needs to be long enough to process the
previous batch of results.
 Each scroll request (with the scroll parameter) sets a new expiry time.

post
{
"_scroll_id": "c2Nhbjs1OzE5NjMzOkxXdWt2d2V2UVFHTVvdGFsX2hpdHM6MT……..",
"took": 487,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1041712,
"max_score": 0,
"hits": []
}
}

scroll
 Get
 http://localhost:9200/_search/scroll/{{_scroll_id}}?scroll=1m

{
"_scroll_id": "c2Nhbjs1OzE5NjMzOkxXdWt2d2V2UVFHTVvdGFsX2hpdHM6MT……..",
"took": 487,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1041712,
"max_score": 0,
"hits": [ {…….}, {…….}, {…….}, {…….}, {…….}, {…….}, {…….}, {…….}]
}
}
get

Ref.
 Bulk
 https://www.elastic.co/guide/en/elasticsearch/guide/current/bulk.html
 Scan & Scroll
 https://www.elastic.co/guide/en/elasticsearch/guide/current/scan-scroll.html
 http://stackoverflow.com/questions/25453872/why-does-this-elasticsearch-scan-and-
scroll-keep-returning-the-same-scroll-id

Cheaper in Bulk
{ action: { metadata }}n
{ request body }n
{ action: { metadata }}n
{ request body }n
…..

action
 delete
 { "delete": { "_index": "website", "_type": "blog", "_id": "123" }}n
 create
 { "create": { "_index": "website", "_type": "blog", "_id": "123" }} n
 { "title": "My first blog post" } n
 Index
 { "index": { "_index": "website", "_type": "blog" }} n
 { "title": "My second blog post" } n
 update
 { "update": { "_index": "website", "_type": "blog", "_id": "123", "_retry_on_conflict" : 3} } n
 { "doc" : {"title" : "My updated blog post"} } n

 status
 '200': 'OK',
 '201': 'Created',
{ "took": 4,
"errors": false,
"items": [
{ "delete": {
"_index": "website", "_type": "blog", "_id": "123", "_version": 2, "status": 200, "found": true }},
{ "create": {
"_index": "website", "_type": "blog", "_id": "123", "_version": 3, "status": 201 }},
{ "create": {
"_index": "website", "_type": "blog", "_id": "EiwfApScQiiy7TIKFxRCTw", "_version": 1, "status": 201 }},
{ "update": {
"_index": "website", "_type": "blog", "_id": "123", "_version": 4, "status": 200 }}
]
}

Error Example
{ "create": { "_index": "website", "_type": "blog", "_id": "123" }}
{ "title": "Cannot create - it already exists" }
{ "index": { "_index": "website", "_type": "blog", "_id": "123" }}
{ "title": "But we can update it" }

Error Example
{ "took": 3,
"errors": true,
"items": [
{ "create": {
"_index": "website",
"_type": "blog",
"_id": "123",
"status": 409,
"error": "DocumentAlreadyExistsException [[website][4] [blog][123]: document already exists]" }},
{ "index": {
"_index": "website",
"_type": "blog",
"_id": "123",
"_version": 5,
"status": 200 }}
]
}

Peggy elasticsearch應用

More Related Content

What's hot (20)

Similar to Peggy elasticsearch應用 (20)

More from LearningTech (20)

Recently uploaded (20)

Peggy elasticsearch應用

Editor's Notes