Elasticsearch 映射（Mapping）-蒲公英云

什么是 Mapping

Mapping（映射）：是用来定义文档和它所包含的字段是如何被存储和索引的。

mapping 有点像 MySQL 中的数据表结构定义。

在 Elasticsearch 中，mapping 可以定义（明确规定）以下内容：

字符串字段是否应当被看作 Full Text（全文本）或精确的字符串值
字段的数据类型（数值、日期或者地理位置）
文档中所有字段的值是否应当索引到 _all 字段
date 数据类型的字段值的存储格式
对动态添加的字段，自定义映射规则

Mapping Type

前面说过，type（类型、类别）是一种虚拟的逻辑分类，用于将索引库（index）中的文档（document）划分为不同的逻辑分组。

Type 类似于 MySQL 中的数据表。

Elasticsearch 6.x 版本只允许每个 Index 包含一个 Type，7.x 版本已彻底移除 Type。

说明： 我用的是 Elasticsearch 5.2，为了便于向 6.x 甚至 7.x 版本过渡，推荐每个 Index 只包含一个 Type（而不要采用 1 对多）。

Type 包含以下内容：

Meta-fields（元字段）：主要包含文档的 _index、_type、_id、_score 和 _source 字段。
Properties or fields（属性或字段）：这里的 Properties 指的是源文档的属性或字段。文档可能包含标题、名称、年龄、创建时间等。

数据类型

每一个字段都有对应的数据类型，数据类型包含以下几种：

基本类型，如 text, keyword, date, long, double, boolean, ip
复合类型，如 object, nested
特殊类型，如 geo_point, geo_shape, completion.

在 Elasticsearch 中，同一个字段可以对应多种数据类型。

比如，同一个字符串字段经常被用于指定两种数据类型：

text 类型，会被分词和索引，用于全文搜索。
keyword 类型，用于排序或者聚合。

你可以用 standard analyzer, english analyzer, french analyzer 或者自定义的分析器（分词器）来分析和索引字符串字段。

大多数数据类型的字段都支持多字段数据类型，你可以通过字段的 fields 属性来实现。

动态映射

在创建索引库时，我们可以同时指定 Mapping，来明确指定文档中的字段的数据类型等。

但是，就算我们没有手动指定 Mapping，直接创建文档时，Elasticsearch 也会自动对文档中的字段进行映射。并且，后来新出现的字段也会被自动映射。这就是动态映射。

例如，我们直接在 Elasticsearch 中创建（索引）一个文档。

PUT /test/counters/1 
{ "count": 5 }

这里，test 索引库和 counters 类型都是自动生成的，并且也自动生成了对应的 Mapping。

我们来查看一下它的 Mapping。

GET /test/_mapping/counters

返回结果：

{
  "test": {
    "mappings": {
      "counters": {
        "properties": {
          "count": {
            "type": "long"
          }
        }
      }
    }
  }
}

然后，我们再来创建一个文档，新增了 created_date 字段。

PUT /test/counters/2
{ "count": 6, "created_date":"2019-06-09" }

再来查看它的 Mapping，结果如下：

{
  "test": {
    "mappings": {
      "counters": {
        "properties": {
          "count": {
            "type": "long"
          },
          "created_date": {
            "type": "date"
          }
        }
      }
    }
  }
}

显式映射

显式映射是指你想要手动明确地指定映射。

当然，我们自己要比 Elasticsearch 更清楚文档中的字段应当以何种数据类型进行存储更为合适。因此，显式映射尤为重要。

创建索引库时定义映射

PUT /twitter 
{
  "mappings": {
    "tweet": {
      "properties": {
        "message": {
          "type": "text"
        }
      }
    }
  }
}

新增映射字段

PUT /twitter/_mapping/tweet 
{
  "properties": {
    "user_name": {
      "type": "text"
    }
  }
}

更新已经存在的字段映射

一般来说，已经存在的字段映射是不允许再更改的，因为如果索引库中已经存在该文档，更新映射可能导致已存在的索引失效。

但是，也有一些例外：

可以给 object 类型的字段添加新的属性。
可以给字段添加多字段类型。
可以更新字段的 ignore_above。

示例：

PUT /my_index 
{
  "mappings": {
    "user": {
      "properties": {
        "name": {
          "properties": {
            "first": {
              "type": "text"
            }
          }
        },
        "user_id": {
          "type": "keyword"
        }
      }
    }
  }
}
PUT /my_index/_mapping/user
{
  "properties": {
    "name": {
      "properties": {
        "last": { 
          "type": "text"
        }
      }
    },
    "user_id": {
      "type": "keyword",
      "ignore_above": 100 
    }
  }
}

除了上述例外，最好的做法是用正确的 mapping 创建新的索引库，然后将你的文档记录重新索引。

查看映射

查看 mapping 的基本命令格式为：

GET /<索引库>/_mapping/<类型>

例如：

GET /alibaba/_mapping/user

返回结果：

{
  "alibaba": {
    "mappings": {
      "user": {
        "properties": {
          "created_at": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "email": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "first_name": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "full_name": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "info": {
            "properties": {
              "address": {
                "type": "text",
                "fields": {
                  "keyword": {
                    "type": "keyword",
                    "ignore_above": 256
                  }
                }
              },
              "age": {
                "type": "long"
              },
              "interests": {
                "type": "text",
                "fields": {
                  "keyword": {
                    "type": "keyword",
                    "ignore_above": 256
                  }
                }
              }
            }
          },
          "last_name": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          }
        }
      }
    }
  }
}