數據轉換

數據轉換 自 Apache ECharts^TM 5 開始支援。在 ECharts 中，術語 數據轉換 意味著從使用者提供的來源數據和轉換函式生成新的數據。此功能使使用者能夠以宣告式的方式處理數據，並提供使用者一些常見的「轉換函式」，使這類任務「開箱即用」。(為了上下文的一致性，我們保留使用「轉換」這個詞的名詞形式，而不是「轉換結果」)。

數據轉換的抽象公式為：outData = f(inputData)，其中轉換函式 f 可以是 filter、sort、regression、boxplot、cluster、aggregate(待辦)...。藉由這些轉換方法，使用者可以實現以下功能：

將數據分割成多個系列。
進行一些統計並將結果視覺化。
將一些視覺化演算法調整到數據並顯示結果。
排序數據。
移除或選擇某些類型的空值或特殊數據點。
...

開始使用數據轉換

在 ECharts 中，數據轉換是基於 dataset 的概念實現的。可以在數據集實例中設定 dataset.transform，以指示此數據集將從此 transform 生成。例如：

var option = {
  dataset: [
    {
      // This dataset is on `datasetIndex: 0`.
      source: [
        ['Product', 'Sales', 'Price', 'Year'],
        ['Cake', 123, 32, 2011],
        ['Cereal', 231, 14, 2011],
        ['Tofu', 235, 5, 2011],
        ['Dumpling', 341, 25, 2011],
        ['Biscuit', 122, 29, 2011],
        ['Cake', 143, 30, 2012],
        ['Cereal', 201, 19, 2012],
        ['Tofu', 255, 7, 2012],
        ['Dumpling', 241, 27, 2012],
        ['Biscuit', 102, 34, 2012],
        ['Cake', 153, 28, 2013],
        ['Cereal', 181, 21, 2013],
        ['Tofu', 395, 4, 2013],
        ['Dumpling', 281, 31, 2013],
        ['Biscuit', 92, 39, 2013],
        ['Cake', 223, 29, 2014],
        ['Cereal', 211, 17, 2014],
        ['Tofu', 345, 3, 2014],
        ['Dumpling', 211, 35, 2014],
        ['Biscuit', 72, 24, 2014]
      ]
      // id: 'a'
    },
    {
      // This dataset is on `datasetIndex: 1`.
      // A `transform` is configured to indicate that the
      // final data of this dataset is transformed via this
      // transform function.
      transform: {
        type: 'filter',
        config: { dimension: 'Year', value: 2011 }
      }
      // There can be optional properties `fromDatasetIndex` or `fromDatasetId`
      // to indicate that where is the input data of the transform from.
      // For example, `fromDatasetIndex: 0` specify the input data is from
      // the dataset on `datasetIndex: 0`, or `fromDatasetId: 'a'` specify the
      // input data is from the dataset having `id: 'a'`.
      // [DEFAULT_RULE]
      // If both `fromDatasetIndex` and `fromDatasetId` are omitted,
      // `fromDatasetIndex: 0` are used by default.
    },
    {
      // This dataset is on `datasetIndex: 2`.
      // Similarly, if neither `fromDatasetIndex` nor `fromDatasetId` is
      // specified, `fromDatasetIndex: 0` is used by default
      transform: {
        // The "filter" transform filters and gets data items only match
        // the given condition in property `config`.
        type: 'filter',
        // Transforms has a property `config`. In this "filter" transform,
        // the `config` specify the condition that each result data item
        // should be satisfied. In this case, this transform get all of
        // the data items that the value on dimension "Year" equals to 2012.
        config: { dimension: 'Year', value: 2012 }
      }
    },
    {
      // This dataset is on `datasetIndex: 3`
      transform: {
        type: 'filter',
        config: { dimension: 'Year', value: 2013 }
      }
    }
  ],
  series: [
    {
      type: 'pie',
      radius: 50,
      center: ['25%', '50%'],
      // In this case, each "pie" series reference to a dataset that has
      // the result of its "filter" transform.
      datasetIndex: 1
    },
    {
      type: 'pie',
      radius: 50,
      center: ['50%', '50%'],
      datasetIndex: 2
    },
    {
      type: 'pie',
      radius: 50,
      center: ['75%', '50%'],
      datasetIndex: 3
    }
  ]
};var option = {
  dataset: [
    {
      // This dataset is on `datasetIndex: 0`.
      source: [
        ['Product', 'Sales', 'Price', 'Year'],
        ['Cake', 123, 32, 2011],
        ['Cereal', 231, 14, 2011],
        ['Tofu', 235, 5, 2011],
        ['Dumpling', 341, 25, 2011],
        ['Biscuit', 122, 29, 2011],
        ['Cake', 143, 30, 2012],
        ['Cereal', 201, 19, 2012],
        ['Tofu', 255, 7, 2012],
        ['Dumpling', 241, 27, 2012],
        ['Biscuit', 102, 34, 2012],
        ['Cake', 153, 28, 2013],
        ['Cereal', 181, 21, 2013],
        ['Tofu', 395, 4, 2013],
        ['Dumpling', 281, 31, 2013],
        ['Biscuit', 92, 39, 2013],
        ['Cake', 223, 29, 2014],
        ['Cereal', 211, 17, 2014],
        ['Tofu', 345, 3, 2014],
        ['Dumpling', 211, 35, 2014],
        ['Biscuit', 72, 24, 2014]
      ]
      // id: 'a'
    },
    {
      // This dataset is on `datasetIndex: 1`.
      // A `transform` is configured to indicate that the
      // final data of this dataset is transformed via this
      // transform function.
      transform: {
        type: 'filter',
        config: { dimension: 'Year', value: 2011 }
      }
      // There can be optional properties `fromDatasetIndex` or `fromDatasetId`
      // to indicate that where is the input data of the transform from.
      // For example, `fromDatasetIndex: 0` specify the input data is from
      // the dataset on `datasetIndex: 0`, or `fromDatasetId: 'a'` specify the
      // input data is from the dataset having `id: 'a'`.
      // [DEFAULT_RULE]
      // If both `fromDatasetIndex` and `fromDatasetId` are omitted,
      // `fromDatasetIndex: 0` are used by default.
    },
    {
      // This dataset is on `datasetIndex: 2`.
      // Similarly, if neither `fromDatasetIndex` nor `fromDatasetId` is
      // specified, `fromDatasetIndex: 0` is used by default
      transform: {
        // The "filter" transform filters and gets data items only match
        // the given condition in property `config`.
        type: 'filter',
        // Transforms has a property `config`. In this "filter" transform,
        // the `config` specify the condition that each result data item
        // should be satisfied. In this case, this transform get all of
        // the data items that the value on dimension "Year" equals to 2012.
        config: { dimension: 'Year', value: 2012 }
      }
    },
    {
      // This dataset is on `datasetIndex: 3`
      transform: {
        type: 'filter',
        config: { dimension: 'Year', value: 2013 }
      }
    }
  ],
  series: [
    {
      type: 'pie',
      radius: 50,
      center: ['25%', '50%'],
      // In this case, each "pie" series reference to a dataset that has
      // the result of its "filter" transform.
      datasetIndex: 1
    },
    {
      type: 'pie',
      radius: 50,
      center: ['50%', '50%'],
      datasetIndex: 2
    },
    {
      type: 'pie',
      radius: 50,
      center: ['75%', '50%'],
      datasetIndex: 3
    }
  ]
};

即時

讓我們總結一下使用數據轉換的重點：

通過在一些空白數據集中宣告 transform、fromDatasetIndex/fromDatasetId，從現有的已宣告數據生成新的數據。
系列引用這些數據集以顯示結果。

進階用法

管道轉換

有一種語法糖可以像這樣管道轉換：

option = {
  dataset: [
    {
      source: [] // The original data
    },
    {
      // Declare transforms in an array to pipe multiple transforms,
      // which makes them execute one by one and take the output of
      // the previous transform as the input of the next transform.
      transform: [
        {
          type: 'filter',
          config: { dimension: 'Product', value: 'Tofu' }
        },
        {
          type: 'sort',
          config: { dimension: 'Year', order: 'desc' }
        }
      ]
    }
  ],
  series: {
    type: 'pie',
    // Display the result of the piped transform.
    datasetIndex: 1
  }
};

注意：理論上，任何類型的轉換都可以有多個輸入數據和多個輸出數據。但是，當轉換被管道化時，它只能接受一個輸入（除非它是管道的第一個轉換）並產生一個輸出（除非它是管道的最後一個轉換）。

輸出多個數據

在大多數情況下，轉換函式只需要產生一個數據。但實際上存在轉換函式需要產生多個數據的情況，其中每個數據可能被不同的系列使用。

例如，在內建的盒狀圖轉換中，除了產生的盒狀圖數據外，還產生了異常值數據，這些數據可以在散佈圖系列中使用。請參閱範例。

我們使用屬性 dataset.fromTransformResult 來滿足此要求。例如：

option = {
  dataset: [
    {
      // Original source data.
      source: []
    },
    {
      transform: {
        type: 'boxplot'
      }
      // After this "boxplot transform" two result data generated:
      // result[0]: The boxplot data
      // result[1]: The outlier data
      // By default, when series or other dataset reference this dataset,
      // only result[0] can be visited.
      // If we need to visit result[1], we have to use another dataset
      // as follows:
    },
    {
      // This extra dataset references the dataset above, and retrieves
      // the result[1] as its own data. Thus series or other dataset can
      // reference this dataset to get the data from result[1].
      fromDatasetIndex: 1,
      fromTransformResult: 1
    }
  ],
  xAxis: {
    type: 'category'
  },
  yAxis: {},
  series: [
    {
      name: 'boxplot',
      type: 'boxplot',
      // Reference the data from result[0].
      datasetIndex: 1
    },
    {
      name: 'outlier',
      type: 'scatter',
      // Reference the data from result[1].
      datasetIndex: 2
    }
  ]
};

此外，dataset.fromTransformResult 和 dataset.transform 可以同時出現在一個數據集中，這表示轉換的輸入是從由 fromTransformResult 指定的上游結果檢索的。例如：

{
  fromDatasetIndex: 1,
  fromTransformResult: 1,
  transform: {
    type: 'sort',
    config: { dimension: 2, order: 'desc' }
  }
}

在開發環境中除錯

當使用數據轉換時，我們可能會遇到最終圖表顯示不正確，但我們不知道哪個設定錯誤的麻煩。在這種情況下，可以使用屬性 transform.print 來幫助。（transform.print 僅在開發環境中可用）。

option = {
  dataset: [
    {
      source: []
    },
    {
      transform: {
        type: 'filter',
        config: {},
        // The result of this transform will be printed
        // in dev tool via `console.log`.
        print: true
      }
    }
  ]
};

篩選轉換

轉換類型 "filter" 是一種內建轉換，可根據指定的條件提供數據篩選。基本選項如下：

option = {
  dataset: [
    {
      source: [
        ['Product', 'Sales', 'Price', 'Year'],
        ['Cake', 123, 32, 2011],
        ['Latte', 231, 14, 2011],
        ['Tofu', 235, 5, 2011],
        ['Milk Tee', 341, 25, 2011],
        ['Porridge', 122, 29, 2011],
        ['Cake', 143, 30, 2012],
        ['Latte', 201, 19, 2012],
        ['Tofu', 255, 7, 2012],
        ['Milk Tee', 241, 27, 2012],
        ['Porridge', 102, 34, 2012],
        ['Cake', 153, 28, 2013],
        ['Latte', 181, 21, 2013],
        ['Tofu', 395, 4, 2013],
        ['Milk Tee', 281, 31, 2013],
        ['Porridge', 92, 39, 2013],
        ['Cake', 223, 29, 2014],
        ['Latte', 211, 17, 2014],
        ['Tofu', 345, 3, 2014],
        ['Milk Tee', 211, 35, 2014],
        ['Porridge', 72, 24, 2014]
      ]
    },
    {
      transform: {
        type: 'filter',
        config: { dimension: 'Year', '=': 2011 }
        // The config is the "condition" of this filter.
        // This transform traverse the source data and
        // and retrieve all the items that the "Year"
        // is `2011`.
      }
    }
  ],
  series: {
    type: 'pie',
    datasetIndex: 1
  }
};option = {
  dataset: [
    {
      source: [
        ['Product', 'Sales', 'Price', 'Year'],
        ['Cake', 123, 32, 2011],
        ['Latte', 231, 14, 2011],
        ['Tofu', 235, 5, 2011],
        ['Milk Tee', 341, 25, 2011],
        ['Porridge', 122, 29, 2011],
        ['Cake', 143, 30, 2012],
        ['Latte', 201, 19, 2012],
        ['Tofu', 255, 7, 2012],
        ['Milk Tee', 241, 27, 2012],
        ['Porridge', 102, 34, 2012],
        ['Cake', 153, 28, 2013],
        ['Latte', 181, 21, 2013],
        ['Tofu', 395, 4, 2013],
        ['Milk Tee', 281, 31, 2013],
        ['Porridge', 92, 39, 2013],
        ['Cake', 223, 29, 2014],
        ['Latte', 211, 17, 2014],
        ['Tofu', 345, 3, 2014],
        ['Milk Tee', 211, 35, 2014],
        ['Porridge', 72, 24, 2014]
      ]
    },
    {
      transform: {
        type: 'filter',
        config: { dimension: 'Year', '=': 2011 }
        // The config is the "condition" of this filter.
        // This transform traverse the source data and
        // and retrieve all the items that the "Year"
        // is `2011`.
      }
    }
  ],
  series: {
    type: 'pie',
    datasetIndex: 1
  }
};

即時

這是篩選轉換的另一個範例：

關於維度

config.dimension 可以是：

在數據集中宣告的維度名稱，例如 config: { dimension: 'Year', '=': 2011 }。維度名稱宣告不是強制性的。
維度索引（從 0 開始），例如 config: { dimension: 3, '=': 2011 }。

關於關係運算符

關係運算符可以是：>(gt)、>=(gte)、<(lt)、<=(lte)、=(eq)、!=(ne, <>)、reg。（括號中的名稱是別名）。它們遵循常見的語義。除了常見的數字比較之外，還有一些額外功能：

多個運算符可以出現在一個 {} 項目中，例如 { dimension: 'Price', '>=': 20, '<': 30 }，表示邏輯「與」(Price >= 20 且 Price < 30)。
數據值可以是「數字字串」。數字字串是可以轉換為數字的字串。例如 ' 123 '。空白和換行符在轉換時將會自動修剪。
如果我們需要比較「JS Date 實例」或日期字串（例如 '2012-05-12'），我們需要手動指定 parser: 'time'，例如 config: { dimension: 3, lt: '2012-05-12', parser: 'time' }。
支援純字串比較，但只能在 =、!= 中使用。>、>=、<、<= 不支援純字串比較（這四個運算符的「右值」不能是「字串」）。
運算符 reg 可用於進行正規表示式測試。例如，使用 { dimension: 'Name', reg: /\s+Müller\s*$/ } 選擇「Name」維度包含姓氏 Müller 的所有數據項目。

關於邏輯關係

有時我們也需要表達邏輯關係（and / or / not）：

option = {
  dataset: [
    {
      source: [
        // ...
      ]
    },
    {
      transform: {
        type: 'filter',
        config: {
          // Use operator "and".
          // Similarly, we can also use "or", "not" in the same place.
          // But "not" should be followed with a {...} rather than `[...]`.
          and: [
            { dimension: 'Year', '=': 2011 },
            { dimension: 'Price', '>=': 20, '<': 30 }
          ]
        }
        // The condition is "Year" is 2011 and "Price" is greater
        // or equal to 20 but less than 30.
      }
    }
  ],
  series: {
    type: 'pie',
    datasetIndex: 1
  }
};

and/or/not 可以巢狀使用，例如：

transform: {
  type: 'filter',
  config: {
    or: [{
      and: [{
        dimension: 'Price', '>=': 10, '<': 20
      }, {
        dimension: 'Sales', '<': 100
      }, {
        not: { dimension: 'Product', '=': 'Tofu' }
      }]
    }, {
      and: [{
        dimension: 'Price', '>=': 10, '<': 20
      }, {
        dimension: 'Sales', '<': 100
      }, {
        not: { dimension: 'Product', '=': 'Cake' }
      }]
    }]
  }
}

關於解析器

進行值比較時可以指定一些「解析器」。目前僅支援：

parser: 'time'：在比較之前將值解析為日期時間。解析器規則與 echarts.time.parse 相同，其中支援將 JS Date 實例、時間戳記數字（以毫秒為單位）和時間字串（例如 '2012-05-12 03:11:22'）解析為時間戳記數字，而其他值將會被解析為 NaN。
parser: 'trim'：在進行比較之前修剪字串。對於非字串，返回原始值。
parser: 'number'：強制在比較之前將值轉換為數字。如果無法轉換為有意義的數字，則轉換為 NaN。在大多數情況下，這不是必要的，因為預設情況下，如果可能，值將會在比較之前自動轉換為數字。但是預設轉換是嚴格的，而此解析器提供了寬鬆的策略。如果我們遇到帶有單位後綴的數字字串的情況（例如 '33%'、12px），我們應該使用 parser: 'number' 在比較之前將它們轉換為數字。

這是一個顯示 parser: 'time' 的範例：

option = {
  dataset: [
    {
      source: [
        ['Product', 'Sales', 'Price', 'Date'],
        ['Milk Tee', 311, 21, '2012-05-12'],
        ['Cake', 135, 28, '2012-05-22'],
        ['Latte', 262, 36, '2012-06-02'],
        ['Milk Tee', 359, 21, '2012-06-22'],
        ['Cake', 121, 28, '2012-07-02'],
        ['Latte', 271, 36, '2012-06-22']
        // ...
      ]
    },
    {
      transform: {
        type: 'filter',
        config: {
          dimension: 'Date',
          '>=': '2012-05',
          '<': '2012-06',
          parser: 'time'
        }
      }
    }
  ]
};

正式定義

最後，我們在這裡給出篩選轉換設定的正式定義：

type FilterTransform = {
  type: 'filter';
  config: ConditionalExpressionOption;
};
type ConditionalExpressionOption =
  | true
  | false
  | RelationalExpressionOption
  | LogicalExpressionOption;
type RelationalExpressionOption = {
  dimension: DimensionName | DimensionIndex;
  parser?: 'time' | 'trim' | 'number';
  lt?: DataValue; // less than
  lte?: DataValue; // less than or equal
  gt?: DataValue; // greater than
  gte?: DataValue; // greater than or equal
  eq?: DataValue; // equal
  ne?: DataValue; // not equal
  '<'?: DataValue; // lt
  '<='?: DataValue; // lte
  '>'?: DataValue; // gt
  '>='?: DataValue; // gte
  '='?: DataValue; // eq
  '!='?: DataValue; // ne
  '<>'?: DataValue; // ne (SQL style)
  reg?: RegExp | string; // RegExp
};
type LogicalExpressionOption = {
  and?: ConditionalExpressionOption[];
  or?: ConditionalExpressionOption[];
  not?: ConditionalExpressionOption;
};
type DataValue = string | number | Date;
type DimensionName = string;
type DimensionIndex = number;

請注意，當使用最小包時，如果您需要使用此內建轉換，除了 Dataset 元件外，還需要導入 Transform 元件。

import {
  DatasetComponent,
  TransformComponent
} from 'echarts/components';

echarts.use([
  DatasetComponent,
  TransformComponent
]);

排序轉換

另一個內建轉換是「sort」。

option = {
  dataset: [
    {
      dimensions: ['name', 'age', 'profession', 'score', 'date'],
      source: [
        [' Hannah Krause ', 41, 'Engineer', 314, '2011-02-12'],
        ['Zhao Qian ', 20, 'Teacher', 351, '2011-03-01'],
        [' Jasmin Krause ', 52, 'Musician', 287, '2011-02-14'],
        ['Li Lei', 37, 'Teacher', 219, '2011-02-18'],
        [' Karle Neumann ', 25, 'Engineer', 253, '2011-04-02'],
        [' Adrian Groß', 19, 'Teacher', null, '2011-01-16'],
        ['Mia Neumann', 71, 'Engineer', 165, '2011-03-19'],
        [' Böhm Fuchs', 36, 'Musician', 318, '2011-02-24'],
        ['Han Meimei ', 67, 'Engineer', 366, '2011-03-12']
      ]
    },
    {
      transform: {
        type: 'sort',
        // Sort by score.
        config: { dimension: 'score', order: 'asc' }
      }
    }
  ],
  series: {
    type: 'bar',
    datasetIndex: 1
  }
  // ...
};

關於「排序轉換」的一些額外功能：

支援按多個維度排序。請參閱以下範例。
排序規則
- 預設情況下，「數值」（即數字和數字字串，例如 ' 123 '）能夠按數值順序排序。
- 否則，「非數字字串」也能夠在它們之間排序。這可能對具有相同標籤的數據項目進行分組的情況有所幫助，尤其是在多個維度參與排序時（請參閱以下範例）。
- 當「數值」與「非數字字串」比較，或者其中任何一個與其他類型的值比較時，它們是不可比較的。因此，我們將後者稱為「不可比較的」，並根據屬性 incomparable: 'min' | 'max' 將其視為「最小值」或「最大值」。此功能通常有助於決定是否將空值（例如 null、undefined、NaN、''、'-'）或其他非法值放在頭部或尾部。
可以使用 parser: 'time' | 'trim' | 'number'，與「篩選轉換」相同。
- 如果打算排序時間值（JS Date 實例或時間字串，例如 '2012-03-12 11:13:54'），則應指定 parser: 'time'。例如 config: { dimension: 'date', order: 'desc', parser: 'time' }。
- 如果打算排序帶有單位後綴的值（例如 '33%'、'16px'），則需要使用 parser: 'number'。

請參閱多重排序的範例：

option = {
  dataset: [
    {
      dimensions: ['name', 'age', 'profession', 'score', 'date'],
      source: [
        [' Hannah Krause ', 41, 'Engineer', 314, '2011-02-12'],
        ['Zhao Qian ', 20, 'Teacher', 351, '2011-03-01'],
        [' Jasmin Krause ', 52, 'Musician', 287, '2011-02-14'],
        ['Li Lei', 37, 'Teacher', 219, '2011-02-18'],
        [' Karle Neumann ', 25, 'Engineer', 253, '2011-04-02'],
        [' Adrian Groß', 19, 'Teacher', null, '2011-01-16'],
        ['Mia Neumann', 71, 'Engineer', 165, '2011-03-19'],
        [' Böhm Fuchs', 36, 'Musician', 318, '2011-02-24'],
        ['Han Meimei ', 67, 'Engineer', 366, '2011-03-12']
      ]
    },
    {
      transform: {
        type: 'sort',
        config: [
          // Sort by the two dimensions.
          { dimension: 'profession', order: 'desc' },
          { dimension: 'score', order: 'desc' }
        ]
      }
    }
  ],
  series: {
    type: 'bar',
    datasetIndex: 1
  }
  // ...
};

最後，我們在這裡給出排序轉換設定的正式定義：

type SortTransform = {
  type: 'sort';
  config: OrderExpression | OrderExpression[];
};
type OrderExpression = {
  dimension: DimensionName | DimensionIndex;
  order: 'asc' | 'desc';
  incomparable?: 'min' | 'max';
  parser?: 'time' | 'trim' | 'number';
};
type DimensionName = string;
type DimensionIndex = number;

請注意，當使用最小包時，如果您需要使用此內建轉換，除了 Dataset 元件外，還需要導入 Transform 元件。

import {
  DatasetComponent,
  TransformComponent
} from 'echarts/components';

echarts.use([
  DatasetComponent,
  TransformComponent
]);

使用外部轉換

除了內建的轉換（例如 'filter'、'sort'），我們也可以使用外部轉換來提供更強大的功能。這裡我們以第三方函式庫 ecStat 作為範例。

這個範例展示如何透過 ecStat 繪製迴歸線。

// Register the external transform at first.
echarts.registerTransform(ecStatTransform(ecStat).regression);

option = {
  dataset: [
    {
      source: rawData
    },
    {
      transform: {
        // Reference the registered external transform.
        // Note that external transform has a namespace (like 'ecStat:xxx'
        // has namespace 'ecStat').
        // built-in transform (like 'filter', 'sort') does not have a namespace.
        type: 'ecStat:regression',
        config: {
          // Parameters needed by the external transform.
          method: 'exponential'
        }
      }
    }
  ],
  xAxis: { type: 'category' },
  yAxis: {},
  series: [
    {
      name: 'scatter',
      type: 'scatter',
      datasetIndex: 0
    },
    {
      name: 'regression',
      type: 'line',
      symbol: 'none',
      datasetIndex: 1
    }
  ]
};