Merge branch 'cnch-mysql-sync-1' into 'cnch-dev'

feat(clickhousech@m-3000223500): MySQL BI compatibility improvements

See merge request dp/ClickHouse!18954
 # Conflicts:
 #	docs/en/operations/settings/settings.md
 #	docs/en/operations/system-tables/part_log.md
 #	docs/en/sql-reference/statements/show.md
 #	src/Core/Settings.h
 #	tests/queries/0_ce_problematic_stateless/01114_database_atomic.reference
 #	tests/queries/0_ce_problematic_stateless/01114_database_atomic.sh
 #	tests/queries/0_ce_problematic_stateless/01834_alias_columns_laziness_filimonov.reference
 #	tests/queries/0_ce_problematic_stateless/01834_alias_columns_laziness_filimonov.sh
 #	tests/queries/4_cnch_stateless/01161_information_schema.reference
This commit is contained in:
Dao 2024-02-21 16:17:37 +08:00
parent 9d6a8ee2c6
commit 577fe977b9
312 changed files with 10319 additions and 569 deletions

File diff suppressed because it is too large Load Diff

View File

@ -15,25 +15,42 @@ SHOW TABLES FROM information_schema;
│ KEY_COLUMN_USAGE │
│ REFERENTIAL_CONSTRAINTS │
│ SCHEMATA │
│ STATISTICS │
│ TABLES │
│ VIEWS │
│ EVENTS │
│ ROUTINES │
│ TRIGGERS │
│ PARTITIONS │
│ columns │
│ key_column_usage │
│ referential_constraints │
│ schemata │
│ statistics │
│ tables │
│ views │
│ events │
│ routines │
│ triggers │
│ partitions │
└─────────────────────────┘
```
`INFORMATION_SCHEMA` contains the following views:
- [COLUMNS](#columns)
- [SCHEMATA](#schemata)
- [TABLES](#tables)
- [VIEWS](#views)
- [KEY_COLUMN_USAGE](#key_column_usage)
- [REFERENTIAL_CONSTRAINTS](#referential_constraints)
- [SCHEMATA](#schemata)
- [STATISTICS](#statistics)
- [TABLES](#tables)
- [VIEWS](#views)
- [EVENTS](#events)
- [ROUTINES](#routines)
- [TRIGGERS](#triggers)
- [PARTITIONS](#partitions)
Case-insensitive equivalent views, e.g. `INFORMATION_SCHEMA.columns` are provided for reasons of compatibility with other databases. The same applies to all the columns in these views - both lowercase (for example, `table_name`) and uppercase (`TABLE_NAME`) variants are provided.
@ -364,3 +381,157 @@ Columns:
- `delete_rule` ([String](../../sql-reference/data-types/string.md)) — Currently unused.
- `table_name` ([String](../../sql-reference/data-types/string.md)) — Currently unused.
- `referenced_table_name` ([String](../../sql-reference/data-types/string.md)) — Currently unused.
## STATISTICS {#statistics}
Provides information about table indexes. Currently returns an empty result (no rows) which is just enough to provide compatibility with 3rd party tools like Tableau Online.
Columns:
- `table_catalog` ([String](../../sql-reference/data-types/string.md)) — Currently unused.
- `table_schema` ([String](../../sql-reference/data-types/string.md)) — Currently unused.
- `table_name` ([String](../../sql-reference/data-types/string.md)) — Currently unused.
- `non_unique` ([Int32](../../sql-reference/data-types/int-uint.md)) — Currently unused.
- `index_schema` ([String](../../sql-reference/data-types/string.md)) — Currently unused.
- `index_name` ([Nullable](../../sql-reference/data-types/nullable.md)([String](../../sql-reference/data-types/string.md))) — Currently unused.
- `seq_in_index` ([UInt32](../../sql-reference/data-types/int-uint.md)) — Currently unused.
- `column_name` ([Nullable](../../sql-reference/data-types/nullable.md)([String](../../sql-reference/data-types/string.md))) — Currently unused.
- `collation` ([Nullable](../../sql-reference/data-types/nullable.md)([String](../../sql-reference/data-types/string.md))) — Currently unused.
- `cardinality` ([Nullable](../../sql-reference/data-types/nullable.md)([Int64](../../sql-reference/data-types/int-uint.md))) — Currently unused.
- `sub_part` ([Nullable](../../sql-reference/data-types/nullable.md)([Int64](../../sql-reference/data-types/int-uint.md))) — Currently unused.
- `packed` ([Nullable](../../sql-reference/data-types/nullable.md)([String](../../sql-reference/data-types/string.md))) — Currently unused.
- `nullable` ([String](../../sql-reference/data-types/string.md)) — Currently unused.
- `index_type` ([String](../../sql-reference/data-types/string.md)) — Currently unused.
- `comment` ([String](../../sql-reference/data-types/string.md)) — Currently unused.
- `index_comment` ([String](../../sql-reference/data-types/string.md)) — Currently unused.
- `is_visible` ([String](../../sql-reference/data-types/string.md)) — Currently unused.
- `expression` ([Nullable](../../sql-reference/data-types/nullable.md)([String](../../sql-reference/data-types/string.md))) — Currently unused.
## EVENTS {#events}
Provides information about scheduled events in the database. Currently returns an empty result (no rows) for compatibility purposes.
Columns:
- `EVENT_CATALOG` ([String](../../sql-reference/data-types/string.md)) — Currently unused.
- `EVENT_SCHEMA` ([String](../../sql-reference/data-types/string.md)) — Currently unused.
- `EVENT_NAME` ([String](../../sql-reference/data-types/string.md)) — Currently unused.
- `DEFINER` ([String](../../sql-reference/data-types/string.md)) — Currently unused.
- `TIME_ZONE` ([String](../../sql-reference/data-types/string.md)) — Currently unused.
- `EVENT_BODY` ([String](../../sql-reference/data-types/string.md)) — Currently unused.
- `EVENT_DEFINITION` ([Nullable](../../sql-reference/data-types/nullable.md)([String](../../sql-reference/data-types/string.md))) — Currently unused.
- `EVENT_TYPE` ([String](../../sql-reference/data-types/string.md)) — Currently unused.
- `EXECUTE_AT` ([Nullable](../../sql-reference/data-types/nullable.md)([DateTime](../../sql-reference/data-types/datetime.md))) — Currently unused.
- `INTERVAL_VALUE` ([Nullable](../../sql-reference/data-types/nullable.md)([String](../../sql-reference/data-types/string.md))) — Currently unused.
- `INTERVAL_FIELD` ([Nullable](../../sql-reference/data-types/nullable.md)([String](../../sql-reference/data-types/string.md))) — Currently unused.
- `SQL_MODE` ([String](../../sql-reference/data-types/string.md)) — Currently unused.
- `STARTS` ([Nullable](../../sql-reference/data-types/nullable.md)([DateTime](../../sql-reference/data-types/datetime.md))) — Currently unused.
- `ENDS` ([Nullable](../../sql-reference/data-types/nullable.md)([DateTime](../../sql-reference/data-types/datetime.md))) — Currently unused.
- `STATUS` ([String](../../sql-reference/data-types/string.md)) — Currently unused.
- `ON_COMPLETION` ([String](../../sql-reference/data-types/string.md)) — Currently unused.
- `CREATED` ([Nullable](../../sql-reference/data-types/nullable.md)([DateTime](../../sql-reference/data-types/datetime.md))) — Currently unused.
- `LAST_ALTERED` ([Nullable](../../sql-reference/data-types/nullable.md)([DateTime](../../sql-reference/data-types/datetime.md))) — Currently unused.
- `LAST_EXECUTED` ([Nullable](../../sql-reference/data-types/nullable.md)([DateTime](../../sql-reference/data-types/datetime.md))) — Currently unused.
- `EVENT_COMMENT` ([String](../../sql-reference/data-types/string.md)) — Currently unused.
- `ORIGINATOR` ([UInt64](../../sql-reference/data-types/int-uint.md)) — Currently unused.
- `CHARACTER_SET_CLIENT` ([String](../../sql-reference/data-types/string.md)) — Currently unused.
- `COLLATION_CONNECTION` ([String](../../sql-reference/data-types/string.md)) — Currently unused.
## ROUTINES {#routines}
Provides information about stored routines (procedures and functions) in the database. Currently returns an empty result (no rows) for compatibility purposes.
Columns:
- `SPECIFIC_NAME` ([String](../../sql-reference/data-types/string.md)) — The unique name that distinguishes the routine from other routines with the same `ROUTINE_NAME`. Currently unused.
- `ROUTINE_CATALOG` ([String](../../sql-reference/data-types/string.md)) — The name of the catalog to which the routine belongs. Currently unused.
- `ROUTINE_SCHEMA` ([String](../../sql-reference/data-types/string.md)) — The name of the schema (database) to which the routine belongs. Currently unused.
- `ROUTINE_NAME` ([String](../../sql-reference/data-types/string.md)) — The name of the routine. Currently unused.
- `ROUTINE_TYPE` ([String](../../sql-reference/data-types/string.md)) — The type of the routine, either 'PROCEDURE' or 'FUNCTION'. Currently unused.
- `DATA_TYPE` ([String](../../sql-reference/data-types/string.md)) — The data type of the routine's return value (for functions). Currently unused.
- `CHARACTER_MAXIMUM_LENGTH` ([Nullable](../../sql-reference/data-types/nullable.md)([Int64](../../sql-reference/data-types/int-uint.md))) — The maximum length of a character data type returned by a function. Currently unused.
- `CHARACTER_OCTET_LENGTH` ([Nullable](../../sql-reference/data-types/nullable.md)([Int64](../../sql-reference/data-types/int-uint.md))) — The length in bytes for a character string returned by a function. Currently unused.
- `NUMERIC_PRECISION` ([Nullable](../../sql-reference/data-types/nullable.md)([Int64](../../sql-reference/data-types/int-uint.md))) — The numeric precision of a numeric data type returned by a function. Currently unused.
- `NUMERIC_SCALE` ([Nullable](../../sql-reference/data-types/nullable.md)([Int64](../../sql-reference/data-types/int-uint.md))) — The numeric scale of a numeric data type returned by a function. Currently unused.
- `DATETIME_PRECISION` ([Nullable](../../sql-reference/data-types/nullable.md)([Int64](../../sql-reference/data-types/int-uint.md))) — The fractional seconds precision of a time data type returned by a function. Currently unused.
- `CHARACTER_SET_NAME` ([Nullable](../../sql-reference/data-types/nullable.md)([String](../../sql-reference/data-types/string.md))) — The name of the character set used by a character or text string routine parameter. Currently unused.
- `COLLATION_NAME` ([Nullable](../../sql-reference/data-types/nullable.md)([String](../../sql-reference/data-types/string.md))) — The name of the collation used by a character or text string routine parameter. Currently unused.
- `DTD_IDENTIFIER` ([Nullable](../../sql-reference/data-types/nullable.md)([String](../../sql-reference/data-types/string.md))) — The data type descriptor for the routine's return value. Currently unused.
- `ROUTINE_BODY` ([String](../../sql-reference/data-types/string.md)) — Specifies whether the routine is implemented in SQL or uses an external implementation. Currently unused.
- `ROUTINE_DEFINITION` ([Nullable](../../sql-reference/data-types/nullable.md)([String](../../sql-reference/data-types/string.md))) — The source text of the routine's body. Currently unused.
- `EXTERNAL_NAME` ([Nullable](../../sql-reference/data-types/nullable.md)([String](../../sql-reference/data-types/string.md))) — The name of the external function or procedure when `ROUTINE_BODY` is 'EXTERNAL'. Currently unused.
- `EXTERNAL_LANGUAGE` ([Nullable](../../sql-reference/data-types/nullable.md)([String](../../sql-reference/data-types/string.md))) — The language used for writing the external routine when `ROUTINE_BODY` is 'EXTERNAL'. Currently unused.
- `PARAMETER_STYLE` ([String](../../sql-reference/data-types/string.md)) — The SQL parameter style of the routine. Currently unused.
- `IS_DETERMINISTIC` ([String](../../sql-reference/data-types/string.md)) — Specifies whether the routine always returns the same results given the same inputs. Currently unused.
- `SQL_DATA_ACCESS` ([String](../../sql-reference/data-types/string.md)) — Specifies the type of data access the routine requires (e.g., 'CONTAINS SQL', 'NO SQL'). Currently unused.
- `SQL_PATH` ([Nullable](../../sql-reference/data-types/nullable.md)([String](../../sql-reference/data-types/string.md))) — The path to be used for resolving names in the routine body. Currently unused.
- `SECURITY_TYPE` ([String](../../sql-reference/data-types/string.md)) — The security context in which the routine is executed ('DEFINER' or 'INVOKER'). Currently unused.
- `CREATED` ([Nullable](../../sql-reference/data-types/nullable.md)([DateTime](../../sql-reference/data-types/datetime.md))) — The date and time when the routine was created. Currently unused.
- `LAST_ALTERED` ([Nullable](../../sql-reference/data-types/nullable.md)([DateTime](../../sql-reference/data-types/datetime.md))) — The date and time when the routine was last altered. Currently unused.
- `SQL_MODE` ([String](../../sql-reference/data-types/string.md)) — The SQL mode in effect when the routine was created. Currently unused.
- `ROUTINE_COMMENT` ([String](../../sql-reference/data-types/string.md)) — Any comment supplied about the routine. Currently unused.
- `DEFINER` ([String](../../sql-reference/data-types/string.md)) — The account of the user who defined the routine. Currently unused.
- `CHARACTER_SET_CLIENT` ([String](../../sql-reference/data-types/string.md)) — The client character set for the routine. Currently unused.
- `COLLATION_CONNECTION` ([String](../../sql-reference/data-types/string.md)) — The collation of the connection that created the routine. Currently unused.
- `DATABASE_COLLATION` ([String](../../sql-reference/data-types/string.md)) — The collation of the database in which the routine was created. Currently unused.
## TRIGGERS {#triggers}
Provides information about triggers defined in the database. Currently returns an empty result (no rows) for compatibility purposes.
Columns:
- `TRIGGER_CATALOG` ([String](../../sql-reference/data-types/string.md)) — The name of the catalog to which the trigger belongs. Currently unused.
- `TRIGGER_SCHEMA` ([String](../../sql-reference/data-types/string.md)) — The name of the schema (database) to which the trigger belongs. Currently unused.
- `TRIGGER_NAME` ([String](../../sql-reference/data-types/string.md)) — The name of the trigger. Currently unused.
- `EVENT_MANIPULATION` ([String](../../sql-reference/data-types/string.md)) — The type of event that activates the trigger (e.g., INSERT, UPDATE, DELETE). Currently unused.
- `EVENT_OBJECT_CATALOG` ([String](../../sql-reference/data-types/string.md)) — The name of the catalog containing the table on which the trigger acts. Currently unused.
- `EVENT_OBJECT_SCHEMA` ([String](../../sql-reference/data-types/string.md)) — The name of the schema containing the table on which the trigger acts. Currently unused.
- `EVENT_OBJECT_TABLE` ([String](../../sql-reference/data-types/string.md)) — The name of the table on which the trigger acts. Currently unused.
- `ACTION_ORDER` ([Int64](../../sql-reference/data-types/int-uint.md)) — The position of the trigger's action within the sequence of triggers on the same table, for the same event. Currently unused.
- `ACTION_CONDITION` ([Nullable](../../sql-reference/data-types/nullable.md)([String](../../sql-reference/data-types/string.md))) — The condition that must be true for the trigger to fire. Currently unused.
- `ACTION_STATEMENT` ([String](../../sql-reference/data-types/string.md)) — The SQL statement executed when the trigger fires. Currently unused.
- `ACTION_ORIENTATION` ([String](../../sql-reference/data-types/string.md)) — Whether the trigger is row-level or statement-level. Currently unused.
- `ACTION_TIMING` ([String](../../sql-reference/data-types/string.md)) — Specifies whether the trigger fires before or after the event. Currently unused.
- `ACTION_REFERENCE_OLD_TABLE` ([Nullable](../../sql-reference/data-types/nullable.md)([String](../../sql-reference/data-types/string.md))) — The old table reference name, for row-level triggers. Currently unused.
- `ACTION_REFERENCE_NEW_TABLE` ([Nullable](../../sql-reference/data-types/nullable.md)([String](../../sql-reference/data-types/string.md))) — The new table reference name, for row-level triggers. Currently unused.
- `ACTION_REFERENCE_OLD_ROW` ([String](../../sql-reference/data-types/string.md)) — The old row reference name. Currently unused.
- `ACTION_REFERENCE_NEW_ROW` ([String](../../sql-reference/data-types/string.md)) — The new row reference name. Currently unused.
- `CREATED` ([Nullable](../../sql-reference/data-types/nullable.md)([DateTime](../../sql-reference/data-types/datetime.md))) — The date and time when the trigger was created. Currently unused.
- `SQL_MODE` ([String](../../sql-reference/data-types/string.md)) — The SQL mode in effect when the trigger was created. Currently unused.
- `DEFINER` ([String](../../sql-reference/data-types/string.md)) — The account of the user who defined the trigger. Currently unused.
- `CHARACTER_SET_CLIENT` ([String](../../sql-reference/data-types/string.md)) — The client character set for the trigger. Currently unused.
- `COLLATION_CONNECTION` ([String](../../sql-reference/data-types/string.md)) — The collation of the connection that created the trigger. Currently unused.
- `DATABASE_COLLATION` ([String](../../sql-reference/data-types/string.md)) — The collation of the database in which the trigger was created. Currently unused.
## PARTITIONS {#partitions}
Provides information about table partitions in the database. Currently returns an empty result (no rows) for compatibility purposes.
Columns:
- `TABLE_CATALOG` ([String](../../sql-reference/data-types/string.md)) — The name of the catalog to which the table with the partition belongs. Currently unused.
- `TABLE_SCHEMA` ([String](../../sql-reference/data-types/string.md)) — The name of the schema (database) to which the table with the partition belongs. Currently unused.
- `TABLE_NAME` ([String](../../sql-reference/data-types/string.md)) — The name of the table with the partition. Currently unused.
- `PARTITION_NAME` ([Nullable](../../sql-reference/data-types/nullable.md)([String](../../sql-reference/data-types/string.md))) — The name of the partition. Currently unused.
- `SUBPARTITION_NAME` ([Nullable](../../sql-reference/data-types/nullable.md)([String](../../sql-reference/data-types/string.md))) — The name of the subpartition. Currently unused.
- `PARTITION_ORDINAL_POSITION` ([Nullable](../../sql-reference/data-types/nullable.md)([UInt64](../../sql-reference/data-types/int-uint.md))) — The ordinal position of the partition within the table. Currently unused.
- `SUBPARTITION_ORDINAL_POSITION` ([Nullable](../../sql-reference/data-types/nullable.md)([UInt64](../../sql-reference/data-types/int-uint.md))) — The ordinal position of the subpartition within the partition. Currently unused.
- `PARTITION_METHOD` ([Nullable](../../sql-reference/data-types/nullable.md)([String](../../sql-reference/data-types/string.md))) — The method or function used to partition the table. Currently unused.
- `SUBPARTITION_METHOD` ([Nullable](../../sql-reference/data-types/nullable.md)([String](../../sql-reference/data-types/string.md))) — The method or function used to subpartition the table. Currently unused.
- `PARTITION_EXPRESSION` ([Nullable](../../sql-reference/data-types/nullable.md)([String](../../sql-reference/data-types/string.md))) — The expression used for partitioning the table. Currently unused.
- `SUBPARTITION_EXPRESSION` ([Nullable](../../sql-reference/data-types/nullable.md)([String](../../sql-reference/data-types/string.md))) — The expression used for subpartitioning the table. Currently unused.
- `PARTITION_DESCRIPTION` ([Nullable](../../sql-reference/data-types/nullable.md)([String](../../sql-reference/data-types/string.md))) — A description or value used in the range or list partitioning. Currently unused.
- `TABLE_ROWS` ([Nullable](../../sql-reference/data-types/nullable.md)([Int64](../../sql-reference/data-types/int-uint.md))) — An estimate of the number of rows in the partition. Currently unused.
- `AVG_ROW_LENGTH` ([Nullable](../../sql-reference/data-types/nullable.md)([Int64](../../sql-reference/data-types/int-uint.md))) — The average length of a row in the partition. Currently unused.
- `DATA_LENGTH` ([Nullable](../../sql-reference/data-types/nullable.md)([Int64](../../sql-reference/data-types/int-uint.md))) — The total length (in bytes) of the data in the partition. Currently unused.
- `MAX_DATA_LENGTH` ([Nullable](../../sql-reference/data-types/nullable.md)([Int64](../../sql-reference/data-types/int-uint.md))) — The maximum data length (in bytes) allowed in the partition. Currently unused.
- `INDEX_LENGTH` ([Nullable](../../sql-reference/data-types/nullable.md)([Int64](../../sql-reference/data-types/int-uint.md))) — The length (in bytes) of the index file for the partition. Currently unused.
- `DATA_FREE` ([Nullable](../../sql-reference/data-types/nullable.md)([Int64](../../sql-reference/data-types/int-uint.md))) — The number of allocated but unused bytes in the partition. Currently unused.
- `CREATE_TIME` ([Nullable](../../sql-reference/data-types/nullable.md)([DateTime](../../sql-reference/data-types/datetime.md))) — The date and time when the partition was created. Currently unused.
- `UPDATE_TIME` ([Nullable](../../sql-reference/data-types/nullable.md)([DateTime](../../sql-reference/data-types/datetime.md))) — The date and time when the partition was last updated. Currently unused.
- `CHECK_TIME` ([Nullable](../../sql-reference/data-types/nullable.md)([DateTime](../../sql-reference/data-types/datetime.md))) — The date and time when the partition was last checked. Currently unused.
- `CHECKSUM` ([Nullable](../../sql-reference/data-types/nullable.md)([UInt64](../../sql-reference/data-types/int-uint.md))) — The live checksum value for the rows in the partition (if any). Currently unused.
- `PARTITION_COMMENT` ([Nullable](../../sql-reference/data-types/nullable.md)([String](../../sql-reference/data-types/string.md))) — Any comment provided for the partition. Currently unused.
- `NODEGROUP` ([Nullable](../../sql-reference/data-types/nullable.md)([Int64](../../sql-reference/data-types/int-uint.md))) — The node group for the partition in a clustered database. Currently unused.
- `TABLESPACE_NAME` ([Nullable](../../sql-reference/data-types/nullable.md)([String](../../sql-reference/data-types/string.md))) — The tablespace in which the partition resides. Currently unused.

View File

@ -0,0 +1,69 @@
# system.part_log {#system_tables-part-log}
The `system.part_log` table is created only if the [part_log](../../operations/server-configuration-parameters/settings.md#server_configuration_parameters-part-log) server setting is specified.
This table contains information about events that occurred with [data parts](../../engines/table-engines/mergetree-family/custom-partitioning-key.md) in the [MergeTree](../../engines/table-engines/mergetree-family/mergetree.md) family tables, such as adding or merging data.
The `system.part_log` table contains the following columns:
- `query_id` ([String](../../sql-reference/data-types/string.md)) — Identifier of the `INSERT` query that created this data part.
- `event_type` ([Enum8](../../sql-reference/data-types/enum.md)) — Type of the event that occurred with the data part. Can have one of the following values:
- `NewPart` — Inserting of a new data part.
- `MergeParts` — Merging of data parts.
- `DownloadParts` — Downloading a data part.
- `RemovePart` — Removing or detaching a data part using [DETACH PARTITION](../../sql-reference/statements/alter/partition.md#alter_detach-partition).
- `MutatePart` — Mutating of a data part.
- `MovePart` — Moving the data part from the one disk to another one.
- `event_date` ([Date](../../sql-reference/data-types/date.md)) — Event date.
- `event_time` ([DateTime](../../sql-reference/data-types/datetime.md)) — Event time.
- `event_time_microseconds` ([DateTime64](../../sql-reference/data-types/datetime64.md)) — Event time with microseconds precision.
- `duration_ms` ([UInt64](../../sql-reference/data-types/int-uint.md)) — Duration.
- `database` ([String](../../sql-reference/data-types/string.md)) — Name of the database the data part is in.
- `table` ([String](../../sql-reference/data-types/string.md)) — Name of the table the data part is in.
- `part_name` ([String](../../sql-reference/data-types/string.md)) — Name of the data part.
- `partition_id` ([String](../../sql-reference/data-types/string.md)) — ID of the partition that the data part was inserted to. The column takes the `all` value if the partitioning is by `tuple()`.
- `path_on_disk` ([String](../../sql-reference/data-types/string.md)) — Absolute path to the folder with data part files.
- `rows` ([UInt64](../../sql-reference/data-types/int-uint.md)) — The number of rows in the data part.
- `size_in_bytes` ([UInt64](../../sql-reference/data-types/int-uint.md)) — Size of the data part in bytes.
- `merged_from` ([Array(String)](../../sql-reference/data-types/array.md)) — An array of names of the parts which the current part was made up from (after the merge).
- `bytes_uncompressed` ([UInt64](../../sql-reference/data-types/int-uint.md)) — Size of uncompressed bytes.
- `read_rows` ([UInt64](../../sql-reference/data-types/int-uint.md)) — The number of rows was read during the merge.
- `read_bytes` ([UInt64](../../sql-reference/data-types/int-uint.md)) — The number of bytes was read during the merge.
- `peak_memory_usage` ([Int64](../../sql-reference/data-types/int-uint.md)) — The maximum difference between the amount of allocated and freed memory in context of this thread.
- `error` ([UInt16](../../sql-reference/data-types/int-uint.md)) — The code number of the occurred error.
- `exception` ([String](../../sql-reference/data-types/string.md)) — Text message of the occurred error.
The `system.part_log` table is created after the first inserting data to the `MergeTree` table.
**Example**
``` sql
SELECT * FROM system.part_log LIMIT 1 FORMAT Vertical;
```
``` text
Row 1:
──────
query_id: 983ad9c7-28d5-4ae1-844e-603116b7de31
event_type: NewPart
event_date: 2021-02-02
event_time: 2021-02-02 11:14:28
event_time_microseconds: 2021-02-02 11:14:28.861919
duration_ms: 35
database: default
table: log_mt_2
part_name: all_1_1_0
partition_id: all
path_on_disk: db/data/default/log_mt_2/all_1_1_0/
rows: 115418
size_in_bytes: 1074311
merged_from: []
bytes_uncompressed: 0
read_rows: 0
read_bytes: 0
peak_memory_usage: 0
error: 0
exception:
```
[Original article](https://clickhouse.tech/docs/en/operations/system-tables/part_log) <!--hide-->

View File

@ -0,0 +1,545 @@
---
toc_priority: 37
toc_title: SHOW
---
# SHOW Statements {#show-queries}
## SHOW CREATE TABLE {#show-create-table}
``` sql
SHOW CREATE [TEMPORARY] [TABLE|DICTIONARY] [db.]table [INTO OUTFILE filename] [FORMAT format]
```
Returns a single `String`-type statement column, which contains a single value the `CREATE` query used for creating the specified object.
## SHOW DATABASES {#show-databases}
Prints a list of all databases.
```sql
SHOW DATABASES [LIKE | ILIKE | NOT LIKE '<pattern>'] [LIMIT <N>] [INTO OUTFILE filename] [FORMAT format]
```
This statement is identical to the query:
```sql
SELECT name FROM system.databases [WHERE name LIKE | ILIKE | NOT LIKE '<pattern>'] [LIMIT <N>] [INTO OUTFILE filename] [FORMAT format]
```
### Examples {#examples}
Getting database names, containing the symbols sequence 'de' in their names:
``` sql
SHOW DATABASES LIKE '%de%'
```
Result:
``` text
┌─name────┐
│ default │
└─────────┘
```
Getting database names, containing symbols sequence 'de' in their names, in the case insensitive manner:
``` sql
SHOW DATABASES ILIKE '%DE%'
```
Result:
``` text
┌─name────┐
│ default │
└─────────┘
```
Getting database names, not containing the symbols sequence 'de' in their names:
``` sql
SHOW DATABASES NOT LIKE '%de%'
```
Result:
``` text
┌─name───────────────────────────┐
│ _temporary_and_external_tables │
│ system │
│ test │
│ tutorial │
└────────────────────────────────┘
```
Getting the first two rows from database names:
``` sql
SHOW DATABASES LIMIT 2
```
Result:
``` text
┌─name───────────────────────────┐
│ _temporary_and_external_tables │
│ default │
└────────────────────────────────┘
```
### See Also {#see-also}
- [CREATE DATABASE](https://clickhouse.tech/docs/en/sql-reference/statements/create/database/#query-language-create-database)
## SHOW PROCESSLIST {#show-processlist}
``` sql
SHOW PROCESSLIST [INTO OUTFILE filename] [FORMAT format]
```
Outputs the content of the [system.processes](../../operations/system-tables/processes.md#system_tables-processes) table, that contains a list of queries that is being processed at the moment, excepting `SHOW PROCESSLIST` queries.
The `SELECT * FROM system.processes` query returns data about all the current queries.
Tip (execute in the console):
``` bash
$ watch -n1 "clickhouse-client --query='SHOW PROCESSLIST'"
```
## SHOW TABLES {#show-tables}
Displays a list of tables.
```sql
SHOW [TEMPORARY] TABLES [{FROM | IN} <db>] [LIKE | ILIKE | NOT LIKE '<pattern>'] [LIMIT <N>] [INTO OUTFILE <filename>] [FORMAT <format>]
```
If the `FROM` clause is not specified, the query returns the list of tables from the current database.
This statement is identical to the query:
```sql
SELECT name FROM system.tables [WHERE name LIKE | ILIKE | NOT LIKE '<pattern>'] [LIMIT <N>] [INTO OUTFILE <filename>] [FORMAT <format>]
```
### Examples {#examples}
Getting table names, containing the symbols sequence 'user' in their names:
``` sql
SHOW TABLES FROM system LIKE '%user%'
```
Result:
``` text
┌─name─────────────┐
│ user_directories │
│ users │
└──────────────────┘
```
Getting table names, containing sequence 'user' in their names, in the case insensitive manner:
``` sql
SHOW TABLES FROM system ILIKE '%USER%'
```
Result:
``` text
┌─name─────────────┐
│ user_directories │
│ users │
└──────────────────┘
```
Getting table names, not containing the symbol sequence 's' in their names:
``` sql
SHOW TABLES FROM system NOT LIKE '%s%'
```
Result:
``` text
┌─name─────────┐
│ metric_log │
│ metric_log_0 │
│ metric_log_1 │
└──────────────┘
```
Getting the first two rows from table names:
``` sql
SHOW TABLES FROM system LIMIT 2
```
Result:
``` text
┌─name───────────────────────────┐
│ aggregate_function_combinators │
│ asynchronous_metric_log │
└────────────────────────────────┘
```
### See Also {#see-also}
- [Create Tables](https://clickhouse.tech/docs/en/getting-started/tutorial/#create-tables)
- [SHOW CREATE TABLE](https://clickhouse.tech/docs/en/sql-reference/statements/show/#show-create-table)
## SHOW COLUMNS {#show_columns}
Displays a list of columns
```sql
SHOW [EXTENDED] [FULL] COLUMNS {FROM | IN} <table> [{FROM | IN} <db>] [{[NOT] {LIKE | ILIKE} '<pattern>' | WHERE <expr>}] [LIMIT <N>] [INTO
OUTFILE <filename>] [FORMAT <format>]
```
The database and table name can be specified in abbreviated form as `<db>.<table>`, i.e. `FROM tab FROM db` and `FROM db.tab` are
equivalent. If no database is specified, the query returns the list of columns from the current database.
The optional keyword `EXTENDED` currently has no effect, it only exists for MySQL compatibility.
The optional keyword `FULL` causes the output to include the collation, comment and privilege columns.
The statement produces a result table with the following structure:
- `field` - The name of the column (String)
- `type` - The column data type. If the query was made through the MySQL wire protocol, then the equivalent type name in MySQL is shown. (String)
- `null` - `YES` if the column data type is Nullable, `NO` otherwise (String)
- `key` - `PRI` if the column is part of the primary key, `SOR` if the column is part of the sorting key, empty otherwise (String)
- `default` - Default expression of the column if it is of type `ALIAS`, `DEFAULT`, or `MATERIALIZED`, otherwise `NULL`. (Nullable(String))
- `extra` - Additional information, currently unused (String)
- `collation` - (only if `FULL` keyword was specified) Collation of the column, always `NULL` because ClickHouse has no per-column collations (Nullable(String))
- `comment` - (only if `FULL` keyword was specified) Comment on the column (String)
- `privilege` - (only if `FULL` keyword was specified) The privilege you have on this column, currently not available (String)
**Examples**
Getting information about all columns in table 'order' starting with 'delivery_':
```sql
SHOW COLUMNS FROM 'orders' LIKE 'delivery_%'
```
Result:
``` text
┌─field───────────┬─type─────┬─null─┬─key─────┬─default─┬─extra─┐
│ delivery_date │ DateTime │ 0 │ PRI SOR │ ᴺᵁᴸᴸ │ │
│ delivery_status │ Bool │ 0 │ │ ᴺᵁᴸᴸ │ │
└─────────────────┴──────────┴──────┴─────────┴─────────┴───────┘
```
**See also**
- [system.columns](https://clickhouse.com/docs/en/operations/system-tables/columns)
## SHOW DICTIONARIES {#show-dictionaries}
Displays a list of [external dictionaries](../../sql-reference/dictionaries/external-dictionaries/external-dicts.md).
``` sql
SHOW DICTIONARIES [FROM <db>] [LIKE '<pattern>'] [LIMIT <N>] [INTO OUTFILE <filename>] [FORMAT <format>]
```
If the `FROM` clause is not specified, the query returns the list of dictionaries from the current database.
You can get the same results as the `SHOW DICTIONARIES` query in the following way:
``` sql
SELECT name FROM system.dictionaries WHERE database = <db> [AND name LIKE <pattern>] [LIMIT <N>] [INTO OUTFILE <filename>] [FORMAT <format>]
```
**Examples**
The following query selects the first two rows from the list of tables in the `system` database, whose names contain `reg`.
``` sql
SHOW DICTIONARIES FROM db LIKE '%reg%' LIMIT 2
```
``` text
┌─name─────────┐
│ regions │
│ region_names │
└──────────────┘
```
## SHOW GRANTS {#show-grants-statement}
Shows privileges for a user.
### Syntax {#show-grants-syntax}
``` sql
SHOW GRANTS [FOR user1 [, user2 ...]]
```
If user is not specified, the query returns privileges for the current user.
## SHOW CREATE USER {#show-create-user-statement}
Shows parameters that were used at a [user creation](../../sql-reference/statements/create/user.md).
`SHOW CREATE USER` does not output user passwords.
### Syntax {#show-create-user-syntax}
``` sql
SHOW CREATE USER [name1 [, name2 ...] | CURRENT_USER]
```
## SHOW CREATE ROLE {#show-create-role-statement}
Shows parameters that were used at a [role creation](../../sql-reference/statements/create/role.md).
### Syntax {#show-create-role-syntax}
``` sql
SHOW CREATE ROLE name1 [, name2 ...]
```
## SHOW CREATE ROW POLICY {#show-create-row-policy-statement}
Shows parameters that were used at a [row policy creation](../../sql-reference/statements/create/row-policy.md).
### Syntax {#show-create-row-policy-syntax}
``` sql
SHOW CREATE [ROW] POLICY name ON [database1.]table1 [, [database2.]table2 ...]
```
## SHOW CREATE QUOTA {#show-create-quota-statement}
Shows parameters that were used at a [quota creation](../../sql-reference/statements/create/quota.md).
### Syntax {#show-create-quota-syntax}
``` sql
SHOW CREATE QUOTA [name1 [, name2 ...] | CURRENT]
```
## SHOW CREATE SETTINGS PROFILE {#show-create-settings-profile-statement}
Shows parameters that were used at a [settings profile creation](../../sql-reference/statements/create/settings-profile.md).
### Syntax {#show-create-settings-profile-syntax}
``` sql
SHOW CREATE [SETTINGS] PROFILE name1 [, name2 ...]
```
## SHOW USERS {#show-users-statement}
Returns a list of [user account](../../operations/access-rights.md#user-account-management) names. To view user accounts parameters, see the system table [system.users](../../operations/system-tables/users.md#system_tables-users).
### Syntax {#show-users-syntax}
Returns a list of [roles](../../operations/access-rights.md#role-management). To view another parameters, see system tables [system.roles](../../operations/system-tables/roles.md#system_tables-roles) and [system.role-grants](../../operations/system-tables/role-grants.md#system_tables-role_grants).
### Syntax {#show-roles-syntax}
``` sql
SHOW [CURRENT|ENABLED] ROLES
```
## SHOW PROFILES {#show-profiles-statement}
Returns a list of [setting profiles](../../operations/access-rights.md#settings-profiles-management). To view user accounts parameters, see the system table [settings_profiles](../../operations/system-tables/settings_profiles.md#system_tables-settings_profiles).
### Syntax {#show-profiles-syntax}
``` sql
SHOW [SETTINGS] PROFILES
```
## SHOW POLICIES {#show-policies-statement}
Returns a list of [row policies](../../operations/access-rights.md#row-policy-management) for the specified table. To view user accounts parameters, see the system table [system.row_policies](../../operations/system-tables/row_policies.md#system_tables-row_policies).
### Syntax {#show-policies-syntax}
``` sql
SHOW [ROW] POLICIES [ON [db.]table]
```
## SHOW QUOTAS {#show-quotas-statement}
Returns a list of [quotas](../../operations/access-rights.md#quotas-management). To view quotas parameters, see the system table [system.quotas](../../operations/system-tables/quotas.md#system_tables-quotas).
### Syntax {#show-quotas-syntax}
``` sql
SHOW QUOTAS
```
## SHOW QUOTA {#show-quota-statement}
Returns a [quota](../../operations/quotas.md) consumption for all users or for current user. To view another parameters, see system tables [system.quotas_usage](../../operations/system-tables/quotas_usage.md#system_tables-quotas_usage) and [system.quota_usage](../../operations/system-tables/quota_usage.md#system_tables-quota_usage).
### Syntax {#show-quota-syntax}
``` sql
SHOW [CURRENT] QUOTA
```
## SHOW ACCESS {#show-access-statement}
Shows all [users](../../operations/access-rights.md#user-account-management), [roles](../../operations/access-rights.md#role-management), [profiles](../../operations/access-rights.md#settings-profiles-management), etc. and all their [grants](../../sql-reference/statements/grant.md#grant-privileges).
### Syntax {#show-access-syntax}
``` sql
SHOW ACCESS
```
## SHOW CLUSTER(s) {#show-cluster-statement}
Returns a list of clusters. All available clusters are listed in the [system.clusters](../../operations/system-tables/clusters.md) table.
!!! info "Note"
`SHOW CLUSTER name` query displays the contents of system.clusters table for this cluster.
### Syntax {#show-cluster-syntax}
``` sql
SHOW CLUSTER '<name>'
SHOW CLUSTERS [LIKE|NOT LIKE '<pattern>'] [LIMIT <N>]
```
### Examples {#show-cluster-examples}
Query:
``` sql
SHOW CLUSTERS;
```
Result:
```text
┌─cluster──────────────────────────────────────┐
│ test_cluster_two_shards │
│ test_cluster_two_shards_internal_replication │
│ test_cluster_two_shards_localhost │
│ test_shard_localhost │
│ test_shard_localhost_secure │
│ test_unavailable_shard │
└──────────────────────────────────────────────┘
```
Query:
``` sql
SHOW CLUSTERS LIKE 'test%' LIMIT 1;
```
Result:
```text
┌─cluster─────────────────┐
│ test_cluster_two_shards │
└─────────────────────────┘
```
Query:
``` sql
SHOW CLUSTER 'test_shard_localhost' FORMAT Vertical;
```
Result:
```text
Row 1:
──────
cluster: test_shard_localhost
shard_num: 1
shard_weight: 1
replica_num: 1
host_name: localhost
host_address: 127.0.0.1
port: 9000
is_local: 1
user: default
default_database:
errors_count: 0
estimated_recovery_time: 0
```
## SHOW SETTINGS {#show-settings}
Returns a list of system settings and their values. Selects data from the [system.settings](../../operations/system-tables/settings.md) table.
**Syntax**
```sql
SHOW [CHANGED] SETTINGS LIKE|ILIKE <name>
```
**Clauses**
`LIKE|ILIKE` allow to specify a matching pattern for the setting name. It can contain globs such as `%` or `_`. `LIKE` clause is case-sensitive, `ILIKE` — case insensitive.
When the `CHANGED` clause is used, the query returns only settings changed from their default values.
**Examples**
Query with the `LIKE` clause:
```sql
SHOW SETTINGS LIKE 'send_timeout';
```
Result:
```text
┌─name─────────┬─type────┬─value─┐
│ send_timeout │ Seconds │ 300 │
└──────────────┴─────────┴───────┘
```
Query with the `ILIKE` clause:
```sql
SHOW SETTINGS ILIKE '%CONNECT_timeout%'
```
Result:
```text
┌─name────────────────────────────────────┬─type─────────┬─value─┐
│ connect_timeout │ Seconds │ 10 │
│ connect_timeout_with_failover_ms │ Milliseconds │ 50 │
│ connect_timeout_with_failover_secure_ms │ Milliseconds │ 100 │
└─────────────────────────────────────────┴──────────────┴───────┘
```
Query with the `CHANGED` clause:
```sql
SHOW CHANGED SETTINGS ILIKE '%MEMORY%'
```
Result:
```text
┌─name─────────────┬─type───┬─value───────┐
│ max_memory_usage │ UInt64 │ 10000000000 │
└──────────────────┴────────┴─────────────┘
```
## SHOW SETTING
``` sql
SHOW SETTING <name>
```
Outputs setting value for specified setting name.
**See Also**
- [system.settings](../../operations/system-tables/settings.md) table
[Original article](https://clickhouse.tech/docs/en/sql-reference/statements/show/) <!--hide-->

View File

@ -21,6 +21,7 @@
#include <Columns/ColumnAggregateFunction.h>
#include <Columns/ColumnsCommon.h>
#include <Columns/MaskOperations.h>
#include <Common/assert_cast.h>
#include <DataStreams/ColumnGathererStream.h>
#include <IO/WriteBufferFromArena.h>
@ -371,6 +372,11 @@ ColumnPtr ColumnAggregateFunction::filter(const Filter & filter, ssize_t result_
return res;
}
void ColumnAggregateFunction::expand(const Filter & mask, bool inverted)
{
expandDataByMask<char *>(data, mask, inverted);
}
ColumnPtr ColumnAggregateFunction::permute(const Permutation & perm, size_t limit) const
{
size_t size = data.size();

View File

@ -205,6 +205,8 @@ public:
ColumnPtr filter(const Filter & filter, ssize_t result_size_hint) const override;
void expand(const Filter & mask, bool inverted) override;
ColumnPtr permute(const Permutation & perm, size_t limit) const override;
ColumnPtr index(const IColumn & indexes, size_t limit) const override;

View File

@ -29,6 +29,7 @@
#include <Columns/ColumnConst.h>
#include <Columns/ColumnsCommon.h>
#include <Columns/ColumnCompressed.h>
#include <Columns/MaskOperations.h>
#include <common/unaligned.h>
#include <common/sort.h>
@ -616,6 +617,34 @@ ColumnPtr ColumnArray::filter(const Filter & filt, ssize_t result_size_hint) con
return filterGeneric(filt, result_size_hint);
}
void ColumnArray::expand(const IColumn::Filter & mask, bool inverted)
{
auto & offsets_data = getOffsets();
if (mask.size() < offsets_data.size())
throw Exception("Mask size should be no less than data size.", ErrorCodes::LOGICAL_ERROR);
int index = mask.size() - 1;
int from = offsets_data.size() - 1;
offsets_data.resize(mask.size());
UInt64 last_offset = offsets_data[from];
while (index >= 0)
{
offsets_data[index] = last_offset;
if (mask[index] ^ inverted)
{
if (from < 0)
throw Exception("Too many bytes in mask", ErrorCodes::LOGICAL_ERROR);
--from;
last_offset = offsets_data[from];
}
--index;
}
if (from != -1)
throw Exception("Not enough bytes in mask", ErrorCodes::LOGICAL_ERROR);}
template <typename T>
ColumnPtr ColumnArray::filterNumber(const Filter & filt, ssize_t result_size_hint) const
{

View File

@ -109,6 +109,7 @@ public:
void insertDefault() override;
void popBack(size_t n) override;
ColumnPtr filter(const Filter & filt, ssize_t result_size_hint) const override;
void expand(const Filter & mask, bool inverted) override;
ColumnPtr permute(const Permutation & perm, size_t limit) const override;
ColumnPtr index(const IColumn & indexes, size_t limit) const override;
template <typename Type> ColumnPtr indexImpl(const PaddedPODArray<Type> & indexes, size_t limit) const;

View File

@ -20,6 +20,7 @@
#include <Columns/ColumnString.h>
#include <Columns/ColumnBitMap64.h>
#include <Columns/ColumnsCommon.h>
#include <Columns/MaskOperations.h>
#include <DataStreams/ColumnGathererStream.h>
#include <Common/HashTable/Hash.h>
#include <Common/WeakHash.h>
@ -199,6 +200,13 @@ ColumnPtr ColumnBitMap64::filter(const Filter & filt, ssize_t result_size_hint)
return res;
}
void ColumnBitMap64::expand(const Filter & mask, bool inverted)
{
auto & chars_data = getChars();
auto & offsets_data = getOffsets();
expandStringDataByMask(chars_data, offsets_data, mask, inverted);
}
ColumnPtr ColumnBitMap64::permute(const Permutation & perm, size_t limit) const
{
size_t size = offsets.size();

View File

@ -262,6 +262,8 @@ public:
ColumnPtr filter(const Filter & filt, ssize_t result_size_hint) const override;
void expand(const Filter & /*mask*/, bool /*inverted*/) override;
double getRatioOfDefaultRows(double) const override
{
throw Exception(ErrorCodes::NOT_IMPLEMENTED, "Method getRatioOfDefaultRows is not supported for {}", getName());
@ -342,6 +344,7 @@ public:
// Throws an exception if offsets/chars are messed up
void validate() const;
};
}

View File

@ -113,6 +113,7 @@ public:
void updateWeakHash32(WeakHash32 &) const override { throwMustBeDecompressed(); }
void updateHashFast(SipHash &) const override { throwMustBeDecompressed(); }
ColumnPtr filter(const Filter &, ssize_t) const override { throwMustBeDecompressed(); }
void expand(const Filter &, bool) override { throwMustBeDecompressed(); }
ColumnPtr permute(const Permutation &, size_t) const override { throwMustBeDecompressed(); }
ColumnPtr index(const IColumn &, size_t) const override { throwMustBeDecompressed(); }
int compareAt(size_t, size_t, const IColumn &, int) const override { throwMustBeDecompressed(); }

View File

@ -59,9 +59,28 @@ ColumnPtr ColumnConst::filter(const Filter & filt, ssize_t /*result_size_hint*/)
throw Exception("Size of filter (" + toString(filt.size()) + ") doesn't match size of column (" + toString(s) + ")",
ErrorCodes::SIZES_OF_COLUMNS_DOESNT_MATCH);
return ColumnConst::create(data, countBytesInFilter(filt));
size_t new_size = countBytesInFilter(filt);
return ColumnConst::create(data, new_size);
}
void ColumnConst::expand(const Filter & mask, bool inverted)
{
if (mask.size() < s)
throw Exception("Mask size should be no less than data size.", ErrorCodes::LOGICAL_ERROR);
size_t bytes_count = countBytesInFilter(mask);
if (inverted)
bytes_count = mask.size() - bytes_count;
if (bytes_count < s)
throw Exception("Not enough bytes in mask", ErrorCodes::LOGICAL_ERROR);
else if (bytes_count > s)
throw Exception("Too many bytes in mask", ErrorCodes::LOGICAL_ERROR);
s = mask.size();
}
ColumnPtr ColumnConst::replicate(const Offsets & offsets) const
{
if (s != offsets.size())

View File

@ -213,6 +213,7 @@ public:
}
ColumnPtr filter(const Filter & filt, ssize_t result_size_hint) const override;
void expand(const Filter & mask, bool inverted) override;
ColumnPtr replicate(const Offsets & offsets) const override;
ColumnPtr permute(const Permutation & perm, size_t limit) const override;

View File

@ -36,6 +36,7 @@
#include <Columns/ColumnsCommon.h>
#include <Columns/ColumnDecimal.h>
#include <Columns/ColumnCompressed.h>
#include <Columns/MaskOperations.h>
#include <DataStreams/ColumnGathererStream.h>
@ -371,6 +372,12 @@ ColumnPtr ColumnDecimal<T>::filter(const IColumn::Filter & filt, ssize_t result_
return res;
}
template <typename T>
void ColumnDecimal<T>::expand(const IColumn::Filter & mask, bool inverted)
{
expandDataByMask<T>(data, mask, inverted);
}
template <typename T>
ColumnPtr ColumnDecimal<T>::index(const IColumn & indexes, size_t limit) const
{

View File

@ -175,6 +175,7 @@ public:
bool isDefaultAt(size_t n) const override { return data[n].value == 0; }
ColumnPtr filter(const IColumn::Filter & filt, ssize_t result_size_hint) const override;
void expand(const IColumn::Filter & mask, bool inverted) override;
ColumnPtr permute(const IColumn::Permutation & perm, size_t limit) const override;
ColumnPtr index(const IColumn & indexes, size_t limit) const override;

View File

@ -309,6 +309,32 @@ ColumnPtr ColumnFixedString::filter(const IColumn::Filter & filt, ssize_t result
return res;
}
void ColumnFixedString::expand(const IColumn::Filter & mask, bool inverted)
{
if (mask.size() < size())
throw Exception("Mask size should be no less than data size.", ErrorCodes::LOGICAL_ERROR);
int index = mask.size() - 1;
int from = size() - 1;
chars.resize_fill(mask.size() * n, 0);
while (index >= 0)
{
if (mask[index] ^ inverted)
{
if (from < 0)
throw Exception("Too many bytes in mask", ErrorCodes::LOGICAL_ERROR);
memcpy(&chars[index * n], &chars[from * n], n);
--from;
}
--index;
}
if (from != -1)
throw Exception("Not enough bytes in mask", ErrorCodes::LOGICAL_ERROR);
}
ColumnPtr ColumnFixedString::permute(const Permutation & perm, size_t limit) const
{
size_t col_size = size();

View File

@ -179,6 +179,7 @@ public:
ColumnPtr filter(const IColumn::Filter & filt, ssize_t result_size_hint) const override;
void expand(const IColumn::Filter & mask, bool inverted) override;
ColumnPtr permute(const Permutation & perm, size_t limit) const override;

View File

@ -2,9 +2,15 @@
#include <Columns/ColumnFunction.h>
#include <Columns/ColumnsCommon.h>
#include <Common/PODArray.h>
#include <Common/ProfileEvents.h>
#include <IO/WriteHelpers.h>
#include <Functions/IFunction.h>
namespace ProfileEvents
{
extern const Event FunctionExecute;
extern const Event CompiledFunctionExecute;
}
namespace DB
{
@ -15,8 +21,8 @@ namespace ErrorCodes
extern const int LOGICAL_ERROR;
}
ColumnFunction::ColumnFunction(size_t size, FunctionBasePtr function_, const ColumnsWithTypeAndName & columns_to_capture)
: size_(size), function(function_)
ColumnFunction::ColumnFunction(size_t size, FunctionBasePtr function_, const ColumnsWithTypeAndName & columns_to_capture, bool is_short_circuit_argument_, bool is_function_compiled_)
: size_(size), function(function_), is_short_circuit_argument(is_short_circuit_argument_), is_function_compiled(is_function_compiled_)
{
appendArguments(columns_to_capture);
}
@ -27,7 +33,7 @@ MutableColumnPtr ColumnFunction::cloneResized(size_t size) const
for (auto & column : capture)
column.column = column.column->cloneResized(size);
return ColumnFunction::create(size, function, capture);
return ColumnFunction::create(size, function, capture, is_short_circuit_argument, is_function_compiled);
}
ColumnPtr ColumnFunction::replicate(const Offsets & offsets) const
@ -41,7 +47,7 @@ ColumnPtr ColumnFunction::replicate(const Offsets & offsets) const
column.column = column.column->replicate(offsets);
size_t replicated_size = 0 == size_ ? 0 : offsets.back();
return ColumnFunction::create(replicated_size, function, capture);
return ColumnFunction::create(replicated_size, function, capture, is_short_circuit_argument, is_function_compiled);
}
ColumnPtr ColumnFunction::cut(size_t start, size_t length) const
@ -50,7 +56,7 @@ ColumnPtr ColumnFunction::cut(size_t start, size_t length) const
for (auto & column : capture)
column.column = column.column->cut(start, length);
return ColumnFunction::create(length, function, capture);
return ColumnFunction::create(length, function, capture, is_short_circuit_argument, is_function_compiled);
}
ColumnPtr ColumnFunction::filter(const Filter & filt, ssize_t result_size_hint) const
@ -65,11 +71,24 @@ ColumnPtr ColumnFunction::filter(const Filter & filt, ssize_t result_size_hint)
size_t filtered_size = 0;
if (capture.empty())
{
filtered_size = countBytesInFilter(filt);
}
else
filtered_size = capture.front().column->size();
return ColumnFunction::create(filtered_size, function, capture);
return ColumnFunction::create(filtered_size, function, capture, is_short_circuit_argument, is_function_compiled);
}
void ColumnFunction::expand(const Filter & mask, bool inverted)
{
for (auto & column : captured_columns)
{
column.column = column.column->cloneResized(column.column->size());
column.column->assumeMutable()->expand(mask, inverted);
}
size_ = mask.size();
}
ColumnPtr ColumnFunction::permute(const Permutation & perm, size_t limit) const
@ -87,7 +106,7 @@ ColumnPtr ColumnFunction::permute(const Permutation & perm, size_t limit) const
for (auto & column : capture)
column.column = column.column->permute(perm, limit);
return ColumnFunction::create(limit, function, capture);
return ColumnFunction::create(limit, function, capture, is_short_circuit_argument, is_function_compiled);
}
ColumnPtr ColumnFunction::index(const IColumn & indexes, size_t limit) const
@ -96,7 +115,7 @@ ColumnPtr ColumnFunction::index(const IColumn & indexes, size_t limit) const
for (auto & column : capture)
column.column = column.column->index(indexes, limit);
return ColumnFunction::create(limit, function, capture);
return ColumnFunction::create(limit, function, capture, is_short_circuit_argument, is_function_compiled);
}
std::vector<MutableColumnPtr> ColumnFunction::scatter(IColumn::ColumnIndex num_columns,
@ -125,7 +144,7 @@ std::vector<MutableColumnPtr> ColumnFunction::scatter(IColumn::ColumnIndex num_c
{
auto & capture = captures[part];
size_t capture_size = capture.empty() ? counts[part] : capture.front().column->size();
columns.emplace_back(ColumnFunction::create(capture_size, function, std::move(capture)));
columns.emplace_back(ColumnFunction::create(capture_size, function, std::move(capture), is_short_circuit_argument));
}
return columns;
@ -179,7 +198,7 @@ void ColumnFunction::appendArgument(const ColumnWithTypeAndName & column)
const auto & argumnet_types = function->getArgumentTypes();
auto index = captured_columns.size();
if (!column.type->equals(*argumnet_types[index]))
if (!is_short_circuit_argument && !column.type->equals(*argumnet_types[index]))
throw Exception("Cannot capture column " + std::to_string(argumnet_types.size()) +
" because it has incompatible type: got " + column.type->getName() +
", but " + argumnet_types[index]->getName() + " is expected.", ErrorCodes::LOGICAL_ERROR);
@ -204,9 +223,26 @@ ColumnWithTypeAndName ColumnFunction::reduce() const
throw Exception("Cannot call function " + function->getName() + " because is has " + toString(args) +
"arguments but " + toString(captured) + " columns were captured.", ErrorCodes::LOGICAL_ERROR);
auto columns = captured_columns;
ColumnsWithTypeAndName columns = captured_columns;
IFunction::ShortCircuitSettings settings;
/// Arguments of lazy executed function can also be lazy executed.
/// But we shouldn't execute arguments if this function is short circuit,
/// because it will handle lazy executed arguments by itself.
if (is_short_circuit_argument && !function->isShortCircuit(settings, args))
{
for (auto & col : columns)
{
if (const ColumnFunction * arg = checkAndGetShortCircuitArgument(col.column))
col = arg->reduce();
}
}
ColumnWithTypeAndName res{nullptr, function->getResultType(), ""};
ProfileEvents::increment(ProfileEvents::FunctionExecute);
if (is_function_compiled)
ProfileEvents::increment(ProfileEvents::CompiledFunctionExecute);
res.column = function->execute(columns, res.type, size_);
return res;
}

View File

@ -5,9 +5,6 @@
#include <Core/ColumnsWithTypeAndName.h>
#include <Columns/IColumn.h>
class IFunctionBase;
using FunctionBasePtr = std::shared_ptr<IFunctionBase>;
namespace DB
{
@ -16,6 +13,8 @@ namespace ErrorCodes
extern const int NOT_IMPLEMENTED;
}
class IFunctionBase;
using FunctionBasePtr = std::shared_ptr<IFunctionBase>;
/** A column containing a lambda expression.
* Behaves like a constant-column. Contains an expression, but not input or output data.
@ -25,7 +24,12 @@ class ColumnFunction final : public COWHelper<IColumn, ColumnFunction>
private:
friend class COWHelper<IColumn, ColumnFunction>;
ColumnFunction(size_t size, FunctionBasePtr function_, const ColumnsWithTypeAndName & columns_to_capture);
ColumnFunction(
size_t size,
FunctionBasePtr function_,
const ColumnsWithTypeAndName & columns_to_capture,
bool is_short_circuit_argument_ = false,
bool is_function_compiled_ = false);
public:
const char * getFamilyName() const override { return "Function"; }
@ -38,6 +42,7 @@ public:
ColumnPtr cut(size_t start, size_t length) const override;
ColumnPtr replicate(const Offsets & offsets) const override;
ColumnPtr filter(const Filter & filt, ssize_t result_size_hint) const override;
void expand(const Filter & mask, bool inverted) override;
ColumnPtr permute(const Permutation & perm, size_t limit) const override;
ColumnPtr index(const IColumn & indexes, size_t limit) const override;
@ -177,14 +182,25 @@ public:
throw Exception(ErrorCodes::NOT_IMPLEMENTED, "Method getIndicesOfNonDefaultRows is not supported for {}", getName());
}
bool isShortCircuitArgument() const { return false; }
bool isShortCircuitArgument() const { return is_short_circuit_argument; }
private:
size_t size_;
FunctionBasePtr function;
ColumnsWithTypeAndName captured_columns;
/// Determine if it's used as a lazy executed argument for short-circuit function.
/// It's needed to distinguish between lazy executed argument and
/// argument with ColumnFunction column (some functions can return it)
/// See ExpressionActions.cpp for details.
bool is_short_circuit_argument;
/// Determine if passed function is compiled. Used for profiling.
bool is_function_compiled;
void appendArgument(const ColumnWithTypeAndName & column);
void addOffsetsForReplication(const IColumn::Offsets & offsets);
};
const ColumnFunction * checkAndGetShortCircuitArgument(const ColumnPtr & column);

View File

@ -251,6 +251,11 @@ public:
return ColumnLowCardinality::create(dictionary.getColumnUniquePtr(), getIndexes().filter(filt, result_size_hint));
}
void expand(const Filter & mask, bool inverted) override
{
idx.getPositionsPtr()->expand(mask, inverted);
}
ColumnPtr permute(const Permutation & perm, size_t limit) const override
{
if (full_state)

View File

@ -263,6 +263,11 @@ ColumnPtr ColumnMap::filter(const Filter & filt, ssize_t result_size_hint) const
return ColumnMap::create(filtered);
}
void ColumnMap::expand(const IColumn::Filter & mask, bool inverted)
{
nested->expand(mask, inverted);
}
ColumnPtr ColumnMap::permute(const Permutation & perm, size_t limit) const
{
auto permuted = nested->permute(perm, limit);

View File

@ -98,6 +98,7 @@ public:
void insertRangeFrom(const IColumn & src, size_t start, size_t length) override;
void insertRangeSelective(const IColumn & src, const Selector & selector, size_t selector_start, size_t length) override;
ColumnPtr filter(const Filter & filt, ssize_t result_size_hint) const override;
void expand(const Filter & mask, bool inverted) override;
ColumnPtr permute(const Permutation & perm, size_t limit) const override;
ColumnPtr index(const IColumn & indexes, size_t limit) const override;
ColumnPtr replicate(const Offsets & offsets) const override;

View File

@ -260,6 +260,12 @@ ColumnPtr ColumnNullable::filter(const Filter & filt, ssize_t result_size_hint)
return ColumnNullable::create(filtered_data, filtered_null_map);
}
void ColumnNullable::expand(const IColumn::Filter & mask, bool inverted)
{
nested_column->expand(mask, inverted);
null_map->expand(mask, inverted);
}
ColumnPtr ColumnNullable::permute(const Permutation & perm, size_t limit) const
{
ColumnPtr permuted_data = getNestedColumn().permute(perm, limit);

View File

@ -112,6 +112,7 @@ public:
void popBack(size_t n) override;
ColumnPtr filter(const Filter & filt, ssize_t result_size_hint) const override;
void expand(const Filter & mask, bool inverted) override;
ColumnPtr permute(const Permutation & perm, size_t limit) const override;
ColumnPtr index(const IColumn & indexes, size_t limit) const override;
int compareAt(size_t n, size_t m, const IColumn & rhs_, int null_direction_hint) const override;

View File

@ -13,6 +13,7 @@
#include <DataStreams/ColumnGathererStream.h>
#include <Interpreters/castColumn.h>
#include <Interpreters/convertFieldToType.h>
#include "Common/Exception.h"
#include <Common/HashTable/HashSet.h>
namespace DB
@ -26,6 +27,7 @@ namespace ErrorCodes
extern const int NUMBER_OF_DIMENSIONS_MISMATCHED;
extern const int SIZES_OF_COLUMNS_DOESNT_MATCH;
extern const int ARGUMENT_OUT_OF_BOUND;
extern const int UNSUPPORTED_METHOD;
}
namespace

View File

@ -250,6 +250,7 @@ public:
void updateHashWithValue(size_t, SipHash &) const override { throwMustBeConcrete(); }
void updateWeakHash32(WeakHash32 &) const override { throwMustBeConcrete(); }
void updateHashFast(SipHash &) const override { throwMustBeConcrete(); }
void expand(const Filter &, bool) override { throwMustBeConcrete(); }
bool hasEqualValues() const override { throwMustBeConcrete(); }
size_t byteSizeAt(size_t) const override { throwMustBeConcrete(); }
double getRatioOfDefaultRows(double) const override { throwMustBeConcrete(); }

View File

@ -4,6 +4,7 @@
#include <Columns/ColumnsCommon.h>
#include <Columns/ColumnCompressed.h>
#include <Columns/ColumnVector.h>
#include <Columns/MaskOperations.h>
#include <DataStreams/ColumnGathererStream.h>
#include <Common/Arena.h>
#include <Common/HashTable/Hash.h>
@ -192,6 +193,12 @@ ColumnPtr ColumnSketchBinary::filter(const Filter & filt, ssize_t result_size_hi
return res;
}
void ColumnSketchBinary::expand(const Filter & mask, bool inverted)
{
auto & chars_data = getChars();
auto & offsets_data = getOffsets();
expandStringDataByMask(chars_data, offsets_data, mask, inverted);
}
ColumnPtr ColumnSketchBinary::permute(const Permutation & perm, size_t limit) const
{

View File

@ -224,6 +224,8 @@ public:
ColumnPtr filter(const Filter & filt, ssize_t result_size_hint) const override;
void expand(const IColumn::Filter & mask, bool inverted) override;
ColumnPtr permute(const Permutation & perm, size_t limit) const override;
ColumnPtr index(const IColumn & indexes, size_t limit) const override;

View File

@ -24,6 +24,7 @@
#include <Columns/Collator.h>
#include <Columns/ColumnsCommon.h>
#include <Columns/ColumnCompressed.h>
#include <Columns/MaskOperations.h>
#include <DataStreams/ColumnGathererStream.h>
#include <Common/Arena.h>
#include <Common/HashTable/Hash.h>
@ -214,6 +215,14 @@ ColumnPtr ColumnString::filter(const Filter & filt, ssize_t result_size_hint) co
return res;
}
void ColumnString::expand(const IColumn::Filter & mask, bool inverted)
{
auto & chars_data = getChars();
auto & offsets_data = getOffsets();
expandStringDataByMask(chars_data, offsets_data, mask, inverted);
}
ColumnPtr ColumnString::permute(const Permutation & perm, size_t limit) const
{
size_t size = offsets.size();

View File

@ -245,6 +245,8 @@ public:
ColumnPtr filter(const Filter & filt, ssize_t result_size_hint) const override;
void expand(const Filter & mask, bool inverted) override;
ColumnPtr permute(const Permutation & perm, size_t limit) const override;
ColumnPtr index(const IColumn & indexes, size_t limit) const override;

View File

@ -274,6 +274,12 @@ ColumnPtr ColumnTuple::filter(const Filter & filt, ssize_t result_size_hint) con
return ColumnTuple::create(new_columns);
}
void ColumnTuple::expand(const Filter & mask, bool inverted)
{
for (auto & column : columns)
column->expand(mask, inverted);
}
ColumnPtr ColumnTuple::permute(const Permutation & perm, size_t limit) const
{
const size_t tuple_size = columns.size();

View File

@ -91,6 +91,7 @@ public:
void insertRangeSelective(const IColumn & src, const IColumn::Selector & selector, size_t selector_start, size_t length) override;
ColumnPtr filter(const Filter & filt, ssize_t result_size_hint) const override;
void expand(const Filter & mask, bool inverted) override;
ColumnPtr permute(const Permutation & perm, size_t limit) const override;
ColumnPtr index(const IColumn & indexes, size_t limit) const override;
ColumnPtr replicate(const Offsets & offsets) const override;

View File

@ -24,6 +24,7 @@
#include <pdqsort.h>
#include <Columns/ColumnsCommon.h>
#include <Columns/ColumnCompressed.h>
#include <Columns/MaskOperations.h>
#include <DataStreams/ColumnGathererStream.h>
#include <IO/WriteHelpers.h>
#include <Common/Arena.h>
@ -470,6 +471,12 @@ ColumnPtr ColumnVector<T>::filter(const IColumn::Filter & filt, ssize_t result_s
}
template <typename T>
void ColumnVector<T>::expand(const IColumn::Filter & mask, bool inverted)
{
expandDataByMask<T>(data, mask, inverted);
}
template <typename T>
void ColumnVector<T>::applyZeroMap(const IColumn::Filter & filt, bool inverted)
{

View File

@ -272,6 +272,7 @@ public:
return data[n];
}
void get(size_t n, Field & res) const override
{
res = (*this)[n];
@ -319,6 +320,8 @@ public:
ColumnPtr filter(const IColumn::Filter & filt, ssize_t result_size_hint) const override;
void expand(const IColumn::Filter & mask, bool inverted) override;
ColumnPtr permute(const IColumn::Permutation & perm, size_t limit) const override;
ColumnPtr index(const IColumn & indexes, size_t limit) const override;

View File

@ -23,6 +23,9 @@
#include <Columns/ColumnVector.h>
#include <Common/typeid_cast.h>
#include <Common/HashTable/HashSet.h>
#include <Columns/ColumnSketchBinary.h>
#include <Columns/ColumnString.h>
#include <Columns/ColumnBitMap64.h>
#include "ColumnsCommon.h"
@ -322,6 +325,7 @@ void filterArraysImplOnlyData(
}
/// Explicit instantiations - not to place the implementation of the function above in the header file.
#define INSTANTIATE(TYPE) \
template void filterArraysImpl<TYPE>( \

View File

@ -3,7 +3,6 @@
#include <Columns/IColumn.h>
#include <Columns/ColumnNullable.h>
#include <Columns/ColumnConst.h>
#include <Columns/ColumnArray.h>
#include <Core/Field.h>

View File

@ -282,12 +282,20 @@ public:
/** Removes elements that don't match the filter.
* Is used in WHERE and HAVING operations.
* If result_size_hint > 0, then makes advance reserve(result_size_hint) for the result column;
* if 0, then don't makes reserve(),
* otherwise (i.e. < 0), makes reserve() using size of source column.
* if 0, then don't makes reserve(),
* otherwise (i.e. < 0), makes reserve() using size of source column.
*/
using Filter = PaddedPODArray<UInt8>;
virtual Ptr filter(const Filter & filt, ssize_t result_size_hint) const = 0;
/** Expand column by mask inplace. After expanding column will
* satisfy the following: if we filter it by given mask, we will
* get initial column. Values with indexes i: mask[i] = 0
* shouldn't be used after expanding.
* If inverted is true, inverted mask will be used.
*/
virtual void expand(const Filter & /*mask*/, bool /*inverted*/) = 0;
/// Permutes elements using specified permutation. Is used in sorting.
/// limit - if it isn't 0, puts only first limit elements in the result.
using Permutation = PaddedPODArray<size_t>;

View File

@ -127,7 +127,16 @@ public:
ColumnPtr filter(const Filter & filt, ssize_t /*result_size_hint*/) const override
{
return cloneDummy(countBytesInFilter(filt));
size_t bytes = countBytesInFilter(filt);
return cloneDummy(bytes);
}
void expand(const IColumn::Filter & mask, bool inverted) override
{
size_t bytes = countBytesInFilter(mask);
if (inverted)
bytes = mask.size() - bytes;
s = bytes;
}
ColumnPtr permute(const Permutation & perm, size_t limit) const override

View File

@ -164,6 +164,11 @@ public:
throw Exception("Method filter is not supported for ColumnUnique.", ErrorCodes::NOT_IMPLEMENTED);
}
void expand(const IColumn::Filter &, bool) override
{
throw Exception("Method expand is not supported for ColumnUnique.", ErrorCodes::NOT_IMPLEMENTED);
}
ColumnPtr permute(const IColumn::Permutation &, size_t) const override
{
throw Exception("Method permute is not supported for ColumnUnique.", ErrorCodes::NOT_IMPLEMENTED);

View File

@ -0,0 +1,363 @@
#include <Columns/MaskOperations.h>
#include <Columns/ColumnFunction.h>
#include <Columns/ColumnNullable.h>
#include <Columns/ColumnNothing.h>
#include <Columns/ColumnsCommon.h>
#include <Columns/ColumnConst.h>
#include <algorithm>
namespace DB
{
namespace ErrorCodes
{
extern const int LOGICAL_ERROR;
extern const int ILLEGAL_COLUMN;
}
template <typename T>
void expandDataByMask(PaddedPODArray<T> & data, const PaddedPODArray<UInt8> & mask, bool inverted)
{
if (mask.size() < data.size())
throw Exception("Mask size should be no less than data size.", ErrorCodes::LOGICAL_ERROR);
int from = data.size() - 1;
int index = mask.size() - 1;
data.resize(mask.size());
while (index >= 0)
{
if (mask[index] ^ inverted)
{
if (from < 0)
throw Exception("Too many bytes in mask", ErrorCodes::LOGICAL_ERROR);
/// Copy only if it makes sense.
if (index != from)
data[index] = data[from];
--from;
}
else
data[index] = T();
--index;
}
if (from != -1)
throw Exception("Not enough bytes in mask", ErrorCodes::LOGICAL_ERROR);
}
/// Explicit instantiations - not to place the implementation of the function above in the header file.
#define INSTANTIATE(TYPE) \
template void expandDataByMask<TYPE>(PaddedPODArray<TYPE> &, const PaddedPODArray<UInt8> &, bool);
INSTANTIATE(UInt8)
INSTANTIATE(UInt16)
INSTANTIATE(UInt32)
INSTANTIATE(UInt64)
INSTANTIATE(UInt128)
INSTANTIATE(UInt256)
INSTANTIATE(Int8)
INSTANTIATE(Int16)
INSTANTIATE(Int32)
INSTANTIATE(Int64)
INSTANTIATE(Int128)
INSTANTIATE(Int256)
INSTANTIATE(Float32)
INSTANTIATE(Float64)
INSTANTIATE(Decimal32)
INSTANTIATE(Decimal64)
INSTANTIATE(Decimal128)
INSTANTIATE(Decimal256)
INSTANTIATE(DateTime64)
INSTANTIATE(char *)
INSTANTIATE(UUID)
INSTANTIATE(IPv4)
INSTANTIATE(IPv6)
#undef INSTANTIATE
void expandStringDataByMask(PaddedPODArray<UInt8> & chars_data, IColumn::Offsets & offsets_data, const IColumn::Filter & mask, bool inverted)
{
if (mask.size() < offsets_data.size())
throw Exception("Mask size should be no less than data size.", ErrorCodes::LOGICAL_ERROR);
/// We cannot change only offsets, because each string should end with terminating zero byte.
/// So, we will insert one zero byte when mask value is zero.
int index = mask.size() - 1;
int from = offsets_data.size() - 1;
/// mask.size() - offsets_data.size() should be equal to the number of zeros in mask
/// (if not, one of exceptions below will throw) and we can calculate the resulting chars size.
UInt64 last_offset = offsets_data[from] + (mask.size() - offsets_data.size());
offsets_data.resize(mask.size());
chars_data.resize_fill(last_offset, 0);
while (index >= 0)
{
offsets_data[index] = last_offset;
if (mask[index] ^ inverted)
{
if (from < 0)
throw Exception("Too many bytes in mask", ErrorCodes::LOGICAL_ERROR);
size_t len = offsets_data[from] - offsets_data[from - 1];
/// Copy only if it makes sense. It's important to copy backward, because
/// ranges can overlap, but destination is always is more to the right then source
if (last_offset - len != offsets_data[from - 1])
std::copy_backward(&chars_data[offsets_data[from - 1]], &chars_data[offsets_data[from]], &chars_data[last_offset]);
last_offset -= len;
--from;
}
else
{
chars_data[last_offset - 1] = 0;
--last_offset;
}
--index;
}
if (from != -1)
throw Exception("Not enough bytes in mask", ErrorCodes::LOGICAL_ERROR);
}
template <bool inverted, bool column_is_short, typename Container>
size_t extractMaskNumericImpl(
PaddedPODArray<UInt8> & mask,
const Container & data,
UInt8 null_value,
const PaddedPODArray<UInt8> * null_bytemap,
PaddedPODArray<UInt8> * nulls)
{
size_t ones_count = 0;
size_t data_index = 0;
for (size_t i = 0; i != mask.size(); ++i)
{
// Change mask only where value is 1.
if (!mask[i])
continue;
UInt8 value;
size_t index;
if constexpr (column_is_short)
{
index = data_index;
++data_index;
}
else
index = i;
if (null_bytemap && (*null_bytemap)[index])
{
value = null_value;
if (nulls)
(*nulls)[i] = 1;
}
else
value = !!data[index];
if constexpr (inverted)
value = !value;
if (value)
++ones_count;
mask[i] = value;
}
return ones_count;
}
template <bool inverted, typename NumericType>
bool extractMaskNumeric(
PaddedPODArray<UInt8> & mask,
const ColumnPtr & column,
UInt8 null_value,
const PaddedPODArray<UInt8> * null_bytemap,
PaddedPODArray<UInt8> * nulls,
MaskInfo & mask_info)
{
const auto * numeric_column = checkAndGetColumn<ColumnVector<NumericType>>(column.get());
if (!numeric_column)
return false;
const auto & data = numeric_column->getData();
size_t ones_count;
if (column->size() < mask.size())
ones_count = extractMaskNumericImpl<inverted, true>(mask, data, null_value, null_bytemap, nulls);
else
ones_count = extractMaskNumericImpl<inverted, false>(mask, data, null_value, null_bytemap, nulls);
mask_info.has_ones = ones_count > 0;
mask_info.has_zeros = ones_count != mask.size();
return true;
}
template <bool inverted>
MaskInfo extractMaskFromConstOrNull(
PaddedPODArray<UInt8> & mask,
const ColumnPtr & column,
UInt8 null_value,
PaddedPODArray<UInt8> * nulls = nullptr)
{
UInt8 value;
if (column->onlyNull())
{
value = null_value;
if (nulls)
std::fill(nulls->begin(), nulls->end(), 1);
}
else
value = column->getBool(0);
if constexpr (inverted)
value = !value;
size_t ones_count = 0;
if (value)
ones_count = countBytesInFilter(mask);
else
std::fill(mask.begin(), mask.end(), 0);
return {.has_ones = ones_count > 0, .has_zeros = ones_count != mask.size()};
}
template <bool inverted>
MaskInfo extractMaskImpl(
PaddedPODArray<UInt8> & mask,
const ColumnPtr & column,
UInt8 null_value,
const PaddedPODArray<UInt8> * null_bytemap,
PaddedPODArray<UInt8> * nulls = nullptr)
{
/// Special implementation for Null and Const columns.
if (column->onlyNull() || checkAndGetColumn<ColumnConst>(*column))
return extractMaskFromConstOrNull<inverted>(mask, column, null_value, nulls);
if (const auto * col = checkAndGetColumn<ColumnNullable>(*column))
{
const PaddedPODArray<UInt8> & null_map = col->getNullMapData();
return extractMaskImpl<inverted>(mask, col->getNestedColumnPtr(), null_value, &null_map, nulls);
}
MaskInfo mask_info;
if (!(extractMaskNumeric<inverted, UInt8>(mask, column, null_value, null_bytemap, nulls, mask_info)
|| extractMaskNumeric<inverted, UInt16>(mask, column, null_value, null_bytemap, nulls, mask_info)
|| extractMaskNumeric<inverted, UInt32>(mask, column, null_value, null_bytemap, nulls, mask_info)
|| extractMaskNumeric<inverted, UInt64>(mask, column, null_value, null_bytemap, nulls, mask_info)
|| extractMaskNumeric<inverted, Int8>(mask, column, null_value, null_bytemap, nulls, mask_info)
|| extractMaskNumeric<inverted, Int16>(mask, column, null_value, null_bytemap, nulls, mask_info)
|| extractMaskNumeric<inverted, Int32>(mask, column, null_value, null_bytemap, nulls, mask_info)
|| extractMaskNumeric<inverted, Int64>(mask, column, null_value, null_bytemap, nulls, mask_info)
|| extractMaskNumeric<inverted, Float32>(mask, column, null_value, null_bytemap, nulls, mask_info)
|| extractMaskNumeric<inverted, Float64>(mask, column, null_value, null_bytemap, nulls, mask_info)))
throw Exception(ErrorCodes::ILLEGAL_COLUMN, "Cannot convert column {} to mask.", column->getName());
return mask_info;
}
MaskInfo extractMask(
PaddedPODArray<UInt8> & mask,
const ColumnPtr & column,
UInt8 null_value)
{
return extractMaskImpl<false>(mask, column, null_value, nullptr);
}
MaskInfo extractInvertedMask(
PaddedPODArray<UInt8> & mask,
const ColumnPtr & column,
UInt8 null_value)
{
return extractMaskImpl<true>(mask, column, null_value, nullptr);
}
MaskInfo extractMask(
PaddedPODArray<UInt8> & mask,
const ColumnPtr & column,
PaddedPODArray<UInt8> * nulls,
UInt8 null_value)
{
return extractMaskImpl<false>(mask, column, null_value, nullptr, nulls);
}
MaskInfo extractInvertedMask(
PaddedPODArray<UInt8> & mask,
const ColumnPtr & column,
PaddedPODArray<UInt8> * nulls,
UInt8 null_value)
{
return extractMaskImpl<true>(mask, column, null_value, nullptr, nulls);
}
void inverseMask(PaddedPODArray<UInt8> & mask, MaskInfo & mask_info)
{
for (size_t i = 0; i != mask.size(); ++i)
mask[i] = !mask[i];
std::swap(mask_info.has_ones, mask_info.has_zeros);
}
void maskedExecute(ColumnWithTypeAndName & column, const PaddedPODArray<UInt8> & mask, const MaskInfo & mask_info)
{
const auto * column_function = checkAndGetShortCircuitArgument(column.column);
if (!column_function)
return;
ColumnWithTypeAndName result;
/// If mask contains only zeros, we can just create
/// an empty column with the execution result type.
if (!mask_info.has_ones)
{
auto result_type = column_function->getResultType();
auto empty_column = result_type->createColumn();
result = {std::move(empty_column), result_type, ""};
}
/// Filter column only if mask contains zeros.
else if (mask_info.has_zeros)
{
auto filtered = column_function->filter(mask, -1);
result = typeid_cast<const ColumnFunction *>(filtered.get())->reduce();
}
else
result = column_function->reduce();
column = std::move(result);
}
void executeColumnIfNeeded(ColumnWithTypeAndName & column, bool empty)
{
const auto * column_function = checkAndGetShortCircuitArgument(column.column);
if (!column_function)
return;
if (!empty)
column = column_function->reduce();
else
column.column = column_function->getResultType()->createColumn();
}
int checkShortCircuitArguments(const ColumnsWithTypeAndName & arguments)
{
int last_short_circuit_argument_index = -1;
for (size_t i = 0; i != arguments.size(); ++i)
{
if (checkAndGetShortCircuitArgument(arguments[i].column))
last_short_circuit_argument_index = i;
}
return last_short_circuit_argument_index;
}
void copyMask(const PaddedPODArray<UInt8> & from, PaddedPODArray<UInt8> & to)
{
if (from.size() != to.size())
throw Exception("Cannot copy mask, because source and destination have different size", ErrorCodes::LOGICAL_ERROR);
if (from.empty())
return;
memcpy(to.data(), from.data(), from.size() * sizeof(*from.data()));
}
}

View File

@ -0,0 +1,75 @@
#pragma once
#include <Core/ColumnWithTypeAndName.h>
#include <Core/ColumnsWithTypeAndName.h>
#include <Core/Field.h>
#include <Common/PODArray.h>
namespace DB
{
/// Expand data by mask. After expanding data will satisfy the following: if we filter data
/// by given mask, we get initial data. In places where mask[i] = 0 we insert default value.
/// If inverted is true, we will work with inverted mask. This function is used in implementations of
/// expand() method in IColumn interface.
template <typename T>
void expandDataByMask(PaddedPODArray<T> & data, const PaddedPODArray<UInt8> & mask, bool inverted);
void expandStringDataByMask(PaddedPODArray<UInt8> & chars_data, IColumn::Offsets & offsets_data, const IColumn::Filter & mask, bool inverted);
struct MaskInfo
{
bool has_ones;
bool has_zeros;
};
/// The next functions are used to extract UInt8 mask from a column,
/// filtered by some condition (mask). We will use value from a column
/// only when value in condition is 1. Column should satisfy the
/// condition: sum(mask) = column.size() or mask.size() = column.size().
/// You can set flag 'inverted' to use inverted values
/// from a column. You can also determine value that will be used when
/// column value is Null (argument null_value).
MaskInfo extractMask(
PaddedPODArray<UInt8> & mask,
const ColumnPtr & column,
UInt8 null_value = 0);
MaskInfo extractInvertedMask(
PaddedPODArray<UInt8> & mask,
const ColumnPtr & column,
UInt8 null_value = 0);
/// The same as extractMask, but fills
/// nulls so that nulls[i] = 1 when column[i] = Null.
MaskInfo extractMask(
PaddedPODArray<UInt8> & mask,
const ColumnPtr & column,
PaddedPODArray<UInt8> * nulls,
UInt8 null_value = 0);
MaskInfo extractInvertedMask(
PaddedPODArray<UInt8> & mask,
const ColumnPtr & column,
PaddedPODArray<UInt8> * nulls,
UInt8 null_value = 0);
/// Inplace inversion.
void inverseMask(PaddedPODArray<UInt8> & mask, MaskInfo & mask_info);
/// If given column is lazy executed argument (ColumnFunction with isShortCircuitArgument() = true),
/// filter it by mask and then reduce. If inverted is true, we will work with inverted mask.
void maskedExecute(ColumnWithTypeAndName & column, const PaddedPODArray<UInt8> & mask, const MaskInfo & mask_info);
/// If given column is lazy executed argument, reduce it. If empty is true,
/// create an empty column with the execution result type.
void executeColumnIfNeeded(ColumnWithTypeAndName & column, bool empty = false);
/// Check if arguments contain lazy executed argument. If contain, return index of the last one,
/// otherwise return -1.
int checkShortCircuitArguments(const ColumnsWithTypeAndName & arguments);
void copyMask(const PaddedPODArray<UInt8> & from, PaddedPODArray<UInt8> & to);
}

View File

@ -35,6 +35,7 @@ SRCS(
ColumnsCommon.cpp
FilterDescription.cpp
IColumn.cpp
MaskOperations.cpp
getLeastSuperColumn.cpp
)

View File

@ -0,0 +1,16 @@
#include <Common/escapeString.h>
#include <IO/WriteBufferFromString.h>
#include <IO/WriteHelpers.h>
namespace DB
{
String escapeString(std::string_view value)
{
WriteBufferFromOwnString buf;
writeEscapedString(value, buf);
return buf.str();
}
}

10
src/Common/escapeString.h Normal file
View File

@ -0,0 +1,10 @@
#pragma once
#include <common/types.h>
namespace DB
{
String escapeString(std::string_view value);
}

View File

@ -461,31 +461,15 @@ enum PreloadLevelSettings : UInt64
M(UInt64, prefetch_buffer_size, DBMS_DEFAULT_BUFFER_SIZE, "The maximum size of the prefetch buffer to read from the filesystem.", 0) \
\
M(UInt64, mysql_max_rows_to_insert, 65536, "The maximum number of rows in MySQL batch insertion of the MySQL storage engine", 0) \
\
M(UInt64, \
optimize_min_equality_disjunction_chain_length, \
3, \
"The minimum length of the expression `expr = x1 OR ... expr = xN` for optimization ", \
0) \
\
M(UInt64, \
min_bytes_to_use_direct_io, \
0, \
"The minimum number of bytes for reading the data with O_DIRECT option during SELECT queries execution. 0 - disabled.", \
0) \
M(UInt64, \
min_bytes_to_use_mmap_io, \
0, \
"The minimum number of bytes for reading the data with mmap option during SELECT queries execution. 0 - disabled.", \
0) \
M(Bool, \
checksum_on_read, \
true, \
"Validate checksums on reading. It is enabled by default and should be always enabled in production. Please do not expect any " \
"benefits in disabling this setting. It may only be used for experiments and benchmarks. The setting only applicable for tables of " \
"MergeTree family. Checksums are always validated for other table engines and when receiving data over network.", \
0) \
\
M(Bool, mysql_map_string_to_text_in_show_columns, false, "If enabled, String type will be mapped to TEXT in SHOW [FULL] COLUMNS, BLOB otherwise.", 0) \
M(Bool, mysql_map_fixed_string_to_text_in_show_columns, false, "If enabled, FixedString type will be mapped to TEXT in SHOW [FULL] COLUMNS, BLOB otherwise.", 0) \
\
M(UInt64, optimize_min_equality_disjunction_chain_length, 3, "The minimum length of the expression `expr = x1 OR ... expr = xN` for optimization ", 0) \
\
M(UInt64, min_bytes_to_use_direct_io, 0, "The minimum number of bytes for reading the data with O_DIRECT option during SELECT queries execution. 0 - disabled.", 0) \
M(UInt64, min_bytes_to_use_mmap_io, 0, "The minimum number of bytes for reading the data with mmap option during SELECT queries execution. 0 - disabled.", 0) \
M(Bool, checksum_on_read, true, "Validate checksums on reading. It is enabled by default and should be always enabled in production. Please do not expect any benefits in disabling this setting. It may only be used for experiments and benchmarks. The setting only applicable for tables of MergeTree family. Checksums are always validated for other table engines and when receiving data over network.", 0) \
\
M(Bool, force_index_by_date, 0, "Throw an exception if there is a partition key in a table, and it is not used.", 0) \
M(Bool, force_primary_key, 0, "Throw an exception if there is primary key in a table, and it is not used.", 0) \
M(Bool, enable_skip_index, 1, "Whether enable to use skip index", 0) \
@ -1294,6 +1278,7 @@ enum PreloadLevelSettings : UInt64
M(UInt64, offset, 0, "Offset on read rows from the most 'end' result for select query", 0) \
\
M(UInt64, function_range_max_elements_in_block, 500000000, "Maximum number of values generated by function 'range' per block of data (sum of array sizes for every row in a block, see also 'max_block_size' and 'min_insert_block_size_rows'). It is a safety threshold.", 0) \
M(ShortCircuitFunctionEvaluation, short_circuit_function_evaluation, ShortCircuitFunctionEvaluation::ENABLE, "Setting for short-circuit function evaluation configuration. Possible values: 'enable', 'disable', 'force_enable'", 0) \
\
/** Bytedance */ \
M(UInt64, \

View File

@ -185,4 +185,9 @@ IMPLEMENT_SETTING_ENUM(TextCaseOption, ErrorCodes::BAD_ARGUMENTS,
{{"MIXED", TextCaseOption::MIXED},
{"LOWERCASE", TextCaseOption::LOWERCASE},
{"UPPERCASE", TextCaseOption::UPPERCASE}})
IMPLEMENT_SETTING_ENUM(ShortCircuitFunctionEvaluation, ErrorCodes::BAD_ARGUMENTS,
{{"enable", ShortCircuitFunctionEvaluation::ENABLE},
{"force_enable", ShortCircuitFunctionEvaluation::FORCE_ENABLE},
{"disable", ShortCircuitFunctionEvaluation::DISABLE}})
} // namespace DB

View File

@ -318,4 +318,13 @@ enum class TextCaseOption
DECLARE_SETTING_ENUM(TextCaseOption)
enum class ShortCircuitFunctionEvaluation
{
ENABLE, // Use short-circuit function evaluation for functions that are suitable for it.
FORCE_ENABLE, // Use short-circuit function evaluation for all functions.
DISABLE, // Disable short-circuit function evaluation.
};
DECLARE_SETTING_ENUM(ShortCircuitFunctionEvaluation)
}

View File

@ -1,5 +1,6 @@
#include "convertMySQLDataType.h"
#include <algorithm>
#include <charconv>
#include <Core/Field.h>
#include <common/types.h>
#include <Core/MultiEnum.h>
@ -7,10 +8,10 @@
#include <Parsers/ASTFunction.h>
#include <Parsers/ASTIdentifier.h>
#include <Parsers/IAST.h>
#include <boost/container/container_fwd.hpp>
#include "DataTypeDate.h"
#include "DataTypeDateTime.h"
#include "DataTypeDateTime64.h"
#include "DataTypeEnum.h"
#include "DataTypesDecimal.h"
#include "DataTypeFixedString.h"
#include "DataTypeNullable.h"
@ -123,4 +124,102 @@ DataTypePtr convertMySQLDataType(MultiEnum<MySQLDataTypesSupport> type_support,
return res;
}
template <bool full_type>
std::conditional_t<full_type, std::string, std::string_view> convertClickHouseDataTypeToMysqlColumnProperties(
std::string_view clickhouse_data_type, ClickHouseToMySQLDataTypeConversionSettings settings
)
{
static const std::unordered_map<std::string_view, std::string_view> mapping{{
{"Int8", "TINYINT"},
{"Int16", "SMALLINT"},
{"Int32", "INTEGER"},
{"Int64", "BIGINT"},
{"UInt8", "TINYINT"},
{"UInt16", "SMALLINT"},
{"UInt32", "INTEGER"},
{"UInt64", "BIGINT"},
{"Float32", "FLOAT"},
{"Float64", "DOUBLE"},
{"UUID", "CHAR"},
{"Bool", "TINYINT"},
{"Date", "DATE"},
{"Date32", "DATE"},
{"DateTime", "DATETIME"},
{"DateTime64", "DATETIME"},
{"Map", "JSON"},
{"Tuple", "JSON"},
{"Object", "JSON"},
}};
std::conditional_t<full_type, std::string, std::string_view> result;
std::string_view sv = clickhouse_data_type;
if (sv.starts_with("LowCardinality"))
{
sv.remove_prefix(strlen("LowCardinality("));
sv.remove_suffix(strlen(")"));
}
if (sv.starts_with("Nullable"))
{
sv.remove_prefix(strlen("Nullable("));
sv.remove_suffix(strlen(")"));
}
const std::string_view inner_type = sv.substr(0, sv.find('('));
do
{
if (inner_type == "Decimal")
{
sv.remove_prefix(strlen("Decimal("));
sv.remove_suffix(strlen(")"));
size_t comma_pos = sv.find(',');
const std::string_view scale_s = sv.substr(0, comma_pos);
const std::string_view precision_s = sv.substr(comma_pos + strlen(", "));
int scale;
int precision;
(void)std::from_chars(scale_s.begin(), scale_s.end(), scale);
(void)std::from_chars(precision_s.begin(), precision_s.end(), precision);
if (scale <= 65 && precision <= 30)
{
if constexpr (full_type)
result = fmt::format("DECIMAL({})", sv);
else
result = "DECIMAL";
break;
}
}
if (const auto it = mapping.find(inner_type); it != mapping.end())
{
result = it->second;
if constexpr (full_type)
{
if (inner_type.starts_with("UInt"))
result += " UNSIGNED";
}
break;
}
const std::array<std::pair<std::string_view, std::string_view>, 2> mapping2{{
{"String", settings.remap_string_as_text ? "TEXT" : "BLOB"},
{"FixedString", settings.remap_fixed_string_as_text ? "TEXT" : "BLOB"},
}};
if (const auto it = std::find_if(mapping2.begin(), mapping2.end(), [inner_type](const auto & pair){ return pair.first == inner_type; }); it != mapping2.end())
{
result = it->second;
break;
}
result = "TEXT";
} while (false);
return result;
}
template
std::string convertClickHouseDataTypeToMysqlColumnProperties<true>(
std::string_view clickhouse_data_type, ClickHouseToMySQLDataTypeConversionSettings settings
);
template
std::string_view convertClickHouseDataTypeToMysqlColumnProperties<false>(
std::string_view clickhouse_data_type, ClickHouseToMySQLDataTypeConversionSettings settings
);
}

View File

@ -1,6 +1,7 @@
#pragma once
#include <string>
#include <type_traits>
#include <Core/MultiEnum.h>
#include <Parsers/IAST.h>
#include "IDataType.h"
@ -17,4 +18,19 @@ ASTPtr dataTypeConvertToQuery(const DataTypePtr & data_type);
/// Convert MySQL type to ClickHouse data type.
DataTypePtr convertMySQLDataType(MultiEnum<MySQLDataTypesSupport> type_support, const std::string & mysql_data_type, bool is_nullable, bool is_unsigned, size_t length, size_t precision, size_t scale);
struct ClickHouseToMySQLDataTypeConversionSettings
{
bool remap_string_as_text;
bool remap_fixed_string_as_text;
};
/// Convert ClickHouse datatype string to MySQL type string and other properties.
/// This is purely a string based mapping.
/// The implementation is a port of the SQL logic in
/// https://github.com/ClickHouse/ClickHouse/blob/1c0fa345ac341030d76a687b0900f8e66739d384/src/Interpreters/InterpreterShowColumnsQuery.cpp#L4
/// E.g., whether we consider a column as nullable is also based the logic there.
template <bool full_type>
std::conditional_t<full_type, std::string, std::string_view> convertClickHouseDataTypeToMysqlColumnProperties(
std::string_view clickhouse_data_type, ClickHouseToMySQLDataTypeConversionSettings settings
);
}

View File

@ -88,6 +88,8 @@ public:
return 1;
}
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return true; }
bool useDefaultImplementationForConstants() const override
{
return true;

View File

@ -1313,6 +1313,12 @@ public:
return !division_by_nullable && !handle_division_by_zero;
}
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & arguments) const override
{
return ((IsOperation<Op>::div_int || IsOperation<Op>::modulo) && !arguments[1].is_const)
|| (IsOperation<Op>::div_floating && (isDecimal(arguments[0].type) || isDecimal(arguments[1].type)));
}
DataTypePtr getReturnTypeImpl(const DataTypes & arguments) const override
{
return getReturnTypeImplStatic(arguments, context, handle_division_by_zero);

View File

@ -29,6 +29,7 @@ public:
String getName() const override { return name; }
bool isVariadic() const override { return true; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return false; }
size_t getNumberOfArguments() const override { return 0; }
DataTypePtr getReturnTypeImpl(const DataTypes & arguments) const override

View File

@ -58,6 +58,7 @@ public:
String getName() const override { return name; }
bool isVariadic() const override { return true; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return true; }
size_t getNumberOfArguments() const override { return 0; }
DataTypePtr getReturnTypeImpl(const ColumnsWithTypeAndName & arguments) const override

View File

@ -771,6 +771,7 @@ public:
bool isVariadic() const override { return true; }
size_t getNumberOfArguments() const override { return 0; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return false; }
DataTypePtr getReturnTypeImpl(const ColumnsWithTypeAndName & arguments) const override
{

View File

@ -41,6 +41,8 @@ public:
FunctionDateOrDateTimeToSomething(ContextPtr context_) : context(context_) { }
static FunctionPtr create(ContextPtr context_) { return std::make_shared<FunctionDateOrDateTimeToSomething>(context_); }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return false; }
DataTypePtr getReturnTypeImpl(const ColumnsWithTypeAndName & arguments) const override
{
this->checkArguments(arguments, (std::is_same_v<ToDataType, DataTypeDate> || std::is_same_v<ToDataType, DataTypeDate32>), context);

View File

@ -24,6 +24,8 @@ public:
bool isDeterministic() const override { return false; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return false; }
size_t getNumberOfArguments() const override
{
return 0;

View File

@ -41,6 +41,8 @@ public:
bool isDeterministic() const override { return false; }
bool isDeterministicInScopeOfQuery() const override { return false; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return true; }
DataTypePtr getReturnTypeImpl(const ColumnsWithTypeAndName & arguments) const override
{
if (arguments.empty() || arguments.size() > 2)

View File

@ -90,11 +90,17 @@ FunctionBasePtr JoinGetOverloadResolver<or_null>::buildImpl(const ColumnsWithTyp
ErrorCodes::NUMBER_OF_ARGUMENTS_DOESNT_MATCH);
auto [storage_join, attr_name] = getJoin(arguments, getContext());
DataTypes data_types(arguments.size() - 2);
for (size_t i = 2; i < arguments.size(); ++i)
data_types[i - 2] = arguments[i].type;
DataTypes argument_types(arguments.size());
for (size_t i = 0; i < arguments.size(); ++i)
{
if (i >= 2)
data_types[i - 2] = arguments[i].type;
argument_types[i] = arguments[i].type;
}
auto return_type = storage_join->joinGetCheckAndGetReturnType(data_types, attr_name, or_null);
auto table_lock = storage_join->lockForShare(getContext()->getInitialQueryId(), getContext()->getSettingsRef().lock_acquire_timeout);
return std::make_unique<FunctionJoinGet<or_null>>(table_lock, storage_join, attr_name, data_types, return_type);
return std::make_unique<FunctionJoinGet<or_null>>(table_lock, storage_join, attr_name, argument_types, return_type);
}
REGISTER_FUNCTION(JoinGet)

View File

@ -60,6 +60,8 @@ public:
String getName() const override { return name; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return true; }
const DataTypes & getArgumentTypes() const override { return argument_types; }
const DataTypePtr & getResultType() const override { return return_type; }

View File

@ -43,6 +43,7 @@ public:
static_assert(Impl::rows_per_iteration > 0, "Impl must process at least one row per iteration");
bool useDefaultImplementationForConstants() const override { return true; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return false; }
private:
ContextPtr context;

View File

@ -20,6 +20,8 @@ private:
size_t getNumberOfArguments() const override { return 0; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return false; }
DataTypePtr getReturnTypeImpl(const DataTypes & /*arguments*/) const override
{
return std::make_shared<DataTypeFloat64>();

View File

@ -49,6 +49,8 @@ private:
String getName() const override { return name; }
size_t getNumberOfArguments() const override { return 1; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return false; }
DataTypePtr getReturnTypeImpl(const DataTypes & arguments) const override
{
const auto & arg = arguments.front();

View File

@ -38,6 +38,11 @@ public:
return 1;
}
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override
{
return false;
}
DataTypePtr getReturnTypeImpl(const DataTypes & arguments) const override
{
if (!isNativeNumber(arguments.front()))

View File

@ -153,6 +153,7 @@ public:
size_t getNumberOfArguments() const override { return 0; }
bool useDefaultImplementationForConstants() const override { return true; }
ColumnNumbers getArgumentsThatAreAlwaysConstant() const override { return {0}; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return true; }
DataTypePtr getReturnTypeImpl(const ColumnsWithTypeAndName & arguments) const override
{

View File

@ -42,6 +42,11 @@ public:
return name;
}
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override
{
return true;
}
size_t getNumberOfArguments() const override
{
return 2;

View File

@ -47,6 +47,11 @@ public:
return is_injective;
}
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override
{
return true;
}
DataTypePtr getReturnTypeImpl(const DataTypes & arguments) const override
{
if (!isStringOrFixedString(arguments[0]))

View File

@ -135,6 +135,7 @@ public:
size_t getNumberOfArguments() const override { return 1; }
bool isInjective(const ColumnsWithTypeAndName &) const override { return is_injective; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return false; }
bool useDefaultImplementationForConstants() const override { return true; }

View File

@ -57,6 +57,7 @@ public:
String getName() const override { return name; }
size_t getNumberOfArguments() const override { return 1; }
bool isVariadic() const override { return false; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return false; }
bool useDefaultImplementationForConstants() const override { return true; }
DataTypePtr getReturnTypeImpl(const ColumnsWithTypeAndName & arguments) const override
@ -121,6 +122,7 @@ public:
String getName() const override { return name; }
size_t getNumberOfArguments() const override { return 0; }
bool isVariadic() const override { return true; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return false; }
bool useDefaultImplementationForConstants() const override { return true; }
DataTypePtr getReturnTypeImpl(const ColumnsWithTypeAndName & arguments) const override

View File

@ -159,6 +159,7 @@ private:
String getName() const override { return name; }
bool isVariadic() const override { return true; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return true; }
size_t getNumberOfArguments() const override { return 0; }
ColumnNumbers getArgumentsThatAreAlwaysConstant() const override { return {0}; }
bool useDefaultImplementationForConstants() const override { return true; }
@ -439,6 +440,7 @@ private:
String getName() const override { return name; }
bool isVariadic() const override { return true; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return true; }
size_t getNumberOfArguments() const override { return 0; }
ColumnNumbers getArgumentsThatAreAlwaysConstant() const override { return {0}; }
bool useDefaultImplementationForConstants() const override { return true; }

View File

@ -122,6 +122,8 @@ public:
bool isVariadic() const override { return false; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return true; }
size_t getNumberOfArguments() const override { return 1; }
DataTypePtr getReturnTypeImpl(const DataTypes & arguments) const override
@ -355,6 +357,8 @@ public:
bool isVariadic() const override { return false; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return true; }
size_t getNumberOfArguments() const override { return 1; }
DataTypePtr getReturnTypeImpl(const DataTypes & arguments) const override
@ -475,6 +479,8 @@ public:
bool isVariadic() const override { return false; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return true; }
size_t getNumberOfArguments() const override { return 3; }
DataTypePtr getReturnTypeImpl(const DataTypes & arguments) const override
@ -714,6 +720,8 @@ public:
bool isVariadic() const override { return false; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return true; }
size_t getNumberOfArguments() const override { return 3; }
DataTypePtr getReturnTypeImpl(const DataTypes & arguments) const override
@ -962,6 +970,8 @@ public:
bool isVariadic() const override { return false; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return true; }
size_t getNumberOfArguments() const override { return 1; }
DataTypePtr getReturnTypeImpl(const DataTypes & arguments) const override
@ -1212,6 +1222,8 @@ public:
bool isVariadic() const override { return false; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return true; }
size_t getNumberOfArguments() const override { return 2; }
DataTypePtr getReturnTypeImpl(const DataTypes & arguments) const override
@ -1344,6 +1356,8 @@ public:
bool isVariadic() const override { return false; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return true; }
size_t getNumberOfArguments() const override { return 2; }
DataTypePtr getReturnTypeImpl(const DataTypes & arguments) const override
@ -1535,6 +1549,8 @@ public:
bool isVariadic() const override { return false; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return true; }
size_t getNumberOfArguments() const override { return 2; }
DataTypePtr getReturnTypeImpl(const DataTypes & arguments) const override

View File

@ -57,6 +57,7 @@ public:
size_t getNumberOfArguments() const override { return 1; }
bool isInjective(const ColumnsWithTypeAndName &) const override { return true; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return false; }
DataTypePtr getReturnTypeImpl(const DataTypes & arguments) const override
{
@ -132,6 +133,8 @@ public:
size_t getNumberOfArguments() const override { return 3; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return false; }
DataTypePtr getReturnTypeImpl(const DataTypes & arguments) const override
{
if (!checkAndGetDataType<DataTypeIPv6>(arguments[0].get()))
@ -275,6 +278,8 @@ public:
bool useDefaultImplementationForNulls() const override { return false; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return false; }
DataTypePtr getReturnTypeImpl(const DataTypes & arguments) const override
{
if (!isStringOrFixedString(removeNullable(arguments[0])))
@ -381,6 +386,8 @@ public:
size_t getNumberOfArguments() const override { return 1; }
bool isInjective(const ColumnsWithTypeAndName &) const override { return mask_tail_octets == 0; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return false; }
DataTypePtr getReturnTypeImpl(const DataTypes & arguments) const override
{
WhichDataType arg_type(arguments[0]);
@ -502,6 +509,7 @@ public:
size_t getNumberOfArguments() const override { return 1; }
bool isInjective(const ColumnsWithTypeAndName &) const override { return true; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return false; }
/// for backward compatibility IPv4ToIPv6 is overloaded, and result type depends on type of argument -
/// if it is UInt32 (presenting IPv4) then result is FixedString(16), if IPv4 - result is IPv6
@ -595,6 +603,7 @@ public:
size_t getNumberOfArguments() const override { return 1; }
bool isInjective(const ColumnsWithTypeAndName &) const override { return true; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return false; }
DataTypePtr getReturnTypeImpl(const DataTypes & arguments) const override
{
@ -724,6 +733,8 @@ public:
size_t getNumberOfArguments() const override { return 1; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return false; }
DataTypePtr getReturnTypeImpl(const DataTypes & arguments) const override
{
if (!isString(arguments[0]))
@ -812,6 +823,7 @@ public:
String getName() const override { return name; }
size_t getNumberOfArguments() const override { return 2; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return false; }
DataTypePtr getReturnTypeImpl(const DataTypes & arguments) const override
{
@ -970,6 +982,8 @@ public:
String getName() const override { return name; }
size_t getNumberOfArguments() const override { return 2; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return false; }
DataTypePtr getReturnTypeImpl(const DataTypes & arguments) const override
{
@ -1035,6 +1049,8 @@ public:
bool useDefaultImplementationForConstants() const override { return true; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return true; }
DataTypePtr getReturnTypeImpl(const DataTypes & arguments) const override
{
if (!isString(arguments[0]))
@ -1086,6 +1102,8 @@ public:
bool useDefaultImplementationForConstants() const override { return true; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return true; }
DataTypePtr getReturnTypeImpl(const DataTypes & arguments) const override
{
if (!isString(arguments[0]))

View File

@ -1105,6 +1105,8 @@ public:
size_t getNumberOfArguments() const override { return 2; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return false; }
/// Get result types by argument types. If the function does not apply to these arguments, throw an exception.
DataTypePtr getReturnTypeImpl(const DataTypes & arguments) const override
{

View File

@ -39,6 +39,8 @@ public:
return 2;
}
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return true; }
DataTypePtr getReturnTypeImpl(const DataTypes & arguments) const override
{
if (!isInteger(arguments[0]) && !isIPv4(arguments[0]))

View File

@ -2090,6 +2090,10 @@ public:
static constexpr bool to_string_or_fixed_string = std::is_same_v<ToDataType, DataTypeFixedString> ||
std::is_same_v<ToDataType, DataTypeString>;
static constexpr bool to_date_or_datetime = std::is_same_v<ToDataType, DataTypeDate> ||
std::is_same_v<ToDataType, DataTypeDate32> ||
std::is_same_v<ToDataType, DataTypeDateTime>;
static FunctionPtr create(ContextPtr context)
{
@ -2132,6 +2136,11 @@ public:
bool isVariadic() const override { return true; }
size_t getNumberOfArguments() const override { return 0; }
bool isInjective(const ColumnsWithTypeAndName &) const override { return std::is_same_v<Name, NameToString>; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & arguments) const override
{
/// TODO: We can make more optimizations here.
return !(to_date_or_datetime && isNumber(*arguments[0].type));
}
using DefaultReturnTypeGetter = std::function<DataTypePtr(const ColumnsWithTypeAndName &)>;
static DataTypePtr getReturnTypeDefaultImplementationForNulls(const ColumnsWithTypeAndName & arguments, const DefaultReturnTypeGetter & getter)
@ -2551,6 +2560,7 @@ public:
}
bool isVariadic() const override { return true; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return true; }
size_t getNumberOfArguments() const override { return 0; }
bool useDefaultImplementationForConstants() const override { return true; }
@ -3223,6 +3233,7 @@ public:
bool isDeterministic() const override { return true; }
bool isDeterministicInScopeOfQuery() const override { return true; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return true; }
bool hasInformationAboutMonotonicity() const override
{

View File

@ -157,6 +157,7 @@ public:
bool isVariadic() const override { return true; }
size_t getNumberOfArguments() const override { return 0; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return true; }
DataTypePtr getReturnTypeImpl(const DataTypes & arguments) const override
{
@ -251,6 +252,7 @@ public:
bool isVariadic() const override { return true; }
size_t getNumberOfArguments() const override { return 0; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return true; }
DataTypePtr getReturnTypeImpl(const DataTypes & arguments) const override
{
@ -389,6 +391,7 @@ public:
bool isVariadic() const override { return true; }
size_t getNumberOfArguments() const override { return 0; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return true; }
DataTypePtr getReturnTypeImpl(const DataTypes & arguments) const override
{
@ -595,6 +598,8 @@ public:
/// even in face of fact that there are many different cities named Moscow.
bool isInjective(const ColumnsWithTypeAndName &) const override { return true; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return true; }
DataTypePtr getReturnTypeImpl(const DataTypes & arguments) const override
{
if (arguments.size() != 1 && arguments.size() != 2)

View File

@ -147,6 +147,8 @@ public:
bool isDeterministic() const override { return false; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return true; }
bool useDefaultImplementationForConstants() const final { return true; }
ColumnNumbers getArgumentsThatAreAlwaysConstant() const final { return {0}; }
@ -289,6 +291,7 @@ public:
String getName() const override { return name; }
bool isVariadic() const override { return true; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return true; }
size_t getNumberOfArguments() const override { return 0; }
bool useDefaultImplementationForConstants() const final { return true; }
@ -655,6 +658,8 @@ private:
bool isDeterministic() const override { return false; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return true; }
ColumnNumbers getArgumentsThatAreAlwaysConstant() const final { return {0, 1}; }
bool isInjective(const ColumnsWithTypeAndName & sample_columns) const override
@ -802,6 +807,8 @@ private:
bool isVariadic() const override { return true; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return true; }
bool useDefaultImplementationForConstants() const override { return true; }
bool useDefaultImplementationForNulls() const override { return false; }
@ -959,6 +966,7 @@ public:
private:
size_t getNumberOfArguments() const override { return 2; }
bool isInjective(const ColumnsWithTypeAndName & /*sample_columns*/) const override { return true; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return true; }
bool useDefaultImplementationForConstants() const final { return true; }
ColumnNumbers getArgumentsThatAreAlwaysConstant() const final { return {0}; }
@ -1021,6 +1029,7 @@ private:
size_t getNumberOfArguments() const override { return 3; }
bool useDefaultImplementationForConstants() const final { return true; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return true; }
ColumnNumbers getArgumentsThatAreAlwaysConstant() const final { return {0}; }
DataTypePtr getReturnTypeImpl(const DataTypes & arguments) const override
@ -1089,6 +1098,7 @@ private:
bool useDefaultImplementationForConstants() const final { return true; }
ColumnNumbers getArgumentsThatAreAlwaysConstant() const final { return {0}; }
bool isDeterministic() const override { return false; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return true; }
DataTypePtr getReturnTypeImpl(const DataTypes & arguments) const override
{
@ -1149,6 +1159,7 @@ private:
bool useDefaultImplementationForConstants() const final { return true; }
ColumnNumbers getArgumentsThatAreAlwaysConstant() const final { return {0}; }
bool isDeterministic() const override { return false; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return true; }
DataTypePtr getReturnTypeImpl(const DataTypes & arguments) const override

View File

@ -1017,6 +1017,8 @@ public:
ColumnNumbers getArgumentsThatAreAlwaysConstant() const override { return {1}; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return true; }
ColumnPtr executeImpl(const ColumnsWithTypeAndName & arguments, const DataTypePtr &, size_t /*input_rows_count*/) const override
{
[[maybe_unused]] uint32_t seed = 0;
@ -1166,6 +1168,8 @@ public:
bool useDefaultImplementationForConstants() const override { return true; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return false; }
ColumnPtr executeImpl(const ColumnsWithTypeAndName & arguments, const DataTypePtr &, size_t /*input_rows_count*/) const override
{
const IDataType * from_type = arguments[0].type.get();
@ -1676,6 +1680,8 @@ public:
size_t getNumberOfArguments() const override { return 0; }
bool useDefaultImplementationForConstants() const override { return !with_seed; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return true; }
DataTypePtr getReturnTypeImpl(const DataTypes & arguments) const override
{
if constexpr (with_seed)
@ -2157,6 +2163,7 @@ public:
bool isVariadic() const override { return true; }
size_t getNumberOfArguments() const override { return 0; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return true; }
DataTypePtr getReturnTypeImpl(const DataTypes & arguments) const override
{

View File

@ -302,6 +302,7 @@ public:
bool isVariadic() const override { return true; }
size_t getNumberOfArguments() const override { return 0; }
bool useDefaultImplementationForConstants() const override { return true; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return true; }
DataTypePtr getReturnTypeImpl(const ColumnsWithTypeAndName & arguments) const override
{

View File

@ -1,17 +1,19 @@
#include <Functions/FunctionFactory.h>
#include <Functions/FunctionsLogical.h>
#include <Columns/IColumn.h>
#include <Columns/ColumnVector.h>
#include <Columns/ColumnsNumber.h>
#include <Columns/ColumnConst.h>
#include <Columns/ColumnNullable.h>
#include <Columns/ColumnVector.h>
#include <Columns/ColumnsNumber.h>
#include <Columns/IColumn.h>
#include <Common/FieldVisitorConvertToNumber.h>
#include <Columns/MaskOperations.h>
#include <Common/typeid_cast.h>
#include <DataTypes/DataTypeFactory.h>
#include <DataTypes/DataTypeNullable.h>
#include <DataTypes/DataTypesNumber.h>
#include <Functions/FunctionHelpers.h>
#include <Common/FieldVisitors.h>
#include <algorithm>
@ -509,10 +511,111 @@ DataTypePtr FunctionAnyArityLogical<Impl, Name>::getReturnTypeImpl(const DataTyp
: result_type;
}
template <bool inverted>
static void applyTernaryLogicImpl(const IColumn::Filter & mask, IColumn::Filter & null_bytemap)
{
for (size_t i = 0; i != mask.size(); ++i)
{
UInt8 value = mask[i];
if constexpr (inverted)
value = !value;
if (null_bytemap[i] && value)
null_bytemap[i] = 0;
}
}
template <typename Name>
static void applyTernaryLogic(const IColumn::Filter & mask, IColumn::Filter & null_bytemap)
{
if (Name::name == NameAnd::name)
applyTernaryLogicImpl<true>(mask, null_bytemap);
else if (Name::name == NameOr::name)
applyTernaryLogicImpl<false>(mask, null_bytemap);
}
template <typename Impl, typename Name>
ColumnPtr FunctionAnyArityLogical<Impl, Name>::executeShortCircuit(ColumnsWithTypeAndName & arguments, const DataTypePtr & result_type) const
{
if (Name::name != NameAnd::name && Name::name != NameOr::name)
throw Exception("Function " + getName() + " doesn't support short circuit execution", ErrorCodes::ILLEGAL_TYPE_OF_ARGUMENT);
executeColumnIfNeeded(arguments[0]);
/// Let's denote x_i' = maskedExecute(x_i, mask).
/// 1) AND(x_0, x_1, x_2, ..., x_n)
/// We will support mask_i = x_0 & x_1 & ... & x_i.
/// Base:
/// mask_0 is 1 everywhere, x_0' = x_0.
/// Iteration:
/// mask_i = extractMask(mask_{i - 1}, x_{i - 1}')
/// x_i' = maskedExecute(x_i, mask)
/// Also we will treat NULL as 1 if x_i' is Nullable
/// to support ternary logic.
/// The result is mask_n.
///
/// 1) OR(x_0, x_1, x_2, ..., x_n)
/// We will support mask_i = !x_0 & !x_1 & ... & !x_i.
/// mask_0 is 1 everywhere, x_0' = x_0.
/// mask = extractMask(mask, !x_{i - 1}')
/// x_i' = maskedExecute(x_i, mask)
/// Also we will treat NULL as 0 if x_i' is Nullable
/// to support ternary logic.
/// The result is !mask_n.
bool inverted = Name::name != NameAnd::name;
UInt8 null_value = UInt8(Name::name == NameAnd::name);
IColumn::Filter mask(arguments[0].column->size(), 1);
/// If result is nullable, we need to create null bytemap of the resulting column.
/// We will fill it while extracting mask from arguments.
std::unique_ptr<IColumn::Filter> nulls;
if (result_type->isNullable())
nulls = std::make_unique<IColumn::Filter>(arguments[0].column->size(), 0);
MaskInfo mask_info;
for (size_t i = 1; i <= arguments.size(); ++i)
{
if (inverted)
mask_info = extractInvertedMask(mask, arguments[i - 1].column, nulls.get(), null_value);
else
mask_info = extractMask(mask, arguments[i - 1].column, nulls.get(), null_value);
/// If mask doesn't have ones, we don't need to execute the rest arguments,
/// because the result won't change.
if (!mask_info.has_ones || i == arguments.size())
break;
maskedExecute(arguments[i], mask, mask_info);
}
/// For OR function we need to inverse mask to get the resulting column.
if (inverted)
inverseMask(mask, mask_info);
if (nulls)
applyTernaryLogic<Name>(mask, *nulls);
MutableColumnPtr res = ColumnUInt8::create();
typeid_cast<ColumnUInt8 *>(res.get())->getData() = std::move(mask);
if (!nulls)
return res;
MutableColumnPtr bytemap = ColumnUInt8::create();
typeid_cast<ColumnUInt8 *>(bytemap.get())->getData() = std::move(*nulls);
return ColumnNullable::create(std::move(res), std::move(bytemap));
}
template <typename Impl, typename Name>
ColumnPtr FunctionAnyArityLogical<Impl, Name>::executeImpl(
const ColumnsWithTypeAndName & arguments, const DataTypePtr & result_type, size_t input_rows_count) const
const ColumnsWithTypeAndName & args, const DataTypePtr & result_type, size_t input_rows_count) const
{
ColumnsWithTypeAndName arguments = std::move(args);
/// Special implementation for short-circuit arguments.
if (checkShortCircuitArguments(arguments) != -1)
return executeShortCircuit(arguments, result_type);
ColumnRawPtrs args_in;
for (const auto & arg_index : arguments)
args_in.push_back(arg_index.column.get());

View File

@ -52,6 +52,12 @@
namespace DB
{
struct NameAnd { static constexpr auto name = "and"; };
struct NameOr { static constexpr auto name = "or"; };
struct NameXor { static constexpr auto name = "xor"; };
struct NameNot { static constexpr auto name = "not"; };
namespace FunctionsLogicalDetail
{
namespace Ternary
@ -172,6 +178,15 @@ public:
}
bool isVariadic() const override { return true; }
bool isShortCircuit(ShortCircuitSettings & settings, size_t /*number_of_arguments*/) const override
{
settings.enable_lazy_execution_for_first_argument = false;
settings.enable_lazy_execution_for_common_descendants_of_arguments = true;
settings.force_enable_lazy_execution = false;
return name == NameAnd::name || name == NameOr::name;
}
ColumnPtr executeShortCircuit(ColumnsWithTypeAndName & arguments, const DataTypePtr & result_type) const;
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return false; }
size_t getNumberOfArguments() const override { return 0; }
bool useDefaultImplementationForNulls() const override { return !Impl::specialImplementationForNulls(); }
@ -179,7 +194,7 @@ public:
/// Get result types by argument types. If the function does not apply to these arguments, throw an exception.
DataTypePtr getReturnTypeImpl(const DataTypes & arguments) const override;
ColumnPtr executeImpl(const ColumnsWithTypeAndName & arguments, const DataTypePtr & result_type, size_t input_rows_count) const override;
ColumnPtr executeImpl(const ColumnsWithTypeAndName & args, const DataTypePtr & result_type, size_t input_rows_count) const override;
ColumnPtr getConstantResultForNonConstArguments(const ColumnsWithTypeAndName & arguments, const DataTypePtr & result_type) const override;
@ -244,6 +259,8 @@ public:
bool useDefaultImplementationForConstants() const override { return true; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return false; }
ColumnPtr executeImpl(const ColumnsWithTypeAndName & arguments, const DataTypePtr &, size_t /*input_rows_count*/) const override;
#if USE_EMBEDDED_COMPILER
@ -259,11 +276,6 @@ public:
}
struct NameAnd { static constexpr auto name = "and"; };
struct NameOr { static constexpr auto name = "or"; };
struct NameXor { static constexpr auto name = "xor"; };
struct NameNot { static constexpr auto name = "not"; };
using FunctionAnd = FunctionsLogicalDetail::FunctionAnyArityLogical<FunctionsLogicalDetail::AndImpl, NameAnd>;
using FunctionOr = FunctionsLogicalDetail::FunctionAnyArityLogical<FunctionsLogicalDetail::OrImpl, NameOr>;
using FunctionXor = FunctionsLogicalDetail::FunctionAnyArityLogical<FunctionsLogicalDetail::XorImpl, NameXor>;

View File

@ -84,6 +84,7 @@ public:
bool isDeterministic() const override { return true; }
bool isDeterministicInScopeOfQuery() const override { return true; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return false; }
const DataTypes & getArgumentTypes() const override { return argument_types; }
const DataTypePtr & getResultType() const override { return return_type; }
@ -180,6 +181,7 @@ public:
bool isDeterministic() const override { return true; }
bool isDeterministicInScopeOfQuery() const override { return true; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return false; }
const DataTypes & getArgumentTypes() const override { return capture->captured_types; }
const DataTypePtr & getResultType() const override { return return_type; }

View File

@ -60,6 +60,7 @@ public:
bool isDeterministic() const override { return false; }
bool isDeterministicInScopeOfQuery() const override { return false; }
bool useDefaultImplementationForNulls() const override { return false; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return false; }
bool isVariadic() const override { return true; }
size_t getNumberOfArguments() const override { return 0; }

View File

@ -573,6 +573,7 @@ public:
bool isVariadic() const override { return true; }
size_t getNumberOfArguments() const override { return 0; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return false; }
/// Get result types by argument types. If the function does not apply to these arguments, throw an exception.
DataTypePtr getReturnTypeImpl(const DataTypes & arguments) const override
@ -709,6 +710,7 @@ public:
size_t getNumberOfArguments() const override { return 2; }
bool useDefaultImplementationForConstants() const override { return true; }
ColumnNumbers getArgumentsThatAreAlwaysConstant() const override { return {1}; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return false; }
DataTypePtr getReturnTypeImpl(const DataTypes & arguments) const override
{

View File

@ -41,6 +41,7 @@ public:
size_t getNumberOfArguments() const override { return 0; }
bool isVariadic() const override { return true; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return true; }
ColumnNumbers getArgumentsThatAreAlwaysConstant() const override
{

View File

@ -52,6 +52,8 @@ public:
bool useDefaultImplementationForConstants() const override { return true; }
ColumnNumbers getArgumentsThatAreAlwaysConstant() const override { return {1}; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return true; }
DataTypePtr getReturnTypeImpl(const DataTypes & arguments) const override
{
if (!isString(arguments[0]))

View File

@ -37,6 +37,8 @@ public:
String getName() const override { return name; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return true; }
size_t getNumberOfArguments() const override { return 2; }
DataTypePtr getReturnTypeImpl(const DataTypes & arguments) const override

View File

@ -481,6 +481,7 @@ static std::optional<DataTypes> removeNullables(const DataTypes & types)
bool IFunction::isCompilable(const DataTypes & arguments) const
{
if (useDefaultImplementationForNulls())
if (auto denulled = removeNullables(arguments))
return isCompilableImpl(*denulled);

View File

@ -243,6 +243,41 @@ public:
*/
virtual bool hasInformationAboutMonotonicity() const { return false; }
struct ShortCircuitSettings
{
/// Should we enable lazy execution for the first argument of short-circuit function?
/// Example: if(cond, then, else), we don't need to execute cond lazily.
bool enable_lazy_execution_for_first_argument;
/// Should we enable lazy execution for functions, that are common descendants of
/// different short-circuit function arguments?
/// Example 1: if (cond, expr1(..., expr, ...), expr2(..., expr, ...)), we don't need
/// to execute expr lazily, because it's used in both branches.
/// Example 2: and(expr1, expr2(..., expr, ...), expr3(..., expr, ...)), here we
/// should enable lazy execution for expr, because it must be filtered by expr1.
bool enable_lazy_execution_for_common_descendants_of_arguments;
/// Should we enable lazy execution without checking isSuitableForShortCircuitArgumentsExecution?
/// Example: toTypeName(expr), even if expr contains functions that are not suitable for
/// lazy execution (because of their simplicity), we shouldn't execute them at all.
bool force_enable_lazy_execution;
};
/** Function is called "short-circuit" if it's arguments can be evaluated lazily
* (examples: and, or, if, multiIf). If function is short circuit, it should be
* able to work with lazy executed arguments,
* this method will be called before function execution.
* If function is short circuit, it must define all fields in settings for
* appropriate preparations. Number of arguments is provided because some settings might depend on it.
* Example: multiIf(cond, else, then) and multiIf(cond1, else1, cond2, else2, ...), the first
* version can enable enable_lazy_execution_for_common_descendants_of_arguments setting, the second - not.
*/
virtual bool isShortCircuit(ShortCircuitSettings & /*settings*/, size_t /*number_of_arguments*/) const { return false; }
/** Should we evaluate this function lazily in short-circuit function arguments?
* If function can throw an exception or it's computationally heavy, then
* it's suitable, otherwise it's not (due to the overhead of lazy execution).
* Suitability may depend on function arguments.
*/
virtual bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const { return false; }
/// The property of monotonicity for a certain range.
@ -432,8 +467,6 @@ public:
*/
virtual bool canBeExecutedOnDefaultArguments() const { return true; }
virtual bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const { return false; }
/// Properties from IFunctionBase (see IFunction.h)
virtual bool isSuitableForConstantFolding() const { return true; }
virtual ColumnPtr getConstantResultForNonConstArguments(const ColumnsWithTypeAndName & /*arguments*/, const DataTypePtr & /*result_type*/) const { return nullptr; }
@ -441,6 +474,11 @@ public:
virtual bool isDeterministic() const { return true; }
virtual bool isDeterministicInScopeOfQuery() const { return true; }
virtual bool isStateful() const { return false; }
using ShortCircuitSettings = IFunctionBase::ShortCircuitSettings;
virtual bool isShortCircuit(ShortCircuitSettings & /*settings*/, size_t /*number_of_arguments*/) const { return false; }
virtual bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const { return false; }
virtual bool hasInformationAboutMonotonicity() const { return false; }
using Monotonicity = IFunctionBase::Monotonicity;

View File

@ -84,6 +84,10 @@ public:
bool isDeterministicInScopeOfQuery() const override { return function->isDeterministicInScopeOfQuery(); }
bool isShortCircuit(ShortCircuitSettings & settings, size_t number_of_arguments) const override { return function->isShortCircuit(settings, number_of_arguments); }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & args) const override { return function->isSuitableForShortCircuitArgumentsExecution(args); }
bool hasInformationAboutMonotonicity() const override { return function->hasInformationAboutMonotonicity(); }
Monotonicity getMonotonicityForRange(const IDataType & type, const Field & left, const Field & right) const override

View File

@ -37,6 +37,7 @@ private:
size_t getNumberOfArguments() const override { return 0; }
bool isVariadic() const override { return true; }
bool useDefaultImplementationForConstants() const override { return true; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return false; }
DataTypePtr getReturnTypeImpl(const DataTypes & types) const override
{

View File

@ -180,6 +180,7 @@ public:
bool isDeterministic() const override { return false; }
bool isDeterministicInScopeOfQuery() const override { return false; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return true; }
size_t getNumberOfArguments() const override { return 5; }

View File

@ -52,6 +52,8 @@ public:
return 1;
}
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return true; }
DataTypePtr getReturnTypeImpl(const ColumnsWithTypeAndName & arguments) const override
{
if (arguments.size() != 1)

View File

@ -44,6 +44,8 @@ public:
return 1;
}
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return true; }
DataTypePtr getReturnTypeImpl(const ColumnsWithTypeAndName & arguments) const override
{
if (arguments.size() != 1)

View File

@ -34,6 +34,8 @@ public:
return name;
}
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return false; }
private:
size_t getNumberOfArguments() const override

View File

@ -24,6 +24,7 @@ public:
bool useDefaultImplementationForConstants() const override { return true; }
bool isVariadic() const override { return true; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return false; }
size_t getNumberOfArguments() const override { return 0; }
DataTypePtr getReturnTypeImpl(const DataTypes & arguments) const override

View File

@ -32,6 +32,7 @@ public:
bool isVariadic() const override { return true; }
size_t getNumberOfArguments() const override { return 0; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return true; }
DataTypePtr getReturnTypeImpl(const DataTypes & arguments) const override
{

View File

@ -38,6 +38,8 @@ public:
bool isVariadic() const override { return false; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return true; }
size_t getNumberOfArguments() const override { return 1; }
bool useDefaultImplementationForConstants() const override { return true; }

View File

@ -40,6 +40,7 @@ public:
String getName() const override;
bool useDefaultImplementationForConstants() const override { return true; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return false; }
size_t getNumberOfArguments() const override { return 2; }
DataTypePtr getReturnTypeImpl(const DataTypes & arguments) const override;

View File

@ -34,6 +34,7 @@ public:
size_t getNumberOfArguments() const override { return 1; }
bool useDefaultImplementationForConstants() const override { return true; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return true; }
DataTypePtr getReturnTypeImpl(const DataTypes & arguments) const override
{

View File

@ -37,6 +37,7 @@ public:
bool isVariadic() const override { return true; }
size_t getNumberOfArguments() const override { return 0; }
bool useDefaultImplementationForConstants() const override { return true; }
bool isSuitableForShortCircuitArgumentsExecution(const DataTypesWithConstInfo & /*arguments*/) const override { return true; }
DataTypePtr getReturnTypeImpl(const DataTypes & arguments) const override
{

Some files were not shown because too many files have changed in this diff Show More