Wieści RSS

Menu witryny

j4age Visitor Details

United States

Mozilla
Your IP: 216.73.216.242

JoomlaWatch Visitors

91%		United States
5.7%		Canada
1.2%		Russian Federation
0.8%		United Kingdom
0.8%		Australia
0.1%		China

Poprzedni tydzień:	3
Bieżący miesiąc:	1
Poprzedni miesiąc:	145
Ogółem:	563

JoomlaWatch Users

JoomlaWatch Agent

JoomlaWatch 1.2.12 - Joomla Monitor and Live Stats by Matej Koval

Start

Wieści RSS

Planet MySQL

Wieści RSS

Planet MySQL

Planet MySQL - https://planet.mysql.com

Missed our Continuent Clustering 6.0 webinar and training? Don’t sweat it. Watch them on-demand.
Missed our Continuent Clustering 6.0 webinar and technical deep dive? Don’t sweat it. Watch these recordings of our presentations and find out what’s new and latest for multisite, multimaster clustering, what’s changed, improvements that we’ve made under the hood, and how this will improve the quality and support for your multimaster MySQL/Percona/MariaDB deployments. In our technical deep dive, we take a detailed look at the new deployment model, how the different components work together, and how you can now manage and support your new multimaster environment using Continuent Clustering 6.0. We cover the new integration of the replication, how that affects your deployment, and how that alters the management and monitoring of your database cluster.
CPU overheads for RocksDB queries
An LSM like RocksDB has much better write and space efficiency than a B-Tree. That means with RocksDB you will use less SSD than with a B-Tree and the SSD will either last longer or you can use lower endurance SSD. But this efficiency comes at a cost. In the worst-case RocksDB might use 2X more CPU/query than a B-Tree, which means that in the worst case QPS with RocksDB might be half of what it is with a B-Tree. In the examples below I show that RocksDB can use ~2X more comparisons per query compared to a B-Tree and it is nice when results in practice can be explained by theory.But first I want to explain the context where this matters and where it doesn't matter. It matters when the CPU overhead from RocksDB is a significant fraction of the query response time -- so the workload needs to be CPU bound (cached working set). It isn't a problem for workloads that are IO-bound. Many performance results have been published on my blog and more are coming later this year. With fast storage devices available I recommend that database workloads strive to be IO-bound to avoid using too much memory+power.Basic OperationsWhere does the CPU overhead come from? I started to explain this in my review of the original LSM paper. My focus in this post is on SELECT statements in SQL, but note that writes in SQL also do reads from the storage engine (updates have a where clause, searches are done to find matching rows and appropriate leaf pages, queries might be done to determine whether constraints will be violated, etc).With RocksDB data can be found in the memtable and sorted runs from the L0 to Lmax (Lmax is the max/largest level in the LSM tree. There are 3 basic operations that are used to search for data - get, range seek and range next. Get searches for an exact match. Range seek positions iterators at the start of a range scan. Range next gets the next row from a range scan.First a few comments before I explain the CPU overhead: I will try to be clear when I refer to the keys for a SQL table (SQL key) and the key as used by RocksDB (RocksDB key). Many SQL indexes can be stored in the same LSM tree (column family) and they are distinguished by prepending the index ID as the first bytes in the RocksDB key. The prepended bytes aren't visible to the user. Get is used for exact match on a PK index but not used for exact match on a secondary index because with MyRocks the RocksDB key for a secondary index entry is made unique by appending the PK to it. A SQL SELECT that does an exact match on the secondary key searches for a prefix of the RocksDB key and that requires a range scan. With RocksDB a bloom filter can be on a prefix of the key or the entire key (prefix vs whole key). A prefix bloom filter can be used for range and point queries. A whole key bloom filter will only be used for point queries on a PK. A whole key bloom filter won't be used for range queries. Nor will it be used for point queries on a secondary index (because secondary index exact match uses a range scan internally). Today bloom filters are configured per column family. A good use for column families is to put secondary indexes into separate ones that are configured with prefix bloom filters. Eventually we expect to make this easier in RocksDB. Get A point query is evaluated for RocksDB by searching sorted runs in order: first the memtable, then runs in the L0, then the L1 and so on until the Lmax is reached. The search stops as soon as the key is found, whether from a tombstone or a key-value pair. The work done to search each sorted run varies. I use comparisons and bloom filter probes as the proxy for work. I ignore memory system behavior (cache misses) but assume that the skiplist isn't great in that regard: memtable - this is a skiplist and I assume the search cost is log2(N) when it has N entries. L0 - for each sorted run first check the bloom filter and if the key might exist then do binary search on the block index and then do binary search within the data block. The cost is a bloom filter probe and possibly log2(M) comparisons when an L0 sorted run has M entries. L1 to Ln - each of these levels is range partitioned into many SST files so the first step is to do a binary search to find the SST that might contain the key, then check the bloom filter for that SST and if the key might exist then do binary search on the block index for that SST and then do binary search within the data block. The binary search cost to find the SST gets larger for the larger levels and that cost doesn't go away when bloom filters are used. For RocksDB with a 1T database, per-level fanout=8, SST size=32M then the number of SSTs per level is 8 in L1, 64 in L2, 512 in L3, 4096 in L4 and 32768 in L5. The number of comparisons before the bloom filter check are 3 for L1, 6 for L2, 9 for L3 and 12 for L4. I assume there is no bloom filter on L5 because it is the max level. A bloom filter check isn't free. Fortunately, RocksDB makes the cost less than I expected by limiting the bits that are set for a key, and must be checked on a probe, to one cache line. See the code in AddHash. For now I assume that the cost of a bloom filter probe is equivalent to a few (< 5) comparisons given the cache line optimization. A Get operation on a B-Tree with 8B rows needs ~33 comparisons. With RocksDB it might need 20 comparisons for the memtable, bloom filter probes for 4 SSTs in L0, 3+6+9+12 SST search comparisons for L1 to L4, bloom filter probes for L1 to L4, and then ~33 comparisons to search the max level (L5). So it is easy to see that the search cost might be double the cost for a B-Tree.The search cost for Get with RocksDB can be reduced by configuring the LSM tree to use fewer sorted runs although we are still figuring out how much that can be reduced. With this example about 1/3 of the comparisons are from the memtable, another 1/3 are from the max level and the remaining 1/3 are from L0 to L4.Range Seek and Next The cost for a range scan has two components: range seek to initialize the scan and range next to produce each row. For range seek an iterator is positioned within each sorted run in the LSM tree. For range next the merging iterator combines the per-run iterators. For RocksDB the cost of range seek depends on the number of sorted runs while the cost of range next does not (for leveled compaction, assuming uniform distribution).Range seek does binary search per sorted run. This is similar to a point query without a bloom filter. The cost across all sorted runs depends on the number of sorted runs. In the example above where RocksDB has a memtable, 4 SSTs in the L0, L1 to L5 and 8B rows that requires 20 comparisons for the memtable, 4 x 20 comparisons for the L0 SSTs, 21 + 24 + 27 + 30 + 33 comparison for L1 to L5. The total is 235 comparisons. There isn't a way to avoid this and for short range queries the cost of range seek dominates the cost of range next. While this overhead is significant for short range queries with embedded RocksDB and an in-memory workload it is harder to notice with MyRocks because there is a lot of CPU overhead above MyRocks from optimize, parse and client RPC for a short range query. It is easier to notice the difference in CPU overhead between MyRocks and a B-Tree with longer range scans. The cost for range next is interesting. LevelDB has a comment that suggests using a heap but the merging iterator code uses N-1 comparisons per row produced which means the overhead is dependent on the number of sorted runs. The cost of range next in RocksDB is much less dependent on the number of sorted runs because it uses an optimized binary heap and the number of comparisons to produce a row depends on the node that produces the winner. Only one comparison is done if the root produces the winner and the root produced the previous winner. Two comparisons are done if the root produces the winner but did not produce the previous winner. More comparisons are needed in other cases. The code is optimized for long runs of winners from one iterator and that is likely with leveled compaction because Lmax is ~10X larger than the next largest level and usually produces most winners. Using a simulation with uniform distribution the expected number of comparisons per row is <= 1.55 regardless of the number of sorted runs for leveled compaction. The optimization in the binary heap used by the merging iterator is limited to remembering the comparison result between the two children of the root node. Were that extended to remembering comparison results between any two sibling nodes in the heap then the expected number of comparisons would be reduced from ~1.55 to ~1.38. For a B-Tree: The overhead for range seek is similar to a point query -- ~33 comparisons when there are 8B rows. There are no merging iterators. The overhead for range next is low -- usually move to the next row in the current leaf page.
AMD EPYC Performance Testing… or Don’t get on the wrong side of SystemD
Ever since AMD released their EPYC CPU for servers I wanted to test it, but I did not have the opportunity until recently, when Packet.net started offering bare metal servers for a reasonable price. So I started a couple of instances to test Percona Server for MySQL under this CPU. In this benchmark, I discovered some interesting discrepancies in performance between AMD and Intel CPUs when running under systemd . The set up To test CPU performance, I used a read-only in-memory sysbench OLTP benchmark, as it burns CPU cycles and no IO is performed by Percona Server. For this benchmark I used Packet.net c2.medium.x86 instances powered by AMD EPYC 7401P processors. The OS is exposed to 48 CPU threads. For the OS I tried Ubuntu 16.04 with default kernel 4.4 and upgraded to 4.15 Ubuntu 18.04 with kernel 4.15 Percona Server started from SystemD and without SystemD (for reasons which will become apparent later) To have some points for comparison, I also ran a similar workload on my 2 socket Intel CPU server, with CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz. I recognize this is not most recent Intel CPU, but this was the best I had at the time, and it also gave 48 CPU Threads. Ubuntu 16 First, let’s review the results for Ubuntu 16 Or in tabular format: Threads Ubuntu 16, kernel 4.4; systemd Ubuntu 16, kernel 4.4; NO systemd Ubuntu 16, kernel 4.15 1 943.44 948.7 899.82 2 1858.58 1792.36 1774.45 4 3533.2 3424.05 3555.94 8 6762.35 6731.57 7010.51 12 10012.18 9950.3 10062.82 16 13063.39 13043.55 12893.17 20 15347.68 15347.56 14756.27 24 16886.24 16864.81 16176.76 30 18150.2 18160.07 17860.5 36 18923.06 18811.64 19528.27 42 19374.86 19463.08 21537.79 48 20110.81 19983.05 23388.18 56 20548.51 20362.31 23768.49 64 20860.51 20729.52 23797.14 72 21123.71 21001.06 23645.95 80 21370 21191.24 23546.03 90 21622.54 21441.73 23486.29 100 21806.67 21670.38 23392.72 128 22161.42 22031.53 23315.33 192 22388.51 22207.26 22906.42 256 22091.17 21943.37 22305.06 512 19524.41 19381.69 19181.71 There are few conclusions we can see from this data AMD EPYC CPU scales quite well to the number of CPU Threads The recent kernel helps to boost the throughput. Ubuntu 18.04 Now, let’s review the results for Ubuntu 18.04 Threads Ubuntu 18, systemd Ubuntu 18, NO systemd 1 833.14 843.68 2 1684.21 1693.93 4 3346.42 3359.82 8 6592.88 6597.48 12 9477.92 9487.93 16 12117.12 12149.17 20 13934.27 13933 24 15265.1 15152.74 30 16846.02 16061.16 36 18488.88 16726.14 42 20493.57 17360.56 48 22217.47 17906.4 56 22564.4 17931.83 64 22590.29 17902.95 72 22472.75 17857.73 80 22421.99 17766.76 90 22300.09 17773.57 100 22275.24 17646.7 128 22131.86 17411.55 192 21750.8 17134.63 256 21177.25 16826.53 512 18296.61 17418.72 This is where the result surprised me: on Ubuntu 18.04 with SystemD running Percona Server for MySQL as a service the throughput was up to 24% better than if Percona Server for MySQL is started from a bash shell. I do not know exactly what causes this dramatic difference—systemd uses different slices for services and user commands, and somehow it affects the performance. Baseline benchmark To establish a baseline, I ran the same benchmark on my Intel box, running Ubuntu 16, and I tried two kernels: 4.13 and 4.15 Threads Ubuntu 16, kernel 4.13, systemd Ubuntu 16, kernel 4.15, systemd Ubuntu 16, kernel 4.15, NO systemd 1 820.07 798.42 864.21 2 1563.31 1609.96 1681.91 4 2929.63 3186.01 3338.47 8 6075.73 6279.49 6624.49 12 8743.38 9256.18 9622.6 16 10580.14 11351.31 11984.64 20 12790.96 12599.78 14147.1 24 14213.68 14659.49 15716.61 30 15983.78 16096.03 17530.06 36 17574.46 18098.36 20085.9 42 18671.14 19808.92 21875.84 48 19431.05 22036.06 23986.08 56 19737.92 22115.34 24275.72 64 19946.57 21457.32 24054.09 72 20129.7 21729.78 24167.03 80 20214.93 21594.51 24092.86 90 20194.78 21195.61 23945.93 100 20753.44 21597.26 23802.16 128 20235.24 20684.34 23476.82 192 20280.52 20431.18 23108.36 256 20410.55 20952.64 22775.63 512 20953.73 22079.18 23489.3 Here we see the opposite result with SystemD: Percona Server running from a bash shell shows the better throughput compared with the SystemD service. So for some reason, systemd works differently for AMD and Intel CPUs. Please let me know if you have any ideas on how to deal with the impact that systemd has on performance. Conclusions So there are some conclusions from these results: AMD EPYC shows a decent performance scalability; the new kernel helps to improve it systemd shows different effects on throughput for AMD and Intel CPUs With AMD the throughput declines for a high concurrent workload with 512 threads, while Intel does not show a decline. The post AMD EPYC Performance Testing… or Don’t get on the wrong side of SystemD appeared first on Percona Database Performance Blog.
The best way to use SQL functions in JPQL or Criteria API queries with JPA and Hibernate
Introduction When executing an entity query (e.g. JPQL, HQL or Criteria API), you can use any SQL function without having to register it as long as the function is passed directly to the WHERE clause of the underlying SQL statement. However, if the SQL function is used in the SELECT clause, and Hibernate has not … Continue reading The best way to use SQL functions in JPQL or Criteria API queries with JPA and Hibernate → The post The best way to use SQL functions in JPQL or Criteria API queries with JPA and Hibernate appeared first on Vlad Mihalcea.
What is MySQL partitioning ?
MySQL partitioning makes data distribution of individual tables (typically we recommend partition for large & complex I/O table for performance, scalability and manageability) across multiple files based on partition strategy / rules. In very simple terms, different portions of table are stored as separate tables in different location to distribute I/O optimally. The user defined division of data by some rule is known as partition function, In MySQL we partition data by RANGE of values / LIST of values / internal hashing function / linear hashing function. By restricting the query examination on the selected partitions by matching rows increases the query performance by multiple times compared to the same query on a non partitioned table, This methodology is also called partition pruning (trimming of unwanted partitions), Please find below example of partition pruning: CREATE TABLE tab1 ( col1 VARCHAR(30) NOT NULL, col2 VARCHAR(30) NOT NULL, col3 TINYINT UNSIGNED NOT NULL, col4 DATE NOT NULL ) PARTITION BY RANGE( col3 ) ( PARTITION p0 VALUES LESS THAN (100), PARTITION p1 VALUES LESS THAN (200), PARTITION p2 VALUES LESS THAN (300), PARTITION p3 VALUES LESS THAN MAXVALUE ); Write a SELECT query benefitting partition pruning: SELECT col1, col2, col3, col4 FROM tab1 WHERE col3 > 200 AND col3 < 250; What is explicit partitioning in MySQL and how is it different from partition pruning ? In MySQL we can explicitly select partition and sub-partitions when executing a statement matching a given WHERE condition, This sounds very much similar to partition pruning, but there is a difference: Partition to be checked are explicitly mentioned in the query statement, In partition pruning it is automatic. In explicit partition, the explicit selection of partitions is supported for both queries and DML statements, partition pruning applies only to queries. SQL statements supported in explicit partitioning – SELECT, INSERT, UPDATE, DELETE, LOAD DATA, LOAD XML and REPLACE Explicit partition example: CREATE TABLE customer ( cust_id INT NOT NULL AUTO_INCREMENT PRIMARY KEY, cust_fname VARCHAR(25) NOT NULL, cust_lname VARCHAR(25) NOT NULL, cust_phone INT NOT NULL, cust_fax INT NOT NULL ) PARTITION BY RANGE(cust_id) ( PARTITION p0 VALUES LESS THAN (100), PARTITION p1 VALUES LESS THAN (200), PARTITION p2 VALUES LESS THAN (300), PARTITION p3 VALUES LESS THAN MAXVALUE ); Query explicitly mentioning partition: mysql> SELECT * FROM customer PARTITION (p1); RANGE partitioningIn RANGE partitioning you can partition values within a given range, Ranges should be contiguous but not overlapping, usually defined by VALUES LESS THAN operator, The following examples explain how to create and use RANGE partitioning for MySQL performance: CREATE TABLE customer_contract( cust_id INT NOT NULL, cust_fname VARCHAR(30), cust_lname VARCHAR(30), st_dt DATE NOT NULL DEFAULT '1970-01-01', end_dt DATE NOT NULL DEFAULT '9999-12-31', contract_code INT NOT NULL, contract_id INT NOT NULL ) PARTITION BY RANGE (contract_id) ( PARTITION p0 VALUES LESS THAN (50), PARTITION p1 VALUES LESS THAN (100), PARTITION p2 VALUES LESS THAN (150), PARTITION p3 VALUES LESS THAN (200) ); For example, let us suppose that you wish to partition based on the year contract ended: CREATE TABLE customer_contract( cust_id INT NOT NULL, cust_fname VARCHAR(30), cust_lname VARCHAR(30), st_dt DATE NOT NULL DEFAULT '1970-01-01', end_dt DATE NOT NULL DEFAULT '9999-12-31', contract_code INT NOT NULL, contract_id INT NOT NULL ) PARTITION BY RANGE (year(end_dt)) ( PARTITION p0 VALUES LESS THAN (2001), PARTITION p1 VALUES LESS THAN (2002), PARTITION p2 VALUES LESS THAN (2003), PARTITION p3 VALUES LESS THAN (2004) ); It is also possible to partition a table by RANGE, based on the value of a TIMESTAMP column, using the UNIX_TIMESTAMP() function, as shown in this example: CREATE TABLE sales_forecast ( sales_forecast_id INT NOT NULL, sales_forecast_status VARCHAR(20) NOT NULL, sales_forecast_updated TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP ) PARTITION BY RANGE ( UNIX_TIMESTAMP(sales_forecast_updated) ) ( PARTITION p0 VALUES LESS THAN ( UNIX_TIMESTAMP('2008-01-01 00:00:00') ), PARTITION p1 VALUES LESS THAN ( UNIX_TIMESTAMP('2008-04-01 00:00:00') ), PARTITION p2 VALUES LESS THAN ( UNIX_TIMESTAMP('2008-07-01 00:00:00') ), PARTITION p3 VALUES LESS THAN ( UNIX_TIMESTAMP('2008-10-01 00:00:00') ), PARTITION p4 VALUES LESS THAN ( UNIX_TIMESTAMP('2009-01-01 00:00:00') ), PARTITION p5 VALUES LESS THAN ( UNIX_TIMESTAMP('2009-04-01 00:00:00') ), PARTITION p6 VALUES LESS THAN ( UNIX_TIMESTAMP('2009-07-01 00:00:00') ), PARTITION p7 VALUES LESS THAN ( UNIX_TIMESTAMP('2009-10-01 00:00:00') ), PARTITION p8 VALUES LESS THAN ( UNIX_TIMESTAMP('2010-01-01 00:00:00') ), PARTITION p9 VALUES LESS THAN (MAXVALUE) ); LIST partitioningThe difference between RANGE and LIST partitioning is: In LIST partitioning, each partition is grouped on the selected list of values of a specific column. You can do it by PARTITION BY LIST (EXPR) where EXPR is the selected column for list partition, We have explained LIST partitioning with example below: CREATE TABLE students ( student_id INT NOT NULL, student_fname VARCHAR(30), student_lname VARCHAR(30), student_joined DATE NOT NULL DEFAULT '1970-01-01', student_separated DATE NOT NULL DEFAULT '9999-12-31', student_house INT, student_grade_id INT ) PARTITION BY LIST(student_grade_id) ( PARTITION P1 VALUES IN (1,2,3,4), PARTITION P2 VALUES IN (5,6,7), PARTITION P3 VALUES IN (8,9,10), PARTITION P4 VALUES IN (11,12) ); HASH partitioningHASH partitioning makes an even distribution of data among predetermined number of partitions, In RANGE and LIST partitioning you must explicitly define the partitioning logic and which partition given column value or set of column values are stored. In HASH partitioning MySQL take care of this, The following example explains HASH partitioning better: CREATE TABLE store ( store_id INT NOT NULL, store_name VARCHAR(30), store_location VARCHAR(30), store_started DATE NOT NULL DEFAULT '1997-01-01', store_code INT ) PARTITION BY HASH(store_id) PARTITIONS 4; P.S. : If you do not include a PARTITIONS clause, the number of partitions defaults to 1. LINEAR HASH partitioningThe LINEAR HASH partitioning utilizes a linear powers-of-two algorithm, Where HASH partitioning employs the modulus of the hashing function’s value. Please find below LINEAR HASH partitioning example: CREATE TABLE store ( store_id INT NOT NULL, store_name VARCHAR(30), store_location VARCHAR(30), store_started DATE NOT NULL DEFAULT '1997-01-01', store_code INT ) PARTITION BY LINEAR HASH( YEAR(store_started) ) PARTITIONS 4; KEY partitioningKEY partitioning is very much similar to HASH, the only difference is, the hashing function for the KEY partitioning is supplied by MySQL, In case of MySQL NDB Cluster, MD5() is used, For tables using other storage engines, the MySQL server uses the storage engine specific hashing function which is based on the same algorithm as PASSWORD(). CREATE TABLE contact( id INT NOT NULL, name VARCHAR(20), contact_number INT, email VARCHAR(50), UNIQUE KEY (id) ) PARTITION BY KEY() PARTITIONS 5; P.S. – if the unique key column were not defined as NOT NULL, then the previous statement would fail. SubpartitioningSUBPARTITIONING is also known as composite partitioning, You can partition table combining RANGE and HASH for better results, The example below explains SUBPARTITIONING better: CREATE TABLE purchase (id INT, item VARCHAR(30), purchase_date DATE) PARTITION BY RANGE( YEAR(purchase_date) ) SUBPARTITION BY HASH( TO_DAYS(purchase_date) ) SUBPARTITIONS 2 ( PARTITION p0 VALUES LESS THAN (2000), PARTITION p1 VALUES LESS THAN (2010), PARTITION p2 VALUES LESS THAN MAXVALUE ); It is also possible to define subpartitions explicitly using SUBPARTITION clauses to specify options for individual subpartitions: CREATE TABLE purchase (id INT, item VARCHAR(30), purchase_date DATE) PARTITION BY RANGE( YEAR(purchase_date) ) SUBPARTITION BY HASH( TO_DAYS(purchase_date) ) ( PARTITION p0 VALUES LESS THAN (2000) ( SUBPARTITION s0, SUBPARTITION s1 ), PARTITION p1 VALUES LESS THAN (2010) ( SUBPARTITION s2, SUBPARTITION s3 ), PARTITION p2 VALUES LESS THAN MAXVALUE ( SUBPARTITION s4, SUBPARTITION s5 ) ); Things to remember: Each partition must have the same number of subpartitions. Each SUBPARTITION clause must include (at a minimum) a name for the subpartition. Otherwise, you may set any desired option for the subpartition or allow it to assume its default setting for that option. Subpartition names must be unique across the entire table. For example, the following CREATE TABLE statement is valid in MySQL 5.7: CREATE TABLE purchase (id INT, item VARCHAR(30), purchase_date DATE) PARTITION BY RANGE( YEAR(purchase_date) ) SUBPARTITION BY HASH( TO_DAYS(purchase_date) ) ( PARTITION p0 VALUES LESS THAN (1990) ( SUBPARTITION s0, SUBPARTITION s1 ), PARTITION p1 VALUES LESS THAN (2000) ( SUBPARTITION s2, SUBPARTITION s3 ), PARTITION p2 VALUES LESS THAN MAXVALUE ( SUBPARTITION s4, SUBPARTITION s5 ) ); MySQL partitioning limitationsMySQL partitioning also has limitations, We are listing down below the limitations of MySQL partitioning: A PRIMARY KEY must include all columns in the table’s partitioning function: CREATE TABLE tab3 ( column1 INT NOT NULL, column2 DATE NOT NULL, column3 INT NOT NULL, column4 INT NOT NULL, UNIQUE KEY (column1, column2), UNIQUE KEY (column3) ) PARTITION BY HASH(column1 + column3) PARTITIONS 4; Expect this error after running above script – ERROR 1503 (HY000): A PRIMARY KEY must include all columns in the table’s partitioning function The right way of doing it: CREATE TABLE table12 ( column1 INT NOT NULL, column2 DATE NOT NULL, column3 INT NOT NULL, column4 INT NOT NULL, UNIQUE KEY (column1, column2, column3) ) PARTITION BY HASH(column3) PARTITIONS 5; CREATE TABLE table25 ( column11 INT NOT NULL, column12 DATE NOT NULL, column13 INT NOT NULL, column14 INT NOT NULL, UNIQUE KEY (column11, column13) ) PARTITION BY HASH(column11 + column13) PARTITIONS 5; Most popular limitation of MySQL – Primary key is by definition a unique key, this restriction also includes the table’s primary key, if it has one. The example below explains this limitation better: CREATE TABLE table55 ( column11 INT NOT NULL, column12 DATE NOT NULL, column13 INT NOT NULL, column14 INT NOT NULL, PRIMARY KEY(column11, column12) ) PARTITION BY HASH(column13) PARTITIONS 4; CREATE TABLE table65 ( column20 INT NOT NULL, column25 DATE NOT NULL, column30 INT NOT NULL, column35 INT NOT NULL, PRIMARY KEY(column20, column30), UNIQUE KEY(column25) ) PARTITION BY HASH( YEAR(column25) ) PARTITIONS 5; Both of the above scripts will return this error – ERROR 1503 (HY000): A PRIMARY KEY must include all columns in the table’s partitioning function The right way of doing it: CREATE TABLE t45 ( column50 INT NOT NULL, column55 DATE NOT NULL, column60 INT NOT NULL, column65 INT NOT NULL, PRIMARY KEY(column50, column55) ) PARTITION BY HASH(column50 + YEAR(column55)) PARTITIONS 5; CREATE TABLE table88 ( column80 INT NOT NULL, column81 DATE NOT NULL, column82 INT NOT NULL, column83 INT NOT NULL, PRIMARY KEY(column80, column81, column82), UNIQUE KEY(column81, column82) ); In above example, the primary key does not include all columns referenced in the partitioning expression. However, both of the statements are valid ! You can still successfully partition a MySQL table without unique keys – this also includes having no primary key and you may use any column or columns in the partitioning expression as long as the column type is compatible with the partitioning type, The example below shows partitioning a table with no unique / primary keys: CREATE TABLE table_has_no_pk (column10 INT, column11 INT, column12 varchar(20)) PARTITION BY RANGE(column10) ( PARTITION p0 VALUES LESS THAN (500), PARTITION p1 VALUES LESS THAN (600), PARTITION p2 VALUES LESS THAN (700), PARTITION p3 VALUES LESS THAN (800) ); You cannot later add a unique key to a partitioned table unless the key includes all columns used by the table’s partitioning expression, The example below explains this much better: ALTER TABLE table_has_no_pk ADD PRIMARY KEY(column10); ALTER TABLE table_has_no_pk drop primary key; ALTER TABLE table_has_no_pk ADD PRIMARY KEY(column10,column11); ALTER TABLE table_has_no_pk drop primary key; However, the next statement fails, because column10 is part of the partitioning key, but is not part of the proposed primary key: mysql> ALTER TABLE table_has_no_pk ADD PRIMARY KEY(column11); ERROR 1503 (HY000): A PRIMARY KEY must include all columns in the table's partitioning function mysql> MySQL partitioning limitations (at storage engine level) InnoDB InnoDB foreign keys and MySQL partitioning are not compatible, Partitioned InnoDB tables cannot have foreign key references, nor can they have columns referenced by foreign keys, So you cannot partition InnoDB tables which have or referenced by foreign keys. InnoDB does not support use of multiple disks for subpartition (MyISAM supports this feature) Use ALTER TABLE … REBUILD PARTITION and ALTER TABLE … ANALYZE PARTITION than using ALTER TABLE … OPTIMIZE PARTITION NDB storage engine We can only partition by KEY (including LINEAR KEY) in NDB storage engine. FEDERATED storage engine Partitioning not supported in FEDERATED storage engine. CSV storage engine Partitioning not supported in CSV storage engine. MERGE storage engine Tables using the MERGE storage engine cannot be partitioned. Partitioned tables cannot be merged. MySQL functions shown in the following list are allowed in partitioning expressions: ABS() CEILING() DATEDIFF() DAY() DAYOFMONTH() DAYOFWEEK() DAYOFYEAR() EXTRACT() FLOOR() HOUR() MICROSECOND() MINUTE() MOD() MONTH() QUARTER() SECOND() TIME_TO_SEC() TO_DAYS() TO_SECONDS() UNIX_TIMESTAMP() WEEKDAY() YEAR() YEARWEEK() MySQL partitioning and locks Effect on DML statements In MySQL 5.7, updating a partitioned MyISAM table cause only the affected partitioned to be locked. SELECT statements (including those containing unions or joins) lock only those partitions that actually need to be read. This also applies to SELECT …PARTITION. An UPDATE prunes locks only for tables on which no partitioning columns are updated. REPLACE and INSERT lock only those partitions having rows to be inserted or replaced. However, if an AUTO_INCREMENT value is generated for any partitioning column then all partitions are locked. INSERT … ON DUPLICATE KEY UPDATE is pruned as long as no partitioning column is updated. INSERT … SELECT locks only those partitions in the source table that need to be read, although all partitions in the target table are locked. Locks imposed by LOAD DATA statements on partitioned tables cannot be pruned. Effect on DML statements CREATE VIEW does not cause any locks. ALTER TABLE … EXCHANGE PARTITION prunes locks; only the exchanged table and the exchanged partition are locked. ALTER TABLE … TRUNCATE PARTITION prunes locks; only the partitions to be emptied are locked. In addition, ALTER TABLE statements take metadata locks on the table level. Effect on other statements LOCK TABLES cannot prune partition locks. CALL stored_procedure(expr) supports lock pruning, but evaluating expr does not. DO and SET statements do not support partitioning lock pruning. The post What is MySQL partitioning ? appeared first on MySQL Consulting, Support and Remote DBA Services.

Joomla! jest wolnym oprogramowaniem dostępnym na licencji GNU GPL.