May
09
2019
--

Improving OLAP Workload Performance for PostgreSQL with ClickHouse Database

Improving OLAP Workload Performance for PostgreSQL

Every database management system is not optimized for every workload. Database systems are designed for specific loads, and thereby give better performance for that workload. Similarly, some kinds of queries work better on some database systems and worse on others. This is the era of specialization, where a product is designed for a specific requirement or a specific set of requirements. We cannot achieve everything performance-wise from a single database system. PostgreSQL is one of the finest object databases and performs really well in OLTP types of workloads, but I have observed that its performance is not as good as some other database systems for OLAP workloads. ClickHouse is one of the examples that outperforms PostgreSQL for OLAP.

ClickHouse is an open source, column-based database management system which claims to be 100–1,000x faster than traditional approaches, capable of processing of more than a billion rows in less than a second.

Foreign Data Wrapper (Clickhousedb_fdw)

If we have a different kind of workload that PostgreSQL is not optimized for, what is the solution? Fortunately, PostgreSQL provides a mechanism where you can handle a different kind of workload while using it, by creating a data proxy within PostgreSQL that internally calls and fetches data from a different database system. This is called Foreign Data Wrapper, which is based on SQL-MED. Percona provides a foreign data wrapper for the Clickhousedb database system, which is available at Percona’s GitHub project repository.

Benchmark Machine

  • Supermicro server:
    • Intel(R) Xeon(R) CPU E5-2683 v3 @ 2.00GHz
    • 2 sockets / 28 cores / 56 threads
    • Memory: 256GB of RAM
    • Storage: Samsung  SM863 1.9TB Enterprise SSD
    • Filesystem: ext4/xfs
  • OS: Linux smblade01 4.15.0-42-generic #45~16.04.1-Ubuntu
  • PostgreSQL: version 11

Benchmark Workload

Ontime (On Time Reporting Carrier On-Time Performance) is an openly-available dataset I have used to benchmark. It has a table size of 85GB with 109 number of different types of columns, and its designed queries more closely emulate the real world. 

https://github.com/Percona-Lab/ontime-airline-performance/blob/master/download.sh

Queries

Q# Query Contains Aggregates and Group By
Q1 SELECT DayOfWeek, count(*) AS c FROM ontime WHERE Year >= 2000 AND Year <= 2008 GROUP BY DayOfWeek ORDER BY c DESC;
Q2 SELECT DayOfWeek, count(*) AS c FROM ontime WHERE DepDelay>10 AND Year >= 2000 AND Year <= 2008 GROUP BY DayOfWeek ORDER BY c DESC;
Q3 SELECT Origin, count(*) AS c FROM ontime WHERE DepDelay>10 AND Year >= 2000 AND Year <= 2008 GROUP BY Origin ORDER BY c DESC LIMIT 10;
Q4 SELECT Carrier, count(*) FROM ontime WHERE DepDelay>10 AND Year = 2007 GROUP BY Carrier ORDER BY count(*) DESC;
Q5 SELECT a.Carrier, c, c2, c*1000/c2 as c3 FROM ( SELECT Carrier, count(*) AS c FROM ontime WHERE DepDelay>10 AND Year=2007 GROUP BY Carrier ) a
INNER JOIN (  SELECT   Carrier,count(*) AS c2 FROM ontime WHERE Year=2007 GROUP BY Carrier)b on a.Carrier=b.Carrier ORDER BY c3 DESC;
Q6 SELECT a.Carrier, c, c2, c*1000/c2 as c3 FROM ( SELECT Carrier,  count(*) AS c FROM ontime  WHERE DepDelay>10 AND Year >= 2000 AND
Year <= 2008 GROUP BY Carrier) a INNER JOIN ( SELECT  Carrier, count(*) AS c2 FROM ontime  WHERE Year >= 2000 AND Year <= 2008 GROUP BY Carrier ) b on a.Carrier=b.Carrier ORDER BY c3 DESC;
Q7 SELECT Carrier, avg(DepDelay) * 1000 AS c3 FROM ontime WHERE Year >= 2000 AND Year <= 2008 GROUP BY Carrier;
Q8 SELECT Year, avg(DepDelay) FROM ontime GROUP BY Year;
Q9 select Year, count(*) as c1 from ontime group by Year;
Q10 SELECT avg(cnt) FROM (SELECT Year,Month,count(*) AS cnt FROM ontime WHERE DepDel15=1 GROUP BY Year,Month) a;
Q11 select avg(c1) from (select Year,Month,count(*) as c1 from ontime group by Year,Month) a;
Q12 SELECT OriginCityName, DestCityName, count(*) AS c FROM ontime GROUP BY OriginCityName, DestCityName ORDER BY c DESC LIMIT 10;
Q13 SELECT OriginCityName, count(*) AS c FROM ontime GROUP BY OriginCityName ORDER BY c DESC LIMIT 10;
Query Contains Joins
Q14 SELECT a.Year, c1/c2 FROM ( select Year, count(*)*1000 as c1 from ontime WHERE DepDelay>10 GROUP BY Year) a
INNER JOIN (select Year, count(*) as c2 from ontime GROUP BY Year ) b on a.Year=b.Year ORDER BY a.Year;
Q15 SELECT a.”Year”, c1/c2 FROM ( select “Year”, count(*)*1000 as c1 FROM fontime WHERE “DepDelay”>10 GROUP BY “Year”) a INNER JOIN (select “Year”, count(*) as c2
FROM fontime GROUP BY “Year” ) b on a.”Year”=b.”Year”;

Table-1: Queries used in the benchmark

 

Benchmark Result

Improving OLAP Workload Performance for PostgreSQL

 

This graph shows the time taken by each query in PostgreSQL, ClickHouse, and Clickhusedb_fdw. You can clearly see that ClickHouse performs really well on its own while PostgreSQL doesn’t. The benchmark has been done on a single node with multiple CPU cores and can scale with multiple nodes. But I think we can see enough performance as ClickHouse is almost 1000x faster in most of the queries. In a previous blog post, I went over some of the reasons why some queries do not perform well in clickhouse_fdw, if you would like to learn more.

Conclusion

The graph above proves my point that PostgreSQL is not fit for every type of workload and query. However, the beauty of PostgreSQL is its ability to use its foreign data wrappers to create a heterogeneous environment with other database systems, which can then optimally handle specific workloads.

The clickhousedb_fdw clearly gives users the power to use ClickHouse for much-improved performance for OLAP workloads whilst still interacting with PostgreSQL only. This has simplified client applications that don’t need to interact with multiple database systems.

May
07
2019
--

MariaDB Track at Percona Live

MariaDB track at Percona Live 2019

mariadb logoLess than one month left until Percona Live. This time the Committee work was a bit unusual. Instead of having one big committee for the whole conference we had a few mini-committees, each responsible for a track. Each independent mini-committee, in turn, had a leader who was responsible for the whole process. I led the MariaDB track. In this post, I want to explain how we worked, which topics we have chosen, and why.

For MariaDB, we had seven slots: five for 50-minutes talks, two for 25-minutes talks and 19 submissions. We had to reject two out of three proposals. We also had to decide how many topics the program should cover. My aim here was to use the MariaDB track to demonstrate as many MariaDB unique features as possible. I also wanted to have as many speakers as possible, considering the number of slots we had available.

The committee agreed, and we tried our best for the program to cover the various topics. If someone sent us two or more proposals, we choose only one to allow more speakers to attend.

We also looked to identify gaps in submitted sessions. For example, if we wanted for a topic to be covered and no one sent a proposal with such a subject, we invited potential speakers and asked them to submit with that topic in mind. Or we asked those who already submitted similar talks to improve them.

In the end, we have five 50-minutes sessions, one MariaDB session in the MySQL track, two 25-minutes sessions, one tutorial, and one keynote. All of them are by different speakers.

The Program

The first MariaDB event will be a tutorial: “Expert MariaDB: Harness the Strengths of MariaDB Server” by Colin Charles on Tuesday, May 28

Colin started his MySQL career as a Community Engineer back in the MySQL AB times. He worked on numerous MySQL events, both big and small, including Percona Live’s predecessor, O’Reilly’s MySQL Conference and Expo. Colin joined Monty Program Ab, and MariaDB Corporation as a Chief Evangelist, then spent two years as Chief Evangelist at Percona. Now he is an independent consultant at his own company GrokOpen.

Colin will not only talk about unique MariaDB features up to version 10.4, but will also help you try all of them out. This tutorial is a must-attend for everyone interested in MariaDB.

Next day: Wednesday, May 29 – the first conference day – will be the MariaDB Track day.

MariaDB Foundation Bronze SponsorshipMariaDB talks will start from the keynote by Vicentiu Ciorbaru about new MariaDB features in version 10.4. He will highlight all the significant additions in this version.

Vicentiu started his career at MariaDB Foundation as a very talented Google Summer of Code student. His first project was Roles. Then he worked a lot on MariaDB Optimizer, bug fixes, and code maintenance. At the same time, he discovered a talent for public speaking, and now he is the face of MariaDB Foundation.

We at the committee had a hard choice: either to accept his 50-minutes session proposal or ask him to make a keynote. This decision was not easy, because a keynote is shorter than 50 minutes. At the same time, though, everyone at the conference will be able to see it. Brand new features of version 10.4 are a very important topic. Therefore, we decided that it would be best to have Vicentiu as a keynote speaker.

Morning sessions

virtualhealthSessions will start with a talk by Alexander Rubin “Opensource Column Store Databases: MariaDB ColumnStore vs. ClickHouse” Alex began his MySQL career as a web developer, then joined MySQL AB as a consultant. He then moved to Percona as Principal Architect. It was our loss when he left Percona to start applying his recommendations himself on behalf of a medical startup VirtualHealth! During his career as a MySQL consultant, he tried all the sexiest database products, loaded terabytes of data into them, ran the deadly intensive loads. He is the one who knows best about database strengths and weaknesses. I would recommend his session to everyone who is considering a column store solution.

codership_720_156Next talk is “Galera Cluster New Features” by Seppo Jaakola. This session is about the long-awaited Galera 4 library. Seppo is one of three founders of Codership Oy: the company which brought us Galera library. Before the year 2007, when the Galera library was first released, MySQL users had to choose between asynchronous replication and asynchronous replication (that’s not a typo). Seppo brought us a solution which allowed us to continue using InnoDB in the style we were used to using while writing to all nodes. The Galera library looks after the data consistency. After more than ten years the product is mature and leaving its competitors far behind. The new version brings us streaming replication technology and other improvements which relax usage limitations and make Galera Cluster more stable. I recommend this session for everyone who looks forward to a synchronous replication future.

Afternoon sessions

Walmart LogoAfter the lunch break, we will meet MariaDB users Sandeep Jangra and Andre Van Looveren who will show how they use MariaDB at Walmart in their talk “Lessons Learned Building a Fully Automated Database Platform as a Service Using Open Source Technologies in the Cloud”. Sandeep and Andre manage more than 6000 MariaDB installations. In addition to setting up automation, they have experience with migration and upgrade. This talk will be an excellent case study, which I recommend to attend everyone who is considering implementing automation for a farm of MariaDB or MySQL servers.MariaDB Foundation

Next topic is “MariaDB Security Features and Best Practices” by Robert Bindar.  Robert is a server Developer at MariaDB Foundation. He will cover best security practices for MariaDB deployment, including the latest security features, added to version 10.4

At 4:15 pm we will have two MariaDB topics in parallel

MariaDB Foundation Bronze Sponsorship“MariaDB and MySQL – What Statistics Optimizer Needs Or When and How Not to Use Indexes” by Sergei Golubchik – a Member of the MariaDB Foundation Board – discovers optimization techniques which are often ignored in favor of indexes. Sergei worked on MySQL, and then on MariaDB, from their very first days. I’ve known him since 2006 when I joined the MySQL team. Each time when I am in trouble to find out how a particular piece of code works, just a couple of words from Sergei help to solve the issue! He has an encyclopedic knowledge on both MariaDB and MySQL databases. In this session, Sergei will explain which statistics optimizer we can use in addition to indexes. While he will focus on specific MariaDB features he will cover MySQL too. Spoiler: these are not only histograms!

Backups in the MySQL track…

In the parallel MySQL track, Iwo Panowicz and Juan Pablo Arruti will speak about backups in their “Percona XtraBackup vs. Mariabackup vs. MySQL Enterprise Backup” Iwo and Juan Pablo are Support Engineers at Percona. Iwo joined Percona two years ago, and now he is one of the most senior engineers in the EMEA team. Linux, PMM, analyzing core files, engineering best practices: Iwo is well equipped to answer all these and many more questions. Juan Pablo works in the American Support team for everything around MariaDB and MySQL: replication, backup, performance issues, data corruption… Through their support work, Iwo and Juan Pablo have had plenty of chances to find out strengths and weaknesses of different backup solutions.

Three tools, which they will cover in the talk, can be used to make a physical backup of MySQL and MariaDB databases, and this is the fastest and best recommended way to work with an actively used server. But what is the difference? When and why should you prefer one instrument over another? Iwo and Juan Pablo will answer these questions.

At the end of the day we will have two 25-minute sessions

Alibaba CloudJim Tommaney will present “Tips and Tricks with MariaDB ColumnStore”. Unlike Alex Rubin, who is an end user of ColumnStore databases, Jim is from another side: development. Thus his insights into MariaDB ColumnStore could be fascinating. If you are considering ColumnStore: this topic is a must-go!

Daniel Black will close the day with his talk “Squash That Old Bug”. This topic is the one I personally am looking forward to the most! Not only because I stick with bugs. But, well… the lists of accepted patches which Daniel’s posts to MariaDB and to MySQL servers are impressive. Especially when you know how strict is the quality control for external patches in MariaDB and MySQL! IBMIn his talk, Daniel is going to help you to start contributing yourself. And to do it successfully, so your patches are accepted. This session is very important for anyone who has asked themselves why one or another MariaDB or MySQL bug has not been fixed for a long time. I do not know a single user who has not asked that question!

MariaDB track at Percona Live 2019Conclusion

This blog about MariaDB track at Percona Live covers eight sessions, one keynote, one tutorial, 12 speakers, seven mini-committee members – two of whom are also speakers. We worked hard, and continue to work hard, to bring you great MariaDB program.

I cannot wait for the show to begin!


Photo by shannon VanDenHeuvel on Unsplash

Feb
15
2019
--

ClickHouse Performance Uint32 vs Uint64 vs Float32 vs Float64

Q1 least compression

While implementing ClickHouse for query executions statistics storage in Percona Monitoring and Management (PMM),  we were faced with a question of choosing the data type for metrics we store. It came down to this question: what is the difference in performance and space usage between Uint32, Uint64, Float32, and Float64 column types?

To test this, I created a test table with an abbreviated and simplified version of the main table in our ClickHouse Schema.

The “number of queries” is stored four times in four different columns to be able to benchmark queries referencing different columns.  We can do this with ClickHouse because it is a column store and it works only with columns referenced by the query. This method would not be appropriate for testing on MySQL, for example.

CREATE TABLE test
(
    digest String,
    db_server String,
    db_schema String,
    db_username String,
    client_host String,
    period_start DateTime,
    nq_UInt32 UInt32,
    nq_UInt64 UInt64,
    nq_Float32 Float32,
    nq_Float64 Float64
)
ENGINE = MergeTree
PARTITION BY toYYYYMM(period_start)
ORDER BY (digest, db_server, db_username, db_schema, client_host, period_start)
SETTINGS index_granularity = 8192

When testing ClickHouse performance you need to consider compression. Highly compressible data (for example just a bunch of zeroes) will compress very well and may be processed a lot faster than incompressible data. To take this into account we will do a test with three different data sets:

  • Very Compressible when “number of queries” is mostly 1
  • Somewhat Compressible when we use a range from 1 to 1000 and
  • Poorly Compressible when we use range from 1 to 1000000.

Since it’s unlikely that an application will use the full 32 bit range, we haven’t used it for this test.

Another factor which can impact ClickHouse performance is the number of “parts” the table has. After loading the data we ran OPTIMIZE TABLE FINAL to ensure only one part is there on the disk. Note: ClickHouse will gradually delete old files after the optimize command has completed. To avoid these operations interfering with benchmarks, I waited for about 15 minutes to ensure all unused data was removed from the disk.

The amount of memory on the system was enough to cache whole columns in all tests, so this is an in-memory test.

Here is how the table with only one part looks on disk:

root@d01e692c291f:/var/lib/clickhouse/data/pmm/test_lc# ls -la
total 28
drwxr-xr-x 4 clickhouse clickhouse 12288 Feb 10 20:39 .
drwxr-xr-x 8 clickhouse clickhouse 4096 Feb 10 22:38 ..
drwxr-xr-x 2 clickhouse clickhouse 4096 Feb 10 20:30 201902_1_372_4
drwxr-xr-x 2 clickhouse clickhouse 4096 Feb 10 19:38 detached
-rw-r--r-- 1 clickhouse clickhouse 1 Feb 10 19:38 format_version.txt

When you have only one part it makes it very easy to see the space different columns take:

root@d01e692c291f:/var/lib/clickhouse/data/pmm/test_lc/201902_1_372_4# ls -la
total 7950468
drwxr-xr-x 2 clickhouse clickhouse 4096 Feb 10 20:30 .
drwxr-xr-x 4 clickhouse clickhouse 12288 Feb 10 20:39 ..
-rw-r--r-- 1 clickhouse clickhouse 971 Feb 10 20:30 checksums.txt
-rw-r--r-- 1 clickhouse clickhouse 663703499 Feb 10 20:30 client_host.bin
-rw-r--r-- 1 clickhouse clickhouse 754848 Feb 10 20:30 client_host.mrk
-rw-r--r-- 1 clickhouse clickhouse 238 Feb 10 20:30 columns.txt
-rw-r--r-- 1 clickhouse clickhouse 9 Feb 10 20:30 count.txt
-rw-r--r-- 1 clickhouse clickhouse 228415690 Feb 10 20:30 db_schema.bin
-rw-r--r-- 1 clickhouse clickhouse 754848 Feb 10 20:30 db_schema.mrk
-rw-r--r-- 1 clickhouse clickhouse 6985801 Feb 10 20:30 db_server.bin
-rw-r--r-- 1 clickhouse clickhouse 754848 Feb 10 20:30 db_server.mrk
-rw-r--r-- 1 clickhouse clickhouse 19020651 Feb 10 20:30 db_username.bin
-rw-r--r-- 1 clickhouse clickhouse 754848 Feb 10 20:30 db_username.mrk
-rw-r--r-- 1 clickhouse clickhouse 28227119 Feb 10 20:30 digest.bin
-rw-r--r-- 1 clickhouse clickhouse 754848 Feb 10 20:30 digest.mrk
-rw-r--r-- 1 clickhouse clickhouse 8 Feb 10 20:30 minmax_period_start.idx
-rw-r--r-- 1 clickhouse clickhouse 1552547644 Feb 10 20:30 nq_Float32.bin
-rw-r--r-- 1 clickhouse clickhouse 754848 Feb 10 20:30 nq_Float32.mrk
-rw-r--r-- 1 clickhouse clickhouse 1893758221 Feb 10 20:30 nq_Float64.bin
-rw-r--r-- 1 clickhouse clickhouse 754848 Feb 10 20:30 nq_Float64.mrk
-rw-r--r-- 1 clickhouse clickhouse 1552524811 Feb 10 20:30 nq_UInt32.bin
-rw-r--r-- 1 clickhouse clickhouse 754848 Feb 10 20:30 nq_UInt32.mrk
-rw-r--r-- 1 clickhouse clickhouse 1784991726 Feb 10 20:30 nq_UInt64.bin
-rw-r--r-- 1 clickhouse clickhouse 754848 Feb 10 20:30 nq_UInt64.mrk
-rw-r--r-- 1 clickhouse clickhouse 4 Feb 10 20:30 partition.dat
-rw-r--r-- 1 clickhouse clickhouse 400961033 Feb 10 20:30 period_start.bin
-rw-r--r-- 1 clickhouse clickhouse 754848 Feb 10 20:30 period_start.mrk
-rw-r--r-- 1 clickhouse clickhouse 2486243 Feb 10 20:30 primary.idx

We can see there are two files for every column (plus some extras), and so, for example, the Float32 based “number of queries” metric store takes around 1.5GB.

You can also use the SQL queries to get this data from the ClickHouse system tables instead:

SELECT *
FROM system.columns
WHERE (database = 'pmm') AND (table = 'test') AND (name = 'nq_UInt32')
Row 1:
??????
database: pmm
table: test
name: nq_UInt32
type: UInt32
default_kind:
default_expression:
data_compressed_bytes: 7250570
data_uncompressed_bytes: 1545913232
marks_bytes: 754848
comment:
is_in_partition_key: 0
is_in_sorting_key: 0
is_in_primary_key: 0
is_in_sampling_key: 0
compression_codec:
1 rows in set. Elapsed: 0.002 sec.
SELECT *
FROM system.parts
WHERE (database = 'pmm') AND (table = 'test')
Row 1:
??????
partition: 201902
name: 201902_1_372_4
active: 1
marks: 47178
rows: 386478308
bytes_on_disk: 1401028031
data_compressed_bytes: 1390993287
data_uncompressed_bytes: 29642900064
marks_bytes: 7548480
modification_time: 2019-02-10 23:26:20
remove_time: 0000-00-00 00:00:00
refcount: 1
min_date: 0000-00-00
max_date: 0000-00-00
min_time: 2019-02-08 14:50:32
max_time: 2019-02-08 15:58:30
partition_id: 201902
min_block_number: 1
max_block_number: 372
level: 4
data_version: 1
primary_key_bytes_in_memory: 4373363
primary_key_bytes_in_memory_allocated: 6291456
database: pmm
table: test
engine: MergeTree
path: /var/lib/clickhouse/data/pmm/test/201902_1_372_4/
1 rows in set. Elapsed: 0.003 sec.

Now let’s look at the queries

We tested with two queries.  One of them – we’ll call it Q1 – is a very trivial query, simply taking the sum across all column values. This query needs only to access one column to return results so it is likely to be the most impacted by a change of data type:

SELECT sum(nq_UInt32)
FROM test

The second query – which we’ll call Q2 – is a typical ranking query which computes the number of queries per period and then shows periods with the highest amount of queries in them:

SELECT
    sum(nq_UInt32) AS cnt,
    period_start
FROM test
GROUP BY period_start
ORDER BY cnt DESC
LIMIT 10

This query needs to access two columns and do more complicated processing so we expect it to be less impacted by the change of data type.

Before we get to results I think it is worth drawing attention to the raw performance we’re getting.  I did these tests on DigitalOcean Droplet with just six virtual CPU cores, yet still I see numbers like these:

SELECT sum(nq_UInt32)
FROM test
??sum(nq_UInt32) ???
?     386638984    ?
????????????????????
1 rows in set. Elapsed: 0.205 sec. Processed 386.48 million rows, 1.55 GB (1.88 billion rows/s., 7.52 GB/s.)

Processing more than 300M rows/sec per core and more than 1GB/sec per core is very cool!

Query Performance

Results between different compression levels show similar differences between column types, so let’s focus on those with the least compression:

Q1 least compression

Q2 least compression

As you can see, the width of the data type (32 bit vs 64 bit) matters a lot more than the type (float vs integer). In some cases float may even perform faster than integer. This was the most unexpected result for me.

Another metric ClickHouse reports is the processing speed in GB/sec. We see a different picture here:

Q1 GB per second

64 bit data types have a higher processing speed than their 32 bit counter parts, but queries run slower as there is more raw data to process.

Compression

Let’s now take a closer look at compression.  For this test we use default LZ4 compression. ClickHouse has powerful support for Per Column Compression Codecs but testing them is outside of scope for this post.

So let’s look at size on disk for UInt32 Column:

On disk data size for UINT32

What you can see from these results is that when data is very compressible ClickHouse can compress it to almost nothing.  The compression ratio for our very compressible data set is about 200x (or 99.5% size reduction if you prefer this metric).

Somewhat compressible data compression rate is 1.4x.  That’s not bad but considering we are only storing 1-1000 range in this column – which requires 10 bits out of 32 – I would hope for better compression. I guess LZ4 is not compressing such data very well.

Now let’s look at compression for a 64 bit integer column:

On disk data size for UINT64

We can see that while the size almost doubled for very compressible data, increases for our somewhat compressible data and poorly compressible data are not that large.  Somewhat compressible data now compresses 2.5x.

Now let’s take a look at Performance depending on data compressibility:

Q1 time for UINT32

Poorly compressible data which takes a larger space on disk is processed faster than somewhat compressible data? This did not make sense. I repeated the run a few times to make sure that the results were correct. When I looked at the compression ratio, though, it suddenly made sense to me.

Poorly compressible data for the UInt32 data type was not compressible by LZ4 so it seems the original data was stored, significantly speeding up “decompression” process.   With somewhat compressible data, compression worked and so real decompression needed to take place too. This makes things slower.

This is why we can only observe these results with UInt32 and Float32 data types.  UInt64 and Float64 show the more expected results:

Q1 time for UINT64

Summary

Here are my conclusions:

  • Even with “slower” data types, ClickHouse is very fast
  • Data type choice matters – but less than I expected
  • Width (32bit vs 64bit) impacts performance more than integer vs float data types
  • Storing a small range of values in a wider column type is likely to yield better compression, though with default compression it is not as good as theoretically possible
  • Compression is interesting. We get the best performance when data can be well compressed. Second best is when we do not have to spend a lot of time decompressing it, as long as it is fits in memory.
Jan
14
2019
--

Should You Use ClickHouse as a Main Operational Database?

ClickHouse as a main operational database

ClickHouse as a main operational databaseFirst of all, this post is not a recommendation but more like a “what if” story. What if we use ClickHouse (which is a columnar analytical database) as our main datastore? Well, typically, an analytical database is not a replacement for a transactional or key/value datastore. However, ClickHouse is super efficient for timeseries and provides “sharding” out of the box (scalability beyond one node).  So can we use it as our main datastore?

Let’s imagine we are running a webservice and provide a public API. Public API as -a-service has become a good business model: examples include social networks like Facebook/Twitter, messaging as a service like Twilio, and even credit card authorization platforms like Marqeta. Let’s also imagine we need to store all messages (SMS messages, email messages, etc) we are sending and allow our customers to get various information about the message. This information can be a mix of analytical (OLAP) queries (i.e. how many messages was send for some time period and how much it cost) and a typical key/value queries like: “return 1 message by the message id”.

Using a columnar analytical database can be a big challenge here. Although such databases can be very efficient with counts and averages, some queries will be slow or simply non existent. Analytical databases are optimized for a low number of slow queries. The most important limitations of the analytical databases are:

  1. Deletes and updates are non-existent or slow
  2. Inserts are efficient for bulk inserts only
  3. No secondary indexes means that point selects (select by ID) tend to be very slow

This is all true for ClickHouse, however, we may be able to live with it for our task.

To simulate text messages I have used ~3 billion of reddit comments (10 years from 2007 to 2017), downloaded from pushshift.io . Vadim published a blog post about analyzing reddit comments with ClickHouse. In my case, I’m using this data as a simulation of text messages, and will show how we can use ClickHouse as a backend for an API.

Loading the JSON data to Clickhouse

I used the following table in Clickhouse to load all data:

CREATE TABLE reddit.rc(
body String,
score_hidden Nullable(UInt8),
archived Nullable(UInt8),
name String,
author String,
author_flair_text Nullable(String),
downs Nullable(Int32),
created_utc UInt32,
subreddit_id String,
link_id Nullable(String),
parent_id Nullable(String),
score Nullable(Int16),
retrieved_on Nullable(UInt32),
controversiality Nullable(Int8),
gilded Nullable(Int8),
id String,
subreddit String,
ups Nullable(Int16),
distinguished Nullable(String),
author_flair_css_class Nullable(String),
stickied Nullable(UInt8),
edited Nullable(UInt8)
) ENGINE = MergeTree() PARTITION BY toYYYYMM(toDate(created_utc)) ORDER BY created_utc ;

Then I used the following command to load the JSON data (downloaded from pushshift.io) to ClickHouse:

$ bzip2 -d -c RC_20*.bz2 | clickhouse-client --input_format_skip_unknown_fields 1 --input_format_allow_errors_num 1000000 -d reddit -n --query="INSERT INTO rc FORMAT JSONEachRow"

The data on disk in ClickHouse is not significantly larger than compressed files, which is great:

#  du -sh /data/clickhouse/data/reddit/rc/
638G    /data/clickhouse/data/reddit/rc/
# du -sh /data/reddit/
404G    /data/reddit/

We have ~4 billion rows:

SELECT
    toDate(min(created_utc)),
    toDate(max(created_utc)),
    count(*)
FROM rc
??toDate(min(created_utc))???toDate(max(created_utc))??????count()??
?               2006-01-01 ?               2018-05-31 ? 4148248585 ?
????????????????????????????????????????????????????????????????????
1 rows in set. Elapsed: 11.554 sec. Processed 4.15 billion rows, 16.59 GB (359.02 million rows/s., 1.44 GB/s.)

The data is partitioned and sorted by created_utc so queries which include created_utc will be able to using partition pruning: therefore skip the not-needed partitions. However, let’s say our API needs to support the following features, which are not common for analytical databases:

  1. Selecting a single comment/message by ID
  2. Retrieving the last 10 or 100 of the messages/comments
  3. Updating a single message in the past (e.g. in the case of messages, we may need to update the final price; in the case of comments, we may need to upvote or downvote a comment)
  4. Deleting messages
  5. Text search

With the latest ClickHouse version, all of these features are available, but some of them may not perform fast enough.

Retrieving a single row in ClickHouse

Again, this is not a typical operation in any analytical database, those databases are simply not optimized for it. ClickHouse does not have secondary indexes, and we are using created_utc as a primary key (sort by). So, selecting a message by just ID will require a full table scan:

SELECT
    id,
    created_utc
FROM rc
WHERE id = 'dbumnpz'
??id????????created_utc??
? dbumnpz ?  1483228800 ?
?????????????????????????
1 rows in set. Elapsed: 18.070 sec. Processed 4.15 billion rows, 66.37 GB (229.57 million rows/s., 3.67 GB/s.)

Only if we know the timestamp (created_utc)… Then it will be lighting fast: ClickHouse will use the primary key:

SELECT *
FROM rc
WHERE (id = 'dbumnpz') AND (created_utc = 1483228800)
...
1 rows in set. Elapsed: 0.010 sec. Processed 8.19 thousand rows, 131.32 KB (840.27 thousand rows/s., 13.47 MB/s.)

Actually, we can simulate an additional index set by creating a materialized view in ClickHouse:

create materialized view rc_id_v
ENGINE MergeTree() PARTITION BY toYYYYMM(toDate(created_utc)) ORDER BY (id)
POPULATE AS SELECT id, created_utc from rc;

Here I’m creating a materialized view and populating it initially from the main (rc) table. The view will be updated automatically when there are any inserts into table reddit.rc. The view is actually another MergeTree table sorted by id. Now we can use this query:

SELECT *
FROM rc
WHERE (id = 'dbumnpz') AND (created_utc =
(
    SELECT created_utc
    FROM rc_id_v
    WHERE id = 'dbumnpz'
))
...
1 rows in set. Elapsed: 0.053 sec. Processed 8.19 thousand rows, 131.32 KB (153.41 thousand rows/s., 2.46 MB/s.)

This is a single query which will join our materialized view to pass the created_utc (timestamp) to the original table. It is a little bit slower but still less than 100ms response time.

Using this trick (materialized views) we can potentially simulate other indexes.

Retrieving the last 10 messages

This is where ClickHouse is not very efficient. Let’s say we want to retrieve the last 10 comments:

SELECT
    id,
    created_utc
FROM rc
ORDER BY created_utc DESC
LIMIT 10
??id????????created_utc??
? dzwso7l ?  1527811199 ?
? dzwso7j ?  1527811199 ?
? dzwso7k ?  1527811199 ?
? dzwso7m ?  1527811199 ?
? dzwso7h ?  1527811199 ?
? dzwso7n ?  1527811199 ?
? dzwso7o ?  1527811199 ?
? dzwso7p ?  1527811199 ?
? dzwso7i ?  1527811199 ?
? dzwso7g ?  1527811199 ?
?????????????????????????
10 rows in set. Elapsed: 24.281 sec. Processed 4.15 billion rows, 82.96 GB (170.84 million rows/s., 3.42 GB/s.)

In a conventional relational database (like MySQL) this can be done by reading a btree index sequentially from the end, as the index is sorted (like “tail” command on linux). In a partitioned massively parallel database system, the storage format and sorting algorithm may not be optimized for that operation as we are reading multiple partitions in parallel. Currently, an issue has been opened to make the “tailing” based on the primary key much faster: slow order by primary key with small limit on big data. As a temporary workaround we can do something like this:

SELECT count()
FROM rc
WHERE (created_utc > (
(
    SELECT max(created_utc)
    FROM rc
) - ((60 * 60) * 24))) AND (subreddit = 'programming')
??count()??
?    1248 ?
???????????
1 rows in set. Elapsed: 4.510 sec. Processed 3.05 million rows, 56.83 MB (675.38 thousand rows/s., 12.60 MB/s.) ```

It is still a five seconds query. Hopefully, this type of query will become faster in ClickHouse.

Updating / deleting data in ClickHouse

The latest ClickHouse version allows running update/delete in the form of “ALTER TABLE .. UPDATE / DELETE” (it is called mutations in ClickHouse terms). For example, we may want to upvote a specific comment.

SELECT score
FROM rc_2017
WHERE (id = 'dbumnpz') AND (created_utc =
(
    SELECT created_utc
    FROM rc_id_v
    WHERE id = 'dbumnpz'
))
??score??
?     2 ?
?????????
1 rows in set. Elapsed: 0.048 sec. Processed 8.19 thousand rows, 131.08 KB (168.93 thousand rows/s., 2.70 MB/s.)
:) alter table rc_2017 update score = score +1 where id =  'dbumnpz' and created_utc = (select created_utc from rc_id_v where id =  'dbumnpz');
ALTER TABLE rc_2017
    UPDATE score = score + 1 WHERE (id = 'dbumnpz') AND (created_utc =
    (
        SELECT created_utc
        FROM rc_id_v
        WHERE id = 'dbumnpz'
    ))
Ok.
0 rows in set. Elapsed: 0.052 sec.

“Mutation” queries will return immediately and will be executed asynchronously. We can see the progress by reading from the system.mutations table:

select * from system.mutations\G
SELECT *
FROM system.mutations
Row 1:
??????
database:                   reddit
table:                      rc_2017
mutation_id:                mutation_857.txt
command:                    UPDATE score = score + 1 WHERE (id = 'dbumnpz') AND (created_utc = (SELECT created_utc FROM reddit.rc_id_v  WHERE id = 'dbumnpz'))
create_time:                2018-12-27 22:22:05
block_numbers.partition_id: ['']
block_numbers.number:       [857]
parts_to_do:                0
is_done:                    1
1 rows in set. Elapsed: 0.002 sec.

Now we can try deleting comments that have been marked for deletion (body showing “[deleted]”):

ALTER TABLE rc_2017
    DELETE WHERE body = '[deleted]'
Ok.
0 rows in set. Elapsed: 0.002 sec.
:) select * from system.mutations\G
SELECT *
FROM system.mutations
...
Row 2:
??????
database:                   reddit
table:                      rc_2017
mutation_id:                mutation_858.txt
command:                    DELETE WHERE body = '[deleted]'
create_time:                2018-12-27 22:41:01
block_numbers.partition_id: ['']
block_numbers.number:       [858]
parts_to_do:                64
is_done:                    0
2 rows in set. Elapsed: 0.017 sec.

After a while, we can do the count again:

:) select * from system.mutations\G
SELECT *
FROM system.mutations
...
Row 2:
??????
database:                   reddit
table:                      rc_2017
mutation_id:                mutation_858.txt
command:                    DELETE WHERE body = '[deleted]'
create_time:                2018-12-27 22:41:01
block_numbers.partition_id: ['']
block_numbers.number:       [858]
parts_to_do:                0
is_done:                    1

As we can see our “mutation” is done.

Text analysis

ClickHouse does not offer full text search, however we can use some text functions. In my previous blog post about ClickHouse I used it to find the most popular wikipedia page of the month. This time I’m trying to find the news keywords of the year using all reddit comments: basically I’m calculating the most frequently used new words for the specific year (algorithm based on an article about finding trending topics using Google Books n-grams data). To do that I’m using the ClickHouse function alphaTokens(body) which will split the “body” field into words. From there, I can count the words or use arrayJoin to create a list (similar to MySQL’s group_concat function). Here is the example:

First I created a table word_by_year_news:

create table word_by_year_news ENGINE MergeTree() PARTITION BY y ORDER BY (y) as
select a.w as w, b.y as y, sum(a.occurrences)/b.total as ratio from
(
select
 lower(arrayJoin(alphaTokens(body))) as w,
 toYear(toDate(created_utc)) as y,
 count() as occurrences
from rc
where body <> '[deleted]'
and created_utc < toUnixTimestamp('2018-01-01 00:00:00')
and created_utc >= toUnixTimestamp('2007-01-01 00:00:00')
and subreddit in ('news', 'politics', 'worldnews')
group by w, y
having length(w) > 4
) as a
ANY INNER JOIN
(
select
 toYear(toDate(created_utc)) as y,
 sum(length(alphaTokens(body))) as total
from rc
where body <> '[deleted]'
and subreddit in ('news', 'politics', 'worldnews')
and created_utc < toUnixTimestamp('2018-01-01 00:00:00')
and created_utc >= toUnixTimestamp('2007-01-01 00:00:00')
group by y
) AS b
ON a.y = b.y
group by
  a.w,
  b.y,
  b.total;
0 rows in set. Elapsed: 787.032 sec. Processed 7.35 billion rows, 194.32 GB (9.34 million rows/s., 246.90 MB/s.)

This will store all frequent words (I’m filtering by subreddits; the examples are: “news, politics and worldnews” or “programming”) as well as its occurrence this year; actually I want to store “relative” occurrence which is called “ratio” above: for each word I divide its occurrence by the number of total words this year (this is needed as the number of comments grows significantly year by year).

Now we can actually calculate the words of the year:

SELECT
    groupArray(w) as words,
    y + 1 as year
FROM
(
    SELECT
        w,
        CAST((y - 1) AS UInt16) AS y,
        ratio AS a_ratio
    FROM word_by_year_news
    WHERE ratio > 0.00001
) AS a
ALL INNER JOIN
(
    SELECT
        w,
        y,
        ratio AS b_ratio
    FROM word_by_year_news
    WHERE ratio > 0.00001
) AS b USING (w, y)
WHERE (y > 0) AND (a_ratio / b_ratio > 3)
GROUP BY y
ORDER BY
    y
LIMIT 100;
10 rows in set. Elapsed: 0.232 sec. Processed 14.61 million rows, 118.82 MB (63.01 million rows/s., 512.29 MB/s.)

And the results are (here I’m grouping words for each year):

For “programming” subreddit:

??year???words??????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
? 2007 ? ['audio','patents','swing','phones','gmail','opera','devices','phone','adobe','vista','backup','mercurial','mobile','passwords','scala','license','copyright','licenses','photoshop'] ?
? 2008 ? ['webkit','twitter','teacher','android','itunes']                                                                                                                                     ?
? 2009 ? ['downvotes','upvote','drupal','android','upvoted']                                                                                                                                   ?
? 2010 ? ['codecs','imgur','floppy','codec','adobe','android']                                                                                                                                 ?
? 2011 ? ['scala','currency','println']                                                                                                                                                        ?
? 2013 ? ['voting','maven']                                                                                                                                                                    ?
? 2014 ? ['compose','xamarin','markdown','scrum','comic']                                                                                                                                      ?
? 2015 ? ['china','sourceforge','subscription','chinese','kotlin']                                                                                                                             ?
? 2016 ? ['systemd','gitlab','autotldr']                                                                                                                                                       ?
? 2017 ? ['offices','electron','vscode','blockchain','flash','collision']                                                                                                                      ?
????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????

For news subreddit:

??year???words???????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
? 2008 ? ['michigan','delegates','obama','alaska','georgia','russians','hamas','biden','hussein','barack','elitist','mccain']                                                                                                                                       ?
? 2009 ? ['stimulus','reform','medicare','franken','healthcare','payer','insurance','downvotes','hospitals','patients','option','health']                                                                                                                           ?
? 2010 ? ['blockade','arizona']                                                                                                                                                                                                                                     ?
? 2011 ? ['protests','occupy','romney','weiner','protesters']                                                                                                                                                                                                       ?
? 2012 ? ['santorum','returns','martin','obamacare','romney']                                                                                                                                                                                                       ?
? 2013 ? ['boston','chemical','surveillance']                                                                                                                                                                                                                       ?
? 2014 ? ['plane','poland','radar','subreddits','palestinians','putin','submission','russia','automoderator','compose','rockets','palestinian','hamas','virus','removal','russians','russian']                                                                      ?
? 2015 ? ['refugees','refugee','sanders','debates','hillary','removal','participating','removed','greece','clinton']                                                                                                                                                ?
? 2016 ? ['morons','emails','opponent','establishment','trump','reply','speeches','presidency','clintons','electoral','donald','trumps','downvote','november','subreddit','shill','domain','johnson','classified','bernie','nominee','users','returns','primaries','foundation','voters','autotldr','clinton','email','supporter','election','feedback','clever','leaks','accuse','candidate','upvote','rulesandregs','convention','conduct','uncommon','server','trolls','supporters','hillary'] ?
? 2017 ? ['impeached','downvotes','monitored','accusations','alabama','violation','treason','nazis','index','submit','impeachment','troll','collusion','bannon','neutrality','permanent','insults','violations']                                                    ?
?????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????

Conclusion

ClickHouse is a great massively parallel analytical system. It is extremely efficient and can potentially (with some hacks) be used as a main backend database powering a public API gateway serving both realtime and analytical queries. At the same time, it was not originally designed that way. Let me know in the comments if you are using ClickHouse for this or similar projects.


Photo by John Baker on Unsplash

 

 

Oct
01
2018
--

ClickHouse: Two Years!

historical trend of ClickHouse popularity

Following my post from a year ago https://www.percona.com/blog/2017/07/06/clickhouse-one-year/, I wanted to review what happened in ClickHouse during this year.
There is indeed some interesting news to share.

1. ClickHouse in DB-Engines Ranking. It did not quite get into the top 100, but the gain from position 174 to 106 is still impressive. Its DB-Engines Ranking score tripled from 0.54 last September to 1.57 this September

And indeed, in my conversation with customers and partners, the narrative has changed from: “ClickHouse, what is it?” to “We are using or considering ClickHouse for our analytics needs”.

2. ClickHouse changed their versioning schema. Unfortunately it changed from the unconventional …; 1.1.54390; 1.1.54394 naming structure to the still unconventional 18.6.0; 18.10.3; 18.12.13 naming structure, where “18.” is a year of the release.

Now to the more interesting technical improvements.

3. Support of the more traditional JOIN syntax. Now if you join two tables you can use SELECT ... FROM tab1 ANY LEFT JOIN tab2 ON tab1_col=tab2_col .

So now, if we take a query from the workload described in https://www.percona.com/blog/2017/06/22/clickhouse-general-analytical-workload-based-star-schema-benchmark/

We can write this:

SELECT     C_REGION,     sum(LO_EXTENDEDPRICE * LO_DISCOUNT)
FROM lineorder ANY INNER JOIN customer
ON LO_CUSTKEY=C_CUSTKEY
WHERE (toYear(LO_ORDERDATE) = 1993) AND ((LO_DISCOUNT >= 1) AND (LO_DISCOUNT <= 3)) AND (LO_QUANTITY < 25)
GROUP BY C_REGION;

instead of the monstrous:

SELECT
    C_REGION,
    sum(LO_EXTENDEDPRICE * LO_DISCOUNT)
FROM lineorder
ANY INNER JOIN
(
    SELECT
        C_REGION,
        C_CUSTKEY AS LO_CUSTKEY
    FROM customer
) USING (LO_CUSTKEY)
WHERE (toYear(LO_ORDERDATE) = 1993) AND ((LO_DISCOUNT >= 1) AND (LO_DISCOUNT <= 3)) AND (LO_QUANTITY < 25)
GROUP BY C_REGION;

4. Support for DELETE and UPDATE operations. This has probably been the most requested feature since the first ClickHouse release.
ClickHouse uses an LSM-tree like structure—MergeTree—and it is not friendly to single row operations. To highlight this specific limitation, ClickHouse uses ALTER TABLE UPDATE / ALTER TABLE DELETE syntax to highlight this will be executed as a bulk operation, so please consider it as such. Updating or deleting rows in ClickHouse should be an exceptional operation, rather than a part of your day-to-day workload.

We can update a column like this: ALTER TABLE lineorder UPDATE LO_DISCOUNT = 5 WHERE LO_CUSTKEY = 199568

5. ClickHouse added a feature which I call Dictionary Compression, but ClickHouse uses the name “LowCardinality”. It is still experimental, but I hope soon it will be production ready. Basically it allows internally to replace long strings with a short list of enumerated values.

For example, consider the table from our example lineorder which contains 600037902 rows, but has only five different values for the column LO_ORDERPRIORITY:

SELECT DISTINCT LO_ORDERPRIORITY
FROM lineorder
??LO_ORDERPRIORITY??
? 1-URGENT         ?
? 5-LOW            ?
? 4-NOT SPECIFIED  ?
? 2-HIGH           ?
? 3-MEDIUM         ?
????????????????????

So we can define our table as:

CREATE TABLE lineorder_dict (
        LO_ORDERKEY             UInt32,
        LO_LINENUMBER           UInt8,
        LO_CUSTKEY              UInt32,
        LO_PARTKEY              UInt32,
        LO_SUPPKEY              UInt32,
        LO_ORDERDATE            Date,
        LO_ORDERPRIORITY        LowCardinality(String),
        LO_SHIPPRIORITY         UInt8,
        LO_QUANTITY             UInt8,
        LO_EXTENDEDPRICE        UInt32,
        LO_ORDTOTALPRICE        UInt32,
        LO_DISCOUNT             UInt8,
        LO_REVENUE              UInt32,
        LO_SUPPLYCOST           UInt32,
        LO_TAX                  UInt8,
        LO_COMMITDATE           Date,
        LO_SHIPMODE             LowCardinality(String)
)Engine=MergeTree(LO_ORDERDATE,(LO_ORDERKEY,LO_LINENUMBER),8192);

How does this help? Firstly, it offers space savings. The table will take less space in storage, as it will use integer values instead of strings. And secondly, performance. The filtering operation will be executed faster.

For example: here’s a query against the table with LO_ORDERPRIORITY stored as String:

SELECT count(*)
FROM lineorder
WHERE LO_ORDERPRIORITY = '2-HIGH'
????count()??
? 119995822 ?
?????????????
1 rows in set. Elapsed: 0.859 sec. Processed 600.04 million rows, 10.44 GB (698.62 million rows/s., 12.16 GB/s.)

And now the same query against table with LO_ORDERPRIORITY as LowCardinality(String):

SELECT count(*)
FROM lineorder_dict
WHERE LO_ORDERPRIORITY = '2-HIGH'
????count()??
? 119995822 ?
?????????????
1 rows in set. Elapsed: 0.350 sec. Processed 600.04 million rows, 600.95 MB (1.71 billion rows/s., 1.72 GB/s.)

This is 0.859 sec vs 0.350 sec (for the LowCardinality case).

Unfortunately this feature is not optimized for all use cases, and actually in aggregation it performs slower.

An aggregation query against table with LO_ORDERPRIORITY as String:

SELECT DISTINCT LO_ORDERPRIORITY
FROM lineorder
??LO_ORDERPRIORITY??
? 4-NOT SPECIFIED  ?
? 1-URGENT         ?
? 2-HIGH           ?
? 3-MEDIUM         ?
? 5-LOW            ?
????????????????????
5 rows in set. Elapsed: 1.200 sec. Processed 600.04 million rows, 10.44 GB (500.22 million rows/s., 8.70 GB/s.)

Versus an aggregation query against table with LO_ORDERPRIORITY as LowCardinality(String):

SELECT DISTINCT LO_ORDERPRIORITY
FROM lineorder_dict
??LO_ORDERPRIORITY??
? 4-NOT SPECIFIED  ?
? 1-URGENT         ?
? 2-HIGH           ?
? 3-MEDIUM         ?
? 5-LOW            ?
????????????????????
5 rows in set. Elapsed: 2.334 sec. Processed 600.04 million rows, 600.95 MB (257.05 million rows/s., 257.45 MB/s.)

This is 1.200 sec vs 2.334 sec (for the LowCardinality case)

6. And the last feature I want to mention is the better support of Tableau Software: this required ODBC drivers. It may not seem significant, but Tableau is the number one software for data analysts, and by supporting this, ClickHouse will reach a much wider audience.

Summing up: ClickHouse definitely became much more user friendly since a year ago!

The post ClickHouse: Two Years! appeared first on Percona Database Performance Blog.

Feb
19
2018
--

Archiving MySQL Tables in ClickHouse

Archiving MySQL Tables in ClickHouse

Archiving MySQL Tables in ClickHouseIn this blog post, I will talk about archiving MySQL tables in ClickHouse for storage and analytics.

Why Archive?

Hard drives are cheap nowadays, but storing lots of data in MySQL is not practical and can cause all sorts of performance bottlenecks. To name just a few issues:

  1. The larger the table and index, the slower the performance of all operations (both writes and reads)
  2. Backup and restore for terabytes of data is more challenging, and if we need to have redundancy (replication slave, clustering, etc.) we will have to store all the data N times

The answer is archiving old data. Archiving does not necessarily mean that the data will be permanently removed. Instead, the archived data can be placed into long-term storage (i.e., AWS S3) or loaded into a special purpose database that is optimized for storage (with compression) and reporting. The data is then available.

Actually, there are multiple use cases:

  • Sometimes the data just needs to be stored (i.e., for regulatory purposes) but does not have to be readily available (it’s not “customer facing” data)
  • The data might be useful for debugging or investigation (i.e., application or access logs)
  • In some cases, the data needs to be available for the customer (i.e., historical reports or bank transactions for the last six years)

In all of those cases, we can move the older data away from MySQL and load it into a “big data” solution. Even if the data needs to be available, we can still move it from the main MySQL server to another system. In this blog post, I will look at archiving MySQL tables in ClickHouse for long-term storage and real-time queries.

How To Archive?

Let’s say we have a 650G table that stores the history of all transactions, and we want to start archiving it. How can we approach this?

First, we will need to split this table into “old” and “new”. I assume that the table is not partitioned (partitioned tables are much easier to deal with). For example, if we have data from 2008 (ten years worth) but only need to store data from the last two months in the main MySQL environment, then deleting the old data would be challenging. So instead of deleting 99% of the data from a huge table, we can create a new table and load the newer data into that. Then rename (swap) the tables. The process might look like this:

  1. CREATE TABLE transactions_new LIKE transactions
  2. INSERT INTO transactions_new SELECT * FROM transactions WHERE trx_date > now() – interval 2 month
  3. RENAME TABLE transactions TO transactions_old, transactions_new TO transactions

Second, we need to move the transactions_old into ClickHouse. This is straightforward — we can pipe data from MySQL to ClickHouse directly. To demonstrate I will use the Wikipedia:Statistics project (a real log of all requests to Wikipedia pages).

Create a table in ClickHouse:

CREATE TABLE wikistat
(
    id bigint,
    dt DateTime,
    project String,
    subproject String,
    path String,
    hits UInt64,
    size UInt64
)
ENGINE = MergeTree
PARTITION BY toYYYYMMDD(dt)
ORDER BY dt
Ok.
0 rows in set. Elapsed: 0.010 sec.

Please note that I’m using the new ClickHouse custom partitioning. It does not require that you create a separate date column to map the table in MySQL to the same table structure in ClickHouse

Now I can “pipe” data directly from MySQL to ClickHouse:

mysql --quick -h localhost wikistats -NBe
"SELECT concat(id,',"',dt,'","',project,'","',subproject,'","', path,'",',hits,',',size) FROM wikistats" |
clickhouse-client -d wikistats --query="INSERT INTO wikistats FORMAT CSV"

Thirdwe need to set up a constant archiving process so that the data is removed from MySQL and transferred to ClickHouse. To do that we can use the “pt-archiver” tool (part of Percona Toolkit). In this case, we can first archive to a file and then load that file to ClickHouse. Here is the example:

Remove data from MySQL and load to a file (tsv):

pt-archiver --source h=localhost,D=wikistats,t=wikistats,i=dt --where "dt <= '2018-01-01 0:00:00'"  --file load_to_clickhouse.txt --bulk-delete --limit 100000 --progress=100000
TIME                ELAPSED   COUNT
2018-01-25T18:19:59       0       0
2018-01-25T18:20:08       8  100000
2018-01-25T18:20:17      18  200000
2018-01-25T18:20:26      27  300000
2018-01-25T18:20:36      36  400000
2018-01-25T18:20:45      45  500000
2018-01-25T18:20:54      54  600000
2018-01-25T18:21:03      64  700000
2018-01-25T18:21:13      73  800000
2018-01-25T18:21:23      83  900000
2018-01-25T18:21:32      93 1000000
2018-01-25T18:21:42     102 1100000
...

Load the file to ClickHouse:

cat load_to_clickhouse.txt | clickhouse-client -d wikistats --query="INSERT INTO wikistats FORMAT TSV"

The newer version of pt-archiver can use a CSV format as well:

pt-archiver --source h=localhost,D=wikitest,t=wikistats,i=dt --where "dt <= '2018-01-01 0:00:00'"  --file load_to_clickhouse.csv --output-format csv --bulk-delete --limit 10000 --progress=10000

How Much Faster Is It?

Actually, it is much faster in ClickHouse. Even the queries that are based on index scans can be much slower in MySQL compared to ClickHouse.

For example, in MySQL just counting the number of rows for one year can take 34 seconds (index scan):

mysql> select count(*) from wikistats where dt between '2017-01-01 00:00:00' and '2017-12-31 00:00:00';
+-----------+
| count(*)  |
+-----------+
| 103161991 |
+-----------+
1 row in set (34.82 sec)
mysql> explain select count(*) from wikistats where dt between '2017-01-01 00:00:00' and '2017-12-31 00:00:00'G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: wikistats
   partitions: NULL
         type: range
possible_keys: dt
          key: dt
      key_len: 6
          ref: NULL
         rows: 227206802
     filtered: 100.00
        Extra: Using where; Using index
1 row in set, 1 warning (0.00 sec)

In ClickHouse, it only takes 0.062 sec:

:) select count(*) from wikistats where dt between  toDateTime('2017-01-01 00:00:00') and  toDateTime('2017-12-31 00:00:00');
SELECT count(*)
FROM wikistats
WHERE (dt >= toDateTime('2017-01-01 00:00:00')) AND (dt <= toDateTime('2017-12-31 00:00:00'))
????count()??
? 103161991 ?
?????????????
1 rows in set. Elapsed: 0.062 sec. Processed 103.16 million rows, 412.65 MB (1.67 billion rows/s., 6.68 GB/s.)

Size on Disk

In my previous blog on comparing ClickHouse to Apache Spark to MariaDB, I also compared disk size. Usually, we can expect a 10x to 5x decrease in disk size in ClickHouse due to compression. Wikipedia:Statistics, for example, contains actual URIs, which can be quite large due to the article name/search phrase. This can be compressed very well. If we use only integers or use MD5 / SHA1 hashes instead of storing actual URIs, we can expect much smaller compression (i.e., 3x). Even with a 3x compression ratio, it is still pretty good as long-term storage.

Conclusion

As the data in MySQL keeps growing, the performance for all the queries will keep decreasing. Typically, queries that originally took milliseconds can now take seconds (or more). That requires a lot of changes (code, MySQL, etc.) to make faster.

The main goal of archiving the data is to increase performance (“make MySQL fast again”), decrease costs and improve ease of maintenance (backup/restore, cloning the replication slave, etc.). Archiving to ClickHouse allows you to preserve old data and make it available for reports.

Jan
16
2018
--

Updating/Deleting Rows From Clickhouse (Part 2)

ClickHouse

ClickHouseIn this post, we’ll look at updating and deleting rows with ClickHouse. It’s the second of two parts.

In the first part of this post, we described the high-level overview of implementing incremental refresh on a ClickHouse table as an alternative support for UPDATE/DELETE. In this part, we will show you the actual steps and sample code.

Prepare Changelog Table

First, we create the changelog table below. This can be stored on any other MySQL instance separate from the source of our analytics table. When we run the change capture script, it will record the data on this table that we can consume later with the incremental refresh script:

CREATE TABLE `clickhouse_changelog` (
  `db` varchar(255) NOT NULL DEFAULT '',
  `tbl` varchar(255) NOT NULL DEFAULT '',
  `created_at` date NOT NULL,
  `updated_at` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  `log_file` varchar(255) NOT NULL,
  `log_pos` int(10) unsigned NOT NULL,
  PRIMARY KEY (`db`,`tbl`,`created_at`),
  KEY `log_file` (`log_file`,`log_pos`)
) ENGINE=InnoDB;

Create ClickHouse Table

Next, let’s create the target ClickhHouse table. Remember, that the corresponding MySQL table is below:

CREATE TABLE `hits` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `type` varchar(100) DEFAULT NULL,
  `user_id` int(11) NOT NULL,
  `location_id` int(11) NOT NULL,
  `created_at` datetime DEFAULT NULL
  PRIMARY KEY (`id`),
  KEY `created_at` (`created_at`)
) ENGINE=InnoDB;

Converting this table to ClickHouse looks like below, with the addition of a “created_day” column that serves as the partitioning key:

CREATE TABLE hits (
  id Int32,
  created_day Date,
  type String,
  user_id Int32,
  location_id Int32,
  created_at Int32
) ENGINE = MergeTree PARTITION BY toMonday(created_day)
ORDER BY (created_at, id) SETTINGS index_granularity = 8192;

Run Changelog Capture

Once the tables are ready, running the change capture script. An example script can be found in this gist, which is written in Python and uses the python-mysql-replication library. This library acts as replication client, continuously downloads the binary logs from the source and sifts through it to find any UPDATE/DELETE executed against our source table.

There are a few configuration options that need to be customized in the script.

  • LOG_DB_HOST: The MySQL host where we created the
    clickhouse_changelog

     table.

  • LOG_DB_NAME: The database name where the
    clickhouse_changelog

     table is created.

  • SRC_DB_HOST: The MySQL host where we will be downloading binary logs from. This can either be a primary or secondary/replica as long as its the same server where our raw table is also located.
  • MYSQL_USER: MySQL username.
  • MYSQL_PASS: MySQL password.
  • TABLE: The table we want to watch for changes.

When the script is successfully configured and running, the

clickhouse_changelog

 table should start populating with data like below.

mysql> select * from mydb.clickhouse_changelog;
+------+------+------------+---------------------+------------------+-----------+
| db   | tbl  | created_at | updated_at          | log_file         | log_pos   |
+------+------+------------+---------------------+------------------+-----------+
| mydb | hits | 2014-06-02 | 2017-12-23 17:19:33 | mysql-bin.016353 |  18876747 |
| mydb | hits | 2014-06-09 | 2017-12-23 22:10:29 | mysql-bin.016414 |   1595314 |
| mydb | hits | 2014-06-16 | 2017-12-23 02:59:37 | mysql-bin.016166 |  33999981 |
| mydb | hits | 2014-06-23 | 2017-12-23 18:09:33 | mysql-bin.016363 |  84498477 |
| mydb | hits | 2014-06-30 | 2017-12-23 06:08:59 | mysql-bin.016204 |  23792406 |
| mydb | hits | 2014-08-04 | 2017-12-23 18:09:33 | mysql-bin.016363 |  84499495 |
| mydb | hits | 2014-08-18 | 2017-12-23 18:09:33 | mysql-bin.016363 |  84500523 |
| mydb | hits | 2014-09-01 | 2017-12-23 06:09:19 | mysql-bin.016204 |  27120145 |
+------+------+------------+---------------------+------------------+-----------+

Full Table Import

So we have our changelog capture in place, the next step is to initially populate the ClickHouse table from MySQL. Normally, we can easily do this with a 

mysqldump

 into a tab-separated format, but remember we have to transform the

created_at

 column from MySQL into ClickHouse’s

Date

 format to be used as partitioning key.

A simple way to do this is by using a simple set of shell commands like below:

SQL=$(cat <<EOF
SELECT
	id, DATE_FORMAT(created_at, "%Y-%m-%d"),
	type, user_id, location_id, UNIX_TIMESTAMP(created_at)
FROM hits
EOF
)
mysql -h source_db_host mydb -BNe "$sql" > hist.txt
cat hist.txt | clickhouse-client -d mydb --query="INSERT INTO hits FORMAT TabSeparated"

One thing to note about this process is that the

MySQL

 client buffers the results for the whole query, and it could eat up all the memory on the server you run this from if the table is really large. To avoid this, chunk the table into several million rows at a time. Since we already have the changelog capture running and in place from the previous step, you do not need to worry about any changes between chunks. We will consolidate those changes during the incremental refreshes.

Incremental Refresh

After initially populating the ClickHouse table, we then set up our continuous incremental refresh using a separate script. A template script we use for the table on our example can be found in this gist.

What this script does is twofold:

  • Determines the list of weeks recently modified based on
    clickhouse_changelog

    , dump rows for those weeks and re-imports to ClickHouse.

  • If the current week is not on the list of those with modifications, it also checks for new rows based on the auto-incrementing primary key and appends them to the ClickHouse table.

An example output of this script would be:

ubuntu@mysql~/clickhouse$ bash clickhouse-incr-refresh.sh hits
2017-12-24_00_20_19 incr-refresh Starting changelog processing for hits
2017-12-24_00_20_19 incr-refresh Current week is: 2017-12-18
2017-12-24_00_20_19 incr-refresh Processing week: 2017-12-18
2017-12-24_00_20_20 incr-refresh Changelog import for hits complete
ubuntu@mysql~/clickhouse$ bash clickhouse-incr-refresh.sh hits
2017-12-24_00_20_33 incr-refresh Starting changelog processing for hits
2017-12-24_00_20_33 incr-refresh Current week is: 2017-12-18
2017-12-24_00_20_33 incr-refresh Weeks is empty, nothing to do
2017-12-24_00_20_33 incr-refresh Changelog import for hits complete
2017-12-24_00_20_33 incr-refresh Inserting new records for hits > id: 5213438
2017-12-24_00_20_33 incr-refresh No new rows found
2017-12-24_00_20_33 incr-refresh Incremental import for hits complete
ubuntu@mysql~/clickhouse$

Note that, on step 4, if you imported a really large table and the changelog had accumulated a large number of changes to refresh, the initial incremental execution might take some time. After that though, it should be faster. This script can be run every minute, longer or shorter, depending on how often you want the ClickHouse table to be refreshed.

To wrap up, here is a query from MySQL on the same table, versus ClickHouse.

mysql> SELECT COUNT(DISTINCT user_id) FROM hits WHERE created_at
    -> BETWEEN '2016-01-01 00:00:00' AND '2017-01-01 00:00:00';
+-------------------------+
| COUNT(DISTINCT user_id) |
+-------------------------+
|                 3023028 |
+-------------------------+
1 row in set (25.89 sec)

:) SELECT COUNT(DISTINCT user_id) FROM hits WHERE created_at BETWEEN 1451606400 AND 1483228800;
SELECT COUNTDistinct(user_id)
FROM hits
WHERE (created_at >= 1451606400) AND (created_at <= 1483228800)
??uniqExact(user_id)??
?            3023028 ?
??????????????????????
1 rows in set. Elapsed: 0.491 sec. Processed 35.82 million rows, 286.59 MB (73.01 million rows/s., 584.06 MB/s.)

Enjoy!

Oct
04
2017
--

ClickHouse MySQL Silicon Valley Meetup Wednesday, October 25 at Uber Engineering with Percona’s CTO Vadim Tkachenko

ClickHouse MySQL

ClickHouse MySQLI will be presenting at the ClickHouse MySQL Silicon Valley Meetup on Wednesday, October 25, 2017, at 6:30 PM.

ClickHouse is a real-time analytical database system. Even though they’re only celebrating one year as open source software, it has already proved itself ready for the serious workloads. We will talk about ClickHouse in general, some internals and why it is so fast. ClickHouse works in conjunction with MySQL – traditionally weak for analytical workloads – and this presentation demonstrates how to make the two systems work together.

My talk will cover how we can improve the experience with real-time analytics using ClickHouse, and how we can integrate ClickHouse with MySQL.

I want to thank our friends at Uber Engineering who agreed to host this event.

Please join us here: https://www.meetup.com/San-Francisco-Bay-Area-ClickHouse-Meetup/events/243887397/.

Vadim TkachenkoVadim Tkachenko, Percona CTO

Vadim Tkachenko co-founded Percona in 2006 and serves as its Chief Technology Officer. Vadim leads Percona Labs, which focuses on technology research and performance evaluations of Percona’s and third-party products. Percona Labs designs no-gimmick tests of hardware, filesystems, storage engines, and databases that surpass the standard performance and functionality scenario benchmarks.

Vadim’s expertise in LAMP performance and multi-threaded programming help optimize MySQL and InnoDB internals to take full advantage of modern hardware. Oracle Corporation and its predecessors have incorporated Vadim’s source code patches into the mainstream MySQL and InnoDB products.

He also co-authored the book High Performance MySQL: Optimization, Backups, and Replication 3rd Edition. Previously, he founded a web development company in his native Ukraine and spent two years in the High Performance Group within the official MySQL support team.

Oct
02
2017
--

Big Dataset: All Reddit Comments – Analyzing with ClickHouse

ClickHouse

In this blog, I’ll use ClickHouse and Tabix to look at a new very large dataset for research.

It is hard to come across interesting datasets, especially a big one (and by big I mean one billion rows or more). Before, I’ve used on-time airline performance available from BUREAU OF TRANSPORTATION STATISTICS. Another recent example is NYC Taxi and Uber Trips data, with over one billion records.

However, today I wanted to mention an interesting dataset I found recently that has been available since 2015. This is Reddit’s comments and submissions dataset, made possible thanks to Reddit’s generous API. The dataset was first mentioned at “I have every publicly available Reddit comment for research,” and currently you can find it at pushshift.io. However, there is no guarantee that pushshift.io will provide this dataset in the future. I think it would be valuable for Amazon or another cloud provider made this dataset available for researchers, just as Amazon provides https://aws.amazon.com/public-datasets/.

The dataset contains 2.86 billion records to the end of 2016 and is 709GB in size, uncompressed. This dataset is valuable for a variety of research scenarios, from simple stats to natural language processing and machine learning.

Now let’s see what simple info we can collect from this dataset using ClickHouse and https://tabix.io/, a GUI tool for ClickHouse. In this first round, we’ll figure some basic stats, like number of comments per month, number of authors per month and number of subreddits. I also added how many comments in average are left for a post.

Queries to achieve this:

SELECT toYYYYMM(created_date) dt,count(*) comments FROM commententry1 GROUP BY dt ORDER BY dt
;;
SELECT toYYYYMM(created_date) dt,count(DISTINCT author) authors FROM commententry1 GROUP BY dt ORDER BY dt
;;
SELECT toYYYYMM(created_date) dt,count(DISTINCT subreddit) subreddits FROM commententry1 GROUP BY dt ORDER BY dt
;;
SELECT toYYYYMM(created_date) dt,count(*)/count(distinct link_id) comments_per_post FROM commententry1 GROUP BY dt ORDER BY dt

And the graphical result:
ClickHouse
ClickHouse
It impressive to see the constant growth in comments (to 70mln per month by the end of 2016) and authors (to 3.5mln for the same time period). There is something interesting happening with subreddits, which jump up and down. It’s interesting to see that the average count of comments per post stays stable, with a slight decline to 13 comments/post by the end of 2016.

Now let’s check most popular subreddits:

SELECT subreddit,count(*) cnt FROM commententry1 GROUP BY subreddit ORDER BY cnt DESC limit 100
DRAW_TREEMAP
{
    path:'subreddit.cnt'
}

and using a treemap (available in Tabix.io):
ClickHouse

We can measure subreddits that get the biggest increase in comments in 2016 compared to 2015:

SELECT subreddit,cntnew-cntold diff FROM (SELECT subreddit,count(*) cntnew FROM commententry1 WHERE toYear(created_date)=2016 GROUP BY subreddit) ALL INNER JOIN (SELECT subreddit,count(*) cntold FROM commententry1 WHERE toYear(created_date)=2015 GROUP BY subreddit) USING (subreddit) ORDER BY diff DESC LIMIT 50
 DRAW_TREEMAP
{
    path:'subreddit.diff'
}

ClickHouse

Obviously, Reddit was affected by the United States Presidential Election 2016, but not just that. The gaming community saw an increase in Overwatch, PokemonGO and Dark Souls 3.

Now we can try to run our own DB-Ranking, but only based on Reddit comments. This is how I can do this for MySQL, PostgreSQL and MongoDB:

SELECT toStartOfQuarter(created_date) Quarter,
sum(if(positionCaseInsensitive(body,'mysql')>0,1,0)) mysql,
sum(if(positionCaseInsensitive(body,'postgres')>0,1,0)) postgres,
sum(if(positionCaseInsensitive(body,'mongodb')>0,1,0)) mongodb
FROM commententry1
GROUP BY Quarter ORDER BY Quarter;

I would say the result is aligned with https://db-engines.com/en/ranking, where MySQL is the most popular among the three, followed by PostgreSQL and then MongoDB. There is an interesting spike for PostgreSQL in the second quarter in 2015, caused by a bot in “leagueoflegend” tournaments. The bot was actively announcing that it is powered by PostgreSQL in the comments, like this: http://reddit.com/r/leagueoflegends/comments/37cvc3/c/crln2ef.

To highlight more ClickHouse features: along with standard SQL functions, it provides a variety of statistical functions (for example, Quantile calculations). We can try to see the distribution of the number of comments left by authors:

SELECT
    quantileExact(0.1)(cnt),
    quantileExact(0.2)(cnt),
    quantileExact(0.3)(cnt),
    quantileExact(0.4)(cnt),
    quantileExact(0.5)(cnt),
    quantileExact(0.6)(cnt),
    quantileExact(0.7)(cnt),
    quantileExact(0.8)(cnt),
    quantileExact(0.9)(cnt),
    quantileExact(0.99)(cnt)
FROM
(
    SELECT
        author,
        count(*) AS cnt
    FROM commententry1
    WHERE author != '[deleted]'
    GROUP BY author
)

The result is:

quantileExact(0.1)(cnt) - 1
quantileExact(0.2)(cnt) - 1
quantileExact(0.3)(cnt) - 1
quantileExact(0.4)(cnt) - 2
quantileExact(0.5)(cnt) - 4
quantileExact(0.6)(cnt) - 7
quantileExact(0.7)(cnt) - 16
quantileExact(0.8)(cnt) - 42
quantileExact(0.9)(cnt) - 160
quantileExact(0.99)(cnt) - 2271

Which means that 30% of authors left only one comment, and 50% of authors left four comments or less.

In general, ClickHouse was a pleasure to use when running analytical queries. However, I should note the missing support of WINDOW functions is a huge limitation. Even MySQL 8.0, which recently was released as RC, provides support for WINDOW functions. I hope ClickHouse will implement this as well.

Sep
12
2017
--

Upcoming Webinar September 14, 2017: Supercharge Your Analytics with ClickHouse

ClickHouse

ClickHouseJoin Percona’s CTO Vadim Tkachenko @VadimTk and Altinity’s Co-Founder, Alexander Zaitsev as they present Supercharge Your Analytics with ClickHouse on Thursday, September 14, 2017, at 10:00 am PDT / 1:00 pm EDT (UTC-7).

 

ClickHouse is a real-time analytical database system. Even though they’re only celebrating one year as open source software, it has already proved itself ready for serious workloads.

We will talk about ClickHouse in general, some of its internals and why it is so fast. ClickHouse works in conjunction with MySQL – traditionally weak for analytical workloads – and this presentation demonstrates how to make the two systems work together.

There will also be an in-person presentation on How to Build Analytics for 100bn Logs a Month with ClickHouse at the meetup Wednesday, September 13, 2017. RSVP here.

Alexander Zaitsev will also be speaking at Percona Live Europe 2017 on Building Multi-Petabyte Data Warehouses with ClickHouse on Wednesday, September 27 at 11:30 am. Use the promo code “SeeMeSpeakPLE17” for 15% off.

Alexander ZaitsevAlexander Zaitsev
Altinity’s Co-Founder
Alexander is a co-founder of Altinity. He has 20 years of engineering and engineering management experience in several international companies. Alexander is expert in high scale analytics systems design and implementation. He designed and deployed petabyte scale data warehouses, including one of earliest ClickHouse deployments outside of Yandex.

Vadim Tkachenko
CTO
Vadim Tkachenko co-founded Percona in 2006 and serves as its Chief Technology Officer. Vadim leads Percona Labs, which focuses on technology research and performance evaluations of Percona’s and third-party products. Percona Labs designs no-gimmick tests of hardware, filesystems, storage engines, and databases that surpass the standard performance and functionality scenario benchmarks. Vadim’s expertise in LAMP performance and multi-threaded programming help optimize MySQL and InnoDB internals to take full advantage of modern hardware. Oracle Corporation and its predecessors have incorporated Vadim’s source code patches into the mainstream MySQL and InnoDB products.

He also co-authored the book High Performance MySQL: Optimization, Backups, and Replication 3rd Edition. Previously, he founded a web development company in his native Ukraine and spent two years in the High Performance Group within the official MySQL support team. Vadim received a BS in Economics and an MS in computer science from the National Technical University of Ukraine. He now lives in California with his wife and two children.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com