Jul
28
2014
--

Sunstone Capital Backs Beacon Platform Kontakt.io To The Tune Of $2 Million

Kontakt.io “Bluetooth Low Energy and iBeacon are the building blocks of the next wave of computing,” says Max Niederhofer of the micro-location technology that lets your smartphone trigger events based on how close you are to a Beacon transmitter. “It’s a cliché, but the possibilities are endless.” Read More

Jul
25
2014
--

Monitoring MySQL flow control in Percona XtraDB Cluster 5.6

Monitoring flow control in a Galera cluster is very important. If you do not, you will not understand why writes may sometimes be stalled.  href="http://www.percona.com/software/percona-xtradb-cluster" >Percona XtraDB Cluster 5.6 provides 2 status variables for such monitoring: wsrep_flow_control_paused and wsrep_flow_control_paused_ns. Which one should you use?

What is flow control?

Flow control does not exist with regular MySQL replication, but only with Galera replication. It is simply the mechanism nodes are using when they are not able to keep up with the write load: to keep replication synchronous, the node that is starting to lag instructs the other nodes that writes should be paused for some time so it does not get too far behind.

If you are not familiar with this notion, you should read this href="http://www.mysqlperformanceblog.com/2013/05/02/galera-flow-control-in-percona-xtradb-cluster-for-mysql/" >blogpost.

Triggering flow control and graphing it

For this test, we’ll use a 3-node Percona XtraDB Cluster 5.6 cluster. On node 3, we will adjust gcs.fc_limit so that flow control is triggered very quickly and then we will lock the node:

pxc3> set global wsrep_provider_options="gcs.fc_limit=1";
pxc3> flush tables with read lock;

Now we will use sysbench to insert rows on node 1:

$ sysbench --test=oltp --oltp-table-size=50000 --mysql-user=root --mysql-socket=/tmp/pxc1.sock prepare

Because of flow control, writes will be stalled and sysbench will hang. So after some time, we will release the lock on node 3:

pxc3> unlock tables;

During the whole process, wsrep_flow_control_paused and wsrep_flow_control_paused_ns are recorded every second with mysqladmin ext -i1. We can then build a graph of the evolution of both variables:

href="http://www.mysqlperformanceblog.com/wp-content/uploads/2014/07/wsrep_flow_control_pxc3.png"> class="alignnone size-full wp-image-24687" src="http://www.mysqlperformanceblog.com/wp-content/uploads/2014/07/wsrep_flow_control_pxc3.png" alt="wsrep_flow_control_pxc3" width="800" height="400" />

While we can clearly see when flow control was triggered on both graphs, it is much easier to know when flow control was stopped with wsrep_flow_control_paused_ns. It would be even more obvious if we have had several timeframes when flow control is in effect.

Conclusion

Monitoring a server is obviously necessary if you want to be able to catch issues. But you need to look at the right metrics. So don’t be scared if you are seeing that wsrep_flow_control_paused is not 0: it simply means that flow control has been triggered at some point since the server started up. If you want to know what is happening right now, prefer wsrep_flow_control_paused_ns.

The post rel="nofollow" href="http://www.mysqlperformanceblog.com/2014/07/25/monitoring-flow-control-percona-xtradb-cluster-5-6/">Monitoring MySQL flow control in Percona XtraDB Cluster 5.6 appeared first on rel="nofollow" href="http://www.mysqlperformanceblog.com/">MySQL Performance Blog.

Jul
24
2014
--

Putting MySQL Fabric to Use: July 30 webinar

href="http://www.mysqlperformanceblog.com/wp-content/uploads/2014/06/Percona-MySQL-Webinars.jpg"> class="alignright wp-image-23763 size-thumbnail" src="http://www.mysqlperformanceblog.com/wp-content/uploads/2014/06/Percona-MySQL-Webinars-150x150.jpg" alt="Percona MySQL webinars" width="150" height="150" />Martin and I have recently been blogging together about rel="nofollow" href="http://www.mysql.com/products/enterprise/fabric.html" rel="nofollow">MySQL Fabric (in case you’ve missed this, you can find the first post of the series href="http://www.mysqlperformanceblog.com/2014/04/25/managing-farms-of-mysql-servers-with-mysql-fabric/">here), and on July 30th, we’re going to be presenting a webinar on this topic titled “ href="http://www.percona.com/resources/mysql-webinars/putting-mysql-fabric-use" >Putting MySQL Fabric to Use.”

The focus of the webinar is to help you get started quickly on this technology, so we’ll include very few slides (mostly just a diagram or two) and then jump straight into shared screen mode, with lots of live console and source code examples.

In order to make the best use of time, we won’t show you how to install and configure MySQL Fabric. However, we can point you to a few resources to help you get ready and even follow our examples as we go:

  • The rel="nofollow" href="http://dev.mysql.com/doc/mysql-utilities/1.4/en/fabric-setup.html" rel="nofollow">official manual is an obvious starting point
  • Our href="http://www.mysqlperformanceblog.com/2014/05/15/high-availability-mysql-fabric-part/">second post in the series includes configuration instructions
  • This rel="nofollow" href="https://github.com/martinarrieta/vagrant-fabric" rel="nofollow">git repo contains the test environment we’ll use to run our demos. Specifically, we’ll use the rel="nofollow" href="https://github.com/martinarrieta/vagrant-fabric/tree/sharding" rel="nofollow">sharding branch, so if you intend to follow our examples as we go, we recommend checking that one out.

If you’re interested, you can register for this webinar href="http://www.percona.com/resources/mysql-webinars/putting-mysql-fabric-use">here, and if there’s something specific you’d like to see (we had a request for PHP examples in the comments to our href="http://www.mysqlperformanceblog.com/2014/07/11/managing-shards-mysql-databases-mysql-fabric-2/">last post) feel free to post that as a comment. We can’t promise we’ll be able to cover all requests during the webinar, but we’ll incorporate examples to the repo as time allows.

Hope to see you then!

The post rel="nofollow" href="http://www.mysqlperformanceblog.com/2014/07/24/putting-mysql-fabric-to-use-july-30-webinar/">Putting MySQL Fabric to Use: July 30 webinar appeared first on rel="nofollow" href="http://www.mysqlperformanceblog.com/">MySQL Performance Blog.

Jul
24
2014
--

Intigua Raises $10M Series B Round For Its IT Management Automation Service

data center Intigua, a company that specializes in giving enterprises an easier way to manage their IT operations in private and public clouds, today announced that it has raised a $10 million Series B round led by Intel Capital. Existing investors Bessemer Venture Partners and Cedar Fund also participated in this round. In total, the company has now raised $21 million. The company says it will use this… Read More

Jul
23
2014
--

DBaaS, OpenStack and Trove 101: Introduction to the basics

We’ll be publishing a series of posts on OpenStack and Trove over the next few weeks, diving into their usage and purpose. For readers who are already familiar with these technologies, there should be no doubt as to why we are incredibly excited about them, but for those who aren’t, consider this a small introduction to the basics and concepts.

What is Database as a Service (DBaaS)? /> In a nutshell, DBaaS – as it is frequently referred to – is a loose moniker to the concept of providing a managed cloud-based database environment accessible by users, applications or developers. Its aim is to provide a full-fledged database environment, while minimizing the administrative turmoil and pains of managing the surrounding infrastructure.

Real life example: Imagine you are working on a new application that has to be accessible from multiple regions. Building and maintaining a large multiregion setup can be very expensive. Furthermore, it introduces additional complexity and strain on your system engineers once timezones start to come into play. The challenge of having to manage machines in multiple datacenters won’t simplify your release cycle, nor increase your engineers’ happiness.

Let’s take a look at some of the questions DBaaS could answer in a situation like this:

- How do I need to size my machines, and where should I locate them? /> Small environments require less computing power and can be a good starting point, although this also means they may not be as well-prepared for future growth. Buying larger-scale and more expensive hardware and hosting can be very expensive and can be a big stumbling block for a brand new development project. Hosting machines in multiple DC’s could also introduce administrative difficulties, like having different SLA’s and potential issues setting up WAN or VPN communications. DBaaS introduces an abstraction layer, so these consideration aren’t yours, but those of the company offering it, while you get to reap all the rewards.

- Who will manage my environment from an operational standpoint? /> Staffing considerations and taking on the required knowledge to properly maintain a production database are often either temporarily sweeped under the rug or, when the situation turns out badly, a cause for the untimely demise of quite a few young projects. Rather than think about how long ago you should have applied that security patch, wouldn’t it be nice to just focus on managing the data itself, and be otherwise confident that the layers beyond it are managed responsibly?

- Have a sudden need to scale out? /> Once you’re up and running, enjoying the success of a growing use base, your environment will need to scale accordingly. Rather than think long and hard on the many options available, as well as the logistics attached to those changes, your DBaaS provider could handle this transparently.

Popular public options: Here are a few names of public services you may have come across already that fall under the DBaaS moniker:

- Amazon RDS /> – Rackspace cloud databases /> – Microsoft SQLAzure /> – Heroku /> – Clustrix DBaaS

What differentiates these services from a standard remote database is the abstraction layer that fully automates their backend, while still offering an environment that is familiar to what your development team is used to (be it MySQL, MongoDB, Microsoft SQLServer, or otherwise). A big tradeoff to using these services is that you are effectively trusting an external company with all of your data, which might make your legal team a bit nervous.

Private cloud options? /> What if you could offer your team the best of both worlds? Or even provide a similar type of service to your own customers? Over the years, a lot of platforms have been popping up to allow effective management and automation of virtual environments such as these, allowing you to effectively “roll your own” DBaaS. To get there, there are two important layers to consider:

  • Infrastructure Management, also referred to as Infrastructure-as-a-Service (IaaS), focusing on the logistics of spinning up virtual machines and keeping their required software packages running.
  • Database Management, previously referred to DBaaS, transparently coordinating multiple database instances to work together and present themselves as a single, coherent data repository.

Examples of IaaS products: /> – OpenStack /> – OpenQRM

Ecample of DBaaS: /> – Trove

Main Advantages of DBaaS /> For reference, the main reasons why you might want to consider using an existing DBaaS are as follows:

- Reduced Database management costs

DBaaS removes the amount of maintenance you need to perform on isolated DB instances. You offload the system administration of hardware, OS and database to either a dedicated service provider, or in the case where you are rolling your own, allow your database team to more efficiently manage and scale the platform (public vs private DBaaS).

- Simplifies certain security aspects

If you are opting to use a DBaaS platform, the responsibility of worrying about this or that patch being applied falls to your service provider, and you can generally assume that they’ll keep your platform secure from the software perspective.

- Centralized management

One system to rule them all. A guarantee of no nasty surprises concerning that one ancient server that should have been replaced years ago, but you never got around to it. As a user of DBaaS, all you need to worry about is how you interface with the database itself.

- Easy provisioning

Scaling of the environment happens transparently, with minimal additional management.

- Choice of backends

Typically, DBaas providers offer you the choice of a multitude of database flavors, so you can mix and match according to your needs.

Main Disadvantages /> - Reduced visibility of the backend

Releasing control of the backend requires a good amount of trust in your DBaaS provider. There is limited or no visibility into how backups are run and maintained, which configuration modifications are applied, or even when and which updates will be implemented. Just as you offload your responsibilities, you in turn need to rely on an SLA contract.

- Potentially harder to recover from catastrophic failures

Similarly to the above, unless your service providers have maintained thorough backups on your behalf, the lack of direct access to the host machines means that it could be much harder to recover from database failure.

- Reduced performance for specific applications

There’s a good chance that you are working on a shared environment. This means the amount of workload-specific performance tuning options is limited.

- Privacy and Security concerns

Although it is much easier to maintain and patch your environment. Having a centralized system also means you’re more prone to potential attacks targeting your dataset. Whichever provider you go with, make sure you are intimately aware of the measures they take to protect you from that, and what is expected from your side to help keep it safe.

Conclusion: While DBaaS is an interesting concept that introduces a completely new way of approaching an application’s database infrastructure, and can bring enterprises easily scalable, and financially flexible platforms, it should not be considered a silver bullet. Some big tradeoffs need to be considered carefully from the business perspective, and any move there should be accompanied with careful planning and investigation of options.

Embracing the immense flexibility these platforms offer, though, opens up a lot of interesting perspectives too. More and more companies are looking at ways to roll their own “as-a-Service”, provisioning completely automated hosted platforms for customers on-demand, and abstracting their management layers to allow them to be serviced by smaller, highly focused technical teams.

Stay tuned: Over the next few weeks we’ll be publishing a series of posts focusing on the combination of two technologies that allow for this type of flexibility: OpenStack and Trove.

The post rel="nofollow" href="http://www.mysqlperformanceblog.com/2014/07/24/dbaas-openstack-and-trove-101-introduction-to-the-basics/">DBaaS, OpenStack and Trove 101: Introduction to the basics appeared first on rel="nofollow" href="http://www.mysqlperformanceblog.com/">MySQL Performance Blog.

Jul
23
2014
--

New Infosys CEO Vishal Sikka Prepares To Take The Helm

Ocean wave It’s been a whirlwind few months for Vishal Sikka. It began when he suddenly left his job as executive and head of products at SAP at the beginning of May and it has continued as he prepares to take the helm at Infosys on August 1st. When I spoke to him at the end of May, just ahead of the SAP Sapphire Conference,  he seemed at peace with his decision to leave SAP in spite of the… Read More

Jul
23
2014
--

Dropbox Gets Down To Business, Adds More Sharing Features And Search For Enterprises

dropbox-for-business-use Dropbox, now at 300 million users globally, says that there are 4 million businesses today using its cloud-based platform to store, distribute and share documents. But of those, only around 80,000 have opted so far to use Dropbox for Business, the company’s premium enterprise tier launched earlier this year. Today, the startup is making a move to sweeten the deal, unveiling a host of… Read More

Jul
23
2014
--

Why TokuDB hates Transparent HugePages

If you try to install the TokuDB storage engine on a modern Linux distribution it might fail with following error message:

2014-07-17 19:02:55 13865 [ERROR] TokuDB will not run with transparent huge pages enabled. /> 2014-07-17 19:02:55 13865 [ERROR] Please disable them to continue. /> 2014-07-17 19:02:55 13865 [ERROR] (echo never > /sys/kernel/mm/transparent_hugepage/enabled)

You might be curious why TokuDB refuses to start with Transparent HugePages. Are they not a good thing… allowing smaller kernel page tables and less TLB misses when accessing data in the buffer pool? I was curious, so I asked Tim Callaghan this very question.

This problem originates with TokuDB using jemalloc memory allocator, which uses a particular trick to deal with memory fragmentation. The classical problem with memory allocators is fragmentation – if you allocated a say 2MB chunk from the operating system (typically using mmap),  as the process runs it is likely some of that 2MB memory block will become free but not all of it, hence it can’t be given back to operating system completely. jemalloc uses a clever trick being able to give back portions of memory allocated in such a way through madvise(…, MADV_DONTNEED) call.

Now what happens when you use transparent huge pages? In this case the operating system (and CPU, really) works with pages of a much larger size which only can be unmapped from the address space in its entirety – which does not work when smaller objects are freed which produce smaller free “holes.”

As a result, without being able to free memory efficiently the amount of allocated memory may grow unbound until the process starts to swap out – and in the end being killed by “out of memory” killer at least under some workloads. This is not a behavior you want to see from the database server. As such requiring to disable huge pages is a better choice.

Having said that this is pretty crude requirement/solution – disabling huge pages on complete operating system image to make one application work while others might be negatively impacted. I hope with a future jemalloc version/kernel releases there will be solution where jemalloc simply prevents huge pages usage for its allocations.

Using jemalloc and its approach to remove pages from resident space also makes TokuDB a lot different than typical MySQL instances running Innodb from the process space. With Innodb VSZ and RSS are often close. In fact we often monitor VSZ to ensure it is not excessively large to avoid danger of process starting to swap actively or be killed with OOM killer. TokuDB however often can look like this

[root@smt1 mysql]# ps aux | grep mysqld /> mysql 14604 21.8 50.6 12922416 4083016 pts/0 Sl Jul17 1453:27 /usr/sbin/mysqld –basedir=/usr –datadir=/var/lib/mysql –plugin-dir=/usr/lib64/mysql/plugin –user=mysql –log-error=/var/lib/mysql/smt1.pz.percona.com.err –pid-file=/var/lib/mysql/smt1.pz.percona.com.pid /> root 28937 0.0 0.0 103244 852 pts/2 S+ 10:38 0:00 grep mysqld

In this case TokuDB is run with defaults on 8GB system – it takes approximately 50% of memory in terms of RSS size, however the VSZ of the process is over 12GB – this is a lot more than memory available.

This is completely fine for TokuDB. If I would not have Transparent HugePages disabled, though, my RSS would be a lot closer to VSZ causing intense swapping or even process killed by OOM killer.

In addition to explaining this to me, Tim Callaghan was also kind enough to share some links on this issue from other companies such as rel="nofollow" href="https://blogs.oracle.com/linux/entry/performance_issues_with_transparent_huge" rel="nofollow">Oracle, rel="nofollow" href="http://dev.nuodb.com/techblog/linux-transparent-huge-pages-jemalloc-and-nuodb" rel="nofollow">NuoDB , rel="nofollow" href="http://answers.splunk.com/answers/112305/on-rh-6-and-splunk-6-my-searches-are-consuming-lots-of-cpu" rel="nofollow">Splunk, rel="nofollow" href="http://www.stechno.net/sap-notes.html?view=sapnote&id=1871318" rel="nofollow">SAP, rel="nofollow" href="http://scn.sap.com/people/markmumy/blog/2014/05/22/sap-iq-and-linux-hugepagestransparent-hugepages" rel="nofollow">SAP(2), which provide more background information on this topic.

The post rel="nofollow" href="http://www.mysqlperformanceblog.com/2014/07/23/why-tokudb-hates-transparent-hugepages/">Why TokuDB hates Transparent HugePages appeared first on rel="nofollow" href="http://www.mysqlperformanceblog.com/">MySQL Performance Blog.

Jul
22
2014
--

PayPal Expands Its Working Capital Service To UK, Switches From Loans To Cash Advances

Screen Shot 2014-07-23 at 01.19.14 As payments platforms look for more ways to grow their margins and usage among businesses, they continue to push into a wider and deeper range of financial services. In one of the latest moves, eBay’s PayPal is expanding its Working Capital service to the UK. This is its first market for PayPal’s lending platform outside of the U.S., where it first launched the service in… Read More

Jul
22
2014
--

Reference architecture for a write-intensive MySQL deployment

We designed href="https://cloud.percona.com/">Percona Cloud Tools (both hardware and software setup) to handle a very high-intensive MySQL write workload. For example, we already observe inserts of 1bln+ datapoints per day. So I wanted to share what kind of hardware we use to achieve this result.

Let me describe what we use, and later I will explain why.

Server:

  • Chassis: Supermicro SC825TQ-R740LPB 2U Rackmount Chassis
  • Motherboard: Supermicro X9DRI-F dual socket
  • CPU: Dual Intel Xeon Ivy Bridge E5-2643v2 (6x 3.5Ghz cores, 12x HT cores, 25M L3)
  • Memory: 256GB (16x 16GB 256-bit quad-channel) ECC registered DDR3-1600
  • Raid: LSI MegaRAID 9260-4i 4-port 6G/s hardware RAID controller, 512M buffer
  • MainStorage: PCIe SSD HGST FlashMAX II 4.8TB
  • Secondary Storage (OS, logs): RAID 1 over 2x 3TB hard drives

Software:

  • OS: Ubuntu 12.04 LTS
  • MySQL: href="http://www.percona.com/software/percona-server" >Percona Server 5.6.19 + TokuDB
  • Backup: LVM partitions + mylvmbackup

When selecting hardware for your application, you need to look at many aspects – typically you’re looking for a solution for which you already have experience in working with and has also proved to be the most efficient option. For us it has been as follows:

Cloud vs Bare Metal /> We have experience having hardware hosted at the data center as well as cash for upfront investments in hardware so we decided to go for physical self-hosted hardware instead of the cloud. Going this route also gave us maximum flexibility in choosing a hardware setup that was the most optimal for our application rather than selecting one of the stock options.

Scale Up vs Scale Out /> We have designed a system from scratch to be able to utilize multiple servers through sharding – so our main concern is choosing the most optimal configuration for the server and provisioning servers as needed. In addition to raw performance we also need to consider power usage and overhead of managing many servers which typically makes having slightly more high-end hardware worth it.

Resource Usage /> Every application uses resources in different ways so an optimal configuration will be different depending on your application. Yet all applications use the same resources you need to consider. Typically you want to plan for all of your resources to be substantially used – providing some margin for spikes and maintenance.

CPU

  • Our application processes a lot of data and uses the TokuDB storage engine which uses a lot of CPU for compression, so we needed powerful CPUs.
  • Many MySQL functions are not parallel, think executing single query or Alter table so we’re going for CPU with faster cores rather than larger amount of cores. The resulting configuration with 2 sockets giving 12 cores and 24 threads is good enough for our workloads.
  • Lower end CPUs such as Xeon E3 have very attractive price/performance but only support 32GB of memory which was not enough for our application.

Memory

  • For database boxes memory is mainly used as a cache, so depending on your application you may be better off investing in memory or storage for optimal performance. Check out href="http://www.mysqlperformanceblog.com/2010/04/08/fast-ssd-or-more-memory/" >this blog post for more details.
  • Accessing data in memory is much faster than even on the fastest flash storage so it is still important. /> For our workload having recent data in memory is very important so we get as much “cheap” memory as we can populating all 16 slots with 16GB dimms which have attractive cost per GB at this point.

Storage /> There are multiple uses for the storage so there are many variables to consider

  • Bandwidth
    • We need to be able access data on the storage device quickly and with stable response time. HGST FlashMax II has been able to meet these very demanding needs.
  • Endurance
    • When using flash storage you need to worry about endurance – how much beating with writes flash storage can handle before it wears out. Some low cost MLC SSDs would wear out in the time frame of weeks if being written with maximum speed. HGST FlashMax II has endurance rating of 10 Petabytes written (for a random workload) – 30 Petabytes written (for a sequential workload)
    • We also use TokuDB storage engine which significantly reduces amount of writes compared to Innodb.
  • Durability
    • Does the storage provide true durability with data guaranteed to be persisted when write is acknowledged at the operating system level when power goes down or is loss possible? /> We do not want to risk database corruption in case of power failure so we were looking for storage solution which guarantees durability. /> HGST FlashMax II guarantees durability which has been confirmed by our stress tests.
  • Size
    • To scale application storage demands you need to scale both number of IO operations storage can handle and storage size. For flash storage it is often the size which becomes limiting factor. /> HGST FlashMax II 4.8 TB capacity is best available on the market which allows us to go “All Flash” and achieve very quick data access to all our data set.
  • Secondary Storage
    • Not every application need requires flash storage properties.
    • We have secondary storage with conventional drives for operating system and logs. /> Sequential read/write pattern works well with low cost conventional drives and also allow us to increase flash life time, having it handling less writes.
    • We’re using RAID with BBU for secondary storage to be able to have fully durable binary logs without paying high performance penalty.

Why PCIe SSD over SATA SSD? /> There are arguments that SATA SSD provides just a good enough performance for MySQL and there is no need for PCIe. While these arguments are valid in one dimension, there are several more to consider.

First, like I said PCIe SSD still provides a best absolute response time and it is an important factor for an end user experience in SaaS systems like href="https://cloud.percona.com/">Percona Cloud Tools. /> Second, consider maintenance operations like backup, ALTER TABLES or slave setups. While these operations are boring and do not get as much attention as a response time or throughput in benchmarks, it is still operations that DBAs performs basically daily, and it is very important to finish a backup or ALTER TABLE in a predictable time, especially on 3-4TB datasize range. And this is where PCIe SSD performs much better than SATA SSDs. For SATA SSD, especially bigger size, write endurance is another point of concern.

Why TokuDB engine? /> The TokuDB engine is the best when it comes to insert operations to a huge dataset, and few more factors makes it a no-brainer:

  • TokuDB compression is a huge win. I estimate into this storage ( FlashMAX II 4.8TB) we will fit about 20-30TB of raw data.
  • TokuDB is SSD friendly, as it performs much less data writes per INSERT operation than InnoDB, which greatly extends SSD (which is, well, expensive to say the least) lifetime.

The post rel="nofollow" href="http://www.mysqlperformanceblog.com/2014/07/22/reference-architecture-for-a-write-intensive-mysql-deployment/">Reference architecture for a write-intensive MySQL deployment appeared first on rel="nofollow" href="http://www.mysqlperformanceblog.com/">MySQL Performance Blog.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com