Red Hat looks beyond Linux

The Red Hat Linux distribution is turning 25 years old this week. What started as one of the earliest Linux distributions is now the most successful open-source company, and its success was a catalyst for others to follow its model. Today’s open-source world is very different from those heady days in the mid-1990s when Linux looked to be challenging Microsoft’s dominance on the desktop, but Red Hat is still going strong.

To put all of this into perspective, I sat down with the company’s current CEO (and former Delta Air Lines COO) Jim Whitehurst to talk about the past, present and future of the company, and open-source software in general. Whitehurst took the Red Hat CEO position 10 years ago, so while he wasn’t there in the earliest days, he definitely witnessed the evolution of open source in the enterprise, which is now more widespread than every.

“Ten years ago, open source at the time was really focused on offering viable alternatives to traditional software,” he told me. “We were selling layers of technology to replace existing technology. […] At the time, it was open source showing that we can build open-source tech at lower cost. The value proposition was that it was cheaper.”

At the time, he argues, the market was about replacing Windows with Linux or IBM’s WebSphere with JBoss. And that defined Red Hat’s role in the ecosystem, too, which was less about technological information than about packaging. “For Red Hat, we started off taking these open-source projects and making them usable for traditional enterprises,” said Whitehurst.

Jim Whitehurst, Red Hat president and CEO (photo by Joan Cros/NurPhoto via Getty Images)

About five or six ago, something changed, though. Large corporations, including Google and Facebook, started open sourcing their own projects because they didn’t look at some of the infrastructure technologies they opened up as competitive advantages. Instead, having them out in the open allowed them to profit from the ecosystems that formed around that. “The biggest part is it’s not just Google and Facebook finding religion,” said Whitehurst. “The social tech around open source made it easy to make projects happen. Companies got credit for that.”

He also noted that developers now look at their open-source contributions as part of their resumé. With an increasingly mobile workforce that regularly moves between jobs, companies that want to compete for talent are almost forced to open source at least some of the technologies that don’t give them a competitive advantage.

As the open-source ecosystem evolved, so did Red Hat. As enterprises started to understand the value of open source (and stopped being afraid of it), Red Hat shifted from simply talking to potential customers about savings to how open source can help them drive innovation. “We’ve gone from being commeditizers to being innovators. The tech we are driving is now driving net new innovation,” explained Whitehurst. “We are now not going in to talk about saving money but to help drive innovation inside a company.”

Over the last few years, that included making acquisitions to help drive this innovation. In 2015, Red Hat bought IT automation service Ansible, for example, and last month, the company closed its acquisition of CoreOS, one of the larger independent players in the Kubernetes container ecosystem — all while staying true to its open-source root.

There is only so much innovation you can do around a Linux distribution, though, and as a public company, Red Hat also had to look beyond that core business and build on it to better serve its customers. In part, that’s what drove the company to launch services like OpenShift, for example, a container platform that sits on top of Red Hat Enterprise Linux and — not unlike the original Linux distribution — integrates technologies like Docker and Kubernetes and makes them more easily usable inside an enterprise.

The reason for that? “I believe that containers will be the primary way that applications will be built, deployed and managed,” he told me, and argued that his company, especially after the CoreOS acquisition, is now a leader in both containers and Kubernetes. “When you think about the importance of containers to the future of IT, it’s a clear value for us and for our customers.”

The other major open-source project Red Hat is betting on is OpenStack . That may come as a bit of a surprise, given that popular opinion in the last year or so has shifted against the massive project that wants to give enterprises an open source on-premise alternative to AWS and other cloud providers. “There was a sense among big enterprise tech companies that OpenStack was going to be their savior from Amazon,” Whitehurst said. “But even OpenStack, flawlessly executed, put you where Amazon was five years ago. If you’re Cisco or HP or any of those big OEMs, you’ll say that OpenStack was a disappointment. But from our view as a software company, we are seeing good traction.”

Because OpenStack is especially popular among telcos, Whitehurst believes it will play a major role in the shift to 5G. “When we are talking to telcos, […] we are very confident that OpenStack will be the platform for 5G rollouts.”

With OpenShift and OpenStack, Red Hat believes that it has covered both the future of application development and the infrastructure on which those applications will run. Looking a bit further ahead, though, Whitehurst also noted that the company is starting to look at how it can use artificial intelligence and machine learning to make its own products smarter and more secure, but also at how it can use its technologies to enable edge computing. “Now that large enterprises are also contributing to open source, we have a virtually unlimited amount of material to bring our knowledge to,” he said.



Multi-Source Replication Performance with GTID

Multi-Source Replication with GTID

In this blog post, we’ll look at the performance of multi-source replication with GTID.

Multi-Source Replication is a topology I’ve seen discussed recently, so I decided to look into how it performs with the different replication concepts. Multi-source replication use replication channels, which allow a slave to replicate from multiple masters. This is a great way to consolidate data that has been sharded for production or simplify the analytics process by using the same server. Since multiple masters are taking writes, care is needed to not overlook the slave. The traditional replication concept uses the binary log file name, and the position inside that file.

This was the standard until the release of global transaction identifiers (GTID). I have set up a test environment to validate which concept would perform better, and be a better choice for use in this topology.


My test suite is rather simple, consisting of only three virtual machines, two masters and one slave. The slaves’ replication channels are set up using the same concept for each run, and no run had any replication filters. To prevent any replication errors, each master took writes against a different schema and user grants are identical on all three servers. The setup below ran with both replication channels using binary log file and position. Then the tables were dropped and the servers changed to use GTID for the next run.

Prepare the sysbench tables:

sysbench --db-driver=mysql --mysql-user= --mysql-password='' --mysql-db=db1 --range_size=100 --table_size=1000000 --tables=5 --threads=5 --events=0 --rand-type=uniform /usr/share/sysbench/oltp_read_only.lua prepare
sysbench --db-driver=mysql --mysql-user= --mysql-password='' --mysql-db=db3 --range_size=100 --table_size=1000000 --tables=5 --threads=5 --events=0 --rand-type=uniform /usr/share/sysbench/oltp_read_only.lua prepare

I used a read-only sysbench to warm up the InnoDB buffer pool. Both commands ran on the slave to ensure both schemas were loaded into the buffer pool:

sysbench --db-driver=mysql --mysql-user= --mysql-password='' --mysql-db=db1 --range_size=100 --table_size=1000000 --tables=5 --threads=5 --events=0 --time=3600 --rand-type=uniform /usr/share/sysbench/oltp_read_only.lua run
sysbench --db-driver=mysql --mysql-user= --mysql-password='' --mysql-db=db3 --range_size=100 --table_size=1000000 --tables=5 --threads=5 --events=0 --time=3600 --rand-type=uniform /usr/share/sysbench/oltp_read_only.lua run

After warming up the buffer pool, the slave should be fully caught up with both masters. To remove IO contention as a possible influencer, I stopped the SQL thread while I generated load on the master. Leaving the IO thread running allowed the slave to write the relay logs during this process, and help ensure that the test only measures the difference in the slave SQL thread.

stop slave sql thread for channel 'db1'; stop slave sql thread for channel 'db3';

Each master had a sysbench run against it for the schema that was designated to it in order to generate the writes:

sysbench --db-driver=mysql --mysql-user= --mysql-password='' --mysql-db=db1 --range_size=100 --table_size=1000000 --tables=5 --threads=1 --events=0 --time=3600 --rand-type=uniform /usr/share/sysbench/oltp_write_only.lua run
sysbench --db-driver=mysql --mysql-user= --mysql-password='' --mysql-db=db3 --range_size=100 --table_size=1000000 --tables=5 --threads=1 --events=0 --time=3600 --rand-type=uniform /usr/share/sysbench/oltp_write_only.lua run

Once the writes completed, I monitored the IO activity on the slave to ensure it was 100% idle and that all relay logs were fully captured. Once everything was fully written, I enabled a capture of the replication lag once per minute for each replication channel, and started the slaves SQL threads:

usr/bin/pt-heartbeat -D db1 -h localhost --master-server-id=101 --check
usr/bin/pt-heartbeat -D db3 -h localhost --master-server-id=103 --check
start slave sql thread for channel 'db1'; start slave sql thread for channel 'db3';

The above chart depicts the cumulative lag seen on the slave by pt-heartbeat since starting the sql_thread. The first item to noticed is that the replication delay was higher overall with the binary log. This could be because the SQL thread stopped for a different amount of time. This may appear to give GTID an advantage in this test, but remember with this test the amount of delay is less important than the processed rate. Focusing on when replication began to display a significant change towards catching up you can see that there are two distinct drops in delay. This is caused by the fact that the slave has two replication threads that individually monitor their delay. One of the replication threads caught up fully and the other was delayed for a bit longer.

In every test run. GTID took slightly longer to fully catch up than the traditional method. There are a couple of reasons to expect GTID’s to be slightly slower. One possibility is the that there are additional writes on the slave, in order to keep track of all the GTID’s that the slave ran. I removed the initial write to the relay log, but we must retain the committed GTID, and this causes additional writes. I used the default settings for MySQL, and as such log_slave_updates was disabled. This causes the replicated GTID to be stored in a table, which is periodically compressed. You can find more details on how log_slave_updates impacts GTID replication here.

So the question still exists, why should we use GTID, especially with multisource replication? I’ve found that the answer lies in the composition of a GTID. From MySQL’s GTID Concepts, a GTID is composed of two parts, the source_id, and the transaction_id. The source_id is a unique identifier targeting the server which originally wrote the transaction. This allows you to identify in the binary log which master took the initial write, and so you can pinpoint problems much easier.

The below excerpt from DB1’s (a master from this test) binary log shows that, before the transaction being written, the “SET @@SESSION.GTID_NEXT” ran. This is the GTID that you can follow through the rest of the topology to identify the same transaction.

“d1ab72e9-0220-11e8-aee7-00155dab6104” is the server_uuid for DB1, and 270035 is the transaction id.

SET @@SESSION.GTID_NEXT= 'd1ab72e9-0220-11e8-aee7-00155dab6104:270035'/*!*/;
# at 212345
#180221 15:37:56 server id 101 end_log_pos 212416 CRC32 0x758a2d77 Query thread_id=15 exec_time=0 error_code=0
SET TIMESTAMP=1519245476/*!*/;
# at 212416
#180221 15:37:56 server id 101 end_log_pos 212472 CRC32 0x4363b430 Table_map: `db1`.`sbtest1` mapped to number 109
# at 212472
#180221 15:37:56 server id 101 end_log_pos 212886 CRC32 0xebc7dd07 Update_rows: table id 109 flags: STMT_END_F
### UPDATE `db1`.`sbtest1`
### @1=654656 /* INT meta=0 nullable=0 is_null=0 */
### @2=575055 /* INT meta=0 nullable=0 is_null=0 */
### @3='20363719684-91714942007-16275727909-59392501704-12548243890-89454336635-33888955251-58527675655-80724884750-84323571901' /* STRING(120) meta=65144 nullable=0 is_null=0 */
### @4='97609582672-87128964037-28290786562-40461379888-28354441688' /* STRING(60) meta=65084 nullable=0 is_null=0 */
### SET
### @1=654656 /* INT meta=0 nullable=0 is_null=0 */
### @2=575055 /* INT meta=0 nullable=0 is_null=0 */
### @3='17385221703-35116499567-51878229032-71273693554-15554057523-51236572310-30075972872-00319230964-15844913650-16027840700' /* STRING(120) meta=65144 nullable=0 is_null=0 */
### @4='97609582672-87128964037-28290786562-40461379888-28354441688' /* STRING(60) meta=65084 nullable=0 is_null=0 */
# at 212886
#180221 15:37:56 server id 101 end_log_pos 212942 CRC32 0xa6261395 Table_map: `db1`.`sbtest3` mapped to number 111
# at 212942
#180221 15:37:56 server id 101 end_log_pos 213166 CRC32 0x2782f0ba Write_rows: table id 111 flags: STMT_END_F
### INSERT INTO `db1`.`sbtest3`
### SET
### @1=817058 /* INT meta=0 nullable=0 is_null=0 */
### @2=390619 /* INT meta=0 nullable=0 is_null=0 */
### @3='01297933619-49903746173-24451604496-63437351643-68022151381-53341425828-64598253099-03878171884-20272994102-36742295812' /* STRING(120) meta=65144 nullable=0 is_null=0 */
### @4='29893726257-50434258879-09435473253-27022021485-07601619471' /* STRING(60) meta=65084 nullable=0 is_null=0 */
# at 213166
#180221 15:37:56 server id 101 end_log_pos 213197 CRC32 0x5814a60c Xid = 2313
# at 213197


Based on the sysbench tests I ran, GTID replication has a slightly lower throughput. It took about two to three minutes longer to process an hour worth of writes on two masters, compared to binary log replication. GTID’s strengths lie more in how it eases the management and troubleshooting of complex replication topologies.

The GTID concept allows a slave to know exactly which server initially wrote the transaction, even in a tiered environment. This means that if you need to promote a slave from the bottom tier, to the middle tier, simply changing the master is all that is needed. The slave can pick up from the last transaction it ran on that server and continue replicating without a problem. Stephane Combaudon explains this in detail in a pair of blogs. You can find part 1 here and part 2 here. Facebook also has a great post about their experience deploying GTID-based replication and the troubles they faced.

The post Multi-Source Replication Performance with GTID appeared first on Percona Database Performance Blog.


MongoDB 3.6 Retryable Writes . . . Retryable Writes

MongoDB 3.6 retryable write

MongoDB 3.6 retryable writeIn this blog post, we will discuss MongoDB 3.6 Retryable Writes, a new application-level feature.


From the beginning, MongoDB replica sets were designed to recover gracefully from many internal problems or events such as node crashes, network partitions/errors/interruptions, replica set member fail-overs, etc.

While these events eventually recover transparently to the overall replica set, in many instances these events return errors to the application. The most common example is a failover of the Primary during a write: this returns network errors to most MongoDB drivers. Another possible situation is a Primary receiving a write, but the acknowledgment response never makes it back to the driver. Here it is unclear to the application if the write really succeeded or not.

If an application is designed for writes to be idempotent, generally all the application needs to do in a problem scenario is send the same write operation again and again until it succeeds. This approach is extremely dangerous to data integrity, however, if the application was not designed for idempotent writes! Retrying writes relying on state can lead to incorrect results.

MongoDB 3.6 Retryable Writes

MongoDB 3.6 introduces the concept of Retryable Writes to address situations where simple retrying of idempotent operations is not possible or desired (often more code is required to perform retries). This feature is implemented transparently via the use of unique IDs for each write operation that both the MongoDB driver and server can consider when handling failures.

This feature allows the application driver to automatically retry a failed write behind-the-scenes, without throwing an exception/error to the application. Retryable Writes mitigates problems caused by short interruptions, not long-term problems. Therefore, the mechanism only retries a write operation exactly once. If the retry is unsuccessful, then the application receives an error/exception as normal.

If a healthy Primary cannot be found to retry the write, the MongoDB driver waits for a time period equal to the serverSelectionTimeoutMS server parameter before retrying the write, so that it can allow for a failover to occur.

MongoDB implemented this feature in both the MongoDB driver and server, and it has some requirements:

  • MongoDB Version – every node in the cluster or replica set must run version 3.6 or greater. All nodes must also have featureCompatabilityVersion set to ‘3.6’.
  • MongoDB Driver – this feature requires that your application use a MongoDB driver that supports it.
  • Replication – The Retryable Writes feature requires that MongoDB Replication is enabled. You can use a single-node Replica Set to achieve this if you do not wish to deploy many nodes.
  • Write Concern – A Write Concern of ‘1’ or greater is required for this feature to operate.
  • Storage Engine – The use of MMAPv1 is not possible with this feature. WiredTiger or inMemory storage engines only!

With the exception of insert operations, this feature is limited to operations that change only a single document, meaning the following operations cannot use Retryable Writes:

  1. Multi-document Update (multi: true)
  2. Multi-document Delete
  3. Bulk Operations with Multi-document changes

The full list of operations available for use with this feature is here:

Using Retryable Writes

Enabling Retryable Writes doesn’t require major code changes!

Generally, you enable the use of Retryable Writes by adding the ‘retryWrites=’ flag to your MongoDB connection string that is passed to your MongoDB driver:


You enable the feature on the ‘mongo’ shell with the command-line flag ‘–retryWrites’:

mongo --retryWrites

That’s it! The rest is transparent to you!


The MongoDB 3.6 Retryable Writes feature continues a theme I’ve noticed in the last few major releases: improved data integrity and improved development experience.

The use of this great new feature should lead to simplified code and improved data integrity in applications using non-idempotent changes!

The post MongoDB 3.6 Retryable Writes . . . Retryable Writes appeared first on Percona Database Performance Blog.


As marketing data proliferates, consumers should have more control

At the Adobe Summit in Las Vegas this week, privacy was on the minds of many people. It was no wonder with social media data abuse dominating the headlines, GDPR just around the corner, and Adobe announcing the concept of a centralized customer experience record.

With so many high profile breaches in recent years, putting your customer data in a central record-keeping system would seem to be a dangerous proposition, yet Adobe sees so many positives for marketers, it likely believes this to be a worthy trade-off.

Which is not to say that the company doesn’t see the risks. Executives speaking at the conference continually insisted that privacy is always part of the conversation at Adobe as they build tools — and they have built in security and privacy safeguards into the customer experience record.

Ben Kepes, an independent analyst says this kind of data collection does raise ethical questions about how to use it. “This new central repository of data about individuals is going to be incredibly attractive to Adobe’s customers. The company is doing what big brands and corporations ask for. But in these post-Cambridge Analytica days, I wonder how much of a moral obligation Adobe and the other vendors have to ensure their tools are used for good purposes,” Kepes asked.

Offering better experiences

It’s worth pointing out that the goal of this exercise isn’t simply to collect data for data’s sake. It’s to offer consumers a more customized and streamlined experience. How does that work? There was a demo in the keynote illustrating a woman’s experience with a hotel brand.

Brad Rencher, EVP and GM at Adobe Experience Cloud explains Adobe’s Cloud offerings. Photo: Jeff Bottari/Invision for Adobe/AP Images

The mythical woman started a reservation for a trip to New York City, got distracted in the middle and was later “reminded” to return to it via Facebook ad. She completed the reservation and was later issued a digital key to her room, allowing her to bypass the front desk check-in process. Further, there was a personal greeting on the television in her room with a custom message and suggestions for entertainment based on her known preferences.

As one journalist pointed out in the press event, this level of detail from the hotel is not something that would thrill him (beyond the electronic check-in). Yet there doesn’t seem to be a way to opt out of that data (unless you live in the EU and will be subject to GDPR rules).

Consumers may want more control

As it turns out, that reporter wasn’t alone. According to a survey conducted last year by The Economist Intelligence Unit in conjunction with ForgeRock, an identity management company, consumers are not just willing sheep that tech companies may think we are.

The survey was conducted last October with 1,629 consumers participating from eight countries including Australia, China, France, Germany, Japan, South Korea, the UK and the US. It’s worth noting that survey questions were asked in the context of Internet of Things data, but it seems that the results could be more broadly applied to any types of data collection activities by brands.

There are a couple of interesting data points that perhaps brands should heed as they collect customer data in the fashion outlined by Adobe. In particular as it relates to what Adobe and other marketing software companies are trying to do to build a central customer profile, when asked to rate the statement, “I am uncomfortable with companies building a “profile” of me to predict my consumer behaviour,” 39 percent strongly agreed with that statement. Another 35 percent somewhat agreed. That would suggest that consumers aren’t necessarily thrilled with this idea.

When presented with the statement, Providing my personal information may have more drawbacks than benefits, 32 percent strongly agreed and 41 percent somewhat agreed.

That would suggest that it is on the brand to make it clearer to consumers that they are collecting that data to provide a better overall experience, because it appears that consumers who answered this survey are not necessarily making that connection.

Perhaps it wasn’t a coincidence that at a press conference after the Day One keynote announcing the unified customer experience record, many questions from analysts and journalists focused on notions of privacy. If Adobe is helping companies gather and organize customer data, what role do they have in how their customers’ use that data, what role does the brand have and how much control should consumers have over their own data?

These are questions we seem to be answering on the fly. The technology is here now or very soon will be, and wherever the data comes from, whether the web, mobile devices or the Internet of Things, we need to get a grip on the privacy implications — and we need to do it quickly. If consumers want more control as this survey suggests, maybe it’s time for companies to give it to them.


Percona XtraBackup 2.4.10 Is Now Available

Percona_XtraBackup LogoVert_CMYK

Percona XtraBackup 2.4Percona announces the GA release of Percona XtraBackup 2.4.10 on March 30, 2018. This release is based on MySQL 5.7.19. You can download it from our download site and apt and yum repositories.

Percona XtraBackup enables MySQL backups without blocking user queries, making it ideal for companies with large data sets and mission-critical applications that cannot tolerate long periods of downtime. Offered free as an open source solution, it drives down backup costs while providing unique features for MySQL backups.

Starting from now, Percona XtraBackup issue tracking system was moved from launchpad to JIRA.

Bugs Fixed:

  • xbcrypt with the --encrypt-key-file option was failing due to regression in Percona XtraBackup 2.4.9. Bug fixed bug PXB-518.
  • Simultaneous usage of both the --lock-ddl and --lock-ddl-per-table options caused Percona XtraBackup lock with the backup process never completed. Bug fixed PXB-792.
  • Compilation under Mac OS X was broken. Bug fixed PXB-796.
  • A regression of the maximum number of pending reads and the unnoticed earlier possibility of a pending reads related deadlock caused Percona XtraBackup to stuck in prepare stage. Bug fixed PXB-1467.
  • Percona XtraBackup skipped tablespaces with a corrupted first page instead of aborting the backup. Bug fixed PXB-1497.

Other bugs fixed: PXB-513.

Release notes with all the bugfixes for version 2.4.10 are available in our online documentation. Please report any bugs to the issue tracker.

The post Percona XtraBackup 2.4.10 Is Now Available appeared first on Percona Database Performance Blog.


Azure’s availability zones are now generally available

No matter what cloud you build on, if you want to build something that’s highly available, you’re always going to opt to put your applications and data in at least two physically separated regions. Otherwise, if a region goes down, your app goes down, too. All of the big clouds also offer a concept called ‘availability zones’ in their regions to offer developers the option to host their applications in two separate data centers in the same zone for a bit of extra resilience. All big clouds, that is, except for Azure, which is only launching its availability zones feature into general availability today after first announcing a beta last September.

Ahead of today’s launch, Julia White, Microsoft’s corporate VP for Azure, told me that the company’s design philosophy behind its data center network was always about servicing commercial customers with the widest possible range of regions to allow them to be close to their customers and to comply with local data sovereignty and privacy laws. That’s one of the reasons why Azure today offers more regions than any of its competitors, with 38 generally available regions and 12 announced ones.

“Microsoft started its infrastructure approach focused on enterprise organizations and built lots of regions because of that,” White said. “We didn’t pick this regional approach because it’s easy or because it’s simple, but because we believe this is what our customers really want.”

Every availability zone has its own network connection and power backup, so if one zone in a region goes down, the others should remain unaffected. A regional disaster could shut down all of the zones in a single region, though, so most business will surely want to keep their data in at least one additional region.


Asana introduces Timeline, lays groundwork for AI-based monitoring as the “team brain” for productivity

When workflow management platform Asana announced a $75 million round of funding in January led by former Vice President Al Gore’s Generation Investment Management, the startup didn’t give much of an indication of what it planned to do with the money, or what it was that won over investors to a new $900 million valuation (a figure we’ve now confirmed with the company).

Now, Asana is taking off the wraps on the next phase of its strategy. This week, the company announced a new feature it’s calling Timeline — composite, visual, and interactive maps of the various projects assigned to different people within a team, giving the group a wider view of all the work that needs to be completed, and how the projects fit together, mapped out in a timeline format.

Timeline is a new premium product: Asana’s 35,000 paying users will be able to access it for no extra charge. Those who are among Asana’s millions of free users will have to upgrade to the premium tier to access it.

The Timeline that Asana is making is intended to be used in scenarios like product launches, marketing campaigns and event planning, and it’s not a matter of a new piece of software where you have to duplicate work, but each project automatically becomes a new segment on a team’s Timeline. Viewing projects through the Timeline allows users to identify if different segments are overlapping and adjust them accordingly.

Perhaps one of the most interesting aspects of the Timeline, however, is that it’s the first instalment of a bigger strategy that Asana plans to tackle over the next year to supercharge and evolve its service, making it the go-to platform for helping keep you focused on work, when you’re at work.

While Asana started out as a place where people go to manage the progress of projects, its ambition going forward is to become a platform that, with a machine-learning engine at the back end, will aim to manage a team’s and a company’s wider productivity and workload, regardless of whether they are actively in the Asana app or not.

“The long term vision is to marry computer intelligence with human intelligence to run entire companies,” Asana co-founder Justin Rosenstein said in an interview. “This is the vision that got investors excited.”

The bigger product — the name has not been revealed — will include a number of different features. Some that Rosenstein has let me see in preview include the ability for people to have conversations about specific projects — think messaging channels but less dynamic and more contained. And it seems that Asana also has designs to move into the area of employee monitoring: it has also been working on a widget of sorts that installs on your computer and watches you work, with the aim of making you more efficient.

“Asana becomes a team brain to keep everyone focused,” said Rosenstein.

Given that Asana’s two co-founders, Dustin Moskovitz and Rosenstein, previously had close ties to Facebook — Moskovitz as a co-founder and Rosenstein as its early engineering lead — you might wonder if Timeline and the rest of its new company productivity engine might be bringing more social elements to the table (or desk, as the case may be).

In fact, it’s quite the opposite.

Rosenstein may have to his credit the creation of the “like” button and other iconic parts of the world’s biggest social network, but he has in more recent times become a very outspoken critic of the distracting effects of services like Facebook’s. It’s part of a bigger trend hitting Silicon Valley, where a number of leading players have, in a wave of mea culpa, turned against some of the bigger innovations particularly in social media.

Some have even clubbed together to form a new organization called the Center for Humane Technology, whose motto is “Reversing the digital attention crisis and realigning technology with humanity’s best interests.” Rosenstein is an advisor, although when I tried to raise the issue of the backlash that has hit Facebook on multiple fronts, he responded pretty flatly, “It’s not something I want to talk about right now.” (That’s what keeping focussed is all about, I guess.)

Asana, essentially, is taking the belief that social can become counterproductive when you have to get something done, and applying it to the enterprise environment.

This is an interesting twist, given that one of the bigger themes in enterprise IT over the last several years has been how to turn business apps and software more “social” — tapping into some of the mechanics and popularity of social networking to encourage employees to collaborate and communicate more with each other even when (as is often the case) they are not in the same physical space.

But social working might not be for everyone, all the time. Slack, the wildly popular workplace chat platform that interconnects users with each other and just about every enterprise and business app, is notable for producing “a gazillion notifications”, in Rosenstein’s words, leading to distraction from actually getting things done. “I’m not saying services like Slack can’t be useful,” he explained. (Slack is also an integration partner of Asana’s.) “But companies are realising that, to collaborate effectively, they need more than communication. They need content and work management. I think that Slack has a lot of useful purposes but I don’t know if all of it is good all the time.”

The “team brain” role that Asana envisions may be all about boosting productivity by learning about you and reducing distraction — you will get alerts, but you (and presumably the brain) prioritise which ones you get, if any at all — but interestingly it has kept another feature characteristic of a lot of social networking services: amassing data about your activities and using that to optimise engagement. As Rosenstein described it, Asana will soon be able to track what you are working on, and how you work on it, to figure out your working patterns.

The idea is that, by using machine learning algorithms, you can learn what a person does quickly, and what might take longer, to help plan that person’s tasks better, and ultimately make that person more productive. Eventually, the system will be able to suggest to you what you should be working on and when.

All of that might sound like music to managers’ ears, but for some, employee monitoring programs sound a little alarming for how closely they monitor your every move. Given the recent wave of attention that social media services have had for all the data they collect, it will be interesting to see how enterprise services like this get adopted and viewed. It’s also not at all clear how these sorts of programs will sit in respect of new directives like GDPR in Europe, which put into place a new set of rules for how any provider of an internet service needs to inform users of how their data is used, and any data collecting needs to have a clear business purpose.

Still, with clearly a different aim in mind — helping you work better — the end could justify the means for some, not just for bosses, but for people who might feel overwhelmed with what is on their work plate every day. “When you come in in the morning, you might have a list [many things] to do today,” Rosenstein said. “We take over your desktop to show the one thing you need to do.”


IoT devices could be next customer data frontier

At the Adobe Summit this week in Las Vegas, the company introduced what could be the ultimate customer experience construct, a customer experience system of record that pulls in information, not just from Adobe tools, but wherever it lives. In many ways it marked a new period in the notion of customer experience management, putting it front and center of the marketing strategy.

Adobe was not alone, of course. Salesforce, with its three-headed monster, the sales, marketing and service clouds, was also thinking of a similar idea. In fact, they spent $6.5 billion dollars last week to buy MuleSoft to act as a data integration layer to access  customer information from across the enterprise software stack, whether on prem, in the cloud, or inside or outside of Salesforce. And they announced the Salesforce Integration Cloud this week to make use of their newest company.

As data collection takes center stage, we actually could be on the edge of yet another data revolution, one that could be more profound than even the web and mobile were before it. That is…the Internet of Things.

Here comes IoT

There are three main pieces to that IoT revolution at the moment from a consumer perspective. First of all, there is the smart speaker like the Amazon Echo or Google Home. These provide a way for humans to interact verbally with machines, a notion that is only now possible through the marriage of all this data, sheer (and cheap) compute power and the AI algorithms that fuel all of it.

Next, we have the idea of a connected car, one separate from the self-driving car. Much like the smart speaker, humans can interact with the car, to find directions and recommendations and that leaves a data trail in its wake. Finally we, have sensors like iBeacons sitting in stores, providing retailers with a world of information about a customer’s journey through the store — what they like or don’t like, what they pick up, what they try on and so forth.

There are very likely a host of other categories too, and all of this information is data that needs to be processed and understood just like any other signals coming from customers, but it also has unique characteristics around the volume and velocity of this data — it is truly big data with all of the issues inherent in processing that amount of data.

The means it needs to be ingested, digested and incorporated into that central customer record-keeping system to drive the content and experiences you need to create to keep your customers happy — or so the marketing software companies tell us, at least. (We also need to consider the privacy implications of such a record, but that is the subject for another article.)

Building a better relationship

Regardless of the vendor, all of this is about understanding the customer better to provide a central data gathering system with the hope of giving people exactly what they want. We are no longer a generic mass of consumers. We are instead individuals with different needs, desires and requirements, and the best way to please us they say, is to understand us so well, that the brand can deliver the perfect experience at exactly the right moment.

Photo: Ron Miller

That involves listening to the digital signals we give off without even thinking about it. We carry mobile, connected computers in our pockets and they send out a variety of information about our whereabouts and what we are doing. Social media acts as a broadcast system that brands can tap into to better understand us (or so the story goes).

Part of what Adobe, Salesforce and others can deliver is a way to gather that information, pull it together into his uber record keeping system and apply a level of machine and learning and intelligence to help further the brand’s ultimate goals of serving a customer of one and delivering an efficient (and perhaps even pleasurable) experience.

Getting on board

At an Adobe Summit session this week on IoT (which I moderated), the audience was polled a couple of times. In one show of hands, they were asked how many owned a smart speaker and about three quarters indicated they owned at least one, but when asked how many were developing applications for these same devices only a handful of hands went up. This was in a room full of marketers, mind you.

Photo: Ron Miller

That suggests that there is a disconnect between usage and tools to take advantage of them. The same could be said for the other IoT data sources, the car and sensor tech, or any other connected consumer device. Just as we created a set of tools to capture and understand the data coming from mobile apps and the web, we need to create the same thing for all of these IoT sources.

That means coming up with creative ways to take advantage of another interaction (and data collection) point. This is an entirely new frontier with all of the opportunity involved in that, and that suggests startups and established companies alike need to be thinking about solutions to help companies do just that.


Analyze MySQL Audit Logs with ClickHouse and ClickTail

ClickHouse and ClickTail

MySQL Audit LogsIn this blog post, I’ll look at how you can analyze MySQL audit logs (Percona Server for MySQL) with ClickHouse and ClickTail.

Audit logs are available with a free plugin for Percona Server for MySQL ( Besides providing insights about activity on your server, you might need the logs for compliance purposes.

However, on an active server, the logs can get very large. Under a sysbench-tpcc workload, for example, I was able to generate 24GB worth of logs just within one hour.

So we are going to use the ClickTail tool, which Peter Zaitsev mentioned in Analyze Your Raw MySQL Query Logs with ClickHouse and the Altinity team describes in the ClickTail Introduction.

Clicktail extracts all fields available in Percona Server for MySQL’s audit log in JSON format, as you can see in Schema. I used the command:

clicktail --dataset='clicktail.mysql_audit_log' --parser=mysqlaudit --file=/mnt/nvmi/mysql/audit.log --backfill

In my setup, ClickTail imported records at the rate of 1.5 to 2 million records/minute. Once we have ClickTail setup, we can do some work on audit logs. Below are some examples of queries.

Check if some queries were run with errors:

    status AS c1,
FROM mysql_audit_log
?    0 ? 46197504 ?
? 1160 ?        1 ?
? 1193 ?     1274 ?
? 1064 ?     5096 ?
4 rows in set. Elapsed: 0.018 sec. Processed 46.20 million rows, 184.82 MB (2.51 billion rows/s., 10.03 GB/s.)

First, it is very impressive to see the performance of 2.5 billion row/s analyzed. And second, there are really some queries with non-zero (errors) statuses.

We can dig into and check what exactly caused an 1193 error (MySQL Error Code: 1193. Unknown system variable):

FROM mysql_audit_log
WHERE status = 1193
? 2018-03-12 20:34:49 ? 2018-03-12 ?   0 ? select        ?          1097 ?    ? localhost ?    ? Query ?         ?          ?            ?               ?           ?            ? 39782055_2018-03-12T20:21:21 ? SELECT @@query_response_time_stats ?   1193 ? root[root] @ localhost [] ?                  ?

So this was

SELECT @@query_response_time_stats

, which I believe comes from the Percona Monitoring and Management (PMM) MySQL Metrics exporter.

Similarly, we can check what queries types were run on MySQL:

FROM mysql_audit_log
GROUP BY command_class
?                      ?    15882 ?
? show_storage_engines ?     1274 ?
? select               ? 26944474 ?
? error                ?     5096 ?
? show_slave_status    ?     1274 ?
? begin                ?  1242555 ?
? update               ?  9163866 ?
? show_tables          ?      204 ?
? show_status          ?     6366 ?
? insert_select        ?      170 ?
? delete               ?   539058 ?
? commit               ?  1237074 ?
? create_db            ?        2 ?
? show_engine_status   ?     1274 ?
? show_variables       ?      450 ?
? set_option           ?     8102 ?
? create_table         ?      180 ?
? rollback             ?     5394 ?
? create_index         ?      120 ?
? insert               ?  7031060 ?
20 rows in set. Elapsed: 0.120 sec. Processed 46.20 million rows, 691.84 MB (385.17 million rows/s., 5.77 GB/s.)

There are more fields available, like:

db String,
host String,
ip String,

to understand who accessed a MySQL instance, and from where.

If you ever need to do some advanced work with MySQL audit logs, consider doing it with ClickHouse and ClickTail!

The post Analyze MySQL Audit Logs with ClickHouse and ClickTail appeared first on Percona Database Performance Blog.


Using ProxySQL and VIRTUAL Columns to Solve ORM Issues

ProxySQL and VIRTUAL columns

ProxySQL and VIRTUAL columnsIn this blog post, we’ll look at using ProxySQL and VIRTUAL columns to solve ORM issues.

There are a lot of web frameworks all around. Programmers and web designers are using them to develop and deploy any website and web application. Just to cite some of the most famous names: Drupal, Ruby on Rails, Symfony, etc.

Web frameworks are very useful tools. But sometimes, as with many human artifacts, they have issues. Any framework has its own queries to manage its internal tables. While there is nothing wrong with that, but it often means these queries are not optimized.

Here is my case with Symfony 2 on MySQL 5.7, and how I solved it.

The sessions table issue

Symfony has a table to manage session data for users on the application. The table is defined as follow:

CREATE TABLE `sessions` (
 `sess_id` varchar(126) COLLATE utf8_bin NOT NULL,
 `sess_data` blob NOT NULL,
 `sess_time` int(10) unsigned NOT NULL,
 `sess_lifetime` mediumint(9) NOT NULL,
 PRIMARY KEY (`sess_id`)

The expiration time of the user session is configurable. The developers decided to configure it to be one month.

Symfony was serving a high traffic website, and very soon that table became very big. After one month, I saw it had more than 14 million rows and was more than 3GB in size.

    -> FROM information_schema.tables WHERE table_schema='symfony' AND table_name='sessions'\G
*************************** 1. row ***************************
  TABLE_SCHEMA: symfony
    TABLE_NAME: sessions
        ENGINE: InnoDB
    TABLE_ROWS: 14272158
   DATA_LENGTH: 3306140672

Developers noticed the web application sometimes stalling for a few seconds. First, I analyzed the slow queries on MySQL and I discovered that sometimes Symfony deletes inactive sessions. It issued the following query, which took several seconds to complete. This query was the cause of the stalls in the application:

DELETE FROM sessions WHERE sess_lifetime + sess_time < 1521025847

The query is not optimized. Let’s have a look at the EXPLAIN:

mysql> EXPLAIN DELETE FROM sessions WHERE sess_lifetime + sess_time < 1521025847\G
*************************** 1. row ***************************
           id: 1
  select_type: DELETE
        table: sessions
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 14272312
     filtered: 100.00
        Extra: Using where

Every DELETE query was a full table scan of more than 14 million rows. So, let’s try to improve it.

First workaround

Looking around on the web and discussing it with colleagues, we’ve found some workarounds. But none of them was the definitive solution:

  1. Reduce expiration time in Symfony configuration. Good idea. One month is probably too long for a high traffic website. But we kept the expiration time configured at one month because of an internal business policy. But even one week wouldn’t have solved the full table scan.
  2. Using a different database solution. Redis was proposed as an alternative to MySQL to manage session data. This might be a good solution, but it could involve a long deployment time. We planned a test, but the sysadmins suggested it was not a good solution to have another database system for such a simple task.
  3. Patching Symfony code. It was proposed to rewrite the query directly into the Symfony code. Discarded.
  4. Create indexes. It was proposed to create indexes on sess_time and sess_lifetime columns. The indexes wouldn’t get used because of the arithmetic addition on the where clause. This is the only condition we have on the query.

So, what do we do if everything must remain the same? Same configuration, same environment, same query issued and no indexes added?

Query optimization using a virtual column

I focused on how to optimize the query. Since I was using 5.7, I thought about a generated virtual column. I decided to add a virtual column in the sessions table, defined as sess_time+sess_lifetime (the same as the condition of the query):

mysql> ALTER TABLE sessions
ADD COLUMN `sess_delete` INT UNSIGNED GENERATED ALWAYS AS ((`sess_time` + `sess_lifetime`)) VIRTUAL;

Any virtual column can have an index on it. So, I created the index:

mysql> ALTER TABLE sessions ADD INDEX(sess_delete);

Note: I first checked that the INSERT queries were well written in Symfony (with an explicit list of the fields to insert), in make sure this modification wouldn’t cause more issues. Making a schema change on a table that is in use by any framework, where the queries against the table are generally outside of your control, can be a daunting task.

So, let’s EXPLAIN the query rewritten as follows, with the condition directly on the generated indexed column:

mysql> EXPLAIN DELETE FROM sessions WHERE sess_delete < 1521025847\G
*************************** 1. row ***************************
           id: 1
  select_type: DELETE
        table: sessions
         type: range
possible_keys: sess_delete
          key: sess_delete
      key_len: 5
          ref: const
         rows: 6435
     filtered: 100.00
        Extra: Using where

The query now can to use the index, and the number of rows selected are the exact number of the session that we have to delete.

So far, so good. But will Symfony execute that query if we don’t want to modify the source code?

Using ProxySQL to rewrite the query

Fortunately, we already had ProxySQL up and running in our environment. We were using it just to manage the master MySQL failover.

One of the very useful features of ProxySQL is the ability to rewrite any query it receives into another one based on rules you can define. You can create queries from very simple rules, like changing the name of a field, to very complex queries that use a chain of rules. It depends on how complex the translation is that you have to do. In our case, we just needed to translate sess_time + sess_lifetime into sess_delete. The rest of the query was the same. We needed to define a very simple rule.

Let’s see how to create the rewrite rules.

Connect to the proxy:

mysql -u admin -psecretpwd -h -P6032 --prompt='Admin> '

Define the rewrite rule by inserting a record into the mysql_query_rules table:

Admin> INSERT INTO mysql_query_rules(rule_id,active,flagIN,match_pattern,negate_match_pattern,re_modifiers,replace_pattern,destination_hostgroup,apply)
 -> 1,
 -> 1,
 -> 0,
 -> '^DELETE FROM sessions WHERE sess_lifetime + sess_time < (.*)',
 -> 0,
 -> 'DELETE FROM sessions WHERE sess_delete < \1',
 -> 0,
 -> 1);

The two fields I want to focus on are:

  • match_pattern: it defines the query to be matched using the regular expression notation. The + symbol must be escaped using because it’s a special character for regular expressions
  • replace_pattern: it defines how to rewrite the matched query. 1 is the value of the parameter matched by match_pattern into (.*)

For the meaning of the other fields, have a look at

Once created, we have to save the rule to disk and put it on runtime to let it run effectively.


After that, the proxy began to filter the query and rewrite it to have a better execution plan using the index on the virtual column.

Note: pay attention when you need to upgrade the framework. If it needs to rebuild the database tables, you will lose the virtual column you’ve created. Just remember to recreate it and check it after the upgrade.


Developers love using web frameworks because they are very powerful in simplifying development and deployment of complex web applications. But for DBAs, sometimes internal queries can cause a bit of a headache because it is not well optimized or because it was not supposed to run in your “huge” database. I solved my case using ProxySQL and VIRTUAL columns with a minimal impact on the architecture of the system we had and avoided any source code patching.

Take this post as a tip in case you face similar issues with your application framework.

The post Using ProxySQL and VIRTUAL Columns to Solve ORM Issues appeared first on Percona Database Performance Blog.

Powered by WordPress | Theme: Aeros 2.0 by