MongoDB 4.4 Performance Regression: Overwhelmed by Memory

MongoDB 4.4 Performance Regression

MongoDB 4.4 Performance RegressionThere is a special collection of database bugs when the system starts to perform worse when given more resources. Examples of such bugs for MySQL I have:

Bug #15815 – This is where InnoDB on an 8-CPU system performed worse than on a 4-CPU system with increased concurrency.

Bug #29847 – This is a similar bug to what I will describe today, when given more memory (innodb_buffer_pool_size), the MySQL crash recovery process would take longer than with less memory, which was described in our blog Innodb Recovery – Is a large buffer pool always better?

It seems InnoDB Flushing was not optimal with a big amount of memory at that time, and this is what I think is happening with MongoDB 4.4 in the scenario I will describe.

MongoDB 4.4 Load Data Procedures

So when preparing data for my benchmark (Percona Server for MongoDB 4.2 vs 4.4 in Python TPCC Benchmark), I also measured how long it takes to load 1000 Warehouses (about 165GB of data in MongoDB) and to have repeatable numbers, as I usually like to repeat the procedure multiple times.

What I noticed is that when MongoDB 4.4 starts with unlimited cache (that is on the server with 250 GB of RAM, it will allocate 125GB for WiredTiger cache and the rest can be used for OS cache) it shows interesting behavior.

Let me describe by load procedure, which is quite simple

  1. Load data into database TPCC1
  2. Sleep 10 mins
  3. Load data into database TPCC3

That is the second time we load into a different database, and in background pages for database, TPCC1 should be flushed and evicted.

Before jumping to the problem I see, let’s check the number for MongoDB 4.2, and by the number I mean how long it takes to accomplish step 1 and step 3.

MongoDB 4.2 with limited memory (WiredTiger cache 25GB):

Step 1: 20 min

Step 3: 21 min

MongoDB 4.4 with limited memory (WiredTiger cache 25GB):

Step 1: 24 min

Step 3: 26 min

MongoDB 4.2 with 125GB WiredTiger cache 

Step 1: 18 min

Step 3: 19 min


And now to the problem I see:

MongoDB 4.4 with WiredTiger cache

Step 1: 19 min

Step 3: 497 min

Notice Step 3 takes almost 8 and a half hours, instead of the usual ~20 mins for all previous cases, and this is when MongoDB has 125GB of WiredTiger cache.

What’s interesting is that I do not see this issue when I limit the WiredTiger cache to 25GB, and also this problem does not exist in MongoDB 4.2.

That’s why I think MongoDB 4.4 starts to behave differently when it is given a lot of memory for WiredTiger cache. Why this happens exactly I do not know yet and I will continue to profile this case. A quick look may indicate this is related to the WiredTiger eviction process (similar to the InnoDB flushing problem in the crash recovery process) and replication flow control which was created in MongoDB 4.2 to keep replicas in sync (but I do not use replicas for this test).

Learn more about the history of Oracle, the growth of MongoDB, and what really qualifies software as open source. If you are a DBA, or an executive looking to adopt or renew with MongoDB, this is a must-read!

Download “Is MongoDB the New Oracle?”


Pure Storage acquires data service platform Portworx for $370M

Pure Storage, the public enterprise data storage company, today announced that it has acquired Portworx, a well-funded startup that provides a cloud-native storage and data-management platform based on Kubernetes, for $370 million in cash. This marks Pure Storage’s largest acquisition to date and shows how important this market for multicloud data services has become.

Current Portworx enterprise customers include the likes of Carrefour, Comcast, GE Digital, Kroger, Lufthansa, and T-Mobile. At the core of the service is its ability to help users migrate their data and create backups. It creates a storage layer that allows developers to then access that data, no matter where it resides.

Pure Storage will use Portworx’s technology to expand its hybrid and multicloud services and provide Kubernetes -based data services across clouds.

Image Credits: Portworx

“I’m tremendously proud of what we’ve built at Portworx: An unparalleled data services platform for customers running mission-critical applications in hybrid and multicloud environments,” said Portworx CEO Murli Thirumale. “The traction and growth we see in our business daily shows that containers and Kubernetes are fundamental to the next-generation application architecture and thus competitiveness. We are excited for the accelerated growth and customer impact we will be able to achieve as a part of Pure.”

When the company raised its Series C round last year, Thirumale told me that Portworx had expanded its customer base by over 100% and its bookings increased by 376 from 2018 to 2019.

“As forward-thinking enterprises adopt cloud-native strategies to advance their business, we are thrilled to have the Portworx team and their groundbreaking technology joining us at Pure to expand our success in delivering multicloud data services for Kubernetes,” said Charles Giancarlo, chairman and CEO of Pure Storage. “This acquisition marks a significant milestone in expanding our Modern Data Experience to cover traditional and cloud native applications alike.”


User-generated e-learning site Kahoot acquires Actimo for up to $33M to double down on corporate sector

Norwegian company Kahoot originally made its name with a platform that lets educators and students create and share game-based online learning lessons, in the process building up a huge public catalogue of gamified lessons created by its community. Today the startup — now valued at more than $2 billion — is announcing an acquisition to give a boost to another segment of its business: corporate customers.

Kahoot has acquired Danish startup Actimo, which provides a platform for businesses to train and engage with employees. Kahoot said that the purchase is being made with a combination of cash and shares, and works out to a total enterprise value of between $26 million and $33 million for the smaller company, with the sale expected to be completed in October 2020.

It may sound like a modest sum in a tech market where companies are currently and regularly seeing paper valuations in the hundreds of millions at Series A stage, but it also presents a different kind of trajectory both for founders and their investors.

This is actually a strong exit for Actimo, which had raised less than $500,000, according to data from PitchBook. And it puts Actimo under the wing of a company that has been scaling globally fast, finding — like others in the areas of online education and remote working — that the current state of social distancing due to COVID-19 is resulting in a boost to its business.

To give you an idea of the scale and growth of Kahoot, the company says that currently it has over 1 billion “participating players,” on top of some 4.4 billion users in aggregate since first launching the platform in 2013. In the last 12 months, some 200 million games have been played on its platform. In June, when Kahoot announced that it had raised $28 million in funding, it told us that 100 million games had been played.

In light of its growth and the future opportunity — even putting aside the progression of the coronavirus, it looks like remote work and remote learning will at least become a lot more common as a longer-term option — the company has also seen a rise in its valuation. With some of its shares traded on the Merkur Market in Norway, the company currently has a market cap of 18.716 billion Norwegian Krone, which at today’s rates is about $2.08 billion. That figure was $1.4 billion in June.

Kahoot’s targeting of the corporate sector is not new. The company has been building a business in this space for years. It says that in the last 12 months, it logged 2 million sessions across 20 million participating “players” of its corporate training “games,” with some 97% of the Fortune 500 among those users. Customers include the likes of Facebook (for sales training), Oyo (hospitality training and onboarding) and Qualys (for taking polls during a conference), among others.

Critically, while a lot of Kahoot’s audience is in education, it’s corporate that most of the revenues come in —  one reason why it’s keen to grow that segment with more services and users.

The aim with Actimo, Kahoot says, is to build out a product set aimed at helping organisations with company culture — which, with many organisations now going on eight months and counting of entire teams working regularly outside of their physical offices, has grown as a priority.

Keeping a team feeling like a team, and an individual feeling more than a transactional regard for an employer, is not a simple thing in the best of times. Now, as we continue to work physically away from each other, it will take even more tools and efforts to get the balance right.

In that context, Actimo’s solution is just one aspect, but potentially an interesting one: it has built a platform where employees can track the training that they have done or need to do, engage with other co-workers, and provide feedback, and employers can use it to generally track and encourage how employees are engaging across the company and its various efforts. It counts some 200 enterprises, including Circle K, Hi3G and Compass Group, among its customers, and has current ARR of $5 million.

For comparison, Kahoot, in its Q2 financials published in August, reported ARR of $25 million, with invoiced revenue for the quarter at $9.6 million, growing some 317% on the same quarter a year before. The company has also raised some $110 million in private funding from the likes of Microsoft and Disney.

As Kahoot looks to find more than just a transient place in a company’s IT and software fabric — transience of attention always being a risk with anything gaming-based — it makes a lot of sense to pick up Actimo and work on ways of coupling the platform with its other corporate work. You can also imagine a time when it might create a similar kind of dashboard for the educational sector.

“We are excited to welcome the Actimo team to be part of the fast-growing Kahoot! family,” said Kahoot CEO, Eilert Hanoa, in a statement. “This acquisition will further extend Kahoot!’s corporate learning offerings, by providing solutions tailored for the frontline segment, as well as to solidify company culture and engagement among remote and distributed teams in companies of all types and sizes. This continues our expressed ambition to also grow through M&A by adding strategic capabilities that we can leverage across our global platform.”

“We are thrilled to join forces with Kahoot! in our mission to develop next-level solutions that connect remote employees and boost employee engagement and productivity,” said Eske Gunge, CEO at Actimo, in a statement. “Being part of Kahoot! and with our experience from working with innovative and ambitious enterprises across industries, we can together set a new standard for corporate learning and engagement.”


Dropbox CEO Drew Houston says the pandemic forced the company to reevaluate what work means

Dropbox CEO and co-founder Drew Houston, appearing at TechCrunch Disrupt today, said that COVID has accelerated a shift to distributed work that we have been talking about for some time, and these new ways of working will not simply go away when the pandemic is over.

“When you think more broadly about the effects of the shift to distributed work, it will be felt well beyond when we go back to the office. So we’ve gone through a one-way door. This is maybe one of the biggest changes to knowledge work since that term was invented in 1959,” Houston told TechCrunch Editor-In-Chief Matthew Panzarino.

That change has prompted Dropbox to completely rethink the product set over the last six months, as the company has watched the way people work change in such a dramatic way. He said even though Dropbox is a cloud service, no SaaS tool in his view was purpose-built for this new way of working and we have to reevaluate what work means in this new context.

“Back in March we started thinking about this, and how [the rapid shift to distributed work] just kind of happened. It wasn’t really designed. What if you did design it? How would you design this experience to be really great? And so starting in March we reoriented our whole product road map around distributed work,” he said.

He also broadly hinted that the fruits of that redesign are coming down the pike. “We’ll have a lot more to share about our upcoming launches in the future,” he said.

Houston said that his company has adjusted well to working from home, but when they had to shut down the office, he was in the same boat as every other CEO when it came to running his company during a pandemic. Nobody had a blueprint on what to do.

“When it first happened, I mean there’s no playbook for running a company during a global pandemic so you have to start with making sure you’re taking care of your customers, taking care of your employees, I mean there’s so many people whose lives have been turned upside down in so many ways,” he said.

But as he checked in on the customers, he saw them asking for new workflows and ways of working, and he recognized there could be an opportunity to design tools to meet these needs.

“I mean this transition was about as abrupt and dramatic and unplanned as you can possibly imagine, and being able to kind of shape it and be intentional is a huge opportunity,” Houston said.

Houston debuted Dropbox in 2008 at the precursor to TechCrunch Disrupt, then called the TechCrunch 50. He mentioned that the Wi-Fi went out during his demo, proving the hazards of live demos, but offered words of encouragement to this week’s TechCrunch Disrupt Battlefield participants.

Although his is a public company on a $1.8 billion run rate, he went through all the stages of a startup, getting funding and eventually going public, and even today as a mature public company, Dropbox is still evolving and changing as it adapts to changing requirements in the marketplace.


Fiverr Business helps teams manage freelance projects

Freelance marketplace Fiverr launched a new service today designed to help teams at larger companies manage their work with freelancers.

CEO Micha Kaufman told me via email that Fiverr had already begun working with larger clients, but that Fiverr Business is better-designed to meet their needs.

“Organizations require tools to manage their team accounts, defining projects, assigning budgets, tracking progress and collaborating internally,” Kaufman wrote. “Fiverr Business provides all of that and much more, including exclusive access to Fiverr’s personal executive assistants which are always available to Fiverr Business customers to help with administrative account tasks, general project management, talent matching, and more.”

He also suggested that with the pandemic forcing companies to adopt remote work and placing pressure on their bottom lines, many of them are increasingly turning to freelancers, and he claimed, “2020 marks the beginning of a decade where businesses will invest and learn how to truly integrate freelancers into their workflows.”

projects dashboard

Image Credits: Fiverr

Fiverr Group Product Manager Meidad Hinkis walked me through the new service, showing me how users can create projects, assign team members and set freelance budgets, then actually hire freelancers, as well as offer internal and external feedback on the work that comes in.

He also noted there’s a special pool of curated freelancers available through Fiverr Business, and like Kaufman, emphasized that customers will also have access to assistants to help them find freelancers and manage projects. (On the freelancer side, payments and the rest of the experience should be pretty similar.)

On top of the freelancer fees, Fiverr Business will cost $149 per year for teams of up to 50 users, and Hinkis said the company is offering the first year for free.

“We so believe in product and the direction that we want people to get real value before they decide,” he said.


In 2020, Warsaw’s startup ecosystem is ‘a place to observe carefully’

If you listed the trends that have captured the attention of 20 Warsaw-focused investors who replied to our recent surveys, automation/AI, enterprise SaaS, cleantech, health, remote work and the sharing economy would top the list. These VCs said they are seeking opportunities in the “digital twin” space, proptech and expanded blockchain tokenization inside industries.

Investors in Central and Eastern Europe are generally looking for the same things as VCs based elsewhere: startups that have a unique value proposition, capital efficiency, motivated teams, post-revenue and a well-defined market niche.

Out of the cohort we interviewed, several told us that COVID-19 had not yet substantially transformed how they do business. As Micha? Papuga, a partner at Flashpoint VC put it, “the situation since March hasn’t changed a lot, but we went from extreme panic to extreme bullishness. Neither of these is good and I would recommend to stick to the long-term goals and not to be pressured.”

Said Pawel Lipkowski of RBL_VC, “Warsaw is at its pivotal point — think Berlin in the ‘90s. It’s a place to observe carefully.”

Here’s who we interviewed for part one:

For the conclusion, we spoke to the following investors:

Karol Szubstarski, partner, OTB Ventures

What trends are you most excited about investing in, generally?
Gradual shift of enterprises toward increased use of automation and AI, that enables dramatic improvement of efficiency, cost reduction and transfer of enterprise resources from tedious, repeatable and mundane tasks to more exciting, value added opportunities.

What’s your latest, most exciting investment?
One of the most exciting opportunities is ICEYE. The company is a leader and first mover in synthetic-aperture radar (SAR) technology for microsatellites. It is building and operating its own commercial constellation of SAR microsatellites capable of providing satellite imagery regardless of the cloud cover, weather conditions and time of the day and night (comparable resolution to traditional SAR satellites with 100x lower cost factor), which is disrupting the multibillion dollar satellite imagery market.

Are there startups that you wish you would see in the industry but don’t? What are some overlooked opportunities right now?
I would love to see more startups in the digital twin space; technology that enables creation of an exact digital replica/copy of something in physical space — a product, process or even the whole ecosystem. This kind of solution enables experiments and [the implementation of] changes that otherwise could be extremely costly or risky – it can provide immense value added for customers.

What are you looking for in your next investment, in general?
A company with unique value proposition to its customers, deep tech component that provides competitive edge over other players in the market and a founder with global vision and focus on execution of that vision.

Which areas are either oversaturated or would be too hard to compete in at this point for a new startup? What other types of products/services are you wary or concerned about?
No market/sector is too saturated and has no room for innovation. Some markets seem to be more challenging than others due to immense competitive landscape (e.g., food delivery, language-learning apps) but still can be the subject of disruption due to a unique value proposition of a new entrant.

How much are you focused on investing in your local ecosystem versus other startup hubs (or everywhere) in general? More than 50%? Less?
OTB is focused on opportunities with links to Central Eastern European talent (with no bias toward any hub in the region), meaning companies that leverage local engineering/entrepreneurial talent in order to build world-class products to compete globally (usually HQ outside CEE).

Which industries in your city and region seem well-positioned to thrive, or not, long term? What are companies you are excited about (your portfolio or not), which founders?
CEE region is recognized for its sizable and highly skilled talent pool in the fields of engineering and software development. The region is well-positioned to build up solutions that leverage deep, unique tech regardless of vertical (especially B2B). Historically, the region was especially strong in AI/ML, voice/speech/NLP technologies, cybersecurity, data analytics, etc.

How should investors in other cities think about the overall investment climate and opportunities in your city?
CEE (including Poland and Warsaw) has always been recognized as an exceptionally strong region in terms of engineering/IT talent. Inherent risk aversion of entrepreneurs has driven, for a number of years, a more “copycat”/local market approach, while holding back more ambitious, deep tech opportunities. In recent years we are witnessing a paradigm shift with a new generation of entrepreneurs tackling problems with unique, deep tech solutions, putting emphasis on global expansion, neglecting shallow local markets. As such, the quality of deals has been steadily growing and currently reflects top quality on global scale, especially on tech level. CEE market demonstrates also a growing number of startups (in total), which is mostly driven by an abundance of early-stage capital and success stories in the region (e.g., DataRobot, Bolt, UiPath) that are successfully evangelizing entrepreneurship among corporates/engineers.

Do you expect to see a surge in more founders coming from geographies outside major cities in the years to come, with startup hubs losing people due to the pandemic and lingering concerns, plus the attraction of remote work?
I believe that local hubs will hold their dominant position in the ecosystem. The remote/digital workforce will grow in numbers but proximity to capital, human resources and markets still will remain the prevalent force in shaping local startup communities.

Which industry segments that you invest in look weaker or more exposed to potential shifts in consumer and business behavior because of COVID-19? What are the opportunities startups may be able to tap into during these unprecedented times?
OTB invests in general in companies with clearly defined technological advantage, making quantifiable and near-term difference to their customers (usually in the B2B sector), which is a value-add regardless of the market cycle. The economic downturn works generally in favor of technological solutions enabling enterprise clients to increase efficiency, cut costs, bring optimization and replace manual labour with automation — and the vast majority of OTB portfolio fits that description. As such, the majority of the OTB portfolio has not been heavily impacted by the COVID pandemic.

How has COVID-19 impacted your investment strategy? What are the biggest worries of the founders in your portfolio? What is your advice to startups in your portfolio right now?
The COVID pandemic has not impacted our investment strategy in any way. OTB still pursues unique tech opportunities that can provide its customers with immediate value added. This kind of approach provides a relatively high level of resilience against economic downturns (obviously, sales cycles are extending but in general sales pipeline/prospects/retention remains intact). Liquidity in portfolio is always the number one concern in uncertain, challenging times. Lean approach needs to be reintroduced, companies need to preserve cash and keep optimizing — that’s the only way to get through the crisis.

Are you seeing “green shoots” regarding revenue growth, retention or other momentum in your portfolio as they adapt to the pandemic?
A good example in our portfolio is Segron, a provider of an automated testing platform for applications, databases and enterprise network infrastructure. Software development, deployment and maintenance in enterprise IT ecosystem requires continuous and rigorous testing protocols and as such a lot of manual heavy lifting with highly skilled engineering talent being involved (which can be used in a more productive way elsewhere). The COVID pandemic has kept engineers home (with no ability for remote testing) while driving demand for digital services (and as such demand for a reliable IT ecosystem). The Segron automated framework enables full automation of enterprise testing leading to increased efficiency, cutting operating costs and giving enterprise customers peace of mind and a good night’s sleep regarding their IT infrastructure in the challenging economic environment.

What is a moment that has given you hope in the last month or so? This can be professional, personal or a mix of the two.
I remain impressed by the unshakeable determination of multiple founders and their teams to overcome all the challenges of the unfavorable economic ecosystem.


Percona Server for MongoDB 4.2 vs 4.4 in Python TPCC Benchmark

Percona Server for MongoDB 4.2 vs. 4.4

Percona Server for MongoDB 4.2 vs. 4.4Following my previous blogs on py-tpcc benchmark for MongoDB, Evaluating the Python TPCC MongoDB Benchmark and Evaluating MongoDB Under Python TPCC 1000W Workload, and the recent release of Percona Server for MongoDB 4.4, I wanted to evaluate 4.2 vs 4.4 in similar scenarios.

Hardware Specs

For the client and server, I will use identical bare metal servers, connected via a 10Gb network.

The node specification:

# Percona Toolkit System Summary Report ######################
        Date | 2020-09-14 16:52:46 UTC (local TZ: EDT -0400)
    Hostname | node3
      System | Supermicro; SYS-2028TP-HC0TR; v0123456789 (Other)
    Platform | Linux
     Release | Ubuntu 20.04.1 LTS (focal)
      Kernel | 5.4.0-42-generic
Architecture | CPU = 64-bit, OS = 64-bit
# Processor ##################################################
  Processors | physical = 2, cores = 28, virtual = 56, hyperthreading = yes
      Models | 56xIntel(R) Xeon(R) CPU E5-2683 v3 @ 2.00GHz
      Caches | 56x35840 KB
# Memory #####################################################
       Total | 251.8G
  Swappiness | 0
 DirtyPolicy | 80, 5
 DirtyStatus | 0, 0

The drive I used for the storage in this benchmark is a Samsung SM863 SATA SSD.

MongoDB Topology

For MongoDB I used:

  • Single node instance without limiting cache size. As the bare metal server has 250GB of RAM, MongoDB should allocate 125GB of memory for WiredTiger cache and the rest will be used for OS cache. This should produce more CPU bound workload.
  • Single node instance with limited cache size. For WiredTiger cache I will set a limit 25GB, and to limit OS cache I will limit the memory available to a mongodb instance to 50GB, as described in Using Cgroups to Limit MySQL and MongoDB memory usage.
  • However I did not use cgroups in this case, but I rather used Docker to run different versions and set limits.

The script to start Percona Server for MongoDB in docker with memory limits:

> bash 4.4 1
=== script ===
docker run -d --name db$2 -m 50g  \
          -v /mnt/data/psmdb$2-$1:/data/db \
          --net=host \
          percona/percona-server-mongodb:$1 --replSet "rs$2" --port $(( 27016 + $2 )) \
          --logpath /data/db/server1.log --slowms=10000 --wiredTigerCacheSizeGB=25 

sleep 10

mongo mongodb://$(( 27016 + $2 )) --eval "rs.initiate( { _id : 'rs$2',  members: [      { _id: 0, host: '$(( 27016 + $2 ))' }   ] })"

MongoDB Versions:

Benchmark Results

Unlimited Memory

The results are in New Order Transactions per Minute (NOTPM), and more is better:

Clients 4.2 4.4
10 541.31 691.89
30 999.89 1105.88
50 1048.50 1171.35
70 1095.72 1335.90
90 1184.38 1433.09
110 1210.18 1521.56
130 1231.38 1575.23
150 1245.31 1680.81
170 1224.13 1668.33
190 1300.11 1641.45
210 1240.86 1619.58
230 1220.89 1575.57
250 1237.86 1545.01

MongoDB 4.4 Unlimited Memory

Limited Memory, 50GB in Total and 25GB for Cache

The results are in New Order Transactions per Minute (NOTPM), and more is better:

Clients 4.2 4.4
10 351.45 377.29
30 483.88 447.22
50 535.34 522.59
70 576.30 574.14
90 604.49 582.10
110 618.59 542.11
130 593.31 386.33
150 386.67 301.75
170 265.91 298.80
190 259.56 301.38
210 254.57 301.88
230 249.47 299.15
250 251.03 300.00

MongoDB 4.2 Limited Memory


Actually I wanted to perform more benchmarks on 4.4 vs 4.2, but some interesting behavior in 4.4 made me reconsider my plans and I’ve gotten distracted trying to understand the issue, and I share this in the post MongoDB 4.4 Performance Regression: Overwhelmed by Memory.

Besides that, in my tests, 4.4 outperformed 4.2 in case of unlimited memory, but I want to consider a variation of throughput during the benchmark so we are working on a py-tpcc version that would report data with 1-sec resolution. Also, I want to re-evaluate how 4.4 would perform in a long-running benchmark, as the current length of the benchmark is 900 sec.

In the case with limited memory, 4.4 did identically or worse than 4.2 with concurrency over 100 clients.

Both versions did not handle the increased number of clients well, showing worse results with 150 clients compared to 10 clients.

Learn more about the history of Oracle, the growth of MongoDB, and what really qualifies software as open source. If you are a DBA, or an executive looking to adopt or renew with MongoDB, this is a must-read!

Download “Is MongoDB the New Oracle?”


Latent AI makes edge AI workloads more efficient

Latent AI, a startup that was spun out of SRI International, makes it easier to run AI workloads at the edge by dynamically managing workloads as necessary.

Using its proprietary compression and compilation process, Latent AI promises to compress library files by 10x and run them with 5x lower latency than other systems, all while using less power thanks to its new adaptive AI technology, which the company is launching as part of its appearance in the TechCrunch Disrupt Battlefield competition today.

Founded by CEO Jags Kandasamy and CTO Sek Chai, the company has already raised a $6.5 million seed round led by Steve Jurvetson of Future Ventures and followed by Autotech Ventures .

Before starting Latent AI, Kandasamy sold his previous startup OtoSense to Analog Devices (in addition to managing HPE Mid-Market Security business before that). OtoSense used data from sound and vibration sensors for predictive maintenance use cases. Before its sale, the company worked with the likes of Delta Airlines and Airbus.

Image Credits: Latent AI

In some ways, Latent AI picks up some of this work and marries it with IP from SRI International .

“With OtoSense, I had already done some edge work,” Kandasamy said. “We had moved the audio recognition part out of the cloud. We did the learning in the cloud, but the recognition was done in the edge device and we had to convert quickly and get it down. Our bill in the first few months made us move that way. You couldn’t be streaming data over LTE or 3G for too long.”

At SRI, Chai worked on a project that looked at how to best manage power for flying objects where, if you have a single source of power, the system could intelligently allocate resources for either powering the flight or running the onboard compute workloads, mostly for surveillance, and then switch between them as needed. Most of the time, in a surveillance use case, nothing happens. And while that’s the case, you don’t need to compute every frame you see.

“We took that and we made it into a tool and a platform so that you can apply it to all sorts of use cases, from voice to vision to segmentation to time series stuff,” Kandasamy explained.

What’s important to note here is that the company offers the various components of what it calls the Latent AI Efficient Inference Platform (LEIP) as standalone modules or as a fully integrated system. The compressor and compiler are the first two of these and what the company is launching today is LEIP Adapt, the part of the system that manages the dynamic AI workloads Kandasamy described above.

Image Credits: Latent AI

In practical terms, the use case for LEIP Adapt is that your battery-powered smart doorbell, for example, can run in a low-powered mode for a long time, waiting for something to happen. Then, when somebody arrives at your door, the camera wakes up to run a larger model — maybe even on the doorbell’s base station that is plugged into power — to do image recognition. And if a whole group of people arrives at ones (which isn’t likely right now, but maybe next year, after the pandemic is under control), the system can offload the workload to the cloud as needed.

Kandasamy tells me that the interest in the technology has been “tremendous.” Given his previous experience and the network of SRI International, it’s maybe no surprise that Latent AI is getting a lot of interest from the automotive industry, but Kandasamy also noted that the company is working with consumer companies, including a camera and a hearing aid maker.

The company is also working with a major telco company that is looking at Latent AI as part of its AI orchestration platform and a large CDN provider to help them run AI workloads on a JavaScript backend.


How To Inject an Empty XA Transaction in MySQL

Inject an Empty XA Transaction in MySQL

Inject an Empty XA Transaction in MySQLIf you are using XA transactions, then you’ve likely run into a few replication issues with the 2PCs (2 Phase Commits). Here is a common error we see in Percona’s Managed Services and a few ways to handle it, including injecting an empty XA transaction.

Last_Error: Error 'XAER_NOTA: Unknown XID' on query. Default database: 'punisher'. Query: 'XA COMMIT X'1a',X'a1',1'

What Does it Mean?

It means that replication has tried to commit an XID (XA transaction ID) that does not exist on the server. We can verify that it does not exist by checking:

replica1 [localhost:20002] {msandbox} ((none)) > XA RECOVER CONVERT XID;
| formatID | gtrid_length | bqual_length | data   |
|        1 |            1 |            1 | 0x2BB2 |
1 row in set (0.00 sec)

In this case, there is a prepared XA transaction on the server but it is XID X’2B’,X’B2’,1′ not X’1a’,X’a1’,1′. So indeed, the XID does not exist.

How Do We Fix It?

A few ways… When using anonymous replication, it can be skipped like any other error:


When using GTIDs, it can be skipped by the typical injecting an empty GTID:


Another option is that we can inject an empty XA transaction, much like we do with GTID. Then we can resume replication so it can naturally commit that XID.

To prepare an empty XA, first copy the SQL + XID from the error. In this case “XA COMMIT X’1a’,X’a1′,1”.

Now transform it into three statements, and run them on the erred replica.

XA START X'1a',X'a1',1;
XA END X'1a',X'a1',1;
XA PREPARE X'1a',X'a1',1;

This will have created a prepared XA transaction on the server. We can verify by running:

replica1 [localhost:20002] {msandbox} ((none)) > XA RECOVER CONVERT XID;
| formatID | gtrid_length | bqual_length | data   |
|        1 |            1 |            1 | 0x2BB2 |
|        1 |            1 |            1 | 0x1AA1 | <--- this is the transaction we just created
2 rows in set (0.00 sec)

So, let’s start replication:

replica1 [localhost:20002] {msandbox} ((none)) > START SLAVE;
ERROR 1399 (XAE07): XAER_RMFAIL: The command cannot be executed when global transaction is in the  PREPARED state

Uh oh, now what? When you prepare an XA transaction on a server, your session cannot execute any other SQL. You must disconnect from MySQL, reconnect, then start replication.

replica1 [localhost:20002] {msandbox} ((none)) > exit
replica1 [localhost:20002] {msandbox} ((none)) > START SLAVE;
Query OK, 0 rows affected (0.02 sec)

Regardless of how you handled the error, it is recommended to run a checksum to validate data consistency.

How Does This Happen?

2PCs write to the binlogs in two…phases ? The first phase contains the {XA START/transaction SQL/XA END/XA PREPARE}. Think of all those statements as a single GTID. Once the XA PREPARE command has run, that whole transaction is written to the binary log so these statements will always be written together. Example:

# at 903
#200908 20:53:35 server id 100  end_log_pos 1004 CRC32 0xd2f9e5c0       Query   thread_id=4     exec_time=0     error_code=0
SET TIMESTAMP=1599598415/*!*/;
XA START X'1a',X'a1',1
# at 1004
#200908 20:53:35 server id 100  end_log_pos 1055 CRC32 0xad24c30d       Table_map: `punisher`.`t1` mapped to number 108
# at 1055
#200908 20:53:35 server id 100  end_log_pos 1100 CRC32 0xf7100e24       Write_rows: table id 108 flags: STMT_END_F
### INSERT INTO `punisher`.`t1`
### SET
###   @1=2
###   @2='2020-09-08 20:53:35'
# at 1100
#200908 20:53:44 server id 100  end_log_pos 1191 CRC32 0x314c857d       Query   thread_id=4     exec_time=0     error_code=0
SET TIMESTAMP=1599598424/*!*/;
XA END X'1a',X'a1',1
# at 1191
#200908 20:53:44 server id 100  end_log_pos 1229 CRC32 0x829495e8       XA PREPARE X'1a',X'a1',1
XA PREPARE X'1a',X'a1',1

Now we have a prepared XA on the source and replicas (which can hold row locks and block other transactions). The transaction can now be committed or rolled back, this depends on the second phase and this is where the problems come in.

The second phase commit/rollback can come seconds later, minutes later, days later, or even never. It all depends on when/if the Transaction Manager issues the command. In this case, it was 4 minutes later:

# at 1294
#200908 20:57:37 server id 100  end_log_pos 1388 CRC32 0xe38c4e46       Query   thread_id=4     exec_time=0     error_code=0
SET TIMESTAMP=1599598477/*!*/;
XA COMMIT X'1a',X'a1',1

There could be hundreds or thousands of other transactions written to the binary log in between the first and second phases. They could even be written to different binlogs.

This explanation is just to show how 2PCs work to understand the separate parts of an XA transaction.


Now to try and give some answer to “how does this happen?”…it could be from restoring a backup but MySQL did not get told to prepare some XID on the server. Now replication starts and it is reading events from the source’s binlog. Then it comes across an XA COMMIT but that XID was not prepared so it errors. Ultimately, these issues usually come down to some bug (here is one for example).

Do you have XAs blocking other transactions? Check out Planes, Trains, and Automobiles: MySQL XA Transactions.

Here is another post on how to troubleshoot XA recovery.

Percona Server for MySQL is also working on making XA RECOVER CONVERT XID more helpful!


Announcing the Agenda for Percona Live ONLINE, 20-21 October 2020!

agenda percona live online

agenda percona live onlineToday, we’re excited to announce the agenda for Percona Live ONLINE taking place on 20-21 October 2020.

As the conference could not go ahead in person in Amsterdam as planned, we have once again moved to an online format that saw more than 6,500 people join us over 24 hours for our May conference.

We’ll be covering topics on Open Source Databases and Applications using MySQL, PostgreSQL, MongoDB, and MariaDB, as well as topics on Cloud, Application Development, High Availability, and Kubernetes.

The conference will run for 28 hours this time and feature opening and closing keynotes, with that line up to be announced soon!

Let’s take a look at a sneak peek of some talks before you view the full agenda, which is now available here:

View Agenda

You can view our line up of speakers, as well as view the schedule in your local timezone and register for specific sessions that are of interest.

Of course Percona Live ONLINE is not complete without our hosts who are based across the globe! Before the conference kicks off, get to know our hosts!

Join the Chat Room ahead of time and network with hundreds of your peers in our dedicated Slack Channel:

Join Now

If you can’t wait until October to view these awesome sessions, why not take a look at our on-demand sessions which are available from Percona Live ONLINE in May.

Powered by WordPress | Theme: Aeros 2.0 by