Aug
31
2021
--

My Favorite Percona Monitoring and Management Additional Dashboards

Percona Monitoring and Management Dashboards

Percona Monitoring and Management (PMM) has dashboards that cover a lot of ground, yet PMM Superpowers come from the fact you do not need to stick to dashboards that are included with the product! You also can easily install additional dashboards provided by the Community, as well as implement your own.

In this blog post, we will cover some of the additional dashboards which I find particularly helpful.

Node Processes Dashboard

Node Processes Dashboard

Get insights into the processes on the system to better understand resource usage by your database server vs other stuff on the system.   Unexpected resource hog processes are a quite common cause of downtime and performance issues.  More information in the Understanding Processes on your Linux Host blog post.

MySQL Memory Usage Details

MySQL Memory Usage Details

Ever wondered where MySQL memory usage comes from? This dashboard can shed a light on this dark place, showing the top global memory consumers as well as what users and client hosts contribute to memory usage.  More details in the Understanding MySQL Memory Usage with Performance Schema blog post.

MySQL Query Performance Troubleshooting

MySQL Query Performance Troubleshooting

Want to understand which queries are responsible for CPU, Disk, Memory, or Network Usage and get some other advanced MySQL Query Troubleshooting tools? Check out this dashboard.  Read more about it in the  MySQL Query Performance Troubleshooting blog post.

RED Method for MySQL Dashboard

RED Method for MySQL Dashboard

Want to apply the RED (Rate-Errors-Duration)  method to MySQL?  Check out this dashboard, and check out RED Method for MySQL Performance Analyses for more details.

OK, so let’s say you’re convinced and want to get those dashboards into your PMM install but manual installation does not excite you.  Here is how you can use custom dashboard provisioning  to install all of them:

curl -LJOs https://github.com/Percona-Lab/pmm-dashboards/raw/main/misc/import-dashboard-grafana-cloud.sh --output import-dashboard-grafana-cloud.sh
curl -LJOs https://github.com/Percona-Lab/pmm-dashboards/raw/main/misc/cleanup-dash.py --output cleanup-dash.py

chmod a+x import-dashboard-grafana-cloud.sh
chmod a+x cleanup-dash.py

./import-dashboard-grafana-cloud.sh -s <PMM_SERVER_IP> -u admin:<ADMIN_PASSWORD> -f Custom -d 13266 -d 12630 -d 12470 -d 14239

Note:  Node Processes and MySQL Memory Usage Details dashboards also require additional configuration on the client-side. Check out the blog posts mentioned for specifics.

Enjoy!

Percona Monitoring and Management is a best-of-breed open source database monitoring solution. It helps you reduce complexity, optimize performance, and improve the security of your business-critical database environments, no matter where they are located or deployed.

Download Percona Monitoring and Management Today

Aug
24
2021
--

Resharding in MongoDB 5.0

Resharding in MongoDB 5.0

Resharding in MongoDB 5.0MongoDB 5.0 has been released with a bunch of new features. An important one is the capability to completely redefine the shard key of a collection in a sharded cluster. Resharding was a feature frequently requested by the customers. In MongoDB 4.4 only the refining of the shard was available; from now on you can change your shard key using also different fields.

When you use sharding in MongoDB and you fail to create the shard key of a collection, you can face issues like:

  • Unbalanced distribution of the data
  • Having jumbo chunks that cannot be split and cannot be migrated

These issues can simply make your cluster very slow, making some of the shards extremely overloaded. Also, the cluster can be really unmanageable and crashes could happen at some point.

The Problem of Changing a Shard Key

I had a customer running a 5-shard cluster with more than 50TB of data. They had a few collections with sub-optimal shard keys. They had one of the shards having more than 75% of the data and workload, and they had thousands of jumbo chunks around. Some of the jumbo chunks reached 200GB in size; really a lot. The cluster was terribly slow, chunk migration attempts were a lot and they failed very frequently, and the application was most of the time unresponsive due to timeouts. Manual splits and migrations of jumbo chunks were impossible. The only solution was to create a different shard key for those collections, but this wasn’t possible with that MongoDB version. The only suggested way to change a shard key was to drop and recreate the collection. Yes, remember also to dump and reload the data. This requires you to stop your application for an impressive amount of time if the dataset is large. That was the case.

In order to avoid stopping the application, we deployed a new empty cluster with a few more shards as well. We created all the empty collections in the new cluster with the optimal shard key. Then we moved the data out of the old cluster with custom scripts, one piece at the time, using boundaries on indexed timestamp fields. The solution required the customer to deploy some application changes. The application was instructed to read/write from a different cluster depending if the affected data was already migrated into the new cluster or not. The entire migration process took weeks to complete. In the end, it worked and the customer now has a balanced cluster with no jumbo chunks.

Anyway, the solution came at some cost in terms of new machines, development effort, and time.

Fortunately, MongoDB 5.0 introduced the reshardCollection command. From now on changing a shard key should be less expensive than in the past. That’s awesome.

In this article, we’ll take a look at how resharding in MongoDB 5.0 works.

Internals

The way resharding works is simple. When you issue the reshardCollection command, a new implicit empty collection is created by MongoDB with the new shard key defined, and the data will be moved from the existing collection chunk by chunk. There is a two-second lock required by the balancer to determine the new data distribution, but during the copy phase, the application can continue to read and write the collection with no issues.

Due to this copy phase, the larger the collection is, the more time the resharding takes.

Prerequisites for Resharding

There are some requirements you need to validate before starting the resharding, and some could be expensive, in certain cases.

  • Disk space: be sure to have at least 1.2x the size of the collection you’re going to reshard. If the collection is 1TB, you need at least 1.2TB free space in your disk.
  • I/O capacity should be below 50%
  • CPU load should be below 80%

The resources available must be evaluated on each shard.

If you fail to provide these resources the database will run out of space or the resharding could take longer than expected.

Another important requirement is that you will need to deploy changes to your application queries. To let the database work best during the sharding process, the queries must use filters on both the current shard key and the new shard key. Only at the end of the resharding process you may drop all the filters regarding the old shard key out of your queries.

This requirement is very important and in some cases it could be a little expensive. Temporarily changing the application code is sometimes a hard task.

There also other limitations you need to consider:

  • Resharing is not permitted if an index built is running
  • Some queries will return error if you didn’t include both the current and new shard keys: deleteOne(), findAnd Modify(), updateOne() … check the manual for the full list
  • You can reshard only one collection at a time
  • You cannot use addShard(), removeShard(), dropDatabase(), db.createCollection() while resharding is running
  • The new shard key cannot have a uniqueness constraint
  • Resharding a collection with a uniqueness constraint is not supported

Let’s Test Resharding in MongoDB

To test the new feature, I created a sharded cluster using AWS instances, t2-large with 2 CPUs, 8 GB RAM, and EBS volume. I deployed a two-shard cluster using just a single member for each replica set.

Let’s create a sharded collection and insert some sample data.

At the beginning we create the shard key on the age field. To create the random data, we use some functions and a javascript loop to insert one million documents.

[direct: mongos] testdb> sh.shardCollection("testdb.testcoll", { age: 1} )
{
  collectionsharded: 'testdb.testcoll',
  ok: 1,
  '$clusterTime': {
    clusterTime: Timestamp({ t: 1628588848, i: 25 }),
    signature: {
      hash: Binary(Buffer.from("0000000000000000000000000000000000000000", "hex"), 0),
      keyId: Long("0")
    }
  },
  operationTime: Timestamp({ t: 1628588848, i: 21 })
}

[direct: mongos] testdb> function getRandomInt(min, max) {
...     return Math.floor(Math.random() * (max - min + 1)) + min;
... }
[Function: getRandomInt]

[direct: mongos] testdb> var day = 1000 * 60 * 60 * 24;

[direct: mongos] testdb> function getRandomDate() {
... return new Date(Date.now() - (Math.floor(Math.random() * day)))
... }
[Function: getRandomDate]

[direct: mongos] testdb> function getRandomString() {
... return (Math.random()+1).toString(36).substring(2)
... }
[Function: getRandomString]

[direct: mongos] testdb> for (var i=1; i<=1000000; i++) {
... db.testcoll.insert(
..... {
....... uid: i,
....... name: getRandomString(),
....... date_created: getRandomDate(),
....... age: getRandomInt(1,100),
....... address: getRandomString(),
....... city: getRandomString(),
....... country: getRandomString()
....... }
..... )
... }

Now we have our initial collection. Let’s take a look at the distribution of the chunks and the sharding status of the cluster.

{
  database: {
    _id: 'testdb',
    primary: 'sh1-rs',
    partitioned: true,
    version: {
    uuid: UUID("efda5b69-1ffc-42c3-abed-ad20e08e321d"),
    timestamp: Timestamp({ t: 1628527701, i: 21 }),
    lastMod: 1
  }
},
collections: {
  'testdb.testcoll': {
    shardKey: { age: 1 },
    unique: false,
    balancing: true,
    chunkMetadata: [
      { shard: 'sh0-rs', nChunks: 2 },
      { shard: 'sh1-rs', nChunks: 3 }
    ],
    chunks: [
      { min: { age: MinKey() }, max: { age: 1 }, 'on shard': 'sh1-rs', 'last modified': Timestamp({ t: 5, i: 0 }) },
      { min: { age: 1 }, max: { age: 42 }, 'on shard': 'sh0-rs', 'last modified': Timestamp({ t: 4, i: 0 }) },
      { min: { age: 42 }, max: { age: 72 }, 'on shard': 'sh1-rs', 'last modified': Timestamp({ t: 5, i: 2 }) },
      { min: { age: 72 }, max: { age: 100 }, 'on shard': 'sh1-rs', 'last modified': Timestamp({ t: 5, i: 3 }) },
      { min: { age: 100 }, max: { age: MaxKey() }, 'on shard': 'sh0-rs', 'last modified': Timestamp({ t: 5, i: 1 }) }
    ],
    tags: []
    }
  }
}

With the current shard key on the age field we’ve got 5 chunks in total, 2 on sh0-rs and 3 on sh1-rs. The total size of the collection is around 200MB (uncompressed).

It’s time to test the resharding. Let’s try to change the shard key to { name: 1 }. The following is the right syntax where we provide the namespace to reshard and the new shard key:

[direct: mongos] testdb> db.adminCommand({
... reshardCollection: "testdb.testcoll",
... key: { name: 1 }
... })

From another connection we can monitor the status of the resharding:

[direct: mongos] testdb> db.getSiblingDB("admin").aggregate([ 
{ $currentOp: { allUsers: true, localOps: false } }, 
{ $match: { type: "op", "originatingCommand.reshardCollection": "testdb.testcoll" } }])
[
  {
    shard: 'sh0-rs',
    type: 'op',
    desc: 'ReshardingRecipientService f86967fe-36a9-4836-8972-1061a61230df',
    op: 'command',
    ns: 'testdb.testcoll',
    originatingCommand: {
      reshardCollection: 'testdb.testcoll',
      key: { name: 1 },
      unique: false,
      collation: { locale: 'simple' }
    },
    totalOperationTimeElapsedSecs: Long("145"),
    remainingOperationTimeEstimatedSecs: Long("173"),
    approxDocumentsToCopy: Long("624135"),
    documentsCopied: Long("569868"),
    approxBytesToCopy: Long("95279851"),
    bytesCopied: Long("86996201"),
    totalCopyTimeElapsedSecs: Long("145"),
    oplogEntriesFetched: Long("0"),
    oplogEntriesApplied: Long("0"),
    totalApplyTimeElapsedSecs: Long("0"),
    recipientState: 'cloning',
    opStatus: 'running'
  },
  {
    shard: 'sh0-rs',
    type: 'op',
    desc: 'ReshardingDonorService f86967fe-36a9-4836-8972-1061a61230df',
    op: 'command',
    ns: 'testdb.testcoll',
    originatingCommand: {
      reshardCollection: 'testdb.testcoll',
      key: { name: 1 },
      unique: false,
      collation: { locale: 'simple' }
    },
    totalOperationTimeElapsedSecs: Long("145"),
    remainingOperationTimeEstimatedSecs: Long("173"),
    countWritesDuringCriticalSection: Long("0"),
    totalCriticalSectionTimeElapsedSecs: Long("0"),
    donorState: 'donating-initial-data',
    opStatus: 'running'
  }, 
  {
    shard: 'sh1-rs',
    type: 'op',
    desc: 'ReshardingDonorService f86967fe-36a9-4836-8972-1061a61230df',
    op: 'command',
    ns: 'testdb.testcoll',
    originatingCommand: {
      reshardCollection: 'testdb.testcoll',
      key: { name: 1 },
      unique: false,
      collation: { locale: 'simple' }
    },
    totalOperationTimeElapsedSecs: Long("145"),
    remainingOperationTimeEstimatedSecs: Long("276"),
    countWritesDuringCriticalSection: Long("0"),
    totalCriticalSectionTimeElapsedSecs: Long("0"),
    donorState: 'donating-initial-data', 
    opStatus: 'running'
  },
  {
    shard: 'sh1-rs',
    type: 'op',
    desc: 'ReshardingRecipientService f86967fe-36a9-4836-8972-1061a61230df',
    op: 'command',
    ns: 'testdb.testcoll',
    originatingCommand: {
      reshardCollection: 'testdb.testcoll',
      key: { name: 1 },
      unique: false,
      collation: { locale: 'simple' }
    },
    totalOperationTimeElapsedSecs: Long("145"),
    remainingOperationTimeEstimatedSecs: Long("276"),
    approxDocumentsToCopy: Long("624135"),
    documentsCopied: Long("430132"),
    approxBytesToCopy: Long("95279851"),
    bytesCopied: Long("65663031"),
    totalCopyTimeElapsedSecs: Long("145"),
    oplogEntriesFetched: Long("0"),
    oplogEntriesApplied: Long("0"),
    totalApplyTimeElapsedSecs: Long("0"),
    recipientState: 'cloning',
    opStatus: 'running'
  }
]

With the totalOperationTimeElapsedSecs you can see the time elapsed and with the remainingOperationTimeEstimatedSecs the estimated time for completing the operation.

While resharding is running we are allowed to write into the collection. For example, I tried the following and it worked:

[direct: mongos] testdb> db.testcoll.insert({ name: "new doc" })
{
  acknowledged: true,
  insertedIds: { '0': ObjectId("61125fa53916a70b7fdd8f6c") }
}

The full resharding of the collection took around 5 minutes. Let’s now see the new status of the sharding.

{
  database: {
    _id: 'testdb',
    primary: 'sh1-rs',
    partitioned: true,
    version: {
      uuid: UUID("efda5b69-1ffc-42c3-abed-ad20e08e321d"),
      timestamp: Timestamp({ t: 1628527701, i: 21 }),
      lastMod: 1
    }
  },
  collections: {
    'testdb.testcoll': {
    shardKey: { name: 1 },
    unique: false,
    balancing: true,
    chunkMetadata: [
      { shard: 'sh0-rs', nChunks: 3 },
      { shard: 'sh1-rs', nChunks: 3 }
    ],
    chunks: [
      { min: { name: MinKey() }, max: { name: '7gbk73hf42g' }, 'on shard': 'sh1-rs', 'last modified': Timestamp({ t: 6, i: 0 }) },
      { min: { name: '7gbk73hf42g' }, max: { name: 'cv1oqoetetf' }, 'on shard': 'sh1-rs', 'last modified': Timestamp({ t: 5, i: 1 }) },
      { min: { name: 'cv1oqoetetf' }, max: { name: 'kuk1p3z8kc' }, 'on shard': 'sh0-rs', 'last modified': Timestamp({ t: 2, i: 2 }) },
      { min: { name: 'kuk1p3z8kc' }, max: { name: 'ppq4ubqwywh' }, 'on shard': 'sh0-rs', 'last modified': Timestamp({ t: 2, i: 3 }) },
      { min: { name: 'ppq4ubqwywh' }, max: { name: 'zsods6qhqp' }, 'on shard': 'sh1-rs', 'last modified': Timestamp({ t: 1, i: 3 }) },
      { min: { name: 'zsods6qhqp' }, max: { name: MaxKey() }, 'on shard': 'sh0-rs', 'last modified': Timestamp({ t: 6, i: 1 }) }
    ],
    tags: []
    }
  }
}

We can see the shard key has been changed, and the chunks are now evenly distributed.

[direct: mongos] testdb> db.testcoll.getShardDistribution()
Shard sh1-rs at sh1-rs/172.30.0.108:27017
{
  data: '92.72MiB',
  docs: 636938,
  chunks: 3,
  'estimated data per chunk': '30.9MiB',
  'estimated docs per chunk': 212312
}
---
Shard sh0-rs at sh0-rs/172.30.0.221:27017
{
  data: '82.96MiB',
  docs: 569869,
  chunks: 3,
  'estimated data per chunk': '27.65MiB',
  'estimated docs per chunk': 189956
}
---
Totals
{
  data: '175.69MiB',
  docs: 1206807,
  chunks: 6,
  'Shard sh1-rs': [
    '52.77 % data',
    '52.77 % docs in cluster',
    '152B avg obj size on shard'
  ],
  'Shard sh0-rs': [
    '47.22 % data',
    '47.22 % docs in cluster',  
    '152B avg obj size on shard'
  ]
}

Good. It worked.

Out of curiosity, during the resharding, the temporary collection being copied is visible. Just run sh.status():

    collections: {
      'testdb.system.resharding.defda80d-ea5f-4e52-9a7f-9b3e2ed8ddf1': {
        shardKey: { name: 1 },
        unique: false,
        balancing: true,
        chunkMetadata: [
          { shard: 'sh0-rs', nChunks: 3 },
          { shard: 'sh1-rs', nChunks: 3 }
        ],
        chunks: [
          { min: { name: MinKey() }, max: { name: '9bnfw8n23l' }, 'on shard': 'sh0-rs', 'last modified': Timestamp({ t: 2, i: 0 }) },
          { min: { name: '9bnfw8n23l' }, max: { name: 'etd3ho9vfwg' }, 'on shard': 'sh1-rs', 'last modified': Timestamp({ t: 2, i: 1 }) },
          { min: { name: 'etd3ho9vfwg' }, max: { name: 'kmht0br01l' }, 'on shard': 'sh0-rs', 'last modified': Timestamp({ t: 1, i: 2 }) },
          { min: { name: 'kmht0br01l' }, max: { name: 'sljcb1s70g' }, 'on shard': 'sh1-rs', 'last modified': Timestamp({ t: 1, i: 3 }) },
          { min: { name: 'sljcb1s70g' }, max: { name: 'zzi3ijuw2f' }, 'on shard': 'sh0-rs', 'last modified': Timestamp({ t: 1, i: 4 }) },
          { min: { name: 'zzi3ijuw2f' }, max: { name: MaxKey() }, 'on shard': 'sh1-rs', 'last modified': Timestamp({ t: 1, i: 5 }) }
        ],
        tags: []
      },
      'testdb.testcoll': {
        shardKey: { age: 1 },
        unique: false,
        balancing: true,
        chunkMetadata: [
          { shard: 'sh0-rs', nChunks: 2 },
          { shard: 'sh1-rs', nChunks: 3 }
        ],
        chunks: [
          { min: { age: MinKey() }, max: { age: 1 }, 'on shard': 'sh1-rs', 'last modified': Timestamp({ t: 5, i: 0 }) },
          { min: { age: 1 }, max: { age: 42 }, 'on shard': 'sh0-rs', 'last modified': Timestamp({ t: 4, i: 0 }) },
          { min: { age: 42 }, max: { age: 72 }, 'on shard': 'sh1-rs', 'last modified': Timestamp({ t: 5, i: 2 }) },
          { min: { age: 72 }, max: { age: 100 }, 'on shard': 'sh1-rs', 'last modified': Timestamp({ t: 5, i: 3 }) },
          { min: { age: 100 }, max: { age: MaxKey() }, 'on shard': 'sh0-rs', 'last modified': Timestamp({ t: 5, i: 1 }) }
        ],
        tags: []
      }
    }
  }

testdb.system.resharding.defda80d-ea5f-4e52-9a7f-9b3e2ed8ddf1 is the newly created collection.

Some Graphs from Percona Monitoring and Management

I deployed Percona Monitoring and Management (PMM) to monitor the cluster and for evaluating the impact of the resharding operation. As said, I used a little test environment with no workload at all, so what we can see on the following graphs is just the impact of the resharding.

The cluster has one config server and two shards. Each replica set is just a single member running on a dedicated virtual host in AWS EC2.

Let’s see the CPU usage

percona monitoring and management mongodb

The node in the middle is the config server. The data-bearing nodes had more impact in terms of CPU usage, quite relevant. The config server didn’t have any impact.

Let’s see the Disk utilization in terms of IOPS and latency:

Disk utilization

The same as the CPU, the config server didn’t have any impact. Instead, the shards increased the IOPS as expected since a new collection has been created. The disk latency had spikes to more than 20 ms at the beginning and at the end of the process.

MongoDB increased the latency of the read operations.

The WiredTiger storage engine increased the number of transactions processed and the cache activity as expected.

Other metrics increased but they are not included here. This behavior was expected and it’s normal for a resharding operation.

The resharding is quite expensive, even for a small collection, in particular for the CPU utilization. Anyway, more tests are needed with larger collections, in the GB/TB magnitude.

Remember the resharding time depends on the size of the collection. Be aware of that and expect the performance decrease for your servers for some time when you plan to do a resharding.

Conclusions

Resharding in MongoDB 5.0 is an amazing new feature but it comes with some additional costs and limitations. Indeed, resharding requires resources in terms of disk space, I/O, and CPU. Be aware of that and plan carefully your resharding operations.

As a best practice, I suggest spending as much time as possible the first time you decide to shard a collection. If you choose the optimal shard key from the beginning, then you won’t need resharding. A good schema design is always the best approach when working with MongoDB.

I suggest continuously monitoring the status of the chunk distribution in your collections. As soon as you notice there’s something wrong with the shard key and the distribution is uneven and it’s getting worse, then use reshardCollection as soon as you can, while the collection is not too large.

Another suggestion is to do resharding only during a maintenance window because the performance impact for your cluster could be significant in the case of very large collections.

Further readings:

https://docs.mongodb.com/manual/core/sharding-reshard-a-collection/

https://docs.mongodb.com/v5.0/core/sharding-choose-a-shard-key/

https://docs.mongodb.com/manual/reference/command/refineCollectionShardKey/

https://www.percona.com/blog/mongodb-5-0-is-coming-in-hot-what-do-database-experts-across-the-community-think/

Percona Distribution for MongoDB is a freely available MongoDB database alternative, giving you a single solution that combines the best and most important enterprise components from the open source community, designed and tested to work together.

Download Percona Distribution for MongoDB Today!

Aug
23
2021
--

August Is Hot, but Not as Hot as Our Next Community Engineering Meeting!

Percona Community Engineering Meeting

Percona Community Engineering MeetingThe engineering marvel that is Percona Monitoring and Management (PMM), has to be, for me, one of the best examples out there of how a large number of vastly different community projects can successfully combine into something whose value and features far exceeds the sum of its component open-source parts.

That’s a lot of words for a simple thing: it’s a metrics mover—it’s PMM!

PMM has exporters that suck system stats from a server or database.

The numbers are safely shuffled through agents across a crowded network toward the sanctuary of the PMM Server.

That’s where your valuable system values are collected and collated, charted and tabled, and where they emerge as fabled knowledge, logically organized and neatly arranged into dashboards.

Dashboards (What you see)

PMM’s dashboards are based on Grafana, and PMM relies on many community-based projects.

Exporters (What you don’t see)

Exporters are specialized programs that run and extract metrics data from a node or database. Every database is different, and so every exporter is too.

Community Meetings (What you may not know)

I tell you what you may already know simply as a prelude to an announcement of what you may not yet know.

We’ve been running monthly community meetings where our Engineering Gurus “talk technical” to share their knowledge of PMM and all its parts.

The format is fairly informal. We use our Discord server’s voice channel. We grab a refreshing region-sensitive hot or cold drink, a very loose agenda, and we talk.

All we need now is someone to listen.

The topics in this month’s meeting:

  • Exporters: Focusing this month on node_exporter, and how and why we’re reverting our fork of the Prometheus one.

  • Dashboards: How Percona’s differ from the community’s, and how to migrate them using the scripts in here. Here’s a brief preview.

    • Convert a dashboard from PMM

      PMM dashboard

      This script converts a PMM dashboard so it can be used in an external Prometheus + Grafana installation. It doesn’t need any input from you. It replaces PMM2 labels (node_name, service_name) that are used in variables with default labels (instance).

    • Convert a dashboard to PMM
      Convert a dashboard to PMM

    This one renames labels and variables interactively because it can’t know in advance the names of the labels you want to use. You must select names for renaming and provide a PMM2 label (node_name, service_name). The script collects a list of used variables and asks which have to be renamed in variables and expressions.

    PMM2 labels can be checked in the VM configuration files:

    VM configuration files

As we’ve already tweeted, our Gurus are not ashamed of showing their consoles. Some are bare and some are full, but they’re always fascinating to watch when they use them with such style.

Why not join us and see for yourself?

Aug
11
2021
--

Pros and Cons: When You Should and Should Not Use MongoDB

pros and cons of using MongoDB

pros and cons of using MongoDBQuite often we see that the main operational storage is used in conjunction with some additional services, for example, for caching or full-text search.

Another architecture approach using multiple databases is microservices, where every microservice has its own database better optimized for the tasks of this particular service. For example, you can use MySQL for primary storage, Redis and Memcache – for caching, Elastic Search, or native Sphinx – for searching. You can apply something like Kafka to transfer data to the analytics system, which was often done on something like Hadoop.

If we are talking about the main operational storage, there can be two options there. We can choose relational databases with SQL language. Alternatively, we may opt for a non-relational database and then for one of its types available in this case.

If we talk about NoSQL data models, then there are a lot of choices there too. The most typical ones are key-value, document, or wide column databases.

Examples are Memcache, MongoDB, and Cassandra, respectively.

Looking at DB-Engines Ranking, we will see that the popularity of open-source databases has been growing over the years, while commercial databases have been gradually declining.

What’s even more interesting, the same trend has been observed for different types of databases: open source databases are the most popular for many types such as columnar databases, time series, and document stories. Commercial licenses prevail only for classical technologies such as relational database data or even older ones like multivalue databases.

At Percona, we work closely with the most popular relational and non-relational open-source databases (MySQL, PostgreSQL, and MongoDB), dealing with many clients; we help them make choices and provide them with the best advice for each case.

With that in mind, this article has the purpose of showing scenarios that are worth considering before deploying MongoDB, leading you to the thought of when you should and should not use it. Additionally, if you already have your setup, this article might be also interesting, as during the evaluation of a product some of the following topics can have passed unnoticed.

Table of Contents

Here is a list of topics that I will discuss further in this article:

  1. Team experience and preferences
  2. Development approach and app lifecycle 
  3. Data Model
  4. Transactions and Consistency (ACID)
  5. Scalability
  6. Administration

1. Team Experience and Preferences

Before diving into MongoDB, the most important thing is to take into account the team’s experience and preferences.

From the MongoDB point of view, the advantage is that we have flexible JSON format documents, and for some tasks and some developers, this is convenient. For some teams, it is difficult, especially if they have worked with SQL databases for a long time and understand relational algebra and SQL language very well.

In MongoDB, you can easily familiarize yourself with CRUD operations like:

Simple queries are less likely to cause problems. Still, as soon as the daily task arises requiring more deep data processing, you certainly need a powerful tool to handle that, like MongoDB aggregation pipeline and map-reduce, which we will further discuss in this article.

There are great and free courses available at MongoDB University that can undoubtedly help grow the team’s knowledge. Still, it’s important to have in mind that the apex of the learning curve might take some time to reach if the team is not entirely familiar with it.

2. Development Approach and Application Lifecycle

If we talk about applications where MongoDB is used, they mainly focus on fast development because you can change everything at any time. You don’t have to worry about the strict format of the document.

The second point is a data schema. Here you need to understand that data always has the schema; the only question is where it is implemented. You can implement the data schema in your application because, somehow, this is the data you use. Or this schema is implemented at the database level.

It is quite often when you have an application, and only this application deals with the data in the database. For example, if we save data from the application to a database, the application-level schema works well. But, If we have the same data used by many applications, it becomes very inconvenient and difficult to control.

A point of view of the application development cycle can be represented as follows:

  • Speed of development
  • No need to synchronize the schema in the database and the application
  • It is clear how to scale further
  • Simple predetermined solutions

3. Data Model

As mentioned in the first topic, the data model is highly dependent on the application and the team’s experience.

The data of many web applications usually is easy to display. Because if we store the structure, something like an associated array of the application, it is straightforward and clear for the developer to serialize it into a JSON document.

Let’s take an example. We want to save a contact list from the phone. There is data that fits well into one relational table: first name, last name, etc. But if you look at phone numbers or email addresses, one person may have several of them. If we want to store this in a good relational form, it would be nice to have it in separate tables, then collect it using JOIN, which is less convenient than storing it in one collection with hierarchical documents.

Data Model – Contact List Example

Relational Database
  • First name, last name, date of birth
  • One person can have several phone numbers and emails
  • You should create separate tables for them 
  • JSON arrays are non-traditional extensions
Document-Oriented Database
  • Everything is stored in one “collection.”
  • Arrays and embedded documents

However, it’s crucial to consider that a more flexible solution results in a list of documents that may have completely different structures. As someone said before, “With great power comes great responsibility.”

Unfortunately, it’s pretty common to see operations failing to manage documents larger than 16MB or a single collection holding terabytes of data; Or, in a worse scenario, shard-keys were wrongly designed.

These anomalies could be a good indication that you are turning your database into a Data Swamp. It is a term commonly used in Big Data deployment for data that is badly designed, inadequately documented, or poorly maintained.

You don’t need to normalize your data strictly, but it’s essential to take time and analyze how you will structure your data to have the best of worlds once using MongoDB and avoid those pitfalls.

You can check the blog post “Schema Design in MongoDB vs Schema Design in MySQL” to get better clarification on data modeling and how it differs. It is worth mentioning the schema validation feature that you can use during updates and insertions. You can set validation rules on a per-collection basis, restricting the content type that is being stored.

Terms

Interestingly, while modeling and querying, there is much in common between relational and non-relational DBMSs. We are talking about databases in both cases, but what we call a table in a relational database is often called a collection in a non-relational database. What is a column in SQL, is a field in MongoDB, and the list goes on.

In terms of using JOIN, which we have been mentioning, MongoDB does not have such a concept. However, you can use $lookup over your aggregation pipeline. It performs only a left outer join on your search; Extensive use of $lookup might indicate an error in your data modeling.

As for access: we apply SQL for relational data. For MongoDB and many other NoSQL databases, we use a standard such as CRUD. This standard says that there are operations to create, read, delete and update documents.

Below are some examples of the most typical tasks to deal with documents and their equivalent in the SQL world:

  • CREATE:

db.customers.insert({
  "CustomerName":"Cardinal",
  "ContactName":"Tom B. Erichsen",
  "Address":"Skagen 21",
  "City":"Stavanger",
  "PostalCode":4006,
  "Country":"Norway"
})

INSERT INTO customers
            (customername,
            contactname,
            address,
            city,
            postalcode,
            country)
VALUES      ('Cardinal',
            'Tom B. Erichsen',
            'Skagen 21',
            'Stavanger',
            '4006',
            'Norway');

  • READ:

db.customers.find({
  "PostalCode":{
      "$eq":4006}},
{
  "_id":0,
  "customername":1,
  "City":1
})

SELECT customername,
      city
FROM   customers
WHERE  PostalCode = 12346;

  • UPDATE:
db.customers.update({
  "CustomerName":"Cardinal"},
{"$set":{"City":"Oslo"}})

UPDATE customers
SET    city = 'Oslo'
WHERE  customername = 'Cardinal';

  • DELETE:

db.customers.remove({
  "customername":"John"
})

DELETE FROM customers
WHERE  customername = 'John';

If you are a developer familiar with the JavaScript language, this syntax provided by CRUD (MongoDB) will be more natural for you than the SQL syntax.

In my opinion, when we have the simplest operations, such as search or insert, they all work well enough. When it comes to more tricky operations of sampling, the SQL language is much more readable.

  • COUNT:

db.customers.find({
  "PostalCode":{
      "$eq":4006
  }
}).count()

SELECT Count(1)
FROM   customers
WHERE  postalcode = 4006;

With the interface, it’s easy enough to do things like counting the number of rows in a table or a collection.

  • Aggregation

db.customers.aggregate([
  {
      "$group":{
        "_id":"$City",
        "total":{
            "$sum":1
        }
      }
  }
])

SELECT city,
      Count(1)
FROM   customers
GROUP  BY city;

But if we do more complex things like GROUP BY in MongoDB, the Aggregation Framework will be required. This is a more complex interface that shows how we want to filter, how we want to group, etc.

4. Transactions and Consistency (ACID)

The reason for bringing this topic to the table is because depending on business requirements, the database solution might need to be ACID compliant. In this game, relational databases are far ahead. An excellent example of ACID requirements is operations that involve money.

Imagine you were building a function to transfer money from one account to another. If you successfully take money from the source account but never credit it to the destination; Or if you instead credited the destination but never took money out of the source to cover it. These two writes have to either happen or both not happen to keep our system sane, also know “all or nothing.”

Prior to the release of 4.0, MongoDB did not support transactions, but it supported atomic operations within a single document. 

That means, from the point of view of one document, the operation will be atomic. If the process changes several documents, and some kind of failure occurs during the change, some of these documents will be changed, and some will not.

This restriction for MongoDB has been lifted with the 4.0 release and onwards. For situations that require atomicity of reads and writes to multiple documents (in single or multiple collections), MongoDB supports multi-document transactions. It can be used across multiple operations, collections, databases, documents, and shards with distributed transactions.

  • In version 4.0, MongoDB supports multi-document transactions on replica sets.
  • In version 4.2, MongoDB introduces distributed transactions, which adds support for multi-document transactions on sharded clusters and incorporates the existing support for multi-document transactions on a Replica Se

5.  Scalability

What is scalability in this context? It is how easily you can take a small application and scale it to millions or even billions of users.

If we talk about the scalability of a cluster where our applications are already large enough, it is clear that one machine won’t cope, even if it is the most powerful one.

It also makes sense to talk about whether we scale reads, writes, or data volume. Priorities may differ in different applications, but in general, if the application is very large, they usually have to deal with all of these things.

In MongoDB, the focus was initially on scalability across multiple nodes. Even in cases of a small application. We can notice it on the Sharding feature released in the early days, which has been developed and getting more mature since then.

If you are looking for vertical scalability, It can be achieved in MongoDB via Replica Set configuration. You can scale up and scale down your database in very few steps, but the point here is that only your availability and reads are scaled. Your writes are still tied to a single point, the primary.

However, we know that the application will demand more write capacity at some point, or the dataset will become too big for the Replica Set; So, it is recommended to horizontally scaling by the use of Sharding, splitting the dataset, and writes across multiples shards. 

MongoDB sharding has some limitations: not all operations work with it, and a bad design on shard-keys can decrease query performance, create unevenly distributed data, and impact cluster internal operation as automatic data splitting, and on a worse scenario demanding manual re-sharding, which is an extensive and error-prone operation.

With the release of MongoDB 5.0, a resharding feature has been recently introduced. As with any new feature, my recommendation is to test extensively before any production usage. If at some point you are looking at approaches to refine your shard-key and then resharding with the new feature, the article Refining Shard Keys in MongoDB 4.4 and Above may guide you for a better choice.

6.  Administration

The administration is all those things that developers don’t think about. At least, it is not their first priority. The administration is all about the need to backup, update, monitor, restore an application in case of failures.

MongoDB is more focused on the standard way – administration is minimized. But it is clear that this happens at the expense of flexibility. A community of open source solutions for MongoDB is significantly smaller. You can notice it in the DB-Engines Ranking highlighted at the beginning of this article and the annual survey by StackOverflow; undoubtedly, MongoDB is the most popular NoSQL database, but unfortunately, it lacks a strong community.

Additionally, many recommended things in MongoDB are quite rigidly tied to Ops Manager and Atlas services – which are commercial platforms of MongoDB.

Until recently, running backup/restore routines were not trivial operations for Sharded Cluster or ReplicaSet. DBAs had to rely on methods around the mongodump/mongorestore tool or the use of File System Snapshot.

This scenario started to get better with features like Percona Hot-Backup and the Percona Backup for MongoDB tool. 

If we check the most popular relational database like MySQL, it is flexible enough and has many different approaches. There are good open-source implementations for everything, which are weaknesses that are still existing in MongoDB.

Conclusion

I have discussed a few topics that would help you in your daily routine, providing a wide vision of where MongoDB would benefit. It’s important to consider that this article is written on top of the latest available release MongoDB 5.0; If you already have a deployment, but it’s using older releases or deprecated ones, some of the observations and features might not be valid.   

If you are facing a problem or have a question on a granular level, please have a look at our blog; we may have written an article about it; We also invite you to check our white paper here, in which we detail more scenarios and cases where MongoDB is a good fit, and where it’s not.

I hope this helps you!

If you have any questions, feel free to reach out over the comment section below.

Useful Resources

Finally, you can reach us through the social networks, our forum, or access our material using the links presented below:

 

Percona Distribution for MongoDB is a freely available MongoDB database alternative, giving you a single solution that combines the best and most important enterprise components from the open source community, designed and tested to work together.

Download Percona Distribution for MongoDB Today!

Aug
10
2021
--

PostgreSQL PL/Java – A How-To, Part 2: How to Manipulate and Return Tuples

PostgreSQL PL:Java Manipulate Return Tuples

We discussed how to install and create a simple class in the first part of this series where we ran a SELECT and returned one row with one column with a formatted text. Now it’s time to expand and see how to return multiple tuples.
A little disclaimer here; I’m not going to comment much on the Java code because this is not intended to be a Java tutorial. The examples here are just for educational purposes, not intended to be of high performance or used in production!

Returning a Table Structure

This first example will show how we can select and return a table from a PL/Java function. We’ll keep using the table “customer” here and the probing SQL will be:
SELECT * FROM customer LIMIT 10;

I will create a new Java class here called “CustomerResultSet” and the initial code is:
package com.percona.blog.pljava;

import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.sql.SQLException;

import org.postgresql.pljava.ResultSetHandle;

public class CustomerResultSet implements ResultSetHandle {
	private Connection conn;
	private PreparedStatement stmt;
	private final String m_url = "jdbc:default:connection";
	private final String sql = "select * FROM customer LIMIT 10";

	public CustomerResultSet() throws SQLException {
		conn = DriverManager.getConnection(m_url);
		stmt = conn.prepareStatement(sql);
	}

	@Override
	public void close() throws SQLException {
		stmt.close();
		conn.close();
	}

	@Override
	public ResultSet getResultSet() throws SQLException {
		return stmt.executeQuery();
	}

	public static ResultSetHandle getCustomerPayments() throws SQLException {
		return new CustomerResultSet();
	}
}

Note that we are implementing the org.postgresql.pljava.ResultSetHandle interface provided by PL/Java. We need it because we are returning a complex object and the ResultSetHandle interface is appropriated when we don’t need to manipulate the returned tuples.

Now that we are using PL/Java objects we need to tell the compiler where to find those references and for this first example here we need the pljava-api jar, which in my case happens to be pljava-api-1.6.2.jar. If you remember from the first post I’ve compiled, the PL/Java I’m using here and my JAR file is located at “~/pljava-1_6_2/pljava-api/target/pljava-api-1.6.2.jar” and the compilation command will be:

javac -cp "~/pljava-1_6_2/pljava-api/target/pljava-api-1.6.2.jar" com/percona/blog/pljava/CustomerResultSet.java
jar -c -f /app/pg12/lib/pljavaPart2.jar com/percona/blog/pljava/CustomerResultSet.class

With my new JAR file created, I can then install it into Postgres and create the function “getCustomerLimit10()“:

SELECT sqlj.install_jar( 'file:///app/pg12/lib/pljavaPart2.jar', 'pljavaPart2', true );
SELECT sqlj.set_classpath( 'public', 'pljavaPart2' );
CREATE OR REPLACE FUNCTION getCustomerLimit10() RETURNS SETOF customer AS 'com.percona.blog.pljava.CustomerResultSet.getCustomerLimit10' LANGUAGE java;

The result of the function call is:
test=# SELECT * FROM getCustomerLimit10();
 customer_id | store_id | first_name | last_name |                email                | address_id | activebool | create_date |     last_update     | active 
-------------+----------+------------+-----------+-------------------------------------+------------+------------+-------------+---------------------+--------
           1 |        1 | MARY       | SMITH     | MARY.SMITH@sakilacustomer.org       |          5 | t          | 2006-02-14  | 2006-02-15 09:57:20 |      1
           2 |        1 | PATRICIA   | JOHNSON   | PATRICIA.JOHNSON@sakilacustomer.org |          6 | t          | 2006-02-14  | 2006-02-15 09:57:20 |      1
           3 |        1 | LINDA      | WILLIAMS  | LINDA.WILLIAMS@sakilacustomer.org   |          7 | t          | 2006-02-14  | 2006-02-15 09:57:20 |      1
           4 |        2 | BARBARA    | JONES     | BARBARA.JONES@sakilacustomer.org    |          8 | t          | 2006-02-14  | 2006-02-15 09:57:20 |      1
           5 |        1 | ELIZABETH  | BROWN     | ELIZABETH.BROWN@sakilacustomer.org  |          9 | t          | 2006-02-14  | 2006-02-15 09:57:20 |      1
           6 |        2 | JENNIFER   | DAVIS     | JENNIFER.DAVIS@sakilacustomer.org   |         10 | t          | 2006-02-14  | 2006-02-15 09:57:20 |      1
           7 |        1 | MARIA      | MILLER    | MARIA.MILLER@sakilacustomer.org     |         11 | t          | 2006-02-14  | 2006-02-15 09:57:20 |      1
           8 |        2 | SUSAN      | WILSON    | SUSAN.WILSON@sakilacustomer.org     |         12 | t          | 2006-02-14  | 2006-02-15 09:57:20 |      1
           9 |        2 | MARGARET   | MOORE     | MARGARET.MOORE@sakilacustomer.org   |         13 | t          | 2006-02-14  | 2006-02-15 09:57:20 |      1
          10 |        1 | DOROTHY    | TAYLOR    | DOROTHY.TAYLOR@sakilacustomer.org   |         14 | t          | 2006-02-14  | 2006-02-15 09:57:20 |      1
(10 rows)

test=#

Manipulating the Result Before Returning

Returning the result of a plain SQL has its usage like visibility/permissioning control, but we usually need to manipulate the results of a query before returning, and to do this we can implement the interface “org.postgresql.pljava.ResultSetProvider“.
I will implement a simple method to anonymize sensitive data with a hash function in the following example. I’ll also create a helper class to deal with the hash and cryptographic functions to keep the CustomerResultSet class clean:
/**
 * Crypto helper class that will contain all hashing and cryptographic functions
 */
package com.percona.blog.pljava;

import java.nio.charset.StandardCharsets;
import java.security.MessageDigest;
import java.security.NoSuchAlgorithmException;

public class Crypto {
	MessageDigest digest;
	
	public Crypto() throws NoSuchAlgorithmException {
		digest = MessageDigest.getInstance("SHA-256");
	}
	
	public String bytesToHex(byte[] hash) {
		StringBuilder hexString = new StringBuilder(2 * hash.length);
		for (int i = 0; i < hash.length; i++) {
			String hex = Integer.toHexString(0xff & hash[i]);
			if (hex.length() == 1) {
				hexString.append('0');
			}
			hexString.append(hex);
		}
		return hexString.toString();
	}
	
	public String encode(String data, int min, int max) {
		double salt = Math.random();
		int sbstring = (int) ((Math.random() * ((max - min) + 1)) + min);

		return bytesToHex(digest.digest((data + salt).getBytes(StandardCharsets.UTF_8))).substring(0, sbstring);
	}
}

/**
 * CustomerHash class
 */
package com.percona.blog.pljava;

import java.security.NoSuchAlgorithmException;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.util.logging.Level;
import java.util.logging.Logger;

import org.postgresql.pljava.ResultSetProvider;

public class CustomerHash implements ResultSetProvider {
	private final Connection conn;
	private final PreparedStatement stmt;
	private final ResultSet rs;
	private final Crypto crypto;
	
	private final String m_url = "jdbc:default:connection";

	public CustomerHash(int id) throws SQLException, NoSuchAlgorithmException {
		String query;
		
		crypto = new Crypto();
		query = "SELECT * FROM customer WHERE customer_id = ?";
		conn = DriverManager.getConnection(m_url);
		stmt = conn.prepareStatement(query);
		stmt.setInt(1, id);
		rs = stmt.executeQuery();
	}
	
	@Override
	public void close() throws SQLException {
		stmt.close();
		conn.close();
	}
	
	@Override
	public boolean assignRowValues(ResultSet receiver, int currentRow) throws SQLException {		
		if (!rs.next())
			return false;
		
		try {
			receiver.updateInt(1, rs.getInt("customer_id"));
			receiver.updateInt(2, rs.getInt("store_id"));
			receiver.updateString(3, crypto.encode(rs.getString("first_name"), 5, 45));
			receiver.updateString(4, crypto.encode(rs.getString("last_name"), 5, 45));
			receiver.updateString(5, crypto.encode(rs.getString("email"), 5, 41) + "@mail.com");
			receiver.updateInt(6, rs.getInt("address_id"));
			receiver.updateBoolean(7, rs.getBoolean("activebool"));
			receiver.updateDate(8, rs.getDate("create_date"));
			receiver.updateTimestamp(9, rs.getTimestamp("last_update"));
			receiver.updateInt(10, rs.getInt("active"));
			
		} catch (Exception e) {
			Logger.getAnonymousLogger().log(Level.parse("SEVERE"), e.getMessage());
		}
		return true;
	}
	
	public static ResultSetProvider getCustomerAnonymized(int id) throws SQLException, NoSuchAlgorithmException {
		return new CustomerHash(id);
	}

}

The number of classes is increasing, so instead of mentioning them one by one let’s just use the “.java” to build the classes and the “.class” to create the jar:
javac -cp "~/pljava-1_6_2/build/pljava-api-1.6.2.jar" com/percona/blog/pljava/*.java
jar -c -f /app/pg12/lib/pljavaPart2.jar com/percona/blog/pljava/*.class

Remember that every time we change our JAR file we need to also reload it into Postgres. Check the next example and you’ll see that I’m reloading the JAR file, creating and testing our new function/method:
test=# SELECT sqlj.replace_jar( 'file:///app/pg12/lib/pljavaPart2.jar', 'pljavaPart2', true );
 replace_jar 
-------------
 
(1 row)

test=# CREATE OR REPLACE FUNCTION getCustomerAnonymized(int) RETURNS SETOF customer AS 'com.percona.blog.pljava.CustomerHash.getCustomerAnonymized' LANGUAGE java;
CREATE FUNCTION

test=# SELECT * FROM getCustomerAnonymized(9);
 customer_id | store_id |     first_name      |              last_name              |                  email                  | address_id | activebool | create_date |     last_update     | ac
tive 
-------------+----------+---------------------+-------------------------------------+-----------------------------------------+------------+------------+-------------+---------------------+---
-----
           9 |        2 | 72e2616ef0075e81929 | 3559c00ee546ae0062460c8faa4f24960f1 | 24854ed40ed42b57f077cb1cfaf916@mail.com |         13 | t          | 2006-02-14  | 2006-02-15 09:57:20 |   
   1
(1 row)

test=#

Great! We now have a method to anonymize data!

Triggers

The last topic of this second part will be about “triggers”, and to make it a bit more interesting we will create a trigger to encrypt the sensitive data of our table. The anonymization using the hash function in the previous example is great, but what happens if we have unauthorized access to the database? The data is saved in plain text!
To make this example as small as possible I won’t bother with securing the keys, as we will do it in part three of this series when we’ll use Java to access external resources using Vault to secure our keys, so stay tuned!
Ok, the first thing we need to do is to create the pair of keys we need to encrypt/decrypt our data. I’ll use “OpenSSL” to create them and gonna store them into a table named “keys”!
openssl genrsa -out keypair.pem 2048
openssl pkcs8 -topk8 -nocrypt -in keypair.pem -outform PEM -out private.pem
openssl rsa -in keypair.pem -outform PEM -pubout -out public.pem

Now that we have the keys we need to sanitize the key files to remove the header and footer data from both the private and public keys, and also remove all break-lines, or else our Java code will complain:
echo -n "CREATE TABLE keys(id int primary key, priv varchar not null, pub varchar not null); INSERT INTO keys VALUES(1, '" > keys.sql
cat private.pem | sed '1d;$d' | sed ':a;N;$!ba;s/\n//g' | tr -d '\n' >> keys.sql
echo -n "', '" >> keys.sql
cat public.pem | sed '1d;$d' | sed ':a;N;$!ba;s/\n//g' | tr -d '\n' >> keys.sql
echo -n "');" >> keys.sql

psql test < keys.sql

It will look like this when sanitized:
CREATE TABLE keys(id int primary key, priv varchar not null, pub varchar not null); INSERT INTO keys VALUES(1, 'MIIEvAIBADANBgkqhkiG9w0BAQEFAASCBKYwggSiAgEAAoIBAQCiAA4BE64JZpXwIGfsUanyL//drIcFZ1cmiCW6zWOxc6nL8AQ33MPyQup8g/ociJFGn/eEEYOvRMV2pVNo3qB3VQU4WHRWkq22x7mRfuhHmAnAJA3dic5fiJ1aCQgo7tEqlGPc0WqL+jMUXh6Wmktq1kDZagUGJRorw0f5Iaj60PycbGtgKaKDc4VHepkN1jl0rhpJBzjBehuvB88LLXJ/cHsMOp3q569jLsHtqymCA2wP68ldtfKtOowPW9togIUmgWY0Z2lWlefrlzmT2g3L/oYbPUxCmptOAMFD8NajdA518ohZAC8SPfUsD4CwL89oPrMZlX4RkTuc5UvBHiKrAgMBAAECggEAcJl5ImZ7YS1cqjrcAPYCGcQjJAD3GFpryOx4zQ5VbNHoA0ggpnNb/tdkBIf3ID4MO/qUH8fMr9YtKfpfr1SOVGNT7YYN1t68v36zDN4YtSqIHHTy7jkKqHxcYmhEs67K072wa5tjY0fUmSOSPzufj/K7wGJge5TuS9y/+fnbafkdfW/yz3X2YXL6T/jfjqI4h+P7Nhh5hlpD1KZfEWTAY5B5tBoLc4xaTIB8FTLclVWw3CAW8h60EwUAkyxCSbrP2I1FCrWsV6hJGy8U+hUQJUpyDdum9ZC1oAVewRrCkSH0ZP1XaQifDZoRv/1N7cCbQqaLJaVk4rzVOEv0DoCEAQKBgQDOMPMm2ioEredCx0hfmHWWayGA5as7VPDSzv1QH3g4AdjZFf2YxctXGNJNMpfqVvFiQCWxp9NpjUPpODRbmR2J+7tYgx3B445zDeXdBH2JTKhUgNTHjL6DrM6FTI3yaSsSJ77L0mDcFQ42nfWtfqkZd5lYfwiVC0eL86bp408+YQKBgQDJIks6RqVlDbHerIIqT1ToN+CQ+BCjx/Z6sk4BFIKBB8sU8VyVkZlvQpFHvT06oE/1NbyiQ3nVufGrm0kwMqx7MXGiA670E1Q+Q/mQ12uRByUlgd+LW4mp1Y6tln1lpP5pVqUOC/jtnXYQmEReU4Ye24E4AZhFU23J+oYoh3XEiwKBgEJFaWFrbWXjnxjPhGt1TRXziOks6ERBoMWg0boW40TdEx1y+/dGW3y69ZzqTfl7yEmT5ImdL04VoWYsMmfeZqgayLRCMCZJRVeld+P5tX+Tq+a9Iaahjfo0aIxfdqAbPUSwkZphG9Cg09iqHHSO6TrOPfM7oT6GSZCp11QFQ0sBAoGAeABi+8D8mx8hmWY5Pv8X/HiiHjwyyVTbpPbO/Wv8NPmuW69per9k2PHRdgjdCCZvrjBCfFlfznljS+yZLQ1+xP2J+4zRDESgBYpO0vED94JY0lj7Q8z4hICq4Lyh0kwvki+kyI2yFirVLy950wFoSu7R2NVywSH2pgQ3mOTBCeMCgYBL5KIRf1qwsCYaCggPls4pWKMjfxxO915h26/aaniEYaTNnhXRSRwkVOWoGHoUKfrqQdrvj/y5lgezn7mZM0CvnB6ZkGwDXxpcIYUnhR1Lnp3HNSqfigg+WjQASVCKuq3YUri3p+KQkrpED/O3B4FJW2Q4IReEuREEsKNkeH96ew==', 'MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAogAOAROuCWaV8CBn7FGp8i//3ayHBWdXJoglus1jsXOpy/AEN9zD8kLqfIP6HIiRRp/3hBGDr0TFdqVTaN6gd1UFOFh0VpKttse5kX7oR5gJwCQN3YnOX4idWgkIKO7RKpRj3NFqi/ozFF4elppLatZA2WoFBiUaK8NH+SGo+tD8nGxrYCmig3OFR3qZDdY5dK4aSQc4wXobrwfPCy1yf3B7DDqd6uevYy7B7aspggNsD+vJXbXyrTqMD1vbaICFJoFmNGdpVpXn65c5k9oNy/6GGz1MQpqbTgDBQ/DWo3QOdfKIWQAvEj31LA+AsC/PaD6zGZV+EZE7nOVLwR4iqwIDAQAB');

After done with populating the table we should have a nice table with both private and public keys. Now is the time to create our Java classes. I’ll reuse the “Crypto” class for the cryptographic functions and create a new class to add our trigger functions. I’ll only add the relevant part of the Crypto class here, but you can find the code described here on my GitHub page here[1] including Part One (and Part Three) when released. Let’s get to the code:
/**
 * This is the relevant part of the Crypto class that will encrypt and decrypt our data using the certificates we generated above.
 */
	public PublicKey getPublicKey(String base64PublicKey) {
		PublicKey publicKey = null;
		try {
			X509EncodedKeySpec keySpec = new X509EncodedKeySpec(Base64.getDecoder().decode(base64PublicKey.getBytes()));
			KeyFactory keyFactory = KeyFactory.getInstance("RSA");
			publicKey = keyFactory.generatePublic(keySpec);
			return publicKey;
		} catch (NoSuchAlgorithmException e) {
			e.printStackTrace();
		} catch (InvalidKeySpecException e) {
			e.printStackTrace();
		}
		return publicKey;
	}

	public PrivateKey getPrivateKey(String base64PrivateKey) {
		PrivateKey privateKey = null;
		PKCS8EncodedKeySpec keySpec = new PKCS8EncodedKeySpec(Base64.getDecoder().decode(base64PrivateKey.getBytes()));
		KeyFactory keyFactory = null;
		try {
			keyFactory = KeyFactory.getInstance("RSA");
		} catch (NoSuchAlgorithmException e) {
			e.printStackTrace();
		}
		try {
			privateKey = keyFactory.generatePrivate(keySpec);
		} catch (InvalidKeySpecException e) {
			e.printStackTrace();
		}
		return privateKey;
	}

	public String encrypt(String data, PublicKey publicKey) throws BadPaddingException, IllegalBlockSizeException,
			InvalidKeyException, NoSuchPaddingException, NoSuchAlgorithmException {
		Cipher cipher = Cipher.getInstance("RSA/ECB/PKCS1Padding");
		cipher.init(Cipher.ENCRYPT_MODE, publicKey);
		return Base64.getEncoder().encodeToString(cipher.doFinal(data.getBytes()));
	}
	
	public String decrypt(String data, PrivateKey privateKey) throws NoSuchPaddingException,
			NoSuchAlgorithmException, InvalidKeyException, BadPaddingException, IllegalBlockSizeException {
		Cipher cipher = Cipher.getInstance("RSA/ECB/PKCS1Padding");
		cipher.init(Cipher.DECRYPT_MODE, privateKey);
		return new String(cipher.doFinal(Base64.getDecoder().decode(data)));
	}

Now we can implement the class with both functions – the trigger function to encrypt and a function to decrypt when we need to SELECT the data:
package com.percona.blog.pljava;

import java.security.InvalidKeyException;
import java.security.NoSuchAlgorithmException;
import java.security.PrivateKey;
import java.security.PublicKey;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.util.logging.Level;
import java.util.logging.Logger;

import javax.crypto.BadPaddingException;
import javax.crypto.IllegalBlockSizeException;
import javax.crypto.NoSuchPaddingException;

import org.postgresql.pljava.ResultSetProvider;
import org.postgresql.pljava.TriggerData;

public class CustomerCrypto implements ResultSetProvider {
	private final String m_url = "jdbc:default:connection";
	private final Connection conn;
	private PreparedStatement stmt;
	private ResultSet rs;

	//
	private PrivateKey privateKey;
	private PublicKey publicKey;

	public CustomerCrypto() throws SQLException, NoSuchAlgorithmException {
		String query;

		query = "SELECT * FROM keys WHERE id = 1";
		conn = DriverManager.getConnection(m_url);
		stmt = conn.prepareStatement(query);
		rs = stmt.executeQuery();
		if (!rs.next())
			throw new SQLException("Keys not found!");

		privateKey = Crypto.getPrivateKey(rs.getString("priv"));
		publicKey = Crypto.getPublicKey(rs.getString("pub"));
	}
	
	public void processQuery(int id) throws SQLException, NoSuchAlgorithmException {
		String query;
		query = "SELECT * FROM customer WHERE customer_id = ?";
		stmt = conn.prepareStatement(query);
		stmt.setInt(1, id);
		rs = stmt.executeQuery();
	}

	@Override
	public void close() throws SQLException {
		stmt.close();
		conn.close();
	}

	public static int getLineNumber() {
		return Thread.currentThread().getStackTrace()[2].getLineNumber();
	}

	@Override
	public boolean assignRowValues(ResultSet receiver, int currentRow) throws SQLException {
		if (!rs.next())
			return false;

		try {
			receiver.updateInt(1, rs.getInt("customer_id"));
			receiver.updateInt(2, rs.getInt("store_id"));
			receiver.updateString(3, Crypto.decrypt(rs.getString("first_name"), this.privateKey));
			receiver.updateString(4, Crypto.decrypt(rs.getString("last_name"), this.privateKey));
			receiver.updateString(5, Crypto.decrypt(rs.getString("email"), this.privateKey));
			receiver.updateInt(6, rs.getInt("address_id"));
			receiver.updateBoolean(7, rs.getBoolean("activebool"));
			receiver.updateDate(8, rs.getDate("create_date"));
			receiver.updateTimestamp(9, rs.getTimestamp("last_update"));
			receiver.updateInt(10, rs.getInt("active"));

		} catch (Exception e) {
			Logger.getAnonymousLogger().log(Level.parse("SEVERE"), e.getMessage());
		}
		return true;
	}
	
	private void encryptData(TriggerData td) throws InvalidKeyException, BadPaddingException, IllegalBlockSizeException, NoSuchPaddingException, NoSuchAlgorithmException, SQLException {
		ResultSet _new = td.getNew();
		
		_new.updateString("first_name", Crypto.encrypt(_new.getString("first_name"), this.publicKey));
		_new.updateString("last_name", Crypto.encrypt(_new.getString("last_name"), this.publicKey));
		_new.updateString("email", Crypto.encrypt(_new.getString("email"), this.publicKey));
	}
	
	public static void customerBeforeInsertUpdate(TriggerData td) throws SQLException, InvalidKeyException, BadPaddingException, IllegalBlockSizeException, NoSuchPaddingException, NoSuchAlgorithmException {
		CustomerCrypto ret = new CustomerCrypto();
		ret.encryptData(td);
	}

	public static ResultSetProvider getCustomerCrypto(int id) throws SQLException, NoSuchAlgorithmException {
		CustomerCrypto ret = new CustomerCrypto();
		ret.processQuery(id);
		
		return ret;
	}

}

The relevant parts of the code above are the “customerBeforeInsertUpdate” and “encryptData” methods, the former being the static method the database will access. The PL/Java on Postgres expects to find a static method with “void (TriggerData)” signature. It will call the “encryptData” method of the “CustomerCrypto” object to do the job. The “encryptData” method will recover the resultset from the “NEW” pointer that is passed through the “TriggerData” object and then change the value to crypt the data. We need to call the trigger in the “BEFORE” event because we need to crypt it before it is persisted.
Another important method is the “getCustomerCrypto“. We need to be able to get the data decrypted and this method will help us. Here, we use the same technique we used in the previous example where we implemented the “ResultSetProvider” interface and manipulated the data before returning the resultset. Take a closer look at the “assignRowValues” method and you’ll see that we are decrypting the data there with “Crypto.decrypt” method!
Ok, time to compile the code and check if it really works:
javac -cp "/v01/proj/percona/blog/pljava/pljava-1_6_2/build/pljava-api-1.6.2.jar" com/percona/blog/pljava/*.java
jar -c -f /app/pg12/lib/pljavaPart2.jar com/percona/blog/pljava/*.class

And create the database objects:
SELECT sqlj.replace_jar( 'file:///app/pg12/lib/pljavaPart2.jar', 'pljavaPart2', true );

CREATE FUNCTION customerBeforeInsertUpdate()
			RETURNS trigger
			AS 'com.percona.blog.pljava.CustomerCrypto.customerBeforeInsertUpdate'
			LANGUAGE java;

CREATE TRIGGER tg_customerBeforeInsertUpdate
			BEFORE INSERT ON customer
			FOR EACH ROW
			EXECUTE PROCEDURE customerBeforeInsertUpdate();

At this point, our data isn’t encrypted yet but we can do it with a noop update and the trigger will do its magic:
test=# SELECT * FROM customer LIMIT 3;
 customer_id | store_id | first_name | last_name |                email                | address_id | activebool | create_date |     last_update     | active 
-------------+----------+------------+-----------+-------------------------------------+------------+------------+-------------+---------------------+--------
           1 |        1 | MARY       | SMITH     | MARY.SMITH@sakilacustomer.org       |          5 | t          | 2006-02-14  | 2006-02-15 09:57:20 |      1
           2 |        1 | PATRICIA   | JOHNSON   | PATRICIA.JOHNSON@sakilacustomer.org |          6 | t          | 2006-02-14  | 2006-02-15 09:57:20 |      1
           3 |        1 | LINDA      | WILLIAMS  | LINDA.WILLIAMS@sakilacustomer.org   |          7 | t          | 2006-02-14  | 2006-02-15 09:57:20 |      1
(3 rows)

test=# UPDATE customer SET first_name = first_name, last_name = last_name, email = email;
UPDATE 599

test=# SELECT * FROM customer LIMIT 3;
 customer_id | store_id |                                                                                                                                                                       
 first_name                                                                                                                                                                        |            
                                                                                                                                                            last_name                           
                                                                                                                                              |                                                 
                                                                                                                         email                                                                  
                                                                                                         | address_id | activebool | create_date |        last_update         | active 
-------------+----------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------------------------+------------+------------+-------------+----------------------------+--------
           3 |        1 | DT8oXb0VQvtFSIOv61zZfmUrpjWWGoeyl8D8tQl7naCLT31WlJn3U+uILYqUedSOdDSO17QdQKwChWG+DrcvYEih9RYyEPR2ja9deN4cn+vwHt/v09HDfmwrsJUt0UgP/fp78hCxkJDAV50KkMUsA23aeH5HRn9nCHOH0P
AcuId+7acCgwvU9YP8Sx2KVeVnLaBrzpeRLrsmAczQLUAXilXfdFC8uT2APBfwx1So2eCe+kSOsjcu1yTwlsa95Dnfu/N++Zm1D4knKUrAuNm5svTHjIz+B4HKXFMPz/Yk7KYF6ThB6OshIomyRvSEtKu5torfdwAvT3tsgP2DLWiKgQ== | H0YRoi10z36
tnNSXpBs/oYfMQRbAhfUYLIcE885Dhxmy2mbuhecCCqPcye5/++MhUwmEQG2pBgfsqWHLOnAgbqjaG3O0reipVysYK7cMysX1w5RVINsyD5H3vCqgnHESfdRuhW3b00InkR2qCtBYX1QJ1tKJZz89D2AOjoTq5jTum00vcLT06h6ZxVh1RKLNAuGpY9qN57m
/9a4JZuff9poYjw2PPQ6kTxhtbFl3bw+B3sJUcLFuFMYUoAAHsVETQRAerH1ncG9Uxi+xQjUtTVBqZdjvED+eydetH7vsnjBuYDtXD9XAn14qmALx5NfvwpU5jfpMOPOM4xP1BRVA2Q== | DpWBfhhii4LRPxZ9XJy8xoNne+qm051wD5Gd9AMHc+oIhx/B
ln6H+lcAM3625rKN1Vw/lG6VkQo2EnoZz/bhFtULvAOAUiBxerBDbYe0lYWqI50NxnFJbkexMkjDSiuoglh3ilRBn6Z+WGLc7FfEprOd1+tULW2gcwLI68uoEhfSY7INQZuGXfOUMAM4roB2fWvEfylL1ShbiGTRjX7KGXQbXLJtm7xel8J2VhdCecXzxzY2
Mtnu3EXGNpFy9atTXxE/fI0C5AX/u2FDZiOHz9xV7sB3atcqAeXwJB0smpBnPbwI3BN+ptzsWnhyFNNS+ol4QayMDgFhi/tp2+lCAQ== |          7 | t          | 2006-02-14  | 2021-08-08 19:10:29.337653 |      1
           4 |        2 | jo3zKr6lJ5zMN5f3/BPgENhs9CdzDu7F/uFxAf6uO9MAKc7X+++ipk/OBbvvA8vpaJ7DTgph9LshRvpYIqPMwS6IubkScDOSRDoLBNW9z2oMF3dB46R86LK0pTEVVrGaddjnPzaAEh7Uwzy3LncC1y7pJqGKW1b3RGUE8n
4SgstMo/4tRNUe/AUcPn9FXkCoc5jFvn8gPwVoTRKwhYu0oTco2gNKZs1rmFdmkmobGaMjZuhWyWG2PO1oXIVPkpgILmD42yAMDxWkS4DVvgJChXJRukjBzfOitsQwHapjqPqf/q3mfBaQzNUwROcSfGBe6KlI5yfjWU309KRKaCYWNQ== | MMhPovG/N3k
Xjou6kS9V7QtEvlA5NS8/n62KVRVVGEnsh5bhwEhBZxlK72AQv8e4niATWPEiJJU6i7Z08NkU5FWNIvuWwlGTdEEW+kK7XQXib6cNAdnmo4RH72SWhDxEp3tMwwoZif2932H8WDEbNdP6bCP69ekBA7Z+nGtXaeh+H9BAaLf1e6XunBj2YN7zs4sFWB2Kxs2
IugLWd9g9677BWzUeJIzpJfVLro4HYlzASh9AMKb8wPRU0LlEpxtqUdejj7IY5M1hVhDTCCLSQjSqJscqzG1pYQ04W7KNdGwWxJPMjvyPC2K4H+HQuW0IWVjvFpyYd/5q1eIQX+vjdw== | oF4nyIjtVtWuydg6QiSg3BDadWe48nhbBEZSLrR5GVigA768
E3n1npN6sdstzG7bRVnphtfwIZwQ3MUFURWCbUCe0VqioNgVXFhiRvr3DAw2AH64ss/h65B2U5whAygnq4kiy5JvPD0z0omtfs9913QeoO+ilvPVLEc0q3n0jD9ZQlkNVfHSytx1NY86gWnESquTVhkVQ55QDV8GY70YLX9V6nU7ldu+zpNLmf2+rfpxqbRC
i16jnHGDcTT7CKeq+AxbiJDeaaAmSPpxTZsrX4sXFW4rpNtSmOyuyHZziy8rkN8xSpyhvrmxjC7EYe4bn6L/+hay108Wn0BSFYe2ow== |          8 | t          | 2006-02-14  | 2021-08-08 19:10:29.337653 |      1
<...>
(3 rows)

test=#

Awesome, we get our data encrypted! What about the “decrypt” part of the class? Let’s check it out:
test=# CREATE OR REPLACE FUNCTION getCustomerCrypto(int) RETURNS SETOF customer AS 'com.percona.blog.pljava.CustomerCrypto.getCustomerCrypto' LANGUAGE java;
CREATE FUNCTION

test=# SELECT * FROM getCustomerCrypto(10);
 customer_id | store_id | first_name | last_name |               email               | address_id | activebool | create_date |     last_update     | active 
-------------+----------+------------+-----------+-----------------------------------+------------+------------+-------------+---------------------+--------
          10 |        1 | DOROTHY    | TAYLOR    | DOROTHY.TAYLOR@sakilacustomer.org |         14 | t          | 2006-02-14  | 2006-02-15 09:57:20 |      1
(1 row)

test=#

Worked like a charm! Here we finish part two and at this point, we are able to query and manipulate objects inside of our database. The next and last article of this series will cover external calls, and we’ll see how to use external resources from PL/Java. Don’t miss it!

[1] https://github.com/elchinoo/blogs/tree/main/pljava

Percona Distribution for PostgreSQL provides the best and most critical enterprise components from the open-source community, in a single distribution, designed and tested to work together.

Download Percona Distribution for PostgreSQL Today!

Aug
10
2021
--

Impact of Network and Cursor on Query Performance of PostgreSQL

Query Performance of PostgreSQL

Many times, we see PostgreSQL users getting confused about the query/statement duration reported in PostgreSQL logs. Other PostgreSQL tools like pgBadger present the same data based on the log file, which further increases the confusion. Knowing the full impact of network-related overhead and cursors is important not only to alleviate the confusion but also to get the best performance.

One might ask “Why discuss Network overhead and Cursors specifically?”. Well, they are the hidden cost after the/in the query execution. Once the query execution starts and there is some data to be given to the client, these are the factors that mainly affect the performance. So the important point which we may want to keep in mind is since all these happens outside the query execution, the corresponding information will not be available through the EXPLAIN (ANALYZE).

Impact of Network

For the demonstration, I am using a query based on pgBench tables.

select a.bid, b.cnt from pgbench_accounts a,
   (select bid,count(*) cnt from pgbench_accounts group by bid) b
where b.bid > a.bid;

There is no significance for this query. A random query is selected which takes some time in the database to execute.

In order to capture the Explain Analyze output, auto_explain is used. Settings are made to capture all statements which take more than 250ms.

ALTER SYSTEM SET auto_explain.log_min_duration = 250

A few other settings to capture additional details with EXPLAIN ANALYZE are:

ALTER SYSTEM SET auto_explain.log_analyze = on;
ALTER SYSTEM SET auto_explain.log_buffers=on;
ALTER SYSTEM SET auto_explain.log_timing=on;
ALTER SYSTEM SET auto_explain.log_verbose=on;
ALTER SYSTEM SET auto_explain.log_triggers=on;
ALTER SYSTEM SET auto_explain.log_wal=on;

In order to Illustratre the difference, the same query will be executed from

1. The database Host Server

2. The application Host Server that is connected over a network

Queries Returning a Large Number of Rows

Case 1. Executing on the Database Host Itself

Following are the few lines from PostgreSQL logs generated by auto_explain:

2021-08-02 03:27:56.347 UTC [25537] LOG:  duration: 1591.784 ms  plan:
        Query Text: select a.bid, b.cnt from pgbench_accounts a,
           (select bid,count(*) cnt from pgbench_accounts group by bid) b
        where b.bid > a.bid;
        Nested Loop  (cost=12322.13..63020.46 rows=833333 width=12) (actual time=119.739..1069.924 rows=1000000 loops=1)
          Output: a.bid, b.cnt
          Join Filter: (b.bid > a.bid)
          Rows Removed by Join Filter: 1500000
          ...

As we can see, the outer nested loop of the query was completed in 1069.924 ms returning 1 million records, but the total duration of the query is reported as 1591.784 ms. What could be the difference?

A straight, EXPLAIN ANALYZE shows that the planning time is sub milliseconds for this simple query where data is coming from a single table without any index. So the planning time shouldn’t be the reason.

Case 2. Executing from a Remote Application Host

Again the information from the PostgreSQL log looks different:

2021-08-02 04:08:58.955 UTC [25617] LOG:  duration: 6568.659 ms  plan:
        Query Text: select a.bid, b.cnt from pgbench_accounts a,
           (select bid,count(*) cnt from pgbench_accounts group by bid) b
        where b.bid > a.bid;
        Nested Loop  (cost=12322.13..63020.46 rows=833333 width=12) (actual time=140.644..1069.153 rows=1000000 loops=1)
          Output: a.bid, b.cnt
          Join Filter: (b.bid > a.bid)
          Rows Removed by Join Filter: 1500000
          ...

As we can see the statement duration jumped to 6568.659 ms! even though the actual execution of the query remained pretty much the same 1069.153 ms. That’s a huge difference. What could be the reason?

Queries Returning a Fewer Number of Rows

The above-mentioned query can be slightly modified to return only the max values. The modified test query may look like this:

select max(a.bid), max(b.cnt) from pgbench_accounts a,
   (select bid,count(*) cnt from pgbench_accounts group by bid) b
where b.bid > a.bid ;

The query plan or time doesn’t change much other than there is an additional aggregate. Even if there is a change in plan that is not relevant for the topic, we are discussing it because we are considering only the time difference between the outer node of the query execution and the duration reported by PostgreSQL.

Case 1: Executing on the Database Host Itself

2021-08-03 06:58:14.364 UTC [28129] LOG:  duration: 1011.143 ms  plan:
        Query Text: select max(a.bid), max(b.cnt) from pgbench_accounts a,
           (select bid,count(*) cnt from pgbench_accounts group by bid) b
        where b.bid > a.bid ;
        Aggregate  (cost=67187.12..67187.13 rows=1 width=12) (actual time=1010.914..1011.109 rows=1 loops=1)
          Output: max(a.bid), max(b.cnt)
          Buffers: shared hit=12622 read=3786
          ->  Nested Loop  (cost=12322.13..63020.46 rows=833333 width=12) (actual time=135.635..908.315 rows=1000000 loops=1)
                Output: a.bid, b.cnt
                Join Filter: (b.bid > a.bid)
                Rows Removed by Join Filter: 1500000
                Buffers: shared hit=12622 read=3786

As we can see there is not much difference between the completion of the query 1011.109 and the duration reported 1011.143 ms. The observation so far indicates that there is extra time consumed when there is a lot of rows are returned.

Case 2: Executing the Statement from the Remote Host

2021-08-03 06:55:37.221 UTC [28111] LOG:  duration: 1193.387 ms  plan:
        Query Text: select max(a.bid), max(b.cnt) from pgbench_accounts a,
           (select bid,count(*) cnt from pgbench_accounts group by bid) b
        where b.bid > a.bid ;
        Aggregate  (cost=67187.12..67187.13 rows=1 width=12) (actual time=1193.139..1193.340 rows=1 loops=1)
          Output: max(a.bid), max(b.cnt)
          Buffers: shared hit=11598 read=4810
          ->  Nested Loop  (cost=12322.13..63020.46 rows=833333 width=12) (actual time=124.508..1067.409 rows=1000000 loops=1)
                Output: a.bid, b.cnt
                Join Filter: (b.bid > a.bid)
                Rows Removed by Join Filter: 1500000
                Buffers: shared hit=11598 read=4810

Again there is not much difference 1193.340 vs 1193.387 ms. Overall, I feel safe to assume from the results that if the data transfer is minimal, the application server on a different host machine doesn’t make much difference; meanwhile, the impact is huge if there is a lot of result transfer is involved.

Analyzing the Wait Events

Luckily, newer versions of PostgreSQL provide us with an excellent way to monitor the “wait event” information from the pg_stat_activity view of session/connection.

At Percona support we use a script from pg_gather snippet for gathering performance information including the wait events by collecting multiple samples. The script collects samples of wait events in every 10ms gap, so there will be 2000 samples in 20 seconds. Apart from UI, the gathered information can be analyzed using backend queries also.

The following is what I could see about PID: 25617 (the case of returning a large number of rows to remote host).

postgres=# select pid,wait_event,count(*) from pg_pid_wait where pid=25617 group by 1,2 order by 3 desc;
  pid  |  wait_event  | count 
-------+--------------+-------
 25617 | ClientWrite  |   286
 25617 |              |    75
 25617 | DataFileRead |     3
(3 rows)

The sessions are spending more time on “ClientWrite” As per PostgreSQL documentation.

ClientWrite Waiting to write data to the client.

It is the time spend on writing the data to the client. The wait_event NULL indicates the CPU utilization.

Impact of Cursors

Typically, after a query execution, the result data need to be processed by the application. Cursors are used for holding the result of queries and processing them. The impact on query performance is mainly decided by where the cursor is residing, whether on PostgreSQL server-side or at the client-side. The location of the cursor is expected to affect when the query is issued from a separate application host, so I am testing only that case.

Client-Side Cursors

Generally, this is the case with most of the PostgreSQL clients and applications. The data is retrieved fully to the database client end and then processed one by one.

Here is a simple python snippet (to mimic the application connection) for testing the same. (Only the relevant lines are copied.)

conn =  psycopg2.connect(connectionString)
   cur = conn.cursor()
   cur.itersize = 2000
   cur.execute("select a.bid, b.cnt from pgbench_accounts a, (select bid,count(*) cnt from pgbench_accounts group by bid) b where b.bid > a.bid")
   row = cur.fetchone()
   while row is not None:
     print(row)
     time.sleep(0.001)
     row = cur.fetchone()

   conn.close()

As we can see

itersize

is specified so only those many records are to be fetched at a time for processing and there is a 1-millisecond delay in a loop using

fetchone()

in each loop

But none of these affects the server-side query performance because the cursor is already cached on the client-side. The query time and duration reported are similar to executing from a remote application host for a large number of rows. As expected, the impact of the network is clearly visible:

2021-08-03 17:39:17.119 UTC [29180] LOG:  duration: 5447.793 ms  plan:
        Query Text: select a.bid, b.cnt from pgbench_accounts a, (select bid,count(*) cnt from pgbench_accounts group by bid) b where b.bid > a.bid
        Nested Loop  (cost=12322.13..63020.46 rows=833333 width=12) (actual time=130.919..1240.727 rows=1000000 loops=1)
          Output: a.bid, b.cnt
          Join Filter: (b.bid > a.bid)
          Rows Removed by Join Filter: 1500000
          Buffers: shared hit=1647 read=14761
          ->  Seq Scan on public.pgbench_accounts a  (cost=0.00..13197.00 rows=500000 width=4) (actual time=0.086..183.503 rows=500000 loops=1)
                Output: a.aid, a.bid, a.abalance, a.filler
                Buffers: shared hit=864 read=7333

Server-Side Cursors

The way the cursor is created and used will be totally changing if we have a named cursor that stays on the server-side, and the statement

cur = conn.cursor()

is modified to include a name like

cur = conn.cursor('curname')

.

As the psycopg2: (The Python connector for PostgreSQL) documentation says:

“Psycopg wraps the database server-side cursor in named cursors. A named cursor is created using the cursor() method specifying the name parameter”

Surprisingly, the PostgreSQL log does not give any more information about the query even though auto_explain is configured. No duration information either. There is only a single line of info:

2021-08-03 18:02:45.184 UTC [29249] LOG:  duration: 224.501 ms  statement: FETCH FORWARD 1 FROM "curname"

PostgreSQL cursor supports various FETCH options with custom size specifications. Please refer to the documentation for FETCH for more details. It is up to the language driver/connector to wrap this functionality into corresponding functions.

The python driver for PostgreSQL – psychopg2 – wraps the functionality to run FETCH rows with the custom batch size specified as follows:

cur.fetchmany(size=20000)

Which produces a PostgreSQL log entry like:

2021-08-05 05:13:30.728 UTC [32374] LOG:  duration: 262.931 ms  statement: FETCH FORWARD 20000 FROM "curname"

As we expected, the fetch size is increased.

But the important point to note here is: even though the application iterating over a server-side cursor took a hell of a lot of time (running into several minutes), there is almost zero information about the query or the session in the PostgreSQL logs.

Ahh! This could the weirdest thing someone would expect.

Wait Event Analysis

Again wait event analysis comes in handy to understand what’s happening. Out of 2000 wait event samples collected by the pg_gather script, the wait events looks like this:

pid  |  wait_event  | count 
-------+--------------+-------
 30115 | ClientRead   |  1754
 30115 |   (CPU)      |   245
 30115 | DataFileRead |     1

The time is pend on the “ClientRead” weight event. This means that the server-side is waiting for the client to send the next request. So a slow network between the application server and the database server can adversely affect the server-side cursors.  But no information will be available in the PostgreSQL logs about the statement.

Conclusion

In this blog post, I tried to assess the impact of transferring a large amount of data from a Database Host to an Application Host.

PLEASE NOTE: The numbers discussed in the blog post in terms of the number of rows and the time recorded don’t have any absolute relevance and it might change from system to system with many environmental factors.

The discussion is more on the areas of impact and what we should expect and how to analyze as an end-user rather than any absolute numbers.

  1. Always try to specify the minimum number of columns in the query.  Avoid “SELECT *” or columns that are not used on the application side to avoid unnecessary data transfer.
  2. Avoid fetching a large number of rows to the application at a time.  If required, use proper paging with LIMIT and OFFSET.
  3. Avoid server-side cursors wherever possible. PostgreSQL reports only the duration of the first fetch and the actual query performance may go unnoticed. Bad performing queries could be hiding behind.
  4. Client-side cursors and data processing time won’t affect the query performance at the PostgreSQL server-side.
  5. Wait event analysis is very handy in understanding where the server is spending time.

Percona Distribution for PostgreSQL provides the best and most critical enterprise components from the open-source community, in a single distribution, designed and tested to work together.

Download Percona Distribution for PostgreSQL Today!

Jul
30
2021
--

Percona Monthly Bug Report: July 2021

July 2021 Percona Bug Report

July 2021 Percona Bug ReportHere at Percona, we operate on the premise that full transparency makes a product better. We strive to build the best open-source database products, but also to help you manage any issues that arise in any of the databases that we support. And, in true open-source form, report back on any issues or bugs you might encounter along the way.

We constantly update our bug reports and monitor other boards to ensure we have the latest information, but we wanted to make it a little easier for you to keep track of the most critical ones. These posts are a central place to get information on the most noteworthy open and recently resolved bugs. 

In this July 2021 edition of our monthly bug report, we have the following list of bugs:

Percona Server for MySQL/MySQL Bugs

 MySQL#77239: When using multi-threaded replication, SQL thread in replica stop with the following error sometimes, and further SQL Thread does not preserve this information about the error in its status and replica starts without any error.

Last_SQL_Errno: 1756

Last_SQL_Error: … The slave coordinator and worker threads are stopped, possibly leaving data in an inconsistent state. A restart should restore consistency automatically, although using non-transactional storage for data or info tables or DDL queries could lead to problems. In such cases, you have to examine your data (see the documentation for details).

There are few cases mentioned in the bug report which will trigger this error, one of the cases is when regularly stopping replication occasionally it throws 1756 error for no apparent reason, even if resuming replication works without problems later. 

In my test, I’m not able to reproduce this issue with the Percona Server for MySQL.

Affects Version/s: 8.0 [Tested/Reported version 8.0.25]

 

PS-7778 (MySQL#104168): The prepare statement can be failed when triggers with DEFINER are used. This is a regression bug introduce after WL#9384 implementation. Issue not reproducible in lower versions, tested with 8.0.20

Affects Version/s: 8.0 [Tested/Reported version 8.0.22, 8.0.25]

 

MySQL#102586:  When doing a multiple-table DELETE that is anticipating a foreign key ON DELETE CASCADE, the statements work on the primary but it breaks row-based replication.

Affects Version/s: 8.0, 5.7  [Tested/Reported version 8.0.23, 5.7.33]

 

Percona XtraDB Cluster

PXC-3449: When ALTER TABLE (TOI) is executed in a user session, sometimes it happens that it conflicts (MDL) with high priority transaction, which causes BF-BF to abort and server termination.

Affects Version/s: 8.0  [Tested/Reported version 8.0.21]

Fixed Version/s: 8.0.25

 

Percona XtraBackup

PXB-2375:  In some cases, XtraBackup will write the wrong binlog filename, pos, and GTID combination info in xtrabackup_binlog_info. Due to this, XtraBackup might not work as expected with GTID.

If we are using this backup with GTID position details in xtrabackup_binlog_info to create a new replica, then most likely replication will break due to incorrect GTID position.

Looks like the GTID position is not consistent with binlog file pos, they are captured differently and later printed together in xtrabackup_binlog_info  file.

Workaround to avoid this bug,

  •  Use binary log coordinates 
  • Take a backup in non-peak hours since this issue mostly occurred when MySQL is under heavy write operations.

Affects Version/s:  8.0 [Tested/Reported version 8.0.14]

 

Percona Toolkit

PT-1889: Incorrect output when using pt-show-grants for users based on MySQL roles, and as a result, they can not be applied back properly on MySQL server. Due to this, we can not use pt-show-grants for MySQL roles until this issue is fixed.

Affects Version/s:  3.2.1

 

PT-1747: pt-online-schema-change was bringing the database into a broken state when applying the “rebuild_constraints” foreign keys modification method if any of the child tables were blocked by the metadata lock.

Affects Version/s:  3.0.13

Fixed Version: 3.3.2

 

PMM  [Percona Monitoring and Management]

PMM-8004: When mongos connection terminates/interrupted could be kill -9 , timeout, network issue, etc, if at this time mongodb_exporter is trying to get DB status or databases list from mongos then it crashes at runtime with panic error.

Affects Version/s: 2.x  [Tested/Reported version 2.18,2.19]

 

PMM-8307: Users unable to use PMM with a large number of DB servers (500+) added for monitoring due to default configuration limitations for allowed numbers of connections.

As a workaround, we can increase worker_connections on Nginx.

Affects Version/s:  2.x  [Tested/Reported version 2.18.0]

Fixed version: 2.21.0

 

PMM-7846:  Adding MongoDB instance via pmm-admin with tls option is not working and failing with error Connection check failed: timeout (context deadline exceeded)

Affects Version/s: 2.x  [Tested/Reported version 2.13, 2.16]

 

PMM-4665: Frequent error messages in pmm-agent.log for components like tokudb storage engine which are not supported by upstream MySQL. As a result, it increases the overall log file size due to these additional messages.

Affects Version/s:  2.x  [Tested/Reported version 2.13.0]

Fixed version: 2.19.0

 

Summary

We welcome community input and feedback on all our products. If you find a bug or would like to suggest an improvement or a feature, learn how in our post, How to Report Bugs, Improvements, New Feature Requests for Percona Products.

For the most up-to-date information, be sure to follow us on Twitter, LinkedIn, and Facebook.

Quick References:

Percona JIRA  

MySQL Bug Report

Report a Bug in a Percona Product

MySQL 8.0.24 Release notes

___

About Percona:

As the only provider of distributions for all three of the most popular open source databases—PostgreSQL, MySQL, and MongoDB—Percona provides expertise, software, support, and services no matter the technology.

Whether it’s enabling developers or DBAs to realize value faster with tools, advice, and guidance, or making sure applications can scale and handle peak loads, Percona is here to help.

Percona is committed to being open source and preventing vendor lock-in. Percona contributes all changes to the upstream community for possible inclusion in future product releases.

Jul
29
2021
--

PostgreSQL PL/Java – A How-To, Part 1

PostgreSQL PL:Java

We’ve recently received some questions regarding PL/Java and I found it hard to get clear instructions searching on the internet. It’s not that there is no good information out there, but most of it is either incomplete, outdated, or confusing and I decided to create this short “how-to” and show how to install it and how to get it running with few examples.

Installation

I will show here how to install it from sources, first because my platform doesn’t have the compiled binaries, and second because if your platform has the binaries from the package manager you can just install it from there, for example using YUM or APT. Also, note that I’m using PL/Java without the designation “TRUSTED” and a Postgres database superuser for simplicity. I would recommend reading the documentation about users and privileges here[1].

The versions of the software I’m using here are:

  • PostgreSQL 12.7
  • PL/Java 1.6.2
  • OpenJDK 11
  • Apache Maven 3.6.3

I downloaded the sources from “https://github.com/tada/pljava/releases“, unpackaged and compiled with maven:

wget https://github.com/tada/pljava/archive/refs/tags/V1_6_2.tar.gz
tar -xf V1_6_2.tar.gz
cd pljava-1_6_2
mvn clean install
java -jar pljava-packaging/target/pljava-pg12.jar

I’ll assume here that you know maven enough and won’t go through the “mvn” command. The “java -jar pljava-packaging/target/pljava-pg12.jar” will copy/install the needed files and packages into Postgres folders. Note that maven used my Postgres version and created the jar file with the version: “pljava-pg12.jar“, so pay attention to the version you have there as the jar file will change if you have a different Postgres version!

I can now create the extension into the database I will use it. I’m using the database “demo” in this blog:

$ psql demo
psql (12.7)
Type "help" for help.

demo=# CREATE EXTENSION pljava;
WARNING: Java virtual machine not yet loaded
DETAIL: libjvm: cannot open shared object file: No such file or directory
HINT: SET pljava.libjvm_location TO the correct path to the jvm library (libjvm.so or jvm.dll, etc.)
ERROR: cannot use PL/Java before successfully completing its setup
HINT: Check the log for messages closely preceding this one, detailing what step of setup failed and what will be needed, probably setting one of the "pljava." configuration variables, to complete the setup. If there is not enough help in the log, try again with different settings for "log_min_messages" or "log_error_verbosity".

Not exactly what I was expecting but I got a good hint: “HINT: SET pljava.libjvm_location TO the correct path to the jvm library (libjvm.so or jvm.dll, etc.)“. Ok, I had to find the libjvm my system is using to configure Postgres. I used the SET command to do it online:

demo=# SET pljava.libjvm_location TO '/usr/lib/jvm/java-11-openjdk-11.0.11.0.9-5.fc34.x86_64/lib/server/libjvm.so';
NOTICE: PL/Java loaded
DETAIL: versions:
PL/Java native code (1.6.2)
PL/Java common code (1.6.2)
Built for (PostgreSQL 12.7 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.1.1 20210531 (Red Hat 11.1.1-3), 64-bit)
Loaded in (PostgreSQL 12.7 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.1.1 20210531 (Red Hat 11.1.1-3), 64-bit)
OpenJDK Runtime Environment (11.0.11+9)
OpenJDK 64-Bit Server VM (11.0.11+9, mixed mode, sharing)
NOTICE: PL/Java successfully started after adjusting settings
HINT: The settings that worked should be saved (using ALTER DATABASE demo SET ... FROM CURRENT or in the "/v01/data/db/pg12/postgresql.conf" file). For a reminder of what has been set, try: SELECT name, setting FROM pg_settings WHERE name LIKE 'pljava.%' AND source = 'session'
NOTICE: PL/Java load successful after failed CREATE EXTENSION
DETAIL: PL/Java is now installed, but not as an extension.
HINT: To correct that, either COMMIT or ROLLBACK, make sure the working settings are saved, exit this session, and in a new session, either: 1. if committed, run "CREATE EXTENSION pljava FROM unpackaged", or 2. if rolled back, simply "CREATE EXTENSION pljava" again.
SET

Also used the “ALTER SYSTEM” to make it persistent across all my databases as it writes the given parameter setting to the “postgresql.auto.conf” file, which is read in addition to “postgresql.conf“:

demo=# ALTER SYSTEM SET pljava.libjvm_location TO '/usr/lib/jvm/java-11-openjdk-11.0.11.0.9-5.fc34.x86_64/lib/server/libjvm.so';
ALTER SYSTEM

Now we have it installed we can check the system catalog if it is indeed there:

demo=# SELECT * FROM pg_language WHERE lanname LIKE 'java%';
oid    | lanname | lanowner | lanispl | lanpltrusted | lanplcallfoid | laninline | lanvalidator | lanacl
-------+---------+----------+---------+--------------+---------------+-----------+--------------+-------------------
16428  | java    | 10       | t       | t            | 16424         | 0         | 16427        | {charly=U/charly}
16429  | javau   | 10       | t       | f            | 16425         | 0         | 16426        |
(2 rows)

And test if it is working:

demo=# CREATE FUNCTION getProperty(VARCHAR)
RETURNS VARCHAR
AS 'java.lang.System.getProperty'
LANGUAGE java;
CREATE FUNCTION
demo=# SELECT getProperty('java.version');
getproperty
-------------
11.0.11
(1 row)

It’s working! Time to try something useful.

Accessing Database Objects with PL/Java

The majority of the examples I found showed how to do a “hello world” from a Java class or how to calculate a Fibonacci sequence but nothing how to access database objects. Well, nothing wrong with those examples but I suppose that one who installs PL/Java in his database would like to access database objects from inside of a Java function and this is what we gonna do here.

I will use the sample database “pagila” that can be found here[2] for our tests in this post.

For this first example, I will create a simple class with a static method that will be accessed outside like any Postgres function. The function will receive an integer argument and use it to search the table “customer”, column “customer_id” and will print the customer’s id, full name, email,  and address:

package com.percona.blog.pljava;

import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.sql.SQLException;

public class Customers {
	private static String m_url = "jdbc:default:connection";

	public static String getCustomerInfo(Integer id) throws SQLException {
		Connection conn = DriverManager.getConnection(m_url);
		String query = "SELECT c.customer_id, c.last_name ||', '|| c.first_name as full_name, "
				+ " c.email, a.address, ci.city, a.district "
				+ " FROM customer c"
				+ "	 JOIN address a on a.address_id = c.address_id "
				+ "	JOIN city ci on ci.city_id = a.city_id "
				+ " WHERE customer_id = ?";

		PreparedStatement stmt = conn.prepareStatement(query);
		stmt.setInt(1, id);
		ResultSet rs = stmt.executeQuery();
		rs.next();
		String ret; 
		ret = "- ID: " + rs.getString("customer_id") ;
		ret += "\n- Name: " + rs.getString("full_name");
		ret += "\n- Email: " + rs.getString("email");
		ret += "\n- Address: " + rs.getString("address");
		ret += "\n- City: " + rs.getString("city");
		ret += "\n- District: " + rs.getString("district");
		ret += "\n--------------------------------------------------------------------------------";

		stmt.close();
		conn.close();

		return (ret);
	}
}

I’ve compiled and created the “jar” file manually with the below commands:

javac com/percona/blog/pljava/Customers.java
jar -c -f /app/pg12/lib/demo.jar com/percona/blog/pljava/Customers.class

Note that I’ve created the jar file inside the folder “/app/pg12/lib”, keep notes because we’ll use this information in the next step, loading the jar file inside Postgres:

demo=# SELECT sqlj.install_jar( 'file:///app/pg12/lib/demo.jar', 'demo', true );
 install_jar 
-------------
(1 row)

demo=# SELECT sqlj.set_classpath( 'public', 'demo' );
 set_classpath 
---------------
(1 row)

The install_jar function has the signature “install_jar(<jar_url>, <jar_name>, <deploy>)” and it loads a jar file from a location appointed by an URL into the SQLJ jar repository. It is an error if a jar with the given name already exists in the repository or if the jar doesn’t exist in the URL or the database isn’t able to read it:

demo=# SELECT sqlj.install_jar( 'file:///app/pg12/lib/demo2.jar', 'demo', true );
ERROR:  java.sql.SQLException: I/O exception reading jar file: /app/pg12/lib/demo2.jar (No such file or directory)
demo=# SELECT sqlj.install_jar( 'file:///app/pg12/lib/demo.jar', 'demo', true );
 install_jar 
------------- 
(1 row)

demo=# SELECT sqlj.install_jar( 'file:///app/pg12/lib/demo.jar', 'demo', true );
ERROR:  java.sql.SQLNonTransientException: A jar named 'demo' already exists

The function set_classpath defines a classpath for the given schema, in this example the schema “public”. A classpath consists of a colon-separated list of jar names or class names. It’s an error if the given schema does not exist or if one or more jar names references non-existent jars.

The next step is to create the Postgres functions:

demo=# CREATE FUNCTION getCustomerInfo( INT ) RETURNS CHAR AS 
    'com.percona.blog.pljava.Customers.getCustomerInfo( java.lang.Integer )'
LANGUAGE java;
CREATE FUNCTION

We can now use it:

demo=# SELECT getCustomerInfo(100);
                                 getcustomerinfo                                  
----------------------------------------------------------------------------------
 - ID: 100                                                                       +
 - Name: HAYES, ROBIN                                                            +
 - Email: ROBIN.HAYES@sakilacustomer.org                                         +
 - Address: 1913 Kamakura Place                                                  +
 - City: Jelets                                                                  +
 - District: Lipetsk                                                             +
 --------------------------------------------------------------------------------
(1 row)

Sweet, we have our first Java function inside our Postgres demo database.

Now, in our last example here I will add another method to this class, now to list all the payments from a given customer and calculate its total:

package com.percona.blog.pljava;

import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.sql.SQLException;

public class Customers {
	private static String m_url = "jdbc:default:connection";

	public static String getCustomerInfo(Integer id) throws SQLException {
		Connection conn = DriverManager.getConnection(m_url);
		String query = "SELECT c.customer_id, c.last_name ||', '|| c.first_name as full_name, "
				+ " c.email, a.address, ci.city, a.district "
				+ " FROM customer c"
				+ "	 JOIN address a on a.address_id = c.address_id "
				+ "	JOIN city ci on ci.city_id = a.city_id "
				+ " WHERE customer_id = ?";

		PreparedStatement stmt = conn.prepareStatement(query);
		stmt.setInt(1, id);
		ResultSet rs = stmt.executeQuery();
		rs.next();
		String ret; 
		ret = "- ID: " + rs.getString("customer_id") ;
		ret += "\n- Name: " + rs.getString("full_name");
		ret += "\n- Email: " + rs.getString("email");
		ret += "\n- Address: " + rs.getString("address");
		ret += "\n- City: " + rs.getString("city");
		ret += "\n- District: " + rs.getString("district");
		ret += "\n--------------------------------------------------------------------------------";

		stmt.close();
		conn.close();

		return (ret);
	}

	public static String getCustomerTotal(Integer id) throws SQLException {
		Connection conn;
		PreparedStatement stmt;
		ResultSet rs;
		String result;
		double total;

		conn = DriverManager.getConnection(m_url);
		stmt = conn.prepareStatement(
				"SELECT c.customer_id, c.first_name, c.last_name FROM customer c WHERE c.customer_id = ?");
		stmt.setInt(1, id);
		rs = stmt.executeQuery();
		if (rs.next()) {
			result = "Customer ID  : " + rs.getInt("customer_id");
			result += "\nCustomer Name: " + rs.getString("last_name") + ", " + rs.getString("first_name");
			result += "\n--------------------------------------------------------------------------------------------------------";
		} else {
			return null;
		}

		stmt = conn.prepareStatement("SELECT p.payment_date, p.amount FROM payment p WHERE p.customer_id = ? ORDER BY 1");
		stmt.setInt(1, id);
		rs = stmt.executeQuery();
		total = 0;

		while (rs.next()) {
			result += "\nPayment date: " + rs.getString("payment_date") + ",    Value: " + rs.getString("amount");
			total += rs.getFloat("amount");
		}
		result += "\n--------------------------------------------------------------------------------------------------------";
		result += "\nTotal: " +String.format("%1$,.2f",  total);
		
		stmt.close();
		conn.close();
		return (result);
	}
}

Same instructions to compile:

javac com/percona/blog/pljava/Customers.java 
jar -c -f /app/pg12/lib/demo.jar com/percona/blog/pljava/Customers.class

Then we need to replace the loaded jar file for the newly created and create the function inside Postgres:

demo=# SELECT sqlj.replace_jar( 'file:///app/pg12/lib/demo.jar', 'demo', true );
 replace_jar 
-------------
(1 row)

demo=# CREATE FUNCTION getCustomerTotal( INT ) RETURNS CHAR AS 
    'com.percona.blog.pljava.Customers.getCustomerTotal( java.lang.Integer )'
LANGUAGE java;
CREATE FUNCTION

And the result is:

demo=# SELECT getCustomerInfo(100);
                                 getcustomerinfo                                  
----------------------------------------------------------------------------------
 - ID: 100                                                                       +
 - Name: HAYES, ROBIN                                                            +
 - Email: ROBIN.HAYES@sakilacustomer.org                                         +
 - Address: 1913 Kamakura Place                                                  +
 - City: Jelets                                                                  +
 - District: Lipetsk                                                             +
 --------------------------------------------------------------------------------
(1 row)

We finish this part here and with this last example. At this point, we are able to access objects, loop through a resultset, and return the result back as a single object like a TEXT. I will discuss how to return an array/resultset, how to use PL/Java functions within triggers, and how to use external resources in part two and part three of this article, stay tuned!

[1] https://tada.github.io/pljava/use/policy.html
[2] https://www.postgresql.org/ftp/projects/pgFoundry/dbsamples/pagila/

Jul
28
2021
--

MongoDB: Modifying Documents Using Aggregation Pipelines and Update Expressions

MongoDB Modifying Documents

MongoDB Modifying DocumentsUpdating documents on MongoDB prior to version 4.2 was quite limited. It was not possible to set values to a conditional expression, combine fields, or update a field based on the value of another field on the server-side. Tracing a parallel to the SQL update statements, for example, it wasn’t possible to do something like the following:

Update t1 set t1.f1 = t1.f2 where…

It wasn’t possible to use a conditional expression either, something easily achieved with SQL standards:

UPDATE t1 SET t1.f1 = CASE WHEN f2 = 1 THEN 1 WHEN f2 = 2 THEN 5 END WHERE…

If something similar to both examples above was required and the deployment was 3.4+, probably the usage of $addFields would be an alternative way to accomplish it. However, it would not touch the current document because the $out output destination could only be a different collection.

With older versions, the only way around was creating a cursor with aggregation pipelines and iterating it on the client side. Inside the loop, it was possible to update using the proper $set values. It was a difficult and tedious task, which would result in a full javascript code.

With MongoDB 4.2 and onwards, it is possible to use an aggregation pipeline to update MongoDB documents conditionally, allowing the update/creation of a field based on another field. This article presents some very common/basic operations which are easily achieved with SQL databases.

Field Expressions in Update Commands (v4.2+)

Updating a field with the value of some other field:

This is similar to the classic example of an SQL command: update t1 set t1.f1 = t1.f2 + t1.f3

replset:PRIMARY> db.getSiblingDB("dbtest").colltest2.update({_id:3},[{$set:{result:{$add: [ "$f2", "$f3" ] } }} ]);
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })

replset:PRIMARY> db.getSiblingDB("dbtest").colltest2.find({_id:3});
{ "_id" : 3, "f1" : 30, "f2" : 300, "f3" : 3000, "result" : 3300 }

The key point is the “$” on the front of the field names being referenced (“f2” and “f3” in this example). These are the simplest type of field path expression, as they’re called in the MongoDB documentation. You’ve probably seen them in the aggregation pipeline before, but it was only in v4.2 that you could also use them in a normal update command.

Applying “CASE” conditions:

It is quite suitable now to determine conditions for a field value while updating a collection:

replset:PRIMARY> db.getSiblingDB("dbtest").colltest3.find({_id:3});
{ "_id" : 3, "grade" : 8 }


replset:PRIMARY> db.getSiblingDB("dbtest").colltest3.update(
  { _id: 3}, 
  [
    { $set: {result : { 
      $switch: {branches: [
        { case: { $gte: [ "$grade", 7 ] }, then: "PASSED" }, 
        { case: { $lte: [ "$grade", 5 ] }, then: "NOPE" }, 
        { case: { $eq: [ "$grade", 6 ] }, then: "UNDER ANALYSIS" } 
      ] } 
    } } } 
  ] 
)
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })


replset:PRIMARY> db.getSiblingDB("dbtest").colltest3.find({ _id: 3});
{ "_id" : 3, "grade" : 8, "result" : "PASSED" }

Adding new fields for a specific filtered doc:

Let’s say that you want to stamp a document with the updated date = NOW and add a simple comment field:

replset:PRIMARY> db.getSiblingDB("dbtest").colltest.find({_id:3})
{ "_id" : 3, "description" : "Field 3", "rating" : 2, "updt_date" : ISODate("2021-05-06T22:00:00Z") }

replset:PRIMARY> db.getSiblingDB("dbtest").colltest.update( 
  { _id: 3 }, 
  [ 
    { $set: { "comment": "Comment3", mod_date: "$$NOW"} } 
  ] 
)
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })

replset:PRIMARY> db.getSiblingDB("dbtest").colltest.find({_id:3})
{ "_id" : 3, "description" : "Field 3", "rating" : 2, "updt_date" : ISODate("2021-05-06T22:00:00Z"), "comment" : "Comment3", "mod_date" : ISODate("2021-07-05T18:48:44.710Z") }

Reaching several Docs with the same expression:

It is possible now to either use the command updateMany() and reach multiple docs with the same pipeline. 

replset:PRIMARY> db.getSiblingDB("dbtest").colltest3.find({});
{ "_id" : 1, "grade" : 8}
{ "_id" : 2, "grade" : 5}
{ "_id" : 3, "grade" : 8}

replset:PRIMARY> db.getSiblingDB("dbtest").colltest3.updateMany({}, 
  [
    { $set: {result : { $switch: {branches: [{ case: { $gte: [ "$grade", 7 ] }, then: "PASSED" }, { case: { $lte: [ "$grade", 5 ] }, then: "NOPE" }, { case: { $eq: [ "$grade", 6 ] }, then: "UNDER ANALYSIS" } ] } } } } 
  ] 
)
{ "acknowledged" : true, "matchedCount" : 3, "modifiedCount" : 2 }

replset:PRIMARY> db.getSiblingDB("dbtest").colltest3.find({});
{ "_id" : 1, "grade" : 8, "result" : "PASSED" }
{ "_id" : 2, "grade" : 5, "result" : "NOPE" }
{ "_id" : 3, "grade" : 8, "result" : "PASSED" }

Or use the command option { multi: true } if you want to stick to using the original db.collection.update() command. Note that the default is false, which means that only the first occurrence will be updated.

replset:PRIMARY> db.getSiblingDB("dbtest").colltest4.update({}, [{ $set: {result : { $switch: {branches: [{ case: { $gte: [ "$grade", 7 ] }, then: "PASSED" }, { case: { $lte: [ "$grade", 5 ] }, then: "NOPE" }, { case: { $eq: [ "$grade", 6 ] }, then: "UNDER ANALYSIS" } ] } } } } ] )
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })

replset:PRIMARY> db.getSiblingDB("dbtest").colltest4.find({});
{ "_id" : 1, "grade" : 8, "result" : "PASSED" }
{ "_id" : 2, "grade" : 5 }
{ "_id" : 3, "grade" : 8 }

When specifying {multi:true} the expected outcome is finally achieved:

replset:PRIMARY> db.getSiblingDB("dbtest").colltest4.update({}, [{ $set: {result : { $switch: {branches: [{ case: { $gte: [ "$grade", 7 ] }, then: "PASSED" }, { case: { $lte: [ "$grade", 5 ] }, then: "NOPE" }, { case: { $eq: [ "$grade", 6 ] }, then: "UNDER ANALYSIS" } ] } } } } ],{multi:true} )
WriteResult({ "nMatched" : 3, "nUpserted" : 0, "nModified" : 2 })

replset:PRIMARY> db.getSiblingDB("dbtest").colltest4.find({});
{ "_id" : 1, "grade" : 8, "result" : "PASSED" }
{ "_id" : 2, "grade" : 5, "result" : "NOPE" }
{ "_id" : 3, "grade" : 8, "result" : "PASSED" }

Update by $merge Stage in the Aggregation Pipeline

Prior to version 4.2, addressing the result of an aggregate pipeline to a new collection was achieved by using $out. Starting on version 4.2 it is possible to use $merge which is way more flexible considering that while using $out, the entire collection will be replaced, and with merge, it is possible to replace a single document and a few or more things. You may want to refer to the comparison table described here:

https://docs.mongodb.com/manual/reference/operator/aggregation/merge/#std-label-merge-out-comparison

With MongoDB 4.4 and onwards, it is allowed to update a collection directly on the aggregate pipeline through the $merge stage. The magic happens after determining the output collection with the same name as the one being aggregated. The example below illustrates how to flag the max grade of the student Rafa in math class:

  • Original document
replset:PRIMARY> db.getSiblingDB("dbtest").students2.find({"name": "Rafa","class":"math"})
{ "_id" : ObjectId("6100081e21f08fe0d19bda41"), "name" : "Rafa", "grades" : [ 4, 5, 6, 9 ], "class" : "math" }

  • The aggregation pipeline
replset:PRIMARY> db.getSiblingDB("dbtest").students2.aggregate( [{ $match : { "name": "Rafa","class":"math" } }, {$project:{maxGrade:{$max:"$grades"}}}, {$merge : { into: { db: "dbtest", coll: "students2" }, on: "_id",  whenMatched: "merge", whenNotMatched:"discard"} } ]);

  • Checking the result
replset:PRIMARY> db.getSiblingDB("dbtest").students2.find({"name": "Rafa","class":"math"})
{ "_id" : ObjectId("6100081e21f08fe0d19bda41"), "name" : "Rafa", "grades" : [ 4, 5, 6, 9 ], "class" : "math", "maxGrade" : 9 }

Note that the maxGrade field was merged into the doc, flagging that the max grade achieved by that student in math was 9.

Watch out: behind the scenes, the merge will trigger an update against the same collection. If that update changes the physical location of the document, the update might revisit the same document multiple times or even get into an infinite loop (Halloween Problem)

The other cool thing is using the $merge stage to work exactly how a SQL command INSERT AS SELECT works (and this is possible with MongoDB 4.2 and onwards). The example below demonstrates how to fill the collection colltest_reporting with the result of an aggregation hit against colltest5.

replset:PRIMARY> db.getSiblingDB("dbtest").colltest5.aggregate( [{ $match : { class: "A" } }, { $group: { _id: "$class",maxGrade: { $max: "$grade" } }},  {$merge : { into: { db: "dbtest", coll: "colltest_reporting" }, on: "_id",  whenMatched: "replace", whenNotMatched: "insert" } } ] );
replset:PRIMARY> db.getSiblingDB("dbtest").colltest_reporting.find()
{ "_id" : "A", "maxGrade" : 8 }

Conclusion

There are plenty of new possibilities which will make a developer’s life easier (especially the life of those developers who are coming from SQL databases) considering that the aggregation framework provides several operators and various different stages to play. Although, it is important to highlight that the complexity of a pipeline may incur performance degradation (that may be a topic for another blog post). For more information on updates with aggregation pipelines, please refer to the official documentation.

Jul
19
2021
--

Performing ETL Using Inheritance in PostgreSQL

ETL Using Inheritance in PostgreSQL

Good database maintenance includes not only performing the standard adding, updating, and deleting records, etc., but periodic edits to the table schema too. Operations such as adding, editing, and removing table columns are part of today’s life-cycle reality too as new functionality is constantly being added.

In quieter, and less demanding times, one could get away with schema updates with minimal impact by performing the operation during low load times. However, as all database systems have become more critical to the business bottom line, maintenance windows have become, by necessity, much smaller and more time-sensitive.

So the question is asked: How can one update a multi-terabyte table with near-zero downtime?

Looking deep into our Postgres bag of tricks, we look not at the most recent and advanced features but instead, we leverage a very old capability that’s been part of Postgres since when it was first released as an open source project, i.e. its object-relational capacity implementation of inheritance.

Use Case

Before going into the details of the solution let’s define just what we are trying to solve.

Strictly speaking; there’s two use cases that come to mind when using inheritance as the prime ETL data migration mechanism:

  • Removing table OIDs, such as when moving to Postgres version 12 and greater.
  • Performing DML/DDL operations which include:
    • updating the data
    • adding or removing table columns

Life starts getting complicated when dealing with issues such as updating data types. But we’ll talk about this at the end of the blog.

About Inheritance

Unlike object-oriented programming languages, where the child inherits attributes from the parent, in Postgres it’s the parent that has the ability to inherit from the child ( if only this was true in real-life ;-)). Thus a table column from a child has the potential of becoming available in the parent relation.

Consider the following snippet of code: Two parents and three children are created and populated with records:

BEGIN;
    drop table if exists father, mother cascade;

    create table father(c1 int, c2 int, c3 int);
    create table mother(c1 int, c2 int, c4 int);

-- notice although not declared column "c4" from mother is added to these tables
    create table son(c1 int, c2 int, c3 int) inherits (father,mother);
    create table daughter(c2 int, c4 int, c5 int) inherits (father,mother);

-- this table inherits only those columns from "father"
    create table cousin (c1 int, c2 int, c3 int, c6 int) inherits (father);
COMMIT;
BEGIN;
    insert into son values(1,1,1,1);
    insert into daughter values(2,2,2,2,2);
    insert into cousin values(3,3,3,3);
    insert into father values(10,10,10);
    insert into mother values(11,11,11);
COMMIT;

Columns declared in the parents must therefore exist in the child, i.e. they are merged. But notice those columns unique to the child are not necessarily propagated to the parents:

Table "public.father"
 Column |  Type   | Collation | Nullable | Default
--------+---------+-----------+----------+---------
 c1     | integer |           |          |
 c2     | integer |           |          |
 c3     | integer |           |          |
Number of child tables: 3 (Use \d+ to list them.)

 

Table "public.mother"
 Column |  Type   | Collation | Nullable | Default
--------+---------+-----------+----------+---------
 c1     | integer |           |          |
 c2     | integer |           |          |
 c4     | integer |           |          |
Number of child tables: 2 (Use \d+ to list them.)

 

 

Table "public.son"
 Column |  Type   | Collation | Nullable | Default
--------+---------+-----------+----------+---------
 c1     | integer |           |          |
 c2     | integer |           |          |
 c3     | integer |           |          |
 c4     | integer |           |          |
Inherits: father,
          mother
Table "public.daughter"
 Column |  Type   | Collation | Nullable | Default
--------+---------+-----------+----------+---------
 c1     | integer |           |          |
 c2     | integer |           |          |
 c3     | integer |           |          |
 c4     | integer |           |          |
 c5     | integer |           |          |
Inherits: father,
          mother
Table "public.cousin"
 Column |  Type   | Collation | Nullable | Default
--------+---------+-----------+----------+---------
 c1     | integer |           |          |
 c2     | integer |           |          |
 c3     | integer |           |          |
 c6     | integer |           |          |
Inherits: father
 

And even though records populated in the child can be seen by the parent the reverse is NOT true when records are populated into the parent and are not seen by the child:

db02=# select * from father;
 c1 | c2 | c3
----+----+----
 10 | 10 | 10
  1 |  1 |  1
  2 |  2 |  2
  3 |  3 |  3
db02=# select * from mother;
 c1 | c2 | c4
----+----+----
 11 | 11 | 11
  1 |  1 |  1
  2 |  2 |  2
 
db02=# select * from son;
 c1 | c2 | c3 | c4
----+----+----+----
  1 |  1 |  1 |  1
db02=# select * from daughter;
 c1 | c2 | c3 | c4 | c5
----+----+----+----+----
  2 |  2 |  2 |  2 |  2
db02=# select * from cousin;
 c1 | c2 | c3 | c6
----+----+----+----
  3 |  3 |  3 |  3

Developing The ETL/Migration Model

Performing data migration under production conditions should take into consideration these four (4) distinct query operations:

  • SELECT from both the target and source tables at the same time.
  • UPDATE and/or DELETE records from both the target and source tables.
  • INSERT new records into the target table.
  • Moving data from the source to the target tables.

For the sake of discussion we’ll demonstrate using one source and target table respectively inheriting from a single parent:

create table parent(c1 int primary key, c2 int, c3 int);
create table source(like parent including all) inherits (parent);
create table target(like parent including all) inherits (parent);

Querying Both Target And Source Tables

Inheritance makes the SELECT query a straightforward operation. This query checks all the tables for the queried record(s):

explain select * from parent;
QUERY PLAN
--------------------------------------------------------------------------
Append (cost=0.00..81.21 rows=4081 width=12)
-> Seq Scan on parent parent_1 (cost=0.00..0.00 rows=1 width=12)
-> Seq Scan on source parent_2 (cost=0.00..30.40 rows=2040 width=12)
-> Seq Scan on target parent_3 (cost=0.00..30.40 rows=2040 width=12)

UPDATE and/or DELETE Records

Similarly to SELECT queries, one doesn’t have to worry about editing the existing application’s DML operation(s) when performing UPDATE and DELETE. Again, notice how both source and target tables are queried as well as the parent:

explain update parent set c2=0 where c1=1;
QUERY PLAN
---------------------------------------------------------------------------------
 Update on parent  (cost=0.00..16.34 rows=3 width=18)
   Update on parent
   Update on source
   Update on target
   ->  Seq Scan on parent  (cost=0.00..0.00 rows=1 width=18)
         Filter: (c1 = 1)
   ->  Index Scan using source_pkey on source  (cost=0.15..8.17 rows=1 width=18)
         Index Cond: (c1 = 1)
   ->  Index Scan using target_pkey on target  (cost=0.15..8.17 rows=1 width=18)
         Index Cond: (c1 = 1)

INSERT New Records Into The Target Table

The thing to keep in mind about INSERT is that without a redirect mechanism all records are inserted into the parent.

Since everybody already knows about triggers, I thought it would be fun to use a REWRITE RULE instead:

create rule child_insert as
on
    insert to parent
do instead
    insert into target values (NEW.*);

 

Table "public.parent"
 Column |  Type   | Collation | Nullable | Default | Storage | Stats target | Description
--------+---------+-----------+----------+---------+---------+--------------+-------------
 c1     | integer |           | not null |         | plain   |              |
 c2     | integer |           |          |         | plain   |              |
 c3     | integer |           |          |         | plain   |              |
Indexes:
    "parent_pkey" PRIMARY KEY, btree (c1)
Rules:
    child_insert AS
    ON INSERT TO parent DO INSTEAD  INSERT INTO target (c1, c2, c3)
  VALUES (new.c1, new.c2, new.c3)
Child tables: source,
              target

And here’s our validation; notice how the INSERT is redirected from parent to target:

EXPLAIN insert into parent (c1,c2,c3) values (1,1,1);
QUERY PLAN
-----------------------------------------------------
 Insert on target  (cost=0.00..0.01 rows=1 width=12)
   ->  Result  (cost=0.00..0.01 rows=1 width=12)

Moving Records Between Source And Target Tables

It’s time to introduce the last mechanism needed to perform the actual data migration. Essentially, the data is moved in batches otherwise, if you can afford the downtime of moving your records in one very large transaction, this dog and pony show is a bit redundant.

In the real world we need to anticipate multiple processes attempting simultaneous EXCLUSIVE LOCKS. In the case that one or more records are locked by another operation the following example demonstrates how one can simply skip over them:

--
-- EX: insert a single record in table "source"
--
insert into source (c1,c2,c3) values (2,2,2);

-- move 1,000 records at a time
-- from table "source" to "target"
--
with a as (select * from source for update skip locked limit 1000),
     b as (delete from source using a where c1=a.c1
           returning    source.c1,
                        source.c2,
                        source.c3)
insert into target select * from b;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------
 Insert on target  (cost=27.92..41.52 rows=680 width=12) (actual time=0.082..0.082 rows=0 loops=1)
   CTE b
     ->  Delete on source  (cost=9.42..27.92 rows=680 width=6) (actual time=0.050..0.053 rows=1 loops=1)
           ->  Bitmap Heap Scan on source  (cost=9.42..27.92 rows=680 width=6) (actual time=0.021..0.023 rows=1 loops=1)
                 Recheck Cond: (c1 >= 0)
                 Heap Blocks: exact=1
                 ->  Bitmap Index Scan on source_pkey  (cost=0.00..9.25 rows=680 width=0) (actual time=0.010..0.011 rows=1 loops=1)
                       Index Cond: (c1 >= 0)
   ->  CTE Scan on b  (cost=0.00..13.60 rows=680 width=12) (actual time=0.054..0.057 rows=1 loops=1)
 Planning Time: 0.237 ms
 Execution Time: 0.173 ms

Putting It All Together

It’s time to demonstrate a Proof Of Concept using pgbench.

SETUP

Initialize database db02:

dropdb --if-exists db02
createdb db02
pgbench -s 10 -i db02
List of relations
 Schema |       Name       | Type  |  Owner   |  Size   | Description
--------+------------------+-------+----------+---------+-------------
 public | pgbench_accounts | table | postgres | 128 MB  |
 public | pgbench_branches | table | postgres | 40 kB   |
 public | pgbench_history  | table | postgres | 0 bytes |
 public | pgbench_tellers  | table | postgres | 40 kB   |

Create the tables parent and child.

NOTE: In order to demonstrate data migration from a deprecated table, table pgbench_accounts is altered by adding OIDs.

create table parent (like pgbench_accounts including all) without oids;
create table child (like pgbench_accounts including all) inherits (parent) without oids;

alter table pgbench_accounts set with oids, inherit parent;

alter table pgbench_accounts rename to pgbench_accounts_deprecated;
alter table parent rename to pgbench_accounts;

TEST

This query is at the heart of the solution. Any exclusively locked record that it tries to move is automatically skipped and a new attempt can be made the next time this script is invoked.

with a as (select * from pgbench_accounts_deprecated order by 1 for update skip locked limit 10),
     b as (delete
                from pgbench_accounts_deprecated
                using a
                where pgbench_accounts_deprecated.aid=a.aid
                returning pgbench_accounts_deprecated.aid,
                          pgbench_accounts_deprecated.bid,
                          pgbench_accounts_deprecated.abalance,
                          pgbench_accounts_deprecated.filler)
insert into child select * from b;

VALIDATE

explain analyze select * from pgbench_accounts order by 1 limit 13;
QUERY PLAN
--------------------------------------------------------------------------------------------------
 Limit  (cost=0.72..1.45 rows=13 width=97) (actual time=0.012..0.016 rows=13 loops=1)
   ->  Merge Append  (cost=0.72..56212.42 rows=1000211 width=97) (actual time=0.011..0.013 rows=13 loops=1)
         Sort Key: pgbench_accounts.aid
         ->  Index Scan using parent_pkey on pgbench_accounts  (cost=0.12..8.14 rows=1 width=352) (actual time=0.002..0.002 rows=0 loops=1)
         ->  Index Scan using pgbench_accounts_pkey on pgbench_accounts_deprecated  (cost=0.42..43225.43 rows=1000000 width=97) (actual time=0.006..0.007 rows=3 loops=1)
         ->  Index Scan using child_pkey on child  (cost=0.14..51.30 rows=210 width=352) (actual time=0.002..0.003 rows=10 loops=1)
 Planning Time: 0.084 ms
 Execution Time: 0.030 ms

MIGRATION SCRIPT

The query has been incorporated into this script moving 1,000 records every 5 seconds.

#!/bin/bash

set -e

export PGHOST=/tmp PGPORT=10011 PGDATABASE=db02 PGUSER=postgres

SLEEP=5
REC=1000

SQL="
with a as (select * from pgbench_accounts_deprecated order by 1 for update skip locked limit $REC),
     b as (delete
                from pgbench_accounts_deprecated
                using a
                where pgbench_accounts_deprecated.aid=a.aid
                returning pgbench_accounts_deprecated.aid,
                          pgbench_accounts_deprecated.bid,
                          pgbench_accounts_deprecated.abalance,
                          pgbench_accounts_deprecated.filler)
insert into child select * from b;

with a(count_child)    as (select count(*) from child),
     b(count_accounts) as (select count(*) from pgbench_accounts_deprecated)
select a.count_child, b.count_accounts from a,b;
"

while true
do
    echo "--- $(date): Executing Query, moving $REC records now ... ---"
    psql <<<"$SQL"
    echo "sleeping: $SLEEP seconds ..." && sleep $SLEEP
done

BENCHMARKING

All the while the aforementioned script is active pgbench is running the bench-marking.

#
# doesn't block
#
pgbench -c 2 -j 4 --protocol=simple -T 120 db02

CAVEAT

This is a simple and powerful method but there are limitations: whereas common columns between tables must be of the same datatype.

For example, if column c1 in the table source is of datatype int and you want to migrate the data into table target with the same column c1 but with a datatype bigint then this method won’t work. An alternate solution however could take advantage of Updatable Views, which you can read more about here and using the appropriate triggers and rewrite rules.

Powered by WordPress | Theme: Aeros 2.0 by TheBuckmaker.com