Are Some of Us Set Up for Failure?

Have you ever entered a situation where you were already feeling defeated? Did you ever go into something knowing that it was a bad move? Sometimes, we enter a new situation with initially high hopes…

Smartphone

独家优惠奖金 100% 高达 1 BTC + 180 免费旋转




4 Ways To Optimise PostgreSQL Database With Millions of Data

This series consists of two articles:

Database optimisation is actually a set of techniques by which we usually want some of the following:

If you are interested to read it, here’s a link:

For those who have read, let me remind you — and for those who have not read, let me introduce you to what happened:

We imitated the behavior of miners for an imaginary cryptocurrency — we had several miners, which differed according to the number of graphics cards:

We created three tables:

The hours table contained the intensity of the computer cooler at certain hours of the day
The miners table contains basic data, such as name and number of graphic cards:

The data was generated with 5-minute time intervals.

As you can see above, simple queries like SELECT, where data is sorted by time and a limit of 1 or 10 records is set, last for an awful 10–12 seconds.

PostgreSQL offers two interesting commands — EXPLAIN and EXPLAIN ANALYZE.

The difference is that EXPLAIN shows you query cost based on collected statistics about your database, and EXPLAIN ANALYZE actually runs it to show the processed time for each stage.

There’s a high recommendation to use EXPLAIN ANALYZE because there are a lot of cases when EXPLAIN shows a higher query cost, while the time to execute is actually less and vice versa. The most important thing is that the EXPLAIN command will help you to understand if a specific index is used and how.
The ability to see indexes is the first step to learning PostgreSQL query optimization.

Here are the results on the example of the query above:

Now, let’s see the four simple steps which can improve your database performance.

A database index is a data structure that improves the speed of data retrieval operations on a database table at the cost of additional writes and storage space to maintain the index data structure.

How to see which indexes are automatically set by PostgreSQL when creating the table?

Let’s try to create a simple index on the time and miner name columns:

Pre-optimization: 12004.737 ms

Post-optimization: 0.469 ms

As we can see, the simplest addition of the index led us to an improvement of a huge 25,596 times!

So, indexing is something you need to pay attention to if you are dealing with larger amounts of data.

Let’s say we want to retrieve the maximum value of a computer cooler for each individual miner through the time.

Query could look something like this:

Regardless of setting the index, this operation is expensive.

It took us more than 9 seconds to do that. Imagine that one of the features on your website is that you allow the user to view such a set of data and that the page loads for a minimum of 9 seconds (not taking into account other data and queries, additional data processing, latency, etc.).

Who would want to wait for 10 seconds to get the data?

What we can do here, for example, is to rewrite the query in a different way, in order to reduce the number of required operations, the number of viewed and compared rows and thus speed up query:

By writing query in a smarter way, we saved ourselves time.

Pre-optimization: 9173.750 ms

Post-optimization: 2794.690 ms

For this case, by writing query in a better way, we speeded up the process by 3.28x.

A materialized view is a pre-computed data set derived from a query specification (the SELECT in the view definition) and stored for later use. Because the data is pre-computed, querying a materialized view is faster than executing a query against the base table of the view.

If we use the query from above:

Let’s look at the time to retrieve the maximum value of a computer cooler per miner using materialized view:

Pre-optimization: 2794.690 ms

Post-optimization: 0.247 ms

Data retrieval was improved 11,314.53 times.

We can normalise the table, use the foreign key as a relation to the miners table (column “id”).

INT (integer) comparisons are faster than VARCHAR comparisons, for the simple fact that INTs take up much less space than VARCHAR.

This holds true both for unindexed and indexed access. The fastest way to go is an INDEXED INT column.

While these are some basic optimization techniques, they can bear very big fruit. Also, although these techniques are simple, it is not always easy to:

You will need to play with the data, until you find an adequate formula that will suit your model.

Add a comment

Related posts:

The History of The Clitoris

From Colombo to Freud to O'Connell, the clitoris has been through a lot. Why do we know so little about this powerful organ? The clitoris deserves more love!

3 Pieces Of Quick Advice That Will Help Young Leaders Gain The Respect of Their Team

Being a young leader is very difficult, as a lot of people consider your lack of experience a big problem. It can be a problem unless you replace it with some other qualities: a positive attitude…

The Role of Frictionless Payments for Businesses

With the changing business landscape in the modern digital world, delivering frictionless experiences to consumers isn’t a luxury anymore; it’s essential. Whether we talk about e-commerce platforms…