Why RDBMS and SQL are difficult...

...while sometimes NoSQL solution may be easier and way faster?

Suppose we have banner advertisement website and we want to increase number of clicks on some banner. Let say MySQL table looks like this:

CREATE TABLE banners(
	id	int unsigned not null primary key,
	clicks	int unsigned not null default 0
)

Then on each click most programmers will do:

UPDATE banners
SET
	clicks = clicks + 1
WHERE
	id = @number

What are the problems here:

  1. We can not sure if MySQL is 100% atomic, e.g. we can not be sure that we are the only thread that do "clicks = clicks + 1". It is quite possible some other thread to updated the value in meanwhile. This means we will understate the clicks.
  2. Statement "clicks = clicks + 1" will blocks either the whole table (MyISAM/Aria) or row/page (InnoDB/XtraDB). This means if traffic increace, performance will be very poor.
  3. Because table is updating all the time, disk I/O will be high. This will increase the load average on the server.
  4. Because table is updating all the time, table will became fragmented and you will need often to run "optimize table banners". During the optimize, website will be down.

To fix most of the issues, you need to create additional table:

CREATE TABLE banners_clicks(
	banner	int unsigned not null,
	id	int unsigned not null auto_increment,
	primary key(
		banner,
		id
	)
)

Then instead do update, you do insert:

INSERT into banners_clicks(
	banner
)VALUES(
	@number
)

Then instead of select clicks from banners, you counting:

SELECT
	count(*)
FROM
	banners_clicks
WHERE
	id = @number

Hopefully this query is quite fast, because it runs on the index and not on the table itself. However if there are too many millions of rows, query will slow down anyway. To fix it, we need to add one more row in this table:

CREATE TABLE banners_clicks(
	banner	int unsigned not null,
	id	int unsigned not null auto_increment,
	checked	int unsigned not null default 0,
	primary key(
		banner,
		id
	)
);

-- and appropriate index

create index checked on banners_clicks(checked);

Then on each hour or so we do following thing into crontab:

-- begin

-- ****************************** --

-- create "lock"...
set @time := unix_timestamp();

-- ****************************** --

-- "lock" the records...
UPDATE banners_clicks
SET
	checked = @time
WHERE
	checked = 0;

-- ****************************** --

-- update the clicks...
UPDATE banners
SET
	clicks = clicks + ( 
		select 
			coalesce(count(*), 0)
		from 
			banners_clicks 
		WHERE 
			id = banners.id		AND
			checked = @time
	);

-- ****************************** --

-- end

Finally the query will look like:

SELECT 
	clicks + (SELECT count(*) FROM banners_clicks WHERE id = @number and checked = 0)
FROM 
	banners
WHERE 
	id = @number

Now this is fast!

...But for one really simple task, we did additional table, made a crontab process and final SQL is quite complicated. We also will need to write garbage collector process that deletes old records from banners_clicks table.


However...

NoSQL solution with Redis will be just this:

// PHP example code...
// To increase value and optionally get it, use...
$clicks = $r->incr("banners:$number");

// To get value without increase it, use...
$clicks = $r->get("banners:$number");

No comments...


blog comments powered by Disqus





Articles library

Redis as session handler in PHP

How to use Redis for session handler in PHP in a way similar to msession or PostgreSQL sessions
Date: 2011-07

Tags: system administration php


Lazy starter script on Linux such Redhat CentOS or Fedora

How to quickly made Redis to start after Linux boot.
Date: 2011-07

Tags: system administration


Redis vs Memcached comparison

Here is basic comparison of Redis Memcached and Memcachedb.
Date: 2011-07

Tags: memcached


Why RDBMS and SQL are difficult...

...while sometimes NoSQL solution may be easier and way faster? This arcle deals with incremental counters and how they needs to be implemented properly in RDBMS-es such MySQL
Date: 2011-07
Updated: 2011-09-30

Tags: anti-pattern programming mysql


Seamless migration from one Redis server to another

How to use replication to migrate Redis server without service interruption
Date: 2011-07

Tags: system administration


Redis swap issue while save

Explanation why your server load average may spike and server to swap memory when running redis
Date: 2011-08-21

Tags: system administration


Amazon EC2 and Amazon ElastiCache service

Memcached at Amazon first impressions
Date: 2011-08-23

Tags: system administration memcached cloud


Understanding hash-max-zipmap-entries and 'hash of hashes' optimization

Explanation of config parameters such hash-max-zipmap-entries and how we can make huge memory savings in new Redis 2.2
Date: 2011-08-28

Tags: optimization system administration programming php


NoSQL database design example

This article explains how we made our article section
Date: 2011-09-02

Tags: database design programming mysql


Redis save and backup script

Simple backup script for Redis
Date: 2011-09-28

Tags: system administration optimization


Postgres 9.1 foreign data wrapper interface

Postgres 9.1 gives you ability to access data from different data sources including Redis
Date: 2011-10-20

Tags: programming mysql pgsql memcached optimization


Redis high traffic connection issue

Explanation on 'Cannot assign requested address' connection error
Date: 2011-11-12

Tags: optimization system administration scaling


Redis too many open files error on high traffic sites

Explanation on 'Accepting client connection: accept: Too many open files'
Date: 2011-12-21

Tags: system administration memcached security


PHP Redis bug with PHP 5.4

Description of bug PHP Redis bug
Date: 2012-07-02

Tags: system administration php


Case Study: Using Redis intersect on very large sets

Intersection of 2 x 120M MySQL records
Date: 2012-08-29

Tags: programming mysql optimization cloud php


Getting started with Python and Redis on CentOS 6.X

How to install and test Redis client for Python on CentOS
Date: 2013-03-21

Tags: system administration python


GeoIP in Redis

Put Maxmind's GeoIP in Redis for best berformance for cient apps written in PHP or Python
Date: 2013-03-24

Tags: programming mysql optimization php scaling python


Redis database file examination

Examination of Redis rdb file in order to find huge memory consumption
Date: 2013-08-23

Tags: optimization system administration


© Jul.2011 - 2017, E-Nick.org