StormByte++

Monday, March 04, 2024

New look, new return

I had changed the look to this blog, edited all entries and made sure its appearance looks better with the hope that despite I was away a lot of time, I can return to post useful things about computer stuff.

In all these years I was finishing my learning about programming and other things that made me more knowledgeable and now I am more active in my Github's page where I share a lot of things.

Hopefully I can use again this blog to post things that I hope they are useful.

Monday, December 15, 2014

Gentoo Stage Manager

   As I already talked in previous post about creating and maintaining a Gentoo Backup, now I've created a script to make it really easy to unpack, mount, umount, chroot and compress any gentoo stage for backup purposes (specially useful to test packages when you don't want to make a live system dirty if something fails)

   It works in command line interface with interactive menus to be easy to use with no parameters added to it as them are usually forgotten by users.

   Also, it will be configured to use a folder you specify, so if you don't try a vital folder, it is 100% safe to use it, as it will not write to anything outside specified folder.

   ~~Currently, it is beta, so it could have some bugs, depite I've tested it and seem to work, for feedback, write to stormbyte at gmail dot com email address and I wil try to help.~~

   It is available on GitHub with even an ebuild in my Gentoo overlay to install it.Hope this is useful for gentoo advanced users to install and maintain their systems.

Monday, January 28, 2013

Various compression algorithm comparison

Although there are already several pages comparing compression algorithms, they are based on multimedia data (already compressed), random data, or data which may not represent a real backup process, so most of them are not useful if compared with real scenarios like, for example, compressing an operating system image efectivelly. Also, they often measures the CPU user time rather than complete elapsed time, so data is not really representative.

In my case, I am creating a new stage5 Gentoo backup from scratch, so I am writing here the results (compressing time, resulting file size and so) so other could find useful to compare a real situation like this one.

This is important as it may be useful for other purposes, like creating a LiveCD/DVD or other purposes where size may matter.

The starting point will be a tar file created with a full Gentoo installation (with KDE, Libreoffice, firefox, Netbeans, and other programs I use daily) as a basis with a Core2 Duo T9550 @2.66GHz and 4GiB RAM on ext4 partition running Gentoo ~amd64 arch.

This is result's table (Marked in red the worst and in green the best values of each measurement):

Algorithm (Version)	Compress Time (hh:mm:ss)	Resulting Size (GiB)	CPU Usage %	Peak MEM Usage (KiB)	Compress Ratio
TAR (1.26)	00:17:51.93¹	6.4623	2	1780	0.0000
GZIP (1.5)	00:17:37.07	2.3228	93	896	0.3594
BZIP2 (1.06)	00:17:51.82	2.1137	89	7996	0.3270
XZ (5.04)	01:42:16.00	1.5962	99	690332	0.2470
7ZIP (9.20)	00:52:22.65	1.6015	142	697004	0.2478
RAR (4.20)	00:17:50.12	1.8819	133	102864	0.2912

All compresion algorithms were tested with maximum compression level options

With this table, depending on the real needs and facts (slow machine, slow network when doing backup via network, etc), you can chose the best algorithm which meet the needs.

I hope it is useful to someone.

1: Despite TAR uses no compression, its performance impact is due to its IO is more intensive than the compressors as it has to look recursivelly lots of files and folders (a tipical Linux installation have many) in contrast with compressors which only read from TAR file.

Friday, July 06, 2012

Prevent DNS Cache DDoS

Note to lawyers, lobbies, music, film and entertainment companies, every ACTA, CISPA, SOPA and Intellectual Property defenders before freedom:

Every mention, accusation or potential offensive words direct or indirect have the word allegedly implied.

DDoS Attacks Against BIND DNS received

Introduction

First of all, a little explanation of why I think I started to receive the attacks.

For the very first time, I used The Pirate Bay to download a film contained in one of their torrents.

Also, I have a small server inside my home network, which is accesible to outside but shares my home internet bandwidth.

Right after finishing the torrent download, I started to receive massive DDoS DNS Cache attacks which were flooding my upload/download bandwidth as well as the BIND log (repace X with attacker IP I will make public later):

Jul 5 14:34:12 SkyNet named[2055]: client X#53: query (cache) 'ripe.net/ANY/IN' denied

Facts

I am running 5 years this personal server, and never received a massive DNS DDoS. Only received some SSH brute force login attempts which are successfull banned via fail2ban.
The intents to block The Pirate Bay: Successfull (UK, USA, Microsoft Messenger, etc), and some unsucessfull ones (Holland) and the effort to block the whole torrent service
ACTA rejected by European Parliament (where I live), making them fail.

Charges

I don't believe in coincidences, but specially in these cases, where it would be too much coincidence, so I make my guess (accusation): Hollywood, and their friends are trying to stop/reduce torrent usage via other methods: Disturbing, attacking, and flooding torrent users, but of course from behind, by using botnets, infected small servers, etc without showing their faces.

Evidences

Since I got fed up of this issue, I will post publicly IPs participating in this attack to let the world know who was causing this. In case of dynamic IPs, their user may not be responsible of the attacks, but I found some static IPs pertaining to servers which are direct responsibility to their owners:

IPs and hostnames participating in this DDoS event
IP/Hostname	IP/Hostname	IP/Hostname
202.1.207.129	173-212-220-30.static.hostnoc.net	50.87.103.112
31.222.72.2	aa.84.344a.static.theplanet.com	31.222.72.4
212.102.11.13	stats.msfayed.arvixevps.com	217.173.81.25
31.222.74.3	d4.67.354a.static.theplanet.com	31.222.66.3
174.37.201.154	mail.silverstructure.com	66.110.113.38
164.138.25.100	host.colocrossing.com	31.222.68.2
159.253.176.4	srv-s144.antiddos.eu	31.222.72.3
212.76.68.14	31.222.68.4	202.1.207.129
159.253.176.3	159.253.176.2	31.222.66.4
213.5.170.10	31.222.74.4	31.222.68.3
31.222.66.2		193.5.111.13

Protection

This kind of attacks can be blocked by using and configuring fail2ban properly:

Currently, fail2ban have a issue flooding sendmail and preventing further actions when sendmail does not respond due to upload bandwidth consumption or some other cause.

This way, I made a simple sendmail-buffered.conf for fail2ban which will only send mails when X IPs are already banned.

Also, even with this enabled, fail2ban was unable to completelly stopping the attack, since it only blocked named default port (53), and somehow, some packets were still reaching me. So I modifyed default iptables-all.conf too.
To configure fail2ban properly, I created new rules and actions, based on current existing ones, but tuned to prevent this attack more efectivelly.

Place those files inside fail2ban's action.d folder:

Once done that, just configure your jail.conf properly to use new rules.
For example:

[named-refused-tcp]

enabled = true
filter = named-refused
action = iptables-allports[name=Named, protocol=tcp]
sendmail-buffered[name=Named, dest=YOU@YOURMAIL.com, sender=YOU@YOURMAIL.com]
logpath = /var/log/named/security.log
ignoreip = YOUR_LOCAL_NETWORK_ADDRESS_MASK

Please note that, with these protections you will not stop the attack, since IP packets are still reaching your server, however, your upload bandwidth and your nameserver will be secured.

Saturday, June 30, 2012

Programming with Database: Using Prepared Statements in whole program

I have blogged several times about the benefits of programming with prepared statements when using database connections (About prepared statements, MySQL's prepared statements made easy, About SQL Injection).

But this time, I will focus on development time, and another advantages that using prepared statements give to programmers rather than its security or performance (for that you can read the other blog posts stated above).

As I stated in other posts, I recommend everyone who uses a database, to make use of Prepared Statements everywhere, not only in certain parts where you are interested.

That way, combined with Model View Controller design (MVC), can save you a lot of time and problems when database scheme changes over time (and make sure it will despite you may not think).

I am talking of common cleaning that a database may suffer some time after it has been used for some time: For example, field removal on a table, table rename, table removal, etc...

To reflect the time you can save when that happens with Prepared Statements, let's start with a very simple example and watch how it could be solved from 2 different points of view: Using prepared statements, and not using them.

Suppose we have a table called Customers with this schema (using PostgreSQL):

Column Name	Column Type
customerID	INTEGER
id	INTEGER
name	VARCHAR(50)
email	TEXT
credit_card_no	TEXT

Having this table amongst others, and in a running company with plenty of data inside, now, some time after the initial database schema, we decide to switch to paypal payments processing (just an example).

In this case, having credit card info of our clients is not longer relevant, and should be deleted due to security issues.
Then now we need something like:

ALTER TABLE Customers DEL COLUMN credit_card_no;

Unfortunatelly, we are not done yet, as we need to change some parts.

As part of this example, we will see what would happen in 2 cases:

Using MVC and Prepared Statements

Now that we have changed database scheme, the changes that are need to be made in source code are very easy to find. If we used something like the database class I blogged here, we just need to open the browser and wait for the errors it will come out, since all SQL sentences are prepared and loaded with database initialization.

It will tell us exactly which prepared statement failed, and just a seach will show in which function member we are using it, allowing us to change it.

Once changed, you just have to be sure to change the model class which receives that old data to remove it.

Not using MVC and not using Prepared Statements

In this scenario, the changes that needs to be done are not easy to find as we don't have any mechanism to look at every SQL sentence we have in the whole program at once.

So we should to do a seach on every project's file, and do as much as replacements we need in order to achieve the changes, and thus, more time programming which may mean more money.

So to sum up, I will render a table containing most important things about each others to reflect why is so important to use Prepared Statements rather than SQL statements as strings defined at runtime.

	MVC and Prepared Statements	SQL statements on the fly
Changes find time	Immediate	Search every project's file
Number of files/classes to modify	2 (Database and Model classes)	Undefined (as it may be in use from several files)
Possibility of runtime errors	No (as SQL sentence is not loaded with errors)	Yes (If, for example, you forgot to change a SQL sentence which is based on strings concatenation)

So now you know that programming with prepared statements does not only protects us from SQL Injection, give us more performance, prevents from having dangerous runtime errors that may harm our public image to our customers and makes us save tons of time when things changes, why are you still not using them in your whole project?

Friday, May 25, 2012

Count regexp matches on a table: Sorting result sets by relevance in PostgreSQL

The idea of counting the number of matches in a field that a regexp has, is a very intesting method to sort sets, for example, a web search result set, when you need the most relevant sources first (for example, containing the searched word more times).

I researched a bit and unfortunatelly, MySQL is far behind PostgreSQL, despite the fact that Oracle bought it (yes, at first I though it could improve a lot with Oracle's experience, but reality is that Oracle is not going to put another free competitor against its own), so I will write this howto for PostgreSQL with an example.

PostgreSQL has a very clever function regexp_matches that will show all substrings which matches a pattern and we will use that to count matches and sort result.

For example, suppose we have this table in database:

DATABASE TABLE AND DATA

TABLE NAME: CONTENT
id	INTEGER
htmltext	TEXT

TABLE DATA
id	html
1	key
2	key key key
3	key key
4	not matching pattern

If we try to search by keyword key, we could want to appear in relevance order:

	Expected result	Actual result
id	2	1
id	3	2
id	1	3

To achieve this sorting, we need to use regexp_matches correctly along with our result set, like this way:

SQL Example Sorting Sentence

SELECT id, COUNT(*) AS patternmatch
   FROM (
      SELECT id,regexp_matches(htmltext, 'key', 'g')
         FROM CONTENT
   )
   AS alias
   GROUP BY id ORDER BY patternmatch DESC;

SQL Example Sorting Sentence
SELECT id, COUNT() AS patternmatch FROM ( SELECT id,regexp_matches(htmltext, 'key', 'g') FROM CONTENT* ) AS alias GROUP BY id ORDER BY patternmatch DESC;

   That will result in a set like Expected result in previous table.

   But, that was only an example!! How about a real usage?

   With a little modification of previous sentence, it could be applied to sort a real search query, just substitute bold text with your pattern and your subquery/data set.

   The only requeriment to do that is: Your subquery must have id and htmltext between its results! (and of course, replace 'key' with

   So assuming we have a function called our_expensive_lookup (with the lookup parameter) that do a very expensive lookup by some keyword, we could apply this sorting to our returned results by:

SQL Real Usage Sorting Sentence

SELECT id, COUNT(*) AS patternmatch
   FROM (
      SELECT id,regexp_matches(htmltext, 'key', 'g')
         FROM (
            SELECT id, htmltext
               FROM our_expensive_lookup('key')
         ) AS lookup
   ) AS alias
   GROUP BY id ORDER BY patternmatch DESC;

SQL Real Usage Sorting Sentence
SELECT id, COUNT() AS patternmatch FROM ( SELECT id,regexp_matches(htmltext, 'key', 'g') FROM ( SELECT id, htmltext FROM our_expensive_lookup('key'*) ) AS lookup ) AS alias GROUP BY id ORDER BY patternmatch DESC;

Of course, it can be combined with prepared statements, stored procedures, and so on! The hardest part is done, so just use your imagination!

Wednesday, May 23, 2012

What can you do against SGAE, MPAA, RIAA, GEMA and those ones?

It is known for everybody what actions are those organizations doing world wide: declaring you are a criminal because you watch/download material, applying censorship to internet, adding fees to buy media in some countries just because you could use it for criminal purposes and those things.

Specially, I am referring to megaupload case: They closed megaupload with no proofs, and they are suing as much people as they can, just by giving their IPs, with the only argument that they are causing lots of revenue loss to multimedia market.

Well, the first thing I will say publicly is: A people wich download some material is not a buy loss as they argue. Like me, for example, I don't, and will not buy everything I hear, see, watch online. So just the fact I download does not mean that I would buy it. It is just false: I will not buy it anyway!

So what's the real loss for them? Their fail is that time had change, now we are in 2012 year, not in 1950, and they simply refuse to change their business model. It is like if dinosaurs try to sue meteors that ended them..just a nonsense.

But, the real matter is: What can YOU do against it? It is pretty simple, and this is what I do. I declare it publicly and if you are Sony, Emi, SGAE, GEMA, RIAA, MPAA, Hollywood have some problem with my declarations, then go and come with me, I will tell you this again at the face!

So, what can be done to force those agencies/companies to realize we are just fed up to be treated like criminals?

Very simple: Do not go to theaters, do not buy original music CDs, do not buy films. Not just a month, not do it never until they change their paths.

I usually have this conversation with people, and they often say things like: a person can't do anything against them, and so on, this is simply a false argument.

If you cause a real loss of revenue to them, they will realize they are alive thanks to us, which are their customers, and they will learn how to treat their potential customers, rather than mark everybody as a criminal, and the stuff they do.

So, in my case, I am not going to theaters, not buying anything with copyrights since the closure of megaupload until it comes back again and they face reality (which the later I doubt it would happen so...)...

And publicly I say: I encourage everybody to pirate copyrighted materials always. Let's show them what a real loss of revenue can mean!