Friday, July 06, 2012

Prevent DNS Cache DDoS

The Pirate Bay

Note to lawyers, lobbies, music, film and entertainment companies, every ACTA, CISPA, SOPA and Intellectual Property defenders before freedom:
Every mention, accusation or potential offensive words direct or indirect have the word allegedly implied.

DDoS Attacks Against BIND DNS received

Introduction

 

    First of all, a little explanation of why I think I started to receive the attacks.

    For the very first time, I used The Pirate Bay to download a film contained in one of their torrents.

    Also, I have a small server inside my home network, which is accesible to outside but shares my home internet bandwidth.

    Right after finishing the torrent download, I started to receive massive DDoS DNS Cache attacks which were flooding my upload/download bandwidth as well as the BIND log (repace X with attacker IP I will make public later):
Jul 5 14:34:12 SkyNet named[2055]: client X#53: query (cache) 'ripe.net/ANY/IN' denied

Facts

 

  1. I am running 5 years this personal server, and never received a massive DNS DDoS. Only received some SSH brute force login attempts which are successfull banned via fail2ban.
  2. The intents to block The Pirate Bay: Successfull (UK, USA, Microsoft Messenger, etc), and some unsucessfull ones (Holland) and the effort to block the whole torrent service
  3. ACTA rejected by European Parliament (where I live), making them fail.

Charges

 

    I don't believe in coincidences, but specially in these cases, where it would be too much coincidence, so I make my guess (accusation): Hollywood, and their friends are trying to stop/reduce torrent usage via other methods: Disturbing, attacking, and flooding torrent users, but of course from behind, by using botnets, infected small servers, etc without showing their faces.

Evidences

 

    Since I got fed up of this issue, I will post publicly IPs participating in this attack to let the world know who was causing this. In case of dynamic IPs, their user may not be responsible of the attacks, but I found some static IPs pertaining to servers which are direct responsibility to their owners:

IPs and hostnames participating in this DDoS event
IP/HostnameIP/HostnameIP/Hostname
202.1.207.129173-212-220-30.static.hostnoc.net50.87.103.112
31.222.72.2aa.84.344a.static.theplanet.com31.222.72.4 
212.102.11.13stats.msfayed.arvixevps.com217.173.81.25
31.222.74.3d4.67.354a.static.theplanet.com31.222.66.3
174.37.201.154mail.silverstructure.com66.110.113.38
164.138.25.100host.colocrossing.com31.222.68.2
159.253.176.4srv-s144.antiddos.eu31.222.72.3
212.76.68.1431.222.68.4202.1.207.129
159.253.176.3159.253.176.231.222.66.4
213.5.170.1031.222.74.431.222.68.3
31.222.66.2193.5.111.13

 

Protection

 

    This kind of attacks can be blocked by using and configuring fail2ban properly:

Currently, fail2ban have a issue flooding sendmail and preventing further actions when sendmail does not respond due to upload bandwidth consumption or some other cause.

This way, I made a simple sendmail-buffered.conf for fail2ban which will only send mails when X IPs are already banned.

Also, even with this enabled, fail2ban was unable to completelly stopping the attack, since it only blocked named default port (53), and somehow, some packets were still reaching me. So I modifyed default iptables-all.conf too.
To configure fail2ban properly, I created new rules and actions, based on current existing ones, but tuned to prevent this attack more efectivelly.

    Place those files inside fail2ban's action.d folder:

    Once done that, just configure your jail.conf properly to use new rules.
For example:
[named-refused-tcp]


enabled = true
filter = named-refused
action = iptables-allports[name=Named, protocol=tcp]
  sendmail-buffered[name=Named, dest=YOU@YOURMAIL.com, sender=YOU@YOURMAIL.com]
logpath = /var/log/named/security.log
ignoreip = YOUR_LOCAL_NETWORK_ADDRESS_MASK
    Please note that, with these protections you will not stop the attack, since IP packets are still reaching your server, however, your upload bandwidth and your nameserver will be secured.

Saturday, June 30, 2012

Programming with Database: Using Prepared Statements in whole program

    I have blogged several times about the benefits of programming with prepared statements when using database connections (About prepared statements, MySQL's prepared statements made easy, About SQL Injection).

    But this time, I will focus on development time, and another advantages that using prepared statements give to programmers rather than its security or performance (for that you can read the other blog posts stated above).

    As I stated in other posts, I recommend everyone who uses a database, to make use of Prepared Statements everywhere, not only in certain parts where you are interested.

    That way, combined with Model View Controller design (MVC), can save you a lot of time and problems when database scheme changes over time (and make sure it will despite you may not think).

    I am talking of common cleaning that a database may suffer some time after it has been used for some time: For example, field removal on a table, table rename, table removal, etc...

    To reflect the time you can save when that happens with Prepared Statements, let's start with a very simple example and watch how it could be solved from 2 different points of view: Using prepared statements, and not using them.

    Suppose we have a table called Customers with this schema (using PostgreSQL):

Column NameColumn Type
customerIDINTEGER
idINTEGER
nameVARCHAR(50)
emailTEXT
credit_card_noTEXT

    Having this table amongst others, and in a running company with plenty of data inside, now, some time after the initial database schema, we decide to switch to paypal payments processing (just an example).
In this case, having credit card info of our clients is not longer relevant, and should be deleted due to security issues.
Then now we need something like:
ALTER TABLE Customers DEL COLUMN credit_card_no;
    Unfortunatelly, we are not done yet, as we need to change some parts.
As part of this example, we will see what would happen in 2 cases: 

Using MVC and Prepared Statements
    Now that we have changed database scheme, the changes that are need to be made in source code are very easy to find. If we used something like the database class I blogged here, we just need to open the browser and wait for the errors it will come out, since all SQL sentences are prepared and loaded with database initialization.
It will tell us exactly which prepared statement failed, and just a seach will show in which function member we are using it, allowing us to change it.
Once changed, you just have to be sure to change the model class which receives that old data to remove it.

Not using MVC and not using Prepared Statements
    In this scenario, the changes that needs to be done are not easy to find as we don't have any mechanism to look at every SQL sentence we have in the whole program at once.
So we should to do a seach on every project's file, and do as much as replacements we need in order to achieve the changes, and thus, more time programming which may mean more money.

    So to sum up, I will render a table containing most important things about each others to reflect why is so important to use Prepared Statements rather than SQL statements as strings defined at runtime.

MVC and Prepared StatementsSQL statements on the fly
Changes find timeImmediateSearch every project's file
Number of files/classes to modify2 (Database and Model classes)Undefined (as it may be in use from several files)
Possibility of runtime errorsNo (as SQL sentence is not loaded with errors)Yes (If, for example, you forgot to change a SQL sentence which is based on strings concatenation)

    So now you know that programming with prepared statements does not only protects us from SQL Injection, give us more performance, prevents from having dangerous runtime errors that may harm our public image to our customers and makes us save tons of time when things changes, why are you still not using them in your whole project?

Friday, May 25, 2012

Count regexp matches on a table: Sorting result sets by relevance in PostgreSQL

    The idea of counting the number of matches in a field that a regexp has, is a very intesting method to sort sets, for example, a web search result set, when you need the most relevant sources first (for example, containing the searched word more times).

    I researched a bit and unfortunatelly, MySQL is far behind PostgreSQL, despite the fact that Oracle bought it (yes, at first I though it could improve a lot with Oracle's experience, but reality is that Oracle is not going to put another free competitor against its own), so I will write this howto for PostgreSQL with an example.

    PostgreSQL has a very clever function regexp_matches that will show all substrings which matches a pattern and we will use that to count matches and sort result.

    For example, suppose we have this table in database:
 
DATABASE TABLE AND DATA
TABLE NAME: CONTENT
idINTEGER
htmltextTEXT
TABLE DATA
idhtml
1key
2key key key
3key key
4not matching pattern
    If we try to search by keyword key, we could want to appear in relevance order:


Expected resultActual result
id21
id32
id13
    To achieve this sorting, we need to use regexp_matches correctly along with our result set, like this way:
SQL Example Sorting Sentence
SELECT id, COUNT(*) AS patternmatch
   FROM (
      SELECT id,regexp_matches(htmltext, 'key', 'g')
         FROM CONTENT
   )
   AS alias
   GROUP BY id ORDER BY patternmatch DESC;

    That will result in a set like Expected result in previous table.

    But, that was only an example!! How about a real usage?

    With a little modification of previous sentence, it could be applied to sort a real search query, just substitute bold text with your pattern and your subquery/data set.

    The only requeriment to do that is: Your subquery must have id and htmltext between its results! (and of course, replace 'key' with

    So assuming we have a function called our_expensive_lookup (with the lookup parameter) that do a very expensive lookup by some keyword, we could apply this sorting to our returned results by:
SQL Real Usage Sorting Sentence
SELECT id, COUNT(*) AS patternmatch
   FROM (
      SELECT id,regexp_matches(htmltext, 'key', 'g')
         FROM (
            SELECT id, htmltext
               FROM our_expensive_lookup('key')
         ) AS lookup
   ) AS alias
   GROUP BY id ORDER BY patternmatch DESC;
    Of course, it can be combined with prepared statements, stored procedures, and so on! The hardest part is done, so just use your imagination!

Wednesday, May 23, 2012

What can you do against SGAE, MPAA, RIAA, GEMA and those ones?

    It is known for everybody what actions are those organizations doing world wide: declaring you are a criminal because you watch/download material, applying censorship to internet, adding fees to buy media in some countries just because you could use it for criminal purposes and those things.

    Specially, I am referring to megaupload case: They closed megaupload with no proofs, and they are suing as much people as they can, just by giving their IPs, with the only argument that they are causing lots of revenue loss to multimedia market.

    Well, the first thing I will say publicly is: A people wich download some material is not a buy loss as they argue. Like me, for example, I don't, and will not buy everything I hear, see, watch online. So just the fact I download does not mean that I would buy it. It is just false: I will not buy it anyway!

    So what's the real loss for them? Their fail is that time had change, now we are in 2012 year, not in 1950, and they simply refuse to change their business model. It is like if dinosaurs try to sue meteors that ended them..just a nonsense.

    But, the real matter is: What can YOU do against it? It is pretty simple, and this is what I do. I declare it publicly and if you are Sony, Emi, SGAE, GEMA, RIAA, MPAA, Hollywood have some problem with my declarations, then go and come with me, I will tell you this again at the face!

    So, what can be done to force those agencies/companies to realize we are just fed up to be treated like criminals?

    Very simple: Do not go to theaters, do not buy original music CDs, do not buy films. Not just a month, not do it never until they change their paths.

    I usually have this conversation with people, and they often say things like: a person can't do anything against them, and so on, this is simply a false argument.

    If you cause a real loss of revenue to them, they will realize they are alive thanks to us, which are their customers, and they will learn how to treat their potential customers, rather than mark everybody as a criminal, and the stuff they do.

    So, in my case, I am not going to theaters, not buying anything with copyrights since the closure of megaupload until it comes back again and they face reality (which the later I doubt it would happen so...)...

    And publicly I say: I encourage everybody to pirate copyrighted materials always. Let's show them what a real loss of revenue can mean!

Wednesday, April 04, 2012

Implementing Namespaces with Memcached and PHP

    Memcached is a tool (with PHP extensions) for storing key-value pairs of data in RAM and it is used for boost applications where persistent storage is needed and fetching data from datasource (file, database, internet...) is more expensive than fetching from RAM (from performance point of view).

    Despite its advantages and the fact that by using it your application loads data much faster when it is in cache, it lacks any kind of keyword data removal nor any other capability to do so.

    It may not sound as a real problem, that can be a real problem making your application return invalid data in some scenarios.

    Here I will provide some scenarios where memcached can lead to problems and will post some approaches to resolve this limitation.

Scenario 1

We have pagination in our system, and keep that pagination in cache for better performance. When deleting, inserting or altering new data, that pagination becomes obsolete, and may be already cached without the posibility of deleting it because, for example, we don't have pagination cache's key id from the function that adds, deletes, or alters its data.

First approach

 
    The idea behind that (in which my work is based) is to emulate namespaces by adding some keys with version, are the official pseudo-namespace proposal, which allows to workarround invalid data returning by emulating namespaces and appending a version to the key. This way, if you have foo stored in cache, you update namespace to automatically try to get foo+latest_version, which may force cache regeneration.

    This approach have some troubles though: You can't bind data to several namespaces at the same time! And that was the original reason I extended this idea to a new approach which can be explained below:

Scenario 2

    Imagine we have a database with two tables, one for accounts, and the other for storing friend relationship related between them.

Table Account Table Friendship
id INTEGER,
username TEXT,
...
PRIMARY KEY (id)
owner INTEGER,
friend INTEGER,
PRIMARY KEY (owner,friend),
FOREIGN KEY (owner) REFERENCES Account(id),
FOREIGN KEY (friend) REFERENCES Account(id)

    In this scenario, suppose account id 3 have these friend ids: 1,2,5,6,8. When asking database for friend, it will return array (1,2,5,6,8) we are going to put into cache, but what about if account id 2 is deleted, or have its id changed?

    Then, our cache is immediatelly invalid but the real data isn't (supposing we provided mechanisms for automatically updating foreign keys in DB!).


    I started to work on a class for memcache to address this problem (and possibly other issues) at the cost of having a bit of namespace data inside cache.

    My approach is to store all the keys a namespace contains as an array for allowing the system to delete all those keys when a namespace has expired.

    For example, in Scenario 2, when we add a new friend, we can bind it to owner account namespace as well as friend account namespace.

    This way, when something changes, we can expire one of those two namespaces and will delete related data from cache.

Code

    You can grab the source code (PHP) from github

Requirements

    In order to use this code you will need:
  • Memcached (of course!)
  • pecl-memcache PHP extension
  • A config file where $cache_enabled is set to TRUE
Usage example

    Check github for an usage example

Benchmarks

    I tested it in my personal server, with postgreSQL, session handling and storing in DB and PHP 5.4.1_RC1 with the following results:
    Sessions are cached when user logs in via my DBCache class.
Without cache hit:
~35 SQL queries
~0.37 seconds page generation
With cache hits:
~0 SQL queries
~0.04 seconds page generation

Decreasing production server downtime with kexec

    When managing a production server, one of the most important thing is the tradeoff between server downtime and keeping server's software updated.

    While most of the updates can be applied from little to no downtime, a kernel update is always problematic since it requires typically a full reboot, and a significant downtime. To prevent that, many servers do not issue kernel updates as often as they should, specially those cheap rented servers.

    On the other hand, there are servers which like to presume of having a high uptime. While that might look good, it is in fact, quite the opposite: a high uptime in a server means they might not have updated their server's software!

    So I will introduce kexec and a benchmark to show how it can reduce downtime by reducing reboot time. But first, let's look how a unix-like system boots and shutdowns.

    In a typical boot/shutdown action, this are (aproximatelly) the steps that will be made by the machine:

  • Boot
    1. BIOS stage
    2. Bootloader load
    3. Kernel load
    4. INIT
      1. Kernel init
      2. Hardware initialisation
      3. Checking and mounting partitions
    5. Start services
  • Shutdown
    1. Stop services
    2. Sync discs
    3. Unmount partitions
    4. Hardware stop
    5. Hardware power off
    By using kexec, some of those steps are skipped, since it will change kernel from a running system. These are (aproximatelly) the steps for kexec reboot:

  1. INIT
    1. Kernel init
    2. Checking and (re)mounting partitions
  2. Start services
    To prove that reboot time decreased I created a little bash script to measure downtime (testTime.sh) and tested in my personal server running a Gentoo system: 

    To use provided script, you must run it after apache have been stopped with:
time ./testTime.sh SERVER_WWW_URI 2&>1 > /dev/null
    The commands I used for this benchmark are (via SSH):
Normal Reboot: /etc/init.d/apache2 stop && echo "Now you can exec time measurement script" && reboot
kexec reboot: kexec -l KERNELIMAGE --reuse-cmdline && /etc/init.d/apache2 stop && echo "Now you can exec time measurement script" && kexec -e

    These are the results I got:
Full reboot:
real    1m21.996s
user    0m3.241s
sys     0m2.833s
kexec reboot:
real    0m31.415s
user    0m1.872s
sys     0m1.684s
    So to sum up, despite it still takes time to perform kernel update, it is reduced significantly, so for most servers out there, now that is not an excuse to have system not updated anymore!

Friday, March 16, 2012

Adobe against FOSS and talling people which software to use

    This company have done always very bad things leaving their users alone whenever they wanted to. I'll try to explain it a little better for anyone who has never heard some


  1. A very long ,long, long history of security vulnerabilities in its produts (flash) that made even Chrome to fall in such a serious issue like remote code execution. (Source: zdnet, Adobe ).
  2. They dropped 64bit architecture whenever they want leaving users either with an old version of their flash plugin or with nothing at all. A simple google search is enough to demonstrate this fact. Despite it seems not important, it really is, because if my whole system (linux was the first to have a complete 64bit working system by the way) is 64bit, Who are Adobe to tell me to install a 32bit navigator or to change my SO? Specially nowadays, when having more then 4 GiB of RAM is not that strange.
  3. Dropping Linux support for their Air framework, and stating that it could still be done if some of their partners code it.  (Source: phoronix)
  4. Dropping Linux Flash Player navigator plugin except Chrome (Source Adobe). Well Adobe, it is good that you work with a company (Chrome is open source, but don't forget it is run by a company anyway) to improve things, but it is unaceptable that you simply drop all the other navigators just because they don't want to work with you or because they won't accept your guideliness.

    I am sure there are many, many other reasons I can't recall now, but there are several things I can say for sure, and I want to share with the world (hoping Adobe could read this sometime):


  1. You, Adobe, are the perfect reason not to tie my future to a closed sourced plattform as a developer. Just because you are proven to do what you want without care even about your customers!
  2. You, Adobe, you are NO ONE to say what software should I run. You may offer all you have, but you are not that important to force me use 32bit, force me use Chrome, force me use any other OS, nor anything similar.
  3. As a company, you fail because I will not change, only because you offer a product (which I don't like) which main uses are for embedded video players. As a company, you should think that Linux, despite being a small % of your share, is still important because your valuable programmers will have complains from their people too, and because fortunatelly you have competence now: HTML5 so guess what: The only thing I lose with flash are video players, and that can be done wih HTML5 too, so who will lose?



Wednesday, March 07, 2012

About Facebook Antiprivacy Policy

    Facebook has always been very irresponsible in respecting people privacy and have done very bad things in the past in this matter.

    Some time ago, according to Mark Zuckerberg (Facebook's CEO) declarations [1], I deleted my facebook account forever and I won't be back until it changes (thing I see very unlikely). Let's explain this and why facebook is evil.

  1. It is proved that facebook creates ghost profiles [2] and retains deleted data without the owner consentiment. This is specially true when deleting an account, or when using the "find your friends in facebook" feature, in which you type your email address and your password, and it search your possible friends in that social network.
  2. The friends of friends feature is totally wrong: If in real life friends of my friends could not be my friends, then why in facebook they act as if they were?
  3. Related to previous one, if I set my account to be seen only by my friends, then, when I post a comment, I like some photo of one of my friends, automatically, that content is available to be seen by uncommon friends, and the worst thing: My friend has to settle this privacy option for me! 
  4. You explicitelly give permission to facebook in their TOS [3] to use without royalties any content you upload to facebook without an option to deny it. So if they want to use one of your personal pics for a cocacola's advertisement spot, then they use it, gain money with you, and you won't be able to complain (nor to see money).
  5. Many people are blaming google to index their name and last name because of facebook's account. This is not true. As a webmaster, I know that google respect your mechanisms to let it know to not crawl over some pages you may have. This way, I blame facebook because they simply don't want that to take place for pure economic interests (I remember all of you that facebook have a high page rank, and one of the reasons is having that much of data being indexed in google)
    There are even more reasons to hate facebook's privacy idea, just because their CEO do not value their people's privacy, so I ask all my readers: Do you still want to be in a place where you are not valued?
Unfortunatelly in USA, privacy's law are not that strong than the ones we have in EU. For example in Spain, in which LOPD (Ley Orgánica de Protección de Datos, or Data Protection Act translated to english).

    I am very pleased we have this law in Spain to ensure companies to treat my data correctly and to guarantee my rights. I explain it better, to allow you to understand why facebook can be considered illegal in Spain or in UE.

  1. LOPD says the data owner is not the company having the DB running, but instead, the data belongs to the owner the data refers to.
  2. You have some 4 undeniable rights:
    1. Access to ALL the data a company has about you.
    2. Alter any data to correct mistakes it can have.
    3. Permanent delete of partial or full data. Where permanent means that you can ask for it to be efectivelly deleted from their database, and not just be marked as invisible. What facebook does, they don't delete anything, just mark that data as invisible or something similar [4]
    4. You are granted to settle the visibility of your data even inside target's company. In other words, you can tell any company that it is forbidden to publish part (or all) your data even inside their webpage.
    The conclusion would be: I am against facebook for those reasons, and I encourage people to stop using facebook or even to delete their accounts until they realize that we have privacy and my data is mine.
It will not surprise me if some judge in the EU decides to give Facebook an ultimatum to change things, something that can be even worse that the ireland request to facebook [5].