Archive for the 'Linux' Category

Do you crash your database on machine shutdown?

Thursday, November 5th, 2009

Most DBA’s would say ‘NO’ without hesitation. I dare you to take another look.

What is this blog about?
This blog is about DBA’s who run databases on linux. Linux sysadmins are target audience too; this blog is about the stop/start system on linux, and to be more precise: about the specific implementation of the stop/start system on redhat and it’s derivatives (Oracle’s enterprise linux for example, but also centos)

What about it?
The redhat stop/start system looks and feels like the traditional unix system V stop/start system found on most unixes. Correct usage of the redhat stop/start system is in fact quite different from that.

The traditional unix system V stop/start uses an executable script in the init.d directory (often /etc/init.d or /etc/rc.d/init.d) and uses a symbolic link in the runlevel directories (rc1.d, rc2.d, rc3.d, etc.) from the script in init.d, but prefixing the name it with ‘S’ (for start) or ‘K’ (for kill, thus stop) and a two digit number to indicate the order for starting or stopping.

The redhat stop/start system works with an executable script in the init.d directory (/etc/rc.d/init.d). The correct way to enable stop/start is using the chkconfig command:
chkconfig --add yourscript
This command check if the script yourscript is present in /etc/rc.d/init.d, then reads a line looking like this in the stop/start script:
#chkconfig: 2345 80 05
This tells chkconfig that the script should run in runlevels 2,3,4 and 5 with order number 80 and stopped (killed) with order number 05.
To check if chkconfig did understand your script, use:
chkconfig --list
Further reading about chkconfig:
man chkconfig

Alright, but what about crashing the database?
A lesser known fact is that the redhat stop/start system uses a file with the same name as the stop/start script in /var/lock/subsys to determine if the script is already started (so it doesn’t get started if it’s already started in a previous runlevel) and needs to be created in the start part, but more importantly it determines if the script needs to execute the stop procedure by checking the existence of this file in /var/lock/subsys. As part of the stop procedure the file needs to be removed, so the stop procedure isn’t run again in another runlevel.

In many custom made scripts this file doesn’t get created in /var/lock/subsys in the start procedure, so the stop procedure actually is never executed, because the script is already stopped from the perspective of the stop/start system. This typically means that the oracle stop/start script does a perfect job starting the database, but actually never stops the database, which means that during a shutdown or a reboot, the database actually crashes.

Sadly, I see very much scripts forgetting this (even ones from respectable sources, like redhat; see: Redhat summit beginner’s guide to running oracle on RHEL, for example, there are much more). Check your stop/start script today!

How to view your website from the outside while being on the inside

Monday, May 4th, 2009

Sound cryptic, doesn’t it?

In most situations where you do configuration, tuning or other settings of web environments (apache, oracle http server, OC4J, weblogic, websphere, etc.) you have access from ‘the inside’, most of the time access via a separate LAN (often called ‘administration LAN’) which let’s you get access in another way than the regular web traffic, which comes in from ‘the outside’ which is firewalled different (or is the only connection being firewalled), thus is handled different.

There are situations where you want to see what traffic from ‘the outside’ does instead of your ‘inside connection’, for example:
- ssl/certificates/wallets
- connecting webservers with the webcache
- altering hostnames
- review firewalling and webtraffic routing
- review NIDS (network intrusion detection)

This is where tor comes in.

What does Tor do? The best explanation is on the tor website. What it does is route your traffic encrypted through an anonymous, distributed network. Tor accomplishes much security related things, for this subject it’s important it setups an connection randomly on the internet.

Tor is not available on the RHEL/OEL/CentOS CD/DVD’s and network repositories. It’s also not present in EPEL (extra packages for Enterprise Linux), so it must be downloaded from the tor download page (choose ’stable’).

In order to fully use it, you need privoxy, which is present in the repository (thus can be installed using yum).

After the installation of both, there’s a little configuration necessary, but then you’ve got a proxy setup which let’s you browse the internet (anonymously) from a random place on the internet!

For people who have security considerations, and/or doubts about anonymity of their traffic, anything which can use a proxy can use it.

This means nikto can be used entirely anonymous, whilst things like nmap, nessus can’t with this setup.

The hidden gem ’sosreport’ in RHEL and OEL

Friday, May 1st, 2009

During a stroll through the filesystem I encountered an executable (a python script actually) called ’sosreport’. You’ve got to admit: the name sounds intriguing.

The man-page says ‘Generate debugging information for this system’. This is getting even more interesting…

Lets run it!

$ /usr/sbin/sosreport

sosreport (version 1.7)

sosreport requires root permissions to run.

The utility requires root permissions. Not very surprising, root got all permissions to get internal system information.

Again, now as root:

# sosreport

sosreport (version 1.7)

This utility will collect some detailed information about the
hardware and setup of your Red Hat Enterprise Linux system.
The information is collected and an archive is packaged under
/tmp, which you can send to a support representative.
Red Hat will use this information for diagnostic purposes ONLY
and it will be considered confidential information.

This process may take a while to complete.
No changes will be made to your system.

Press ENTER to continue, or CTRL-C to quit.

Please enter your first initial and last name [localhost]: frits hoogland
Please enter the case number that you are generating this report for: 9999

Progress [###################100%##################][05:15/05:15]

Creating compressed archive...

Your sosreport has been generated and saved in:
/tmp/sosreport-fritshoogland.9999-725705-3400ad.tar.bz2

The md5sum is: 8bac4b5ccbb8c1a79dd6571f293400ad

Please send this file to your support representative.

Let’s see what it provides, and see if it got any benefits above Oracle’s RDA…
It produces two files in /tmp:

-rw-r--r-- 1 root root 33 May 1 14:34 sosreport-fritshoogland.9999-725705-3400ad.tar.bz2.md5
-rw------- 1 root root 631098 May 1 14:34 sosreport-fritshoogland.9999-725705-3400ad.tar.bz2

The list of files when extracting sosreport-fritshoogland.9999-725705-3400ad.tar.bz2 simply is too big to put in the blogpost.
It contains all linux configuration files (yes, that’s a bold statement), all startup/shutdown scripts, all kind of current system information (dmidecode, lsmod, lspci, netstat, iptables, rpm, network routes, LVM, device mapper, EMC multipathing, etc.), could go on and on!

This command is able to let anyone who can get access to the system make a file with all relevant settings for troubleshooting. That’s handy! It’s very hard to get information about network configuration, firewall (iptables), etc. out of inexperienced administrators. This command lets you get all information so you can do the troubleshooting out of the archive instead of a question and answer game with the administrator!

Introducing pgstatspack, a statistics package for postgresql

Saturday, May 10th, 2008

When performance or behavior during the development or test phase is not what you’ve expected, the normal way to go is explain or explain analyze the statements involved.

But how about production environments? And how can you tell what has happened? The most common scenario is probably when the behavior is unacceptable, so instant action is required. That is not the way you, as a DBA, want to be notified how about the dissatisfaction of the performance.

Some time ago, glenn fawcett of sun microsystems encountered probably a situation in which he wanted to know what was going on on a database level. He also had experience with the oracle database, and with a statistics package called “statspack”. What glenn did, was emulate some of the functionality of statspack, and create a version for postgresql, heavily focussed on logical IO and physical IO.

Later, this package was lost, and corresponding page also. In fact, I managed to get the package using the internet time machine (http://www.archive.org/web/web.php). Glenn recently got notified of this fact, and created a post about the package on his blog.

Well, I got into the same situation as Glenn. I have a postgresql database, and want to know what’s going on. In fact, I want to know more than just logical and physical IO, I want to know everything that is available. I also want to know postgresql version, database size, parameters, etc. Also, I want to make use of it to do analysis on daily basis of the snapshots, so I can do capacity management and do predictions about growth and performance.

That is why I’ve enhanced pgstatspack, and made it available open source: http://pgfoundry.org/projects/pgstatspack

Enjoy!