Resolving network connection issues on Linux Mint (Ubuntu)

So several years after purchasing my smallest form-factor netbook Aspire One I have stumbled upon issue that I couldn’t explain myself. Hopefully this would help somebody and save time it took me to figure out.

Background

The Aspire One was the only netbook with descent (2GB RAM at that time) amount of memory and Linux installation out of box (=minimal support guaranteed). I have used for some time, but than gave up for a while since keyboard proved to be a bit too cramped for my palms and fingers, so extensive use caused some pain. Still it is a nice little piece of hardware that is easy to transport, not expensive (meaning if gets stolen or broken I won’t cry over it :)) and, finally, it is a full-fledged Linux-powered thing with real (althoguh small) keyboard large enough hard disk to host my music collection. All in all, after some time I decided to give it a try.

Continue reading “Resolving network connection issues on Linux Mint (Ubuntu)”

Getting AWS intances on network

Today I was bringing some new isntances of Amazon AWS for a commercial service I am providing some basic support for. Nothing special, but somehow I could not connect to the instances I was starting. Searching around I saw many things including the fact that I could only search forums but not contact AWS support unless I buy premium support. WTF…

Today I was bringing some new isntances of Amazon AWS for a commercial service I am providing some basic support for. Nothing special, but somehow I could not connect to the instances I was starting. Searching around I saw many things including the fact that I could only search forums but not contact AWS support unless I buy premium support. WTF… Anyway, at some point I right-clicked in the list of the EC2 instances on the instance giving me hard time and selected the “Get System Log” item (see screenshot below).

AWS EC2 context menu

Ah-ha, here it goes… I saw some logging (running CentOS 5) that made me rather unhappy and so far unexplainable….

Bringing up loopback interface: [ OK ]

Bringing up interface eth0:
Determining IP information for eth0... failed.
[FAILED]

Starting getsshkey:
curl: (7) Failed to connect to 169.254.169.254: Network is unreachable
getting ssh-key...

It seems that the instance had some problem with networking connectivity. Nice. I’ve terminated the given instance and tried about 15 min later. The problem misteriously disappeared. Neither changes to my own network configuration nor image was changed. AWS as a reliable provider?.. I do welcome alternatives!

Monitoring a web service with Zabbix

What is needed is to check availability of certain public services (on fixed URLs) to make sure they still do what is advertised. If not I want to be notified within a reasonable period by e.g. e-mail (immediately is the ideal case, but I don’t want to poll my services every second). Zabbix can do million and one thing including such a trivial task as this one, but there are quite some steps to be taken. Hereby a short checklist that I had to discover through manuals and trial-and-error actions.

For a lesson to be remembered it has to be painful. This is how humans seem to work. This time a rather disastrous death of one of vital services without me noticing it seem to cost my company some money and clients. This is really not a good move for a startup that desperately needs scaling. So the lesson is to have a proper monitoring service for all core activities and do not rely on humans ‘checking it once in a while’. There are numerous off-the-shelf packages that provide everything you need and more. After a small research about a year or so ago I had chosen Zabbix as the tool of choice. It may have been an overkill and Zabbix is definitely NOT the most intuitive tool to work with, hence this post.

What is needed is to check availability of certain public services (on fixed URLs) to make sure they still do what is advertised. If not I want to be notified within a reasonable period by e.g. e-mail (immediately is the ideal case, but I don’t want to poll my services every second). Zabbix can do million and one thing including such a trivial task as this one, but there are quite some steps to be taken. Hereby a short checklist that I had to discover through manuals and trial-and-error actions.

Pre-requisites
I assume that you already have set up the following things (since I had them already):

  • host to be checked
  • user in a proper user group with defined e-mail where you want to receive notifications

E-mail configuration
First of all make sure Zabbix can send e-mails. Apparently this is not part of initial configuration as I had default settings for quite some time. Go to Administration | Media Types and select the type you would like to use. I wanted e-mail (having BlackBerry the other ones are rather superficial), therefore I clicked on Email. By default it has some sample information, fill in the

  • SMTP server – enter the SMTP server you can use to send outbound e-mail.
  • SMTP helo – I just put my Zabbix server name there, but I don’t see it coming back anywhere.
  • SMTP email – enter the e-mail address of ‘sender’, a ‘fake’ one is OK unless the SMTP server you entered above is picky on sender addresses.

Don’t forget to save the changes.

NoteIf you wonder why I’ve spent so many lines above in this post and haven’t referred to the Zabbix documentation the answer is rather simple. The Zabbix documentation on version 1.8 is very concise and states exactly two words: “Email notification.”. Check it yourself if you wish.

Indicate what has to be monitored
So now let’s define what we have to monitor. I wanted to be able to open certain URL and see whether I get a proper response (yes, I can write few Python lines for this, but I won’t get it all in a consistent interface with many details as I get in Zabbix). This is called “Web Scenario” in Zabbix. Go to Configuration | Web to define one.

NoteHere is a part I don’t really like about Zabbix. It took me some time to find the “Create Scenario” button simply because it was at the right TOP of the page, ABOVE the list of scenarios. OK, it is consistent with other pages, but it has very little intuitiveness in it.

Anyway, when you have located the “Create Scenario” button you will get a form that needs to be filled. Here the Zabbix documentation is a bit more verbose, check the “WEB Scenario” manual for some tips. One thing to mention is that web a scenario is just a ‘placeholder’ for actual actions (called ‘Steps’), it does not do much itself. To actually indicate which page has to be opened you will need to define at least one step (click the ‘Add’ button in the form). OK, here you go, define a good name (you will have to locate it later on, so come up with a distinct one!), paste the URL of the page/service you would like to check and put what HTML response (e.g. 200 for OK, 404 for Not Found, etc) you expect to have when it is called.

We almost there. What is missing is coupling between the mail and the check.

Configuring triggers
Now we need to couple the new scenario to the trigger list of a server. Go to Configuration | Host and click on the host you would like to couple the scenario to (well, I suppose you were checking the service provided by some server and you have that server in the list already…). Then click on the “Triggers” link (another ‘intuitive one’). When you get the list of triggers locate the “Create Trigger” button (yes, at the right near the top of the page, grrrr), and now we have to do some ‘magic’. Come up with some distinct name (you will need this one later again!), then press the ‘Select’ button. In the next pop-up press ‘Select’ button again.

Here you will have to select the name of the WEB Scenario you have selected above, make sure you select the one measuring RESPONSE. The first one in the list for me was measuring ‘Download speed’, which is not exactly what I am interested NOW (may be some time later). OK, having the ‘Response code for…’ selected (just click on it) brings you back to the form. Select “Last value not N” and leave N on 0 and press the ‘Insert’ button. You will have to end up with something like

{your.server.com:web.test.rspcode[Scenario Name,Step Name].last(0)}#0

Now press ‘Save’. But don’t relax yet, this will make trigger go off when your service is not behaving properly, but you won’t get a response yet. Let’s go to the final step.

Creating alerts
Go to Configuration | Actions and press the ‘Create Action’ button. Here we have another set of odd forms that needs to be filled in. BTW, at this point I was almost giving up and dreaming of a small script sending me happy e-mails once in a while :). Anyway, give a name, leave “Even source” on “Triggers”, change the message header (e.g. put some tags there so that your e-mail program can sort/filter these messages) and body (e.g. signature, link to the Zabbix server, etc). DON’T press ‘Save’ just yet! First press ‘New’ button in the form below under ‘Action conditions’. Here you want to indicate upon which condition you want to send the e-mail you have just defined above (I wonder what twisted mind came up with this ‘workflow’). I simply put the name of the trigger (defined above). You can also filter on value (OK | ERROR), but I find it useful to receive notifications on both errors and recoveries. OK, here is the final step. At the top right there is another form called ‘Action operations’. Press ‘New’ there and select at least user group where you want to send your e-mail to and select ‘Emai’ from the ‘Send only to’ list. At least I don’t have anything else (for sure no Siemense MC35 anymore) to send SMS’es.

OK, now press ‘Save’ under the ‘Action’ form.

You’re done! Next time your server won’t give the proper response you will get an e-mail about this event and off you go fixing the problem hopefully before hordes of clients start cursing you.

Well, I also wonder if writing a script would be easier (well, for sure less words than I have spent describing the steps above), but I guess it is always the case with heavy tooling used for rather simple purposes.

Good luck!

CentOS: custom logwatch script

So, let’s say you want to parse your own log files and extract some very useful information, but you (as every normal person should) don’t want to reinvent the wheel. Well, logwatch does everything you need except, of course, parsing the log files in your format or interpret your own messages. This is easy to fix, although documentation left me puzzling for about an hour or so. Hence this post.

Apparently this is not as straightforward as it should be for ‘just a bunch of (highly usable) perl scripts’, therefore my experience.

Install

If you don’t have logwatch installed (how come?) then do something like
# wget http://downloads.sourceforge.net/project/logwatch/logwatch-7.3.6.tar.gz?use_mirror=ignum
# tar xzf logwatch-7.3.6.tar.gz
# cd logwatch-7.3.6
# ./install_logwatch.sh

Normally you should already have one pre-installed.

Custom script

So, let’s say you want to parse your own log files and extract some very useful information, but you (as every normal person should) don’t want to reinvent the wheel. Well, logwatch does everything you need except, of course, parsing the log files in your format or interpret your own messages. This is easy to fix, although documentation left me puzzling for about an hour or so. Hence this post.

I will explain in the order how I have approached it. This means that I don’t create all kinds of configuration files FIRST like all regular HOWTO’s suggest. Not at all. First you create… a script to parse your log file! And test it of course… And THEN you think how to hook it up. And not the other way around.

Script file

You can use anything you want that can be executed and accept input from STDIN (so potentially I could have used Python), but I somehow decided to be consistent with the rest of the logwatch and carry on with perl. Well, there were days I thought it is a fine language (approximately until I’ve “The Pragmatic Programmer“), but now it took quite some shivers to get back in the perl mood…

Basically I’ve copied the existing script and started messing up with it, resulting in something like (no fancy stuff):

#!/bin/env perl
use strict;

# Will hold pairs (error_type_string, count)
my %ErrorTypeCount;
my $ThisLine;

# Go through the input.
while (defined($ThisLine = <STDIN>))
{
# Specify number of errors per exception type.
if ($ThisLine =~ /\[ERROR\].*?\/oi)
{
my $key = $1;
if (exists $ErrorTypeCount{$key})
{
$ErrorTypeCount{$key} += 1;
}
else
{
$ErrorTypeCount{$key} = 1;
}
}
}

# Print the total stats.
print "\nERRORS\n";
while (my ($type, $count) = each(%ErrorTypeCount))
{
print " * FOUND $count of $type\n";

}

print "TOTAL $TotalErrorsCount ERROR(s)\n";

The code above goes through the log file, takes any match of the error, which in my case looks like:

2010-01-01 [ERROR] Raising Exception: () my error description

It will count the errors per ‘class’ and then print a summary.

Inputs and outputs

Important to understand:

  • All log files you will specify in the configuration files will be given as STDIN input to your script. So you just read from STDIN and do not bother which files these actually are.
  • Any strings you print will become part of your logwatch mail sent to the corresponding account (root by default).

So in your script you focus only on what you really need to make – parsing your stuff, nothing else. No reinvention, only what is needed! Now save the script to e.g. my-service.pl. Think of a proper name, normally the name is given after the name of the service it will parse the log file from (e.g. ssh, or exim), but it would be strange to have ssh file in your source folder, isn’t it? So let give it some name with extension for the moment.

What’s next?

Of course, you test your script by feeding it with some sample log files like

# cat my.log | perl my-service.pl

Ok, now it all works perfectly (perhaps hours later, but that’s where your time has to go to, not in reading endless discussions on why logwatch itself does not do this or that…). Now you’re ready to deploy.

There will be three actions needed to deploy your script on CentOS integrated with the logwatch installation

1. Deploy the script itself

The script has to be copied next to the other scripts where logwatch expects them, I kept the naming convention (no extensions) here as well, so

# sudo cp my-service.pl /usr/share/logwatch/scripts/services/my-service
# chmod a+x /usr/share/logwatch/scripts/services/my-service

So you will end up with a new executable file under the /usr/share/logwatch/scripts/services/ folder. Alternatively you could of course create a symlink to your file.

2. Create the configuration file for the log group

Well, this was not exactly obvious, but you need to create a configuration file under /etc/logwatch/conf/logfiles/. Let’s say we create something like my.conf. Note that you will need the NAME of this file in the next step. This file can contain something like

# Place this file to the /etc/logwatch/conf/logfiles/
LogFile = /var/log/my/my.log
Archive = /var/log/my/my.log.*.gz
*RemoveHeaders

Here you put the ACTUAL log file and archive names. So you let logwatch take care of giving you the right input. This is already better than doing it all yourself isn’t it?

For one log file creating a &quit;group” may sound an overkill, but this is how it works and it is OK if you start reusing stuff, but for now this is a “necessary evil”.

2. Create the configuration file for the service

Ok, now we create the file which will be picked up by the logwatch and determine which script to run (by looking up the name!). So we create my-service.conf under /etc/logwatch/conf/services/ and put something like

# Place this file to /etc/logwatch/conf/services/
Title = "MY Service"
LogFile = my

Note that here you use the LogFile directive to refer to a LOG FILE GROUP, which was very confusing to start with, but OK, “was there, done that”.

Finally, you can do something like

# sudo /etc/init.d/0logwatch

and check the corresponding account mailbox for the result.

I hope this spares somebody time…

CentOS: Id “7” respawning too fast: disabled for 5 minutes

Ever since I’ve got brand new server running CentOS 5.4 and installed the logwatch I was scrolling like mad (and that DOES take time on a BB) to get through the numerous Id "7" respawning too fast: disabled for 5 minutes messages. I knew where they came form, but apparently I have missed one tiny step.

What does the “Id “7”” mean?

Examining the /etc/inittab reveals the following line

7:2345:respawn:/sbin/mingetty --autologin=root tty7

I don’t know why someone would do that, because the first thing I’ve done is disabling the root access by simply changing the default shell for the root account to /sbin/nologin in /etc/passwd. I don’t want any root accounts, I live in sudo when I need to.

What’s the trick?

Well, simply commenting the offending line in the /etc/inittab didn’t do anything and I was too cautious to restart the server since “the users may be coming right now” and experiencing downtime from a brand new service is not a good thing. So after yet another frustrating morning (it starts from grabbing my BB and reading logwatch messages for me for the last weeks…) I finally “solved” the problem which was just my lack of knowledge (as the most problems are):

sudo kill -HUP 1

and the configuration is re-read and the message is GONE without the need for (expensive) restart! Wow! How stupid you can be…

Crond and timezones

I was going crazy for the last few days. I have set up an important job on the brand new CentOS5 server and it didn’t seem to run. Every morning I was rushing to start ssh and check the result, but alas, nothing to be seen. I was starting the script manually and cursing my Linux knowledge. What I should have cursed is my attentiveness to my earlier actions, one of which was setting a different time zone.

I was going crazy for the last few days. I have set up an important job on the brand new CentOS5 server and it didn’t seem to run. Every morning I was rushing to start ssh and check the result, but alas, nothing to be seen. I was starting the script manually and cursing my Linux knowledge. What I should have cursed is my attentiveness to my earlier actions, one of which was setting a different time zone.

As any admin knows a local timezone is an evil for the things like cron jobs which may be otherwise affected by the summer and winter time shifts otherwise. So what do you do? You set the UTC time. On CentOS you may do something like this (assuming you’re well-behaving, disabled the root account and doing all things under sudo):

sudo rm /etc/localtime
sudo ln -s /usr/share/zoneinfo/UTC /etc/localtime
sudo tzdata-update

What one should realize (and I didn’t) is that not all services will catch up with the new settings. And guess what, cron is one of them. So what you need to do is the following

sudo /etc/init.d/crond restart

This will make sure that cron is not left running with your originally set timezone (which was in my case about 6 hours away from me) and will behave according to the newly set timezone.

MySQL and SSL: ERROR 1045 (28000): Access denied for user…

Ok, this one was just so frustrating I cannot keep it to myself. It costs so much time and nerves to realize that you were pointed in all but the right direction after all.

So here it goes. I have added a user to the MySQL database, I require the SSL connection (actually X509) and it all works on my development system (MacOSX 10.4). So I naively didn’t expect any problems on the test server running Ubuntu. I was wrong. And my mistake apparently has a name… But not all.

So what I see is that when I try to connect to the DB using either the Python/MySQLdb or giving the command line is the same:

_mysql_exceptions.OperationalError: (1045, "Access denied for user
'xxx'@'localhost' (using password: YES)")

or

ERROR 1045 (28000): Access denied for user 'xxx'@'localhost' (using password: YES)

Great. Now what?

I tried every single thing that made sense to me:

  • Re-created the user, updated the privileges and even added
    FLUSH PRIVILEGES
    at the end (never needed it before).
  • Re-generated the certificate files
  • Different DB

Ok, after all it boiled down to the following messages in the dmesg output:

[877463.513737] audit(1263600950.291:21): type=1503 operation="inode_permission"
requested_mask="r::" denied_mask="r::" name="/xxx/xxx/certificates/server-cert.pem"
pid=11840 profile="/usr/sbin/mysqld" namespace="default"

Bingo! MySQL cannot read the f…g certificate? Why-y-y-y-y? Apparently because the
apparmor
(now I know the enemy’s name!) does not allow it. And this is all because I have installed certificates to the ‘non-standard’ folder. Well, adding the following line to the

/etc/apparmor.d/usr.sbin.mysqld
/xxx/xxx/certificates/*.pem r,

and then issuing the following commands:

#apparmor_parser -r /etc/apparmor.d/usr.sbin.mysqld 

#sudo /etc/init.d/mysql restart

finally put everything on their places.

2 hours, a lot of frustration, but it seem to work now. I hope you spend less time finding this in Google :).

Have a nice weekend!

Subversion 1.6.3 on Ubuntu 9.02

Since installing subclipse I’ve got same problem I’ve already had on MacOSX when my subversion command line client (1.5.x) started to complain that the repository was already used with a newer version. Downgrading repository version works, but not a very nice solution, so I have decided to get the latest version. Apparently this is not just an aptitude command, but a little more. This is what have worked for me finally:

Download the source code for 1.6.3 (or later when it becomes available) and the corresponding dependencies from http://subversion.tigris.org. You have to have the following files in the case of 1.6.3:

  • subversion-1.6.3.tar.gz
  • tar xvf subversion-deps-1.6.3.tar.gz

Download the OpenSSL from http://openssl.org/source/ if you need https support:

  • openssl-0.9.8k.tar.gz

Then perform the following commands:

sudo apt-get install libssl-dev
sudo apt-get install zlib1g-dev


tar xvfz subversion-1.6.3.tar.gz
tar xvfz subversion-deps-1.6.3.tar.gz
tar xvfz openssl-0.9.8k.tar.gz


cd openssl-0.9.8k
./config
make


cd ../subversion-1.6.3
./configure --with-openssl=/usr/local/ssl --with-ssl --with-zlib=/usr/include --without-berkley-db
make
sudo make install

Having all this I’ve got working subversion (client) 1.6.3 under Ubuntu 9.02 .

TIP: If you done all above and typing svn --version still shows the default 1.5.x try another console. The installation of subversion apparently adds a new path which not picked up on the fly.

XPlanner setup

I like XPlanner, but I am also very suspicious on beta’s being frozen for years (ok, GMail being a nice exception :)). Anyway, today I was happily changing XPlanner configuration to get around a known bug in XPlanner, which fortunately didn’t cost too much time to find solution for. XPlanner on Tomcat 5.5 + Java-6 on Ubuntu 8.04 apparently result in an exception during startup. The following explanation (thanks Alex!) gives a quick fix.

  1. in /xplanner/WEB-INF/classes/sbring-beans.xml
  2. for Find the bean id="metaRepository"
  3. replace
    <map>......</map>
    with

    <property name="repositories">
    <bean class="java.util.HashMap">
    <constructor-arg>
    <map>.....</map>
    </constructor-arg>
    </bean>
    </property>

Arrgggh, I feel I am hacking too much just to get get things working while I already expect them to work out of a box. Is this the way all admins feel?..

Killing time with vsftpd

It seems that I am out of luck for the last 24 hours. I have tried to set up virtual ftp users for my server and seen the number of posts on this topic it should have been a piece of cake. I wish… Some good posts worth having a look though (and reading CAREFULLY!):

Several lessons on this topic:

  1. Make sure you have pam_passwd.so. By default it is NOT present on Ubuntu (at least 8.04). Get it by issuing

    sudo apt-get install libpam-pwdfile
  2. Don’t try to be smarter than you are. The following command (without -m switch!) do the trick with the passwd file:

    sudo htpasswd -c /etc/vsftpd/passwd user_name

Damn, was it worth sleepless 4 hours of headaches?