Easier Rails in Ubuntu 7.04 beta 4/1/2007
With the upcoming release of Ubuntu 7, Apache 2.2 will be included in the standard package manager, meaning we no longer have to work in a hybrid managed and from-source environment to get a proper Rails stack up and running. Apache 2.2 added mod_proxy, giving us access to the ProxyBalancer directive that allows Apache to serve as a frontend to several Mongrel application servers.
Here is a quick-and-dirty guide to installing my application stack:
- Ubuntu 7.04 beta
- Apache 2.2 with mod_rewrite and mod_proxy
- Mongrel 1.0.1
- MySQL 5 (finally)
- Ruby 1.8.5 and Rails 1.2.3
For starters, get yourself a copy of Ubuntu 7.04 beta and get it installed. No one should have to give you directions on that part. Once that's up and running, we need to turn it into a development environment, so fire up a terminal and bang out the following:
$ sudo apt-get install build-essential
Now open Synaptic Package Manager and select and install mysql-server, libmysqlclient15-dev, ruby, irb, rdoc, ruby1.8-dev, and rubygems. Accept all the dependencies it alerts you to, and move on. Its very important you get ruby1.8-dev installed, or rubygems will not work correctly. I believe this to be an error in Synaptic, but we have to deal.
Now get back to the command line. Using Ruby's Gem installer, its time to finish up the Rails side of our stack. This part is similar to my previous howto for Ubuntu 6.
$ sudo gem install rails --include-dependencies $ sudo gem install mysql $ sudo gem install daemons gem_plugin mongrel mongrel_cluster --include-dependencies
Again, accept the dependencies it asks you to build, and accept the highest-versioned, non-win32 option in the menus. At this point you have a fully-functioning Rails environment, so test it out. Start a new rails app and configure a cluster like this:
$ mkdir ~/rails ; cd ~/rails ; rails www $ mongrel_rails cluster::configure -p 8001 -a 127.0.0.1 -N 3 $ mongrel_rails cluster::start
Test it out by browsing to http://127.0.0.1:8001/ and you should see the Rails welcome screen. If not, you've got something screwy happening and you should ask about it in the comments.
Now its time to get our Apache 2.2 frontend running. Start by using Synaptic to install apache2 (and its dependencies). If you like, include PHP or whatever else you think will be fun. After you've got it all installed, run these commands to enable mod_rewrite and mod_proxy:
$ sudo a2enmod rewrite $ sudo a2enmod proxy $ sudo a2enmod proxy_balancer $ sudo /etc/init.d/apache2 restart
Now add a line to the bottom of /etc/apache2/apache2.conf that directs Apache to your configuration file, which I store in ~/conf/httpd.conf.
Include /home/rcrowley/conf/httpd.conf
In this new config file, wherever you put it, place the following code (replacing whatever you need with new values):
NameVirtualHost *:80 <VirtualHost *:80> ServerName richarddcrowley.org Include /home/rcrowley/conf/www.conf </VirtualHost> <VirtualHost *:80> ServerName www.richarddcrowley.org Include /home/rcrowley/conf/www.conf </VirtualHost>
I choose to link out to another config file to keep from repeating myself when configuring both with-www and without-www versions of my sites. In the new file:
DocumentRoot /home/rcrowley/rails/www/public
RewriteEngine On
RewriteCond %{DOCUMENT_ROOT}/%{REQUEST_FILENAME} !-f
RewriteRule ^/(.*)$ balancer://www_mongrel%{REQUEST_URI} [P,QSA,L]
<Proxy balancer://www_mongrel>
BalancerMember http://127.0.0.1:8001
BalancerMember http://127.0.0.1:8002
BalancerMember http://127.0.0.1:8003
</Proxy>
Now you’re pretty much done, but its likely throwing 403 errors at you for most any request. Since this is undesirable behavior, you need to edit /etc/apache2/mods-available/proxy.conf. I found this method to be less-than-helpful, so I commented every line in the file out, restarted Apache, and life became better.
The last thing necessary is making your Mongrel cluster start up with your machine. This is exactly like last time. Create /etc/init.d/mongrel and put this in it:
#! /bin/sh
do_start()
{
echo "Starting Mongrel..."
mongrel_rails cluster::start -C ~rcrowley/rails/www/config/mongrel_cluster.yml
}
do_stop()
{
echo "Stopping Mongrel..."
mongrel_rails cluster::stop -C ~rcrowley/rails/www/config/mongrel_cluster.yml
}
case "$1" in
start)
do_start
;;
stop)
do_stop
;;
restart|force-reload)
do_stop
do_start
;;
*)
echo "Usage: $SCRIPTNAME {start|stop|restart}" >&2
exit 3
;;
esac
Now make it run on startup:
$ sudo update-rc.d mongrel defaults
Go forth and hack.
Ruby on Rails from source on Ubuntu 1/15/2007
Update: Now that Ubuntu 7.04 beta is out, I tackled this problem again: Easier Rails in Ubuntu 7.04 beta.
A few weeks back I started playing a lot with Rails and got hooked. Everything that I had hacked before was just part of the workflow in Rails. In a bit of good timing, I broke my webserver while trying to upgrade X11 on Gentoo. Clearly it was time to start from scratch and to try Ubuntu. What follows is my way of getting from a blank Ubuntu install to a solid LAMP stack plus Ruby on Rails.
Format notes: a backslash (\) indicates a long line that I wrapped, a $ is a regular terminal and a # is a superuser terminal.
Gathering the pieces
$ su # cd ~ # apt-get install zlib1g-dev # wget ftp://ftp.ruby-lang.org/pub/ruby/ruby-1.8.5\ -p12.tar.gz # tar xzf ruby-1.8.5-p12.tar.gz # cd ruby-1.8.5-p12 # ./configure # make && make install # apt-get install postfix # apt-get install mysql-server mysql-common \ mysql-client libmysqlclient15-dev libmysqlclient15off # apt-get install libmysql-ruby1.8 # wget http://rubyforge.org/frs/download.php/\ 11289/rubygems-0.9.0.tgz # tar xzf rubygems-0.9.0.tgz # cd rubygems-0.9.0 # ruby setup.rb # cd ~ # gem install rails --include-dependencies # gem install mysql # gem install daemons gem_plugin mongrel \ mongrel_cluster --include-dependencies # wget http://apache.mirror99.com/httpd/\ httpd-2.2.4.tar.bz2 # tar xjf httpd-2.2.4.tar.bz2 # cd httpd-2.2.4 # ./configure --enable-deflate --enable-proxy \ --enable-proxy-balancer --enable-http --enable-info \ --disable-cgi --enable-rewrite --enable-so # make && make install
Apache startup script
# vi /etc/init.d/apache
The apache init script should look something like this:
#! /bin/sh
do_start()
{
echo "Starting apache..."
/usr/local/apache2/bin/apachectl start
}
do_stop()
{
echo "Stopping apache..."
/usr/local/apache2/bin/apachectl stop
}
case "$1" in
start)
do_start
;;
stop)
do_stop
;;
restart|force-reload)
do_stop
do_start
;;
*)
echo "Usage: $SCRIPTNAME {start|stop|restart}" >&2
exit 3
;;
esac
Rails setup
As Steven pointed out in the comments, I missed a step. If you have previously built Rails apps, now is the time to copy them onto your Ubuntu system and remember where they are. If you're really starting from scratch, the following command will give you a fresh Rails app called "www" to play with.
$ mkdir ~/rails ; cd ~/rails ; rails wwwThat last command,
rails www is where the magic happens. That generates the skeleton of a Rails app for you to fill in. See OnLamp for a good intro to building a Rails app.
Mongrel cluster wrangling
Before we can serve a Rails app properly, we need a Mongrel cluster. These steps are nearly verbatim what is prescribed by Coda Hale. I find it to be pretty tasty, myself.$ cd ~/rails/www $ mongrel_rails cluster::configure -p 8001 \ -a 127.0.0.1 -N 3 $ mongrel_rails cluster::startNow would be a good time to test the URL http://127.0.0.1:8001/ in a browser to see your Rails app working properly. Once it is, stop the cluster so we can setup the script that will start it at boot. This is pretty much exactly like the apache script (which is pretty much exactly like the Ubuntu skeleton script).
$ mongrel_rails cluster::stop # cp /etc/init.d/apache /etc/init.d/mongrel # vi /etc/init.d/mongrel
4,5c4,5 < echo "Starting apache..." < /usr/local/apache2/bin/apachectl start --- > echo "Starting mongrel cluster..." > mongrel_rails cluster::start -C ~rcrowley/\ rails/www/config/mongrel_cluster.yml 9,10c9,10 < echo "Stopping apache..." < /usr/local/apache2/bin/apachectl stop --- > echo "Stopping mongrel cluster..." > mongrel_rails cluster::stop -C ~rcrowley/\ rails/www/config/mongrel_cluster.yml
Install the init scripts
# update-rc.d apache defaults # update-rc.d mongrel defaults
Put the puzzle together
Again following the gospel according to Coda, this is how to beat the entire system together. First uncomment the line that includesconf/extra/httpd-vhosts.conf in the main apache conf/httpd.conf file. Then we can setup virtual hosts in that file. Something like
NameVirtualHost *:80 <VirtualHost *:80> ServerName richarddcrowley.org Include conf/extra/www.conf </VirtualHost> <VirtualHost *:80> ServerName www.richarddcrowley.org Include conf/extra/www.conf </VirtualHost>
will do just fine. You'll notice that I left a lot out by including conf/extra/www.conf. Indeed. That is where we'll do all the fun stuff:
DocumentRoot /home/rcrowley/rails/www/public
RewriteEngine On
RewriteCond %{DOCUMENT_ROOT}/%{REQUEST_FILENAME} !-f
RewriteRule ^/(.*)$ balancer://www_mongrel\
%{REQUEST_URI} [P,QSA,L]
<Proxy balancer://www_mongrel>
BalancerMember http://127.0.0.1:8001
BalancerMember http://127.0.0.1:8002
BalancerMember http://127.0.0.1:8003
</Proxy>
Now the Rails part of the config is done, so restart!
# /etc/init.d/apache restart
It works, right? Of course it does. I'll leave the "why" to experts but this is a fine "how" for making your Ubuntu box a sort of web server Swiss Army knife.
As a bonus, I have installed PHP 5.2 from source as well. PHP is better for hacking things together, so we have to keep it around. Before you run back to the command line, use Synaptic to make sure libcurl and libxml2 are installed, as any self-respecting PHP install will need those.
# wget http://us2.php.net/get/php-5.2.0.\ tar.bz2/from/this/mirror # tar xjf php-5.2.0.tar.bz2 # cd php-5.2.0 # ./configure --disable-cgi --with-libxml-dir \ --with-apxs2=/usr/local/apache2/bin/apxs \ --with-curl --with-mysql --with-mysqli \ --with-pdo-mysql --with-pear # make && make install
We're really done now. You now have a LAMPR server!
Python Web Platform 8/13/2006
I've been talking for some time about switching from PHP to Python for web programming, and I think now (at long last) I'm closing in on a switch. My knowledge of Python has gotten to the point now that I can properly leverage its hooks into Apache to make programming much easier for myself.
The first advance that Python gives you is two ways to write pages. First you havemod_python.publisher which is good for writing more code-intensive modules. Basically, if mod_python.publisher is running the file index.py, then index.py calls the index() function with whatever arguments are given in the normal ?key1=value1&key2=value2 format. If I request index.py/qwerty, the qwerty() function gets called with arguments going the same as before. Additionally there is mod_python.psp which gives you the PHP/ASP-style interpreter that runs Python code within <% and %>, and dumps HTML from everywhere else. The special advantage is that the PSP interpreter is available from a call within the Publisher handler.
This on the surface looks like enough, but there is one extra thing that I got working yesterday that makes life even easier. The PythonHandler directive in the Apache config file can allow any arbitrary module to be given control of the Python execution phase. All your custom module must do is define the handler() function. My handler leverages both Publisher and PSP and does so in a way that keeps the page author from having to include any templates or other extra code that would be common to every page. With my handler (in its current half-done state), a page looks like this:
def index( req ):
return { 'title': '',
'css': ( '/css/index.css', '/css/blog.css' ),
'content': 'asdfqwerty' }
The dictionary returned can supply any number of arbitrary variables, including Javascript includes or actual code, multiple CSS files (by returning a tuple of CSS filenames rather than a string), or any other fields. Any fields that match up with fields defined in the site template will be filled in (that's the PSP at work) and others will be discarded. I created a modified version of Publisher that allows itself to be called from another handler rather than directly from Apache. The modifications were to provide a way to make it return its result rather than writing it out, and a way to have it return a dictionary rather than a string. The reason a dictionary must be returned is that otherwise pages would not have direct control over the title, CSS, and content (and other fields). Also of note, because my mod_python.publisher2 module still works if used like the normal Publisher, it still supports the definition multiple functions within a source file, each of which can be called as file.py/function_name?key1=value1&key2=value2.
Here's the Apache config necessary to get mod_python running. You can replace soapbox.handler with your handler, or with mod_python.publisher or mod_python.psp.
<Directory "/home/rcrowley/py/public_html"> AddHandler mod_python .py PythonHandler soapbox.handler PythonDebug On # Optional, prints and logs error messages </Directory>
New Apache Log Viewer 7/30/2006
So I've made a few changes to my Apache Log Viewer, pretty much as requested by Amos and Scott in the comments on the original post. I still haven't installedlogrotate on my machine, so the default setup only accesses the current access_log file, but that's easily changable as noted in the comments.
Its still not the most memory-efficient, because quite frankly, its not that important. The trick if you do start pulling chunks of the file(s) down 4K or so at a time, is that you have to remember to complete lines that you chopped off when you broke the file at 4K, 8K, etc. So after each line, you must check to see if you have a complete line remaining and if not, remember that ending text, grab another chunk, concatenate, and start parsing that new chunk. Not too bad, but not high priority, especially since using /bin/zcat -f to get all the files you want forces using a string rather than a file handle.
The last monkey wrench I had thrown at me was in the regex to take out the URL. It was brought to my attention that scripts with no file extension (the example was something in /cgi-bin/) were not showing up in the list. This was indeed true, as I could not (and still cannot) figure out how to tell preg_match to match anything other than strings ending in '.gif', '.css', etc. Can anyone give me some pointers on how to make a regex do this? I Googled and Yahoo!-ed (I really wish Yahoo! had a verb, too) for it and the closest I got was being able to negate a character class by putting a ^ after the opening [. But this wouldn't would with subpatterns within the character class. That is, unless Phil was right and we just didn't throw enough escape characters in there.
Anyway, enjoy and someone please tell me how regex does the anything-but subpattern matching. It seems like too common of a request to be impossible with regex.
Apache Log Viewer 7/16/2006
Unrelatedly, I checked out Superman Returns this evening, and I was reasonably impressed. It holds very true to the style I'd expect from Superman, though it definitely takes place today. And Clark Kent is a dweeb, there's really not other way to put it. Now, on to more important business: today I put together an Apache Log Viewer that makes sense of what seems like nonsense in Apache logs. I chose from the most detailed of the standard log formats (those that are setup and ready in the standard httpd.conf file), which is labeled "combined." It saves the date and time, the remote IP, the actual request (in a form like "GET /index.html HTTP/1.1"), the referring URL, and the user agent. See the comments in the file for a few more details about the log format.Now this is all very useful information, and who wouldn't want to know a little demographic info about their readers? But I find viewing the logs with less or some other command line tool a bit more than tedious. The solution of course is a pretty parsed package (alliteration!) that makes it all easily digestable online. The file (literally one file) I've put together does exactly this. Here is what I've completed so far:
- Unique Users (page views grouped by IP address)
- Most Popular Pages (page views grouped by URL)
- Best Referers (page views grouped by referer)
- User Agents (page views grouped by user agent)
- Entry Points (referers and the pages they referred users to)
- Hits (full log information printed in human-readable form)
At Panchenko's request, I made a small effort to make this just a little prettier than my usual utility page. Everything is float: left; so the page just kind of flows together as best it can. Perhaps later some more customization is in order, but for now this is all about getting the information out.
Here's the important part: I need suggestions from people that really use their logs to know what to do next. I think there's a lot of value to be had, but I don't know where it is. The goal of this page is to become a lightweight but useful tool for webmasters to react to their traffic.
The next things on my list are breaking out the browser from the OS in the User Agents section, to get an even more useable breakdown of your readers.
You can check it out in action on www.richarddcrowley.org or download it yourself.