Thanks to Angie for taking the photo!
I found myself needing to see all of the 404 errors in the access logs for all virtual hosts on my web server. I put all of my logs for a given application (in this case WordPress) in one place (
/srv/www/wordpress/logs/$host-access.log). Logrotate kicks in to keep them segmented and compress them by day.
A bunch of Unix magic later…
zgrep " 404 " *-access.log* | \ cut -d " " -f 1,7 | \ sed s/\:.*\ /\ / | \ sed s/\-access.*\ /\ / | \ sort | \ uniq -c | \ sort -n -r | \ head -20
zgrep is just grep that handles both normal and gzipped files. Pipe that into cut to pull out just the data we want. The two sed commands pull out data that would mess up the aggregation (the IP address of the requester and part of the filename). Sort puts prepares the stream for uniq to do the counting. Then do a numeric sort in reverse and show the top 20 404′s in all log files.
Output looks like
380 thingelstad.com /wp-content/uploads/2011/09/cropped-20090816-101826-0200.jpg 301 thingelstad.com /wp-content/uploads/2009/06/Peppa-Pig-Cold-Winter-Day-DVD-Cover.jpg 300 thingelstad.com /wp-content/thingelstad/uploads/2011/10/Halloween-2011-1000x750.jpg 264 thingelstad.com /wp-content/uploads/2007/12/guitar-hero-iii-cover-image.jpg 130 thingelstad.com /apple-touch-icon.png 129 thingelstad.com /apple-touch-icon-precomposed.png 121 thingelstad.com /wp-content/uploads/import/o_nintendo-ds-lite.jpg 114 thingelstad.com /wp-content/thingelstad/uploads/2011/10/Crusty-Tofu-1000x750.jpg
Of course the next step would be to further the pipe into a
curl --head command to see which 404′s are still problematic. That just makes me smile. :-)
As an aside, sort combined with uniq -c has to be one of the most deceptively powerful yet simple set of commands out there. I’m amazed at how often they give me exactly what I’m looking for.
Tammy went digging deep into our Lightroom library looking for photos for our new exercise room. She told me she had found her “favorite picture” of me. I was very curious what picture it was. I like it. This is from the finish of the 2004 Chequamegon Fat 40. The kid in the background is out of place though, a little Photoshop could deal with that.
There is a ridiculous amount I could write about the move, and I’ll try to share what I think is most helpful to others. In general I can say that the Linode hosts are a lot faster than the Slicehost instance I had. Doing basic Linux stuff was 2-3 times faster on Linode than on Slicehost.
My new setup is also a lot faster due to how I deployed WordPress and MediaWiki. I’m now running everything on nginx instead of Apache. I’m also serving all my PHP out of php5-cgi instead of mod_php. Perhaps even more importantly I got all of my wiki and blog instances running under one PHP install for MediaWiki and WordPress. As a result, the APC module for PHP can do it’s job right. I’m now getting 99.9% PHP cache hits.
With all that said I fully expect I may have a thing or two not working right now. If you see anything broken a comment or email would be great.
Four of my friends from the B-Squad Softball team are in a band called the Good Commies. I was finally able to add some of these wonderful tracks to my iTunes collection. I insist on high resolution cover art so I fired up the scanner. Here are the files for anyone else who is looking for them. I couldn’t find anything other than thumbnails online.
Apple launched iTunes Match today and having all my music in iTunes, with three different computers, an iPhone 4S, iPad 2 and three Apple TVs I figured I could benefit from using iTunes Match for my music. I enabled it and it is still in the process of doing its first update. I was very curious about how iTunes Match dealt with certain metadata.
I’m a big user of Smart Playlists and many of them rely on song ratings, play counts and last played dates. I also often have Smart Playlists that reference other Smart Playlists. Some quick learnings from an hour of playing around.
While doing our remodel we lost the use of our garage. We used it for storage and the contractors had a big dumpster in the driveway. For most of the summer there was also a small lumber yard in the driveway and a portable toilet. So we parked on the street all summer. Best time to do it but an inconvenience.
This last weekend we got the garage cleared back out and I was irrationally excited to park both cars in there again.