thingelstad

Jamie Thingelstad's personal website

Huge Impact of Linode Cloud Updates

I host all of my personal projects on two servers at Linode. Last week Linode announced new “cloud servers” with SSD’s, double RAM and a new chip architecture. I migrated both of my hosts over to the new servers that evening and the performance impact was immediately noticeable. WikiApiary is the most taxing project I run, and it was clearly much faster. This graph though of the WikiApiary API response time is the most telling. So much better!

response-time

PHP Post-Facebook

Facebook is famously built in PHP, and I think it is really interesting to watch what they are doing technically with that massive codebase. This week Facebook announced Hack, a new language that specifically targets their previously released HipHop (HHVM) project released in 2008. Facebook obviously has massive scale challenges, and they have little ability to cache content, so they are having to redefine how PHP works. So what’s interesting about this?

Compare this all to Twitter. Twitter was originally built on the very dynamic and popular Railsframework, using Ruby. Twitter had massive scaling problems with near daily “fail whales” displayed during outages. These are now a thing of the past because Twitter brought in a ton of engineering talent, and they effectively engineered Ruby on Rails out of their environment. They replaced the entire architecture with code written in much more performant environments. In short, they grew out of their house and moved to a new house. Imagine what would have happened to the Rails ecosystem if they had instead decided that they would reinvent the ecosystem to scale to what they needed?

Facebook isn’t the first huge PHP-based website of course. Wikipedia is one of the three-largest sites on the Internet and is built on PHP. All WordPress blogs are built on PHP. However, both of these examples have really good caching scenarios. Wikipedia uses Varnishto cache all requests they can and as a result they reduce their dependency on PHP greatly. WordPress uses caching in the same way. Both sites eschew calls to PHP for the vast majority of their requests by doing this. This is probably the most widely accepted pattern for scaling PHP, which doesn’t have a good track record for scaling. Just bypass PHP as much as possible. However, for Facebook this isn’t an option. They want to personalize every single request so caching just doesn’t work for them.

I find it fun to watch Facebook doing this. Internally Facebook has a focus on speed and believes that using PHP and PHP-like tools is part of achieving that. They can’t cache. So, rather than move that massive codebase they are changing what it runs in. Instead of moving to a new house, they are remodeling!

First with the introduction of HHVM, and now with Hack, they are redefining the characteristics of the platform their code runs on to achieve the performance they want. I find this interesting because it is a path so rarely taken. Certainly no small startup could (should?) do this. Their simply isn’t the time or money to do it, and it takes your focus off your main goal. You could look at Google and Go as something similar, but I don’t think their motives for making Go are anything like what Facebook is doing with HHVM and Hack.

I like to say that PHP is “the people’s language”. It is the disdain of almost any developer you meet. It’s crufty, gross and houses some awesomely terrible code. Some of this is the languages fault, but a bunch of it is also that many people first learned to program in PHP. PHP is also the language that nearly every blog and wiki you have ever visited uses. I would go so far as to suggest that there are more page views on the Internet of PHP than any other language in existence.

Wikimedia Foundation, the non-profit that runs Wikipedia and hosts the Wikimedia engine, is running development versions of Wikimedia on the HHVM engine. Part of Wikipedia’s scaling plan is now coming from the byproduct of Facebook redefining how PHP works. That is really cool. By choosing to change their ecosystem, instead of moving to a new one, Facebook is building a path that millions of blogs and wikis may be able to follow. That is pretty interesting, and why this is a path worth watching.

Remembering Farhan Muhammad

Farhan MuhammadOn Tuesday morning I was shocked to get an email from Jennifer at ILM Professional Services informing us that Farhan Muhammad, the CEO of ILM, had passed away in his sleep on Sunday night. I see Farhan regularly and I’ve worked with him and his firm on projects for several years. To my knowledge he had no health issues and he was a mere 41 years old! Farhan and I had been emailing back and forth just days prior on a project that I was framing up to get him and his firm involved in. This news was so unimaginable it took a while to even process it.

In the past decade I’ve done several projects with Farhan over the course of three different companies. He was a true partner in every aspect of the word. I knew that if Farhan told me that the consultants he had to work on a project were great, I didn’t need to verify. I knew as a fellow technologist I could just go with what he said. He was an “ace in the hole” so to speak. When I knew I needed to extend beyond my team, and get some outside folks in, him and his firm would rise to the challenge. And, if he knew he couldn’t, he would be plain and direct and help redirect to a path that would work.

I remember on multiple occasions calling him that I needed help getting a project started, and needed to start yesterday with a solid team that could execute in weeks not months. He was always able to make magic happen.

On top of being a great business partner, he was also just an awesome guy. You couldn’t work with him long without seeing the pride and joy he had in his family. His wife and their two kids came up in conversations a lot. My heart goes out to them with this sudden and unexpected loss. Farhan was born in Pakistan and moved here straight after high school. His extended family was a big part of his life as well, and one of the drivers to extra bedrooms in their home so they could stay for weeks and months at a time.

Perhaps the most amazing thing is after a near decade of doing projects, technology projects with high risk, I don’t even remember a single time having a meeting or discussion with Farhan that was frustrating, challenging or dredging through a project gone wrong or technology poorly designed. That’s amazing!

I was right in the middle of framing up the next project I wanted to work with Farhan and his team on. I was really eager to get Farhan’s take on the approach, and we were excited to totally reinvent how this service worked and how people would engage with it. I’m really saddened to not have the opportunity to do that again.

To all of Farhan’s family, friends and coworkers I want to send my deepest sympathies and regrets. I was able to attend the funeral service today to pay my respects and it was so nice to hear from so many that had great memories of working and living with him. His impact will be felt for a very long time, and he will not be forgotten. The Twin Cities technology community lost a great technologist, entrepreneur and supporter this week.

Related:

“Walk into a room of people just like you.”

Over recent years I’ve been growing increasingly concerned about the lack of women in technology careers. Perhaps it’s being a dad, or just getting older. Either way, I think this is bad for our industry. I believe we would have healthier cultures, better teams and make better software and products if we had more diversity.

I recently got an email invite to an event in town for tech entrepreneurs. The headline of the email exclaimed in large type…

Walk into a room of people just like you.

In the email were three photos to highlight the people just like you. All set in the gorgeous CoCo Minneapolis space.

Notably a room of people just like you, if you are a young, white man.

I’m not highlighting it because I think there was anything intentional with these images. But instead just to highlight something that I don’t think many in tech even see. We rarely notice the absence of any women in these scenes. We need to work harder to create an inclusive environment that draws the great women technologists into our events too.

On a related note, many know I’m on the board of minne✱ which hosts minnedemo and minnebar. We are continuing to work hard to make sure we get all technologists to our events. We have a lot of work to do, but making sure that our imagery displays an open and accepting event is an important start.

Dropbox Arbitration Opt-Out

If you missed the news that Dropbox is now automatically putting all users into an arbitration agreement you should take a moment to opt-out of this change. You can go to

https://www.dropbox.com/arbitration_optout

And easily opt out.

Opt-Out of Google Plus Gmail integration

It’s lame when the only time you need to log into services is to disable privacy invading features that I have no interest in.

google-plus-opt-out

One Year for WikiApiary

Yesterday WikiApiary had a very meta tweet when it wished itself Happy Birthday! A while back I realized that if you looked the edit history of the “Main Page” of a MediaWiki website you could infer the date the wiki was started, it’s birthday. WikiApiary then wishes wikis a happy birthday on that date. Yesterday was WikiApiary’s big day.

WikiApiary LogoThe first year of WikiApiary has been great! The comments people make about it and the great contributions that many people have made to the wiki reflect the utility and interest in the data. WikiApiary was a holiday break project for me in 2012 and it’s continued to get additions and modifications from a number of people throughout the world. It is the first project I’ve started that I feel has gotten a true community around it and people that are moving it forward independent of what I’m doing. That is really great! This idea of a “Wiki to track other wikis” clearly caught on with some people.

In WikiApiary’s first year it has collected 1,855,979,520 statistics samples in its database, just 2.6GB of data. As of today, WikiApiary is collecting data from 9,555 active wikis. It shows 2,478,637 active users over 384,870,041 pages with 2,894,060,197 edits in the part of the wikiverse that it monitors.

Looking at visitor activity during this first year, WikiApiary had 32,416 visits with 105,895 page views. 9,346 of those visits were from MediaWiki.org. The top 10 countries visiting the site were:

United States 10,243 31.6%
Germany 3,422 10.6%
United Kingdom 2,095 6.5%
Russian Federation 1,853 5.7%
France 1,084 3.3%
Canada 1,000 3.1%
Netherlands 814 2.5%
India 755 2.3%
Spain 714 2.2%
China
687 2.1%

WikiApiary visitors weigh heavier than average to Linux.

Windows 7 14,989 46.2%
Linux 4,029 12.4%
Windows XP 3,876 12%
Mac OS 3,473 10.7%
Windows 8
1,707 5.3%

Chrome dominates the browser choice for WikiApiary visitors.

Chrome 26.0 2,377 7.3%
Safari 6.0 2,319 7.2%
Chrome 30.0 2,087 6.4%
Chrome 28.0 1,976 6.1%
Chrome 27.0
1,756 5.4%

2,455 of the 32,416 (7.6%) visits to WikiApiary were from logged in users. All websites statistics are from the amazing Piwik project. No data is shared with Google or other search engines.

WikiApiary is largely about graphs, so that seems like a logical way to explore the first year of WikiApiary. The number of active users on WikiApiary has roughly been around 30 for most of the year, peaking over 50. This doesn’t seem like a ton, but most wikis that are monitored actually have fewer than 5 active users. The total number of users is over 250 and grows steadily. Those are all real accounts too, no spam accounts. Registration is required to edit so this is a good reflection of engagement.

WikiApiary-users

Edit activity on WikiApiary is mostly robotic. The bots are constantly tending to the data set and they do this with edits. You can see the edit rate jumped in October after I added tracking for MaxMind geo data as well as Whois records for wikis. Over 5 million edits in the first year.

WikiApiary-edits

Total pages of content largely reflect the number of wikis being tracked, plus the number of extensions and skins that exist. Notably you can see the initial load of sites in February and March. There were additional farmer bots that added in some reasonably sized farms in June. In October the pages spike again with the addition of more datasets.

WikiApiary-pages-articles

WikiApiary is itself the 11th largest Semantic MediaWiki installation that it tracks. The largest is Gyvosios gamtos enciklopedija with over 16 million properties (think of a property as a data value). WikiApiary has over 3.3 million property values.

WikiApiary-property-count

These 3.3 million properties are queried in MediaWiki templates so you can see the data. There are nearly 140,000 queries in WikiApiary.

WikiApiary-query-count

Special Thanks

WikiApiary has had a lot of contributions with additions of wikis and help with templates and bots. Karsten Hoffmeyer has been a huge part of WikiApiary and is also an administrator on the site. Karsten helps with adding wikis and fending off the occasional bad edits. WikiApiary also has a very distinctive look from the Foreground skin which was built by my friend Garrick van Buren. Mark Hershberger has also been an active part of WikiApiary and is exploring ways that MediaWiki installs can automatically add themselves to WikiApiary. Huge thanks to Frederico Leva (Nemo) for linking extension pages on MediaWiki.org to their respective page on WikiApiary. This drives a lot of exposure for WikiApiary and provides great value to visitors of MediaWiki.org. A big thank you also to Paul DeCoursey who rewrote the Javascript code to embed the charts into the pages, and support multiple charts with usable controls.

Also, I think one of the things that makes WikiApiary unique is that it is built with MediaWiki and Semantic MediaWiki and the related suite of extensions. This is such a wonderful set of software and a special thanks to James HK, Jeroen De Dauw and Yaron Koren. All of them have helped out and provided input on WikiApiary at times in the first year.

Future Plans

I’ve got a ton of plans for WikiApiary, and I keep picking them off slowly. I’ve not had much time for the project the last couple of months but whatever time I have had has been going into rewriting the bots. The first versions where just hacked up and difficult to understand. I’m working on a rewrite that includes unit tests, a good object model and code that is easy enough to understand that I hope to get some more contributors involved. The other huge thing being added is parallel requests. Right now WikiApiary is limited in collecting from more sites due to how it collects, in serial. The new bots will do that in parallel and will dramatically change the cost of running a collection sequence. There should be no problem going from 10,000 to 100,000 or more wikis being monitored.

I would also like to see the Honey Bee MediaWiki extension get going which will be the first step of an extension that leverages WikiApiary inside of the wiki it’s running in.

Additionally I’d like to do a whole deeper level of analysis of MediaWiki websites and have been contacted by two groups who have written algorithms that do this and are interested in adding their code to WikiApiary. I hope to make that easier with the bot rewrite mentioned above.

I also want to provide a base Farmer class that can easily be extended so that bots that farm new wikis into WikiApiary are easier. My big objective is to finally pull in Wikia.

I’m proud of WikiApiary and plan on continuing to host it (no small feat actually, given its scale) and work on it. I see WikiApiary as one of my “decade project” so I don’t have to move too fast. I just keep things rolling the right way.

Here’s Everywhere You Should Enable Two-Factor Authentication Right Now

Two-factor authentication is one of the best things you can do to make sure your accounts don’t get hacked. We’ve talked about it a bit before, but here’s a list of all the popular services that offer it, and where you should go to turn it on right now.

This is a great list. Highly recommended to go and enable all of these that are relevant to you.

Merry Christmas 2014

Christmas Card 2014 Front

 

Christmas Card 2014 Back

Kids Running for Kid’s World Marathon Challenge

Our neighbor Nicole organized the Kids Running for Kid’s team to raise money for the Save the Children World Marathon Challenge. Mazie and Tyler joined 50 other kids running segments of a marathon around the block and raised $1,526 for a great cause.

« Older posts

© 2014 thingelstad

Theme by Anders NorenUp ↑