Railsconf was a blast. I met so many amazing people and saw so many great talks. My own talk turned out pretty well, I think. Though next time I think I’ll add some lighter content at the 20 minute mark to break things up.
Here’s the transcript:
Starr Home: Thank you guys for coming, this is going to be an awesome talk, quite possibly the best talk you will ever hear in your life. But I have a built an escape hatch here so if things are going wrong, I might just bail a…. Yeah. Anyway, a couple weeks ago I was having a panic attack about this presentation, because I’m a programmer, my natural habitat is in some dark place looking at a screen. It’s not talking to a bunch of people with lights, what’s up with this?
Starr: [laughs] I was in the middle of this panic attack and I went out on a walk, because that always clears my mind. I was listening to a podcast. My favorite podcast is called Hardcore History, I highly recommend it. Yeah oh, somebody likes it! [laughter]
Starr: Yeah! [laughter]
Starr: This particular episode of Hardcore History was about the Punic Wars. In case you haven’t heard about the Punic Wars, don’t know the story, I’ll just give you the brief rundown. A long, long time ago there was this place called Rome, and they decided that they should take over the world. For the most part they did. There was this one country that a was a thorn in their side for a long time called Carthage.
They fought all these wars back and forth, it was a stale mate. One day, when I don’t know maybe he was making a sacrifice to the fire god or whatever, this guy Hannibal had this great idea.
He’s like “I’m going to lead this sneak attack on Rome, and I’m going to do so by marching my army through the Alps. Which is pretty cool, that’s pretty bad ass. I think the most awesome part of the story, for me at least, was the dude had war elephants!
I don’t know if you can see it in the slide, but there’s a war elephant. It’s of under the G. A lot of people have doubted this story over the years. In 1959 there was a British engineer who, on a bar bet, borrowed a circus elephant named Jumbo and marched it across the altitude.
I don’t know what my point is here. Don’t underestimate elephants, because, they don’t like it and they have long memories.
This talk is really about biggish data. What the hell is biggish data? It’s not big data. It’s not about hadoop clusters and server farms. That’s way over my head. It’s not about fancy architectures. I’m not going to show you how to make multi-write Postgres clusters that do automatic sharding.
That’s all wicked cool, but it’s not what we’re going to talk about. Instead, I’m going to talk about something that I think is more practical and is more interesting to me. How do you keep your app working as your production dataset grows, and grows into biggish data territory?
This is very easy. This is very easy to happen. Even if you don’t have a fire-hose of data coming in, if you run a popular e-commerce site or a site with a lot of users, over the course of years you can accumulate a ton of data.
As this data accumulates, you find that your site performance goes downhill gradually in ways that you don’t understand.
What is biggish data? This talk is based on my experience at Honey badger. In case you haven’t heard of us, we are an exception, performance and up time monitoring company.
That means that we essentially have a fire-hose of new data coming in every day. Right now, we have about a terabyte of errors in our database. The world has a lot of errors. You guys need to start doing a better job.
Starr: We get two gigabytes of new errors per day, roughly. All this goes into a plain vanilla Postgres database, that backs up pretty plain vanilla rails application. The good news is that Postgres can handle this pretty easily. Postgres got it covered. The bad news is that unless you’ve engineered your app to deal with this level of data, you’re screwed.
The reason is that a 100-megabyte database behaves fundamentally different than a one-terabyte database. A lot of the conventions, a lot of the normal things we do in rails’ apps stop working when you have this much data.
Starr: This is a live feed this crew has been running since this morning. [laughter]
Starr: Something I learned. Pretty much every pagination system breaks on page like 2,000. Even if you’re, “Dear God, make this stop. I want to delete half my data.” You’re still screwed, because deleting data takes a lot of time. I’m going to explain all of this. I’m going to explain why this happens, how you can work around it. How you can optimize the queries and your own database and optimize your stack.
But to really understand this, we’ve got to take a little trip back in time, to the summer of 1978.
Starr: The summer of Disco, free love, and the VT-100 terminal. [laughter]
Starr: The VT-100 was the first computer-shaped thing that you could buy from an actual company and set on your desk. To prove that you were the alpha nerd of the office. It was also the time that Oracle 1.0 was being developed. Which was one of the first databases that used the SQL language. It’s pretty cool, it’s written in PDP-11 assembly language. In case you are too young to have ever seen a PDP-11, this is what a PDP-11 looks like.
Starr: Yeah. [laughter]
Starr: Show some love for the PDP-11. [laughter] [applause]
Starr: This is pretty awesome. Nowadays, in our modern-day times, right this second, marketers down in the presentation…What do they call that, the exhibit hall? Are right now throwing around all sorts of words about cloud-computing, platform as a service and…I don’t know, some stuff I haven’t even heard of, probably.
Back then it was the same, but the buzzwords were different. The buzzword on everybody’s lips at that time was “real-time” computing.
I’m using air quotes around “real-time” because that meant that you could have your report generated in five minutes instead of five days. Which I’m sure was pretty awesome. I would have been really happy about that.
Looking at this, it doesn’t really seem like the thing I would want to build a web app on top of, right? I wouldn’t stick honey badger on that. But a funny thing happened in the past 36 years. Moore’s Law made this into a web stack.
I know everybody here knows vaguely what Moore’s law is, but I thought it would be good to go over the technical, precise definition of Moore’s law.
Starr: That is that computers get more awesome as time goes forward. Awesomeness is generally defined as, technically, being able to do more stuff, faster. Being able to process more data, faster. Based on this, I would like to postulate for the first time in public. You are very lucky to be able to hear this. Starr’s corollary to Moore’s law. Which is, as database growth outpaces Moore’s law, you literally travel back in time.
Starr: That was the TED moment of this talk. [laughter]
Starr: I can hear your minds being blown, it sounds like bubble wrap popping. [laughter]
Starr: [laughs] Based on my extensive research of time travel movies, the first rule for surviving your new time period is that you need to understand the methods and the motivations of the people in that time period. We need to start thinking like our ancestors. Our ancestors were very interested in this thing called hardware. I don’t mean to be flip. I want to bring this up first if you happen to be having database scaling issues.
If your app, right now, is getting slower because the amount of data in your database is getting bigger, and your app happens to live anywhere named DC2, Linode, whatever other companies offer that.
You can probably solve your scaling issue right now by going and buying a real, damn computer.
You’ll be shocked and amazed because the two things that databases need to perform well, to work at high performance with lots of data, is they need lots of ram and they need really fast disk I/O.
Virtual servers give you neither of those things. Go buy a real, damn computer. While you’re at it, throw in a bunch of disks. I’m not talking about a raid array.
You probably want each of your operating system, your Postgres data, your Postgres log file, you probably want all that stuff to be on separate hard drives. That’s going to make the operating system able to more efficiently schedule disk I/O.
If that solves your problems, great. You’re welcome. If you guys want to leave the room now, that’s fine. I won’t be offended. You can go call your local colo facility. Work something out. If that doesn’t solve your problem, or even if it does, you probably want to look at your queries next.
Most of us develop against smallish data sets. When you develop against a small data set, you don’t notice inefficient queries. It’s just life. Books have been written about query optimization. It’s a very huge topic. I can’t explain everything in 30 minutes.
I’m going to explain one thing. That’s called EXPLAIN. Fortunately, Postgres gives us an awesome tool called EXPLAIN. You pass it a query and it gives us a query plan.
Starr: Wait a second, that’s a chapter from the “Iliad.” Sorry. It gives us a query plan, which still looks like a chapter from the Iliad. Fortunately, we only have to look at one metric. The only thing we need to worry about is rows, for this talk. Here we have an example of a very efficient query. It’s going to use an index and it’s going to look at one row. That’s pretty sweet. Very fast. This is the type of query that biggish data works with.
There’s one thing you need to know about rows. Obviously, the more rows you’re dealing with, the more data your computer has to crunch to give you the answers you want. The whole name of the game in query optimization is to limit the number of rows that you have to touch.
Let’s go back to that inefficient COUNT query. If you run EXPLAIN on that, it turns out that when you ask Postgres to count all of the rows in a table, it literally counts all the rows in the table, one by one.
You wind up with one of these things. It’s not pretty. It’s often hard to know when stuff like this is going to crop up. For example, I mentioned that most pagination systems break at page 1,000 or 2,000.
That’s because the offset and limit operators work a lot like COUNT in that, if you do an offset of 500,000, Postgres is going to count up 500,000 rows and then if you have a limit of 100, it’s going to count up another 100. What you see is that pages one, two and three load extremely quickly.
You as a developer – when I say you, I mean me because I fell into this trap – you test it with one or two pages and it works fine so you move on. Later on, your customers are complaining because they can’t access page 500. It’s timing out. The solution to this is to use a range query.
Range queries are very fast. I’m going to give you a really stupid example here. Here’s a link with a much more detailed analysis of all this. Essentially, if you do a range query, you’re going to be able to use an end axis. It’s going to touch 100 rows and it’s going to be really fast.
I know what you guys are thinking. That’s not the same as an offset limit query. That’s the reason I gave you the link in the previous slide. Sorting. Sorting is tricky. Sorting is the devil.
Sometimes it’s superfast. Sometimes, if you happen to be asking for data sorted in exactly the same way that an index is sorted, it’ll be super fast.
Other times, even if you have an index on a column, because it’s not setup right, or if you’re asking for the data in a slightly unusual way, you’ll wind up sorting the whole damn data set. It’s no good. Here’s your rundown for query optimization.
You want to develop against a real data set, because otherwise you won’t know when things are going wrong until they go wrong in production. Use EXPLAIN. Use it a lot. The whole name of the game here is to limit the number of rows that you touch.
The more rows you touch, the slower your queries are, in general. I don’t know about you, but that was a lot of cognitive load happening right up here. I want to give you guys a cookie.
Starr: For the second half of this talk, we’re going to cover a lot of issues that relate to the infrastructure around biggish data, around big data sets and fire-hoses of data coming in. Here’s a legend of how we’re going to approach this. We’re going to have a cute picture that hopefully symbolizes something about the topic. We’re going to have the name of the topic. Then we’re going to have a link with more info about the topic, if you want to find this later.
You don’t have to write all these links down right now. At the end of the presentation, at the very last slide, I’ll have a URL where you can get a list of all of them. I promise, this isn’t malware. Just have to take my word on that.
Our ancestors were really interested in disk operating systems. It seems like they were building a new disk operating system every two seconds. What’s up with that? I don’t know. Now we have three.
The first thing that you should do if you’re using Linux, is to increase your read-ahead cache. A read-ahead cache is something that I found not a ton of people know about. It’s a super easy way to get up to a doubling of your read-throughput.
Essentially, what this means is that Linux examines the request that you make to the disk. If it sees that you’re asking for lots of blocks that come right after another in a row, it’s going to pre-load the next set of blocks into RAM.
The default for this is 256K RAM that it uses to prefix these blocks. If you update this to use 2 Megs, 4 Megs, you’ll get a really big increase in read performance.
I use a modern file system. That means ext3 is not an option. If you want to know why, check out the link. If you happen to be using ext4 or XFS, I can never say that unless I say it super slow, you might want to look into journaling settings.
If you have your database on a completely separate hard drive, and it’s running ext4, and you have full data journaling enabled.
Since Postgres does its own journaling, you’re going to have an inefficiency there, because you have two things journaling. You have to tell PG, you have to tell Paul Graham…
Starr: …about all the RAM you bought in that fancy new box. A real easy way to set this up is to use a pgtune script. It examines your computer and writes a Postgres configuration file that has some pretty reasonable values in it. You can tweak them from there. The stuff is all really easy to find. Finally, the bane of DB administrators is the VACUUM command. Postgres needs to vacuum, because it’s messy. When you run queries, when you delete things, when you update things, it leaves a lot of stuff undone. It does that in the name of speed.
It only does what is necessary at the time of the query to get you an answer to your query. Then it uses VACUUM to clean up after itself. The problem is that VACUUM can be really resource-intensive.
If your server is under a lot of load and you see that VACUUM is also causing a lot of load, you may be tempted to turn off VACUUM or to make auto-vacuum happen maybe once a night or something.
That’s generally a bad idea. We got bitten by this ourselves, so that’s why I bring it up. Usually the answer to VACUUM problems is to VACUUM more often, not less often.
Velocity. I didn’t have time, I really wanted to have a little TRON guy on a motorcycle going down the little grid there.
We’re going to talk about a couple of things that are important if you have a ton of data coming in, or a ton of read requests coming in, a ton of queries coming in.
The first is too many database connections. Each database connection in Postgres is its own process. Each process has its own ram overhead.
There’s a limited number of connections that you want to have to your database server. If you have 1,000 workers and web app processes, you don’t want to open 1,000 database connections. You probably already know about this. The solution is to pool connections.
There’s ways to do this in Ruby. If you’re interested in a more Opsy approach, you can check out something like PgBouncer, which is a proxy that sits in between your Ruby app and your database and functions as a connection pool.
You can also run into problems with too many locks. This is the problem that you don’t really ever run into if don’t have a fire-hose of data coming in. I don’t expect you to know everything about locks. It’s a pretty complex topic.
You should know that within a transaction, if you go to update a row, that transaction is going to put a lock on the row until it’s done. It’s going to say that nothing else can write to that row until it’s done. That makes sense.
Where this can bite you is, imagine you have two Rails models. You have a parent model and a child model. The parent model has a counter cache. It gets incremented every time you have a child.
Normally, this is no big deal. People do this thing all the time. If something crazy happens, and someone bombards your API and suddenly you have 15,000 children created, you’re going to have some locking issues.
What’s going to happen is your first query is going to execute fine. The child’s going to get created, it’s going to increment the counter, everything’s going to be fine.
While it was doing that, it put a lock on that row. Once the first query is done, the lock is removed and the second query happens. If you had infinite time, this would all be fine. It would all eventually work itself out.
What happens is that by the time you finish query 100, query 15,000 has timed out, which causes all sorts of fun in your unicorns and your sidekicks and all that. It’s a huge pain in the neck.
The way you avoid this is, it’s just an architectural thing, you have to avoid any situation where you could be updating the same record in the database a jillion times per second.
Intensive database queries. Sometimes we have our production database and we need that to be very performant for our users. We also need to maybe do some pre-processing on data as it comes in.
A really simple way to do this is to use Postgres’ streaming replication facilities to create a read-only replicant. Then you do your intensive queries against that and it doesn’t affect your users. It’s super simple. I feel dumb even talking about it here.
Partitioning is awesome. Partitioning is the best thing ever. It allows you to dramatically speed up data culling and data archiving. What you can set up in Postgres is a partitioning scheme in which data for different days go into different physical database tables. That’s all right.
The really cool thing is that you access that data by querying a parent virtual table that then propagates that query across all its children and magically does its results and spits them back to you.
You don’t have to update your Ruby code at all, which is sweet. This way, if you want to delete data that’s 180 days old, you drop the table that’s associated with 180 days ago. If you were to do a DELETE where created at is less than 180 days ago, you’re probably going to be waiting for weeks.
Last but not least, backups are a real pain in the neck when you have a huge data set. That cron job that dumps your entire database and uploads it to S3 doesn’t work so well when your database is two terabytes big. That’s a big S3 bill too.
Fortunately, we can take advantage of the same things that Postgres does to allow it to do streaming replication, to do on-the-fly incremental backups.
There’s a tool called wal-e which makes this super easy. It’s really cool, because it makes it very easy to upload incremental backups to S3. When you want to restore your database, it makes it really easy to restore a specific point in time. Wal-e’s really awesome.
I stayed in my hotel room last night. I didn’t get to go to the speakeasy thing, which bummed me out. I had to work on these slides for you people.
Starr: When I got done with this, when I reached this point, I looked back on all these slides. There’s 54 of them. I was like, “Holy crap. This is a lot of information.” The thing I want to stress is that, at least in my experience, these issues tend to come at you one at a time. This isn’t the thing where you have to know all of this stuff in advance, in order to deal with biggish data in Postgres. Things come at you one at a time. You can deal with them one at a time. I have faith in you. I think you can do it. You’re awesome.
Nobody told us that we could turn transistors into wall cats, but we did it. That’s the type of people we are. That’s why I’m proud to be at RailsConf.
If you’re interested in talking to me about this, if you think I’m full of shit about anything, which I probably am on at least one point, the statistics say so, feel free to come up to me after my talk here.
I have delicious hot and spicy Mexican candy as an incentive. If you want to learn more about the stuff that… Do you want to take a picture of the Mexican candy? OK. I can send it to you. [laughs] If you want to learn more, all of the links referenced in this talk are at this URL.
If you are interested in having more visibility into your production errors, check out Honey badger. We love Rails stuff. That’s it. It says end of show here. I guess it must be the end of show.