Looking for Okta Logos?

You can find all the media assets you need as part of our press room.

Download Media Assets

David Blevins – Deconstructing REST Security, Iterate 2018

  • Transcript
  • Details
  • Related Content


Nate Barbettini: David Blevins is the founder and CEO of a company called Tomitribe. And together Tomitribe and David have been working in the enterprise Java world for a very long time. They've contributed majorly to big projects like OpenEJB, Apache TomEE, Apache Tomcat, and Micro Profile, which is a really big deal if you're in the Enterprise Java EE world right now. David has been at the forefront of working on decentralized applications and Java enterprise apps for many years. And he is here to share some knowledge and nuggets of wisdom with us tonight at Iterate about security, which is one of my favorite topics. Without further ado, welcome David Blevins to the Iterate stage.

David Blevins: There's a microphone, there we go. All right. Well, thanks for the introduction that I possibly can't live up to best for last, that's not really a good thing to say to somebody before they go on stage. Anyway, now that I'm hopelessly going to disappoint you, yes, I'm David Blevins. 

A lot of this talk came from kind of real-world experience as Nate mentions. We're very big in the middleware space. And a couple years ago we rolled out Apache TomEE with a very major customer and they said, "Great, now what? How do we secure this thing?" And we were like, "Basic auth." And of course that doesn't really fly. We had to get up to speed on all things modern security in a very short period of time. 

And as people who have been participating in kind of Java-based standards with the Java community process and J2EE and Java EE and all that kind of stuff for very many years you know that there's always some people in the specification who aren't happy and they always leave. The very first thing that we did to get to speed was basically look at the first version of who was on these specifications and then who is on the most recent version and whose names are no longer there. You contact those people, ask them why did they leave and then they can't wait to rant and tell you all sorts of things about what's wrong with a specification, which gives you a really good counterbalance to the everything's sunny and perfect perspective you always get from the spec lead.

We talked with people like Facebook, Microsoft and Google and so on and so forth, and Amazon and we got a lot of different perspectives on how to do this type of REST-based security. We noticed first that Google, Facebook very much with user-based applications, they're tending to go with JSON web tokens. But also the source code for Amazon EC2 client, which by the way, that little command line client you use on EC2, that's written in Java. There's actually just a small little wrapper on the source code, it was all on GitHub. We went digging through that thing and figured, "Well, if they're securing a few billion dollars worth of business on EC2, what are they using for security?" They use a really cool technique called message signing and we actually contacted the person who maintains that git repo and said, "Hey, what do you think about us using that?" And they pointed us towards a much simpler specification, which is basically an identical in spirit of what Amazon does to secure all the REST API calls, and that's called message signing.

Basically, we're going to look at both of these two worlds. And if we have time, we're going to see how they might fit together to make kind of a perfect solution that kind of covers all use cases, business to consumer and business to business. Okay. 

First of all, I think this quote really sums up a lot of problems and difficulties in digesting security standards, and standards in general. It says the great thing about standards there's so many to choose from. And I don't know who has ever tried to pick through and read security standards like the IETF or anything like that.

Okay. Basically what happens is when you're reading these specifications, number one, they can't help but rename everything that came before as if all other ideas are 100% new. That means it's really difficult to get a bearing, you have no footholds that you can stand on that feel familiar. Number two is that they enormously point to each other. You're halfway through a specification and then it suddenly stops and there's a bunch of pointers to other specifications and you're like, "Wait a minute, am I not done?" You start reading those specifications and then they also point to other specifications. And then inevitably this moment happens where you're at the one spec and you think this is the one that's going to solve it all and you realize, I already read this one. And you just got a circular reference pointer right back to the beginning.

It's maddening. And when you're looking through these things, the spec authors dream big. They're thinking of all these things that that spec can solve, but your problem set is very narrow. You have to sift through all sorts of ideas that don't apply to you and it becomes really difficult. And the more you read, the more you feel the less you know. What this talk is, is basically trying to give you a very narrow and opinionated perspective on how things like OAuth, JSON web tokens, and HTTP signatures and all these things apply specifically to REST communication in a microservices world.

You may be familiar with all these concepts and you think they can do much more than that, but understand this talk is very narrow and opinionated to get you basically from beginner to expert and as quickly as possible. As I mentioned we're going to be focusing on OAuth 2.0, JSON web signatures, what OAuth 2.0 looks like before JWTs, after, and the message signing aspect of it. To understand how these things impact our architecture, I think it's really, really important that we have a baseline way to have a conversation. 

For all of these scenarios, we're going to basically say that we have an architecture where we have 1,000 users who are making three transactions per second to the front door. On the back end, we have a minimum of four micro-service hops before we can answer that request.

Okay. Basically what that means is we're going to have 3,000 TPS coming into through front door and we got 12,000 TPS that results in, in the back side. Everything we do is going to be measured against that. First, basic auth, this is what we're all starting from. We're just going to talk about this to understand the world we're living in now. All right. 

Basic auth is an incredibly complicated and unbreakable security mechanism where you take a username and password, you concatenate together with a colon and you're Base64 encoded. No one can ever figure that out. In three years of internet history dating back to the ARPANET, it's never been broken except by everybody who's ever done it because encoding is not encryption. Anybody who has the algorithm can encode and decode in Base64 is our universal standard for making binary data be at least text legible.

Base64 is terrible, that username and password if it's stolen, you’ve got to shut down the whole person's account. There's lots and lots of issues architecturally with basic username and password based approaches. We're going to put the 3,000 TPS in red because we're signifying that the username and password is seen on the wire even over SSL, it's on the wire 3,000 transactions per second. Everybody's identities are on the wire 3,000 transactions per second. 

And also, you need something that's going to validate the username and password. Now, we introduce into our architecture something on the bottom, which is typically like an LDAP or a database. A lot of people are using Active Directory, so on so forth. That means if we have 3,000 transactions per second coming against the front gate, we have also an additional 3,000 transactions per second hitting our central store for user information, our LDAP, our database.

It has to keep up with the front door traffic. That's kind of an architectural smell right there. We got a center point of failure, we want to avoid these. On the backside, typically, people have no authentication information at all that they're passing around. Who authenticates traffic on the back end? All right. We got about three hands, four hands out of the whole room. And that's typically what I see. Did anybody here ever work for Target Corporation? Okay. Well, they had a very big case where they had lots and lots of credit cards stolen. And they weren't necessarily stolen from the outside, it turned out to be an inside job. Why do we want to secure the back end? Who has contractors coming in and out of your four walls all year long? Most of us. We're basically securing the front door with unbelievable lengths, but then on the back door, we're letting rotating temporary workers come into our enterprise and they're all equipped with laptops and knowledge and we don't expect anything bad is going to happen.

We try to lock down the back end a little bit, but we're not doing a very good job of it. What people typically do is, we'll just say they do IP whitelisting. IP whitelisting is fine for five years ago and in the previous ten years. But IP whitelisting is the natural enemy of elasticity. What we're trying to get to in our architecture is the ability to grow and shrink. Every time you spin up a new instance and every time you shut it down, you end up with a new IP. What are we going to expect to happen? We're going to have team of network people who just type in IP every time we spin up new instances, it's not scalable. IP whitelisting as our form of point-to-point security in the backend is the natural enemy of elasticity, which is where we're going. 

A lot of people end up having to do, either they stay on their architecture that's not elastic or if you wanted to keep passing the username and password that you got from the user, that's fine. But now, we're having 12,000 transactions happen in the back and each of them if they had to validate the username and password would cause 12,000 additional transactions per second against your LDAP. 

Now, LDAP is hosting 15,000 transactions per second and we know that that is not scalable. We need to figure out something on how we can actually secure the backend and not just the front end. We also end up with very, very, very bad pieces of code that do things like this. Microservice A says to microservice D over there, "Hey, give me all of Joe's salary information." And of course microservice D says, "Sure, you got it, coming right up," and it gives it to it.

Why are we writing software like this? There is no proof that Joe is on the other side. And as a result, this is how you get these big bulk breaches. This is how someone can walk into your enterprise and get a list of everyone's names and then ask for all their salary information in bulk and leave. If they have the ability to get one, they can get them all because we've written our code so that it just blindly gives the answer because we trust microservice A. But what if microservice A was compromised? There's nothing there to prevent anything from happening.

Now, let's add the scenario that the evil country of Latveria wakes up one day and decides to hack us ... by the way, who knows where Latveria is? Yeah. All right. It's in the Marvel Universe, it's where Dr. Doom lives. Of course, you have to give a politically correct presentation so we can't list an actual evil country because there are no evil countries so we make one up. That's how you get around stuff like that. 

Okay. This is what the people of Latveria look like in my presentation, they look like Snidely Whiplash. These people are sending 6,000 transactions per second of basically trying to hack us. They're making up username and password combinations and they're trying to bang on the front door. Now, we have our regular old healthy users who are doing their standard old 3,000 transactions per second. And as a result, we're now getting 9,000 transactions per second against our central user store.

All right. This is not very good. What's happening is that the evil people are affecting the experience of the good people. We have no way to sift out the bad from the good and so they're all hitting our center point of failure. Not only have we built the center point of failure into our architecture, but we're letting the bad people use it too. Well, they can not only try and hack us and get in legitimately, but they can denial-of-service us. It's not a good architecture. Okay. 

To solve the password problem, OAuth 2.0 enters the picture. And this part is a little tongue-in-cheek, OAuth 2.0 in its original state and its problems, basically OAuth 2.0 tries to solve the password problem by when you log in, you get a token. And that token lives per each device that we have.

If I have an iPad, a laptop, and a phone, I might have three devices. And if I lose my iPad or lose my phone, I can just say, "Hey, revoke the access of my phone." Who uses Twitter and notices that you have the ability to revoke any of your authorized devices? You can go in there, you can see your iPad, you can see your desktop, you can see your phone. And if you lose one of them, you can just login with the one that you do have access and you can poke out the one you want. That's effectively what we have with OAuth, we have the ability to create a token per device and it’s associated with me. And if I lose that device, I can block out its access.

And so, it's very good. We have the ability now to log individual devices out because they're not associated, they're not using my username and password anymore. To get a token in OAuth 2.0, you do this amazing thing called a grant process, which is a revolutionary thing where you take a username and password and you put it in a form URL encoded payload and you post it to an endpoint. This is amazing, no one's ever done this at all in the last 20 years. We get back effectively what's called an access token and a refresh token, and it's a short little ID. 

Okay. First of all, who's used form URL login in your website at all in the last 20 years? All of your hands should be going up, basically, this is what we all have been doing.

And if you do a form URL login and you're working on a system that uses session IDs, you get something back that looks a little bit like that. So far it's feeling like it's all being presented new, but it's all the same old stuff. This is kind of what we have to watch out for when we're evaluating technologies. We take this little access token, we put it in the authorization header instead of our Base64 URL encoded username and password. And we can just send as many REST requests as we like, doesn't matter. All the payloads are changing, the token is not. 

And then eventually, bam we get a 401. And what that means is that we now have to re-up our session, our access token has expired, we can no longer use it to get entry. And we don't want to prompt the user for the username and password again. What we have in this scenario is a refresh token.

We take a refresh token, which we haven't been using at all up to this point and we post that to the same place we got the access token. And then now as a response, we get a new access token/refresh token pair. Here was the old access token/refresh token we were using, that's the new one we can use. We can no longer use this old pair. And, of course, the amazing thing about standards is that everything is optional, refresh tokens are entirely optional in the OAuth 2.0 spec. And if they are implemented, there are no semantics in exactly how they're implemented. I've seen some implementations where the refresh token isn't supported and they just say, "Take this access token, it lives forever."

I've seen some implementations that allow the refresh token to be used many, many, many, many times and I've seen some implementations that are smarter, which allow the refresh token to be used exactly once. Can anyone guess why we might not want to let the refresh token be used several, several times without expiring it as well? Yeah. Basically what happens is that if I allow the refresh token to spawn off many access tokens, effectively a user logs in once and then that refresh token can effectively fork multiple sessions. And if I'm a hacker, I can give them all those access tokens away and effectively I have taken one identity and one login session and I've forked it a thousand or a million times and now I have a brute denial-of-service attack that's all legit.

We do not want the refresh token to be used many, many times. Single use and it's done, you get a new refresh token. That basically means when someone logs in, they only ever have at one moment one valid refresh token and the second it's used, it expires. That's the ideal way to set this thing up. All of that is optional, the scenario I mentioned is never covered in the OAuth 2.0 spec. This is just real-world stuff that you’ve got to learn the hard way. That's why I really wanted to make sure these types of things are covered in the talk. Once we have that new access token, we can go ahead and use that in the authorization header and we just continue on with life.

All this stuff is probably 30 pages into the OAuth 2.0 spec by now. And what have we accomplished? You have a lot more passwords, that's basically what happened. If someone sees your access token, they can steal it and be you. You have to communicate over SSL just like basic OAuth. Effectively, an access token is just as susceptible to theft as a username and password. The one advantage is that they can't reverse engineer an access token to find the original username and password, that would be an advantage, but we do have effectively a lot more passwords. 

Another thing that's been accomplished aside from this one gain of being able to put effectively a password, a temporary, short-lived password on each of our devices, we've achieved renaming form URL login, which we've been doing for the last 15 or more years.

Your first web app that had a form page posts that in URL form encoded to a place that gets some sort of ID back, nothing different. Now, I poke fun that we haven't achieved anything, we've managed to achieve renaming logging in to a very fancy term, grant process. The spec authors are thinking we want this thing to fit in what the industry is doing now, which is really, I feel that heart mentality like, good for you for not trying to change everything. But why do you have to rename it so we can't know what it is? Why are you presenting it as completely new and confusing all of us when basically this is the thing that we already know?

What we have is a scenario where there's 10% new and 90% of what we've already been doing has been renamed. We don't call login in the spec, it's called a grant process, super fancy. And what do we get back from our grant? We get a token. What's a token? It's basically a made-up string, it's a fancy way of saying a chunk of text. We don't want to call it a chunk of text. 

I like to say that that chunk of text is basically a slightly less crappy password or an equally crappy HTTP session ID. Let's now go through the scenario where, this is how OAuth 2.0 looks in our architectural perspective with the access tokens that are very small. We have a 1,000 who are, let's say for the sake of our scenario, we have 1,000 people there. They're going to log in once a day, every day we make them log in with their username and password again.

It doesn't matter how many refreshes they do, every day they have to at least log in once. We're going to have 1,000 people logging into LDAP once a day, now LDAP is getting a 1,000 hits a day whereas it was getting 3,000 transactions per second. That's a big improvement, the load on LDAP has now gone down to nearly nothing. 

But now, we have now 3,000 transactions per second against our token store. A token store is basically your session store. It's when someone gives us a little ID, we got to go to that store, we got to look up who they are and what that ID means. We've swapped one center point of failure for another point of failure. 

And then on the backside, we're still doing IP whitelisting because basically if I passed the OAuth 2.0 access token, the little short one from microservice to microservice, I'm going to end up with a scenario that looks a lot like this. That microservice is going to go, "What the heck does this mean?" And it's giving a little made-up string. And the other microservice says, "I have no idea who this is." All of those microservices go asking the token store, and now we're generating the 12,000 transactions per second against the token store, which of course now it's processing 15,000 transactions per second, which is 0.5, which is 55% of all of our traffic and of course it explodes in a fiery doom of death and now we're getting zero transactions per second because our center point of failure has just been railed.

Okay. We haven't avoided any of the downsides of basic auth, everything that I mentioned was a downside of basic auth. We have our same center point of failure, it's a different system now. There's an additional problem with this is that these things are pointers. What do they point to? That's a really big thing. Now, we have to store that information on what they point to. 

Do any of you have any legal requirements that require you to be able to know who logged in and did what a year in the future? Most people have these types of restrictions if you're dealing with anything that's financial-related. You've got PCI constraints and concerns. If you're doing HIPAA-related stuff, you’ve got to be able to know who did what when. You can't just simply say, "Oh, some things happened, but don't know who they were. There was just a little token here, I don't know who that belongs to. Well, I can't answer any basic questions. What do you mean I'm getting sued? I have to pay a big fine, this sucks."

You have to store these things if you're doing any sort of real business. What happens and what I've seen is people roll out these systems and then day one, that token store is very empty and they're going, "Man, this thing performs great. I love it, it's fantastic." Four years later there's millions and millions and billions of old tokens sitting in a store and suddenly it stops performing so well. It's way worse than a typical type of a system because that thing grows bigger and bigger over time. Since this talk is really trying to be buzzword averse, to not propagate everything old as new, let's come up with some better names for access token. How about we call it access pointer, would that sell? Probably not. Access primary key, not so attractive. It's honest, but it wouldn't sell.

I like to think of OAuth 2.0 as high frequency password exchange algorithm. The funny thing is that if you are a vendor, these terms are really, really fantastic because what happens is a customer calls you in and they say, "You know what, we're trying to get rid of our basic auth system. All the passwords flowing around, it's a real bad idea. What do you got? Go." And you say, "Tokens." The typical thing for vendors is to like, "You know what tokens are, don't you? All my smartest customers know what tokens are, you look like a smart customer." And they go, " Yeah. We totally know what those things are," and they're Googling on their phones.

That's how things happen. But if we’re all honest it would be like this, "We're trying to get rid of our basic auth and our password based system, what's your solution?" "More passwords." They would kick you right out the front door. And you would be explaining, "No. It has all the same architectural disadvantages of what you're using now, but there's a lot more of them." They're like, "That does not sound attractive at all." This is what we have, this is what OAuth 2.0 looked like for the first five years of its existence. To understand the next sections, there's really one critical concept that you have to know and understand, all things on top of it are syntactic sugar. We need to cover this concept, let's pose a programming problem for us.

We have a file system, and in a directory there are files. And our goal is to know when the contents of a file have changed. Do we A, monitor the date stamp of the file? Or B, do we see the byte size of a file and if that changes the file has changed? Who says A? All right. Who says B? Who says both? Okay. 

The answer is none of the above. 

The answer is, I heard it over here, you said checksum, that is the right answer. There is a concept of hashing that is fundamentally the basis of all REST security and all security concepts that are not crap. To be honest, I'll just go right out and say it. Hashing is this concept of protecting data. A hash is a short number that is statistically unique. 

Who uses GitHub?

Okay. All right. Just checking to see if your hands work. I feel sorry for those who either your hands don't work or you're not using GitHub. All of this is like git because there are other git providers, there we go, such as yourselves, you could be a git provider also. You see there's little kind of hex things at the beginning. And those are basically statistically unique numbers made out of thin air by looking at the data of your commit compared to the previous commits. And boom, you get a nice number that can serve as an ID. 

The way I kind of describe this is imagine of a snow globe. And I've got two analogies, if one doesn't work for you, the second one's going to do it. You have a snow globe, and in the snow globe there are 256 snowflakes and they represent bits in a 256-bit number. They all start out at zeroes. And as the data pulls through, we shake that thing. We want those bits to be as random looking as possible, we don't want them to look like zeros anymore, we want to distribute out. When the data stops coming through, we instantly stop shaking. And whatever position those ones and zeros are, that's our hash. We crack the thing open and we take those snowflakes out and we lay them all out and that's our hash. And you can see the snowman, actually, he's crying now. We just cracked open his snow globe. That part is our hash.

Now, I've made it sound random. The key part of this is that it's deterministic. If the data passes through and it's different data, we would get a different number. If one byte of the input data is different, that hash comes out different. But if the data is one gig, it doesn't matter, it's still 256 bits. That 256 bits is statistically representative of the whole data set. And it seems mind-blowing that this can actually be true and it is mathematical magic. And there is a limit to how much you can really create a unique number based upon how many flakes you have. Effectively, the more flakes, the more bits in your number, the better.

If I had eight bits, I can't represent a lot, maybe a 256 different combinations of things. There's a really high chance that no matter what data I feed it, it's going to clash with something else. The more bits I add, the less chances there are of a clash and a collision and the more unique in the world the number is. Right now, SHA-256 is basically the standard, it used to be SHA-1, which is 128 bits and that got ruled to be breakable by anybody out there with a decent laptop or a mobile phone these days. And 512 is probably what we'll be doing in a couple years whenever computing power catches up. Now, you cannot get the data out of a hash. A hash is one way.

Here's another way of describing a hash, let's say I have a piano keyboard and it's got 256 keys on it. And below that, there is a zero, a bit for each key. And as the data streams through, I play it like music and I'm playing chords and I'm playing arpeggios and I'm doing all sorts of things. And when the music stops, I take my hands off the keyboard. And every time I have touched a key, that bit moved from positive to negative or negative to positive. Basically, those bits are getting toggled as I'm playing the music and when the music is done, my hands are off the keyboard and the resulting number at the bottom is my hash. If I change a note in the music, am I going to play different keys? Yes. Will the bits be different? Yes. Can I look at the bits and guess what the music was? No. That would be really, really hard.

The resulting thing is our hash, and sometimes that's referred to as a fingerprint. And if you think of the piano keys as the things we've touched and our fingerprints are literally all over them, it really helps to remember why it's called a fingerprint. 

Let's take a real-world scenario, I've got a text file. And in the text file I have Eagles beat Patriots 41 to 33, who's a sports ball fan? Not me, so I had to guess sports ball, I think you might like that. Anyway, the scenario is valid though. Let's say we have a game and that's the final score of the game and the hash has come out to be a certain way. And then this is what the hashes look like, just maybe look at the row. This is the hash of the text file with the contents of 41 to 43. If someone tries to change the contents of the file so the score is actually 31 or 41 to 34, the hashes will be this. And anybody can run it through these basic major hashing algorithms and you will see this output so it's totally stable.

Now, the thing is what if someone does try and sneak a change in and we're basically arguing whose hash is the real one? We need to establish an authority over which hash is the real one. That's the whole thing about distributed systems. Now, things like Bitcoin they use it by masses. The more people who think the hash is this, they just win. And that's fine, but in a security system, we can't have it be like that, the people who think that they're right, they get in. That does not define good security, all the hackers think they're correct so let's get them right the front door.

You have to protect the hash. And the way that you protect the hash is by encrypting it. 

There are two styles of encryption, there's symmetric and asymmetric encryption. We're basically going to take that hash, which is that snow globe and we're going to put it in a box and we're going to lock it. And if we use symmetric encryption, the people who have the symmetric key, there's one of them they can both lock and unlock the box. It's fine for certain kinds of security where we want to know that both sides have the same secret. 

But the other kind is RSA, which is asymmetric and basically one side can close the box and a whole other I can open the box, the responsibilities are split.

Here's a way to imagine this in architecture using real-world people. Let's say that that's Sue and that's Bob, Sue and Bob have equal ability to write data that's protected by the key and read data protected by the key okay. They're equal, they can both do the same things. And if we see encrypted data, we don't know if Sue made it or Bob made it. At least they have a private way to talk between just two of them. 

Asymmetric would be there's one key that's called the private key and another key that corresponds to the public to the private key, it's called a public key. And if you need a visual in your head to imagine this magic, think of yin-yang. You got one side that's white with a little black dot and one side that's black with a little white dot. That white dot corresponds to the other side and they sort of show that each side has balance.

Basically what happens is we take and break that apart and we use the black side with the white dot, the white dot links it to the public key and the black dot links that public key to the private, so they're connected. We distribute the white one to all the people we want to be able to validate the data, but we keep that black one private. In this scenario, Sue has the ability to sign data and everybody can know that Sue is the one who made it, but nobody else other than Sue can sign that data. Now, that signature is unique to Sue. For all those people on that side the, data is effectively read-only. And that's a very, very significant thing. We have distributed data that's read-only, we've solved how to distribute and make it not modifiable, but still it's in plain text. 

The signature is the part that's encrypted, the payload, the thing that we sign. The sheet music anybody can read it. But if someone tries to modify the sheet music, we can check it with the key that Sue has and we know that it's not what sue sent us, someone modified Sue's sheet music.

Okay. Now, enter JSON web tokens. Now, we're going to see how OAuth 2.0 with JSON web tokens plays out. Of course, the OAuth 2.0 specification, which is really more of a design pattern that an actual spec because it doesn't actually say too many concrete details. You can literally read through this whole thing going, "Is the next chapter going to be the one that tells me all the real stuff?" And then you get to the end and realize, "They were never going to tell me how to actually do all this thing, they're just describing things to me that are infinitely flexible."

One of those things that's infinitely flexible is that it says the access token could be anything you want, doesn't matter. And JSON web token says, "Here's how the access token could look." Now, of course, the JSON web token is way bigger than OAuth, they're trying to sign data globally for everybody. They're not going to talk about OAuth 2.0 a terrible amount and the JSON web token spec, which of course makes it really frustrating because you're coming from a very specific perspective.

All right. First of all, it's pronounced JOT. And, of course, that's what a W sounds like when I pronounce it. I don't know what you guys are talking about. And then it's basically just a fancy JSON map that's signed. It's got a built-in expiration, and here's what our access token looked like previously and here's what our access token looks like now. It's amazing, I love it. This is so much better. Now, this is why we have to show an architecture because every single one of us at this moment basically thought I want the smaller one because that's the perspective that we have, "Look, that's so much bigger. That's going to be slowing the network, we definitely can't use that one. We should use a smaller one, it's going to be way better."

First of all, this is what the payload looks like. We have a little thing at the beginning that says what things we used to sign this payload in yellow and then there's the actual Base64 encoded signature at the bottom. In this access token, we have the following information. We know who it belongs to, we know what permissions they have, we know, when the access token expires. There's an EXP thing, we know the date that the token was issued. And it's got a unique ID right there. If this person goes to get a new access token and all the other information is more or less the same, that unique ID causes the hash to be different because the unique ID will change, and therefore the hash will change. This token is always going to be a unique token that represents this moment in time.

Okay. This is what our information looks like, and this has a subtle but very high architectural impact change. I'll skip this part because we're basically running low. What happens now is when someone logs in, I pull a bunch of information about them from LDAP or our database or whatever. I put it in a JSON map then I format it and I sign the JSON map with my private key. And then I put a little pointer in a database, which we will talk about later and that's for revocation purposes. And then I will send the token back to the client. 

From that moment, the server-side holds a pointer or the client side holds state. Before, the server-side held state and the client side held a pointer. The difference between this is the difference between an HTTP session ID and an HTTP cookie.

Okay. How does this look architecturally? I have all these mobile devices and they're now holding a lot of information about themselves in the mobile device aka cookie, but it's a digitally signed cookie. Every time they make a request, they send it across. And, of course, the way to get this new JSON web token style of access token is exactly the same. I take my username and password, I post it over to the token endpoint and I get back a much bigger looking access token and a much bigger refresh token. Both the refresh token and access token are going to be JSON web tokens.

Okay. When I send my messages, I send this bigger string. It's a lot bigger than before, but it's more effective on the network, it's more effective on our architecture. Here is what we have now. Let's walk over our pain points that we had before. Now, we have still 1,000 people logging in with a username and password once a day. When we're generating refresh tokens, the access token that's digitally signed access token is going to be sent on every request. Do we take that token and look it up in a database before we've even bothered to check the signature on it? No, we check the signature. If the signature is valid, we go ahead and grab if we need to check to see if the ID is there. But we're going to say for the point of our architecture, we're not going to do a revocation check on each access token usage, we're all going to do it on the refresh token.

And we're going to say further that we're refreshing once a half-hour. We have a very aggressive window that we're expiring these access tokens. Architecturally, that means that I can do 3,000 transactions per second on my front door, never hitting any store at all in local memory using just the public key that's associated with the gateway. 

Now, here's a common question, how many public-private key pairs do I need to authenticate a million users? The answer is one. You only need one public-private key pair because the data is always different. Your website uses certs and you only need one cert for the whole website because the cert identifies the identity, the people, the relationship, your relationship to the world. You have one cert that's got a private counterpart and there's a public counterpoint that everybody else has so they know when you speak it's you.

You only need one public-private key pair for all of the access tokens that you create. You can create millions and billions and trillions and eventually maybe two years you should rotate your public-private key pair like you're supposed to rotate your certs every once in a while. But you can create lots and lots, and lots of tokens with one public-private key pair. 

On that gateway, there is one public key that can verify all the incoming traffic without hitting a center store. It's a major architectural advantage. Our backend traffic now, we're going to say we're passing the JSON web token and they can also validate the JSON web token with the public key. We distribute the public key to all the micro-services in the backend, they're going to put it on say, the Docker image, something like that. And we only have to rotate this thing out every couple of years so it's not a very big architectural thing that we've done in terms of impact and rolling out and maintenance headaches.

And so, with one very stable piece of data, we can avoid all network traffic to a center store at the gateway and between our microservices. We have just achieved a stateless architecture. This is a very, very, very major thing. Now, the actual state checks if we do that only on the refresh token, that's going to result in a do it at half an hour. And each time they refresh, we check the database for the pointer. And if the pointer doesn't exist, we don't let them refresh. 

At any moment, we can go to that thing and poke out the ID of the currently active refresh token and then they can't use it anymore. That's how we forcibly log somebody out and then they get prompted with the username and password screen. If that happens, we're now doing 0.55 transactions per second against our center point of failure, it's no longer a center point of failure.

Before, if we were going to validate all the tokens all the way to the back, it was 55% of our transactions. Now, it's 0.55, we have achieved an astronomically huge gain in architecture with this change, which is why it's really, really, really important to know the difference. One is an HTTP session, one is an HTTP cookie. A lot of times, people are seeing this all the same they go, "Oh, it's OAuth," and they go check, "I got the check box, I've earned the check box." Well. it's not the same. OAuth, pre JSON web tokens is a huge stateful system that grows heavier and heavier over time. OAuth after JSON web tokens is a stateless system that allows the identity of the user to be verified all the way to the very, very back of the enterprise.

And so, what we should do now is with these new capabilities that we have is if microservice A says to microservice D, "Can I have all of Joe's salary information," and does not present the token that represents Joe proving Joe's on the other side, microservice D should say, "Not a chance. I don't care that you're one of us and you're in our same cluster or in our same four walls. And someone I know probably deployed you, but you're not Joe so you do not get Joe's information." Then what we should do is of course we should pass that JSON web token every hop. 

When we take it off the wire, when we get a request and when we make a request to the next web service, we put it in the outgoing message that we send. And in this manner, it propagates through all the different microservices and we go ahead and have the ability to verify that Joe is the actor on the other side of all of these microservice calls.

And so, if something sensitive needs to be requested pertaining to Joe, we say yes. If they ask for Susan's information, we say, "No, this is Joe's token." Now, we get hacked again, Latveria comes back and they're at it again. Here's how this looks now. We were getting 3,000 valid transactions per second that are now token validations that can all be done with CPU and memory. We're not hitting any network or anything, we're not hitting any center point of failure. 

And now our people from Latveria are sending 6,000 transactions per second all of them containing fake access tokens and they're trying to guess their way in the front door just like they were trying to guess their way in with username and password. But they don't have the private key so they cannot make legitimate access tokens. They're just going to be trying to make them up and what we're going to be able to do is verify very quickly none of these pass the public key check.

When we go to verify the signature, we see that it doesn't work and we drop those 6,000 transits per second right on the floor no calling out to center or anything, no making a network hop. It all happens at the cost of just CPU time in our gateway. Making an RSA signature is this much effort and verifying an RSA signature is that much effort. The amount of CPU that you need to do a lot of verifications is nothing, it's tiny. These 6,000 transactions per second that are all invalid are not affecting the experience of these 3,000 people who are valid. There is no center point of failure being hit, there are no systems being starved of resources and everything is just fine. They can increase their attack as much as they want and we don't even care.

This is a very, very major thing. We've not only fixed the state problem, but we've fixed our denial-of-service problem as well. This is a pretty awesome thing. 

Now, we're going to look at the Amazon style of all this, which also involves signing. But the approach to Amazon is that it's actually signing the message itself. First of all, this specification is terrible, it's only 20 pages and it's easily understandable. I definitely recommend you don't read it. I think they did something wrong, it's supposed to have 20 million references to other specifications and it's got to be at least majorly incomplete in a couple of areas that you're forced to go read those specifications. And those specifications should have bigger goals than the one that pointed them there. You're basically more and more confused the more things you read.

I don't know what they did wrong, it's a real big failure. I apologize profusely. They don't even have any funny acronyms that are hard to pronounce, I don't know. 

Here's what a message that is using HTTP signatures what it looks like. Our goal is to say that the content length header, the host header, the date header and a virtual header called the request target, which points to the post and that URI that they're protected. And they're protected via HMAC-SHA256, which is basically a symmetric signing. HMAC is a form of symmetric encryption and SHA-256 is our hashing. We've said how we're going to hash it, we've said how we're going to encrypt the hash HMAC-SHA256. And that is the resulting signature when we use our key, which is called the very fancy, my key name.

When we use our key associated with my key name and we sign those four headers with that algorithm, we get that Base64 encoded version of the signature. How we create this is as you know, we've been talking in order to create a digital signature, you need a payload to hash and then you need to encrypt the hash. 

What's our payload? Well, basically the logic for creating that payload is right there. We're going to take the content length header, we're going to take its name and its value. And then we're going to take the host header, its name and its value. We're going to put that after by a new line and then we're going to take the date header and we're going to put that after by a new line. And then we're going to take that request target and we put that after by a new line.

Effectively, what we have is the key value of content length, host, date and request target all concatenated together in one big long string. And then we hash and sign that, and then that's what we put in our thing as a signature. That means if you look at this message if someone tampers with the accept header, we don't see it. It's not been signed. If someone tampers with the date header, we see that. That's has been signed. If someone tries to forge an HTTP message using our key, well, they won't be able to because they would have to change one of the headers that aren't signed. If they try to change one of the headers that are signed, we can catch them in the act.

How Amazon does this is basically you sign up, you go online and you get an API key, and you get an API secret. That key ID is your API key and the API secret is the key value you use to sign this message. When this goes across the wire to Amazon, they retrace your steps by taking these headers, they concatenate them together using the algorithm they know you have and then they sign it with what they know your key is. You said the key name right there, your API key ID is going to be that thing at the top. And if it doesn't match, then either one of two things happened. Either you're not who you say you are or you don't have the right key or some part of the message has been changed in transit. 

Either way, that's what they want to know because ultimately they're not going to process your request for 1,000 EC2 spot instances that are like C4 extra, extra triple large, which is coming out next month. Because they don't want to basically have a situation where you say, "Well, that was me. But I don't ask for a thousand, I asked for a hundred."They want what's called non-repudiation, they want the ability to say that you signed off on everything you asked for. For them, it's not just about proving identity, it's about proving that what you said you wanted is exactly what you said and someone else didn't change it in transit.

Ultimately, Amazon's approach proves identity, proves the data that you sent is what you wanted. And by the way, this specification is so simple it says, if you wanted to make sure the body hasn't been tampered with, there's an RFC called the digest header and it has a way for you to digest the body and put the resulting digest as a header and then sign that too. It's really, really pragmatic and really, really simple. This was actually written by one of the people who was on the Amazon identity team who later left and worked for another company and wrote this nice wonderful bit of tech for us. This is very much the spirit of what Amazon uses to do all of their B2B stuff.

Okay. I'm just going to skip to this part, it doesn't solve the back end situation where we're having a JSON web token passed from microservice to microservice. What we have is this new specification, it's called OAuth 2.0 Proof-of-Possession. And what we're doing in this is basically we're combining this concept of signed identity information and signed HD messages into one. When we generate our access token, we also generate a symmetric key. We put the name of the symmetric key in our access token. And when they login, here was our access token before, now here's our access token. It has the name of the key in it and they can't change the name of the key because then the token won't validate, everything in the token is protected.

They log in through their regular mechanism, they get back an access token, refresh token and a symmetric key. And then here's what the symmetric key looks like in JSON format. And then when they send their message, they both sign the message with the key that they got and they send the access token. The access token, that's not the thing that provides access it's now the signature. And the access token is there to take the state off of our system so we still have a stateless system. With this scenario, we have basically the best of both worlds. We have a protected HTTP message and we have the ability to have a stateless backend with microservices that know who's on the other side at 100% of the time. This is the nice little spec thing that you should check out if you want to take a picture.

And, of course, all these slides will be online and available for you to see. And then, of course, we don't know all these things for no reason. We actually did write a security system ourselves, you can get the slides and you can actually check it out by going to that URL. All of the slides that we have are there. I don't know that we have time for questions, nope. If you want to ask questions, I will be just right back there. Thank you very much, and I hope you enjoyed the presentation.


David Blevins
Founder + CEO, Tomitribe