Hello everyone, Shawn Hernan here. I used to work on the SDL team, and I might have been a regular contributor to this space, but instead I joined the SQL Server security team. Ralph Hood, Microsoft SDL guru, asked me if I would contribute a post about “Life on the other side,” talking to what I’ve learned about the SDL from this new perspective — sort of the reverse of his recent post. I couldn’t turn down the opportunity.
First, let me say what I knew about the SDL going in: no policy can anticipate every situation; you have to make tradeoffs; the details matter; the big picture matters; you need tools; you need human insight; you need management support; and we’re never going to be perfect. All of the things you’ve read in this blog are true, and they really shouldn’t be controversial. Since joining SQL, I’ve learned a lot about SQL Server too, and what it means to ship a product – but that’s outside the scope of this blog. So instead, I’ll try to describe three real experiences that illustrate things that shouldn’t be controversial either, but aren’t usually covered under the rubric of security. They are crucial nonetheless.
Security is not the point, it’s the needs of the customer. It’s easy to believe that security is the point of producing a product. It’s not. We won’t produce an insecure product, but the primary driver for a product team is to produce a valuable, useful product. Yes, security is a big part of that, but security is not a goal in and of itself. For example, one of the areas of fierce competition in enterprise database products is performance, and we have to balance security with performance. One of the ways we do that is by verifying data we receive really well, but only when necessary. We define clear trust boundaries, and check the data thoroughly once on the way in, and then work very hard to enforce those trust boundaries.
I first encountered this in SQL when I helped review threat models for the database engine. The engine trusts that the data on the disk was written correctly by a trusted entity (with checksums to guard against random errors), and enforce that. Instead of a slavish adherence to the principle of total mediation or defense in depth, which, when taken to its extreme would say to “check everything, every time,” we are hard core about making the right checks, but only the right checks.
I will note that it is not an either/or choice between security and performance – it is possible to do both. Indeed, I would say that doing one without the other is pointless, but to get both 1) world class performance, and 2) world class security, you have to understand your data flows really well, and make detailed decisions.
Be polite, but don’t be afraid: Job interviews at Microsoft can be challenging. When I interviewed for this job, my final interview was with a very senior architect. The subject of integer overflows came up, and he asked me to describe the problems and solutions. So I started writing some code on the whiteboard. After about 10 minutes of describing my approach to integer overflows, he said to me, “What if I were to tell you that’s a really bad solution, and the interview is over?”
My heart sank.
But instead of rolling over, I said, “well, that’s a bad outcome, tell me why.” He proceeded to attack my solution on several grounds, including being unreadable and unmaintainable, and he proceeded to describe his solution to the problem. Now, this was a very serious, very senior technical architect, and I was in a high pressure, asymmetric situation. So, not willing to be intimidated, but unable to attack back, I pointed out several shortcomings of his solution, politely, but firmly. And we spent the next 40 minutes talking about various aspects of the problem, and me defending my solution, which I think was credible. I don’t know if he agreed with my solution or not, really, but I suspect it might have been a test to see if I would cave. Or maybe he thought it really was a bad solution, I don’t know. But I got the job.
As a security professional, you’re always going to be at a technical disadvantage when you’re reviewing another team’s components. They designed and implemented the system. You are an outsider, and it is absolutely impossible to understand the system to the degree as the people who built it. Nonetheless, you’ve got to find a way to ask hard, probing, impolite and sometimes even uninformed questions without being threatening or insulting, or undermining your own credibility.
Be polite, be firm, put your ego in a box, and ask questions until you understand.
“It should work” is not a good answer: We take the giblets problem very seriously, and managing giblets can be quite difficult at times. And in SQL, we have lots of giblets. We consume things from Windows, and Office, and Visual Studio, and others, and we provide giblets to other teams as well. In fact, we provide components that other teams use to build the giblets they provide to us – we consume our own giblets!
And as it happens, one of the components we use was updated recently. Even though it would get serviced through Microsoft Update, we want to ensure we have the latest and greatest version of any component we ship. But to consume the latest and greatest version of this particular component would require some small updates to either our installer or theirs. So we met with the team that owns the giblet in question to try to divvy up the work, and to avoid schedule disruptions on either side.
There was a lot of back and forth about various things to try, and we continued to refine a solution until we had reduced the problem to a single issue.
At this point, there was an air of hope in the room. If the idea actually worked, we had a solution at relatively low cost. But would it work? When the question of “will this work” comes up, all eyes turn towards test managers.
Our general manager was looking right at our test manager and she asked, “Will that work?” The test manager looked across the table at the development manager from the other group, and said, “I don’t know. That depends on their level of confidence in the behavior of their component under these conditions.”
Now, all eyes were starting at the dev manager, and the room got quiet. A somewhat sheepish look came over his face, because he knew the answer he was about to give would be unsatisfactory. He said, “Well, I’m not a tester, I’m just a developer, but it should work.”
At which point the room erupted into hysterical laughter.
“It should work” means “I think so, but we have to test it.” And that means the whole battery of tests for each of the affected components, across all of the supported platforms. And that has to be scheduled in test labs. To be clear, this wasn’t a lack of confidence in the developer, quite the contrary, he was laughing along with everyone else. We just know that writing software to satisfy all the scenarios in which our software is deployed requires far more testing than can reasonably be performed on a single desktop system.
So the tests were scheduled, the developer was proven correct, and we’re picking up the latest version. Even seemingly simple changes require a lot of testing.
So, that’s what I’ve learned: security isn’t the be-all-end-all,, things are really complex and hard to understand, and you don’t really know if anything works until you test it. None of which should be controversial, but none of the central ideas in the SDL are controversial either. The hard part is putting theory into practice, and recognizing that no venture is risk free, despite the natural inclination of security engineers to avoid any risk whatsoever. In this, I am reminded of one of my favorite books, “To Engineer is Human: The Role of Failure in Successful Design,” by Henry Petroski. He writes, “No one wants to learn by mistakes, but we cannot learn enough from successes to go beyond the state of the art. Contrary to their popular characterization as intellectual conservatives, engineers are really among the avant-garde. They are constantly seeking to employ new concepts [and are] constantly striving to do more with less.  The engineer always believes he is trying something without error, but the truth of the matter is the each new structure can be a new trial.  Such is the nature not only of science and engineering, but of all human endeavors.”