Read Along: The Chubby Lock Service for Loosely-Coupled Distributed Systems

This is a read along for Hacker School’s paper of the week series for The Chubby Lock Service for Loosely-Coupled Distributed Systems.

There’s so much I like about this paper! A lot of their decisions were made so that their service would be useful for humans, as opposed to for some technical reason. The entire reason they built a lock service instead of a library was so that it would be easy to use, and you can see that thinking reflected at multiple levels.

Hand Coded Assembly Beats Intrinsics in Speed and Simplicity

Every once in a while, I hear how intrinsics have improved enough that it’s safe to use them for high performance code. That would be nice. The promise of intrinsics is that you can write optimized code by calling out to functions (intrinsics) that correspond to particular assembly instructions. Since intrinsics act like normal functions, they can be cross platform. And since your compiler has access to more computational power than your brain, as well as a detailed model of every CPU, the compiler should be able to do a better job of micro-optimizations. Despite decade old claims that intrinsics can make your life easier, it never seems to work out.

The last time I tried intrinsics was around 2007; for more on why they were hopeless then see this exploration by the author of VirtualDub. I gave them another shot recently, and while they’ve improved, they’re still not worth the effort. The problem is that intrinsics are so unreliable that you have to manually check the result on every platform and every compiler you expect your code to be run on, and then tweak the intrinsics until you get a reasonable result. That’s more work than just writing the assembly by hand. If you don’t check the results by hand, it’s easy to get bad results.

Automated Bug Detection With Analytics

I can’t remember the last time I went a whole day without running into a software bug. For weeks, I couldn’t invite anyone to Facebook events due to a bug that caused the invite button to not display on the invite screen. Google Maps has been giving me illegal and sometimes impossible directions ever since I moved to a small city. And Google Docs regularly hangs when I paste an image in, giving me a busy icon until I delete the image.

It’s understandable that bugs escape testing. Testing is hard. Integration testing is very hard. End to end testing is extremely hard. But there’s an easier way. A third of bugs like this – bugs I run into daily – could be found automatically using analytics.

Editing Binaries: Easier Than It Sounds

Editing binaries is a trick that comes in handy a few times a year. You don’t often need to, but when you do, there’s no alternative. When I mention patching binaries, I get one of two reactions: complete shock or no reaction at all. As far as I can tell, this is because most people have one of these two models of the world:

  1. There exists source code. Compilers do something to source code to make it runnable. If you change the source code, different things happen.

  2. There exists a processor. The processor takes some bits and decodes them to make things happen. If you change the bits, different things happen.

If you have the first view, breaking out a hex editor to modify a program is the action of a deranged lunatic. If you have the second view, editing binaries is the most natural thing in the world. Why wouldn’t you just edit the binary? It’s often the easiest way to get what you need.

There Is a Gender Gap in Tech Salaries

Last week, a journalist wrote a post titled “There is no gender gap in tech salaries” for Quartz, the Atlantic Media’s attempt at “a digitally native news outlet … for business people in the new global economy”1. That resulted in linkbait copycat posts all over the internet, from obscure livejournals to Smithsonian.com. The claims are awfully strong, considering that the main study cited only looked at people who graduated with a B.S. exactly one year ago, not to mention the fact that the study makes literally the opposite claim.

Let’s look at the evidence from the AAUW study that all these posts cite.

images

Anonymous Benchmark Markets

It’s 1982. RSI decides to focus on its main product, Oracle Database; they rename themselves as Oracle. Meanwhile, at the University of Wisconsin, Dina Bitton, David DeWitt, and Carolyn Turbyfill create a database benchmarking framework. Oracle does not fare well.

Larry Ellison tries to have DeWitt fired. When that doesn’t work, he bans Oracle from hiring Wisconsin grads. Soon afterwards, every major commercial database vendor (other than IBM) adds a license clause that makes benchmarking their database illegal, without special permission.

Why Don’t Schools Teach Debugging, or, Why Are We Proud of How Fast the STEM Education Pipline Is Leaking?

In the fall of 2000, I took my first engineering class: ECE 352, an entry-level digital design class for first-year computer engineers. It was standing room only, filled with waitlisted students who would find seats later in the semester as people dropped out. We had been warned in orientation that half of us wouldn’t survive the year. In class, We were warned again that half of us were doomed to fail, and that ECE 352 was the weedout class that would be responsible for much of the damage.

The class moved briskly. The first lecture wasted little time on matters of the syllabus, quickly diving into the real course material. Subsequent lectures built on previous lectures; anyone who couldn’t grasp one had no chance at the next. Projects began after two weeks, and also built upon their predecessors; anyone who didn’t finish one had no hope of doing the next.

How Much Math Do You Really Need for Software Development?

Dear David,

I’m afraid my off the cuff response the other day wasn’t too well thought out; when you talked about taking calc III and linear algebra, and getting resistance from one of your friends because “wolfram alpha can do all of that now,” my first reaction was horror– which is why I replied that while I’ve often regretted not taking a class seriously because I’ve later found myself in a situation where I could have put the skills to good use, I’ve never said to myself “what a waste of time it was to learn that fundamental mathematical concept and use it enough to that I truly understand it.”

PCA Is Not a Panacea

Earlier this year, I interviewed with a well-known tech startup, one of the hundreds of companies that claims to have harder interviews, more challenging work, and smarter employees than Google1. My first interviewer, John, gave me the standard tour: micro-kitchen stocked with a combination of healthy snacks and candy; white male 20-somethings gathered around a foosball table; bright spaces with cutesy themes; a giant TV set up for video games; and the restroom. Finally, he showed me a closet-sized conference room and we got down to business.

After the usual data structures and algorithms song and dance, we moved on to the main question: how would you design a classification system for foo2? We had a discussion about design tradeoffs, but the key disagreement was about the algorithm. I said, if I had to code something up in an interview, I’d use a naïve matrix factorization algorithm, but that I didn’t expect that I would get great results because not everything can be decomposed easily. John disagreed – he was adamant that PCA was the solution for any classification problem.