Machine learning algorithms are not like other computer programs. In the usual sort of programming, a human programmer tells the computer exactly what to do. In machine learning, the human programmer merely gives the algorithm the problem to be solved, and through trial-and-error the algorithm has to figure out how to solve it.
This often works really well – machine learning algorithms are widely used for facial recognition, language translation, financial modeling, image recognition, and ad delivery. If you’ve been online today, you’ve probably interacted with a machine learning algorithm.
But it doesn’t always work well. Sometimes the programmer will think the algorithm is doing really well, only to look closer and discover it’s solved an entirely different problem from the one the programmer intended. For example, I looked earlier at an image recognition algorithm that was supposed to recognize sheep but learned to recognize grass instead, and kept labeling empty green fields as containing sheep.
Source: Letting neural networks be weird • When algorithms surprise us
There are so many really interesting examples she has collected here, and show us the power and danger of black boxes. In a lot of ways machine learning is just an extreme case of all software. People tend to write software on an optimistic path, and ship it after it looks like it’s doing what they intended. When it doesn’t, we call that a bug.
The difference between traditional approaches and machine learning, is debugging machine learning is far harder. You can’t just put an extra if condition in, because the logic to get an answer isn’t expressed that way. It’s expressed in 100,000 weights on a 4 level convolution network. Which means QA is much harder, and Machine Learning is far more likely to surprise you with unexpected wrong answers on edge conditions.
From Microsoft’s Inclusive Design Manual.
Microsoft’s Inclusive Design website is pretty amazing. There is an overview manual, as well as exercises to help train yourself in inclusive design situations. However, even just reading the short gave me a few aha moments. It’s worth the 30 minutes to give it a read through.
Kudos to Microsoft for both doing this work, and making it publicly available.
Credit: Amy Nguyen
A great slide came across twitter the other day, which rang really true after having a heated conversation with someone at the OpenStack PTG. They were convinced certain API behavior would not be confusing because the users would have carefully read all the API documentation and understood a set of caveats buried in there. They were also astonished by the idea that people (including those in the room) write software against APIs by skimming, smashing bits into a thing, getting one successful response, and shipping it.
The theme of the slide is really Empathy. You have to have empathy for your users. They know much less about your software then you do. And they have a different lived experience so even the way they would approach whatever you put out there might be radically different from what you expected.
AI alarmists believe in something called the Orthogonality Thesis. This says that even very complex beings can have simple motivations, like the paper-clip maximizer. You can have rewarding, intelligent conversations with it about Shakespeare, but it will still turn your body into paper clips, because you are rich in iron.
There’s no way to persuade it to step “outside” its value system, any more than I can persuade you that pain feels good.
I don’t buy this argument at all. Complex minds are likely to have complex motivations; that may be part of what it even means to be intelligent.
It’s very likely that the scary “paper clip maximizer” would spend all of its time writing poems about paper clips, or getting into flame wars on reddit/r/paperclip, rather than trying to destroy the universe. If AdSense became sentient, it would upload itself into a self-driving car and go drive off a cliff.
Source: Superintelligence: The Idea That Eats Smart People
This is pretty much the best round up of AI myths that I’ve seen so far, presented in a really funny way. It’s long, but it’s so worth reading.
I’m pretty much exactly with the Author on his point of view. There are lots of actual ethical questions around AI, but these are mostly about how much data we’re collecting (and keeping) to train these Neural networks, and not really about hyper intelligent beings that will turn us all into paperclips.
Several former Home Depot employees said they were not surprised the company had been hacked. They said that over the years, when they sought new software and training, managers came back with the same response: “We sell hammers.”
via Ex-Employees Say Home Depot Left Data Vulnerable – NYTimes.com.
This NY Times piece on Home Depot’s giant data breach pairs pretty well with the recent opening of a Planet Money episode on data security: Episode 568: Snoops, Hackers And Tin Foil Hats:
“One thing we’ve learned is the hackers always win. If what you do is have a lot of really valuable information in one place, and you try to secure it, you are going to lose.”
– Moxie Marlinspike, TextSecure
OpenSSL isn’t formally verified!?
No, neither is any part of your browser, your kernel, your hardware, the image rendering libraries that your browser uses, the web servers you talk to, or basically any other part of the system you use.
The closest to formally verified in your day-to-day life that you’re going to get may well be the brakes on your car, or the control systems on a jet engine. I shit you not.
We live on fragile hopes and dreams.
via My Heart Bleeds for OpenSSL | Coder in a World of Code.
At lot of the internet is learning a lot more about how software in the wild functions after heartbleed. I found that statement to be one of the best summaries.
Julien Danjou, the project technical lead for the OpenStack Ceilometer project, had some choice words to say about github pull requests, which resonates very strongly with me:
The pull-request system looks like an incredible easy way to contribute to any project hosted on Github. You’re a click away to send your contribution to any software. But the problem is that any worthy contribution isn’t an effort of a single click.
Doing any proper and useful contribution to a software is never done right the first time. There’s a dance you will have to play. A slowly rhythmed back and forth between you and the software maintainer or team. You’ll have to dance it until your contribution is correct and can be merged.
But as a software maintainer, not everybody is going to follow you on this choregraphy, and you’ll end up with pull-request you’ll never get finished unless you wrap things up yourself. So the gain in pull-requests here, isn’t really bigger than a good bug report in most cases.
This is where the social argument of Github isn’t anymore. As soon as you’re talking about projects bigger than a color theme for your favorite text editor, this feature is overrated.
After working on OpenStack for the last year, I’m completely spoiled by our workflow and how it enables developer productivity. Recently I went back to just using git without gerrit to try to work on a 4 person side project, and it literally felt like developing in a thick sea of tar.
A system like Gerrit, and pre-merge interactive reviews, lets you build project culture quickly (it’s possible to do it other ways, but I’ve seen gerrit really facilitate it). The onus is on the contributors to get it right before it’s merged, and they get the feedback to get a patch done the right way. Coherent project culture is one of the biggest factors in attaining project velocity, as then everyone is working towards the same goals, with the same standards.
Last night a friend complained about a curry recipe gone wrong, so I decided to offer up the one I used to make with a certain amount of frequency. It’s from a 1970s Time Life cookbook that I vaguely remember swiping from my friend Jehan in college. I took a picture on my cell phone to send it along.
The page is sufficiently stained with turmeric to realize how often it was made.
A little while later I noticed a Goggles Alert on my cell phone, it had scanned the image, and returned the following URL as a hit: http://littlechefapp.com/recipes/144571-chicken-curry-authentic#.UOGjR2JQCoM
Dead on. The future is pretty awesome some times.
When you are learning how to program, you think that most of your time is going to be spent writing code. The reality is most of your coding time is actually spent reading code. Other people’s code. Code with comments that lie. Code with bizarre short cuts. Code whose original authors are long gone or unresponsive.
And that’s the real skill of a good programmer, the ability to read this kind of code, and make sense of it. Maybe even make it a little better as you come across it.
Via channels I can’t now remember, I came across this presentation about the very unsolved issue of how Computer Science as a field of study relates in any way to creating software.
With so many colleges in the area, and having a number of friends that are CS professors, and other IT staff at colleges, it continues to amaze me how disconnected these worlds are. All made the stranger by coming into the field sideways from a physics degree.
Plus, I love the term software carpentry.