Tag Archives: openwhisk

Python functions on OpenWhisk

Part of the wonderful time I had at North Bay Python was also getting to represent IBM on stage for a few minutes as part of our sponsorship of the conference. The thing I showed during those few minutes was writing some Python functions running in OpenWhisk on IBM’s Cloud Functions service.

A little bit about OpenWhisk

OpenWhisk is an Apache Foundation open source project to build a serverless / function as a service environment. It uses Docker containers as the foundation, spinning up either predefined or custom named containers, running to completion, then exiting. It was started before Kubernetes, so has it’s own Docker orchestration built in.

In addition to just the run time, it also has pretty solid logging and interactive editing through the webui. This becomes critical when you do anything that’s more than trivial with cloud functions, because the execution environment looks very different than just your laptop.

What are Cloud Functions good for?

Cloud Functions are really good when you have code that you want to run after some event has occurred, and you don’t want to maintain a daemon sitting around polling or waiting for that event. A good concrete instance of this is Github Webhooks.

If you have a repository that you’d like to do some things automatically on a new issue or PR, doing with with Cloud Functions means you don’t need to maintain a full system just to run a small bit of code on these events.

They can also be used kind of like a web cron, so that you don’t need a full vm running if there is just something you want to fire off once a week to do 30 seconds of work.

Github Helpers

I wrote a few example uses of this for my open source work. Because my default mode for writing source code is open source, I have quite a few open source repositories on Github. They are all under very low levels of maintenance. That’s a thing I know, but others don’t. So instead of having PR requests just sit in the void for a month I thought it would be nice to auto respond to folks (especially new folks) the state of the world.

#
#
# main() will be invoked when you Run This Action
#
# @param Cloud Functions actions accept a single parameter, which must be a JSON object.
#
# @return The output of this action, which must be a JSON object.
#
#

import github
from openwhisk import openwhisk as ow


def thank_you(params):
    p = ow.params_from_pkg(params["github_creds"])
    g = github.Github(p["accessToken"], per_page=100)

    issue = str(params["issue"]["number"])


    repo = g.get_repo(params["repository"]["full_name"])
    name = params["sender"]["login"]
    user_issues = repo.get_issues(creator=name)
    num_issues = len(list(user_issues))

    issue = repo.get_issue(params["issue"]["number"])

    if num_issues < 3:
        comment = """
I really appreciate finding out how people are using this software in
the wide world, and people taking the time to report issues when they
find them.
I only get a chance to work on this project on the weekends, so please
be patient as it takes time to get around to looking into the issues
in depth.
"""
    else:
        comment = """
Thanks very much for reporting an issue. Always excited to see
returning contributors with %d issues created . This is a spare time
project so I only tend to get around to things on the weekends. Please
be patient for me getting a chance to look into this.
""" % num_issues

    issue.create_comment(comment)


def main(params):
    action = params["action"]
    issue = str(params["issue"]["number"])
    if action == "opened":
        thank_you(params)
        return { 'message': 'Success' }
    return { 'message': 'Skipped invocation for %s' % action }

Pretty basic, it responses back within a second or two of folks posting to an issue telling them what’s up. While you can do a light weight version of this with templates in github native, using a cloud functions platform lets you be more specific to individuals based on their previous contribution rates. You can also see how you might extend it to do different things based on the content of the PR itself.

Using a Custom Docker Image

IBM’s Cloud Functions provides a set of docker images for different programming languages (Javascript, Java, Go, Python2, Python3). In my case I needed more content then was available in the Python3 base image.

The entire system runs on Docker images, so extending those is straight forward. Here is the Dockerfile I used to do that:

# Dockerfile for example whisk docker action
FROM openwhisk/python3action

# add package build dependencies
RUN apk add --no-cache git

RUN pip install pygithub

RUN pip install git+git://github.com/sdague/python-openwhisk.git

This builds with the base, and installs 2 additional python libraries: pygithub to make github api access (especially paging) easier, and a utility library I put up on github to keep from repeating code to interact with the openwhisk environment.

When you create your actions in Cloud Functions, you just have to specify the docker image instead of language environment.

Weekly Emails

My spare time open source work mostly ends up falling between the hours of 6 – 8am on Saturdays and Sundays, which I’m awake before the rest of the family. One of the biggest problems is figuring out what I should look at then, because if I spend and hour figuring that out, then there isn’t much time to do much that requires code. So I set up 2 weekly emails to myself using Cloud Functions.

The first email looks at all the projects I own, and provides a list of all the open issues & PRs for them. These are issues coming in from other folks, that I should probably respond to, or make some progress on. Even just tackling one a week would get me to a zero issue space by the middle of spring. That’s one of my 2018 goals.

The second does a keyword search on Home Assistant’s issue tracker for components I wrote, or that I run in my house that I’m pretty familiar with. Those are issues that I can probably meaningfully contribute to. Home Assistant is a big enough project now, that as a part time contributor, finding a narrower slice is important to getting anything done.

Those show up at 5am in my Inbox on Saturday, so it will be the top of my email when I wake up, and a good reminder to have a look.

The Unknown Unknowns

This had been my first dive down the function as a service rabbit hole, and it was a very educational one. The biggest challenge I had was getting into a workflow of iterative development. The execution environment here is pretty specialized, including a bunch of environmental setup.

I did not realize how truly valuable a robust Web IDE and detailed log server is in these environments. Being someone that would typically just run a vm and put some code under cron, or run a daemon, you get to keep all your normal tools. But the trade off of getting rid of a server that you need to keep patched is worth it some times. I think that as we see a lot of new entrants into the function-as-a-service space, that is going to be what makes or breaks them: how good their tooling is for interactive debug and iterative development.

Replicate and Extend

I’ve got a pretty detailed write up in the README for how all this works, and how you would replicate this yourself. Pull requests are welcomed, and discussions of related things you might be doing are as well.

This is code that I’ll continue to run to make my github experience better. The pricing on IBM’s Cloud Functions means that this kind of basic usage works fine at the free tier.