Tab Dumping with AppleScript and back to Python

Rock

Goal: Iterate through all my (OS X Safari) browser windows and make a list of titles and urls which is then placed in the clipboard ready to be pasted into an email or blog post.

This is an update to Tab Dumping in Safari. That still works well as the basis for extending any Cocoa-based application at runtime, but it relies on SIMBL, which while it is a great bit of code, essentially is abusing the InputManager interface. Some developers and users shun such hacks, and at least one Apple application checks for them at startup and warns you from using them.

I have been running the WebKit nightlies, which are like Safari, but with newer code and features (most importantly to me right now, a Firebug-like developer toolkit). WebKit warns at startup that if you’re running extensions (such as SIMBL plugins) it may make the application less stable. I was running both Saft and my own tab dumping plugin, and WebKit was crashing a lot. So I removed those and the crashes went away. I miss a handful of the Saft extensions (but not having to update it for every Safari point release), and I found I really miss my little tab-dumping tool.

I toyed with the idea of rewriting it as a service, which would then be available from the services menu, but couldn’t figure out how to access the application’s windows and tabs from the service. So I tried looking at Safari’s scriptable dictionary, using the AppleScript Script Editor. Long ago, John Gruber had written about the frustration with Safari’s tabs not being scriptable, but a glance at the scripting dictionary showed me this was no longer the case (and probably hasn’t been for years, I haven’t kept track).

I am a complete n00b at AppleScript. I find the attempt at English-like syntax just confuses (and irritates) me no end. But what I wanted looked achievable with it, so I armed myself with some examples from Google searches, and Apple’s intro pages and managed to get what I wanted working. It may not be the best possible solution (in fact I suspect the string concatenation may be one of the most pessimal methods), but it Works For Me™.

In Script Editor, paste in the following:

set url_list to ""
-- change WebKit to Safari if you are not running nightlies
tell application "WebKit"
  set window_list to windows
  repeat with w in window_list
    try
      set tab_list to tabs of w
      repeat with t in tab_list
        set url_list to url_list & name of t & "\n"
        set url_list to url_list & URL of t & "\n\n"
      end repeat
    on error
      -- not all windows have tabs
    end try
  end repeat
  set the clipboard to url_list
end tell

I had to use AppleScript Utility to add the script menu to my menu bar. From there it was easy to create script folders that are specific to both WebKit and Safari and save a copy of the script (with the appropriate substitution, see comment in script) into each folder. Now I can copy the title and URL of all my open tabs onto the clipboard easily again, without any InputManager hacks.

I had some recollection that is a way to do this from Python, so I looked and found Appscript. I was able to install this with a simple easy_install appscript and quickly ported most of the applescript to Python. The only stumbling block was that I couldn’t find a way to access the clipboard with appscript, and I didn’t want to have to pull in the PyObjC framework just to write to the clipboard. So I used subprocess to call the command-line pbcopy utility.

#!/usr/local/bin/python
from appscript import app
import subprocess
tab_summaries = []
for window in app('WebKit').windows.get():
    try:
        for tab in window.tabs.get():
            name = tab.name.get().encode('utf-8')
            url = tab.URL.get().encode('utf-8')
            tab_summaries.append('%s\n%s' % (name, url))
    except:
        # not all windows have tabs
        pass
clipboard = subprocess.Popen('pbcopy', stdin=subprocess.PIPE)
clipboard.stdin.write('\n\n'.join(tab_summaries))

The remaining hurdle was simply to put the Python script I’d written into the same Scripting folder as my AppleScript version. For me this was ~/Library/Scripts/Applications/WebKit/. When run from the scripts folder, your usual environment is not inherited, so the #! line must point to the version of Python you are using (and which has Appscript installed). You should also make the script executable. Adding .py or any other extension is not necessary.

Overall, while I found AppleScript to be very powerful, and not quite as painful as I remembered, I found the Python version (warts and all) to be easier to work with. Combined with the fact that the script folder will run non-Applescript scripts, this opens up new worlds for me. I have hesitated in the past to write a lot of SIMBL-based plugins, tempting though it may be, because they are hacks, and they run in every Cocoa-based application. But adding application-specific (or universal) scripts, in Python, is pure, unadulterated goodness.

First VanPyZ of 2009

I really need to start blogging these before they happen, but I will at least try to summarize the January Vancouver Python user group meeting. Our featured speaker wasn’t able to make it, and it was only with the nudging of Andy and the gracious help of Jim and Dane at Workspace that we even had a January meeting. But a meeting was had, a surprising number of people braved the heavy rain, and we had a good time talking about our varied interests and explorations, mostly with a Python theme.

What I did over winter vacation

We had an open discussion, starting with the nominal topic of the night, Python packaging. We discussed some of the pros and cons of distutils, setuptools, and easy_install. Our casual conclusion was that all of these tools work better if you first use Ian Bicking’s VirtualEnv to prevent library pollution in your main Python install. VirtualEnv puts a copy of Python, your required libraries, and any dependencies they have into a separate directory and then runs from that, keeping a project nicely sandboxed. The OS X Python application builder py2app, works similarly, but after the fact as a final build step so you can deliver a complete application without worrying what version of Python and libraries are installed on a user’s computer. Ian Bicking also has package installer you can use with (or without) VirtualEnv, called Pip, and if you check his two links here you may notice a pattern forming for his blog titles.

From there we wandered around, topic-wise, discussing XML/HTML parsing libraries such as Beautiful Soup and lxml, from there to screen scrapers like mechanize, to web spiders like scrapy.

Conversation drifted into 3D for a bit, touching on VRML/X3D and the open-source FreeWRL viewer (built by the Canadian Government, yay!). While it isn’t really a Python project, there is a Python library to generate VRML as part of the Scientific library (not to be confused with SciPy. And if you don’t want to go the XML route, you can stay in Python because Scientific can also generate VPython code (although VPython does not yet support things like transparency or texture mapping).

We kept coming back to web frameworks, discussing the tradeoffs between Django and Zope/Plone and the impact of “cloud computing” platforms such as Amazon Web Services (AWS) and Google App Engine. While the AWS services give a lot more flexibility, they require a substantial amount of planning and configuration. App Engine has the easy deployment of a PHP app, but with the convenience of Python and the power of Big Table, but the rather substantial disadvantage of still being a beta environment that you can’t actually buy services on yet (no scaling up).

There are some workarounds for these and other problems (the App Engine 10 app limit, for instance). Hadoop was mentioned as an open-source alternative to App Engine. Amazon Machine Images can give you a head start at deploying on AWS, although you will still need to make arrangements for data persistence. 10gen appears to be building using the App Engine model (more or less), but as a smaller player, they may be more responsive to user feedback. LowEndBox is not a provider, but a blog tracking ultra-cheap hosting with root access, so you could conceivably build your own AWS on a shoestring.

Since I spent part of the holidays building Lego Mindstorms models with my kids (a machine gun and a puppy), I kept trying to steer the conversation towards robotics. Unfortunately, the Python Robotics project does not support Mindstorms yet, but they do support Roombas, so I may still be in luck. I haven’t found a Python project for programming Mindstorms yet, but you could probably wrap Not Exactly C with Python fairly easily, and after the meeting I found this script to use Bluetooth to control Mindstorms from Python.

After that we continued the discussion at the pub and I don’t have any browser history to help me remember that part of the discussion. I do remember that before the meeting started, I did plug Scratch again as the best way to introduce 6-12 year olds to programming.

Finally, I want to thank everyone on the VanPyZ mailing list for helping to organize and to re-establish the group’s web site after recent hack attacks and crashing, especially Henry Prêcheur for restoring the content and moving it to the official Python wiki for the new VanPyZ page.

google

google

asus