Tab Dumping in Safari

The Problem

I first saw the term “tab dump” on Dori Smith’s blog, but I immediately recognized the concept. I keep Safari running all the time and with the help of Hao Li’s wonderful extension Saft I keep everything in tabs in one window. Among its many features, Saft will let you consolidate your windows into tabs of one window, and it can save the tabs you have open when you close (or crash) Safari, and re-open them automatically when you start Safari again. What it doesn’t do is give you a list of all the tabs you have open in text format, suitable for blog or email. I don’t currently put tag dumps on the blog because a) I’d feel guilty doing that without adding at least a short comment for each link, which would take too much time, and b) because this isn’t really a link blog, more a place for me to bash out example code and tutorials. At least, that’s how I think of it.

I do however, find Safari teetering on the brink of being unfunctionally slow because I have so many tabs open, and often they’re only open because I want to remember to do something with them later, or come back to them, or some other reminder-type function. So I send myself a tab dump on a more-or-less daily basis. Firefox has tools to help you do this, but I haven’t seen anything for Safari, possibly because you can’t really do it with a Safari plugin, but need to use an InputManager, which is fairly deep magic, and basically a hack, an abuse of the system.

On the other hand, I couldn’t keep using Safari if it wasn’t for Saft, and Saft is an InputManager. Another tool for blocking ads and such (which Saft also does) is PithHelmet, but the interesting thing to me about PithHelmet isn’t that it is a popular ad blocker, but that the Mike Solomon (who wrote PithHelmet) decided to not just make an InputManager, but to make the only InputManager you’ll ever need. You see, PithHelmet itself is not an InputManager, it is a plugin for SIMBL (also by Solomon), which is an InputManager that loads plugins based on the application(s) they claim to support. InputManagers get loaded by every application (Cocoa apps, at least), so you have to be careful you’re in the app you want to modify, and take steps not to break things. SIMBL takes care of the nasty business of being a well-behaved system hack, and your code can assume it is in the right app, because it doesn’t get loaded otherwise.

The Goal

Once I figured out that the only way I was going to get Tab Dumping behaviour into Safari (because Safari tabs don’t play well with Javascript, that turned out to be a dead-end), I decided to try writing an InputManager in Python. SIMBL is open-source, so at first I was looking at the code to see what I need to do to create an InputManager (remember, this is a hack, so Apple doesn’t document it very well). I also read Mike’s essay Armchair Guide To Cocoa Reverse Engineering. What I decided was that, rather than recreate the functionality in SIMBL using Python, I would just create a SIMBL plugin in Python.

Getting started wasn’t too bad, but I found one issue in the above essay that stumped me for awhile. Mike recommends you put your initialization code into a class method load() which gets called after your class is loaded. I don’t know if it is artifact of using PyObjC or what, but my load() method was never getting called. What I did instead was to run the command-line utility class-dump on another SIMBL plugin to see what they were doing. They were using the class method initialize() rather than load and when I switched to that things started working, where by “things” I mean, “I could print to the console to see that my class had loaded.”

The Solution

The next step was to actually do something once I had my code loading into Safari. The tab behaviour of Safari isn’t part of WebKit, so it isn’t documented anywhere. Once again, I used the handy class-dump utility. This is a fabulous tool which will read any Cocoa library, bundle, or application and produce a pseudo-header file showing all the objects and methods defined. I still had to try a few different paths to get to the tab information I wanted, but it was pretty easy, armed as I was with Python and the output of class-dump. Here is the result:

import objc
from Foundation import *
from AppKit import *
class TabDump(NSObject):
    # We will retain a pointer to the plugin to prevent it
    # being garbage-collected
    plugin = None
    @classmethod
    # the following is not strictly necessary, but we only
    # need one instance of our object
    def sharedInstance(cls):
        if not cls.plugin:
            cls.plugin = cls.alloc().init()
        return cls.plugin
    @classmethod
    def initialize(cls):
        app = NSApp()
        menu = app.windowsMenu()
        cls.item = NSMenuItem.alloc().initWithTitle_action_keyEquivalent_(
            'Dump tabs to clipboard',
             'tabdump:',
             '')
        # should be after "Previous Tab" and "Next Tab"
        menu.insertItem_atIndex_(cls.item, 6)
        cls.item.setTarget_(cls.sharedInstance())
    def tabdump_(self, source):
        output = []
        app = NSApp()
        for window in app.windows():
            if window.className() == 'BrowserWindow':
                controller = window.windowController()
                for browserWebView in controller.orderedTabs():
                    output.append(browserWebView.mainFrameTitle().encode('utf8'))
                    output.append(browserWebView.mainFrameURL().encode('utf8'))
                    output.append('')
        self.copyToPasteboard_('\n'.join(output))
    def copyToPasteboard_(self, string):
        pasteboard = NSPasteboard.generalPasteboard()
        pasteboard.declareTypes_owner_([NSStringPboardType], self)
        pasteboard.setString_forType_(string, NSStringPboardType)

As you can see, on my class being initialized, I create a new menu item and insert it into the Windows menu. This could be more robust, by testing menu item names to make sure I’m in the right place, but it works for me, and simple code is more maintainable code. I create an instance of my object and make it the target of the menu item. Pretty basic stuff.

When the tabdump method is called (by selecting the menu item in Safari), it walks through Safari’s window objects (of which there are many) until it finds browser windows, then it extracts the tabbed views from the browser windows to get the titles and URLs involved. When it has collected all the title/URL pairs, it turns it into a big string and puts the string on the pasteboard. Here is where we could be a lot fancier. I’m just putting title/URL pairs, separated by newlines in plain text, because that’s how I mail them to myself. You could easily create Markdown links or any other format here. You could turn them into HTML and put them on the Pasteboard that way. There’s a lot you can do, and the Firefox tool I used to use to do this offered so many options that I was never sure what most of them actually did. Here you can customize the code to do exactly what you need, and keep it simple.

Building the plugin

I haven’t tested this with multiple windows, or with a window with only one tab. It might work, might not. I don’t plan on using it that way, and if I do, it’s easy enough to fix. Now, there is one more thing you’ll need, which is the setup.py script to build it. Assuming you’ve saved the above code as TabDump.py, the following script should be what you need:

'''
    Minimalist build file for TabDump.py
    To build run 'python setup.py py2app' on the command line
'''
from distutils.core import setup
import py2app
plist = dict(
    NSPrincipalClass='TabDump',
    CFBundleName='TabDump',
    SIMBLTargetApplications=[
        dict(
            BundleIdentifier='com.apple.Safari',
            MinBundleVersion='312',
            MaxBundleVersion='420')],
)
setup(
    plugin=['TabDump.py'],
    options=dict(py2app=dict(
        extension='.bundle',
        plist=plist,
    )),
)

In the above file, MinBundleVersion and MaxBundleVersion can keep your code from being loaded if an untested version of the application is running. I have more-or-less dummy values there, don’t treat them as the right thing to do. The SIMBLTargetApplications key holds a list, so if you want your code to load in other applications, add more dictionaries to the list.

Also note that you can build your bundle with python setup.py py2app -A to create a development version (can’t ship it that way) that is all symlinks, so you can edit TabDump.py to make changes without having to rebuild the plugin. If you modify the MinBundleVersion or MaxBundleVersion you will have to rebuild to regenerate the property list (or move the property list to be an external file rather than generating it in setup.py), but that should be an infrequent operation. More importantly, you can put a symlink to your bundle in your ~/Library/Application Support/SIMBL/Plugins/ directory. Then you can make changes to the python code and test it by simply restarting Safari. WARNING: If you have a syntax error in your file, Safari will most likely hang on restart. Just force quit it and check your console for the error to fix.

The Promise

Now, if you’ve followed along with me so far, I’d like to point out a few things that are really freaking cool about this. Item the first: You now have Python running in Safari. Can you think of anything else you’d like it to do while you’re there? I bet you can. Item the second: You can do this in any Cocoa-based application just as easily. Problems in Mail.app? Frustrated by iChat? Just fix it. Take control of your own applications! Make the computer work for you, not the other way around. Item the third: dump-classes gives you the keys to the kingdom. Seriously, the combination of being able to embed Python and get a listing of the objects and methods at will is so powerful that when I got TabDump working late last night and realized what I’d just done (i.e., these three things), I was barely able to get to sleep after that. The possibilities are endless.

If you use this and do something cool with it, please drop me a line and tell me about it. I’m really looking forward to hearing about what kind of cool ways we can push our existing applications.

Correction [2006-08-30]

The class-dump utility rocks, and you should add it to your arsenal of Cocoa tools, along with Python and PyObjC. Since I’ve found it it has already become indispensable for examining existing applications that I want to, er, adjust. Here’s what I’ve learned so far.

First, I want to update my previous post to talk a little bit more about the command-line utility class-dump. This is a fine tool that lets you introspect a Cocoa bundle (plugin, library, or application) and prints out a header file describing all the objects and methods in that bundle. I didn’t mention where to get it, and at BarCamp this weekend I gave some mis-information by telling people it came with Apple’s developer tools, which is not true. I assumed that’s where it came from, because I didn’t remember hearing of it before reading Mike Solomon’s Armchair Guide to Cocoa Reverse Engineering, which refers to classdump without any explanation of where to get it. I tried it, found class-dump worked (tab-completion is your friend), and assumed it came with my system, when in fact I had installed it earlier after reading about it on another blog (I’m afraid I don’t remember where) meaning to try it out, then forgotten about it. So it was there, waiting for me, when I discovered a need for it.

So the truth is, class-dump is a utility written by Steve Nygard. He says it provides the same output as the developer tools command otool -ov, but formatted as a header file. Besides the basic output it can also do various kinds of filtering, sorting, and formatting.
So this is my Tool of the Week (and then some): class-dump. Use it, love it, thank Steve.

Bar Camp Tomorrow

Tomorrow, from 6PM to Saturday at 6PM will be the Bar Camp Vancouver. I’ll be heading over there with my neighbor, former co-worker, and original member of Pluto, John Ounpuu, currently of Sutori fame. I’m planning on ducking out to sleep at home rather than camping there, but I’m sure it will be a great time. If there is time I may reprise my presentation on Python, OS X, and Kids from the Vancouver Python Workshop. If time is short I may still be able to demo Drawing Board. If time is really short I’ll still try to squeeze in a demo of the new hack I figured out last night (see next post). I’m also hoping to find some time to hack on turtle graphics for OS X, since I’m so close to having a working port of the standard library turtle graphics in PyObjC. But the main thing I’m excited about is meeting folks, it’s going to be a great crowd.

Vancouver is such a great place. There’s the standard stuff: Great weather, beautiful beaches, forests and mountains. Then there is all the rest: lots of interesting geeks of various stripes, cool places to work, small conferences to attend. I’ve had so much more fun at the Vancouver Python Workshop and Northern Voices than at big anonymous events like JavaOne and OOPSLA. There’s just no contest. And BarCamp is all about being a small, intimate event–that appears to be its whole entire purpose. I can hardly wait.

Vancouver Python Workshop

[Updated 2006-08-20 to add links. All (most?) presentations are being posted on the proceedings page]

Last weekend I attended the Vancouver Python Workshop here in town. I love this conference because it’s small and local, but attracts a great crowd. There were a few folks I know mainly from the edu-sig mailing list. Guido was there to give the keynote, I got to talk with Ian Bicking a fair bit, and I had never realized that Toby Donaldson is a fellow Vancouverite.

Friday night: Keynotes

Guido (van Rossum, creator of Python, for anyone reading this who is not soaking in Python every day) gave a keynote on the state of the upcoming Python 3000. Since I make an effort to keep up with this there wasn’t much in the way of surprises here, but it’s always good to hear it from the BDFL’s mouth. I’m excited by what’s happening in Python lately, both in language changes and in the libraries, and I think Python 3000 will be a move in the right direction when it comes.

Jim Hugunin, initiator of the Numeric extension, Jython (Python on the Java virtual machine), and most recently IronPython (Python on Microsoft’s dotNet virtual machine) also gave a keynote. He showed how having Python running in the same runtime as your C# code allows you to have stack traces all the way down, which is an advantage over PyObjC. It also helps to be binding to a language which is garbage collected. In PyObjC they go to lengths to make the runtime behave like garbage-collected Python, but there are still some edge cases that can sneak in and bite you. Of course, Apple’s announcement of garbage-collected Objective-C in XCode 3 next year should fix that. Jim’s an engaging speaker and seems like a really nice guy. And since both Google (”do no evil”) and Microsoft (”evil empire”) were represented, balance was maintained in the Force.

After the keynotes there was a reception at the Steamworks brew pub, which is a great place. Staying up late drinking may not be the best way to kick off a conference that starts at 8 am the next day though. I was home by midnight, but I heard from many others that they were out until 3 am. Ouch.

Saturday: Workshop Day One

This was an intense day for me. I was scheduled to give a presentation, but then Paul (Prescod, my friend and co-worker, and one of the conference organizers) asked me to also give a lightning talk and be on a panel discussion. Since I didn’t have time to prepare much for these I was a bit nervous. On top of already being nearly sick with stage fright getting ready for the presentation I did prepare for.

Guido hadn’t had time for questions after his keynote the night before, so the conference kicked off with a Q&A session for him. After that was the lightning talks, and I gave the first one. Aside from not having time to prepare, I had never given a lightning talk before. I demo’d Drawing Board, the animation tool I’ve been working on for my kids, which filled my five minutes pretty easily. I got some nice feedback about it too, which is especially gratifying since after I’ve been working on something for awhile I tend to only see all the problems I know it has, rather than what is good about it. Drawing Board has been a real struggle for me, both in learning PyObjC and the “Cocoa Way” of doing things, and trying to push it into new territory, so I was pleased with how its first public demo was received.

After the lightning talks I attended the beginning of Paul’s tutorial for newcomers to Python to try and pick up some tips. After lurking on the edu-sig mailing list, I’m trying to organized some Python classes at my daughter’s school for the kids there. Paul’s approach was more towards people already programming in other languages, so I didn’t get much there that I could use, but it gave me some things to think about at least.

Ian Bicking’s talk on WSGI got extended from one 45-minute slot to two, and I only got so sit in on the first half, because it overlapped with my presentation. I loved his hand-drawn slides and his take on the “Internet is a series of tubes” meme. He’s a good presenter and WSGI is important stuff for Python, giving it the kind of basic web framework that Servlets gives to Java, which Python has desperately needed for a long time. Now if I could just wrap my brain around Paste I think I could achieve web enlightenment…

My talk, “OS X, Python, and Kids” went well. I used the whole 45 minutes (not like last time I presented when I went nervously through my presentation in 10 minutes and was then left embarrassed on stage), took some good questions, and people seemed to be engaged. I’ve posted my slides, with notes based on what I was talking about during the presentation, here (3MB PDF). The feedback from this presentation, where I used several of my projects as examples of what you can accomplish with PyObjC, was very good, at least from the folks who came up to me. A couple of people even sounded like they were on the fence about whether to switch to Macs and I might have given them a push.

After my talk, and lunch, was the panel discussing how to embed C, C++, C#, and other languages in Python. We had Jim Hugunin (IronPython), Tom Weir (using SWIG on a proprietary project), Samuele Pedroni (of PyPy), and me. There was no moderator, and while I’ve followed the various tools to wrap libraries for Python, and I’ve used a lot of libraries that are wrapped for Python, I haven’t actually done much wrapping myself, so I tried to be a moderator and get the conversation going. SWIG seemed to be used by the most people in the room, and while as a user of SWIG I’ve struggled with it, since any project that uses it seems to be dependent on a very specific version which you then have to find, download, and install. So I’m not a big SWIG fan. On the other hand, folks who use it for wrapping C code can freeze the version they’re using and more or less rely on it, so it works great for them (although I guess debugging is still a bear). I still thing SWIG is part of the problem, not part of the solution.

At the end of day one was a great BBQ at Locarno Beach. Daniela and I took the kids, then we stayed on the beach to watch the final night of the Celebration of Light fireworks. It was Mexico’s turn to light up the sky and they did a spectacular job, winning the four-night competition with Italy, China, and the Czech Repulic. So then there was an encore display, and then we tried to get home through the gridlock that follows fireworks in Vancouver. We got home by midnight, washed the sand of the kids, and put them to bed.

Sunday: Workshop Day Two

After two late nights (for me) and early mornings (for weekends) I’m beginning to feel really beat. I’m also coming down from the adrenaline high I was on yesterday, before and after presenting. I was also distracted by trying to implement the Turtle module in the standard library from Tkinter to PyObjC (and simultaneously port my Kutia turtle program from Tkinter to PyObjC). I actually got pretty far with it, but probably need to post some questions on the PyObjC mailing list to get it finished.

The morning started with a status report on PyPy from Samuele Pedroni, one of the furthest travellers to the conference, coming in from Italy. I’m fascinated by the progress PyPy has made, going from thousands of times slower than CPython to only a couple of times slower. PyPy is Python implemented in Python, specifically implemented in a subset of Python which can be efficiently compiled to C, called RPython. This is similar to how Squeak Smalltalk is implemented (a small, compilable subset of Python is used to implement the rest). The advantage of PyPy is that because the language definition itself is relatively high level, they can quickly implement all kinds of interesting things, like Stackless Python, compiling python to dotNet or Java, even experimenting with compiling Python to Javascript. It’s a very interesting project and a lot of fun to watch. I’m rooting for them.

Next was Wilson Fowlie’s presentation on PyParsing. I had lunch with Wilson and he’s a great guy. He shouldn’t be so self-deprecating in his presentation because he had great information and it was an interesting topic. Parsing libraries are an area where Python has an excess of riches, and it can be difficult to decide what to use. Wilson made his choice and is happy with it, and makes a pretty convincing case for choosing PyParsing. It would be interesting to see projects such as reStructured Text, PyMarkdown, and PyTextile all built on top of the same basic parsing engine. Next time I have a file type to parse I will give PyParsing a try, based on this Wilson’s presentation.

After lunch, James Thiele gave a talk on embedding domain specific languages in Python, mainly revolving around the hooks Python gives you for importing libraries. By using the import hooks you can import code as Python objects which is not Python code. This is pretty cool, especially when coupled with the previous talk on Parsing.

There was a panel discusson on Little Languages, which both Wilson and James were on, but by now fatigue and distraction were taking their toll and I wasn’t a very good audience.

After lunch there was a presentation by Leonardo Almeida on Python and Zope in Brazil, which was basically that they are both widely used, in both business and government, but companies that use them don’t want to talk about it because they regard Python as their secret weapon.

David Ostrowski gave a talk on teaching with Python in graduate computer science classes, but I didn’t really enjoy the presentation. I was too tired and his points about Python were too familiar. I didn’t hear anything new, but it all appeared to be new to him. More power to him, I guess.

There were some good lightning talks at the end of the conference, including Brian Quinlan showing a hack he’d whipped up after the earlier talk on importing domain specific languages. He showed how to use the import hooks and the csv module to load comma separated value files as native python datatypes.

To close the conference, Ian Caven gave a great show about how he and a partner built up a successful business using Python (and OS X). They restore movies for DVD release and they use hundreds of Macs running in parallel, with all the processing managed by Python. It’s a great story and Ian is a great presenter, so that was a high note to end the conference on.

Looking forward to next time, for sure. Thanks to all the organizers!

Pastels

[Update: Thanks to Blake Winton for pointing out that the project page link to the Pastels download was broken, fixed now. Also added a link to the project page.]

Pastels is an example project for creating an OS X screensaver in Python using PyObjC. By extension it could be used as an example for building nearly any plugin or bundle for OS X. It started when I had an idea for drawing a simple squiggle, over and over, while cycling the colours and moving the squiggle around. I was very pleased with how it turned out.

Project page: http://livingcode.org/project/pastels/

It’s also my first attempt at hosting an open-source project at Google with their new hosting program. If it works out well I will add more of my projects there, which will save me trying to set up and configure Subversion on Dreamhost for public access (probably not difficult, but one more thing I don’t have to do means more time for writing example code and tutorials).

I’m working on the tutorial text to go along with this project, so ask any questions you have and I’ll try to get to them in the tutorial.

If you are seeing this on my site (as opposed to the Atom feed), there are some changes I’m making to the site that I’d like to point out. I’ve added pages for projects and mini-projects which use the same stylesheet and includes as the rest of the site. I know the stylesheet is uglyless than completely attractive right now–the first thing was to get everything factored and consistent, then to make it pretty. The projects page only has one item on it (Pastels), but that should be changing now that I have the infrastructure set up the way I want it. Nearly all the projects I mentioned in my presentation at the Vancouver Python Workshop will get their own pages soon. More about that in my next post.

XML Article without the XML

My latest article for David Mertz’s column XML Matters is up at IBM developerWorks: Lighter than microformats: Picoformats Ajax without X, Microformats without angle brackets went live a couple of days ago. It isn’t so much about XML as how to avoid XML. My feelings towards XML are that it is useful and good, but overused and not a panacea. By providing some alternatives, maybe some of the backlash against the “XML everywhere for everything” meme can be averted.

I’ve been meaning to post about the article, but I keep getting caught up preparing my presentation for the Vancouver Python Workshop on Saturday (the workshop starts Friday August 3rd and goes through Sunday August 5th). My talk this year is on using [PyObjC] to create applications and plugins for OS X using Python. I’ll get the slides up after, as soon as I can. I’m also planning on doing a shorter version of this talk at Bar Camp Vancouver which is 6 pm Friday, August 25 to 6 pm Saturday, August 26.

And I should have mentioned the Google talk at the Vancouver High Performance Computing User Group before it happened on July 27th. Narayanan ‘Shiva’ Shivakumar came up from their Seattle office to present mostly old information from their published papers such as The Google File System, MapReduce, and BigTable (video). The talk over beers after was fun, and it was good to see my friend Mark and find out he has a blog, even if it’s over my head much of the time.

Well, that’s my update dump. More stuff on actually using PyObjC coming Real Soon Now.

google

google

asus