The explanation for the apparent insanity of this product is actually very simple. Akio Toyoda, the CEO of Toyota, loves fast cars. He fucking loves them! That’s it. That’s the big reason. It’s why the biggest car maker in the world spent ten long years and well over a billion dollars developing a car that almost no one will ever own—or even know about, for that matter.
I've been trying to morph my nascent machine learning library from haphazard research project into something more full fledged. I have a great name: pyrouette, and some repositories for code that a few other people have found useful (I get emails).
I'm not sure what it takes to make something widely used. Other projects like scikits and pybrain seem to have multiple developers from the start, or at least emerged into a small but dedicated pool of early adopters. I have a few algorithms that I couldn't find elsewhere and a lot of the coding equivalent of duct tape.
Obviously a lot of work needs to be done, and there's no guarantee that anybody will care. Plenty of great projects never get a lot of traction. I sort of wonder if branding might be the key. Towards that end I've been thinking a bit about trying to come up with a logo for my project. A futile effort, probably, but one I thought I could have fun trying.
I call pyrouette a "Pythonic Artificial Intelligence" library. My vision is that the library include lots of light-weight, highly customizable implementations of advanced AI algorithms. Given my research into reinforcement and manifold learning, there's a clear bias in what's currently available, but the vision is more aspirational than practical.
Anyway, I wanted to take some inspiration from the logo for Python, which I thankfully found in SVG form:
Using my limited Inkscape skills I thought I'd play around with the Python logo a bit. I wanted to see what I could do to evoke the idea of the namesake of this project: the pirouette. My first idea was to make the two snakes dance with each other, but the image still seemed quite static, and so to capture the idea of a spinning motion I decided to include some form of speed lines. Not actually being a designer or artist, I opted for the out-of-the-box spiral from the Inkscape toolbar, suitably squashed.
Yeah, I'm not proud. Anyway, I put the project on hold, but just today I had one of those rare shower ideas. My logo could pay homage to the Python logo without being the Python logo. I could capture both motion and grace with smoother, slender figures (still suitably abstract), and keep the color scheme. Here's what I came up with:
The colors on the yellow snake aren't quite right, but I kind of like how the red accents add a nice character and even a bit more motion. I'm happy enough with this draft that I decided to go with it. The logo is now live at http://pyrouette.net. Not sure it will bring any new interest, but I like the new look.
There's a famous and massive comparison of iOS writing apps on the web and I think it's time for someone to put together a similar comparison of blogging/site generation engines. There's a new Python/Markdown based blog engine on Hacker News today called Letterpress, and at least one commenter has already chimed in with a Node/Markdown blog engine of his own.
Considering how the last static site generation post went, I'd expect a lot more programmer/bloggers to jump in with their own contributions. The proliferation of static site generation tools (consider this list) is something of a marvel. It's a great problem for reinvention, with lots of parameters and, given the state of modern tooling, nothing about building a blog or site generator requires a lot of strenuous effort. Every project of this kind seems like a weekend project.
Also, Wordpress is slow as shit. Browsing the plugin library is like browsing for STDs. That's usually enough to get hobbyists like myself sifting around for something else, or give us the bump to put a few lines of our own code behind the problem.
Hacker News has highlighted a few recent pieces on Excel recently. This thread is a good example of the general sentiment. I've long thought that the real power of Excel was the ability of Excel to bring important programming concepts to the masses. The complex models that "non-programmers" can develop in Excel are surprising to programmers who haven't spent a lot of time with the software.
This comment captures a lot of what goes wrong whenever organizations attempt to replace Excel workflows:
Most of us who have been there and done that know what happens next: higher-level stakeholders get involved, broader objectives get defined, more team members are brought on, timetables are established, results are "metricized", paradigms are going to be shifted, etc.
As a computer scientist, it seems like the ground here is fertile for a bit more technology and a little less "process". Why not design an Excel compiler that takes in an Excel spreadsheet and outputs an application in a particular stack? Of course there's a bit of detail that needs to be filled in, but it seems like 90% of the work can be automated.
I'd imagine there would be fewer tales of failed porting projects if these projects took hours instead of weeks.
The look and feel of this blog has changed a bit. Underneath the hood there is a new engine called lorem (a complementary project to ipsum). Lorem is a bespoke blog engine inspired by Jekyll but built using Python and docutils. I originally planned on ReST/docutils being a bigger part of the engine, but most of the post parsing is done prior to any application of docutils (and imported posts, already rendered in html, bypass docutils entirely). Even so, the post file format is compatible with ReST.
Turns out docutils wasn't the real hero of this project. It was the somewhat oddly named Jinja2 template engine. There are a few other nice features that are actually legit improvements over the Wordpress plugins I was using before. MathJax is an amazing way to format math on the web. Pygments provides very nice source code formating for all my imported posts. New source code snippets now go in gists.
I never really liked the bells and whistles that Wordpress provided, and in my quest to minimize the features I needed, I realized I just wanted an engine that would take text files and render static html on a webserver. My shared hosting plan is sort of ghetto, so a static file blog, in addition to being exactly as "feature rich" as I desire, is also a bit faster. I wouldn't recommend using the code for anything at this time. As I grow into this new platform, I may get the chance to file down some of the rough edges.
Apropos my recent post on Gaussian Processes, I dug up an old set of notes on how to use Gaussian Processes in reinforcement learning. You can check them out here.
A recent comment on Hacker News:
My guess is that a lot of big data is deployed on linux, and thus the development environment as well as the well known deployment routines on linux revolves around tools that traditionally work well on Linux, like the headless JRE (for the servers) and Eclipse/Netbeans/IntelliJ (for the development environment).
Can you even set up a bunch of windows server nodes without running into a licensing headache?
Me, from many years ago:
Finally, I sometimes need to run, say 200 or so processes in order to get experimental results. To do so would require ~200 Matlab licenses. That's a lot of money needed to harness the kind of distributed computing power that characterizes not just research institutions, but many modern business environments (I ran large distributed computational processes even when I had a real job.) In the past I’ve had to port to Octave, which ended up saving me days of waiting for a serial Matlab process to finish.
Lack of flexible licensing is death in big data.
A simple working example from Bishop's book: