With Octopress 2, the paradigm I
used was to fork the canonical
Octopress Git repository and maintain all my posts and theme files as a branch
forked off of the Octopress "master" branch. I then periodically ran git merge
octopress/master
to pull upstream core Octopress infrastructure changes into
my custom branch.
As mentioned in the official "Octopress 3.0 Is Coming" announcement, there were various downsides to the Octopress 2 paradigm. The main pain-point revolves around how Octopress 2 is basically just the skeleton of a Jekyll blog that you need to fork and modify – which means that when you want to take upstream changes of the Octopress 2 infrastructure, you need to merge those upstream changes into your local forked branch and work through any merge-conflicts. Sometimes easy, sometimes time-consuming.
With Octopress 3, all of this has been re-thought so that your site is a Jekyll site first-and-foremost and all the extra Octopress "goodies" are delivered via gems which can be used as Jekyll plugins. That creates a nice clean separation between your site content versus the Jekyll site-building tools. Neat!
Side note: it's unclear what the future of Octopress 3 is.
Octopress 3 development was active and vibrant circa 2015, but all the activity in the Octopress plugin repositories seemed to tail-off towards the beginning of 2016. It's a shame, but I understand how these things sometimes go – priorities change, and life comes first.
I opted to embrace Octopress 3 as-is because I was a big Octopress 2 fan and I really like the Octopress 3 vision. Also, I wanted the small quality-of-life features which Octopress added above-and-beyond the default Jekyll scaffolding.
But I did run into some small bumps along the way – more on that later.
I created a docker-compose.yml
file to compartmentalize Jekyll build
environment:
Some key points:
jekyll/jekyll:latest
Docker
image.$BUNDLE_PATH
to avoid needing to
re-download and re-install all the Gemfile
gems for each docker-compose
run
container.Using Docker containers means everything is compartmentalized and that it's easy to bootstrap my blog build environment onto another machine. Containers for the win.
Create a new directory for your brand-new Jekyll site:
Copy your docker-compose.yml
file into the new directory.
Use docker-compose run
to start a shell into a new Docker container:
Install the Octopress gem:
Use the octopress new
command to create a new Octopress-flavored Jekyll site:
(Tip: You'll need to pass the -f
flag to octopress new
because you're
trying to install into a directory which already exists)
Edit the Gemfile
(created during octopress new
) and minimally add the
baseline Jekyll and Octopress gem dependencies:
At this point, you have a brand-new empty Jekyll site with a few extra
Octopress-flavored bits (like the _templates
directory), and you can run
normal Jekyll commands like 'jekyll serve',
jekyll build
, etc.
Jekyll supports several ways of managing gem-based plugins.
I opted to manage all my gem plugins via a Bundler group in the Gemfile
file.
This just seemed the most straight-forward approach. It also seemed like a good
idea to track both the Gemfile
file and counterpart Gemfile.lock
file via
source-control (e.g. Git), but your mileage may vary.
There are several neat Octopress 3 gem plugins, but I found that several of the
plugins didn't work out-of-the-box with Jekyll 3.x (because most of the
plugins haven't been maintained in over a year now). But thanks to the
community-effect of GitHub, several other folks have fixed the Jekyll
incompatibility problems and submitted pull requests. You can install the
patched version of the various Octopress 3 gems by specifying an explicit
git:
remote URL (and the specific branch:
) for that particular gem in your
Gemfile
.
Here is the final Gemfile
I ended-up with, with various patched Octopress
gems:
You'll need to copy over whatever parts you need from your old Octopress 2 site to your new Octopress 3 site:
source/_posts
-> _posts
source/images
-> images
source/_includes
-> _includes
, etc, etc.The original Octopress 2 theme was baked into the Octopress 2 git branch.
You could probably migrate over the old Octopress 2 theme from your old Octopress 2 site, but now that we're using plain-vanilla Jekyll 3 that means we can use any of the plethora of Jekyll-based themes. So I went hunting for a new theme.
I ended-up choosing the Pixyll theme because it was clean and modern.
I "installed" the Pixyll theme by:
I opted to create a "baseline" commit of the Pixyll theme (along with making
note of what Git revision # from the Pixyll repo I was forking at) to (in
theory) make it easier to take and review upstream changes to the Pixyll git
branch: create a local Git branch off that baseline commit, copy in any
upstream changes to the Pixyll theme, and then git merge
forward onto my
local master
branch and work through any merge conflicts.
Here are some of the customizations & personalizations I made along the way:
_config.yml
to set various site
flags which the Pixyll theme
respects._includes/footer.html
to include my name and copyright._includes/head.html
to add the {% css_asset_tag %}
Liquid tag, which was needed to get the octopress-solarized
plugin to take effect (via the octopress-asset-pipeline
plugin).Here are some helpful posts which laid out the vast majority of the migration process:
Happy upgrading!
]]>After 8 years on the job, Google Reader is retiring on July 1st. Thank you for being there for us all this time!
Thanks for being a great friend, Google Reader. And I'm happy to see that there will be plenty of worthy alternatives to fill your absence.
]]>When I saw that headline on Wednesday evening, I felt a mixture of emotions: anger, sadness, and worry. Anger that Google could shut-down a service that was so personally-valuable to me; sadness that a web-service I've depending upon for years was going away; worry that there might not be an alternative service out there that could fill the same role in all the same ways that the Google Reader ecosystem has.
You see, I've been using Google Reader since 2007. It's the web-app I use more than any others. I've had a dedicated Google Reader pinned-tab open in all my Firefox sessions for as long as I can remember.
I use it every single day. It's the way I read the Internet.
Google Reader is more than just a simple web-app: it's the back-end RSS aggregator service which crawls all the 120+ RSS feeds I follow and centrally stores all the state about what unread (and starred) articles I have. That kind of read-anywhere-sync-to-everywhere workflow is immensely valuable to me because I want to be able to access my news wherever I am and with whatever device is most handy: I use the Google Reader web-app on my home laptop, home desktop, and work desktop machines; I use Reeder on my iPhone; I use gReader on my Nexus 7 Android.
It's been interesting to read the reactions/backlash from the tech community. I found some interesting blog-posts via Twitter:
Christian Heilmann wrote "RIP Google Reader - I'd Have Paid For You" where we talks about how the social web will never be a replacement for the one-stop aggregation service that Google Reader provides:
Yes, RSS has been declared dead many times and people keep banging on about the social web and that Facebook, Twitter, Reddit and others have replaced the old style of blogging and having an own feed. But I don’t buy it, sorry. Every social network is full of senseless chatter and organised advertising. Social media experts and PR folk make sure that information about certain products and celebrities get read and retweeted. I don’t care about that. I don’t want it. The same way I don’t watch public access channels or randomly surf channels but instead plan what I want to see on TV. Random exploration and finding things by chance is fun, but it is not helping you to keep up to date – it is the ADHD of information consumption.
Jonathan Poritsky wrote a fantastic blog-post titled "Reader's End and Google Today", where he points out that this could be the start of a disturbing shift in Google's priorities:
The biggest issue doesn’t seem to be the loss of Reader itself, but the recognition that Google’s priorities have shifted …
But walls have sprouted up. Google can’t access the massive amounts of data people pour into Facebook and Twitter, so they built Google+ as their own social walled garden. Twitter is exerting control over how users experience their product, which shuts out competitors like Instagram (which is owned by Facebook), which can no longer display images inline in tweets. The Web is getting smaller, not bigger, with each company working to become the umbrella under which you experience the Internet. So Google has taken steps to make sure that the Web as users know it exists under their company banner, and Reader doesn’t fit in with that plan anymore.
I was once a Google cheerleader. Like many I believed their goal was to make a better Web for everyone, with the one major tradeoff being that they would sell ads instead of charging users. That may once have been true but the Google of 2013 doesn’t want to build a better Web, it wants to build a better Google. I don’t think that goal is aligned with any of my own.
With this move, Google is seeding a lot of ill-will in the tech community. This feels very much against the "Do No Evil" slogan which Google touts. If Google can shutdown a service as beloved as Reader, then it makes you wonder which services are safe from the chopping-block…
Even though Google says that "usage of Google Reader has declined", there's obviously still a lot of people in the tech community who still find great value in RSS and a Reader-like service. And a lot can happen in the next 3 months leading up to the July 1st shutdown.
So, the imminent death of Google Reader could just be trigger-point needed to spark another renaissance in RSS readers like what we had circa 2005-2006.
And it looks like we will have several options. ReplaceReader is a neat little site I found for folks to suggest replacements for Google Reader.
I expect the biggest challenge (for me personally) in finding a suitable replacement for will be finding a solution that I can still (easily) access (and seamlessly sync!) across multiple platforms. I've gotten spoiled-rotten by the native iOS and Android Google Reader clients. And whatever I pick, I want to make sure that I still have some exit-paths in case that service closes-up shop for some reason. Some kind of open-source/self-host option could be nice so that I can control my own data, but then again needing to maintain a DB-backed website isn't really something I want to do anymore.
Tiny Tiny RSS is an open-source self-host option which looks pretty mature. This looks promising if all you need is a simple web interface.
The most interesting option I've seen so far is NewsBlur. It's open-source (on Github) and self-host-able (nice to know I have options) but also has a paid hosted option. And it looks extremely polished and simply gorgeous. It keeps the same simple/functional interface principals as Google Reader while updating the UI for 2013. And there are iOS and Android clients so that I can still access my news on whatever device I want.
I expect (or at least hope) there will be a flurry of activity in the RSS reader space in the next few months leading up to the July 1st shutdown. It will be interesting to see what alternatives the community embraces. I plan to watch the space for a while before committing to any particular option, to see which options rise to the top.
I plan to write a follow-up post in a few months detailing what option I end-up going with…
]]>grep
. It's aimed at programmers and by default will only search a
white-list of known file-extensions so that it will only search the "code" in a
directory.
ack
looks at your ~/.ackrc
file to get any customized "default" settings
you want. I use my (user-level) ~/.ackrc
to enable some personalized default options,
e.g. color-ize output, always use a $PAGER
, sort the output by filename, etc.
But sometimes I want to have directory-level (or project-level) additional settings, namely to always exclude/ignore certain directories when searching at the project-level.
For example, here are the exclusions I want for my Octopress build environment:
So, I wrote-up a little ack
wrapper script: ack-wrapper.
It crawls the directory tree looking for an additional "local" .ackrc
file,
starting from the current working directory crawling up through any parent
directories until we find a .ackrc
file (or until we reach either $HOME
or
/
).
(Update 2012-03-10: It looks like Ack v2.0 supports PWD .ackrc files natively, and has some other neat enhancements to boot!)
$PATH
(I like having a ~/bin/
directory), and
make sure it's executable (chmod 755 path/to/ack-wrapper
)..bashrc
to alias ack=ack-wrapper
, so that running ack ...
will first call the wrapper script, search for any "local" .ackrc
files,
insert any additional options found into the original supplied command-line
arguments, and finally call the (real) ack
executable.But I didn't realize there were other alternatives…
While reading though the back-archives of one of my new favorite finance blogs, I came across an interesting article, "Our New $10.00 Per Month iPhone Plans", talking about switching to a monthly pre-paid plan and paying only $10/month for an iPhone cellular plan. I had heard mention of some of these "pre-paid" cellular vendors before but had never really looked into them much or really understand what they were:
Many of these new options are called Mobile Network Virtual Operators (MNVOs), and they are in fact just re-selling access to the bigger carriers’ networks. So you get the same reception, coverage, and reliability as you had before.
I had never realized! Same cellular network/service, you're just paying a different middle-man. And there's quite a long list of these MVNO's, lots of options for each of the major US mobile operators: AT&T, Sprint, Verizon, and T-Mobile. Reading the comments in that article was where a lot of the gold was; there's lots of great information in there. A lot of these MVNO's have been around for quite a while and are pretty well-established.
So, since my iPhone 4 hardware was still giving my reasonable performance, I decided to skip the expected hardware-upgrade and stick with my current handset and try to milk it for all its worth.
After doing some more reading, I ending-up choosing Airvoice Wireless because they seem to be one of the favorite (and well-established) AT&T-based MVNO's and they have lots of different plan options.
Since my phone is already setup to run on AT&T's GSM network, you wouldn't even need a carrier-unlock to be able to jump to one of the AT&T MVNO's because it's still using the same carrier network behind-the-scenes. Though, truth be told, since I was out of my 2-year contract with AT&T, I did make use of AT&T's free carrier-unlock before ditching my AT&T plan so that I could use my handset when traveling internationally should the opportunity arise…
I considered going the ultra-frugal $10/month Talk & Text plan but that seemed a bit too restrictive. I'm coming from my AT&T plan which had 450 anytime minutes (though I only used on average ~100 minutes per month and had a huge rollover pool) and my grandfathered-in Unlimited data plan from my original iPhone 3G based plan. I usually averaged around 200MB-300MB of data per month, so I wasn't really reaping much from my Unlimited data-plan. The ultra-frugal $10/month plan would mean I'd really need to scrutinize my cellular data usage ($0.33/MB adds up quick!), as in turning-off cellular data most of the time and only turning it on when I absolutely needed. That was just a bit too extreme for me — too much penny-pinching.
So, to make my transition from AT&T to Airvoice Wireless as painless as possible, I opted for their $40/month Unlimited Plan with Data which has unlimited talk & text (neither of which I use much of) and 500MB of data per month.
Here's a quick summary of the setup/transition process…
AIRVOICE WIRELESS
vendor banner on the upper-left of your iPhone screen.Easy-peasy.
It's only been a few weeks so far, but all-in-all the transition has been extremely painless.
Even if some months I do happen to go over my 500MB/month cap, I would simply drop another $40 in my account early and that would restart the 30 day expiration over again. As I understand it, a given prepayment expires either: after 30 days or after you've used up your quota, whichever comes first.
If I find the $40/month plan is either too limiting or I'm consistently not using everything I'm paying for, I can always easily switch to a different plan/tier. Based on the Airvoice Wireless terms of service, it sounds like they won't refund you a partial-month at all; you need to make a cut at the end of your 30 day cycle (to make the most of your money) and you need to call customer-service to switch your account over to the new plan. I (obviously) haven't tried this yet but it sounds pretty painless.
And by saving $500/year on my cell phone bill, that's basically the price of a brand-new (unlocked) flagship phone. So, I could still upgrade my hardware and still come-out ahead because I'm not tied into an expensive contract with one of the big cellular companies.
It will be interesting to see how this plays-out over the next few months, to see what happens to my service at the end of a 30-day cycle (e.g. how apparent will it be that my service has cut off and that I need to pump more money into my account?), to see how often I use up my 500MB quota before the 30 days are up, etc. I'm just darn excited to have interesting new options/alternatives.
]]>He's been able to put to clear words what I've been starting to realize lately:
The bottom line is this: by focusing on happiness itself, you can lead a much better life than those who focus on convenience, luxury, and following the lead of the financially illiterate herd that is the TV-ad-absorbing Middle Class of the United States today (and most of the other rich countries). Happiness comes from many sources, but none of these sources involve car or purse upgrades. No matter what the herd or the TV set tells you, this is the truth.
Living within your means and cutting-back on needless spending means lower monthly expenses. By cutting those recurring expenses, you're able to save more money. The more money you save, the more compounding interest and investing works in your favor. By learning to live more frugally, you're changing your lifestyle and mindset and you end-up not needing as much money for your post-retirement lifestyle. It becomes this snowball effect and you may even be able to retire a lot earlier than you expected.
]]>screen
, ssh
, grep
, etc.
I use SSH's public key authentication
pretty extensively to get password-less authentication to make it dead-easy
(and quick!) to SSH around to different machines.
On some of the non-UNIX machines at work, I couldn't get SSH public-key auth working, but those machines do support Kerberos auth (binding to Active Directory). Based on my Google searches, all I could find were articles talking about compiling OpenSSH from source to get a working Kerberos-enabled version of OpenSSH on Cygwin. So, that made it sound like it would be a pain to get this working. But, after doing some more playing around, I found this was actually easy to setup once you understand the various pieces. Since I couldn't find any helpful information online when I first tried to get this working, I figured I'd write up what worked for me in case that helps other people.
Install the openssh
package. This gets you the OpenSSH client tools, e.g. ssh
,
ssh-agent
, etc.
Install the heimdal
package, which supplies an implementation of the Kerberos
tools. This gets you things like kinit
, klist
, etc.
You may want to configure your /etc/krb5.conf
file to list a default realm
so that you don't need to specify it when doing the kinit
later on:
Alternatively, rather than fiddling with /etc/krb5.conf
on my Cygwin
install, I opted to use the KRB5_CONFIG
environment variable (see the
kinit
manpage) to point to a ~/.krb5.conf
file instead to keep my
Kerberos config confined to my $HOME
directory (since I
keep my $HOME directory under version control).
Modify your ~/.ssh/config
file to enable GSSAPI authentication:
ssh
to a remote machine where you want to use Kerberized credentials,
simply run kinit
to acquire a new Kerberos ticket. (Pro tip: you can run klist
to
list all your active Kerberos tickets and their expiration dates.)GSSAPIAuthentication
directive in your .ssh/config
file, that should
enable GSSAPI authentication for free. There's also a -K
param to the ssh
command
which talks about enabling GSSAPI auth and forwarding, which I'm not entirely sure
what that controls, but my guess is that it's for opting into GSSAPI auth mode if
you don't have that directive in your .ssh/config
file.I hope this helps someone else who's trying to get Kerberized SSH working on Cygwin. Happy SSH'ing!
]]>Virtualization is huge in the enterprise-space these days. After talking with some friends from work about what kind of virtualization strategies we're using for our in-house datacenter, I became enticed by the idea of running a "bare metal" hypervisor: install virtualization software as the primary OS on the physical hardware and spinning up VM's for different logical needs.
Over the years, I've accumulated a fair amount of old computers. Back in the day, those extra computers gave me an opportunity to try-out new OSes/software: playing with different Ubuntu configurations, tinkering with different *BSD flavors, etc. But once my spare-time dried up, all that old computer hardware just became clutter taking-up space and it seemed like such a pain to fire up an old computer just to tinker with some new configuration.
So, going virtualized held a lot of (obvious) appeal to me: minimize my physical hardware (less power consumed, less physical space, etc.) while letting me easily tinker with new configurations.
Based on previous research, I knew that I really wanted to move to some kind of ZFS-backed storage solution, to protect/maintain the integrity of my ever-increasing digital collection.
After following the "zfs-discuss" mailing list for a few months, given my simple home-needs a simple mirrored configuration (along with proper backups) seemed like the best solution. If I need more space, I can expand horizontally: add another pair of drives (of whatever size is appropriate) to expand my pool. It was this easy expansion which pushed me towards a mirrored configuration rather than a RAID-Z style configuration. The trouble with RAID-Z is that you can't add new devices to a vdev after you've initially created it; the only way to "grow" a RAID-Z vdev is by replacing all of the individual drives (one at a time, waiting for each to resilver) and setting the "autoexpand" property on the pool so that ZFS will auto-expand the pool based on the new common maximum size of each of the individual drives. But, that's a whole lot of moving parts (pun intended) and replacing all the drives in the pool isn't quite my idea of easy expansion. Mirroring just seems easier, given that I don't need lots of individual drives for raw performance reasons.
Somewhere along the line in my research, I stumbled upon napp-it, which introduced me to this idea of an "all-in-one" fileserver: using VMWare ESXi as a bare-metal hypervisor, having a Solaris-based VM running on the ESXi datastore which will control the mass-storage ZFS pool, and then exporting an NFS share back out to ESXi so that you can store the bulk of the VMs on ZFS-backed storage. It's a bit complicated at first glance, but it performs great (thanks to hardware-passthrough) and lets me easily manage the bulk of my VMs on ZFS-backed storage.
To make use of ESXi's hardware-passthrough support, you need to run server-grade hardware. For me, that meant going with an Xeon-based CPU with appropriate motherboard chipset. But that also gave other server-grade wins like using ECC memory and IPMI (easy remote-control of the console, which is awesome).
Here is the hardware I ended-up going with:
The Xeon chipset was needed for some of the hardware passthrough features in ESXi, e.g. passing through the HBA card directly to the Solaris-based VM so that ZFS can have direct access to the physical drives.
The 650W power-supply ended-up being way more than I needed. The server uses less than 100W when idle, though that PSU is still quite efficient even with such low power draw.
The setup has been awesome so far. Even just getting to play around with server-grade hardware has been an eye-opening experience. Being able to remote-control the server, e.g. mounting an ISO remotely over the Java-based client and installing the OS on the computer all without needing to hook-up a keyboard or monitor, has been a revelation. (Goodbye old CRT monitor that I used to keep around for my server closet!)
I love the flexibility of ESXi. I have an Ubuntu VM for development/testing, a FreeBSD VM for running Mediatomb (PS3 media-server), etc. It even let me play around with different Solaris flavors while trying to figure out what OS I wanted to use for the ZFS back-end. It's just so easy to spin-up new VMs to test an idea or just to play around.
I started out using Solaris 11 Edition (free), which ran fine for the better part of a year. I tried upgrading to Solaris 11/11 earlier this year only to find that Solaris 11 apparently doesn't play nicely with ESXi. From there, I jumped over to using OpenIndiana, the open-source spin-off from the now-defunct OpenSolaris lineage.
ZFS has been a huge win too. Taking nightly snapshots makes it dead-easy to look back in time to see how day was several months ago. The snapshots also make it easy to send the incremental differences to a backup pool. The built-in CIFS server makes it dead-easy to mount the shares on Windows, and the filesystem snapshots are easily accessible via the "Previous Versions" tab in Windows Explorer. I also really love the idea of the "pool". I can create different filesystems to group/organize my data (e.g. photos vs. music vs. backups), enable compression on a per-filesystem basis, set quotas per filesystem, etc.
I really love the "all-in-one" idea: ZFS reliability combined with the flexibility of VMWare ESXi.
]]>svn blame
to
drill-down into code-history. Our central SVN repository is some 4-5 years old
and a whopping 300GB+ on-disk. (Yowza!) What we'd really like to do is dump
just the /trunk
history out to a new repo and roll forward with that, ditching
any historical baggage from old topic branches (/branches
). The trouble is,
I haven't been able to find any tools to do this.
So, I ended-up writing my own tool to do this. But, first, some back-story…
In doing some searches for variations around "svn repo filter", I found a lot
of people pointing to the "svndumpfilter"
utility as the tool-of-choice. Sadly, it doesn't seem to be quite "smart" enough
to do what I'd like it to do. It seems to be aimed at (namely) filtering an
svnadmin dump
stream, taking only certain paths in the SVN "filesystem". That
works fine if you're trying to take an isolated folder/project from the repository,
but if that folder/project has ever been merged into from any paths outside of
the target filter path, then things fall apart. The "trouble" are the copy-from's…
For example, say that you have a repo with a typical trunk/tags/branches setup.
Say you create a new topic branch (e.g. svn copy /trunk /branches/my-fix && svn co /branches/my-fix
)
and happily work on your sandboxed branch. Say that you decide to rename some of
the pre-existing files/folders, so you run the appropriate svn move
commands and
happily commit those changes to your branch. Once everything is working happily,
you go to merge these changes into /trunk
and that all works great. After the
commit, if you run a svn log -v -l1 /trunk
to look at the details of the most
recent commit to trunk, you'll see something like this:
…which only describes the (top-level) folder rename, not any add/modifications/renames/etc
that might have happened inside that folder on the branch. At the full repo-level,
SVN can get away with just doing a "svn copy
" from the branch to trunk, so that
when you do an "svn log
" on the new "RenamedFolder" path it will walk from trunk
back into that originating topic branch and follow the rename (copy or move) that
happened within.
The svnadmin dump
will show the same "copy-from" action as svn log -v
did,
since the dump stream will need to be able to recreate that same ancestry. When
svndumpfilter
sees this, it throws up its hands and returns an error because
it doesn't know how to recreate the logical history that happened between
when that branch originally forked from trunk versus the final state of that branch.
That is, assuming the copy-from path even has any ancestry back to trunk…
In my searches, I had found a few folks that had hacked together there own solutions. None of those quite fit my problem at hand, so naturally I decided to start working on my own solution.
The closest fit I found was a project named "svn2svn"
hosted on Google Code. It had copied parts of the "hgsvn"
project (sychronizing between Mercurial and Subversion repositories) and
slapped them together in a way that worked for the original author's needs.
It used "svn log
" to walk the entire history of a given path in a source
repository and then manually replayed those changes to a working-copy of
some path in a target repository. Revision by revision, it would replay
the delta, recreating the history of just the source path in the target repo.
This was oh-so-close to what I wanted, except that it also didn't correctly
handle the copy-from case. And the repository I really wanted to replay was
littered with copy-from's, since I'm trying to replay just the history from
/trunk
and we create topic-branches for everything.
So, I started with the svn2svn project, got familiar with the code, and started hacking to extend the code to solve my problem at hand.
The net-result is my own (nearly completely rewritten) take on the problem, a project which I'm also calling svn2svn.
I've made several enhancements to the original script:
Full ancestry (copy-from) support. This was the tricky part. It took
several different iterations/rewrites to get something which worked for all
the different edge-cases. The general idea here is that we can use svn log
to walk backwards through the ancestry on a copy-from case: if we can trace
back our ancestry to same source path we're replaying, then we can do an svn
copy
from that original parent and then do svn export
to update the
contents of all the files to match the final copy-from version. There's also
some extra recursion that needs to happen here, to handle cases where child
files got renamed inside of a parent-renamed folder. There are other
edge-cases like files getting replaced inside a parent-renamed folder.
Use revprops for source-tracking and resume support. Subversion has
revision properties, key+value pairs that are associated with a particular
revision/commit. I'm setting some svn2svn:*
rev-props to track the source
URL, source repository UUID (i.e. in case source URL now points at a
physically different repo), and source revision #. This is all needed for
proper resume support, since for the ancestry support we have to maintain a
mapping-table of source_rev -> target_rev, so that when we find a copy-from
from some revision # in the source repo, we can map this to the equivalent
revision # in the target repo so we can do an equivalent svn copy
command.
Better verbosity output, and optional debug output. As I was playing with the rewrite, I quickly found I needed better debug output, to see which shell commands were being run and to display just general debug/status messages as we do all this new complicate ancestry-walking logic. Bonus: the debugging messages have colored output, using ANSI escape codes which all self-respecting terminal emulators should respect.
All commits (including initial import) go through the same central
code-path, which means we could run a (client-side) pre-commit script to
scrub the contents of the target working-copy before each commit. This is
where the power of doing the manual replay of changes really starts to shine.
We have full control over the pending changes, which means that if your
original trunk history had some files which you didn't want to transition
into the new replayed repo then you could easily svn rm
those files from
the working-copy before the commit happens. That could be as simple as
excluding certain fixed paths, but it could be a lot more flexible like
searching the entire working-copy tree and removing any files which match
a certain file-name. Heck, you could even modify the working-copy
file-contents at this point…or add brand-new files if you want. We're
replaying the SVN history here will full control of the target content,
so this opens a lot of interesting options.
Check-out the svn2svn project page for more details.
Also, I have the project mirrored on Github so please feel free to fork the project, send me issues/enhancements, or send me pull-requests for any tweaks you've made.
]]>$HOME
directory "dot-files" in a Git repository.
That single idea led me on a personal crusade to better understand all the
different configuration files that live in your UNIX (Linux, Mac OSX, etc.)
home directory, and the end-result was creating my own "dotfiles"
Git repository for synchronizing/tracking/deploying my dot-files between the
various machines I work upon.
I learned a lot of neat stuff along the way, including some config options which I never knew were there and some tricks which really optimized my command-line shell experience.
I found it to be a very natural fit to use Git to track the edits to my dot-files. It makes it dead-easy to see what changes I've made since the last commit and easy to commit those changes and push them to a central Git repository.
Git also makes it dead-easy to "bootstrap" my dot-files environment into a brand-new home directory:
Ryan has a ~/bin/
directory in his dot-files repo with all kinds of nifty
little utilities.
There's a bunch of great Git-related utilities in there, e.g.:
git-pull
but show a short and sexy log of changes immediately after merging (git-up) or rebasing (git-reup).And there's also some great general-purpose utilities in there, e.g.:
grep
, designed for programmers with large trees of heterogeneous source code. Very handy for recursive file-searching.Ryan has lots of nifty tricks in his .gitconfig
file, tying in short-cuts for
calling the helper utilities in ~/bin/
. Also, you can setup Git to use ANSI
colors for various sub-commands (diff, status), which is very handy.
There's a lot of neat options in the .inputrc
file, for binding key-sequences
to various command-line options. For example:
…are the two single-most time-saving key-bindings that I stumbled upon. This
lets you type a partial command and then use Ctrl-n
and Ctrl-p
to
completion-match the rest of the line based on your command history. Awesome!
I find this incredibly useful to completion-match hostnames for ssh
commands or other
longer commands which I often type. Completion-matching for-the-win!
Before setting out on this project, I had used Vi some but hadn't really explored
(or understood) the full-power that was Vim. Digging
into the .vimrc
file, reading-up on all the various config options, and
digging into all the ~/.vim/
plugins which Ryan had was an eye-opening
experience. I came to realize just how much you could extend and customize
Vim to meet your needs. Simple things like getting syntax-highlighting
and color-schemes set by default made for an oh-so-much more pleasant
Vim experience.
This all led me to learning a lot more about Vim and becoming much more proficient using Vim. It's now my text-based editor of choice. Once you get to know the key-bindings and the various commands, the command-chaining that you can do in Vim is incredibly powerful.
If you're at all interesting in learning more about customizing your Unix shell environment, looking at other people's dot-files is a great learning experience. And publishing your dot-files on Github is a great way to share your shell-environment with the world so that others can learn and explore.
]]>Octopress is an obsessively designed framework for Jekyll blogging. It’s easy to configure and easy to deploy. Sweet huh? – Octopress
Sweet indeed! The more I read about it, the more intrigued I became. After a few hours spent exploring the code, importing the posts/pages from my Wordpress install into Octopress, and tweaking the default theme to my taste, I've taken the jump and moved my site over to using the Octopress platform.
Octopress is a simple blogging platform aimed at hackers. It's built around the
Jekyll engine, which is the blog-aware static
site generator that powers Github Pages. The "database"
for the site is simply a collection of flat text files using
Markdown (or Textile) mark-up. You
can setup either "posts" or "pages", all as you'd expect. When you're ready to
deploy, you use Rakefile
automation to "generate" the static site files.
It also has a great default theme (IMHO) – very clean and simple – which was actually my original attraction. It was only after reading more about the project that I realized it had huge appeal to my geek-side too.
It was Matt Gemmell's "Blogging With Octopress" post which resonated the most with me:
WordPress is excellent, but it’s over-featured for what I need, and its PHP/MySQL guts are opaque. I don’t really like the idea of all my writing being inside a big database either; it’s a single point of failure, and that makes me uneasy. – Matt Gemmell
For what I do with my website, Wordpress was overkill. It ended-up being Yet-Another-Thing-I-Needed-To-Maintain-Security-Patches-On™.
I was immediately attracted to the simplicity which Octopress provides: editing
pages and posts as plain-text text-files, all static files so no security-patches
or vulnerabilities to worry about, and easily portable and backup-able. It combines
a lot of things which I have grown to love: I can keep all my content in a Git
repository so that I can publish from multiple locations (if need be), I can
work on the site entirely in text (vim
+ screen
), and it's dead easy to
deploy and backup.
It really didn't take that long to migrate everything over to Octopress.
It took a few hours to get a cleaned-up import of my old Wordpress content.
Based on the recommendation of others, I used exitwp
to create Jekyll-style posts based on my Wordpress content. This went fine for
the most part, but there were some parts in the Wordpress export file that I
needed to fiddle to keep the exitwp
script from crashing. The end result
was that I had a directory full of simple *.markdown
files which represented
all the content from my Wordpress site. I spent a few hours going through those
files and cleaning up the Markdown syntax until it matched what I wanted. Some
of my original Wordpress posts were a mixture of Wordpress mark-up and raw HTML,
and that threw the exitwp
parser for a loop in some places. It did a great
job overall though.
From there, I spent some time looking through the guts of the Octopress source code getting acquainted with things and seeing how I might be able to fiddle with some of its inner fiddly-bits. It's written with native support for customization, trying to keep a clear separation between the core "code" of the site (which can be overwritten during a future upgrade) versus any "customization" files which the user might make changes too. It's a fantastic paradigm and one that you don't often see too much of in projects.
Last but not least, I spent some time tweaking the theme to my tasting (…a bit of pepper here, a bit of salt there…). As I mentioned above, it natively carves out files which are earmarked for user-customizations, where you can tweak the CSS (SASS), colors, page/sidebar dimensions, etc. It's all just extremely well-thought-out and a joy to use and extend.
This has been a fun pet project for me. It's the first time I've used a Ruby environment.
It's also the first time I've played with SASS (*.scss
),
and boy is it going to be hard to go back to writing plain-old CSS.
SASS's color functions
are ridiculously slick: being able to do things like desaturate(lighten($nav-bg, 8), 15)
in a *.scss
file is awesome and makes tweaking a site's color-scheme
oh-so-much easier.
I'm a bit sad to have lost the lifestream (wp-lifestream)
functionality from my old Wordpress site. But then again I'm already doing
full exports of most of my lifestream'd websites (Google Reader, Delicious),
so it might not be too hard to generate static lifestream pages using a
cronjob
or something. That's a project for another day…
*.markdown
files.In reading through different online forums, the topic of ZFS-based systems kept coming-up again and again. I had heard the term "ZFS" thrown around before but I had never really spent the time to read-up on it. It's just another filesystem, right? How fancy can it be?
Well, ZFS is just damn cool. ZFS is lot more than "just another filesystem"…
zfs send
and zfs receive
), snapshots and all, making it dead-easy to
backup ZFS filesystems to remote machines.Very, very cool stuff! The more I read about it, the more I wanted it for my next-generation file-server solution. In particular, I really love the idea of end-to-end checksums. I want to make darn sure that the OS can realize when the disks are serving up garbage-data or if somehow bits have mysteriously been corrupted on-disk (as unlikely as that might be).
In reading about all these various historical problems that ZFS sets out to address, I'm really surprised that more OSes haven't tried to adopt superior filesystem technology. The classic problem for so many Windows-users seems to be that their installs get "corrupt", likely usually due to folks forcibly powering-off their computers before the hard-drives have a chance to flush their write-buffer to disk. And then folks get to cope with BSOD's or broken Windows installs. Sad times for everyone. Filesystems seem like such a foundational part of the OS that you'd expect companies would spend the time to make them as robust as possible. I read some rumors that Apple was planning to incorporate ZFS into OSX a few years back, but apparently that fell apart for some reason. Sad…
Kudos to the Sun folks for designing such a fantastic filesystem and then open-sourcing it!
]]>I run an Ubuntu box at home and it was easy to install the dovecot-imapd
package to get an IMAP server installed. Since my box is behind my
router/firewall, I'm wasn't that concerned with tweaking Dovecot's default
configuration, but I'm sure you could fiddle with the config to ensure that
Dovecot only binds to 127.0.0.1
.
From there, it's just a matter of using imapsync
, just like I ended-up
using previously to
initially transfer all my e-mail to my Google Apps account.
Here's the script:
The --regextrans2
option rewrites IMAP folder-names on-the-fly, so that my
local IMAP folder structure can be different than the structure on Gmail's
server. For example, the top Gmail IMAP folder is [Gmail]
which wasn't all
that useful for me, so instead I rewrote that top-level folder to be
username@somedomain
so that the local folder name (e.g. in ~/mail/
) would
match the source e-mail address.
You can also use the --include
option to decide which IMAP folders to copy.
I opted to just copy "All Mail" and "Sent Mail", which gives me a copy of all
my mail but doesn't preserve any information about the labels I might have had
assigned to those messages in Gmail.
The initial copy will definitely take a few hours (or more), depending on how much e-mail you have in your Gmail account. But this works great for me and stores the mail in "mbox" format locally so I can even access the mail locally via mutt/alpine/etc.
]]>imapsync
to push data to my new Google Apps e-mail address:
http://gemal.dk/blog/2008/04/08/completed_the_gmail_migration/
I adapted that script for my own needs, and I was able to successfully copy all the mail from my regular Gmail account to my new Google Apps account.
Here is the final script I ended-up with:
The --syncinternaldates
, --useheader 'Message-Id'
,
and --skipsize
options are all recommended by the imapsync FAQ
(search for "Gmail"):
http://www.linux-france.org/prj/imapsync/FAQ
I opted to use the --passfile1/passfile2
options rather than passing in a
plain-text password in via a command-line param for two reasons: first because
anyone with access to your system can use ps
to view active processes and
hence would see your password plain-as-day; second because it just better
abstracts the script-logic from the password-text, and we can control the
file-permissions of those password files.
All your Gmail labels should sync-over automagically. Since this is going Google-to-Google, because Google presents labels as separate IMAP folders in their Gmail IMAP implementation, the process above should sync and preserve all your Gmail labels for free.
It's been a two years since I've used this script, but I seem to remember it working pretty painlessly. I wanted to share it here for anyone else who might be looking to do this same thing.
]]>(via kotaku.com)
]]>(via autocompleteme.com)
Google auto-complete hilarity.
]]>(via appolicious.com)
After hearing some rave-reviews from various places over the past few days, I picked-up GeoDefense for the iPhone this evening. Wow, it’s a blast! Nice balanced game-play, good amount of challenge, great graphics. Well worth the $2.
]]>I find that I don't always have the inspiration or motivation to write blog posts that often, but I do have a Web 2.0 foot-print that other people might want to watch, whether it be sharing interesting news tidbits on Google Reader to bookmarking useful webpages on Del.icio.us, to what songs I love on Last.fm.
Hence, the birth of this lifestream: all my various online activities aggregated and presenting in a unified "lifestream". This gives others an easy way to check-out what I'm up to. If you have any comments or feedback, drop me an e-mail!
]]>More and more lately, I've been trying to find web apps that can replace the desktop applications I use on a daily basis. It's just so incredibly convenient to have the portability to be able to do everything you need from inside a web-browser, and more importantly to have the persistent-state data stored on the server-side rather than on each individual machine I use throughout the day. I'm already a huge fan of webapps like del.icio.us (for tracking all my bookmarks) and Google Reader (for all my RSS aggregation and reading needs), and finding a web-based password manager would be ideal. It really becomes a chore remembering what username or e-mail address I registered with at various websites: Amazon, eBay, Newegg, Google, Yahoo, credit cards, travel sites, online bill pay, banks, etc. (Oy!) It's just a lot of identities to juggle. I've thought about using KeePass or something, but that's a desktop app and I'd have to keep my database in-sync between work and home.
What really interested me about Clipperz is their claim at being a "zero- knowledge" web-application: they only store encrypted data on their server- side, and use Javascript to decrypt the data client-side based on the username and password which you supply. Your password is never sent to the server at all; it's merely used to decrypt the data locally. That's a pretty slick idea, IMO. But, I'm also a little nervous about trusting some third-party website will all my personal sensitive online account information. So, I'm tempted to sit down and write my own webapp to recreate that wheel, except that it's all code I can vouch for and know that it's not secretly sending my data off to some third-party site. You can already find open-source Javascript-based implementations of the various crypto algorithms that I'd need to use. And it seems like a simple AJAX-based webapp to save/load data from the server.
Also really intriguing is Clipperz' one-click-sign-in feature. They have a Javascript bookmark which pulls HTML FORM information off a given login page and then allows you to link the various FORM INPUT fields with the appropriate data-fields on the "card" for that website (e.g. username, password). The one- click sign-in then just has to do a HTTP POST to the desired web-address with the correct FORM data to act just like the real login page. It's simple enough, but that seems like the real time-saver here. Not only could you have a webapp which stores all your personal data, but it also provides a quick launchpad to login any site you need. That just takes it a step further and makes the webapp almost a blackbox: you don't really need to know what your username and password is anymore because the webapp database knows it and can feed it to the target website's login form for you.
This just seems like a really awesome idea, and I'm really tempted to just use Clipperz natively so that I don't have to re-invent the wheel, but I'm still just nervous about using someone else's website as a database to store all my personal sensitive data. It's basically just a trust issue, and I don't think I have any good reason to trust that their Javascript code will never do anything malicious. I'd much rather control the data myself on my private webserver. If only Clipperz was a SourceForge project… ;)
]]>I already use a quasi-RAID system: I have two identical hdd's in my file- server box and I rsync the "master" drive to the "slave" drive every so often. Not only does this provide me some amount of rollback-ness (i.e. because I only rsync so often), but the decision to not RAID-mirror the drives was intentional: if the filesystem or partition table on the RAID somehow became corrupted, all my data could be lost.
But…what if my house burns down? (*gasp* Oh noes!!) Yes, that would be sad indeed! So, rather than just providing a single layer of redundancy locally, if I really want to invest in the survival of my important data, I really need to spread that data around; I need to diversify.
I've looked around at various web-based back-up solutions like Amazon's S3 service, but those don't seem very optimal for me because of the amount of data I want to backup (~150GB) and because it seems like it would be a PITA to do a full restore over my home cable internet pipe. Not to mention the monthly fees, paying someone else to store my data safe and sound. But, dare I trust my important (and partially sensitive) data to a stranger?
Currently, I'm tempted by a seemingly simple solution: just get an external USB hdd, mirror my data once locally, and then throw it at a friend's house and use rsync to keep it up-to-date. The main cost involved is the cost of the new hdd; there's no monthly fee because I already need to pay for my internet- access. And I can even return the favor by hosting drives on my end too. And this could even be expanded to a multiple people, if you wanted to back-up your data in multiple off-sites. This seems almost too easy to me, but it seems perfectly effective. Anything, it's making use of the hidden geek- factor: you're a geek and you have geek friends, so why not make the most of it and use them for geeky endeavors like helping each other backup each other's data? ;)
The main problem I have with this plan is that my data wouldn't be encrypted at all. I'm not sure how paranoid I really need to be about my friends snooping around my data. Though, I think this is basically the general idea as: what would happen if someone stole your computer? So, that's really more an argument that I should be locally encrypting my data so that even if prying eyes were to get at my local/master copy, they still wouldn't be able to do much with it.
Has anyone else given thought to getting more serious about backing up their data?
]]>