I have good reasons for making this move and I don’t want to get pounced on for my decision, so let us just start there. My reasons include hosting costs and the fact that our Plone host hasn’t been a good fit. I recently completed the process of migrating Recycle-A-Bicycle to a new web host and CMS. I moved almost all of our content from Plone into Drupal, and learned a few things along the way.
I hate to tempt you with titles that might suggest that I figured out how to automate the Plone to Drupal migration. I didn’t. What I did do was figure out a few nifty command line options to get all the content from our old site into a useable format on my own laptop so that I could start plugging it into the new Drupal Site.
One thing Plone lets you do is store pages with whatever shortnames you want. To get some consistency starting out, I pulled the whole site down thus:
wget
wget -mirror -p --html-extension --convert-links http://www.example.com
That ensured that all links were fixed so they’d work locally, and all pages had the html extension appended.
tidy
Then I copied /etc/tidy/tidy.conf to my home directory as .tidyrc and edited it to work just the way I like.
tidy -m ./*.html
time for a real editor
With all of that done, I really needed a decent editor. I couldn’t get extended searches to work in bluefish (nor could I sort out how to define a project), so I tried Screem which seemed to be doing the trick but I couldn’t figure out what it meant to search a “whole site”–is that a project? And where do you define it? And how do you see what you’ve defined? Confusing. So I settled on Quanta, though diligent readers might have noticed that I wasn’t ever 100% thrilled with Quanta (Kfilereplace is the real problem) and sometimes had to scoot back to gvim.
That is all. This wasn’t so much fun, but I think that wget and tidy are totally underused.
Plone gives me the willies. Drupal just makes me tilt my head quizically. Except I don’t think that’s a word.
Quizzically is totally a word. You just have to know how to spell it.
I secretly suspect that Plone is great. But if you really understand php, apache and mysql, drupal is a lot easier. Also if your plone host stinks, anything is better. I’d hate to judge apache based on my run in with one fly-by-night ISP.
Just in case you are reading this and wondering why on earth I write about such mundane crap …
I am sitting at Recycle-A-Bicycle looking at the infinite page not found logs that have been churning up from the deep since we moved to Drupal. Most of the missing pages don’t even have referrers, which leads me to suspect that robots are involved. I’ve run the W3 Link Checker a few times over so I know there aren’t (okay … there are a few, but I know what they are) broken links within the site. So I figure one option for starting to identify real broken links from outside the site (as opposed to web crawler databases that host broken links) is to look at my google analytics account.
But, of course, I can’t do that. Because of that font thing with Flash and Ubuntu. Instead of starting from scratch and searching the internet to maybe (maybe not) find the solution to a problem I’ve solved before, I can search my notebook. Where I wrote down what I found useful. See how that works? Clever. Now that I can see the flash text, it turns out that google analytics doesn’t have any information on broken links, but that is a different problem.
If you’re asking yourself what this has to do with Plone, or Drupal (you should know better than to bother asking, shouldn’t you?) it has to do with the transition, see. Because one of the things that Zope does is allow this funky (if you are used to Apache) phenomenon of ever shifting URLs. Link to a page called “rideclub” and if the link isn’t technically accurate, Zope will scan the directory structure for a page by that name and the URL you see will be totally wackadoo. So that a URL like: resources/toolsforlife/39funds.html/56budget.html actually worked in Plone. Someone screwed up and put a link to “56budget.html” that made it look downwind of “39funds.html” when it lived in the same directory. Plone went ahead and found the document anyhow and (I suspect) some spider or robot someplace stored that URL. And now it keeps looking for it and turning up naught. And I keep getting these Page Not Found errors that make it seem like there is a bigger problem than a bunch of robots with bad information checking up on pages. I want to be able to watch for real broken links, so I’m hoping this will all die down soon.
Want to read this Tools for Life?
[...] Plone to Drupal [...]