[Members] wiki.xmpp.org data recovery

Guus der Kinderen guus.der.kinderen at gmail.com
Fri Jun 23 15:07:53 UTC 2017

I've taken Tobias' http://ayena.de/files/wiki.xmpp.org.zip archive and
pulled every HTML file there through the Xidel / Pandoc wringer. The
resulting content is stored in a file with a different extension. I've
created a new archive of everything here:
http://goodbytes.nl/with-conversion.tar.gz (same as Tobias' archive, but
with additional files). With these files, and the list of manual
modifications that I mentioned in my last message, content is restored with
relative ease.

On 23 June 2017 at 16:59, Kevin Smith <kevin.smith at isode.com> wrote:

> On 23 Jun 2017, at 11:07, Guus der Kinderen <guus.der.kinderen at gmail.com>
> wrote:
> >
> > I've manually restored my application pages and all page's from Tobi's
> archive that started with Summer_of_Code
> >
> > From that, I've learned that these manual modifications are needed for a
> page that is transformed using the xidel / pandoc combination mentioned
> earlier:
> >       • The table of content needs to be removed (Mediawiki will add one
> automatically)
> >       • Everything that matches this regex need to be removed <span
> [^>]*> (these were used to create anchors for the old ToC, I think)
> >       • Everything that matches </span> needs to be removed (closing
> tags for the anchors mentioned above)
> >       • The old context root of the wiki was /web/, while the new one is
> /index.php/ - search the text for web/ which gives you some old references
> to pages and or user profiles
> >       • Some pages start with a level 2 header - you'll have to reduce
> all header levels down by one for these pages.
> >       • Generally, get rid of <div> and <br> tags
> >       • Images that are used on some pages are lost
> >       • When images were used, there now is a table of two columns, each
> column having a fixed with of 50%. You should drop that 50% fixation.
> > After that, Mediawiki's preview can be used for smell-testing your
> resulting page.
> Thanks very very much, Guus.
> /K
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.jabber.org/pipermail/members/attachments/20170623/27017002/attachment.html>

More information about the Members mailing list