wiki.libvirt.org replacement

https://gitlab.com/libvirt/libvirt-wiki/-/merge_requests/1 Hi, in order to ease editing of the libvirt wiki, remove the need for registering users and remove the need to run PHP in an openshift instance I've decided to contents of the libvirt wiki into a staticaly generated page using (almost) the same approach we use to generate the libvirt web pages from rST documents in the repo. All articles were converted into rST files [1] and images linked from currently existing articles were dowloaded. Advantages: - same workflow as with editing the libvirt pages - gitlab still provides a web editor - local editing for users who hate web - no need to deal with user registration - no need to run PHP in openshift - we still keep separate space for docs which don't really belong into the main repo - even if we decide to kill-off the wiki eventually the valuable content will be easier to port to the kbase as it'll be in rST - the conversion fixed many orphaned pages Disadvantages: - all links will be broken [2] - changes will need to be reviewed/approved - low quality/obsolete content is forward-ported as I didn't review anything - the build script is in bash (This obviously can be changed if somebody cares more than I do.) The generator is based on a cleaned up page.xsl and other assets from libvirt's repo and the check-html-references script [3] to validate linking. The new wiki can for now be browsed from artifacts of the pipeline job that I've used to test it: https://pipo.sk.gitlab.io/-/libvirt-wiki/-/jobs/3771076975/artifacts/website... (I know my username is unfortunate to contain a dot which breaks the certificate) If somebody wishes to see the extremely ugly conversion scripts: https://gitlab.com/pipo.sk/libvirt-wiki/-/commit/fe36e37d3580a76ca18e5123541... [1] The conversion was done using a collection of ugly scripts and pandoc. Unfortunately markdown-eque formats have the issue of not really having strict rules and in certain cases the tools used to process them not implementing them correctly. This meant that the least painful was actually the coversion from HTML!!! and not the internal mediawiki format. [2] If we really want to preserve links I can modify the script to generate a list of redirects for the webserver, but I really doubt that preserving links has value. Additionally some content was moved to the knowledge base, so I really want to just delete it from the wiki, so breaking links is a feature. [3] It uses the version after the last patchset: https://listman.redhat.com/archives/libvir-list/2023-February/237785.html

On Tue, Feb 14, 2023 at 11:03:20PM +0100, Peter Krempa wrote:
https://gitlab.com/libvirt/libvirt-wiki/-/merge_requests/1
Hi,
in order to ease editing of the libvirt wiki, remove the need for registering users and remove the need to run PHP in an openshift instance I've decided to contents of the libvirt wiki into a staticaly generated page using (almost) the same approach we use to generate the libvirt web pages from rST documents in the repo.
Excellent !
All articles were converted into rST files [1] and images linked from currently existing articles were dowloaded.
Advantages:
- same workflow as with editing the libvirt pages - gitlab still provides a web editor - local editing for users who hate web - no need to deal with user registration - no need to run PHP in openshift - we still keep separate space for docs which don't really belong into the main repo - even if we decide to kill-off the wiki eventually the valuable content will be easier to port to the kbase as it'll be in rST - the conversion fixed many orphaned pages
You missed the single most important benefit * Remove me as a single point of failure for hosting of the wiki The wiki is attached to a personal openshift account and there's no way for me to grant other users access to co-admin it :-( Removing me as a failure point outweighs all the disadvantages !
Disadvantages:
- all links will be broken [2]
IIUC, the problem is that gitlab needs to have a file extension to make it correctly serve with HTML content type. All our wiki pages lack a file extension It might be possible to achieve this by using sub-directories. eg instead of converting TroubleshootMacvtapHostFail.rst -> TroubleshootMacvtapHostFail.html do TroubleshootMacvtapHostFail.rst -> TroubleshootMacvtapHostFail/index.html In theory then, if someone accesses /TroubleshootMacvtapHostFail they should get a redirect to /TroubleshootMacvtapHostFail/ which should then serve /TroubleshootMacvtapHostFail/index.html
- changes will need to be reviewed/approved
We could turn off required for merge request approval if really needed, which would allow any libvirt committer to merge without waiting, 3rd parties wold still need help. I don't think it is a big deal though, as the frequency of edits is very small.
- low quality/obsolete content is forward-ported as I didn't review anything
Not an issue since we're keeping it separate from main libvirt docs.
- the build script is in bash (This obviously can be changed if somebody cares more than I do.)
The generator is based on a cleaned up page.xsl and other assets from libvirt's repo and the check-html-references script [3] to validate linking.
The new wiki can for now be browsed from artifacts of the pipeline job that I've used to test it:
https://pipo.sk.gitlab.io/-/libvirt-wiki/-/jobs/3771076975/artifacts/website...
That's a 404, the right URL seems to be this one (at least this is the only pipeline I see that exists in your fork) https://pipo.sk.gitlab.io/-/libvirt-wiki/-/jobs/3771270309/artifacts/website...
[1] The conversion was done using a collection of ugly scripts and pandoc. Unfortunately markdown-eque formats have the issue of not really having strict rules and in certain cases the tools used to process them not implementing them correctly. This meant that the least painful was actually the coversion from HTML!!! and not the internal mediawiki format.
[2] If we really want to preserve links I can modify the script to generate a list of redirects for the webserver, but I really doubt that preserving links has value. Additionally some content was moved to the knowledge base, so I really want to just delete it from the wiki, so breaking links is a feature.
AFAIK, you can't do redirects in gitlab pages, except the hacky way by actually creating a page with the desired name, and using the <meta> tag to redirect when the browser loads it. If you go that far you might as well just output the content in that page to start with. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Wed, Feb 15, 2023 at 10:55:42 +0000, Daniel P. Berrangé wrote:
On Tue, Feb 14, 2023 at 11:03:20PM +0100, Peter Krempa wrote:
https://gitlab.com/libvirt/libvirt-wiki/-/merge_requests/1
Hi,
in order to ease editing of the libvirt wiki, remove the need for registering users and remove the need to run PHP in an openshift instance I've decided to contents of the libvirt wiki into a staticaly generated page using (almost) the same approach we use to generate the libvirt web pages from rST documents in the repo.
Excellent !
All articles were converted into rST files [1] and images linked from currently existing articles were dowloaded.
Advantages:
- same workflow as with editing the libvirt pages - gitlab still provides a web editor - local editing for users who hate web - no need to deal with user registration - no need to run PHP in openshift - we still keep separate space for docs which don't really belong into the main repo - even if we decide to kill-off the wiki eventually the valuable content will be easier to port to the kbase as it'll be in rST - the conversion fixed many orphaned pages
You missed the single most important benefit
* Remove me as a single point of failure for hosting of the wiki
The wiki is attached to a personal openshift account and there's no way for me to grant other users access to co-admin it :-( Removing me as a failure point outweighs all the disadvantages !
Disadvantages:
- all links will be broken [2]
IIUC, the problem is that gitlab needs to have a file extension to make it correctly serve with HTML content type. All our wiki pages lack a file extension
It might be possible to achieve this by using sub-directories.
eg instead of converting
TroubleshootMacvtapHostFail.rst -> TroubleshootMacvtapHostFail.html
do
TroubleshootMacvtapHostFail.rst -> TroubleshootMacvtapHostFail/index.html
I think I'd rather keep the conversion flat so that linking is not confusing. If we want to keep links we can then keep a directory and do a bunch of symlinks for all the pages. Eventually we'd be able to get rid of them. In certain cases the coversion removed special characters from the filenames as xsltproc didn't like having ' and ". I've also purged : ()_and few other which I don't really want to keep part of filenames. I obviously vote for breaking all links.
In theory then, if someone accesses
/TroubleshootMacvtapHostFail
they should get a redirect to
/TroubleshootMacvtapHostFail/
which should then serve
/TroubleshootMacvtapHostFail/index.html
- changes will need to be reviewed/approved
We could turn off required for merge request approval if really needed, which would allow any libvirt committer to merge without waiting, 3rd parties wold still need help. I don't think it is a big deal though, as the frequency of edits is very small.
- low quality/obsolete content is forward-ported as I didn't review anything
Not an issue since we're keeping it separate from main libvirt docs.
- the build script is in bash (This obviously can be changed if somebody cares more than I do.)
The generator is based on a cleaned up page.xsl and other assets from libvirt's repo and the check-html-references script [3] to validate linking.
The new wiki can for now be browsed from artifacts of the pipeline job that I've used to test it:
https://pipo.sk.gitlab.io/-/libvirt-wiki/-/jobs/3771076975/artifacts/website...
That's a 404, the right URL seems to be this one (at least this is the only pipeline I see that exists in your fork)
https://pipo.sk.gitlab.io/-/libvirt-wiki/-/jobs/3771270309/artifacts/website...
Oops, I linked to an older job and then I've purged everything besides the last one :)
[1] The conversion was done using a collection of ugly scripts and pandoc. Unfortunately markdown-eque formats have the issue of not really having strict rules and in certain cases the tools used to process them not implementing them correctly. This meant that the least painful was actually the coversion from HTML!!! and not the internal mediawiki format.
[2] If we really want to preserve links I can modify the script to generate a list of redirects for the webserver, but I really doubt that preserving links has value. Additionally some content was moved to the knowledge base, so I really want to just delete it from the wiki, so breaking links is a feature.
AFAIK, you can't do redirects in gitlab pages, except the hacky way by actually creating a page with the desired name, and using the <meta> tag to redirect when the browser loads it. If you go that far you might as well just output the content in that page to start with.
The original idea was that it'd be served the same way as libvirt's web through DV's server and thus didn't think originally about pages. There we could generate a .htaccess or whatever to keep the redirects. Gitlab pages is obviously also a good idea here, but thus will require changing the ci definition to generate into proper directory and name the job properly.

On Wed, Feb 15, 2023 at 12:22:04PM +0100, Peter Krempa wrote:
On Wed, Feb 15, 2023 at 10:55:42 +0000, Daniel P. Berrangé wrote:
On Tue, Feb 14, 2023 at 11:03:20PM +0100, Peter Krempa wrote:
https://gitlab.com/libvirt/libvirt-wiki/-/merge_requests/1
Hi,
in order to ease editing of the libvirt wiki, remove the need for registering users and remove the need to run PHP in an openshift instance I've decided to contents of the libvirt wiki into a staticaly generated page using (almost) the same approach we use to generate the libvirt web pages from rST documents in the repo.
Excellent !
All articles were converted into rST files [1] and images linked from currently existing articles were dowloaded.
Advantages:
- same workflow as with editing the libvirt pages - gitlab still provides a web editor - local editing for users who hate web - no need to deal with user registration - no need to run PHP in openshift - we still keep separate space for docs which don't really belong into the main repo - even if we decide to kill-off the wiki eventually the valuable content will be easier to port to the kbase as it'll be in rST - the conversion fixed many orphaned pages
You missed the single most important benefit
* Remove me as a single point of failure for hosting of the wiki
The wiki is attached to a personal openshift account and there's no way for me to grant other users access to co-admin it :-( Removing me as a failure point outweighs all the disadvantages !
Disadvantages:
- all links will be broken [2]
IIUC, the problem is that gitlab needs to have a file extension to make it correctly serve with HTML content type. All our wiki pages lack a file extension
It might be possible to achieve this by using sub-directories.
eg instead of converting
TroubleshootMacvtapHostFail.rst -> TroubleshootMacvtapHostFail.html
do
TroubleshootMacvtapHostFail.rst -> TroubleshootMacvtapHostFail/index.html
I think I'd rather keep the conversion flat so that linking is not confusing. If we want to keep links we can then keep a directory and do a bunch of symlinks for all the pages. Eventually we'd be able to get rid of them.
In certain cases the coversion removed special characters from the filenames as xsltproc didn't like having ' and ". I've also purged : ()_and few other which I don't really want to keep part of filenames.
I obviously vote for breaking all links.
I'm not so bothered about the wierdly named pages as probably no one links to them. A few pages though are fairly widely linked, so it'd be a shame to break them. If we want to keep the main pages flat, perhaps generate the subdirs with the gross <meta> redirect trick ?
In theory then, if someone accesses
/TroubleshootMacvtapHostFail
they should get a redirect to
/TroubleshootMacvtapHostFail/
which should then serve
/TroubleshootMacvtapHostFail/index.html
- changes will need to be reviewed/approved
We could turn off required for merge request approval if really needed, which would allow any libvirt committer to merge without waiting, 3rd parties wold still need help. I don't think it is a big deal though, as the frequency of edits is very small.
- low quality/obsolete content is forward-ported as I didn't review anything
Not an issue since we're keeping it separate from main libvirt docs.
- the build script is in bash (This obviously can be changed if somebody cares more than I do.)
The generator is based on a cleaned up page.xsl and other assets from libvirt's repo and the check-html-references script [3] to validate linking.
The new wiki can for now be browsed from artifacts of the pipeline job that I've used to test it:
https://pipo.sk.gitlab.io/-/libvirt-wiki/-/jobs/3771076975/artifacts/website...
That's a 404, the right URL seems to be this one (at least this is the only pipeline I see that exists in your fork)
https://pipo.sk.gitlab.io/-/libvirt-wiki/-/jobs/3771270309/artifacts/website...
Oops, I linked to an older job and then I've purged everything besides the last one :)
[1] The conversion was done using a collection of ugly scripts and pandoc. Unfortunately markdown-eque formats have the issue of not really having strict rules and in certain cases the tools used to process them not implementing them correctly. This meant that the least painful was actually the coversion from HTML!!! and not the internal mediawiki format.
[2] If we really want to preserve links I can modify the script to generate a list of redirects for the webserver, but I really doubt that preserving links has value. Additionally some content was moved to the knowledge base, so I really want to just delete it from the wiki, so breaking links is a feature.
AFAIK, you can't do redirects in gitlab pages, except the hacky way by actually creating a page with the desired name, and using the <meta> tag to redirect when the browser loads it. If you go that far you might as well just output the content in that page to start with.
The original idea was that it'd be served the same way as libvirt's web through DV's server and thus didn't think originally about pages. There we could generate a .htaccess or whatever to keep the redirects.
I'd really like to avoid adding stuff to the physical libvirt.org server. It is an alarming single point of failure and even though DV has backups captured, I'm not confident in correctly bringing everything back on a new server without a bunch of manual intervention. I would have already moved the main website to gitlab, except that the apache webroot has got content merged from multiple different sources into one, and its difficult to untangle it.
Gitlab pages is obviously also a good idea here, but thus will require changing the ci definition to generate into proper directory and name the job properly.
Essentially there needs to be job called 'pages' and it needs to put all the content in a subdir called 'public/' under the source root, and should have 'rules' to limit publishing to a push event on the default branch. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Wed, Feb 15, 2023 at 11:43:00 +0000, Daniel P. Berrangé wrote:
On Wed, Feb 15, 2023 at 12:22:04PM +0100, Peter Krempa wrote:
On Wed, Feb 15, 2023 at 10:55:42 +0000, Daniel P. Berrangé wrote:
On Tue, Feb 14, 2023 at 11:03:20PM +0100, Peter Krempa wrote:
[...]
I think I'd rather keep the conversion flat so that linking is not confusing. If we want to keep links we can then keep a directory and do a bunch of symlinks for all the pages. Eventually we'd be able to get rid of them.
In certain cases the coversion removed special characters from the filenames as xsltproc didn't like having ' and ". I've also purged : ()_and few other which I don't really want to keep part of filenames.
I obviously vote for breaking all links.
I'm not so bothered about the wierdly named pages as probably no one links to them. A few pages though are fairly widely linked, so it'd be a shame to break them.
If we want to keep the main pages flat, perhaps generate the subdirs with the gross <meta> redirect trick ?
Okay I'll try to hack in the generation of the redirects into the conversion script. It should be fairly easy. I'll also dump the list of pages which got weird chars dropped which we'll most likely just break.
participants (2)
-
Daniel P. Berrangé
-
Peter Krempa