[libvirt] [PATCH 0/3] Remove website search, just use google

older
[libvirt] [PATCH v3 0/5] tools:...

Cole Robinson

4 Apr 2019 4 Apr '19

6:26 a.m.

Andrea's ChangeLog patches reminder me that I attempted something similarish once, and also tried to drop our custom website search: http://www.redhat.com/archives/libvir-list/2016-May/msg01616.html danpb recommended I just make our search bar use google instead but I never made the attempt. 3 years later here it is. Patch #1 makes the switch but doesn't remove anything Patch #2 drops index.py, the search indexer Patch #3 drops search.php and all the references to it. libvirt.org/search.php is still a valid URL so this may break some links, even though at the moment the page prints virtually nothing except some boilerplate text. Maybe we can redirect it to the homepage or something, ideas welcome Cole Robinson (3): docs: Use google sitesearch for website search docs: Remove index.py docs: Remove search.php and all references .gitignore | 1 - docs/Makefile.am | 22 +- docs/devhelp/html.xsl | 4 - docs/index.py | 1266 --------------------------------------- docs/page.xsl | 9 +- docs/search.php.code.in | 225 ------- docs/search.php.in | 16 - 7 files changed, 6 insertions(+), 1537 deletions(-) delete mode 100755 docs/index.py delete mode 100644 docs/search.php.code.in delete mode 100644 docs/search.php.in -- 2.21.0

Show replies by date

Cole Robinson

4 Apr 4 Apr

6:26 a.m.

New subject: [libvirt] [PATCH 1/3] docs: Use google sitesearch for website search

The website search is perpetually broken, has had XSS issues in the past, and I suspect when it's working it's not as fast or capable as a simple google site:libvirt.org search Replace the <form> implementation with one that sends the user to google.com with 'site:libvirt.org' appended to the search string Signed-off-by: Cole Robinson <crobinso@redhat.com> --- docs/page.xsl | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/docs/page.xsl b/docs/page.xsl index 4698e2789e..3d007f486c 100644 --- a/docs/page.xsl +++ b/docs/page.xsl @@ -155,11 +155,10 @@ </ul> </div> <div id="search"> - <form action="{$href_base}search.php" enctype="application/x-www-form-urlencoded" method="get"> - <div> - <input name="query" type="text" size="12" value=""/> - <input name="submit" type="submit" value="Go"/> - </div> + <form action="https://www.google.com/search" menctype="application/x-www-form-urlencoded" method="get"> + <input name="sitesearch" type="hidden" value="libvirt.org"/> + <input name="q" type="text" size="12" value=""/> + <input type="submit" value="Go"/> </form> </div> </div> -- 2.21.0

Daniel P. Berrangé

10:24 p.m.

New subject: [libvirt] [PATCH 1/3] docs: Use google sitesearch for website search

On Wed, Apr 03, 2019 at 06:26:49PM -0400, Cole Robinson wrote:

...

The website search is perpetually broken, has had XSS issues in the past, and I suspect when it's working it's not as fast or capable as a simple google site:libvirt.org search

Replace the <form> implementation with one that sends the user to google.com with 'site:libvirt.org' appended to the search string

Signed-off-by: Cole Robinson <crobinso@redhat.com> --- docs/page.xsl | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-)

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

Ján Tomko

10:35 p.m.

New subject: [libvirt] [PATCH 1/3] docs: Use google sitesearch for website search

On Wed, Apr 03, 2019 at 06:26:49PM -0400, Cole Robinson wrote:

...

The website search is perpetually broken, has had XSS issues in the past, and I suspect when it's working it's not as fast or capable as a simple google site:libvirt.org search

Replace the <form> implementation with one that sends the user to google.com with 'site:libvirt.org' appended to the search string

Signed-off-by: Cole Robinson <crobinso@redhat.com> --- docs/page.xsl | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/docs/page.xsl b/docs/page.xsl index 4698e2789e..3d007f486c 100644 --- a/docs/page.xsl +++ b/docs/page.xsl @@ -155,11 +155,10 @@ </ul> </div> <div id="search"> - <form action="{$href_base}search.php" enctype="application/x-www-form-urlencoded" method="get"> - <div> - <input name="query" type="text" size="12" value=""/> - <input name="submit" type="submit" value="Go"/> - </div> + <form action="https://www.google.com/search" menctype="application/x-www-form-urlencoded" method="get">

s/menctype/enctype/ ? Jano

...

+ <input name="sitesearch" type="hidden" value="libvirt.org"/> + <input name="q" type="text" size="12" value=""/> + <input type="submit" value="Go"/> </form> </div> </div> -- 2.21.0

-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

Cole Robinson

5 Apr 5 Apr

6:49 a.m.

New subject: [libvirt] [PATCH 1/3] docs: Use google sitesearch for website search

On 4/4/19 10:35 AM, Ján Tomko wrote:

...

On Wed, Apr 03, 2019 at 06:26:49PM -0400, Cole Robinson wrote:

...
The website search is perpetually broken, has had XSS issues in the past, and I suspect when it's working it's not as fast or capable as a simple google site:libvirt.org search

Replace the <form> implementation with one that sends the user to google.com with 'site:libvirt.org' appended to the search string

Signed-off-by: Cole Robinson <crobinso@redhat.com> --- docs/page.xsl | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/docs/page.xsl b/docs/page.xsl index 4698e2789e..3d007f486c 100644 --- a/docs/page.xsl +++ b/docs/page.xsl @@ -155,11 +155,10 @@ </ul> </div> <div id="search"> - <form action="{$href_base}search.php" enctype="application/x-www-form-urlencoded" method="get"> - <div> - <input name="query" type="text" size="12" value=""/> - <input name="submit" type="submit" value="Go"/> - </div> + <form action="https://www.google.com/search" menctype="application/x-www-form-urlencoded" method="get">

s/menctype/enctype/ ?

Nice catch, the attribute misgendering was not intended. Fixed before pushing Thanks, Cole

...

...
+ <input name="sitesearch" type="hidden" value="libvirt.org"/> + <input name="q" type="text" size="12" value=""/> + <input type="submit" value="Go"/> </form> </div> </div> -- 2.21.0

-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

- Cole

Cole Robinson

4 Apr 4 Apr

6:26 a.m.

New subject: [libvirt] [PATCH 2/3] docs: Remove index.py

This was used for generating the website search, which now just calls out to google. Remove it Signed-off-by: Cole Robinson <crobinso@redhat.com> --- docs/index.py | 1266 ------------------------------------------------- 1 file changed, 1266 deletions(-) delete mode 100755 docs/index.py diff --git a/docs/index.py b/docs/index.py deleted file mode 100755 index 0d07ca4d05..0000000000 --- a/docs/index.py +++ /dev/null @@ -1,1266 +0,0 @@ -#!/usr/bin/env python2 -# -# imports the API description and fills up a database with -# name relevance to modules, functions or web pages -# -# Operation needed: -# ================= -# -# install mysqld, the python wrappers for mysql and libxml2, start mysqld -# - mysql-server -# - mysql -# - php-mysql -# - MySQL-python -# Change the root passwd of mysql: -# mysqladmin -u root password new_password -# Create the new database libvir -# mysqladmin -p create libvir -# Create a database user 'veillard' and give him password access -# change veillard and abcde with the right user name and passwd -# mysql -p -# password: -# mysql> GRANT ALL PRIVILEGES ON libvir TO veillard@localhost -# IDENTIFIED BY 'abcde' WITH GRANT OPTION; -# mysql> GRANT ALL PRIVILEGES ON libvir.* TO veillard@localhost -# IDENTIFIED BY 'abcde' WITH GRANT OPTION; -# -# As the user check the access: -# mysql -p libvir -# Enter password: -# Welcome to the MySQL monitor.... -# mysql> use libvir -# Database changed -# mysql> quit -# Bye -# -# Then run the script in the doc subdir, it will create the symbols and -# word tables and populate them with information extracted from -# the libvirt-api.xml API description, and make then accessible read-only -# by nobody@loaclhost the user expected to be Apache's one -# -# On the Apache configuration, make sure you have php support enabled -# - -import MySQLdb -import libxml2 -import sys -import string -import os - -# -# We are not interested in parsing errors here -# -def callback(ctx, str): - return -libxml2.registerErrorHandler(callback, None) - -# -# The dictionary of tables required and the SQL command needed -# to create them -# -TABLES = { - "symbols": """CREATE TABLE symbols ( - name varchar(255) BINARY NOT NULL, - module varchar(255) BINARY NOT NULL, - type varchar(25) NOT NULL, - descr varchar(255), - UNIQUE KEY name (name), - KEY module (module))""", - "words": """CREATE TABLE words ( - name varchar(50) BINARY NOT NULL, - symbol varchar(255) BINARY NOT NULL, - relevance int, - KEY name (name), - KEY symbol (symbol), - UNIQUE KEY ID (name, symbol))""", - "wordsHTML": """CREATE TABLE wordsHTML ( - name varchar(50) BINARY NOT NULL, - resource varchar(255) BINARY NOT NULL, - section varchar(255), - id varchar(50), - relevance int, - KEY name (name), - KEY resource (resource), - UNIQUE KEY ref (name, resource))""", - "wordsArchive": """CREATE TABLE wordsArchive ( - name varchar(50) BINARY NOT NULL, - ID int(11) NOT NULL, - relevance int, - KEY name (name), - UNIQUE KEY ref (name, ID))""", - "pages": """CREATE TABLE pages ( - resource varchar(255) BINARY NOT NULL, - title varchar(255) BINARY NOT NULL, - UNIQUE KEY name (resource))""", - "archives": """CREATE TABLE archives ( - ID int(11) NOT NULL auto_increment, - resource varchar(255) BINARY NOT NULL, - title varchar(255) BINARY NOT NULL, - UNIQUE KEY id (ID,resource(255)), - INDEX (ID), - INDEX (resource))""", - "Queries": """CREATE TABLE Queries ( - ID int(11) NOT NULL auto_increment, - Value varchar(50) NOT NULL, - Count int(11) NOT NULL, - UNIQUE KEY id (ID,Value(35)), - INDEX (ID))""", - "AllQueries": """CREATE TABLE AllQueries ( - ID int(11) NOT NULL auto_increment, - Value varchar(50) NOT NULL, - Count int(11) NOT NULL, - UNIQUE KEY id (ID,Value(35)), - INDEX (ID))""", -} - -# -# The XML API description file to parse -# -API = "libvirt-api.xml" -DB = None - -######################################################################### -# # -# MySQL database interfaces # -# # -######################################################################### -def createTable(db, name): - global TABLES - - if db is None: - return -1 - if name is None: - return -1 - c = db.cursor() - - ret = c.execute("DROP TABLE IF EXISTS %s" % (name)) - if ret == 1: - print "Removed table %s" % (name) - print "Creating table %s" % (name) - try: - ret = c.execute(TABLES[name]) - except: - print "Failed to create table %s" % (name) - return -1 - return ret - -def checkTables(db, verbose=1): - global TABLES - - if db is None: - return -1 - c = db.cursor() - nbtables = c.execute("show tables") - if verbose: - print "Found %d tables" % (nbtables) - tables = {} - i = 0 - while i < nbtables: - l = c.fetchone() - name = l[0] - tables[name] = {} - i = i + 1 - - for table in TABLES.keys(): - if not tables.has_key(table): - print "table %s missing" % (table) - createTable(db, table) - try: - ret = c.execute("SELECT count(*) from %s" % table) - row = c.fetchone() - if verbose: - print "Table %s contains %d records" % (table, row[0]) - except: - print "Troubles with table %s: repairing" % (table) - ret = c.execute("repair table %s" % table) - print "repairing returned %d" % (ret) - ret = c.execute("SELECT count(*) from %s" % table) - row = c.fetchone() - print "Table %s contains %d records" % (table, row[0]) - if verbose: - print "checkTables finished" - - # make sure apache can access the tables read-only - try: - ret = c.execute("GRANT SELECT ON libvir.* TO nobody@localhost") - ret = c.execute("GRANT INSERT,SELECT,UPDATE ON libvir.Queries TO nobody@localhost") - except: - pass - return 0 - -def openMySQL(db="libvir", passwd=None, verbose=1): - global DB - - if passwd is None: - try: - passwd = os.environ["MySQL_PASS"] - except: - print "No password available, set environment MySQL_PASS" - sys.exit(1) - - DB = MySQLdb.connect(passwd=passwd, db=db) - if DB is None: - return -1 - ret = checkTables(DB, verbose) - return ret - -def updateWord(name, symbol, relevance): - global DB - - if DB is None: - openMySQL() - if DB is None: - return -1 - if name is None: - return -1 - if symbol is None: - return -1 - - c = DB.cursor() - try: - ret = c.execute( -"""INSERT INTO words (name, symbol, relevance) VALUES ('%s','%s', %d)""" % - (name, symbol, relevance)) - except: - try: - ret = c.execute( - """UPDATE words SET relevance = %d where name = '%s' and symbol = '%s'""" % - (relevance, name, symbol)) - except: - print "Update word (%s, %s, %s) failed command" % (name, symbol, relevance) - print "UPDATE words SET relevance = %d where name = '%s' and symbol = '%s'" % (relevance, name, symbol) - print sys.exc_type, sys.exc_value - return -1 - - return ret - -def updateSymbol(name, module, type, desc): - global DB - - updateWord(name, name, 50) - if DB is None: - openMySQL() - if DB is None: - return -1 - if name is None: - return -1 - if module is None: - return -1 - if type is None: - return -1 - - try: - desc = string.replace(desc, "'", " ") - l = string.split(desc, ".") - desc = l[0] - desc = desc[0:99] - except: - desc = "" - - c = DB.cursor() - try: - ret = c.execute( -"""INSERT INTO symbols (name, module, type, descr) VALUES ('%s','%s', '%s', '%s')""" % - (name, module, type, desc)) - except: - try: - ret = c.execute( -"""UPDATE symbols SET module='%s', type='%s', descr='%s' where name='%s'""" % - (module, type, desc, name)) - except: - print "Update symbol (%s, %s, %s) failed command" % (name, module, type) - print """UPDATE symbols SET module='%s', type='%s', descr='%s' where name='%s'""" % (module, type, desc, name) - print sys.exc_type, sys.exc_value - return -1 - - return ret - -def addFunction(name, module, desc=""): - return updateSymbol(name, module, 'function', desc) - -def addMacro(name, module, desc=""): - return updateSymbol(name, module, 'macro', desc) - -def addEnum(name, module, desc=""): - return updateSymbol(name, module, 'enum', desc) - -def addStruct(name, module, desc=""): - return updateSymbol(name, module, 'struct', desc) - -def addConst(name, module, desc=""): - return updateSymbol(name, module, 'const', desc) - -def addType(name, module, desc=""): - return updateSymbol(name, module, 'type', desc) - -def addFunctype(name, module, desc=""): - return updateSymbol(name, module, 'functype', desc) - -def addPage(resource, title): - global DB - - if DB is None: - openMySQL() - if DB is None: - return -1 - if resource is None: - return -1 - - c = DB.cursor() - try: - ret = c.execute( - """INSERT INTO pages (resource, title) VALUES ('%s','%s')""" % - (resource, title)) - except: - try: - ret = c.execute( - """UPDATE pages SET title='%s' WHERE resource='%s'""" % - (title, resource)) - except: - print "Update symbol (%s, %s, %s) failed command" % (name, module, type) - print """UPDATE pages SET title='%s' WHERE resource='%s'""" % (title, resource) - print sys.exc_type, sys.exc_value - return -1 - - return ret - -def updateWordHTML(name, resource, desc, id, relevance): - global DB - - if DB is None: - openMySQL() - if DB is None: - return -1 - if name is None: - return -1 - if resource is None: - return -1 - if id is None: - id = "" - if desc is None: - desc = "" - else: - try: - desc = string.replace(desc, "'", " ") - desc = desc[0:99] - except: - desc = "" - - c = DB.cursor() - try: - ret = c.execute( -"""INSERT INTO wordsHTML (name, resource, section, id, relevance) VALUES ('%s','%s', '%s', '%s', '%d')""" % - (name, resource, desc, id, relevance)) - except: - try: - ret = c.execute( -"""UPDATE wordsHTML SET section='%s', id='%s', relevance='%d' where name='%s' and resource='%s'""" % - (desc, id, relevance, name, resource)) - except: - print "Update symbol (%s, %s, %d) failed command" % (name, resource, relevance) - print """UPDATE wordsHTML SET section='%s', id='%s', relevance='%d' where name='%s' and resource='%s'""" % (desc, id, relevance, name, resource) - print sys.exc_type, sys.exc_value - return -1 - - return ret - -def checkXMLMsgArchive(url): - global DB - - if DB is None: - openMySQL() - if DB is None: - return -1 - if url is None: - return -1 - - c = DB.cursor() - try: - ret = c.execute( - """SELECT ID FROM archives WHERE resource='%s'""" % (url)) - row = c.fetchone() - if row is None: - return -1 - except: - return -1 - - return row[0] - -def addXMLMsgArchive(url, title): - global DB - - if DB is None: - openMySQL() - if DB is None: - return -1 - if url is None: - return -1 - if title is None: - title = "" - else: - title = string.replace(title, "'", " ") - title = title[0:99] - - c = DB.cursor() - try: - cmd = """INSERT INTO archives (resource, title) VALUES ('%s','%s')""" % (url, title) - ret = c.execute(cmd) - cmd = """SELECT ID FROM archives WHERE resource='%s'""" % (url) - ret = c.execute(cmd) - row = c.fetchone() - if row is None: - print "addXMLMsgArchive failed to get the ID: %s" % (url) - return -1 - except: - print "addXMLMsgArchive failed command: %s" % (cmd) - return -1 - - return((int)(row[0])) - -def updateWordArchive(name, id, relevance): - global DB - - if DB is None: - openMySQL() - if DB is None: - return -1 - if name is None: - return -1 - if id is None: - return -1 - - c = DB.cursor() - try: - ret = c.execute( -"""INSERT INTO wordsArchive (name, id, relevance) VALUES ('%s', '%d', '%d')""" % - (name, id, relevance)) - except: - try: - ret = c.execute( -"""UPDATE wordsArchive SET relevance='%d' where name='%s' and ID='%d'""" % - (relevance, name, id)) - except: - print "Update word archive (%s, %d, %d) failed command" % (name, id, relevance) - print """UPDATE wordsArchive SET relevance='%d' where name='%s' and ID='%d'""" % (relevance, name, id) - print sys.exc_type, sys.exc_value - return -1 - - return ret - -######################################################################### -# # -# Word dictionary and analysis routines # -# # -######################################################################### - -# -# top 100 english word without the one len < 3 + own set -# -dropWords = { - 'the':0, 'this':0, 'can':0, 'man':0, 'had':0, 'him':0, 'only':0, - 'and':0, 'not':0, 'been':0, 'other':0, 'even':0, 'are':0, 'was':0, - 'new':0, 'most':0, 'but':0, 'when':0, 'some':0, 'made':0, 'from':0, - 'who':0, 'could':0, 'after':0, 'that':0, 'will':0, 'time':0, 'also':0, - 'have':0, 'more':0, 'these':0, 'did':0, 'was':0, 'two':0, 'many':0, - 'they':0, 'may':0, 'before':0, 'for':0, 'which':0, 'out':0, 'then':0, - 'must':0, 'one':0, 'through':0, 'with':0, 'you':0, 'said':0, - 'first':0, 'back':0, 'were':0, 'what':0, 'any':0, 'years':0, 'his':0, - 'her':0, 'where':0, 'all':0, 'its':0, 'now':0, 'much':0, 'she':0, - 'about':0, 'such':0, 'your':0, 'there':0, 'into':0, 'like':0, 'may':0, - 'would':0, 'than':0, 'our':0, 'well':0, 'their':0, 'them':0, 'over':0, - 'down':0, - 'net':0, 'www':0, 'bad':0, 'Okay':0, 'bin':0, 'cur':0, -} - -wordsDict = {} -wordsDictHTML = {} -wordsDictArchive = {} - -def cleanupWordsString(str): - str = string.replace(str, ".", " ") - str = string.replace(str, "!", " ") - str = string.replace(str, "?", " ") - str = string.replace(str, ",", " ") - str = string.replace(str, "'", " ") - str = string.replace(str, '"', " ") - str = string.replace(str, ";", " ") - str = string.replace(str, "(", " ") - str = string.replace(str, ")", " ") - str = string.replace(str, "{", " ") - str = string.replace(str, "}", " ") - str = string.replace(str, "<", " ") - str = string.replace(str, ">", " ") - str = string.replace(str, "=", " ") - str = string.replace(str, "/", " ") - str = string.replace(str, "*", " ") - str = string.replace(str, ":", " ") - str = string.replace(str, "#", " ") - str = string.replace(str, "\\", " ") - str = string.replace(str, "\n", " ") - str = string.replace(str, "\r", " ") - str = string.replace(str, "\xc2", " ") - str = string.replace(str, "\xa0", " ") - return str - -def cleanupDescrString(str): - str = string.replace(str, "'", " ") - str = string.replace(str, "\n", " ") - str = string.replace(str, "\r", " ") - str = string.replace(str, "\xc2", " ") - str = string.replace(str, "\xa0", " ") - l = string.split(str) - str = string.join(str) - return str - -def splitIdentifier(str): - ret = [] - while str != "": - cur = string.lower(str[0]) - str = str[1:] - if ((cur < 'a') or (cur > 'z')): - continue - while (str != "") and (str[0] >= 'A') and (str[0] <= 'Z'): - cur = cur + string.lower(str[0]) - str = str[1:] - while (str != "") and (str[0] >= 'a') and (str[0] <= 'z'): - cur = cur + str[0] - str = str[1:] - while (str != "") and (str[0] >= '0') and (str[0] <= '9'): - str = str[1:] - ret.append(cur) - return ret - -def addWord(word, module, symbol, relevance): - global wordsDict - - if word is None or len(word) < 3: - return -1 - if module is None or symbol is None: - return -1 - if dropWords.has_key(word): - return 0 - if ord(word[0]) > 0x80: - return 0 - - if wordsDict.has_key(word): - d = wordsDict[word] - if d is None: - return 0 - if len(d) > 500: - wordsDict[word] = None - return 0 - try: - relevance = relevance + d[(module, symbol)] - except: - pass - else: - wordsDict[word] = {} - wordsDict[word][(module, symbol)] = relevance - return relevance - -def addString(str, module, symbol, relevance): - if str is None or len(str) < 3: - return -1 - ret = 0 - str = cleanupWordsString(str) - l = string.split(str) - for word in l: - if len(word) > 2: - ret = ret + addWord(word, module, symbol, 5) - - return ret - -def addWordHTML(word, resource, id, section, relevance): - global wordsDictHTML - - if word is None or len(word) < 3: - return -1 - if resource is None or section is None: - return -1 - if dropWords.has_key(word): - return 0 - if ord(word[0]) > 0x80: - return 0 - - section = cleanupDescrString(section) - - if wordsDictHTML.has_key(word): - d = wordsDictHTML[word] - if d is None: - print "skipped %s" % (word) - return 0 - try: - (r,i,s) = d[resource] - if i is not None: - id = i - if s is not None: - section = s - relevance = relevance + r - except: - pass - else: - wordsDictHTML[word] = {} - d = wordsDictHTML[word] - d[resource] = (relevance, id, section) - return relevance - -def addStringHTML(str, resource, id, section, relevance): - if str is None or len(str) < 3: - return -1 - ret = 0 - str = cleanupWordsString(str) - l = string.split(str) - for word in l: - if len(word) > 2: - try: - r = addWordHTML(word, resource, id, section, relevance) - if r < 0: - print "addWordHTML failed: %s %s" % (word, resource) - ret = ret + r - except: - print "addWordHTML failed: %s %s %d" % (word, resource, relevance) - print sys.exc_type, sys.exc_value - - return ret - -def addWordArchive(word, id, relevance): - global wordsDictArchive - - if word is None or len(word) < 3: - return -1 - if id is None or id == -1: - return -1 - if dropWords.has_key(word): - return 0 - if ord(word[0]) > 0x80: - return 0 - - if wordsDictArchive.has_key(word): - d = wordsDictArchive[word] - if d is None: - print "skipped %s" % (word) - return 0 - try: - r = d[id] - relevance = relevance + r - except: - pass - else: - wordsDictArchive[word] = {} - d = wordsDictArchive[word] - d[id] = relevance - return relevance - -def addStringArchive(str, id, relevance): - if str is None or len(str) < 3: - return -1 - ret = 0 - str = cleanupWordsString(str) - l = string.split(str) - for word in l: - i = len(word) - if i > 2: - try: - r = addWordArchive(word, id, relevance) - if r < 0: - print "addWordArchive failed: %s %s" % (word, id) - else: - ret = ret + r - except: - print "addWordArchive failed: %s %s %d" % (word, id, relevance) - print sys.exc_type, sys.exc_value - return ret - -######################################################################### -# # -# XML API description analysis # -# # -######################################################################### - -def loadAPI(filename): - doc = libxml2.parseFile(filename) - print "loaded %s" % (filename) - return doc - -def foundExport(file, symbol): - if file is None: - return 0 - if symbol is None: - return 0 - addFunction(symbol, file) - l = splitIdentifier(symbol) - for word in l: - addWord(word, file, symbol, 10) - return 1 - -def analyzeAPIFile(top): - count = 0 - name = top.prop("name") - cur = top.children - while cur is not None: - if cur.type == 'text': - cur = cur.next - continue - if cur.name == "exports": - count = count + foundExport(name, cur.prop("symbol")) - else: - print "unexpected element %s in API doc <file name='%s'>" % (name) - cur = cur.next - return count - -def analyzeAPIFiles(top): - count = 0 - cur = top.children - - while cur is not None: - if cur.type == 'text': - cur = cur.next - continue - if cur.name == "file": - count = count + analyzeAPIFile(cur) - else: - print "unexpected element %s in API doc <files>" % (cur.name) - cur = cur.next - return count - -def analyzeAPIEnum(top): - file = top.prop("file") - if file is None: - return 0 - symbol = top.prop("name") - if symbol is None: - return 0 - - addEnum(symbol, file) - l = splitIdentifier(symbol) - for word in l: - addWord(word, file, symbol, 10) - - return 1 - -def analyzeAPIConst(top): - file = top.prop("file") - if file is None: - return 0 - symbol = top.prop("name") - if symbol is None: - return 0 - - addConst(symbol, file) - l = splitIdentifier(symbol) - for word in l: - addWord(word, file, symbol, 10) - - return 1 - -def analyzeAPIType(top): - file = top.prop("file") - if file is None: - return 0 - symbol = top.prop("name") - if symbol is None: - return 0 - - addType(symbol, file) - l = splitIdentifier(symbol) - for word in l: - addWord(word, file, symbol, 10) - return 1 - -def analyzeAPIFunctype(top): - file = top.prop("file") - if file is None: - return 0 - symbol = top.prop("name") - if symbol is None: - return 0 - - addFunctype(symbol, file) - l = splitIdentifier(symbol) - for word in l: - addWord(word, file, symbol, 10) - return 1 - -def analyzeAPIStruct(top): - file = top.prop("file") - if file is None: - return 0 - symbol = top.prop("name") - if symbol is None: - return 0 - - addStruct(symbol, file) - l = splitIdentifier(symbol) - for word in l: - addWord(word, file, symbol, 10) - - info = top.prop("info") - if info is not None: - info = string.replace(info, "'", " ") - info = string.strip(info) - l = string.split(info) - for word in l: - if len(word) > 2: - addWord(word, file, symbol, 5) - return 1 - -def analyzeAPIMacro(top): - file = top.prop("file") - if file is None: - return 0 - symbol = top.prop("name") - if symbol is None: - return 0 - symbol = string.replace(symbol, "'", " ") - symbol = string.strip(symbol) - - info = None - cur = top.children - while cur is not None: - if cur.type == 'text': - cur = cur.next - continue - if cur.name == "info": - info = cur.content - break - cur = cur.next - - l = splitIdentifier(symbol) - for word in l: - addWord(word, file, symbol, 10) - - if info is None: - addMacro(symbol, file) - print "Macro %s description has no <info>" % (symbol) - return 0 - - info = string.replace(info, "'", " ") - info = string.strip(info) - addMacro(symbol, file, info) - l = string.split(info) - for word in l: - if len(word) > 2: - addWord(word, file, symbol, 5) - return 1 - -def analyzeAPIFunction(top): - file = top.prop("file") - if file is None: - return 0 - symbol = top.prop("name") - if symbol is None: - return 0 - - symbol = string.replace(symbol, "'", " ") - symbol = string.strip(symbol) - info = None - cur = top.children - while cur is not None: - if cur.type == 'text': - cur = cur.next - continue - if cur.name == "info": - info = cur.content - elif cur.name == "return": - rinfo = cur.prop("info") - if rinfo is not None: - rinfo = string.replace(rinfo, "'", " ") - rinfo = string.strip(rinfo) - addString(rinfo, file, symbol, 7) - elif cur.name == "arg": - ainfo = cur.prop("info") - if ainfo is not None: - ainfo = string.replace(ainfo, "'", " ") - ainfo = string.strip(ainfo) - addString(ainfo, file, symbol, 5) - name = cur.prop("name") - if name is not None: - name = string.replace(name, "'", " ") - name = string.strip(name) - addWord(name, file, symbol, 7) - cur = cur.next - if info is None: - print "Function %s description has no <info>" % (symbol) - addFunction(symbol, file, "") - else: - info = string.replace(info, "'", " ") - info = string.strip(info) - addFunction(symbol, file, info) - addString(info, file, symbol, 5) - - l = splitIdentifier(symbol) - for word in l: - addWord(word, file, symbol, 10) - - return 1 - -def analyzeAPISymbols(top): - count = 0 - cur = top.children - - while cur is not None: - if cur.type == 'text': - cur = cur.next - continue - if cur.name == "macro": - count = count + analyzeAPIMacro(cur) - elif cur.name == "function": - count = count + analyzeAPIFunction(cur) - elif cur.name == "const": - count = count + analyzeAPIConst(cur) - elif cur.name == "typedef": - count = count + analyzeAPIType(cur) - elif cur.name == "struct": - count = count + analyzeAPIStruct(cur) - elif cur.name == "enum": - count = count + analyzeAPIEnum(cur) - elif cur.name == "functype": - count = count + analyzeAPIFunctype(cur) - else: - print "unexpected element %s in API doc <files>" % (cur.name) - cur = cur.next - return count - -def analyzeAPI(doc): - count = 0 - if doc is None: - return -1 - root = doc.getRootElement() - if root.name != "api": - print "Unexpected root name" - return -1 - cur = root.children - while cur is not None: - if cur.type == 'text': - cur = cur.next - continue - if cur.name == "files": - pass -# count = count + analyzeAPIFiles(cur) - elif cur.name == "symbols": - count = count + analyzeAPISymbols(cur) - else: - print "unexpected element %s in API doc" % (cur.name) - cur = cur.next - return count - -######################################################################### -# # -# Web pages parsing and analysis # -# # -######################################################################### - -import glob - -def analyzeHTMLText(doc, resource, p, section, id): - words = 0 - try: - content = p.content - words = words + addStringHTML(content, resource, id, section, 5) - except: - return -1 - return words - -def analyzeHTMLPara(doc, resource, p, section, id): - words = 0 - try: - content = p.content - words = words + addStringHTML(content, resource, id, section, 5) - except: - return -1 - return words - -def analyzeHTMLPre(doc, resource, p, section, id): - words = 0 - try: - content = p.content - words = words + addStringHTML(content, resource, id, section, 5) - except: - return -1 - return words - -def analyzeHTML(doc, resource, p, section, id): - words = 0 - try: - content = p.content - words = words + addStringHTML(content, resource, id, section, 5) - except: - return -1 - return words - -def analyzeHTML(doc, resource): - para = 0 - ctxt = doc.xpathNewContext() - try: - res = ctxt.xpathEval("//head/title") - title = res[0].content - except: - title = "Page %s" % (resource) - addPage(resource, title) - try: - items = ctxt.xpathEval("//h1 | //h2 | //h3 | //text()") - section = title - id = "" - for item in items: - if item.name == 'h1' or item.name == 'h2' or item.name == 'h3': - section = item.content - if item.prop("id"): - id = item.prop("id") - elif item.prop("name"): - id = item.prop("name") - elif item.type == 'text': - analyzeHTMLText(doc, resource, item, section, id) - para = para + 1 - elif item.name == 'p': - analyzeHTMLPara(doc, resource, item, section, id) - para = para + 1 - elif item.name == 'pre': - analyzeHTMLPre(doc, resource, item, section, id) - para = para + 1 - else: - print "Page %s, unexpected %s element" % (resource, item.name) - except: - print "Page %s: problem analyzing" % (resource) - print sys.exc_type, sys.exc_value - - return para - -def analyzeHTMLPages(): - ret = 0 - HTMLfiles = glob.glob("*.html") + glob.glob("tutorial/*.html") + \ - glob.glob("CIM/*.html") + glob.glob("ocaml/*.html") + \ - glob.glob("ruby/*.html") - for html in HTMLfiles: - if html[0:3] == "API": - continue - if html == "xml.html": - continue - try: - doc = libxml2.parseFile(html) - except: - doc = libxml2.htmlParseFile(html, None) - try: - res = analyzeHTML(doc, html) - print "Parsed %s: %d paragraphs" % (html, res) - ret = ret + 1 - except: - print "could not parse %s" % (html) - return ret - -######################################################################### -# # -# Mail archives parsing and analysis # -# # -######################################################################### - -import time - -def getXMLDateArchive(t=None): - if t is None: - t = time.time() - T = time.gmtime(t) - month = time.strftime("%B", T) - year = T[0] - url = "http://www.redhat.com/archives/libvir-list/%d-%s/date.html" % (year, month) - return url - -def scanXMLMsgArchive(url, title, force=0): - if url is None or title is None: - return 0 - - ID = checkXMLMsgArchive(url) - if force == 0 and ID != -1: - return 0 - - if ID == -1: - ID = addXMLMsgArchive(url, title) - if ID == -1: - return 0 - - try: - print "Loading %s" % (url) - doc = libxml2.htmlParseFile(url, None) - except: - doc = None - if doc is None: - print "Failed to parse %s" % (url) - return 0 - - addStringArchive(title, ID, 20) - ctxt = doc.xpathNewContext() - texts = ctxt.xpathEval("//pre//text()") - for text in texts: - addStringArchive(text.content, ID, 5) - - return 1 - -def scanXMLDateArchive(t=None, force=0): - global wordsDictArchive - - wordsDictArchive = {} - - url = getXMLDateArchive(t) - print "loading %s" % (url) - try: - doc = libxml2.htmlParseFile(url, None) - except: - doc = None - if doc is None: - print "Failed to parse %s" % (url) - return -1 - ctxt = doc.xpathNewContext() - anchors = ctxt.xpathEval("//a[@href]") - links = 0 - newmsg = 0 - for anchor in anchors: - href = anchor.prop("href") - if href is None or href[0:3] != "msg": - continue - try: - links = links + 1 - - msg = libxml2.buildURI(href, url) - title = anchor.content - if title is not None and title[0:4] == 'Re: ': - title = title[4:] - if title is not None and title[0:6] == '[xml] ': - title = title[6:] - newmsg = newmsg + scanXMLMsgArchive(msg, title, force) - - except: - pass - - return newmsg - - -######################################################################### -# # -# Main code: open the DB, the API XML and analyze it # -# # -######################################################################### -def analyzeArchives(t=None, force=0): - global wordsDictArchive - - ret = scanXMLDateArchive(t, force) - print "Indexed %d words in %d archive pages" % (len(wordsDictArchive), ret) - - i = 0 - skipped = 0 - for word in wordsDictArchive.keys(): - refs = wordsDictArchive[word] - if refs is None: - skipped = skipped + 1 - continue - for id in refs.keys(): - relevance = refs[id] - updateWordArchive(word, id, relevance) - i = i + 1 - - print "Found %d associations in HTML pages" % (i) - -def analyzeHTMLTop(): - global wordsDictHTML - - ret = analyzeHTMLPages() - print "Indexed %d words in %d HTML pages" % (len(wordsDictHTML), ret) - - i = 0 - skipped = 0 - for word in wordsDictHTML.keys(): - refs = wordsDictHTML[word] - if refs is None: - skipped = skipped + 1 - continue - for resource in refs.keys(): - (relevance, id, section) = refs[resource] - updateWordHTML(word, resource, section, id, relevance) - i = i + 1 - - print "Found %d associations in HTML pages" % (i) - -def analyzeAPITop(): - global wordsDict - global API - - try: - doc = loadAPI(API) - ret = analyzeAPI(doc) - print "Analyzed %d blocs" % (ret) - doc.freeDoc() - except: - print "Failed to parse and analyze %s" % (API) - print sys.exc_type, sys.exc_value - sys.exit(1) - - print "Indexed %d words" % (len(wordsDict)) - i = 0 - skipped = 0 - for word in wordsDict.keys(): - refs = wordsDict[word] - if refs is None: - skipped = skipped + 1 - continue - for (module, symbol) in refs.keys(): - updateWord(word, symbol, refs[(module, symbol)]) - i = i + 1 - - print "Found %d associations, skipped %d words" % (i, skipped) - -def usage(): - print "Usage index.py [--force] [--archive] [--archive-year year] [--archive-month month] [--API] [--docs]" - sys.exit(1) - -def main(): - try: - openMySQL() - except: - print "Failed to open the database" - print sys.exc_type, sys.exc_value - sys.exit(1) - - args = sys.argv[1:] - force = 0 - if args: - i = 0 - while i < len(args): - if args[i] == '--force': - force = 1 - elif args[i] == '--archive': - analyzeArchives(None, force) - elif args[i] == '--archive-year': - i = i + 1 - year = args[i] - months = ["January", "February", "March", "April", "May", - "June", "July", "August", "September", "October", - "November", "December"] - for month in months: - try: - str = "%s-%s" % (year, month) - T = time.strptime(str, "%Y-%B") - t = time.mktime(T) + 3600 * 24 * 10 - analyzeArchives(t, force) - except: - print "Failed to index month archive:" - print sys.exc_type, sys.exc_value - elif args[i] == '--archive-month': - i = i + 1 - month = args[i] - try: - T = time.strptime(month, "%Y-%B") - t = time.mktime(T) + 3600 * 24 * 10 - analyzeArchives(t, force) - except: - print "Failed to index month archive:" - print sys.exc_type, sys.exc_value - elif args[i] == '--API': - analyzeAPITop() - elif args[i] == '--docs': - analyzeHTMLTop() - else: - usage() - i = i + 1 - else: - usage() - -if __name__ == "__main__": - main() -- 2.21.0

Daniel P. Berrangé

10:24 p.m.

New subject: [libvirt] [PATCH 2/3] docs: Remove index.py

On Wed, Apr 03, 2019 at 06:26:50PM -0400, Cole Robinson wrote:

...

This was used for generating the website search, which now just calls out to google. Remove it

Signed-off-by: Cole Robinson <crobinso@redhat.com> --- docs/index.py | 1266 ------------------------------------------------- 1 file changed, 1266 deletions(-) delete mode 100755 docs/index.py

Cole Robinson

6:26 a.m.

New subject: [libvirt] [PATCH 3/3] docs: Remove search.php and all references

libvirt.org/search.php drops into some kind of screen which I guess is supposed to show a search bar with options, but presently for me renders as nothing but the following text: Search the documentation on Libvirt.org The search service indexes the libvirt APIs and documentation as well as the libvir-list@redhat.com mailing-list archives. To use it simply provide a set of keywords: The main page search bar now redirects to google, this page is broken, I say we just remove it and move on. Signed-off-by: Cole Robinson <crobinso@redhat.com> --- .gitignore | 1 - docs/Makefile.am | 22 +--- docs/devhelp/html.xsl | 4 - docs/search.php.code.in | 225 ---------------------------------------- docs/search.php.in | 16 --- 5 files changed, 2 insertions(+), 266 deletions(-) delete mode 100644 docs/search.php.code.in delete mode 100644 docs/search.php.in diff --git a/.gitignore b/.gitignore index c918ec8226..7f99e5db2d 100644 --- a/.gitignore +++ b/.gitignore @@ -66,7 +66,6 @@ /docs/libvirt-qemu-*.xml /docs/libvirt-refs.xml /docs/news.html.in -/docs/search.php /docs/todo.html.in /examples/admin/client_close /examples/admin/client_info diff --git a/docs/Makefile.am b/docs/Makefile.am index bd7bc1a431..ebdc734ddd 100644 --- a/docs/Makefile.am +++ b/docs/Makefile.am @@ -128,10 +128,6 @@ dot_html_in = \ $(notdir $(wildcard $(srcdir)/*.html.in)) dot_html = $(dot_html_in:%.html.in=%.html) -dot_php_in = $(notdir $(wildcard $(srcdir)/*.php.in)) -dot_php_code_in = $(dot_php_in:%.php.in=%.php.code.in) -dot_php = $(dot_php_in:%.php.in=%.php) - xml = \ libvirt-api.xml \ libvirt-refs.xml @@ -175,7 +171,7 @@ EXTRA_DIST= \ $(dot_html) $(dot_html_in) $(gif) $(apihtml) $(apipng) \ $(devhelphtml) $(devhelppng) $(devhelpcss) $(devhelpxsl) \ $(xml) $(qemu_xml) $(lxc_xml) $(admin_xml) $(fig) $(png) $(css) \ - $(logofiles) $(dot_php_in) $(dot_php_code_in) $(dot_php)\ + $(logofiles) \ $(internals_html_in) $(internals_html) $(fonts) \ aclperms.htmlinc \ hvsupport.pl \ @@ -192,7 +188,6 @@ MAINTAINERCLEANFILES = \ $(addprefix $(srcdir)/,$(apihtml)) \ $(addprefix $(srcdir)/,$(devhelphtml)) \ $(addprefix $(srcdir)/,$(internals_html)) \ - $(addprefix $(srcdir)/,$(dot_php)) \ $(srcdir)/hvsupport.html.in $(srcdir)/aclperms.htmlinc timestamp="$(shell if test -n "$$SOURCE_DATE_EPOCH"; \ @@ -209,8 +204,7 @@ qemu_api: $(srcdir)/libvirt-qemu-api.xml $(srcdir)/libvirt-qemu-refs.xml lxc_api: $(srcdir)/libvirt-lxc-api.xml $(srcdir)/libvirt-lxc-refs.xml admin_api: $(srcdir)/libvirt-admin-api.xml $(srcdir)/libvirt-admin-refs.xml -web: $(dot_html) $(internals_html) html/index.html devhelp/index.html \ - $(dot_php) +web: $(dot_html) $(internals_html) html/index.html devhelp/index.html hvsupport.html: $(srcdir)/hvsupport.html.in @@ -265,18 +259,6 @@ MAINTAINERCLEANFILES += \ $(AM_V_GEN)$(XMLLINT) --nonet --format $< > $(srcdir)/$@ \ || { rm $(srcdir)/$@ && exit 1; } -%.php.tmp: %.php.in site.xsl page.xsl - $(AM_V_GEN)$(XSLTPROC) --stringparam pagename $(@:.tmp=) \ - --stringparam timestamp $(timestamp) --nonet \ - $(top_srcdir)/docs/site.xsl $< > $@ \ - || { rm $@ && exit 1; } - -%.php: %.php.tmp %.php.code.in - $(AM_V_GEN)sed \ - -e '/<span id="php_placeholder"><\/span>/r '"$(srcdir)/$@.code.in" \ - -e /php_placeholder/d < $@.tmp > $(srcdir)/$@ \ - || { rm $(srcdir)/$@ && exit 1; } - $(apihtml_generated): html/index.html html/index.html: libvirt-api.xml newapi.xsl page.xsl $(APIBUILD_STAMP) diff --git a/docs/devhelp/html.xsl b/docs/devhelp/html.xsl index eb10e362bf..9cdc049150 100644 --- a/docs/devhelp/html.xsl +++ b/docs/devhelp/html.xsl @@ -565,10 +565,6 @@ by a Linux instance. The library aim at providing long term stable C API initially for the <a href="http://www.cl.cam.ac.uk/Research/SRG/netos/xen/index.html">Xen paravirtualization</a> but should be able to integrate other virtualization mechanisms if needed.</p> -<p> If you get lost searching for some specific API use, try -<a href="https://libvirt.org/search.php">the online search -engine</a> hosted on <a href="https://libvirt.org/">libvirt.org</a> -it indexes the project page, the APIs as well as the mailing-list archives. </p> </body> </html> </xsl:document> diff --git a/docs/search.php.code.in b/docs/search.php.code.in deleted file mode 100644 index 01a6a64d28..0000000000 --- a/docs/search.php.code.in +++ /dev/null @@ -1,225 +0,0 @@ -<?php - $query = $_GET['query']; - // We handle only the first argument so far - $query = ltrim ($query); - - $scope = $_GET['scope']; - if ($scope == NULL) - $scope = "any"; - $scope = ltrim ($scope); - if ($scope == "") - $scope = "any"; - $querystr = htmlspecialchars($query, ENT_QUOTES, 'UTF-8'); -?> - -<form action="<?php echo htmlspecialchars($_SERVER['PHP_SELF'], ENT_QUOTES, 'UTF-8'), "?query=", rawurlencode($query) ?>" - enctype="application/x-www-form-urlencoded" method="get"> - <input name="query" type="text" size="50" value="<?php echo $querystr ?>"/> - <select name="scope"> - <option value="any">Search All</option> - <option value="API" <?php if ($scope == 'API') print "selected='selected'"?>>Only the APIs</option> - <option value="DOCS" <?php if ($scope == 'DOCS') print "selected='selected'"?>>Only the Documentation</option> - <option value="LISTS" <?php if ($scope == 'LISTS') print "selected='selected'"?>>Only the lists archives</option> - </select> - <input name="submit" type="submit" value="Search ..."/> -</form> - -<?php - function logQueryWord($word) { - $result = mysql_query ("SELECT ID,Count FROM Queries WHERE Value='$word'"); - if ($result) { - $i = mysql_num_rows($result); - if ($i == 0) { - mysql_free_result($result); - mysql_query ("INSERT INTO Queries (Value,Count) VALUES ('$word',1)"); - } else { - $id = mysql_result($result, 0, 0); - $count = mysql_result($result, 0, 1); - $count ++; - mysql_query ("UPDATE Queries SET Count=$count WHERE ID=$id"); - } - } else { - mysql_query ("INSERT INTO Queries (Value,Count) VALUES ('$word',1)"); - } - } - function queryWord($word) { - $result = NULL; - $j = 0; - if ($word) { - $result = mysql_query ("SELECT words.relevance, symbols.name, symbols.type, symbols.module, symbols.descr FROM words, symbols WHERE LCASE(words.name) LIKE LCASE('$word') and words.symbol = symbols.name ORDER BY words.relevance DESC LIMIT 75"); - if ($result) { - $j = mysql_num_rows($result); - if ($j == 0) - mysql_free_result($result); - } - logQueryWord($word); - } - return array($result, $j); - } - function queryHTMLWord($word) { - $result = NULL; - $j = 0; - if ($word) { - $result = mysql_query ("SELECT relevance, name, id, resource, section FROM wordsHTML WHERE LCASE(name) LIKE LCASE('$word') ORDER BY relevance DESC LIMIT 75"); - if ($result) { - $j = mysql_num_rows($result); - if ($j == 0) - mysql_free_result($result); - } - logQueryWord($word); - } - return array($result, $j); - } - function queryArchiveWord($word) { - $result = NULL; - $j = 0; - if ($word) { - $result = mysql_query ("SELECT wordsArchive.relevance, wordsArchive.name, 'libvir-list', archives.resource, archives.title FROM wordsArchive, archives WHERE LCASE(wordsArchive.name) LIKE LCASE('$word') and wordsArchive.ID = archives.ID ORDER BY relevance DESC LIMIT 75"); - if ($result) { - $j = mysql_num_rows($result); - if ($j == 0) - mysql_free_result($result); - } - logQueryWord($word); - } - return array($result, $j); - } - function resSort ($a, $b) { - list($ra,$ta,$ma,$na,$da) = $a; - list($rb,$tb,$mb,$nb,$db) = $b; - if ($ra == $rb) return 0; - return ($ra > $rb) ? -1 : 1; - } - if (($query) && (strlen($query) <= 50)) { - $link = mysql_connect ("localhost", "nobody"); - if (!$link) { - echo "<p> Could not connect to the database: ", mysql_error(); - } else { - mysql_select_db("libvir", $link); - $list = explode (" ", $query); - $results = array(); - $number = 0; - for ($number = 0;$number < count($list);$number++) { - - $word = $list[$number]; - if (($scope == 'any') || ($scope == 'API')) { - list($result, $j) = queryWord($word); - if ($j > 0) { - for ($i = 0; $i < $j; $i++) { - $relevance = mysql_result($result, $i, 0); - $name = mysql_result($result, $i, 1); - $type = mysql_result($result, $i, 2); - $module = mysql_result($result, $i, 3); - $desc = mysql_result($result, $i, 4); - if (array_key_exists($name, $results)) { - list($r,$t,$m,$d,$w,$u) = $results[$name]; - $results[$name] = array(($r + $relevance) * 2, - $t,$m,$d,$w,$u); - } else { - $id = $name; - $m = strtolower($module); - $url = "html/libvirt-$module.html#$id"; - $results[$name] = array($relevance,$type, - $module, $desc, $name, $url); - } - } - mysql_free_result($result); - } - } - if (($scope == 'any') || ($scope == 'DOCS')) { - list($result, $k) = queryHTMLWord($word); - if ($k > 0) { - for ($i = 0; $i < $k; $i++) { - $relevance = mysql_result($result, $i, 0); - $name = mysql_result($result, $i, 1); - $id = mysql_result($result, $i, 2); - $module = mysql_result($result, $i, 3); - $desc = mysql_result($result, $i, 4); - $url = $module; - if ($id != "") { - $url = $url + "#$id"; - } - $results["$name _html_ $number _ $i"] = - array($relevance, "XML docs", - $module, $desc, $name, $url); - } - mysql_free_result($result); - } - } - if (($scope == 'any') || ($scope == 'LISTS')) { - list($result, $j) = queryArchiveWord($word); - if ($j > 0) { - for ($i = 0; $i < $j; $i++) { - $relevance = mysql_result($result, $i, 0); - $name = mysql_result($result, $i, 1); - $type = mysql_result($result, $i, 2); - $url = mysql_result($result, $i, 3); - $desc = mysql_result($result, $i, 4); - if (array_key_exists($url, $results)) { - list($r,$t,$m,$d,$w,$u) = $results[$url]; - $results[$name] = array(($r + $relevance) * 2, - $t,$m,$d,$w,$u); - } else { - $id = $name; - $m = strtolower($module); - $u = str_replace( - "http://www.redhat.com/archives/libvir-list/", "", $url); - $results[$url] = array($relevance,$type, - $u, $desc, $name, $url); - } - } - mysql_free_result($result); - } - } - } - if ((count($results) == 0) && (count($list) == 1)) { - $word = $list[0]; - if (($scope == 'any') || ($scope == 'XMLAPI')) { - list($result, $j) = queryWord("vir$word"); - if ($j > 0) { - for ($i = 0; $i < $j; $i++) { - $relevance = mysql_result($result, $i, 0); - $name = mysql_result($result, $i, 1); - $type = mysql_result($result, $i, 2); - $module = mysql_result($result, $i, 3); - $desc = mysql_result($result, $i, 4); - if (array_key_exists($name, $results)) { - list($r,$t,$m,$d,$w,$u) = $results[$name]; - $results[$name] = array(($r + $relevance) * 2, - $t,$m,$d,$w,$u); - } else { - $id = $name; - $m = strtolower($module); - $url = "html/libvirt-$module.html#$id"; - $results[$name] = array($relevance,$type, - $module, $desc, $name, $url); - } - } - mysql_free_result($result); - } - } - } - mysql_close($link); - $nb = count($results); - echo "<h3 align='center'>Found $nb results for query $querystr</h3>\n"; - usort($results, "resSort"); - - if ($nb > 0) { - printf("<table><tbody>\n"); - printf("<tr><td>Quality</td><td>Symbol</td><td>Type</td><td>module</td><td>Description</td></tr>\n"); - $i = 0; - while (list ($name, $val) = each ($results)) { - list($r,$t,$m,$d,$s,$u) = $val; - $m = str_replace("<", "<", $m); - $s = str_replace("<", "<", $s); - $d = str_replace("<", "<", $d); - echo "<tr><td>$r</td><td><a href='$u'>$s</a></td><td>$t</td><td>$m</td><td>$d</td></tr>"; - $i = $i + 1; - if ($i > 75) - break; - } - printf("</tbody></table>\n"); - } - } - } -?> diff --git a/docs/search.php.in b/docs/search.php.in deleted file mode 100644 index 5de4fcee66..0000000000 --- a/docs/search.php.in +++ /dev/null @@ -1,16 +0,0 @@ -<?xml version="1.0"?> -<!DOCTYPE html> -<html xmlns="http://www.w3.org/1999/xhtml"> - <body> - <h1>Search the documentation on Libvirt.org</h1> - - <p> - The search service indexes the libvirt APIs and documentation as - well as the libvir-list@redhat.com mailing-list archives. To use - it simply provide a set of keywords: - </p> - -<span id="php_placeholder"/> - - </body> -</html> -- 2.21.0

Daniel P. Berrangé

10:27 p.m.

New subject: [libvirt] [PATCH 3/3] docs: Remove search.php and all references

On Wed, Apr 03, 2019 at 06:26:51PM -0400, Cole Robinson wrote:

...

libvirt.org/search.php drops into some kind of screen which I guess is supposed to show a search bar with options, but presently for me renders as nothing but the following text:

Search the documentation on Libvirt.org

The search service indexes the libvirt APIs and documentation as well as the libvir-list@redhat.com mailing-list archives. To use it simply provide a set of keywords:

The main page search bar now redirects to google, this page is broken, I say we just remove it and move on.

Signed-off-by: Cole Robinson <crobinso@redhat.com> --- .gitignore | 1 - docs/Makefile.am | 22 +--- docs/devhelp/html.xsl | 4 - docs/search.php.code.in | 225 ---------------------------------------- docs/search.php.in | 16 --- 5 files changed, 2 insertions(+), 266 deletions(-) delete mode 100644 docs/search.php.code.in delete mode 100644 docs/search.php.in

Andrea Bolognani

4:25 p.m.

On Wed, 2019-04-03 at 18:26 -0400, Cole Robinson wrote:

...

Andrea's ChangeLog patches reminder me that I attempted something similarish once, and also tried to drop our custom website search:

http://www.redhat.com/archives/libvir-list/2016-May/msg01616.html

Glad to be an inspiration for you! :)

...

danpb recommended I just make our search bar use google instead but I never made the attempt. 3 years later here it is.

Patch #1 makes the switch but doesn't remove anything

Patch #2 drops index.py, the search indexer

Patch #3 drops search.php and all the references to it. libvirt.org/search.php is still a valid URL so this may break some links, even though at the moment the page prints virtually nothing except some boilerplate text. Maybe we can redirect it to the homepage or something, ideas welcome

I like the idea, and I like the diffstat even more O:-) I would, however, suggest a slightly different implementation than what you have here, where using the search bar would still send you to https://libvirt.org/search.php?query=X, and that page would contain three links: * "Search libvirt.org" https://google.com/search?q=site:libvirt.org+X * "Search the libvirt wiki" https://wiki.libvirt.org/index.php?search=X * "Search the libvir-list mailing list archives" https://google.com/search?q=site:redhat.com/archives/libvir-list+X The advantages of this approach are that existing links pointing to search.php will keep working, people will be able to search the mailing list archives as well (this is a feature the current search implementation is supposed to have) and search the wiki without having to go through an extra hoop, and finally that extending the search interface to include more sources would become as simple as adding another generated link to the page. What do you think? -- Andrea Bolognani / Red Hat / Virtualization

Cole Robinson

10:18 p.m.

On 4/4/19 4:25 AM, Andrea Bolognani wrote:

...

On Wed, 2019-04-03 at 18:26 -0400, Cole Robinson wrote:

...
Andrea's ChangeLog patches reminder me that I attempted something similarish once, and also tried to drop our custom website search:

http://www.redhat.com/archives/libvir-list/2016-May/msg01616.html

Glad to be an inspiration for you! :)

...
danpb recommended I just make our search bar use google instead but I never made the attempt. 3 years later here it is.

Patch #1 makes the switch but doesn't remove anything

Patch #2 drops index.py, the search indexer

Patch #3 drops search.php and all the references to it. libvirt.org/search.php is still a valid URL so this may break some links, even though at the moment the page prints virtually nothing except some boilerplate text. Maybe we can redirect it to the homepage or something, ideas welcome

I like the idea, and I like the diffstat even more O:-)

I would, however, suggest a slightly different implementation than what you have here, where using the search bar would still send you to https://libvirt.org/search.php?query=X, and that page would contain three links:

* "Search libvirt.org" https://google.com/search?q=site:libvirt.org+X

* "Search the libvirt wiki" https://wiki.libvirt.org/index.php?search=X

* "Search the libvir-list mailing list archives" https://google.com/search?q=site:redhat.com/archives/libvir-list+X

The advantages of this approach are that existing links pointing to search.php will keep working, people will be able to search the mailing list archives as well (this is a feature the current search implementation is supposed to have) and search the wiki without having to go through an extra hoop, and finally that extending the search interface to include more sources would become as simple as adding another generated link to the page.

No objections from me but considering the embarassingly long time it took me to even get this solution working, I shouldn't be the one to implement it! Unless someone volunteers in the short term I think patch #1 and #2 can still go in, and search.php can be removed/replaced with the proper fix Thanks, Cole

Daniel P. Berrangé

10:29 p.m.

On Thu, Apr 04, 2019 at 10:25:52AM +0200, Andrea Bolognani wrote:

...

On Wed, 2019-04-03 at 18:26 -0400, Cole Robinson wrote:

...
Andrea's ChangeLog patches reminder me that I attempted something similarish once, and also tried to drop our custom website search:

http://www.redhat.com/archives/libvir-list/2016-May/msg01616.html

Glad to be an inspiration for you! :)

...
danpb recommended I just make our search bar use google instead but I never made the attempt. 3 years later here it is.

Patch #1 makes the switch but doesn't remove anything

Patch #2 drops index.py, the search indexer

Patch #3 drops search.php and all the references to it. libvirt.org/search.php is still a valid URL so this may break some links, even though at the moment the page prints virtually nothing except some boilerplate text. Maybe we can redirect it to the homepage or something, ideas welcome

I like the idea, and I like the diffstat even more O:-)

I would, however, suggest a slightly different implementation than what you have here, where using the search bar would still send you to https://libvirt.org/search.php?query=X, and that page would contain three links:

* "Search libvirt.org" https://google.com/search?q=site:libvirt.org+X

* "Search the libvirt wiki" https://wiki.libvirt.org/index.php?search=X

* "Search the libvir-list mailing list archives" https://google.com/search?q=site:redhat.com/archives/libvir-list+X

The advantages of this approach are that existing links pointing to search.php will keep working, people will be able to search the mailing list archives as well (this is a feature the current search implementation is supposed to have) and search the wiki without having to go through an extra hoop, and finally that extending the search interface to include more sources would become as simple as adding another generated link to the page.

What do you think?

Sounds reasonable, though I don't think it should be a search.php page. It can be done with a plain search.html page and small amount of javascript. Avoiding php has the benefit that it works for locally installed docs from the RPM. I'm not fussed about broken links because search.php is long broken and so any bookmark to it is largely useless already. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

Andrea Bolognani

10:46 p.m.

On Thu, 2019-04-04 at 15:29 +0100, Daniel P. Berrangé wrote:

...

On Thu, Apr 04, 2019 at 10:25:52AM +0200, Andrea Bolognani wrote:

...
I would, however, suggest a slightly different implementation than what you have here, where using the search bar would still send you to https://libvirt.org/search.php?query=X, and that page would contain three links:

* "Search libvirt.org" https://google.com/search?q=site:libvirt.org+X

* "Search the libvirt wiki" https://wiki.libvirt.org/index.php?search=X

* "Search the libvir-list mailing list archives" https://google.com/search?q=site:redhat.com/archives/libvir-list+X

The advantages of this approach are that existing links pointing to search.php will keep working, people will be able to search the mailing list archives as well (this is a feature the current search implementation is supposed to have) and search the wiki without having to go through an extra hoop, and finally that extending the search interface to include more sources would become as simple as adding another generated link to the page.

What do you think?

Sounds reasonable, though I don't think it should be a search.php page. It can be done with a plain search.html page and small amount of javascript.

Avoiding php has the benefit that it works for locally installed docs from the RPM.

I'm not a big fan of requiring JavaScript for websites, but on the other hand it looks like the search function is the only bit of PHP code we have in the repository so I'm okay with the idea of dropping it. That said, while I'd be perfectly comfortable implementing the idea above in PHP, I wouldn't quite know where to start for a JavaScript version. Dan, would you be willing to write that code yourself? -- Andrea Bolognani / Red Hat / Virtualization

Daniel P. Berrangé

5 Apr 5 Apr

3:31 p.m.

On Thu, Apr 04, 2019 at 04:46:25PM +0200, Andrea Bolognani wrote:

...

On Thu, 2019-04-04 at 15:29 +0100, Daniel P. Berrangé wrote:

...
On Thu, Apr 04, 2019 at 10:25:52AM +0200, Andrea Bolognani wrote:

...
I would, however, suggest a slightly different implementation than what you have here, where using the search bar would still send you to https://libvirt.org/search.php?query=X, and that page would contain three links:

* "Search libvirt.org" https://google.com/search?q=site:libvirt.org+X

* "Search the libvirt wiki" https://wiki.libvirt.org/index.php?search=X

* "Search the libvir-list mailing list archives" https://google.com/search?q=site:redhat.com/archives/libvir-list+X

The advantages of this approach are that existing links pointing to search.php will keep working, people will be able to search the mailing list archives as well (this is a feature the current search implementation is supposed to have) and search the wiki without having to go through an extra hoop, and finally that extending the search interface to include more sources would become as simple as adding another generated link to the page.

What do you think?

Sounds reasonable, though I don't think it should be a search.php page. It can be done with a plain search.html page and small amount of javascript.

Avoiding php has the benefit that it works for locally installed docs from the RPM.

I'm not a big fan of requiring JavaScript for websites, but on the other hand it looks like the search function is the only bit of PHP code we have in the repository so I'm okay with the idea of dropping it.

That said, while I'd be perfectly comfortable implementing the idea above in PHP, I wouldn't quite know where to start for a JavaScript version. Dan, would you be willing to write that code yourself?

Yes, I will have a go. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

2565

Age (days ago)

2567

Last active (days ago)

List overview

Download

13 comments

4 participants

participants (4)

Andrea Bolognani
Cole Robinson
Daniel P. Berrangé
Ján Tomko

[libvirt] [PATCH 0/3] Remove website search, just use google

tags

participants (4)