mirror of
https://github.com/Hopiu/linkchecker.git
synced 2026-04-27 17:44:42 +00:00
removed
git-svn-id: https://linkchecker.svn.sourceforge.net/svnroot/linkchecker/trunk/linkchecker@1420 e7d03fd6-7b0d-0410-9947-9c21f3af8025
This commit is contained in:
parent
288af2fb64
commit
9f7e3e67a9
39 changed files with 0 additions and 1558 deletions
152
FAQ
152
FAQ
|
|
@ -1,152 +0,0 @@
|
|||
Q1: LinkChecker produced an error, but my web page is ok with
|
||||
Netscape/IE/Opera/...
|
||||
Is this a bug in LinkChecker?
|
||||
A1: Please check your web pages first. Are they really ok? Use
|
||||
a syntax highlighting editor! Use HTML Tidy from www.w3c.org!
|
||||
Check if you are using a proxy which produces the error.
|
||||
|
||||
|
||||
Q2.1: I still get an error, but the page is definitely ok.
|
||||
A2: Some servers deny access of automated tools (also called robots)
|
||||
like LinkChecker. This is not a bug in LinkChecker but rather a
|
||||
policy by the webmaster running the website you are checking.
|
||||
It might even be possible for a website to send robots different
|
||||
web pages than normal browsers.
|
||||
|
||||
|
||||
Q3: How can I tell LinkChecker which proxy to use?
|
||||
A3: LinkChecker works transparently with proxies. In a Unix or Windows
|
||||
environment, set the http_proxy, https_proxy, ftp_proxy or gopher_proxy
|
||||
environment variables to a URL that identifies the proxy server before
|
||||
starting LinkChecker. For example
|
||||
# http_proxy="http://www.someproxy.com:3128"
|
||||
# export http_proxy
|
||||
|
||||
In a Macintosh environment, LinkChecker will retrieve proxy information
|
||||
from Internet Config.
|
||||
|
||||
|
||||
Q4: The link "mailto:john@company.com?subject=Hello John" is reported
|
||||
as an error.
|
||||
A4: You have to quote special characters (e.g. spaces) in the subject field.
|
||||
The correct link should be "mailto:...?subject=Hello%20John"
|
||||
Unfortunately browsers like IE and Netscape do not enforce this.
|
||||
|
||||
|
||||
Q5: Has LinkChecker JavaScript support?
|
||||
A5: No, it never will. If your page is not working without JS then your
|
||||
web design is broken.
|
||||
Use PHP or Zope or ASP for dynamic content, and use JavaScript just as
|
||||
an addon for your web pages.
|
||||
|
||||
|
||||
Q6: I have a pretty large site to check. How can I restrict link checking
|
||||
to check only my own pages?
|
||||
A6: Look at the options --intern, --extern, --strict, --denyallow and
|
||||
--recursion-level.
|
||||
|
||||
|
||||
Q7: I don't get this --extern/--intern stuff.
|
||||
A7: When it comes to checking there are three types of URLs:
|
||||
1) strict external URLs:
|
||||
We do only syntax checking. Internal URLs are never strict.
|
||||
2) external URLs:
|
||||
Like 1), but we additionally check if they are valid by connect()ing
|
||||
to them
|
||||
3) internal URLs:
|
||||
Like 2), but we additionally check if they are HTML pages and if so,
|
||||
we descend recursively into this link and check all the links in the
|
||||
HTML content.
|
||||
The --recursion-level option restricts the number of such recursive
|
||||
descends.
|
||||
|
||||
LinkChecker provides four options which affect URLs to fall in one
|
||||
of those three categories: --intern, --extern, --strict and
|
||||
--denyallow.
|
||||
By default all URLs are internal. With --extern you specify what URLs
|
||||
are external. With --intern you specify what URLs are internal.
|
||||
Now imagine you have both --extern and --intern. What happens
|
||||
when an URL matches both patterns? Or when it matches none? In this
|
||||
situation the --denyallow option specifies the order in which we match
|
||||
the URL. By default it is internal/external, with --denyallow the order is
|
||||
external/internal. Either way, the first match counts, and if none matches,
|
||||
the last checked category is the category for the URL.
|
||||
Finally, with --strict all external URLs are strict.
|
||||
|
||||
Oh, and just to boggle your mind: you can have more than one external
|
||||
regular expression in a config file and for each of those expressions
|
||||
you can specify if those matched external URLs should be strict or not.
|
||||
|
||||
An example. Assume we want to check only urls of our domains named
|
||||
'mydomain.com' and 'myotherdomain.com'. Then we specify
|
||||
-i'^http://my(other)?domain\.com' as internal regular expression, all other
|
||||
urls are treated external. Easy.
|
||||
|
||||
Another example. We don't want to check mailto urls. Then its
|
||||
-i'!^mailto:'. The '!' negates an expression. With --strict, we don't
|
||||
even connect to any mail hosts.
|
||||
|
||||
Yet another example. We check our site www.mycompany.com, don't recurse
|
||||
into external links point outside from our site and want to ignore links to
|
||||
hollowood.com and hullabulla.com completely.
|
||||
This can only be done with a configuration entry like
|
||||
[filtering]
|
||||
extern1=hollowood.com 1
|
||||
extern2=hullabulla.com 1
|
||||
# the 1 means strict external ie don't even connect
|
||||
and the command
|
||||
linkchecker --intern=www.mycompany.com www.mycompany.com
|
||||
|
||||
|
||||
Q8: Is LinkCheckers cookie feature insecure?
|
||||
A8: Cookies can not store more information as is in the HTTP request itself,
|
||||
so you are not giving away any more system information.
|
||||
After storing however, the cookies are sent out to the server on request.
|
||||
Not to every server, but only to the one who the cookie originated from!
|
||||
This could be used to "track" subsequent requests to this server,
|
||||
and this is what some people annoys (including me).
|
||||
Cookies are only stored in memory. After LinkChecker finishes, they
|
||||
are lost. So the tracking is restricted to the checking time.
|
||||
The cookie feature is disabled as default.
|
||||
|
||||
|
||||
Q9: I want to have my own logging class. How can I use it in LinkChecker?
|
||||
A9: Currently, only a Python API lets you define new logging classes.
|
||||
Define your own logging class as a subclass of StandardLogger or any other
|
||||
logging class in the log module.
|
||||
Then call the addLogger function in Config.Configuration to register
|
||||
your new Logger.
|
||||
After this append a new Logging instance to the fileoutput.
|
||||
|
||||
import linkcheck, MyLogger
|
||||
log_format = 'mylog'
|
||||
log_args = {'fileoutput': log_format, 'filename': 'foo.txt'}
|
||||
cfg = linkcheck.Config.Configuration()
|
||||
cfg.addLogger(log_format, MyLogger.MyLogger)
|
||||
cfg['fileoutput'].append(cfg.newLogger(log_format, log_args))
|
||||
|
||||
|
||||
Q10.1: LinkChecker does not ignore anchor references on caching.
|
||||
Q10.2: Some links with anchors are getting checked twice.
|
||||
A10: This is not a bug.
|
||||
It is common practice to believe that if an URL ABC#anchor1 works then
|
||||
ABC#anchor2 works too. That is not specified anywhere and I have seen
|
||||
server-side scripts that fail on some anchors and not on others.
|
||||
This is the reason for always checking URLs with different anchors.
|
||||
If you really want to disable this, use --no-anchor-caching.
|
||||
|
||||
|
||||
Q11: I see LinkChecker gets a "/robots.txt" file for every site it
|
||||
checks. What is that about?
|
||||
A11: LinkChecker follows the robots.txt exclusion standard. To avoid
|
||||
misuse of LinkChecker, you cannot turn this feature off.
|
||||
See http://www.robotstxt.org/wc/robots.html and
|
||||
http://www.w3.org/Search/9605-Indexing-Workshop/ReportOutcomes/Spidering.txt
|
||||
for more info.
|
||||
|
||||
|
||||
Q12: Ctrl-C does not stop LinkChecker immediately. Why is that so?
|
||||
A12: The Python interpreter has to wait for all threads to finish, and
|
||||
this means waiting for all open sockets to close. The default timeout
|
||||
for sockets is 30 seconds, hence the delay.
|
||||
You can change the default socket timeout with the --timeout option.
|
||||
16
WONTDO
16
WONTDO
|
|
@ -1,16 +0,0 @@
|
|||
This is a list of things LinkChecker will *not* do for you.
|
||||
|
||||
1) Support JavaScript
|
||||
See the FAQ, question Q5.
|
||||
|
||||
2) Print unreachable/dead documents of your website.
|
||||
This would require
|
||||
- file system access to your web repository
|
||||
- access to your web server configuration
|
||||
|
||||
You can instead store the linkchecker results in a database
|
||||
and look for missing files.
|
||||
|
||||
3) HTML/XML syntax checking
|
||||
Use the HTML tidy program from http://tidy.sourceforge.net/ .
|
||||
|
||||
|
|
@ -1,77 +0,0 @@
|
|||
# -*- coding: iso-8859-1 -*-
|
||||
# Copyright (C) 2000-2004 Bastian Kleineidam
|
||||
#
|
||||
# This program is free software; you can redistribute it and/or modify
|
||||
# it under the terms of the GNU General Public License as published by
|
||||
# the Free Software Foundation; either version 2 of the License, or
|
||||
# (at your option) any later version.
|
||||
#
|
||||
# This program is distributed in the hope that it will be useful,
|
||||
# but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
# GNU General Public License for more details.
|
||||
#
|
||||
# You should have received a copy of the GNU General Public License
|
||||
# along with this program; if not, write to the Free Software
|
||||
# Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
|
||||
|
||||
import sys
|
||||
import os
|
||||
import linkcheck.logger.Logger
|
||||
|
||||
|
||||
class BlacklistLogger (linkcheck.logger.Logger.Logger):
|
||||
"""Updates a blacklist of wrong links. If a link on the blacklist
|
||||
is working (again), it is removed from the list. So after n days
|
||||
we have only links on the list which failed for n days.
|
||||
"""
|
||||
|
||||
def __init__ (self, **args):
|
||||
super(BlacklistLogger, self).__init__(**args)
|
||||
self.errors = 0
|
||||
self.blacklist = {}
|
||||
if args.has_key('fileoutput'):
|
||||
self.fileoutput = True
|
||||
filename = args['filename']
|
||||
if os.path.exists(filename):
|
||||
self.readBlacklist(file(filename, "r"))
|
||||
self.fd = file(filename, "w")
|
||||
elif args.has_key('fd'):
|
||||
self.fd = args['fd']
|
||||
else:
|
||||
self.fileoutput = False
|
||||
self.fd = sys.stdout
|
||||
|
||||
def newUrl (self, urlData):
|
||||
if not urlData.cached:
|
||||
key = urlData.getCacheKey()
|
||||
if key in self.blacklist:
|
||||
if urlData.valid:
|
||||
del self.blacklist[key]
|
||||
else:
|
||||
self.blacklist[key] += 1
|
||||
else:
|
||||
if not urlData.valid:
|
||||
self.blacklist[key] = 1
|
||||
|
||||
def endOfOutput (self, linknumber=-1):
|
||||
self.writeBlacklist()
|
||||
|
||||
def readBlacklist (self, fd):
|
||||
for line in fd:
|
||||
line = line.rstrip()
|
||||
if line.startswith('#') or not line:
|
||||
continue
|
||||
value, key = line.split(None, 1)
|
||||
self.blacklist[key] = int(value)
|
||||
fd.close()
|
||||
|
||||
def writeBlacklist (self):
|
||||
"""write the blacklist"""
|
||||
oldmask = os.umask(0077)
|
||||
for key, value in self.blacklist.items():
|
||||
self.fd.write("%d %s\n" % (value, key))
|
||||
if self.fileoutput:
|
||||
self.fd.close()
|
||||
# restore umask
|
||||
os.umask(oldmask)
|
||||
|
|
@ -1,92 +0,0 @@
|
|||
# -*- coding: iso-8859-1 -*-
|
||||
# Copyright (C) 2000-2004 Bastian Kleineidam
|
||||
#
|
||||
# This program is free software; you can redistribute it and/or modify
|
||||
# it under the terms of the GNU General Public License as published by
|
||||
# the Free Software Foundation; either version 2 of the License, or
|
||||
# (at your option) any later version.
|
||||
#
|
||||
# This program is distributed in the hope that it will be useful,
|
||||
# but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
# GNU General Public License for more details.
|
||||
#
|
||||
# You should have received a copy of the GNU General Public License
|
||||
# along with this program; if not, write to the Free Software
|
||||
# Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
|
||||
|
||||
import time
|
||||
import csv
|
||||
import bk.i18n
|
||||
import linkcheck.logger.StandardLogger
|
||||
import linkcheck.logger.Logger
|
||||
|
||||
|
||||
class CSVLogger (linkcheck.logger.StandardLogger.StandardLogger):
|
||||
""" CSV output. CSV consists of one line per entry. Entries are
|
||||
separated by a semicolon.
|
||||
"""
|
||||
def __init__ (self, **args):
|
||||
super(CSVLogger, self).__init__(**args)
|
||||
self.separator = args['separator']
|
||||
self.lineterminator = "\n"
|
||||
|
||||
def init (self):
|
||||
linkcheck.logger.Logger.Logger.init(self)
|
||||
if self.fd is None:
|
||||
return
|
||||
self.starttime = time.time()
|
||||
if self.has_field("intro"):
|
||||
self.fd.write("# "+(bk.i18n._("created by %s at %s%s") % (linkcheck.Config.AppName, bk.strtime.strtime(self.starttime), self.lineterminator)))
|
||||
self.fd.write("# "+(bk.i18n._("Get the newest version at %s%s") % (linkcheck.Config.Url, self.lineterminator)))
|
||||
self.fd.write("# "+(bk.i18n._("Write comments and bugs to %s%s%s") % \
|
||||
(linkcheck.Config.Email, self.lineterminator, self.lineterminator)))
|
||||
self.fd.write(
|
||||
bk.i18n._("# Format of the entries:")+self.lineterminator+\
|
||||
"# urlname;"+self.lineterminator+\
|
||||
"# recursionlevel;"+self.lineterminator+\
|
||||
"# parentname;"+self.lineterminator+\
|
||||
"# baseref;"+self.lineterminator+\
|
||||
"# errorstring;"+self.lineterminator+\
|
||||
"# validstring;"+self.lineterminator+\
|
||||
"# warningstring;"+self.lineterminator+\
|
||||
"# infostring;"+self.lineterminator+\
|
||||
"# valid;"+self.lineterminator+\
|
||||
"# url;"+self.lineterminator+\
|
||||
"# line;"+self.lineterminator+\
|
||||
"# column;"+self.lineterminator+\
|
||||
"# name;"+self.lineterminator+\
|
||||
"# dltime;"+self.lineterminator+\
|
||||
"# dlsize;"+self.lineterminator+\
|
||||
"# checktime;"+self.lineterminator+\
|
||||
"# cached;"+self.lineterminator)
|
||||
self.flush()
|
||||
self.writer = csv.writer(self.fd, dialect='excel', delimiter=self.separator, lineterminator=self.lineterminator)
|
||||
|
||||
def newUrl (self, urlData):
|
||||
if self.fd is None:
|
||||
return
|
||||
row = [urlData.urlName, urlData.recursionLevel,
|
||||
urlData.parentName or "", urlData.baseRef,
|
||||
urlData.errorString, urlData.validString,
|
||||
urlData.warningString, urlData.infoString,
|
||||
urlData.valid, urlData.url,
|
||||
urlData.line, urlData.column,
|
||||
urlData.name, urlData.dltime,
|
||||
urlData.dlsize, urlData.checktime,
|
||||
urlData.cached]
|
||||
self.writer.writerow(row)
|
||||
self.flush()
|
||||
|
||||
def endOfOutput (self, linknumber=-1):
|
||||
if self.fd is None:
|
||||
return
|
||||
self.stoptime = time.time()
|
||||
if self.has_field("outro"):
|
||||
duration = self.stoptime - self.starttime
|
||||
self.fd.write("# "+bk.i18n._("Stopped checking at %s (%s)%s")%\
|
||||
(bk.strtime.strtime(self.stoptime),
|
||||
bk.strtime.strduration(duration), self.lineterminator))
|
||||
self.flush()
|
||||
self.fd.close()
|
||||
self.fd = None
|
||||
|
|
@ -1,156 +0,0 @@
|
|||
# -*- coding: iso-8859-1 -*-
|
||||
# Copyright (C) 2000-2004 Bastian Kleineidam
|
||||
#
|
||||
# This program is free software; you can redistribute it and/or modify
|
||||
# it under the terms of the GNU General Public License as published by
|
||||
# the Free Software Foundation; either version 2 of the License, or
|
||||
# (at your option) any later version.
|
||||
#
|
||||
# This program is distributed in the hope that it will be useful,
|
||||
# but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
# GNU General Public License for more details.
|
||||
#
|
||||
# You should have received a copy of the GNU General Public License
|
||||
# along with this program; if not, write to the Free Software
|
||||
# Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
|
||||
|
||||
import bk.i18n
|
||||
import bk.ansicolor
|
||||
import linkcheck.logger.StandardLogger
|
||||
|
||||
|
||||
class ColoredLogger (linkcheck.logger.StandardLogger.StandardLogger):
|
||||
"""ANSI colorized output"""
|
||||
|
||||
def __init__ (self, **args):
|
||||
super(ColoredLogger, self).__init__(**args)
|
||||
self.colorparent = bk.ansicolor.esc_ansicolor(args['colorparent'])
|
||||
self.colorurl = bk.ansicolor.esc_ansicolor(args['colorurl'])
|
||||
self.colorname = bk.ansicolor.esc_ansicolor(args['colorname'])
|
||||
self.colorreal = bk.ansicolor.esc_ansicolor(args['colorreal'])
|
||||
self.colorbase = bk.ansicolor.esc_ansicolor(args['colorbase'])
|
||||
self.colorvalid = bk.ansicolor.esc_ansicolor(args['colorvalid'])
|
||||
self.colorinvalid = bk.ansicolor.esc_ansicolor(args['colorinvalid'])
|
||||
self.colorinfo = bk.ansicolor.esc_ansicolor(args['colorinfo'])
|
||||
self.colorwarning = bk.ansicolor.esc_ansicolor(args['colorwarning'])
|
||||
self.colordltime = bk.ansicolor.esc_ansicolor(args['colordltime'])
|
||||
self.colordlsize = bk.ansicolor.esc_ansicolor(args['colordlsize'])
|
||||
self.colorreset = bk.ansicolor.esc_ansicolor(args['colorreset'])
|
||||
self.currentPage = None
|
||||
self.prefix = 0
|
||||
|
||||
def newUrl (self, urlData):
|
||||
if self.fd is None:
|
||||
return
|
||||
if self.has_field("parenturl"):
|
||||
if urlData.parentName:
|
||||
if self.currentPage != urlData.parentName:
|
||||
if self.prefix:
|
||||
self.fd.write("o\n")
|
||||
self.fd.write("\n"+self.field("parenturl")+
|
||||
self.spaces("parenturl")+
|
||||
self.colorparent+
|
||||
(urlData.parentName or "")+
|
||||
self.colorreset+"\n")
|
||||
self.currentPage = urlData.parentName
|
||||
self.prefix = 1
|
||||
else:
|
||||
if self.prefix:
|
||||
self.fd.write("o\n")
|
||||
self.prefix = 0
|
||||
self.currentPage=None
|
||||
if self.has_field("url"):
|
||||
if self.prefix:
|
||||
self.fd.write("|\n+- ")
|
||||
else:
|
||||
self.fd.write("\n")
|
||||
self.fd.write(self.field("url")+self.spaces("url")+self.colorurl+
|
||||
urlData.urlName+self.colorreset)
|
||||
if urlData.line:
|
||||
self.fd.write(bk.i18n._(", line %d")%urlData.line)
|
||||
if urlData.column:
|
||||
self.fd.write(bk.i18n._(", col %d")%urlData.column)
|
||||
if urlData.cached:
|
||||
self.fd.write(bk.i18n._(" (cached)\n"))
|
||||
else:
|
||||
self.fd.write("\n")
|
||||
|
||||
if urlData.name and self.has_field("name"):
|
||||
if self.prefix:
|
||||
self.fd.write("| ")
|
||||
self.fd.write(self.field("name")+self.spaces("name")+
|
||||
self.colorname+urlData.name+self.colorreset+"\n")
|
||||
if urlData.baseRef and self.has_field("base"):
|
||||
if self.prefix:
|
||||
self.fd.write("| ")
|
||||
self.fd.write(self.field("base")+self.spaces("base")+
|
||||
self.colorbase+urlData.baseRef+self.colorreset+"\n")
|
||||
|
||||
if urlData.url and self.has_field("realurl"):
|
||||
if self.prefix:
|
||||
self.fd.write("| ")
|
||||
self.fd.write(self.field("realurl")+self.spaces("realurl")+
|
||||
self.colorreal+urlData.url+
|
||||
self.colorreset+"\n")
|
||||
if urlData.dltime>=0 and self.has_field("dltime"):
|
||||
if self.prefix:
|
||||
self.fd.write("| ")
|
||||
self.fd.write(self.field("dltime")+self.spaces("dltime")+
|
||||
self.colordltime+
|
||||
(bk.i18n._("%.3f seconds") % urlData.dltime)+
|
||||
self.colorreset+"\n")
|
||||
if urlData.dlsize>=0 and self.has_field("dlsize"):
|
||||
if self.prefix:
|
||||
self.fd.write("| ")
|
||||
self.fd.write(self.field("dlsize")+self.spaces("dlsize")+
|
||||
self.colordlsize+linkcheck.StringUtil.strsize(urlData.dlsize)+
|
||||
self.colorreset+"\n")
|
||||
if urlData.checktime and self.has_field("checktime"):
|
||||
if self.prefix:
|
||||
self.fd.write("| ")
|
||||
self.fd.write(self.field("checktime")+self.spaces("checktime")+
|
||||
self.colordltime+
|
||||
(bk.i18n._("%.3f seconds") % urlData.checktime)+self.colorreset+"\n")
|
||||
|
||||
if urlData.infoString and self.has_field("info"):
|
||||
if self.prefix:
|
||||
self.fd.write("| "+self.field("info")+self.spaces("info")+
|
||||
linkcheck.StringUtil.indentWith(linkcheck.StringUtil.blocktext(
|
||||
urlData.infoString, 65), "| "+self.spaces("info")))
|
||||
else:
|
||||
self.fd.write(self.field("info")+self.spaces("info")+
|
||||
linkcheck.StringUtil.indentWith(linkcheck.StringUtil.blocktext(
|
||||
urlData.infoString, 65), " "+self.spaces("info")))
|
||||
self.fd.write(self.colorreset+"\n")
|
||||
|
||||
if urlData.warningString:
|
||||
#self.warnings += 1
|
||||
if self.has_field("warning"):
|
||||
if self.prefix:
|
||||
self.fd.write("| ")
|
||||
self.fd.write(self.field("warning")+self.spaces("warning")+
|
||||
self.colorwarning+
|
||||
urlData.warningString+self.colorreset+"\n")
|
||||
|
||||
if self.has_field("result"):
|
||||
if self.prefix:
|
||||
self.fd.write("| ")
|
||||
self.fd.write(self.field("result")+self.spaces("result"))
|
||||
if urlData.valid:
|
||||
self.fd.write(self.colorvalid+urlData.validString+
|
||||
self.colorreset+"\n")
|
||||
else:
|
||||
self.errors += 1
|
||||
self.fd.write(self.colorinvalid+urlData.errorString+
|
||||
self.colorreset+"\n")
|
||||
self.flush()
|
||||
|
||||
def endOfOutput (self, linknumber=-1):
|
||||
if self.fd is None:
|
||||
return
|
||||
if self.has_field("outro"):
|
||||
if self.prefix:
|
||||
self.fd.write("o\n")
|
||||
super(ColoredLogger, self).endOfOutput(linknumber=linknumber)
|
||||
|
||||
|
|
@ -1,99 +0,0 @@
|
|||
# -*- coding: iso-8859-1 -*-
|
||||
# Copyright (C) 2000-2004 Bastian Kleineidam
|
||||
#
|
||||
# This program is free software; you can redistribute it and/or modify
|
||||
# it under the terms of the GNU General Public License as published by
|
||||
# the Free Software Foundation; either version 2 of the License, or
|
||||
# (at your option) any later version.
|
||||
#
|
||||
# This program is distributed in the hope that it will be useful,
|
||||
# but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
# GNU General Public License for more details.
|
||||
#
|
||||
# You should have received a copy of the GNU General Public License
|
||||
# along with this program; if not, write to the Free Software
|
||||
# Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
|
||||
|
||||
import time
|
||||
import linkcheck.logger.StandardLogger
|
||||
import bk.i18n
|
||||
|
||||
|
||||
class GMLLogger (linkcheck.logger.StandardLogger.StandardLogger):
|
||||
"""GML means Graph Modeling Language. Use a GML tool to see
|
||||
your sitemap graph.
|
||||
"""
|
||||
|
||||
def __init__ (self, **args):
|
||||
super(GMLLogger, self).__init__(**args)
|
||||
self.nodes = {}
|
||||
self.nodeid = 0
|
||||
|
||||
def init (self):
|
||||
linkcheck.logger.Logger.Logger.init(self)
|
||||
if self.fd is None:
|
||||
return
|
||||
self.starttime = time.time()
|
||||
if self.has_field("intro"):
|
||||
self.fd.write("# "+(bk.i18n._("created by %s at %s\n") % (linkcheck.Config.AppName,
|
||||
bk.strtime.strtime(self.starttime))))
|
||||
self.fd.write("# "+(bk.i18n._("Get the newest version at %s\n") % linkcheck.Config.Url))
|
||||
self.fd.write("# "+(bk.i18n._("Write comments and bugs to %s\n\n") % \
|
||||
linkcheck.Config.Email))
|
||||
self.fd.write("graph [\n directed 1\n")
|
||||
self.flush()
|
||||
|
||||
def newUrl (self, urlData):
|
||||
"""write one node and all possible edges"""
|
||||
if self.fd is None:
|
||||
return
|
||||
node = urlData
|
||||
if node.url and not self.nodes.has_key(node.url):
|
||||
node.id = self.nodeid
|
||||
self.nodes[node.url] = node
|
||||
self.nodeid += 1
|
||||
self.fd.write(" node [\n")
|
||||
self.fd.write(" id %d\n" % node.id)
|
||||
if self.has_field("realurl"):
|
||||
self.fd.write(' label "%s"\n' % node.url)
|
||||
if node.dltime>=0 and self.has_field("dltime"):
|
||||
self.fd.write(" dltime %d\n" % node.dltime)
|
||||
if node.dlsize>=0 and self.has_field("dlsize"):
|
||||
self.fd.write(" dlsize %d\n" % node.dlsize)
|
||||
if node.checktime and self.has_field("checktime"):
|
||||
self.fd.write(" checktime %d\n" % node.checktime)
|
||||
if self.has_field("extern"):
|
||||
self.fd.write(" extern %d\n" % (node.extern and 1 or 0))
|
||||
self.fd.write(" ]\n")
|
||||
self.writeEdges()
|
||||
|
||||
def writeEdges (self):
|
||||
"""write all edges we can find in the graph in a brute-force
|
||||
manner. Better would be a mapping of parent urls.
|
||||
"""
|
||||
for node in self.nodes.values():
|
||||
if self.nodes.has_key(node.parentName):
|
||||
self.fd.write(" edge [\n")
|
||||
self.fd.write(' label "%s"\n' % node.urlName)
|
||||
if self.has_field("parenturl"):
|
||||
self.fd.write(" source %d\n" % \
|
||||
self.nodes[node.parentName].id)
|
||||
self.fd.write(" target %d\n" % node.id)
|
||||
if self.has_field("result"):
|
||||
self.fd.write(" valid %d\n" % (node.valid and 1 or 0))
|
||||
self.fd.write(" ]\n")
|
||||
self.flush()
|
||||
|
||||
def endOfOutput (self, linknumber=-1):
|
||||
if self.fd is None:
|
||||
return
|
||||
self.fd.write("]\n")
|
||||
if self.has_field("outro"):
|
||||
self.stoptime = time.time()
|
||||
duration = self.stoptime - self.starttime
|
||||
self.fd.write("# "+bk.i18n._("Stopped checking at %s (%s)\n")%\
|
||||
(bk.strtime.strtime(self.stoptime),
|
||||
bk.strtime.strduration(duration)))
|
||||
self.flush()
|
||||
self.fd = None
|
||||
|
|
@ -1,175 +0,0 @@
|
|||
# -*- coding: iso-8859-1 -*-
|
||||
# Copyright (C) 2000-2004 Bastian Kleineidam
|
||||
#
|
||||
# This program is free software; you can redistribute it and/or modify
|
||||
# it under the terms of the GNU General Public License as published by
|
||||
# the Free Software Foundation; either version 2 of the License, or
|
||||
# (at your option) any later version.
|
||||
#
|
||||
# This program is distributed in the hope that it will be useful,
|
||||
# but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
# GNU General Public License for more details.
|
||||
#
|
||||
# You should have received a copy of the GNU General Public License
|
||||
# along with this program; if not, write to the Free Software
|
||||
# Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
|
||||
|
||||
import time
|
||||
import linkcheck.logger.StandardLogger
|
||||
import linkcheck.StringUtil
|
||||
import linkcheck.Config
|
||||
import bk.i18n
|
||||
|
||||
|
||||
HTML_HEADER = """<!DOCTYPE html PUBLIC "-//W3C//DTD html 4.01//EN">
|
||||
<html><head><title>%s</title>
|
||||
<style type="text/css">\n<!--
|
||||
h2 { font-family: Verdana,sans-serif; font-size: 22pt;
|
||||
font-style: bold; font-weight: bold }
|
||||
body { font-family: Arial,sans-serif; font-size: 11pt }
|
||||
td { font-family: Arial,sans-serif; font-size: 11pt }
|
||||
code { font-family: Courier }
|
||||
a:hover { color: #34a4ef }
|
||||
//-->
|
||||
</style></head>
|
||||
<body bgcolor="%s" link="%s" vlink="%s" alink="%s">
|
||||
"""
|
||||
|
||||
class HtmlLogger (linkcheck.logger.StandardLogger.StandardLogger):
|
||||
"""Logger with HTML output"""
|
||||
|
||||
def __init__ (self, **args):
|
||||
super(HtmlLogger, self).__init__(**args)
|
||||
self.colorbackground = args['colorbackground']
|
||||
self.colorurl = args['colorurl']
|
||||
self.colorborder = args['colorborder']
|
||||
self.colorlink = args['colorlink']
|
||||
self.tablewarning = args['tablewarning']
|
||||
self.tableerror = args['tableerror']
|
||||
self.tableok = args['tableok']
|
||||
|
||||
def init (self):
|
||||
linkcheck.logger.Logger.Logger.init(self)
|
||||
if self.fd is None:
|
||||
return
|
||||
self.starttime = time.time()
|
||||
self.fd.write(HTML_HEADER%(linkcheck.Config.App, self.colorbackground,
|
||||
self.colorlink, self.colorlink, self.colorlink))
|
||||
if self.has_field('intro'):
|
||||
self.fd.write("<center><h2>"+linkcheck.Config.App+"</h2></center>"+
|
||||
"<br><blockquote>"+linkcheck.Config.Freeware+"<br><br>"+
|
||||
(bk.i18n._("Start checking at %s\n") % \
|
||||
bk.strtime.strtime(self.starttime))+
|
||||
"<br>")
|
||||
self.flush()
|
||||
|
||||
def newUrl (self, urlData):
|
||||
if self.fd is None:
|
||||
return
|
||||
self.fd.write("<br clear=\"all\"><br>\n"+
|
||||
"<table align=\"left\" border=\"0\" cellspacing=\"0\" cellpadding=\"1\"\n"+
|
||||
" bgcolor=\""+self.colorborder+"\" summary=\"Border\">\n"+
|
||||
"<tr>\n"+
|
||||
"<td>\n"+
|
||||
"<table align=\"left\" border=\"0\" cellspacing=\"0\" cellpadding=\"3\"\n"+
|
||||
" summary=\"checked link\" bgcolor=\""+self.colorbackground+"\">\n")
|
||||
if self.has_field("url"):
|
||||
self.fd.write("<tr>\n"+
|
||||
"<td bgcolor=\""+self.colorurl+"\">"+self.field("url")+"</td>\n"+
|
||||
"<td bgcolor=\""+self.colorurl+"\">"+urlData.urlName)
|
||||
if urlData.cached:
|
||||
self.fd.write(bk.i18n._(" (cached)"))
|
||||
self.fd.write("</td>\n</tr>\n")
|
||||
if urlData.name and self.has_field("name"):
|
||||
self.fd.write("<tr>\n<td>"+self.field("name")+"</td>\n<td>"+
|
||||
urlData.name+"</td>\n</tr>\n")
|
||||
if urlData.parentName and self.has_field("parenturl"):
|
||||
self.fd.write("<tr>\n<td>"+self.field("parenturl")+
|
||||
'</td>\n<td><a target="top" href="'+
|
||||
(urlData.parentName or "")+'">'+
|
||||
(urlData.parentName or "")+"</a>")
|
||||
if urlData.line:
|
||||
self.fd.write(bk.i18n._(", line %d")%urlData.line)
|
||||
if urlData.column:
|
||||
self.fd.write(bk.i18n._(", col %d")%urlData.column)
|
||||
self.fd.write("</td>\n</tr>\n")
|
||||
if urlData.baseRef and self.has_field("base"):
|
||||
self.fd.write("<tr>\n<td>"+self.field("base")+"</td>\n<td>"+
|
||||
urlData.baseRef+"</td>\n</tr>\n")
|
||||
if urlData.url and self.has_field("realurl"):
|
||||
self.fd.write("<tr>\n<td>"+self.field("realurl")+"</td>\n<td>"+
|
||||
'<a target="top" href="'+urlData.url+
|
||||
'">'+urlData.url+"</a></td>\n</tr>\n")
|
||||
if urlData.dltime>=0 and self.has_field("dltime"):
|
||||
self.fd.write("<tr>\n<td>"+self.field("dltime")+"</td>\n<td>"+
|
||||
(bk.i18n._("%.3f seconds") % urlData.dltime)+
|
||||
"</td>\n</tr>\n")
|
||||
if urlData.dlsize>=0 and self.has_field("dlsize"):
|
||||
self.fd.write("<tr>\n<td>"+self.field("dlsize")+"</td>\n<td>"+
|
||||
linkcheck.StringUtil.strsize(urlData.dlsize)+
|
||||
"</td>\n</tr>\n")
|
||||
if urlData.checktime and self.has_field("checktime"):
|
||||
self.fd.write("<tr>\n<td>"+self.field("checktime")+
|
||||
"</td>\n<td>"+
|
||||
(bk.i18n._("%.3f seconds") % urlData.checktime)+
|
||||
"</td>\n</tr>\n")
|
||||
if urlData.infoString and self.has_field("info"):
|
||||
self.fd.write("<tr>\n<td>"+self.field("info")+"</td>\n<td>"+
|
||||
linkcheck.StringUtil.htmlify(urlData.infoString)+
|
||||
"</td>\n</tr>\n")
|
||||
if urlData.warningString:
|
||||
#self.warnings += 1
|
||||
if self.has_field("warning"):
|
||||
self.fd.write("<tr>\n"+
|
||||
self.tablewarning+self.field("warning")+
|
||||
"</td>\n"+self.tablewarning+
|
||||
urlData.warningString.replace("\n", "<br>")+
|
||||
"</td>\n</tr>\n")
|
||||
if self.has_field("result"):
|
||||
if urlData.valid:
|
||||
self.fd.write("<tr>\n"+self.tableok+
|
||||
self.field("result")+"</td>\n"+
|
||||
self.tableok+urlData.validString+"</td>\n</tr>\n")
|
||||
else:
|
||||
self.errors += 1
|
||||
self.fd.write("<tr>\n"+self.tableerror+self.field("result")+
|
||||
"</td>\n"+self.tableerror+
|
||||
urlData.errorString+"</td>\n</tr>\n")
|
||||
self.fd.write("</table></td></tr></table><br clear=\"all\">")
|
||||
self.flush()
|
||||
|
||||
def endOfOutput (self, linknumber=-1):
|
||||
if self.fd is None:
|
||||
return
|
||||
if self.has_field("outro"):
|
||||
self.fd.write("\n"+bk.i18n._("Thats it. "))
|
||||
#if self.warnings==1:
|
||||
# self.fd.write(bk.i18n._("1 warning, "))
|
||||
#else:
|
||||
# self.fd.write(str(self.warnings)+bk.i18n._(" warnings, "))
|
||||
if self.errors==1:
|
||||
self.fd.write(bk.i18n._("1 error"))
|
||||
else:
|
||||
self.fd.write(str(self.errors)+bk.i18n._(" errors"))
|
||||
if linknumber >= 0:
|
||||
if linknumber == 1:
|
||||
self.fd.write(bk.i18n._(" in 1 link"))
|
||||
else:
|
||||
self.fd.write(bk.i18n._(" in %d links") % linknumber)
|
||||
self.fd.write(bk.i18n._(" found")+"\n<br>")
|
||||
self.stoptime = time.time()
|
||||
duration = self.stoptime - self.starttime
|
||||
self.fd.write(bk.i18n._("Stopped checking at %s (%s)\n")%\
|
||||
(bk.strtime.strtime(self.stoptime),
|
||||
bk.strtime.strduration(duration)))
|
||||
self.fd.write("</blockquote><br><hr noshade size=\"1\"><small>"+
|
||||
linkcheck.Config.HtmlAppInfo+"<br>")
|
||||
self.fd.write(bk.i18n._("Get the newest version at %s\n") %\
|
||||
('<a href="'+linkcheck.Config.Url+'" target="_top">'+linkcheck.Config.Url+
|
||||
"</a>.<br>"))
|
||||
self.fd.write(bk.i18n._("Write comments and bugs to %s\n\n") %\
|
||||
('<a href="mailto:'+linkcheck.Config.Email+'">'+linkcheck.Config.Email+"</a>."))
|
||||
self.fd.write("</small></body></html>")
|
||||
self.flush()
|
||||
self.fd = None
|
||||
|
|
@ -1,81 +0,0 @@
|
|||
# -*- coding: iso-8859-1 -*-
|
||||
# Copyright (C) 2000-2004 Bastian Kleineidam
|
||||
#
|
||||
# This program is free software; you can redistribute it and/or modify
|
||||
# it under the terms of the GNU General Public License as published by
|
||||
# the Free Software Foundation; either version 2 of the License, or
|
||||
# (at your option) any later version.
|
||||
#
|
||||
# This program is distributed in the hope that it will be useful,
|
||||
# but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
# GNU General Public License for more details.
|
||||
#
|
||||
# You should have received a copy of the GNU General Public License
|
||||
# along with this program; if not, write to the Free Software
|
||||
# Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
|
||||
|
||||
import bk.i18n
|
||||
|
||||
|
||||
class Logger (object):
|
||||
|
||||
Fields = {
|
||||
"realurl": bk.i18n._("Real URL"),
|
||||
"result": bk.i18n._("Result"),
|
||||
"base": bk.i18n._("Base"),
|
||||
"name": bk.i18n._("Name"),
|
||||
"parenturl": bk.i18n._("Parent URL"),
|
||||
"extern": bk.i18n._("Extern"),
|
||||
"info": bk.i18n._("Info"),
|
||||
"warning": bk.i18n._("Warning"),
|
||||
"dltime": bk.i18n._("D/L Time"),
|
||||
"dlsize": bk.i18n._("D/L Size"),
|
||||
"checktime": bk.i18n._("Check Time"),
|
||||
"url": bk.i18n._("URL"),
|
||||
}
|
||||
|
||||
def __init__ (self, **args):
|
||||
self.logfields = None # log all fields
|
||||
if args.has_key('fields'):
|
||||
if "all" not in args['fields']:
|
||||
self.logfields = args['fields']
|
||||
|
||||
def has_field (self, name):
|
||||
if self.logfields is None:
|
||||
# log all fields
|
||||
return True
|
||||
return name in self.logfields
|
||||
|
||||
def field (self, name):
|
||||
"""return translated field name"""
|
||||
# XXX i18nreal._(self.Fields[name])
|
||||
return self.Fields[name]
|
||||
|
||||
def spaces (self, name):
|
||||
return self.logspaces[name]
|
||||
|
||||
def init (self):
|
||||
# map with spaces between field name and value
|
||||
self.logspaces = {}
|
||||
if self.logfields is None:
|
||||
fields = self.Fields.keys()
|
||||
else:
|
||||
fields = self.logfields
|
||||
values = [self.field(x) for x in fields]
|
||||
# maximum indent for localized log field names
|
||||
self.max_indent = max(map(lambda x: len(x), values))+1
|
||||
for key in fields:
|
||||
self.logspaces[key] = " "*(self.max_indent - len(self.field(key)))
|
||||
|
||||
def newUrl (self, urlData):
|
||||
raise Exception, "abstract function"
|
||||
|
||||
def endOfOutput (self, linknumber=-1):
|
||||
raise Exception, "abstract function"
|
||||
|
||||
def __str__ (self):
|
||||
return self.__class__.__name__
|
||||
|
||||
def __repr__ (self):
|
||||
return repr(self.__class__.__name__)
|
||||
|
|
@ -1,28 +0,0 @@
|
|||
# -*- coding: iso-8859-1 -*-
|
||||
# Copyright (C) 2000-2004 Bastian Kleineidam
|
||||
#
|
||||
# This program is free software; you can redistribute it and/or modify
|
||||
# it under the terms of the GNU General Public License as published by
|
||||
# the Free Software Foundation; either version 2 of the License, or
|
||||
# (at your option) any later version.
|
||||
#
|
||||
# This program is distributed in the hope that it will be useful,
|
||||
# but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
# GNU General Public License for more details.
|
||||
#
|
||||
# You should have received a copy of the GNU General Public License
|
||||
# along with this program; if not, write to the Free Software
|
||||
# Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
|
||||
|
||||
import linkcheck.logger.Logger
|
||||
|
||||
|
||||
class NoneLogger (linkcheck.logger.Logger.Logger):
|
||||
"""Dummy logger printing nothing."""
|
||||
|
||||
def newUrl (self, urlData):
|
||||
pass
|
||||
|
||||
def endOfOutput (self, linknumber=-1):
|
||||
pass
|
||||
|
|
@ -1,101 +0,0 @@
|
|||
# -*- coding: iso-8859-1 -*-
|
||||
# Copyright (C) 2000-2004 Bastian Kleineidam
|
||||
#
|
||||
# This program is free software; you can redistribute it and/or modify
|
||||
# it under the terms of the GNU General Public License as published by
|
||||
# the Free Software Foundation; either version 2 of the License, or
|
||||
# (at your option) any later version.
|
||||
#
|
||||
# This program is distributed in the hope that it will be useful,
|
||||
# but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
# GNU General Public License for more details.
|
||||
#
|
||||
# You should have received a copy of the GNU General Public License
|
||||
# along with this program; if not, write to the Free Software
|
||||
# Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
|
||||
|
||||
import time
|
||||
import linkcheck
|
||||
import bk.i18n
|
||||
import linkcheck.logger.StandardLogger
|
||||
import linkcheck.logger.Logger
|
||||
|
||||
|
||||
def applyTable (table, s):
|
||||
"apply a table of replacement pairs to str"
|
||||
for mapping in table:
|
||||
s = s.replace(mapping[0], mapping[1])
|
||||
return s
|
||||
|
||||
|
||||
SQLTable = [
|
||||
("'","''")
|
||||
]
|
||||
|
||||
|
||||
def sqlify (s):
|
||||
"Escape special SQL chars and strings"
|
||||
if not s:
|
||||
return "NULL"
|
||||
return "'%s'"%applyTable(SQLTable, s)
|
||||
|
||||
|
||||
class SQLLogger (linkcheck.logger.StandardLogger.StandardLogger):
|
||||
""" SQL output for PostgreSQL, not tested"""
|
||||
|
||||
def __init__ (self, **args):
|
||||
super(SQLLogger, self).__init__(**args)
|
||||
self.dbname = args['dbname']
|
||||
self.separator = args['separator']
|
||||
|
||||
def init (self):
|
||||
linkcheck.logger.Logger.Logger.init(self)
|
||||
if self.fd is None: return
|
||||
self.starttime = time.time()
|
||||
if self.has_field("intro"):
|
||||
self.fd.write("-- "+(bk.i18n._("created by %s at %s\n") % (linkcheck.Config.AppName,
|
||||
bk.strtime.strtime(self.starttime))))
|
||||
self.fd.write("-- "+(bk.i18n._("Get the newest version at %s\n") % linkcheck.Config.Url))
|
||||
self.fd.write("-- "+(bk.i18n._("Write comments and bugs to %s\n\n") % \
|
||||
linkcheck.Config.Email))
|
||||
self.flush()
|
||||
|
||||
def newUrl (self, urlData):
|
||||
if self.fd is None: return
|
||||
self.fd.write("insert into %s(urlname,recursionlevel,parentname,"
|
||||
"baseref,errorstring,validstring,warningstring,infostring,"
|
||||
"valid,url,line,col,name,checktime,dltime,dlsize,cached)"
|
||||
" values "
|
||||
"(%s,%d,%s,%s,%s,%s,%s,%s,%d,%s,%d,%d,%s,%d,%d,%d,%d)%s\n" % \
|
||||
(self.dbname,
|
||||
sqlify(urlData.urlName),
|
||||
urlData.recursionLevel,
|
||||
sqlify((urlData.parentName or "")),
|
||||
sqlify(urlData.baseRef),
|
||||
sqlify(urlData.errorString),
|
||||
sqlify(urlData.validString),
|
||||
sqlify(urlData.warningString),
|
||||
sqlify(urlData.infoString),
|
||||
urlData.valid,
|
||||
sqlify(bk.url.url_quote(urlData.url)),
|
||||
urlData.line,
|
||||
urlData.column,
|
||||
sqlify(urlData.name),
|
||||
urlData.checktime,
|
||||
urlData.dltime,
|
||||
urlData.dlsize,
|
||||
urlData.cached,
|
||||
self.separator))
|
||||
self.flush()
|
||||
|
||||
def endOfOutput (self, linknumber=-1):
|
||||
if self.fd is None: return
|
||||
if self.has_field("outro"):
|
||||
self.stoptime = time.time()
|
||||
duration = self.stoptime - self.starttime
|
||||
self.fd.write("-- "+bk.i18n._("Stopped checking at %s (%s)\n")%\
|
||||
(bk.strtime.strtime(self.stoptime),
|
||||
bk.strtime.strduration(duration)))
|
||||
self.flush()
|
||||
self.fd = None
|
||||
|
|
@ -1,172 +0,0 @@
|
|||
# -*- coding: iso-8859-1 -*-
|
||||
# Copyright (C) 2000-2004 Bastian Kleineidam
|
||||
#
|
||||
# This program is free software; you can redistribute it and/or modify
|
||||
# it under the terms of the GNU General Public License as published by
|
||||
# the Free Software Foundation; either version 2 of the License, or
|
||||
# (at your option) any later version.
|
||||
#
|
||||
# This program is distributed in the hope that it will be useful,
|
||||
# but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
# GNU General Public License for more details.
|
||||
#
|
||||
# You should have received a copy of the GNU General Public License
|
||||
# along with this program; if not, write to the Free Software
|
||||
# Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
|
||||
|
||||
import sys
|
||||
import time
|
||||
import bk.i18n
|
||||
import linkcheck.logger.Logger
|
||||
import linkcheck.StringUtil
|
||||
import linkcheck.Config
|
||||
|
||||
|
||||
class StandardLogger (linkcheck.logger.Logger.Logger):
|
||||
"""Standard text logger.
|
||||
|
||||
Every Logger has to implement the following functions:
|
||||
init(self)
|
||||
Called once to initialize the Logger. Why do we not use __init__(self)?
|
||||
Because we initialize the start time in init and __init__ gets not
|
||||
called at the time the checking starts but when the logger object is
|
||||
created.
|
||||
Another reason is that we dont want might create several loggers
|
||||
as a default and then switch to another configured output. So we
|
||||
must not print anything out at __init__ time.
|
||||
|
||||
newUrl(self,urlData)
|
||||
Called every time an url finished checking. All data we checked is in
|
||||
the UrlData object urlData.
|
||||
|
||||
endOfOutput(self)
|
||||
Called at the end of checking to close filehandles and such.
|
||||
|
||||
Passing parameters to the constructor:
|
||||
__init__(self, **args)
|
||||
The args dictionary is filled in Config.py. There you can specify
|
||||
default parameters. Adjust these parameters in the configuration
|
||||
files in the appropriate logger section.
|
||||
|
||||
Informal text output format spec:
|
||||
Output consists of a set of URL logs separated by one or more
|
||||
blank lines.
|
||||
A URL log consists of two or more lines. Each line consists of
|
||||
keyword and data, separated by whitespace.
|
||||
Unknown keywords will be ignored.
|
||||
"""
|
||||
|
||||
def __init__ (self, **args):
|
||||
super(StandardLogger, self).__init__(**args)
|
||||
self.errors = 0
|
||||
#self.warnings = 0
|
||||
if args.has_key('fileoutput'):
|
||||
self.fd = file(args['filename'], "w")
|
||||
elif args.has_key('fd'):
|
||||
self.fd = args['fd']
|
||||
else:
|
||||
self.fd = sys.stdout
|
||||
|
||||
def init (self):
|
||||
super(StandardLogger, self).init()
|
||||
if self.fd is None:
|
||||
return
|
||||
self.starttime = time.time()
|
||||
if self.has_field('intro'):
|
||||
self.fd.write("%s\n%s\n" % (linkcheck.Config.AppInfo, linkcheck.Config.Freeware))
|
||||
self.fd.write(bk.i18n._("Get the newest version at %s\n") % linkcheck.Config.Url)
|
||||
self.fd.write(bk.i18n._("Write comments and bugs to %s\n\n") % linkcheck.Config.Email)
|
||||
self.fd.write(bk.i18n._("Start checking at %s\n") % bk.strtime.strtime(self.starttime))
|
||||
self.flush()
|
||||
|
||||
def newUrl (self, urlData):
|
||||
if self.fd is None:
|
||||
return
|
||||
if self.has_field('url'):
|
||||
self.fd.write("\n"+self.field('url')+self.spaces('url')+
|
||||
urlData.urlName)
|
||||
if urlData.cached:
|
||||
self.fd.write(bk.i18n._(" (cached)\n"))
|
||||
else:
|
||||
self.fd.write("\n")
|
||||
if urlData.name and self.has_field('name'):
|
||||
self.fd.write(self.field("name")+self.spaces("name")+
|
||||
urlData.name+"\n")
|
||||
if urlData.parentName and self.has_field('parenturl'):
|
||||
self.fd.write(self.field('parenturl')+self.spaces("parenturl")+
|
||||
(urlData.parentName or "")+
|
||||
(bk.i18n._(", line %d")%urlData.line)+
|
||||
(bk.i18n._(", col %d")%urlData.column)+"\n")
|
||||
if urlData.baseRef and self.has_field('base'):
|
||||
self.fd.write(self.field("base")+self.spaces("base")+
|
||||
urlData.baseRef+"\n")
|
||||
if urlData.url and self.has_field('realurl'):
|
||||
self.fd.write(self.field("realurl")+self.spaces("realurl")+
|
||||
urlData.url+"\n")
|
||||
if urlData.dltime>=0 and self.has_field('dltime'):
|
||||
self.fd.write(self.field("dltime")+self.spaces("dltime")+
|
||||
bk.i18n._("%.3f seconds\n") % urlData.dltime)
|
||||
if urlData.dlsize>=0 and self.has_field('dlsize'):
|
||||
self.fd.write(self.field("dlsize")+self.spaces("dlsize")+
|
||||
"%s\n"%linkcheck.StringUtil.strsize(urlData.dlsize))
|
||||
if urlData.checktime and self.has_field('checktime'):
|
||||
self.fd.write(self.field("checktime")+self.spaces("checktime")+
|
||||
bk.i18n._("%.3f seconds\n") % urlData.checktime)
|
||||
if urlData.infoString and self.has_field('info'):
|
||||
self.fd.write(self.field("info")+self.spaces("info")+
|
||||
linkcheck.StringUtil.indent(
|
||||
linkcheck.StringUtil.blocktext(urlData.infoString, 65),
|
||||
self.max_indent)+"\n")
|
||||
if urlData.warningString:
|
||||
#self.warnings += 1
|
||||
if self.has_field('warning'):
|
||||
self.fd.write(self.field("warning")+self.spaces("warning")+
|
||||
linkcheck.StringUtil.indent(
|
||||
linkcheck.StringUtil.blocktext(urlData.warningString, 65),
|
||||
self.max_indent)+"\n")
|
||||
|
||||
if self.has_field('result'):
|
||||
self.fd.write(self.field("result")+self.spaces("result"))
|
||||
if urlData.valid:
|
||||
self.fd.write(urlData.validString+"\n")
|
||||
else:
|
||||
self.errors += 1
|
||||
self.fd.write(urlData.errorString+"\n")
|
||||
self.flush()
|
||||
|
||||
def endOfOutput (self, linknumber=-1):
|
||||
if self.fd is None:
|
||||
return
|
||||
if self.has_field('outro'):
|
||||
self.fd.write(bk.i18n._("\nThats it. "))
|
||||
#if self.warnings==1:
|
||||
# self.fd.write(bk.i18n._("1 warning, "))
|
||||
#else:
|
||||
# self.fd.write(str(self.warnings)+bk.i18n._(" warnings, "))
|
||||
if self.errors==1:
|
||||
self.fd.write(bk.i18n._("1 error"))
|
||||
else:
|
||||
self.fd.write(str(self.errors)+bk.i18n._(" errors"))
|
||||
if linknumber >= 0:
|
||||
if linknumber == 1:
|
||||
self.fd.write(bk.i18n._(" in 1 link"))
|
||||
else:
|
||||
self.fd.write(bk.i18n._(" in %d links") % linknumber)
|
||||
self.fd.write(bk.i18n._(" found\n"))
|
||||
self.stoptime = time.time()
|
||||
duration = self.stoptime - self.starttime
|
||||
self.fd.write(bk.i18n._("Stopped checking at %s (%s)\n") % \
|
||||
(bk.strtime.strtime(self.stoptime),
|
||||
bk.strtime.strduration(duration)))
|
||||
self.flush()
|
||||
self.fd = None
|
||||
|
||||
def flush (self):
|
||||
"""ignore flush errors since we are not responsible for proper
|
||||
flushing of log output streams"""
|
||||
if self.fd:
|
||||
try:
|
||||
self.fd.flush()
|
||||
except IOError:
|
||||
pass
|
||||
|
|
@ -1,139 +0,0 @@
|
|||
# -*- coding: iso-8859-1 -*-
|
||||
# Copyright (C) 2000-2004 Bastian Kleineidam
|
||||
#
|
||||
# This program is free software; you can redistribute it and/or modify
|
||||
# it under the terms of the GNU General Public License as published by
|
||||
# the Free Software Foundation; either version 2 of the License, or
|
||||
# (at your option) any later version.
|
||||
#
|
||||
# This program is distributed in the hope that it will be useful,
|
||||
# but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
# GNU General Public License for more details.
|
||||
#
|
||||
# You should have received a copy of the GNU General Public License
|
||||
# along with this program; if not, write to the Free Software
|
||||
# Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
|
||||
|
||||
import time
|
||||
import xml.sax.saxutils
|
||||
import linkcheck.logger.StandardLogger
|
||||
import bk.i18n
|
||||
|
||||
|
||||
xmlattr_entities = {
|
||||
"&": "&",
|
||||
"<": "<",
|
||||
">": ">",
|
||||
"\"": """,
|
||||
}
|
||||
|
||||
|
||||
def xmlquote (s):
|
||||
"""quote characters for XML"""
|
||||
return xml.sax.saxutils.escape(s)
|
||||
|
||||
|
||||
def xmlquoteattr (s):
|
||||
"""quote XML attribute, ready for inclusion with double quotes"""
|
||||
return xml.sax.saxutils.escape(s, xmlattr_entities)
|
||||
|
||||
|
||||
def xmlunquote (s):
|
||||
"""unquote characters from XML"""
|
||||
return xml.sax.saxutils.unescape(s)
|
||||
|
||||
|
||||
def xmlunquoteattr (s):
|
||||
"""unquote attributes from XML"""
|
||||
return xml.sax.saxutils.unescape(s, xmlattr_entities)
|
||||
|
||||
|
||||
class XMLLogger (linkcheck.logger.StandardLogger.StandardLogger):
|
||||
"""XML output mirroring the GML structure. Easy to parse with any XML
|
||||
tool."""
|
||||
|
||||
def __init__ (self, **args):
|
||||
super(XMLLogger, self).__init__(**args)
|
||||
self.nodes = {}
|
||||
self.nodeid = 0
|
||||
|
||||
def init (self):
|
||||
linkcheck.logger.Logger.Logger.init(self)
|
||||
if self.fd is None: return
|
||||
self.starttime = time.time()
|
||||
self.fd.write('<?xml version="1.0"?>\n')
|
||||
if self.has_field("intro"):
|
||||
self.fd.write("<!--\n")
|
||||
self.fd.write(" "+bk.i18n._("created by %s at %s\n") % \
|
||||
(linkcheck.Config.AppName, bk.strtime.strtime(self.starttime)))
|
||||
self.fd.write(" "+bk.i18n._("Get the newest version at %s\n") % linkcheck.Config.Url)
|
||||
self.fd.write(" "+bk.i18n._("Write comments and bugs to %s\n\n") % \
|
||||
linkcheck.Config.Email)
|
||||
self.fd.write("-->\n\n")
|
||||
self.fd.write('<GraphXML>\n<graph isDirected="true">\n')
|
||||
self.flush()
|
||||
|
||||
def newUrl (self, urlData):
|
||||
"""write one node and all possible edges"""
|
||||
if self.fd is None: return
|
||||
node = urlData
|
||||
if node.url and not self.nodes.has_key(node.url):
|
||||
node.id = self.nodeid
|
||||
self.nodes[node.url] = node
|
||||
self.nodeid += 1
|
||||
self.fd.write(' <node name="%d" ' % node.id)
|
||||
self.fd.write(">\n")
|
||||
if self.has_field("realurl"):
|
||||
self.fd.write(" <label>%s</label>\n" %\
|
||||
xmlquote(node.url))
|
||||
self.fd.write(" <data>\n")
|
||||
if node.dltime>=0 and self.has_field("dltime"):
|
||||
self.fd.write(" <dltime>%f</dltime>\n" % node.dltime)
|
||||
if node.dlsize>=0 and self.has_field("dlsize"):
|
||||
self.fd.write(" <dlsize>%d</dlsize>\n" % node.dlsize)
|
||||
if node.checktime and self.has_field("checktime"):
|
||||
self.fd.write(" <checktime>%f</checktime>\n" \
|
||||
% node.checktime)
|
||||
if self.has_field("extern"):
|
||||
self.fd.write(" <extern>%d</extern>\n" % \
|
||||
(node.extern and 1 or 0))
|
||||
self.fd.write(" </data>\n")
|
||||
self.fd.write(" </node>\n")
|
||||
self.writeEdges()
|
||||
|
||||
def writeEdges (self):
|
||||
"""write all edges we can find in the graph in a brute-force
|
||||
manner. Better would be a mapping of parent urls.
|
||||
"""
|
||||
for node in self.nodes.values():
|
||||
if self.nodes.has_key(node.parentName):
|
||||
self.fd.write(" <edge")
|
||||
self.fd.write(' source="%d"' % \
|
||||
self.nodes[node.parentName].id)
|
||||
self.fd.write(' target="%d"' % node.id)
|
||||
self.fd.write(">\n")
|
||||
if self.has_field("url"):
|
||||
self.fd.write(" <label>%s</label>\n" % \
|
||||
xmlquote(node.urlName))
|
||||
self.fd.write(" <data>\n")
|
||||
if self.has_field("result"):
|
||||
self.fd.write(" <valid>%d</valid>\n" % \
|
||||
(node.valid and 1 or 0))
|
||||
self.fd.write(" </data>\n")
|
||||
self.fd.write(" </edge>\n")
|
||||
self.flush()
|
||||
|
||||
def endOfOutput (self, linknumber=-1):
|
||||
if self.fd is None: return
|
||||
self.fd.write("</graph>\n</GraphXML>\n")
|
||||
if self.has_field("outro"):
|
||||
self.stoptime = time.time()
|
||||
duration = self.stoptime - self.starttime
|
||||
self.fd.write("<!-- ")
|
||||
self.fd.write(bk.i18n._("Stopped checking at %s (%s)\n")%\
|
||||
(bk.strtime.strtime(self.stoptime),
|
||||
bk.strtime.strduration(duration)))
|
||||
self.fd.write("-->")
|
||||
self.flush()
|
||||
self.fd = None
|
||||
|
|
@ -1,4 +0,0 @@
|
|||
<a href="#myid">Bla</a>
|
||||
<ul>
|
||||
<li id="myid">
|
||||
</ul>
|
||||
|
|
@ -1,8 +0,0 @@
|
|||
<!-- base without href -->
|
||||
<base target="_top">
|
||||
<!-- meta url -->
|
||||
<META HTTP-equiv="refresh" content="0; url=misc.html">
|
||||
<!-- spaces between key and value -->
|
||||
<a href
|
||||
=
|
||||
"misc.html">
|
||||
|
|
@ -1,3 +0,0 @@
|
|||
<!-- base with href -->
|
||||
<base href="base/">
|
||||
<a href="test.txt">
|
||||
|
|
@ -1,2 +0,0 @@
|
|||
<!-- codebase test -->
|
||||
<applet codebase="base/" archive="test.txt">
|
||||
|
|
@ -1 +0,0 @@
|
|||
file:///etc/group
|
||||
|
|
@ -1,4 +0,0 @@
|
|||
@font-face {
|
||||
src:url(misc.html)
|
||||
}
|
||||
background-image:url(news.html)
|
||||
|
|
@ -1,8 +0,0 @@
|
|||
<a href="http.html">relative url</a>
|
||||
<a href="http.html#isnix">bad anchor</a>
|
||||
<a href="http.html#iswas">good anchor</a>
|
||||
<a href="file:///etc/group">good file</a>
|
||||
<a href="file://etc/group">bad file</a>
|
||||
<a href="file:/etc/group">good file</a>
|
||||
<a href="file:etc/group">bad file</a>
|
||||
<a href="file:/etc/">good dir</a>
|
||||
|
|
@ -1 +0,0 @@
|
|||
file:///etc/group
|
||||
|
|
@ -1,5 +0,0 @@
|
|||
<!-- frame src urls -->
|
||||
<frameset border="0" frameborder="0" framespacing="0">
|
||||
<frame name="top" src="base1.html" frameborder="0">
|
||||
<frame name="bottom" src="http.html" frameborder="0">
|
||||
</frameset>
|
||||
|
|
@ -1,6 +0,0 @@
|
|||
<a href="ftp:/ftp.debian.org/"> <!-- ftp one slash -->
|
||||
<a href="ftp://ftp.debian.org/"> <!-- ftp two slashes -->
|
||||
<a href="ftp://ftp.debian.org//debian/"> <!-- ftp two dir slashes -->
|
||||
<a href="ftp://ftp.debian.org/debian"> <!-- missing trailing dir slash -->
|
||||
<a href="ftp://ftp.debian.org////////debian/"> <!-- ftp many dir slashes -->
|
||||
<a href="ftp:///ftp.debian.org/"> <!-- ftp three slashes -->
|
||||
|
|
@ -1,23 +0,0 @@
|
|||
Just some HTTP links
|
||||
<a b=c "boo" href="http://www.garantiertnixgutt.bla">bad url</a>
|
||||
<a href="http://www.heise.de">ok</a>
|
||||
<a href="http:/www.heise.de">one slash</a>
|
||||
<a href="http:www.heise.de">no slash</a>
|
||||
<a href="http://">no url</a>
|
||||
<a href="http:/">no url, one slash</a>
|
||||
<a href="http:">no url, no slash</a>
|
||||
<a href="http://www.blubb.de/stalter&sohn">unquoted ampersand</a>
|
||||
<a name="iswas">anchor for anchor.html</a>
|
||||
<a href=http://slashdot.org/>unquoted</a>
|
||||
<a href="http://www.heise.de/#isnix">invalid anchor</a>
|
||||
<a href="HtTP://WWW.hEIsE.DE">should be cached</a>
|
||||
<a href="HTTP://WWW.HEISE.DE">should be cached</a>
|
||||
<!-- entities -->
|
||||
<a href="http://www.heise.de/?quoted=ü">html entities</a>
|
||||
<a
|
||||
href="mailto:postmaster@aol.de">postmaster@aol.de</a>
|
||||
<!-- <a href=http://nocheckin> no check because of comment -->
|
||||
<a href=illegalquote1">no beginning quote</a>
|
||||
<a href="illegalquote2>no ending quote</a>
|
||||
<!-- check the parser at end of file -->
|
||||
<a href="g
|
||||
|
|
@ -1 +0,0 @@
|
|||
<a href="https://sourceforge.net/">https</a>
|
||||
|
|
@ -1,21 +0,0 @@
|
|||
<!-- extra mail checking -->
|
||||
<html><head></head>
|
||||
<body>
|
||||
<!-- legal -->
|
||||
<a href=mailto:calvin@LocalHost?subject=Hallo&to=michi>1</a>
|
||||
<a href="mailto:Dude <calvin@studcs.uni-sb.de> , Killer <calvin@cs.uni-sb.de>?subject=bla">2</a>
|
||||
<a href="mailto:Bastian Kleineidam <calvin@studcs.uni-sb.de>?bcc=jsmith%40wummel.company.com">3</a>
|
||||
<a href="mailto:Bastian Kleineidam <calvin@studcs.uni-sb.de>">4</a>
|
||||
<a href="mailto:">6</a>
|
||||
<a href="mailto:o'hara@cs.uni-sb.de">5</a>
|
||||
<a href="mailto:?to=calvin@studcs.uni-sb.de&subject=blubb&cc=calvin_cc@studcs.uni-sb.de&CC=calvin_CC@studcs.uni-sb.de">...</a>
|
||||
<a href="mailto:news-admins@freshmeat.net?subject=Re:%20[fm%20#11093]%20(news-admins)%20Submission%20report%20-%20Pretty%20CoLoRs">...</a>
|
||||
<a href="mailto:jan@jan-dittberner.de?subject=test">...</a>
|
||||
<!-- illegal -->
|
||||
<!-- contains non-quoted characters -->
|
||||
<a href="mailto:a@d?subject=äöü">5</a>
|
||||
<a href="mailto:calvin@cs.uni-sb.de?subject=Halli hallo">_</a>
|
||||
<!-- ? extension forbidden in <> construct -->
|
||||
<a href="mailto:Bastian Kleineidam <calvin@host1?foo=bar>">3</a>
|
||||
</body>
|
||||
</html>
|
||||
|
|
@ -1,9 +0,0 @@
|
|||
<!-- meta url -->
|
||||
<meta http-equiv="refresh" content="5; url=http://localhost/">
|
||||
<a href="hutzli:nixgutt">bad scheme</a>
|
||||
<a href="javascript:loadthis()">javascript url</a>
|
||||
<!-- multiple links in one tag -->
|
||||
<applet archive="misc.html" src="misc.html">
|
||||
<!-- css urls -->
|
||||
<img style="@font-face {src:url(misc.html)};background-image:url(news.html)"
|
||||
title="CSS urls">
|
||||
|
|
@ -1,19 +0,0 @@
|
|||
<!-- news testing -->
|
||||
<a href="news:comp.os.linux.misc">
|
||||
<!-- snews -->
|
||||
<a href="snews:de.comp.os.unix.linux.misc">
|
||||
<!-- no group -->
|
||||
<a href="news:">
|
||||
<!-- illegal syntax -->
|
||||
<a href="news:§$%&/´`(§%">
|
||||
<!-- nttp scheme with host -->
|
||||
<a href="nntp://news.rz.uni-sb.de/comp.lang.python">
|
||||
<!-- article span -->
|
||||
<a href="nntp://news.rz.uni-sb.de/comp.lang.python/1-5">
|
||||
<!-- article number -->
|
||||
<a href="nntp://news.rz.uni-sb.de/EFGJG4.7A@deshaw.com">
|
||||
<!-- host but no group -->
|
||||
<a href="nntp://news.rz.uni-sb.de/">
|
||||
<!-- article span -->
|
||||
<a href="news:comp.lang.python/1-5">
|
||||
|
||||
|
|
@ -1,5 +0,0 @@
|
|||
<a href="telnet:localhost">
|
||||
<a href="telnet:">
|
||||
<a href="telnet://swindon.city.ac.uk">
|
||||
<a href="telnet://user@swindon.city.ac.uk">
|
||||
<a href="telnet://user:password@swindon.city.ac.uk">
|
||||
|
|
@ -1,19 +0,0 @@
|
|||
test_base
|
||||
url file:///home/calvin/projects/linkchecker/test/html/base1.html
|
||||
valid
|
||||
url file:///home/calvin/projects/linkchecker/test/html/base2.html
|
||||
valid
|
||||
url file:///home/calvin/projects/linkchecker/test/html/codebase.html
|
||||
valid
|
||||
url misc.html
|
||||
valid
|
||||
url misc.html
|
||||
cached
|
||||
valid
|
||||
url test.txt
|
||||
baseurl file:///home/calvin/projects/linkchecker/test/html/base/
|
||||
valid
|
||||
url test.txt
|
||||
cached
|
||||
baseurl file:///home/calvin/projects/linkchecker/test/html/base/
|
||||
valid
|
||||
|
|
@ -1,13 +0,0 @@
|
|||
# -*- coding: iso-8859-1 -*-
|
||||
import os, linkcheck
|
||||
config = linkcheck.Config.Configuration()
|
||||
config.addLogger('test', linkcheck.test_support.TestLogger)
|
||||
config['recursionlevel'] = 1
|
||||
config['log'] = config.newLogger('test')
|
||||
config["anchors"] = True
|
||||
config["verbose"] = True
|
||||
config.setThreads(0)
|
||||
for filename in ('base1.html', 'base2.html', 'codebase.html'):
|
||||
url = os.path.join("test", "html", filename)
|
||||
config.appendUrl(linkcheck.UrlData.GetUrlDataFrom(url, 0, config))
|
||||
linkcheck.checkUrls(config)
|
||||
|
|
@ -1,13 +0,0 @@
|
|||
# -*- coding: iso-8859-1 -*-
|
||||
import os, linkcheck
|
||||
config = linkcheck.Config.Configuration()
|
||||
config.addLogger('test', linkcheck.test_support.TestLogger)
|
||||
config['recursionlevel'] = 1
|
||||
config['log'] = config.newLogger('test')
|
||||
config["anchors"] = True
|
||||
config["verbose"] = True
|
||||
config.setThreads(0)
|
||||
for filename in ('file.html', "file.txt", "file.asc", "file.css"):
|
||||
url = os.path.join("test", "html", filename)
|
||||
config.appendUrl(linkcheck.UrlData.GetUrlDataFrom(url, 0, config))
|
||||
linkcheck.checkUrls(config)
|
||||
|
|
@ -1,13 +0,0 @@
|
|||
# -*- coding: iso-8859-1 -*-
|
||||
import os, linkcheck
|
||||
config = linkcheck.Config.Configuration()
|
||||
config.addLogger('test', linkcheck.test_support.TestLogger)
|
||||
config['recursionlevel'] = 1
|
||||
config['log'] = config.newLogger('test')
|
||||
config["anchors"] = True
|
||||
config["verbose"] = True
|
||||
config.setThreads(0)
|
||||
for filename in ('frames.html',):
|
||||
url = os.path.join("test", "html", filename)
|
||||
config.appendUrl(linkcheck.UrlData.GetUrlDataFrom(url, 0, config))
|
||||
linkcheck.checkUrls(config)
|
||||
|
|
@ -1,13 +0,0 @@
|
|||
# -*- coding: iso-8859-1 -*-
|
||||
import os, linkcheck
|
||||
config = linkcheck.Config.Configuration()
|
||||
config.addLogger('test', linkcheck.test_support.TestLogger)
|
||||
config['recursionlevel'] = 1
|
||||
config['log'] = config.newLogger('test')
|
||||
config["anchors"] = True
|
||||
config["verbose"] = True
|
||||
config.setThreads(0)
|
||||
for filename in ('ftp.html',):
|
||||
url = os.path.join("test", "html", filename)
|
||||
config.appendUrl(linkcheck.UrlData.GetUrlDataFrom(url, 0, config))
|
||||
linkcheck.checkUrls(config)
|
||||
|
|
@ -1,14 +0,0 @@
|
|||
# -*- coding: iso-8859-1 -*-
|
||||
import os, linkcheck
|
||||
config = linkcheck.Config.Configuration()
|
||||
config.addLogger('test', linkcheck.test_support.TestLogger)
|
||||
config['recursionlevel'] = 1
|
||||
config['log'] = config.newLogger('test')
|
||||
config["anchors"] = True
|
||||
config["verbose"] = True
|
||||
config.setThreads(0)
|
||||
htmldir = "test/html"
|
||||
for filename in ('http.html',):
|
||||
url = os.path.join("test", "html", filename)
|
||||
config.appendUrl(linkcheck.UrlData.GetUrlDataFrom(url, 0, config))
|
||||
linkcheck.checkUrls(config)
|
||||
|
|
@ -1,13 +0,0 @@
|
|||
# -*- coding: iso-8859-1 -*-
|
||||
import os, linkcheck
|
||||
config = linkcheck.Config.Configuration()
|
||||
config.addLogger('test', linkcheck.test_support.TestLogger)
|
||||
config['recursionlevel'] = 1
|
||||
config['log'] = config.newLogger('test')
|
||||
config["anchors"] = True
|
||||
config["verbose"] = True
|
||||
config.setThreads(0)
|
||||
for filename in ('https.html',):
|
||||
url = os.path.join("test", "html", filename)
|
||||
config.appendUrl(linkcheck.UrlData.GetUrlDataFrom(url, 0, config))
|
||||
linkcheck.checkUrls(config)
|
||||
|
|
@ -1,13 +0,0 @@
|
|||
# -*- coding: iso-8859-1 -*-
|
||||
import os, linkcheck
|
||||
config = linkcheck.Config.Configuration()
|
||||
config.addLogger('test', linkcheck.test_support.TestLogger)
|
||||
config['recursionlevel'] = 1
|
||||
config['log'] = config.newLogger('test')
|
||||
config["anchors"] = True
|
||||
config["verbose"] = True
|
||||
config.setThreads(0)
|
||||
for filename in ('mail.html',):
|
||||
url = os.path.join("test", "html", filename)
|
||||
config.appendUrl(linkcheck.UrlData.GetUrlDataFrom(url, 0, config))
|
||||
linkcheck.checkUrls(config)
|
||||
|
|
@ -1,13 +0,0 @@
|
|||
# -*- coding: iso-8859-1 -*-
|
||||
import os, linkcheck
|
||||
config = linkcheck.Config.Configuration()
|
||||
config.addLogger('test', linkcheck.test_support.TestLogger)
|
||||
config['recursionlevel'] = 1
|
||||
config['log'] = config.newLogger('test')
|
||||
config["anchors"] = True
|
||||
config["verbose"] = True
|
||||
config.setThreads(0)
|
||||
for filename in ('misc.html','anchor.html', 'norobots.html'):
|
||||
url = os.path.join("test", "html", filename)
|
||||
config.appendUrl(linkcheck.UrlData.GetUrlDataFrom(url, 0, config))
|
||||
linkcheck.checkUrls(config)
|
||||
|
|
@ -1,13 +0,0 @@
|
|||
# -*- coding: iso-8859-1 -*-
|
||||
import os, linkcheck
|
||||
config = linkcheck.Config.Configuration()
|
||||
config.addLogger('test', linkcheck.test_support.TestLogger)
|
||||
config['recursionlevel'] = 1
|
||||
config['log'] = config.newLogger('test')
|
||||
config["anchors"] = True
|
||||
config["verbose"] = True
|
||||
config.setThreads(0)
|
||||
for filename in ('news.html',):
|
||||
url = os.path.join("test", "html", filename)
|
||||
config.appendUrl(linkcheck.UrlData.GetUrlDataFrom(url, 0, config))
|
||||
linkcheck.checkUrls(config)
|
||||
|
|
@ -1,13 +0,0 @@
|
|||
# -*- coding: iso-8859-1 -*-
|
||||
import os, linkcheck
|
||||
config = linkcheck.Config.Configuration()
|
||||
config.addLogger('test', linkcheck.test_support.TestLogger)
|
||||
config['recursionlevel'] = 1
|
||||
config['log'] = config.newLogger('test')
|
||||
config["anchors"] = True
|
||||
config["verbose"] = True
|
||||
config.setThreads(0)
|
||||
for filename in ('telnet.html',):
|
||||
url = os.path.join("test", "html", filename)
|
||||
config.appendUrl(linkcheck.UrlData.GetUrlDataFrom(url, 0, config))
|
||||
linkcheck.checkUrls(config)
|
||||
Loading…
Reference in a new issue